Title The ParlaMint corpora of parliamentary proceedings /
Authors Erjavec, Tomaz ; Ogrodniczuk, Maciej ; Osenova, Petya ; Ljubesic, Nikola ; Simov, Kiril ; Pancur, Andrej ; Rudolf, Michal ; Kopp, Matyas ; Barkarson, Starkadur ; Steingrimsson, Steinthor ; Coltekin, Cagri ; de Does, Jesse ; Depuydt, Katrien ; Agnoloni, Tommaso ; Venturi, Giulia ; Calzada Perez, Maria ; de Macedo, Luciana D ; Navarretta, Costanza ; Luxardo, Giancarlo ; Coole, Matthew ; Rayson, Paul ; Morkevičius, Vaidas ; Krilavičius, Tomas ; Dargis, Roberts ; Ring, Orsolya ; van Heusden, Ruben ; Marx, Maarten ; Fiser, Darja
DOI 10.1007/s10579-021-09574-0
Full Text Download
Is Part of Language resources and evaluation.. Dordrecht : Springer Nature. 2023, vol. 57, iss. 1, p. 415-448.. ISSN 1574-020X. eISSN 1574-0218
Keywords [eng] parliamentary proceedings ; comparable corpora ; TEI
Abstract [eng] This paper presents the ParlaMint corpora containing transcriptions of the sessions of the 17 European national parliaments with half a billion words. The corpora are uniformly encoded, contain rich meta-data about 11 thousand speakers, and are linguistically annotated following the Universal Dependencies formalism and with named entities. Samples of the corpora and conversion scripts are available from the project’s GitHub repository, and the complete corpora are openly available via the CLARIN.SI repository for download, as well as through the NoSketch Engine and KonText concordancers and the Parlameter interface for on-line exploration and analysis.
Published Dordrecht : Springer Nature
Type Journal article
Language English
Publication date 2023
CC license CC license description