Title ParlaMint: comparable corpora of European parliamentary data /
Authors Erjavec, Tomaž ; Ogrodniczuk, Maciej ; Osenova, Petya ; Pančur, Andrej ; Ljubešič, Nikola ; Agnoloni, Tommaso ; Barkarson, StarkaDur ; Pérez, María Calzada ; Çöltekin, Çagrı ; Coole, Matthew ; Dargis, Roberts ; de Macedo, Luciana D ; de Does, Jesse ; Depuydt, Katrien ; Diwersy, Sascha ; Hansen, Dorte Haltrup ; Kopp, Matyáš ; Krilavičius, Tomas ; Luxardo, Giancarlo ; Marx, Maarten ; Morkevičius, Vaidas ; Navarretta, Costanza ; Rayson, Paul ; Ring, Orsolya ; Rudolf, Michał ; Simov, Kiril ; Steingrímsson, Steinþór ; Üveges, István ; van Heusden, Ruben ; Venturi, Giulia
Full Text Download
Is Part of Proceedings of CLARIN annual conference 2021, 27-29 September, 2021, virtual edition / edited by Monica Monachini, Maria Eskevich.. Utrecht : Utrecht University. 2021, p. 20-25
Abstract [eng] This paper outlines the ParlaMint project from the perspective of its goals, tasks, participants, results and applications potential. The project produced language corpora from the sessions of the national parliaments of 17 countries, almost half a billion words in total. The corpora are split into COVID-related subcorpora (from November 2019) and reference corpora (to October 2019). The corpora are uniformly encoded according to the ParlaMint schema with the same Universal Dependencies linguistic annotations. Samples of the corpora and conversion scripts are available from the project’s GitHub repository. The complete corpora are openly available via the CLARIN.SI repository1 for download, and through the NoSketch Engine2 and KonText3 concordancers as well as through the Parlameter4 interface for exploration and analysis.
Published Utrecht : Utrecht University
Type Conference paper
Language English
Publication date 2021
CC license CC license description