ParGramBank: The ParGram parallel treebank

Sebastian Sulger, Miriam Butt, Tracy Holloway King, Paul Meurer, Tibor Laczkó, György Rákosi, Cheikh Bamba Dione, Helge Dyvik, Victoria Rosén, Koenraad De Smedt, Agnieszka Patejuk, Özlem Çetinoǧlu, I. Wayan Arka, Meladel Mistica

Research output: Chapter in Book/Report/Conference proceedingConference PaperResearchpeer-review

Abstract

This paper discusses the construction of a parallel treebank currently involving ten languages from six language families. The treebank is based on deep LFG (Lexical-Functional Grammar) grammars that were developed within the framework of the ParGram (Parallel Grammar) effort. The grammars produce output that is maximally parallelized across languages and language families. This output forms the basis of a parallel treebank covering a diverse set of phenomena. The treebank is publicly available via the INESS treebanking environment, which also allows for the alignment of language pairs. We thus present a unique, multilayered parallel treebank that represents more and different types of languages than are available in other treebanks, that represents deep linguistic knowledge and that allows for the alignment of sentences at several levels: dependency structures, constituency structures and POS information.

Original languageEnglish
Title of host publication51st Annual Meeting of the Association for Computational Linguistics
Subtitle of host publicationProceedings of the Conference
Place of PublicationSofia, Bulgaria
PublisherAssociation for Computational Linguistics (ACL)
Pages550-560
Number of pages11
Volume1
ISBN (Print)9781937284503
Publication statusPublished - 1 Jan 2013
Externally publishedYes
EventAnnual Meeting of the Association of Computational Linguistics 2013 - Sofia, Bulgaria
Duration: 4 Aug 20139 Aug 2013
Conference number: 51st
http://www.acl2013.org/site/
https://aclweb.org/conference/acl2013/

Conference

ConferenceAnnual Meeting of the Association of Computational Linguistics 2013
Abbreviated titleACL 2013
CountryBulgaria
CitySofia
Period4/08/139/08/13
Internet address

Keywords

  • Treebanks

Cite this

Sulger, S., Butt, M., King, T. H., Meurer, P., Laczkó, T., Rákosi, G., ... Mistica, M. (2013). ParGramBank: The ParGram parallel treebank. In 51st Annual Meeting of the Association for Computational Linguistics: Proceedings of the Conference (Vol. 1, pp. 550-560). Sofia, Bulgaria: Association for Computational Linguistics (ACL).
Sulger, Sebastian ; Butt, Miriam ; King, Tracy Holloway ; Meurer, Paul ; Laczkó, Tibor ; Rákosi, György ; Dione, Cheikh Bamba ; Dyvik, Helge ; Rosén, Victoria ; De Smedt, Koenraad ; Patejuk, Agnieszka ; Çetinoǧlu, Özlem ; Arka, I. Wayan ; Mistica, Meladel. / ParGramBank : The ParGram parallel treebank. 51st Annual Meeting of the Association for Computational Linguistics: Proceedings of the Conference. Vol. 1 Sofia, Bulgaria : Association for Computational Linguistics (ACL), 2013. pp. 550-560
@inproceedings{4bc8e4f688e9429cb90705a286c8e3ec,
title = "ParGramBank: The ParGram parallel treebank",
abstract = "This paper discusses the construction of a parallel treebank currently involving ten languages from six language families. The treebank is based on deep LFG (Lexical-Functional Grammar) grammars that were developed within the framework of the ParGram (Parallel Grammar) effort. The grammars produce output that is maximally parallelized across languages and language families. This output forms the basis of a parallel treebank covering a diverse set of phenomena. The treebank is publicly available via the INESS treebanking environment, which also allows for the alignment of language pairs. We thus present a unique, multilayered parallel treebank that represents more and different types of languages than are available in other treebanks, that represents deep linguistic knowledge and that allows for the alignment of sentences at several levels: dependency structures, constituency structures and POS information.",
keywords = "Treebanks",
author = "Sebastian Sulger and Miriam Butt and King, {Tracy Holloway} and Paul Meurer and Tibor Laczk{\'o} and Gy{\"o}rgy R{\'a}kosi and Dione, {Cheikh Bamba} and Helge Dyvik and Victoria Ros{\'e}n and {De Smedt}, Koenraad and Agnieszka Patejuk and {\"O}zlem {\cC}etinoǧlu and Arka, {I. Wayan} and Meladel Mistica",
year = "2013",
month = "1",
day = "1",
language = "English",
isbn = "9781937284503",
volume = "1",
pages = "550--560",
booktitle = "51st Annual Meeting of the Association for Computational Linguistics",
publisher = "Association for Computational Linguistics (ACL)",

}

Sulger, S, Butt, M, King, TH, Meurer, P, Laczkó, T, Rákosi, G, Dione, CB, Dyvik, H, Rosén, V, De Smedt, K, Patejuk, A, Çetinoǧlu, Ö, Arka, IW & Mistica, M 2013, ParGramBank: The ParGram parallel treebank. in 51st Annual Meeting of the Association for Computational Linguistics: Proceedings of the Conference. vol. 1, Association for Computational Linguistics (ACL), Sofia, Bulgaria, pp. 550-560, Annual Meeting of the Association of Computational Linguistics 2013, Sofia, Bulgaria, 4/08/13.

ParGramBank : The ParGram parallel treebank. / Sulger, Sebastian; Butt, Miriam; King, Tracy Holloway; Meurer, Paul; Laczkó, Tibor; Rákosi, György; Dione, Cheikh Bamba; Dyvik, Helge; Rosén, Victoria; De Smedt, Koenraad; Patejuk, Agnieszka; Çetinoǧlu, Özlem; Arka, I. Wayan; Mistica, Meladel.

51st Annual Meeting of the Association for Computational Linguistics: Proceedings of the Conference. Vol. 1 Sofia, Bulgaria : Association for Computational Linguistics (ACL), 2013. p. 550-560.

Research output: Chapter in Book/Report/Conference proceedingConference PaperResearchpeer-review

TY - GEN

T1 - ParGramBank

T2 - The ParGram parallel treebank

AU - Sulger, Sebastian

AU - Butt, Miriam

AU - King, Tracy Holloway

AU - Meurer, Paul

AU - Laczkó, Tibor

AU - Rákosi, György

AU - Dione, Cheikh Bamba

AU - Dyvik, Helge

AU - Rosén, Victoria

AU - De Smedt, Koenraad

AU - Patejuk, Agnieszka

AU - Çetinoǧlu, Özlem

AU - Arka, I. Wayan

AU - Mistica, Meladel

PY - 2013/1/1

Y1 - 2013/1/1

N2 - This paper discusses the construction of a parallel treebank currently involving ten languages from six language families. The treebank is based on deep LFG (Lexical-Functional Grammar) grammars that were developed within the framework of the ParGram (Parallel Grammar) effort. The grammars produce output that is maximally parallelized across languages and language families. This output forms the basis of a parallel treebank covering a diverse set of phenomena. The treebank is publicly available via the INESS treebanking environment, which also allows for the alignment of language pairs. We thus present a unique, multilayered parallel treebank that represents more and different types of languages than are available in other treebanks, that represents deep linguistic knowledge and that allows for the alignment of sentences at several levels: dependency structures, constituency structures and POS information.

AB - This paper discusses the construction of a parallel treebank currently involving ten languages from six language families. The treebank is based on deep LFG (Lexical-Functional Grammar) grammars that were developed within the framework of the ParGram (Parallel Grammar) effort. The grammars produce output that is maximally parallelized across languages and language families. This output forms the basis of a parallel treebank covering a diverse set of phenomena. The treebank is publicly available via the INESS treebanking environment, which also allows for the alignment of language pairs. We thus present a unique, multilayered parallel treebank that represents more and different types of languages than are available in other treebanks, that represents deep linguistic knowledge and that allows for the alignment of sentences at several levels: dependency structures, constituency structures and POS information.

KW - Treebanks

UR - http://www.scopus.com/inward/record.url?scp=84906928249&partnerID=8YFLogxK

M3 - Conference Paper

SN - 9781937284503

VL - 1

SP - 550

EP - 560

BT - 51st Annual Meeting of the Association for Computational Linguistics

PB - Association for Computational Linguistics (ACL)

CY - Sofia, Bulgaria

ER -

Sulger S, Butt M, King TH, Meurer P, Laczkó T, Rákosi G et al. ParGramBank: The ParGram parallel treebank. In 51st Annual Meeting of the Association for Computational Linguistics: Proceedings of the Conference. Vol. 1. Sofia, Bulgaria: Association for Computational Linguistics (ACL). 2013. p. 550-560