Improving word alignment of rare words with word embeddings

Masoud Jalili Sabet, Heshaam Faili, Gholamreza Haffari

Research output: Chapter in Book/Report/Conference proceedingConference PaperResearchpeer-review

3 Citations (Scopus)

Abstract

We address the problem of inducing word alignment for language pairs by developing an unsupervised model with the capability of getting applied to other generative alignment models. We approach the task by: i) proposing a new alignment model based on the IBM alignment model 1 that uses vector representation of words, and ii) examining the use of similar source words to overcome the problem of rare source words and improving the alignments. We apply our method to English-French corpora and run the experiments with different sizes of sentence pairs. Our results show competitive performance against the baseline and in some cases improve the results up to 6.9% in terms of precision.

Original languageEnglish
Title of host publicationCOLING 2016 - 26th International Conference on Computational Linguistics
Subtitle of host publicationProceedings of COLING 2016: Technical Papers
EditorsYuji Matsumoto, Rashmi Prasad
Place of PublicationSingapore
PublisherAssociation for Computational Linguistics (ACL)
Pages3209-3215
Number of pages7
ISBN (Electronic)9784879747020
Publication statusPublished - 2016
EventInternational Conference on Computational Linguistics 2016 - Osaka, Japan
Duration: 11 Dec 201616 Dec 2016
Conference number: 26th
https://coling2016.anlp.jp/
https://dblp.org/db/conf/coling/coling2016.html (Proceedings)

Conference

ConferenceInternational Conference on Computational Linguistics 2016
Abbreviated titleCOLING 2016
Country/TerritoryJapan
CityOsaka
Period11/12/1616/12/16
Internet address

Cite this