Skip to main navigation Skip to search Skip to main content

Mitigating Translationese in Low-resource Languages: The Storyboard Approach

  • Garry Kuwanto
  • , Eno Abasi Urua
  • , Priscilla Amuok
  • , Shamsuddeen Hassan Muhammad
  • , Anuoluwapo Aremu
  • , Verrah Otiende
  • , Loice Nanyanga
  • , Teresiah Nyoike
  • , Aniefon Akpan
  • , Nsima Udouboh
  • , Idongesit Archibong
  • , Idara Moses
  • , Ifeoluwatayo Ige
  • , Benjamin Ajibade
  • , Olumide Awokoya
  • , Idris Abdulmumin
  • , Saminu Mohammad Aliyu
  • , Ruqayya Iro
  • , Ibrahim Said Ahmad
  • , Deontae Smith
  • Praise E.L. Michaels, David Ifeoluwa Adelani, Derry Tanti Wijaya, Anietie Andy

Research output: Chapter in Book/Report/Conference proceedingConference PaperResearchpeer-review

Abstract

Low-resource languages often face challenges in acquiring high-quality language data due to the reliance on translation-based methods, which can introduce the translationese effect. This phenomenon results in translated sentences that lack fluency and naturalness in the target language. In this paper, we propose a novel approach for data collection by leveraging storyboards to elicit more fluent and natural sentences. Our method involves presenting native speakers with visual stimuli in the form of storyboards and collecting their descriptions without direct exposure to the source text. We conducted a comprehensive evaluation comparing our storyboard-based approach with traditional text translation-based methods in terms of accuracy and fluency. Human annotators and quantitative metrics were used to assess translation quality. The results indicate a preference for text translation in terms of accuracy, while our method demonstrates worse accuracy but better fluency in the language focused.

Original languageEnglish
Title of host publicationThe 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024) - Main Conference Proceedings
EditorsNicoletta Calzolari, Min-Yen Kan, Veronique Hoste, Alessandro Lenci, Sakriani Sakti, Nianwen Xue
Place of PublicationParis France
PublisherEuropean Language Resources Association (ELRA)
Pages11349-11360
Number of pages12
ISBN (Electronic)9782493814104
Publication statusPublished - 2024
EventJoint International Conference on Computational Linguistics and International Conference on Language Resources and Evaluation 2024 - Hybrid, Torino, Italy
Duration: 20 May 202425 May 2024
https://aclanthology.org/volumes/2024.lrec-main/ (Proceedings)
https://lrec-coling-2024.org/ (Website)

Conference

ConferenceJoint International Conference on Computational Linguistics and International Conference on Language Resources and Evaluation 2024
Abbreviated titleLREC-COLING 2024
Country/TerritoryItaly
CityHybrid, Torino
Period20/05/2425/05/24
Internet address

Keywords

  • Low-resource languages
  • Translation Data
  • Translationese

Cite this