Linguistic unit discovery from multi-modal inputs in unwritten languages: summary of the “Speaking Rosetta” JSALT 2017 workshop

Odette Scharenborg, Laurent Besacier, Alan Black, Mark Hasegawa-Johnson, Florian Metze, Graham Neubig, Sebastian Stuker, Pierre Godard, Markus Müller, Lucas Ondel, Shruti Palaskar, Philip Arthur, Francesco Ciannella, Mingxing Du, Elin Larsen, Danny Merkx, Rachid Riad, Liming Wang, Emmanuel Dupoux

Research output: Chapter in Book/Report/Conference proceedingConference PaperResearchpeer-review

Abstract

We summarize the accomplishments of a multi-disciplinary workshop exploring the computational and scientific issues surrounding the discovery of linguistic units (subwords and words) in a language without orthography. We study the replacement of orthographic transcriptions by images and/or translated text in a well-resourced language to help unsupervised discovery from raw speech.
Original languageEnglish
Title of host publication2018 IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
EditorsDan Schonfeld
Place of PublicationPiscataway NJ USA
PublisherIEEE, Institute of Electrical and Electronics Engineers
Pages4979-4983
Number of pages5
ISBN (Electronic)9781538646588, 9781538646571
ISBN (Print)9781538646595
DOIs
Publication statusPublished - 2017
Externally publishedYes
EventIEEE International Conference on Acoustics, Speech and Signal Processing 2018 - Calgary, Canada
Duration: 15 Apr 201820 Apr 2018
https://2018.ieeeicassp.org/
https://ieeexplore.ieee.org/xpl/conhome/8450881/proceeding (Proceedings)

Conference

ConferenceIEEE International Conference on Acoustics, Speech and Signal Processing 2018
Abbreviated titleICASSP 2018
Country/TerritoryCanada
CityCalgary
Period15/04/1820/04/18
Internet address

Keywords

  • unwritten languages
  • multi-modal data
  • unsupervised unit discovery
  • image retrieval
  • machine translation

Cite this