Abstract
Computational morphological resources are the crucial component needed in providing morphological information to create morphological analyser. To acquire the morphological resources in a manual way, two main components are required. The components, which are preprocessing and morphology induction, have led to two issues: i) time consuming and ii) ambiguity in managing the resources from under-resourced languages perspective. We proposed an automatic acquisition of morphological resources tool, which is an extension from the manual way, to overcome the mentioned issues. In this work, three main modules in the proposed automatic tool are: i) tokenization - to tokenise a raw text and generate a wordlist, ii) conversion - to convert a softcopy of morphological resources into required formats and iii) integration of segmentation tools - to integrate two established segmentation tools, namely, Linguistica and Morfessor, in obtaining morphological information from the generated wordlist. Two testing methods have been conducted are component and integration testing. Result shows the proposed tool has been devised and the effectiveness has been demonstrated which allows the linguist to obtain their wordlist and segmented data easily. We believe the proposed tool will assist other researchers to construct computational morphological resources in automated way for under-resourced languages.
Original language | English |
---|---|
Title of host publication | Proceedings of the International Conference on Asian Language Processing 2014, IALP 2014 |
Editors | Minghui Dong, Yanfeng Lu, Rafael E. Banchs, Bali Ranaivo-Malancon |
Publisher | IEEE, Institute of Electrical and Electronics Engineers |
Pages | 203-206 |
Number of pages | 4 |
ISBN (Electronic) | 9781479953301 |
DOIs | |
Publication status | Published - 3 Dec 2014 |
Externally published | Yes |
Event | International Conference on Asian Language Processing (IALP) 2014 - Kuching, Malaysia Duration: 20 Oct 2014 → 22 Oct 2014 https://ieeexplore.ieee.org/xpl/conhome/6960147/proceeding (Proceedings) |
Conference
Conference | International Conference on Asian Language Processing (IALP) 2014 |
---|---|
Abbreviated title | IALP 2014 |
Country/Territory | Malaysia |
City | Kuching |
Period | 20/10/14 → 22/10/14 |
Internet address |
Keywords
- computational morphological resources
- morphological analyser
- pre-processing
- under-resourced language