Abstract
Malaysian English, being a low-resource creole language, presents unique challenges for natural language processing tasks such as Named Entity Recognition (NER) and Relational Extraction (RE). In this paper, we propose a methodology utilizing Human-in-the-Loop (HITL) Annotation to address these challenges and enhance the annotation process for NER and RE tasks in Malaysian English. By implementing this methodology, we effectively expanded the MEN Dataset from 6,061 entities to 12,456 entities and from 4,095 relation instances to 7,794 relation instances. This promising outcome serves as an encouragement to expand resources for any low-resource language by implementing the discussed methodology.
| Original language | English |
|---|---|
| Title of host publication | Proceedings of 2025 International Conference on Asian Language Processing, IALP 2025 |
| Editors | Lei Wang, Rong Tong, Sarah Flora Samson Juan, Yanfeng Lu, Ping Ping Tan, Suhaila Saee, Minghui Dong |
| Place of Publication | Piscataway NJ USA |
| Publisher | IEEE, Institute of Electrical and Electronics Engineers |
| Pages | 165-169 |
| Number of pages | 5 |
| ISBN (Electronic) | 9798331589790 |
| ISBN (Print) | 9798331589806 |
| DOIs | |
| Publication status | Published - 2025 |
| Event | International Conference on Asian Language Processing (IALP) 2025 - Sarawak, Malaysia Duration: 4 Aug 2025 → 6 Aug 2025 Conference number: 29th https://ieeexplore.ieee.org/xpl/conhome/11156192/proceeding (Proceedings) https://www.colips.org/conferences/ialp2025/wp/ (Website) |
Conference
| Conference | International Conference on Asian Language Processing (IALP) 2025 |
|---|---|
| Abbreviated title | IALP 2025 |
| Country/Territory | Malaysia |
| City | Sarawak |
| Period | 4/08/25 → 6/08/25 |
| Internet address |
Keywords
- Human-in-the-Loop Annotation
- Low-Resource Language
- Malaysian English
- Named Entity Recognition
- Relation Extraction