Abstract
Sociocultural norms serve as guiding principles for personal conduct in social interactions within a particular society or culture. The study of norm discovery has seen significant development over the last few years, with various interesting approaches. However, it is difficult to adopt these approaches to discover norms in a new culture, as they rely either on human annotations or real-world dialogue contents. This paper presents a robust automatic norm discovery pipeline, which utilizes the cultural knowledge of GPT-3.5 Turbo (ChatGPT) along with several social factors. By using these social factors and ChatGPT, our pipeline avoids the use of human dialogues that tend to be limited to specific scenarios, as well as the use of human annotations that make it difficult and costly to enlarge the dataset. The resulting database - Multicultural Norm Base (MNB) - covers 6 distinct cultures, with over 150k sociocultural norm statements in total. A state-of-the-art Large Language Model (LLM), Llama 3, fine-tuned with our proposed dataset, shows remarkable results on various downstream tasks, outperforming models fine-tuned on other datasets significantly.
Original language | English |
---|---|
Title of host publication | CoNLL 2024 - 28th Conference on Computational Natural Language Learning, Proceedings of the Conference |
Editors | Libby Barak, Malihe Alikhani |
Place of Publication | Kerrville TX USA |
Publisher | Association for Computational Linguistics (ACL) |
Pages | 24-35 |
Number of pages | 12 |
ISBN (Electronic) | 9798891761780 |
DOIs | |
Publication status | Published - 2024 |
Event | Conference on Natural Language Learning 2024 - Miami, United States of America Duration: 15 Nov 2024 → 16 Nov 2024 Conference number: 28th https://aclanthology.org/volumes/2024.conll-1/ (Proceedings) https://conll.org/2024 (Website) |
Conference
Conference | Conference on Natural Language Learning 2024 |
---|---|
Abbreviated title | CoNLL 2024 |
Country/Territory | United States of America |
City | Miami |
Period | 15/11/24 → 16/11/24 |
Internet address |
|