Automatic identification of protagonist in fairy tales using verb

Hui Ngo Goh, Lay Ki Soon, Su Cheng Haw

Research output: Chapter in Book/Report/Conference proceedingConference PaperResearchpeer-review

9 Citations (Scopus)

Abstract

Named entity recognition (NER) has been a well-studied problem in the area of text mining for locating atomic element into predefined categories, where "name of people" is one of the most commonly studied categories. Numerous new NER techniques have been unfolded to accommodate the needs of the application developed. However, most research works carried out focused on non-fiction domain. Fiction domain exhibits complexity and uncertainty in locating protagonist as it represents name of person in a diverse spectrums, ranging from living things (animals, plants, person) to non-living things (vehicle, furniture). This paper proposes automated protagonist identification in fiction domain, particularly in fairy tales. Verb has been used as a determinant in substantiating the existence of protagonist with the assistance of WordNet. The experimental results show that it is viable to use verb in identifying named entity, particularly "people" category and it can be applied in a small text size environment.

Original languageEnglish
Title of host publicationAdvances in Knowledge Discovery and Data Mining - 16th Pacific-Asia Conference, PAKDD 2012, Proceedings
Pages395-406
Number of pages12
EditionPART 2
DOIs
Publication statusPublished - 2012
Externally publishedYes
EventPacific-Asia Conference on Knowledge Discovery and Data Mining 2012 - Kuala Lumpur, Malaysia
Duration: 29 May 20121 Jun 2012
Conference number: 16th
https://link.springer.com/book/10.1007/978-3-642-30217-6 (Proceedings)

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
NumberPART 2
Volume7301 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

ConferencePacific-Asia Conference on Knowledge Discovery and Data Mining 2012
Abbreviated titlePAKDD 2012
Country/TerritoryMalaysia
CityKuala Lumpur
Period29/05/121/06/12
Internet address

Keywords

  • characters
  • fairy tales
  • Named entity recognition
  • text mining

Cite this