Classification and pattern discovery of mood in weblogs

Thin Nguyen, Dinh Phung, Brett Adams, Truyen Tran, Svetha Venkatesh

Research output: Chapter in Book/Report/Conference proceedingConference PaperResearchpeer-review

19 Citations (Scopus)

Abstract

Automatic data-driven analysis of mood from text is an emerging problem with many potential applications. Unlike generic text categorization, mood classification based on textual features is complicated by various factors, including its context- and user-sensitive nature. We present a comprehensive study of different feature selection schemes in machine learning for the problem of mood classification in weblogs. Notably, we introduce the novel use of a feature set based on the affective norms for English words (ANEW) lexicon studied in psychology. This feature set has the advantage of being computationally efficient while maintaining accuracy comparable to other state-of-the-art feature sets experimented with. In addition, we present results of data-driven clustering on a dataset of over 17 million blog posts with mood groundtruth. Our analysis reveals an interesting, and readily interpreted, structure to the linguistic expression of emotion, one that comprises valuable empirical evidence in support of existing psychological models of emotion, and in particular the dipoles pleasure-displeasure and activation-deactivation.

Original languageEnglish
Title of host publicationAdvances in Knowledge Discovery and Data Mining - 14th Pacific-Asia Conference, PAKDD 2010, Proceedings
Pages283-290
Number of pages8
EditionPART 2
DOIs
Publication statusPublished - 1 Dec 2010
Externally publishedYes
EventPacific-Asia Conference on Knowledge Discovery and Data Mining 2010 - Hyderabad, India
Duration: 21 Jun 201024 Jun 2010
Conference number: 14th
http://oldwww.iiit.ac.in/conferences/pakdd2010/
https://link.springer.com/book/10.1007/978-3-642-13657-3

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
NumberPART 2
Volume6119 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

ConferencePacific-Asia Conference on Knowledge Discovery and Data Mining 2010
Abbreviated titlePAKDD 2010
CountryIndia
CityHyderabad
Period21/06/1024/06/10
Internet address

Cite this

Nguyen, T., Phung, D., Adams, B., Tran, T., & Venkatesh, S. (2010). Classification and pattern discovery of mood in weblogs. In Advances in Knowledge Discovery and Data Mining - 14th Pacific-Asia Conference, PAKDD 2010, Proceedings (PART 2 ed., pp. 283-290). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 6119 LNAI, No. PART 2). https://doi.org/10.1007/978-3-642-13672-6_28