Dual COnsistency-enhanced semi-supervised Sentiment Analysis towards COVID-19 tweets

Teng Sun, Liqiang Jing, Yinwei Wei, Xuemeng Song, Zhiyong Cheng, Liqiang Nie

Research output: Contribution to journalArticleResearchpeer-review

12 Citations (Scopus)

Abstract

In the context of COVID-19, numerous people present their opinions through social networks. It is thus highly desired toconduct sentiment analysis towards COVID-19 tweets to learn the public’s attitudes, and facilitate the government to make properguidelines for avoiding the social unrest. Although many efforts have studied the text-based sentiment classification from variousdomains (e.g., delivery and shopping reviews), it is hard to directly use these classifiers for the sentiment analysis towards COVID-19tweets due to the domain gap. In fact, developing the sentiment classifier for COVID-19 tweets is mainly challenged by the limitedannotated training dataset, as well as the diverse and informal expressions of user-generated posts. To address these challenges, weconstruct a large-scale COVID-19 dataset from Weibo and propose a dual COnsistency-enhanced semi-superVIseD network forSentiment Anlaysis (COVID-SA). In particular, we first introduce a knowledge-based augmentation method to augment data andenhance the model’s robustness. We then employ BERT as the text encoder backbone for both labeled data, unlabeled data, andaugmented data. Moreover, we propose a dual consistency (i.e., label-oriented consistency and instance-oriented consistency)regularization to promote the model performance. Extensive experiments on our self-constructed dataset and three public datasetsshow the superiority of COVID-SA over state-of-the-art baselines on various applications.

Original languageEnglish
Pages (from-to)12605-12617
Number of pages13
JournalIEEE Transactions on Knowledge and Data Engineering
Volume35
Issue number12
DOIs
Publication statusPublished - 1 Dec 2023
Externally publishedYes

Keywords

  • Bit error rate
  • Blogs
  • COVID-19
  • Data models
  • Knowledge based systems
  • Semi-supervised Text Classification
  • Sentiment Analysis
  • Sentiment analysis
  • Social Media Dataset on COVID-19
  • Social networking (online)

Cite this