Abstract
In the context of COVID-19, numerous people present their opinions through social networks. It is thus highly desired toconduct sentiment analysis towards COVID-19 tweets to learn the public’s attitudes, and facilitate the government to make properguidelines for avoiding the social unrest. Although many efforts have studied the text-based sentiment classification from variousdomains (e.g., delivery and shopping reviews), it is hard to directly use these classifiers for the sentiment analysis towards COVID-19tweets due to the domain gap. In fact, developing the sentiment classifier for COVID-19 tweets is mainly challenged by the limitedannotated training dataset, as well as the diverse and informal expressions of user-generated posts. To address these challenges, weconstruct a large-scale COVID-19 dataset from Weibo and propose a dual COnsistency-enhanced semi-superVIseD network forSentiment Anlaysis (COVID-SA). In particular, we first introduce a knowledge-based augmentation method to augment data andenhance the model’s robustness. We then employ BERT as the text encoder backbone for both labeled data, unlabeled data, andaugmented data. Moreover, we propose a dual consistency (i.e., label-oriented consistency and instance-oriented consistency)regularization to promote the model performance. Extensive experiments on our self-constructed dataset and three public datasetsshow the superiority of COVID-SA over state-of-the-art baselines on various applications.
Original language | English |
---|---|
Pages (from-to) | 12605-12617 |
Number of pages | 13 |
Journal | IEEE Transactions on Knowledge and Data Engineering |
Volume | 35 |
Issue number | 12 |
DOIs | |
Publication status | Published - 1 Dec 2023 |
Externally published | Yes |
Keywords
- Bit error rate
- Blogs
- COVID-19
- Data models
- Knowledge based systems
- Semi-supervised Text Classification
- Sentiment Analysis
- Sentiment analysis
- Social Media Dataset on COVID-19
- Social networking (online)