The Development of Hate Speech and Message Monitoring Ahead Election 2024

Project: Research

Project Details

Project Description

As observed in almost all countries during general elections, hate speech and disinformation are spreading intensively and widely in society. The intensity surpasses the capacity of mass media organizations and fact-checkers to fact-check and anticipate rapidly. Therefore, a platform is needed that enables the collection, analysis, modeling, monitoring, and dashboarding of hate speech and divisive messages in society. This monitoring dashboard will facilitate various elements of society in anticipating and mitigating hate speech and disinformation that leads to social polarization. This monitoring dashboard is a collaborative work between the Indonesian Independent Journalists Alliance (Aliansi Jurnalis Independen) and Monash University Indonesia.

Work Outputs:
Analysis and topic modeling of various types of hate speech within vulnerable groups.
Dashboard to display the results of data analysis.
Automated conversation monitoring.

Work Phases:
1. The preparation phase includes collecting and analyzing hate speech data within vulnerable groups. There is currently an existing hate speech-related dataset in the Indonesian language. However, this dataset has yet to analyze hate speech within vulnerable groups specifically. The required data samples will come from the following platforms: 1. Facebook Groups, 2. Facebook Pages, 3. Instagram, 4. Twitter, 5. Media news articles. The keyword we select for dataset collection will be contextualized to the context of hate speech within vulnerable groups in Indonesia.
Activities in this phase are: a. Presentation from AJI experts/team regarding hate speech within vulnerable groups. b. Review panel to review and/or add keywords to the existing dataset. Five experts/researchers specialized in studying hate speech within vulnerable groups are needed. c. Sampling for annotation, at least 20,000 social media posts (2019-2023) and 5,000 mass media articles.
d. Analyzing sampled media content and social media content according to predefined categories. Six coders/annotators will work for approximately 4 weeks. These could include journalists, research assistants, and representatives from vulnerable groups.

2. The data Analysis and Modeling phase covers analyzing data and creating a machine learning model to automatically predict articles/posts containing hate speech towards vulnerable groups based on the categories determined in Phase I. Two research assistants are needed for data analysis and model training. Activities: a. Research team analyzes the data prepared in Phase 1. b. Research team trains a machine learning model based on the data prepared in Phase 1.

3. The dashboard Creation phase consists of developing a dashboard to display the results of data analysis. Activities: a. Research team evaluates the predictions made by the machine learning model built in the second phase. b. Design and build a dashboard to display the results of data analysis. c. Subsequently, data from the platforms will be automatically linked to the dashboard.
Effective start/end date3/11/2330/03/24