Abstract
Classifying educational forum posts is a longstanding task in the research of Learning Analytics and Educational Data Mining. Though this task has been tackled by applying both traditional Machine Learning (ML) approaches (e.g., Logistics Regression and Random Forest) and up-to-date Deep Learning (DL) approaches, there lacks a systematic examination of these two types of approaches to portray their performance difference. To better guide researchers and practitioners to select a model that suits their needs the best, this study aimed to systematically compare the effectiveness of these two types of approaches for this specific task. Specifically, we selected a total of six representative models and explored their capabilities by equipping them with either extensive input features that were widely used in previous studies (traditional ML models) or the state-of-the-art pre-trained language model BERT (DL models). Through extensive experiments on two real-world datasets (one is open-sourced), we demonstrated that: (i) DL models uniformly achieved better classification results than traditional ML models and the performance difference ranges from 1.85% to 5.32% with respect to different evaluation metrics; (ii) when applying traditional ML models, different features should be explored and engineered to tackle different classification tasks; (iii) when applying DL models, it tends to be a promising approach to adapt BERT to the specific classification task by fine-tuning its model parameters. We have publicly released our code at https://github.com/lsha49/LL_EDU_FORUM_CLASSIFIERS.
Original language | English |
---|---|
Title of host publication | Proceedings of the 14th International Conference on Educational Data Mining |
Editors | I-Han (Sharon) Hsiao, Shaghayegh (Sherry) Sahebi, Francois Bouchet, Jill-Jenn Vie |
Place of Publication | Massachusetts USA |
Publisher | International Educational Data Mining Society |
Pages | 228-239 |
Number of pages | 12 |
ISBN (Electronic) | 9781733673624 |
Publication status | Published - 2021 |
Event | Educational Data Mining 2021 - Online, Paris, France Duration: 29 Jun 2021 → 2 Jul 2021 Conference number: 14th https://educationaldatamining.org/edm2021/ https://educationaldatamining.org/edm2021/proceedings/ (Proceedings) |
Conference
Conference | Educational Data Mining 2021 |
---|---|
Abbreviated title | EDM 2021 |
Country/Territory | France |
City | Paris |
Period | 29/06/21 → 2/07/21 |
Internet address |