Which hammer should I use? a systematic evaluation of approaches for classifying educational forum posts.

Lele Sha, Mladen Rakovic, Yuheng Li, Alex Whitelock-Wainwright, David Carroll, Dragan Gašević, Guanliang Chen

Research output: Chapter in Book/Report/Conference proceedingConference PaperResearchpeer-review

7 Citations (Scopus)


Classifying educational forum posts is a longstanding task in the research of Learning Analytics and Educational Data Mining. Though this task has been tackled by applying both traditional Machine Learning (ML) approaches (e.g., Logistics Regression and Random Forest) and up-to-date Deep Learning (DL) approaches, there lacks a systematic examination of these two types of approaches to portray their performance difference. To better guide researchers and practitioners to select a model that suits their needs the best, this study aimed to systematically compare the effectiveness of these two types of approaches for this specific task. Specifically, we selected a total of six representative models and explored their capabilities by equipping them with either extensive input features that were widely used in previous studies (traditional ML models) or the state-of-the-art pre-trained language model BERT (DL models). Through extensive experiments on two real-world datasets (one is open-sourced), we demonstrated that: (i) DL models uniformly achieved better classification results than traditional ML models and the performance difference ranges from 1.85% to 5.32% with respect to different evaluation metrics; (ii) when applying traditional ML models, different features should be explored and engineered to tackle different classification tasks; (iii) when applying DL models, it tends to be a promising approach to adapt BERT to the specific classification task by fine-tuning its model parameters. We have publicly released our code at https://github.com/lsha49/LL_EDU_FORUM_CLASSIFIERS.
Original languageEnglish
Title of host publicationProceedings of the 14th International Conference on Educational Data Mining
EditorsI-Han (Sharon) Hsiao, Shaghayegh (Sherry) Sahebi, Francois Bouchet, Jill-Jenn Vie
Place of PublicationMassachusetts USA
PublisherInternational Educational Data Mining Society
Number of pages12
ISBN (Electronic)9781733673624
Publication statusPublished - 2021
EventEducational Data Mining 2021 - Online, Paris, France
Duration: 29 Jun 20212 Jul 2021
Conference number: 14th
https://educationaldatamining.org/edm2021/proceedings/ (Proceedings)


ConferenceEducational Data Mining 2021
Abbreviated titleEDM 2021
Internet address

Cite this