Towards improving rhetorical categories classification and unveiling sequential patterns in students' writing

Sehrish Iqbal, Mladen Rakovic, Guanliang Chen, Tongguang Li, Jasmine Bajaj, Rafael Ferreira Mello, Yizhou Fan, Naif Radi Aljohani, Dragan Gasevic

Research output: Chapter in Book/Report/Conference proceedingConference PaperResearchpeer-review

3 Citations (Scopus)

Abstract

To meet the growing demand for future professionals who can present information to an audience and create quality written products, educators are increasingly assigning writing assignments that require students to gather information from multiple sources, reorganise and reinterpret knowledge from source materials, and plan for rhetorical structure goals in order to meet the task requirements. When evaluating an essay coherence, scorers manually look for the presence of required rhetorical categories, which takes time. Supervised Machine Learning (ML) techniques have proven to be an effective tool for automatic detection of rhetorical categories that approximate students' cognitive engagement with source information. Previous studies that addressed this problem used relatively small datasets and reported relatively low kappa scores for accuracy, limiting the use of such models in real-world scenarios. Moreover, to empower educators to effectively evaluate the overall quality of students' writing, the associations between the sequential patterns of rhetorical categories in students' writing and writing performance must be examined, which remains largely unexplored in educational domain. Therefore, to fill these gaps, our study aimed to i) investigate the impact of data augmentation approaches on the performance of deep learning algorithms in classifying rhetorical categories in student essays according to Bloom's taxonomy ii) and explore the sequential patterns of rhetorical categories in students' writing that can influence writing performance. Our findings showed that deep learning-based model BERT on Easy Data Augmentation (EDA) based augmented data achieved 20% higher Cohen's kappa than normal (non-augmented) data, and we discovered that students in different performance groups were statistically different in terms of rhetorical patterns. Our proposed study is valuable in terms of building a data analytic foundation that can be used to create formative feedback on students' writings based on the patterns of rhetorical categories to improve essay quality.

Original languageEnglish
Title of host publicationLAK 2024 Conference Proceedings - The Fourteenth International Conference on Learning Analytics & Knowledge
EditorsSrecko Joksimovic, Andrew Zamecnik
Place of PublicationNew York NY USA
PublisherAssociation for Computing Machinery (ACM)
Pages656-666
Number of pages11
ISBN (Electronic)9798400716188
DOIs
Publication statusPublished - 2024
EventInternational Learning Analytics & Knowledge Conference 2024 - Kyoto, Japan
Duration: 18 Mar 202422 Mar 2024
Conference number: 14th
https://dl.acm.org/doi/proceedings/10.1145/3636555 (Conference Proceedings)
https://www.solaresearch.org/events/lak/lak24/
https://ceur-ws.org/Vol-3667/ (LAK 2024 Workshop Proceedings)

Conference

ConferenceInternational Learning Analytics & Knowledge Conference 2024
Abbreviated titleLAK 2024
Country/TerritoryJapan
CityKyoto
Period18/03/2422/03/24
Internet address

Keywords

  • automatic classification
  • deep learning
  • essay analysis
  • ordered network analysis
  • Rhetorical structure

Cite this