Does the Prompt-Based Large Language Model Recognize Students’ Demographics and Introduce Bias in Essay Scoring?

Research output: Chapter in Book/Report/Conference proceedingConference PaperResearchpeer-review

Abstract

Large Language Models (LLMs) are widely used in Automated Essay Scoring (AES) due to their ability to capture semantic meaning. Traditional fine-tuning approaches required technical expertise, limiting accessibility for educators with limited technical backgrounds. However, prompt-based tools like ChatGPT have made AES more accessible, enabling educators to obtain machine-generated scores using natural-language prompts (i.e., the prompt-based paradigm). Despite advancements, prior studies have shown bias in fine-tuned LLMs, particularly against disadvantaged groups. It remains unclear whether such biases persist or are amplified in the prompt-based paradigm with cutting-edge tools. Since such biases are believed to stem from the demographic information embedded in pre-trained models (i.e., the ability of LLMs’ text embeddings to predict demographic attributes), this study explores the relationship between the model’s predictive power of students’ demographic attributes based on their written works and its predictive bias in the scoring task in the prompt-based paradigm. Using a publicly available dataset of over 25,000 students’ argumentative essays, we designed prompts to elicit demographic inferences (i.e., gender, first-language background) from GPT-4o and assessed fairness in automated scoring. Then we conducted the multivariate regression analysis to explore the impact of the model’s ability to predict demographics on its scoring outcomes. Our findings revealed that (i) Prompt-based LLM can somewhat infer students’ demographics, particularly their first-language backgrounds, from their essays; (ii) Scoring biases are more pronounced when the LLM correctly predicts students’ first-language background than when it does not; (iii) Scoring error for non-native English speakers increases when the LLM correctly identifies them as non-native.

Original languageEnglish
Title of host publicationArtificial Intelligence in Education - 26th International Conference, AIED 2025 Palermo, Italy, July 22–26, 2025 Proceedings, Part II
EditorsAlexandra I. Cristea, Erin Walker, Yu Lu, Olga C. Santos, Seiji Isotani
Place of PublicationCham Switzerland
PublisherSpringer
Pages75-89
Number of pages15
ISBN (Electronic)9783031984174
ISBN (Print)9783031984167
DOIs
Publication statusPublished - 2025
EventInternational Conference on Artificial Intelligence in Education 2025 - Palermo, Italy
Duration: 22 Jul 202526 Jul 2025
Conference number: 26th
https://link.springer.com/book/10.1007/978-3-031-98465-5 (Published Proceedings)
https://aied2025.itd.cnr.it/ (Website)

Publication series

NameLecture Notes in Computer Science
PublisherSpringer
Volume15878
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

ConferenceInternational Conference on Artificial Intelligence in Education 2025
Abbreviated titleAIED 2025
Country/TerritoryItaly
CityPalermo
Period22/07/2526/07/25
Internet address

Keywords

  • Automated Essay Scoring
  • Bias
  • Large Language Model

Cite this