Skip to main navigation Skip to search Skip to main content

Leveraging prompt-based LLMs for automated scoring and feedback generation in higher education

Research output: Contribution to journalArticleResearchpeer-review

Abstract

As demand grows for personalized, scalable assessments in higher education (including both scoring and feedback provision), large language models (LLMs) have emerged as promising tools. While human educators typically perform scoring and feedback in a sequential and interrelated manner, existing research has largely addressed these tasks separately. This raises important questions about LLMs’ ability to handle scoring and feedback within a single workflow and the extent to which task sequencing affects their performance. To address this gap, this study investigates how prompting LLMs to perform scoring and feedback either together in one single prompt (prompt composition) or separately in two consecutive prompts (prompt decomposition), and the order in which these tasks are prompted affect the performance of GPT-4o, a cutting-edge LLM, in postgraduate open-ended assessments. We analyzed the scoring performance across student groups of varying performance levels. To tailor GPT-4o-generated feedback to individual student learning needs, we embedded well-established learner-centered feedback principles into the prompt design and assessed the quality of the generated feedback based on these principles. The scoring results revealed that prompt effectiveness varied modestly across student groups, with higher scoring errors on lower quality submissions. In terms of generated feedback, GPT-4o demonstrated greater support for learner agency. Task order influenced how this agency was expressed: prompting feedback first fostered learner autonomy, while prompting it after scoring emphasized the student–teacher connection.

Original languageEnglish
Article number105511
Number of pages14
JournalComputers and Education
Volume243
DOIs
Publication statusPublished - Apr 2026

Keywords

  • Automated essay scoring
  • Feedback generation
  • Higher education
  • Learner-centered feedback
  • Prompt engineering

Cite this