Modeling classifiers for virtual internships without participant data

Dipesh Gautam, Zachari Swiecki, David W. Shaffer, Arthur C. Graesser, Vasile Rus

Research output: Chapter in Book/Report/Conference proceedingConference PaperResearchpeer-review

3 Citations (Scopus)


Virtual internships are online simulations of professional practice where students play the role of interns at a fictional company. During virtual internships, participants complete activities and then submit write-ups in the form of short answers, digital notebook entries. Prior work used classifiers trained on participant data to automatically assess notebook entries from these learning environments. However, when teachers create new internships using available authoring tools, no such data exists. We evaluate a method for generating classifiers using specifications provided by teachers during their authoring process instead of participant data. Our models rely on Latent Semantic Analysis based and Neural Network based semantic similarity approaches in which notebook entries are compared to ideal, expert generated responses. We also investigated a Regular Expression based model. The experiments on the proposed models on unseen data showed high precision and recall values for some classifiers using a similarity based approach. Regular Expression based classifiers performed better where the other two approaches did not, suggesting that these approaches may complement one another in future work.

Original languageEnglish
Title of host publicationProceedings of the 10th International Conference on Educational Data Mining
EditorsX. Hu, T. Barnes, A. Hershkovitz, L. Paquette
Place of PublicationWuhan China
PublisherInternational Educational Data Mining Society
Number of pages6
Publication statusPublished - 2017
Externally publishedYes
EventEducational Data Mining 2017 - Central China Normal University, Wuhan, China
Duration: 25 Jun 201728 Jun 2017
Conference number: 10th


ConferenceEducational Data Mining 2017
Abbreviated titleEDM 2017
Internet address


  • Automated assessment
  • LSA
  • Neural network
  • Regular expressions
  • Semantic similarity
  • Text classification

Cite this