Evaluations of NLG systems: common corpus and tasks or common dimensions and metrics

Cecile Paris, Nathalie F Colineau, Ross Gordon Wilkinson

Research output: Chapter in Book/Report/Conference proceedingConference PaperOther

3 Citations (Scopus)


In this position paper, we argue that a common task and corpus are not the only ways to evaluate Natural Language Generation (NLG) systems. It might be, in fact, too narrow a view on evaluation and thus not be the best way to evaluate these systems. The aim of a common task and corpus is to allow for a comparative evaluation of systems, looking at the systems performances. It is thus a systemoriented view of evaluation. We argue here that, if we are to take a system oriented view of evaluation, the community might be better served by enlarging the view of evaluation, defining common dimensions and metrics to evaluate systems and approaches. We also argue that end-user (or usability) evaluations form another important aspect of a system s evaluation and should not be forgotten.
Original languageEnglish
Title of host publicationINLG-06 Fourth International Natural Language Generation Conference Proceedings
EditorsNathalie Colineau, Cecile Paris, Stephen Wan, Robert Dale
Place of PublicationStroudsburg PA United States of America
PublisherAssociation for Computational Linguistics (ACL)
Pages127 - 129
Number of pages3
ISBN (Print)1932432728
Publication statusPublished - 2006
Externally publishedYes
EventInternational Natural Language Generation Conference 2006 - Sydney NSW Australia, Stroudsburg PA United States of America
Duration: 1 Jan 2006 → …


ConferenceInternational Natural Language Generation Conference 2006
CityStroudsburg PA United States of America
Period1/01/06 → …

Cite this