Abstract
In this position paper, we argue that a common task and corpus are not the only ways to evaluate Natural Language Generation (NLG) systems. It might be, in fact, too narrow a view on evaluation and thus not be the best way to evaluate these systems. The aim of a common task and corpus is to allow for a comparative evaluation of systems, looking at the systems performances. It is thus a systemoriented view of evaluation. We argue here that, if we are to take a system oriented view of evaluation, the community might be better served by enlarging the view of evaluation, defining common dimensions and metrics to evaluate systems and approaches. We also argue that end-user (or usability) evaluations form another important aspect of a system s evaluation and should not be forgotten.
Original language | English |
---|---|
Title of host publication | INLG-06 Fourth International Natural Language Generation Conference Proceedings |
Editors | Nathalie Colineau, Cecile Paris, Stephen Wan, Robert Dale |
Place of Publication | Stroudsburg PA United States of America |
Publisher | Association for Computational Linguistics (ACL) |
Pages | 127 - 129 |
Number of pages | 3 |
ISBN (Print) | 1932432728 |
Publication status | Published - 2006 |
Externally published | Yes |
Event | International Natural Language Generation Conference 2006 - Sydney NSW Australia, Stroudsburg PA United States of America Duration: 1 Jan 2006 → … |
Conference
Conference | International Natural Language Generation Conference 2006 |
---|---|
City | Stroudsburg PA United States of America |
Period | 1/01/06 → … |