Abstract
In our recent work on the measurement of (collective) intelligence,
we used a dynamic intelligence test to measure and compare
the performances of artificial agents. In this paper we give a detailed technical description of the testing framework, its design and implementation, showing how it can be used to quantitatively evaluate general purpose, single- and multi-agent artificial intelligence (AI). The source code and scripts to run experiments have been released as open-source, and instructions on how to administer the test to artificial agents have been outlined. This will allow evaluating new agent behaviours and also extending the scope of the test. Alternative
testing environments are discussed along with other considerations
relevant to the robustness of multi-agent performance tests.
The intuition is to encourage people in the AI community to quantitatively
evaluate new types of heuristics and algorithms individually
and collectively using different communication and interaction protocols,
and thus pave the way towards a rigorous, formal and unified
testing framework for general purpose agents.
we used a dynamic intelligence test to measure and compare
the performances of artificial agents. In this paper we give a detailed technical description of the testing framework, its design and implementation, showing how it can be used to quantitatively evaluate general purpose, single- and multi-agent artificial intelligence (AI). The source code and scripts to run experiments have been released as open-source, and instructions on how to administer the test to artificial agents have been outlined. This will allow evaluating new agent behaviours and also extending the scope of the test. Alternative
testing environments are discussed along with other considerations
relevant to the robustness of multi-agent performance tests.
The intuition is to encourage people in the AI community to quantitatively
evaluate new types of heuristics and algorithms individually
and collectively using different communication and interaction protocols,
and thus pave the way towards a rigorous, formal and unified
testing framework for general purpose agents.
Original language | English |
---|---|
Title of host publication | EGPAI 2016 - Evaluating General Purpose AI 2016 |
Editors | Christos Dimitrakakis, Jose Hernandez-Orallo, Claes Strannegard, Kristinn R. Thorisson |
Place of Publication | Palo Alto CA USA |
Publisher | Association for the Advancement of Artificial Intelligence (AAAI) |
Number of pages | 8 |
Publication status | Published - 2016 |
Event | Evaluating General Purpose AI 2016 - The Hague, Netherlands Duration: 30 Aug 2016 → 30 Aug 2016 http://dmip.webs.upv.es/EGPAI2016/ |
Conference
Conference | Evaluating General Purpose AI 2016 |
---|---|
Abbreviated title | EGPAI 2016 |
Country/Territory | Netherlands |
City | The Hague |
Period | 30/08/16 → 30/08/16 |
Internet address |