Accelerating progress in Artificial General Intelligence: Choosing a benchmark for natural world interaction
Measuring progress in the field of Artificial General Intelligence (AGI) can be difficult without commonly accepted methods of evaluation. An AGI benchmark would allow evaluation and comparison of the many computational intelligence algorithms that have been developed. In this paper I propose that a benchmark for natural world interaction would possess seven key characteristics: fitness, breadth, specificity, low cost, simplicity, range, and task focus. I also outline two benchmark examples that meet most of these criteria. In the first, the direction task, a human coach directs a machine to perform a novel task in an unfamiliar environment. The direction task is extremely broad, but may be idealistic. In the second, the AGI battery, AGI candidates are evaluated based on their performance on a collection of more specific tasks. The AGI battery is designed to be appropriate to the capabilities of currently existing systems. Both the direction task and the AGI battery would require further definition before implementing. The paper concludes with a description of a task that might be included in the AGI battery: the search and retrieve task.
Geva, S., and Sitte, J. 1993. A cart-pole experiment for trainable controllers. IEEE Control Systems Magazine 13:40-51.
Goertzel, B., and Pennachin, Eds., C. 2007. Artificial General Intelligence. Springer.
Goertzel, B.; Arel, I.; and Scheutz, M. 2009. Toward a roadmap for human-level artificial general intelligence: Embedding HLAI systems in broad, approachable, physical or virtual contexts. Technical report, Artificial General Intelligence Roadmap Initiative. http://www.agi-roadmap.org/images/HLAIR.pdf
Kokinov, B. N. 1994. The DUAL cognitive architechture: A hybrid multi-agent approach. In Proceedings of the Eleventh European Conference on Artificial Intelligence. John Wiley and Sons.
Laird, J. E.; Wray III, R. E.; Marinier III, R. P.; and Langley, P. 2009. Claims and challenges in evaluating human-level intelligent systems. In Proceedings of the 2009 Conference on Artificial General Intelligence. Atlantis Press.
Lebiere, C.; Gonzales, C.; and Warwick, W. 2009. A comparative approach to understanding general intelligence: Predicting cognitive performance in an open-ended dynamic task. In Proceedings of the Second Conference on Artificial General Intelligence. Atlantis Press.
Michel, O.; Rohrer, F.; and van Bourquin, Y. 2008. Rat's Life: A cognitive robotics benchmark. In et al., H. B., ed., Proc 2008 European Robotics Sympoiusm, volume STAR 44, 223-232. Berlin Heidelberg: Springer-Verlag.
Mlodinow, L. 2008. The Drunkard's Walk: How Randomness Rules Our Lives, 8th Printing Edition. Pantheon. See Chapter 1.
Moore, A. 1990. Efficient Memory-Based Learning for Robot Control. Ph.D. Dissertation, University of Cambridge.
Mueller, S. T., and Minnery, B. S. 2008. Adapting the Turing Test for embodied neurocognitive evaluation of biologically-inspired cognitive agents. In Proc. 2008 AAAI Fall Symposium on Biologically Inspired Cognitive Architectures.