Accelerating progress in Artificial General Intelligence: Choosing a benchmark for natural world interaction
Measuring progress in the field of Artificial General Intelligence (AGI) can be difficult without commonly accepted methods of evaluation. An AGI benchmark would allow evaluation and comparison of the many computational intelligence algorithms that have been developed. In this paper I propose that a benchmark for natural world interaction would possess seven key characteristics: fitness, breadth, specificity, low cost, simplicity, range, and task focus. I also outline two benchmark examples that meet most of these criteria. In the first, the direction task, a human coach directs a machine to perform a novel task in an unfamiliar environment. The direction task is extremely broad, but may be idealistic. In the second, the AGI battery, AGI candidates are evaluated based on their performance on a collection of more specific tasks. The AGI battery is designed to be appropriate to the capabilities of currently existing systems. Both the direction task and the AGI battery would require further definition before implementing. The paper concludes with a description of a task that might be included in the AGI battery: the search and retrieve task.
If the inline PDF is not rendering correctly, you can download the PDF file here.
Achler T. and Amir E. 2009. Neuroscience and AI share the same elegant mathematical trap. In Proc 2009 Conf on Artificial General Intelligence.
Asuncion A. and Newman D. 2007. UCI Machine Learning Repository. http://www.ics.uci.edu/~mlearn/MLRepository.html
AUVSI. 2009. AUVSI Unmanned Systems Online. http://www.auvsi.org/competitions/water.cfm
Bayer S.; Damianos L.; Hirschman L.; and Strong G. 2004. A Summary of Previous Grand Challenge Proposals for Cognitive Systems. Technical report The MITRE Corporation. Version 1.5 Prepared for DARPA IPTO http://www.dtic.mil/cgi-bin/GetTRDoc?Location=U2&doc=GetTRDoc.pdf&AD=ADA458170
Brachman R. J. 2006. (AA)AI more than the sum of its parts. AI Magazine 27(4):19-34.
Carpenter R. and Freeman J. 2005. Computing machinery and the individual: The Personal Turing Test. Technical report Jabberwacky. http://www.jabberwacky.com/personaltt
Cohen P. R. 2005. If not Turing's test then what? AI Magazine 26(4):61-67.
CoroWare Inc. 2007. The CoroWare CoroBot. http://www.corobot.net/
Dillman R. 2004. KA 1.10 Benchmarks for Robotics Research. Technical report University of Karlsruhe. Sponsored: European Robotics Research Network.
Duch W.; Oentaryo R. J.; and Pasquier M. 2008. Frontiers in Artificial Intelligence Applications volume 171. IOS Press. chapter Cognitive architectures: Where do we go from here? 122-136.
Elio R. and Pelletier F. J. 1993. Human benchmarks on AI's benchmark problems. In Proc 15th Congress of the Cognitive Science Society 406-411.
FIRA. 2009. Federation of International Robosoccer Association Homepage. http://www.fira.net/
Geva S. and Sitte J. 1993. A cart-pole experiment for trainable controllers. IEEE Control Systems Magazine 13:40-51.
Goertzel B. and Pennachin Eds. C. 2007. Artificial General Intelligence. Springer.
Goertzel B.; Arel I.; and Scheutz M. 2009. Toward a roadmap for human-level artificial general intelligence: Embedding HLAI systems in broad approachable physical or virtual contexts. Technical report Artificial General Intelligence Roadmap Initiative. http://www.agi-roadmap.org/images/HLAIR.pdf
Griffin G.; Holub A.; and Perona P. 2007. Caltech-256 Object Category Dataset. Technical Report 7694 California Institute of Technology. http://authors.library.caltech.edu/7694
Harnad S. 1991. Other bodies other minds: A machine incarnation of an old philisophical problem. Minds and Machines 1:43-54.
Hutter M. 2005. Universal Artificial Intelligence: Sequential Decisions Based on Algorithmic Probability. Berlin Heidelberg: Springer-Verlag.
Kennedy J. F. 1961. Man on the Moon Address. http://www.homeofheroes.com/presidents/speeches/kennedy_space.html
Kokinov B. N. 1994. The DUAL cognitive architechture: A hybrid multi-agent approach. In Proceedings of the Eleventh European Conference on Artificial Intelligence. John Wiley and Sons.
Laird J. E.; Wray III R. E.; Marinier III R. P.; and Langley P. 2009. Claims and challenges in evaluating human-level intelligent systems. In Proceedings of the 2009 Conference on Artificial General Intelligence. Atlantis Press.
Lebiere C.; Gonzales C.; and Warwick W. 2009. A comparative approach to understanding general intelligence: Predicting cognitive performance in an open-ended dynamic task. In Proceedings of the Second Conference on Artificial General Intelligence. Atlantis Press.
Livingston S. and Arel I. 2009. AGI Roadmap. http://agi-roadmap.org/
Michel O.; Rohrer F.; and van Bourquin Y. 2008. Rat's Life: A cognitive robotics benchmark. In et al. H. B. ed. Proc 2008 European Robotics Sympoiusm volume STAR 44 223-232. Berlin Heidelberg: Springer-Verlag.
Mlodinow L. 2008. The Drunkard's Walk: How Randomness Rules Our Lives 8th Printing Edition. Pantheon. See Chapter 1.
Moore A. 1990. Efficient Memory-Based Learning for Robot Control. Ph.D. Dissertation University of Cambridge.
Mueller S. T. and Minnery B. S. 2008. Adapting the Turing Test for embodied neurocognitive evaluation of biologically-inspired cognitive agents. In Proc. 2008 AAAI Fall Symposium on Biologically Inspired Cognitive Architectures.