In this paper we propose a strategy learning model for autonomous agents based on classification. In the literature, the most commonly used learning method in agent-based systems is reinforcement learning. In our opinion, classification can be considered a good alternative. This type of supervised learning can be used to generate a classifier that allows the agent to choose an appropriate action for execution. Experimental results show that this model can be successfully applied for strategy generation even if rewards are delayed. We compare the efficiency of the proposed model and reinforcement learning using the farmer-pest domain and configurations of various complexity. In complex environments, supervised learning can improve the performance of agents much faster that reinforcement learning. If an appropriate knowledge representation is used, the learned knowledge may be analyzed by humans, which allows tracking the learning process
If the inline PDF is not rendering correctly, you can download the PDF file here.
Airiau S. Padham L. Sardina S. and Sen S. (2008). Incorporating learning in BDI agents Proceedings of the ALAMAS+ALAg Workshop Estoril Portugal.
Barrett S. Stone P. Kraus S. and Rosenfeld A. (2012). Learning teammate models for ad hoc teamwork AAMAS Adaptive Learning Agents (ALA)Workshop Valencia Spain.
Bazzan A. Peleteiro A. and Burguillo J. (2011). Learning to cooperate in the iterated prisoners dilemma by means of social attachments Journal of the Brazilian Computer Society 17(3): 163-174.
Bellman R. (1957). Dynamic Programming A Rand Corporation Research Study Princeton University Press Princeton NJ.
Cetnarowicz K. and Drezewski R. (2010). Maintaining functional integrity in multi-agent systems for resource allocation Computing and Informatics 29(6): 947-973.
Cohen W.W. (1995). Fast effective rule induction Proceedings of the 12th International Conference on Machine Learning (ICML’95) Tahoe City CA USA pp. 115-123.
Dietterich T.G. (2000). Hierarchical reinforcement learning with the MAXQ value function decomposition Journal of Artificial Intelligence Research 13: 227-303.
Gehrke J.D. and Wojtusiak J. (2008). Traffic prediction for agent route planning in M. Bubak et al. (Eds.) Computational Science-ICCS 2008 Part III Lecture Notes Computer Science Vol. 5103 Springer Berlin/Heidelberg pp. 692-701.
Hernandez-Leal P. Munoz de Cote E. and Sucar L.E. (2013). Learning against non-stationary opponents Workshop on Adaptive Learning Agents Saint Paul MN USA.
Kaelbling L.P. Littman M.L. and Moore A.W. (1996). Reinforcement learning: A survey Journal of Artificial Intelligence Research 4: 237-285.
Kazakov D. and Kudenko D. (2001). Machine learning and inductive logic programming for multi-agent systems in M. Luck et al. (Eds.) Multi-Agent Systems and Applications Springer Berlin/Heidelberg pp. 246-270.
Lin L.-J. (1992). Self-improving reactive agents based on reinforcement learning planning and teaching Machine Learning 8(3-4): 293-321.
Panait L. and Luke S. (2005). Cooperative multi-agent learning: The state of the art Autonomous Agents and Multi-Agent Systems 11(3): 387-434.
Quinlan J. (1993). C4.5: Programs for Machine Learning Morgan Kaufmann San Francisco CA.
Rao A.S. and Georgeff M.P. (1991). Modeling rational agents within a BDI-architecture in J. Allen R. Fikes and E. Sandewall (Eds.) Proceedings of the 2nd International Conference on Principles of Knowledge Representation and Reasoning Morgan Kaufmann: San Mateo CA pp. 473-484.
Rummery G.A. and Niranjan M. (1994). On-line q-learning using connectionist systems Technical report Cambridge University Cambridge.
Russell S.J. and Zimdars A. (2003). Q-decomposition for reinforcement learning agents Proceedings of the 20th International Conference on Machine Learning (ICML-2003) Washington DC USA pp. 656-663.
Russell S. and Norvig P. (2009). Artificial Intelligence: A Modern Approach 3rd Edn. Prentice-Hall Upper Saddle River NJ.
Sen S. and Weiss G. (1999). Learning in Multiagent Systems MIT Press Cambridge MA pp. 259-298.
Shoham Y. Powers R. and Grenager T. (2003). Multi-agent reinforcement learning: A critical survey Technical report Stanford University Stanford CA.
Singh D. Sardina S. Padgham L. and Airiau S. (2010). Learning context conditions for BDI plan selection Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems Toronto Canada pp. 325-332.
Śnieżyński B. (2013a). Agent strategy generation by rule induction Computing and Informatics 32(5): 1055-1078. ´Snie˙zy´nski B. (2013b). Comparison of reinforcement and supervised learning methods in farmer-pest problem with delayed rewards in C. Badica N.T. Nguyen and M. Brezovan (Eds.) Computational Collective Intelligence Lecture Notes in Computer Science Vol. 8083 Springer Berlin/Heidelberg pp. 399-408.
Śnieżyński B. (2014). Agent-based adaptation system for service-oriented architectures using supervised learning Procedia Computer Science 29: 1057-1067.
Śnieżyński B. and Dajda J. (2013). Comparison of strategy learning methods in farmer-pest problem for various complexity environments without delays Journal of Computational Science 4(3): 144 - 151.
Śnieżyński B. and Kozlak J. (2005). Learning in a multi-agent approach to a fish bank game in M. Pchouek P. Petta and L.Z. Varga (Eds.) Multi-Agent Systems and Applications IV Lecture Notes in Computer Science Vol. 3690 Springer Berlin/Heidelberg pp. 568-571.
Śnieżyński B. Wojcik W. Gehrke J.D. and Wojtusiak J. (2010). Combining rule induction and reinforcement learning: An agent-based vehicle routing Proceedings of the International Conference on Machine Learning and Applications Washington DC USA pp. 851-856.
Sutton R. and Barto A. (1998). Reinforcement Learning: An Introduction (Adaptive Computation and Machine Learning) The MIT Press Cambridge MA.
Sutton R.S. (1990). Integrated architecture for learning planning and reacting based on approximating dynamic programming Proceedings of the 7th International Conference on Machine Learning Austin TX USA pp. 216-224.
Tan M. (1993). Multi-agent reinforcement learning: Independent vs. cooperative agents Proceedings of the 10th International Conference on Machine Learning Amherst MA USA pp. 330-337.
Tuyls K. and Weiss G. (2012). Multiagent learning: Basics challenges and prospects AI Magazine 33(3): 41-52.
Watkins C.J.C.H. (1989). Learning from Delayed Rewards Ph.D. thesis King’s College Cambridge.
Wooldridge M. (2009). An Introduction to MultiAgent Systems 2nd Edn. Wiley Publishing Chichester.
ZhangW. and Dietterich T.G. (1995). A reinforcement learning approach to job-shop scheduling Proceedings of the 14th International Joint Conference on Artificial Intelligence Montreal Canada pp. 1114-1120.