Open Access

Feature Reinforcement Learning: Part II. Structured MDPs

   | Jun 14, 2021

Cite

The Feature Markov Decision Processes ( MDPs) model developed in Part I (Hutter, 2009b) is well-suited for learning agents in general environments. Nevertheless, unstructured (Φ)MDPs are limited to relatively simple environments. Structured MDPs like Dynamic Bayesian Networks (DBNs) are used for large-scale real-world problems. In this article I extend ΦMDP to ΦDBN. The primary contribution is to derive a cost criterion that allows to automatically extract the most relevant features from the environment, leading to the “best” DBN representation. I discuss all building blocks required for a complete general learning algorithm, and compare the novel ΦDBN model to the prevalent POMDP approach.

eISSN:
1946-0163
Language:
English
Publication timeframe:
2 times per year
Journal Subjects:
Computer Sciences, Artificial Intelligence