On Global Optimization of Walking Gaits for the Compliant Humanoid Robot, COMAN Using Reinforcement Learning

Open access

Abstract

In ZMP trajectory generation using simple models, often a considerable amount of trials and errors are involved to obtain locally stable gaits by manually tuning the gait parameters. In this paper a 15 degrees of Freedom dynamic model of a compliant humanoid robot is used, combined with reinforcement learning to perform global search in the parameter space to produce stable gaits. It is shown that for a given speed, multiple sets of parameters, namely step sizes and lateral sways, are obtained by the learning algorithm which can lead to stable walking. The resulting set of gaits can be further studied in terms of parameter sensitivity and also to include additional optimization criteria to narrow down the chosen walking trajectories for the humanoid robot.

1. Kormushev, P., D. G. Caldwell. Simultaneous Discovery of Multiple Alternative Optimal Policies by Reinforcement Learning. - In: IEEE International Conference on Intelligent Systems 2012, Sofia, Bulgaria (Accepted).

2. Morimoto, J., G. Cheng, C. G. Atkeson, G. Zeglin. A Simple Reinforcement Learning Algorithm For Biped Walking. - In: IEEE International Conference on Robotics and Automation’2004, New Orleans, LA, USA.

3. Endo, G., J. Morimoto, T. Matsubara, J. Nakanishi, G. Cheng. Learning CPGBased Biped Locomotion with a Policy Gradient Method: Application to a Humanoid Robot. - The International Journal of Robotics Research, Vol. 27, 2008, No 2, 213-228.

4. Tedrake, R., T. W. Zhang, H. S. Seung. Stochastic Policy Gradient Reinforcement Learning on a Simple 3D Biped. - In: IEEE/RSJ International Conference on Intelligent Robots and Systems, 2004, Sendai, Japan. 2849-2854.

5. Wada, Y., K. Sumita. A Reinforcement Learning Scheme for Acquisition of Via-Point Representation of Human Motion. - In: IEEE International Joint Conference on Neural Networks’2004, Vols 1-4, 2004, 1109-1114.

6. Kormushev, P., B. Ugurlu, S. Calinon, N. G. Tsagarakis, D. G. Caldwell. Bipedal Walking Energy Minimization by Reinforcement Learning with Evolving Policy Parameterization. - In: IEEE/RSJ International Conference on Intelligent Robots and Systems’2011, San Francisco, CA, USA.

7. Kajita, S., F. Kanehiro, K. Kaneko, K. Fujiwara, K. Yokoi, H. Hirukawa. Biped Walking Pattern Generation by a Simple Three-Dimensional Inverted Pendulum Model. - Advanced Robotics, Vol. 17, 2003, No 2, 131-147.

8. Kajita, S., F. Kanehiro, K. Kaneko, K. Yokoi, H. Hirukawa. The 3D Linear Inverted Pendulum Mode: A Simple Modeling for a Biped Walking Pattern Generation. - In: IEEE/RJS International Conference on Intelligent Robots and Systems (IROS), 2001, 239-246.

9. Goswami, A., B. Thuilot, B. Espiau. A Study of the Passive Gait of a Compass-Like Biped Robot: Symmetry and Chaos. - International Journal of Robotics Research, Vol. 17, 1998, No 12, 1282-1301.

10. Kajita, S., F. Kanehiro, K. Kaneko, K. Fujiwara, K. Harada, K. Yokoi, H. Hirukawa. Biped Walking Pattern Generation by Using Preview Control of Zero- Moment Point. - In: IEEE International Conference on Robotics and Automation (ICRA), 2003. 1620-1626.

11. Kajita, S., M. Morisawa, K. Miura, S. Nakaoka, K. Harada, K. Kaneko, F. Kanehiro, K. Yokoi. Biped Walking Stabilization Based on Linear Inverted Pendulum Tracking. - In: IEEE/Rsj International Conference on Intelligent Robots and Systems (IROS), 2010.

12. AMARSI. Adaptive Modular Architectures for Rich Motor Skills. EU Supported FP7 Project. http://www.amarsi-project.eu/

13. Tsagarakis, G., N., Z. Li, J. Saglia, D. G. Caldwell. The Design of the Lower Body of the Compliant Humanoid Robot “cCub”. - In: IEEE International Conference on Robotics and Automation (ICRA), 2011, Shanghai, China.

14. Buchli, J., M. Kalakrishnan, M. Mistry, P. Pastor, S. Schaal. Compliant Quadruped Locomotion Over Rough Terrain. - In: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2009, St. Louis, USA.

15. Featherstone, R. Rigid Body Dynamics Algorithms. 1st Ed. New York, Springer 2008, Science+Business Media, LLC. 280.

16. Mistry, M., J. Nakanishi, G. Cheng, S. Schaal. Inverse Kinematics with Floating Base and Constraints for Full Body Humanoid Robot Control. - In: IEEE-RAS International Conference on Humanoid Robots, 2008, Daejeon, Korea.

17. Bishop, C. M. Pattern Recognition and Machine Learning. New York, Springer, 2006.

Cybernetics and Information Technologies

The Journal of Institute of Information and Communication Technologies of Bulgarian Academy of Sciences

Journal Information


CiteScore 2017: 0.52

SCImago Journal Rank (SJR) 2017: 0.204
Source Normalized Impact per Paper (SNIP) 2017: 0.397

Mathematical Citation Quotient (MCQ) 2017: 0.01

Cited By

Metrics

All Time Past Year Past 30 Days
Abstract Views 0 0 0
Full Text Views 154 154 17
PDF Downloads 83 83 24