Fixed Final Time Optimal Adaptive Control of Linear Discrete-Time Systems in Input-Output form

Open access


In this paper, the fixed final time adaptive optimal regulation of discrete-time linear systems with unknown system dynamics is addressed. First, by transforming the linear systems into the input/output form, the adaptive optimal control design depends only on the measured outputs and past inputs instead of state measurements. Next, due to the time-varying nature of finite-horizon, a novel online adaptive estimator is proposed by utilizing an online approximator to relax the requirement on the system dynamics. An additional error term corresponding to the terminal constraint is defined and minimized overtime. No policy/value iteration is performed by the novel parameter update law which is updated once a sampling interval. The proposed control design yields an online and forward-in-time solution which enjoys great practical advantages. Stability of the system is demonstrated by Lyapunov analysis while simulation results verify the effectiveness of the propose approach

[1] F. L. Lewis and V. L. Syrmos, Optimal Control, 2nd edition. New York: Wiley, 1995.

[2] D. Kirk, Optimal Control Theory: An Introduction, New Jersey, Prentice-Hall, 1970.

[3] Z. Chen and S. Jagannathan, “Generalized Hamilton-Jacobi-Bellman formulation based neural network control of affine nonlinear discretetime systems”, IEEE Trans. Neural Networks, vol. 7, pp. 90-106, 2008.

[4] S. J. Bradtke and B. E. Ydstie, Adaptive linear quadratic control using policy iteration, in Proc. Am Contr. Conf., Baltimore, pp. 3475-3479, 1994.

[5] Z. Qiming, X. Hao and S. Jagannathan, “Finitehorizon optimal control design for uncertain linear discrete-time systems”, Proceedings of IEEE Symposium on Approximate Dynamic Programming and Reinforcement Learning (ADPRL), Singapore, 2013.

[6] X. Hao, S. Jagannathan and F. L. Lewis, “Stochastic optimal control of unknown networked control systems in the presence of random delays and packet losses,” Automatica, vol. 48, pp. 1017-1030, 2012.

[7] T. Dierks and S. Jagannathan, “Online optimal control of affine nonlinear discrete-time systems with unknown internal dynamics by using timebased policy update,” IEEE Trans. Neural Networks and Learning Systems, vol. 23, pp. 1118-1129, 2012.

[8] R. Beard, “Improving the closed-loop performance of nonlinear systems,” Ph.D. dissertation, Rensselaer Polytechnic Institute, USA, 1995.

[9] T. Cheng, F. L. Lewis, and M. Abu-Khalaf, “A neural network solution for fixed-final-time optimal control of nonlinear systems,” Automatica, vol. 43, pp. 482-490, 2007.

[10] A. Heydari and S. N. Balakrishnan, “Finitehorizon Control-Constrained Nonlinear Optimal Control Using Single Network Adaptive Critics,” IEEE Trans. Neural Networks and Learning Systems, vol. 24, pp. 145-157, 2013.

[11] P. J. Werbos, “A menu of designs for reinforcement learing over time,” J. Neural Network Contr., vol. 3, pp. 835-846, 1983.

[12] J. Si, A. G. Barto, W. B. Powell and D. Wunsch, Handbook of Learning and Approximate Dynamic Programming. New York: Wiley, 2004

[13] A. Al-Tamimi and F. L. Lewis, ”Discrete-time nonlinear HJB solution using approximate dynamic programming: convergence proof,” IEEE Trans. Systems, Man and Cybernetics, Part B: Cybernetics, vol. 38, pp. 943-949, 2008.

[14] H. Xu and S. Jagannathan, “Stochastic optimal controller design for uncertain nonlinear networked control system via neuro dynamic programming, IEEE Trans. Neural Netw. And Learning Syst, 24 (2013), pp. 471-484.

[15] C. Watkins, “Learning from delayed rewards,” Ph.D. dissertation, Cambridge University, England, 1989.

[16] W. Aangenent, D. Kostic, B. de Jager, R. van de Molengraft and M. Steinbuch, Data-based optimal control, in Proc. Amer. Control Conf., Portland, OR, 2005, pp. 1460-1465.

[17] R. K. Lim, M. O. Phan, and R. W. Longman, “State-space system identification with identified Hankel matrix,” Dept. Mech. Aerosp. Eng., Princeton Univ., NJ, Tech. Rep. 3045, Sep, 1998.

[18] M. O. Phan, R. K. Lim and R.W. Longman, “Unifying input-output and state-space perspectives of predictive control”, Dept. Mech. Aerosp. Eng., Princeton Univ., NJ, Tech. Rep. 3044, Sep, 1998

[19] F. L. Lewis and K. G. Vamvoudakis, “Reinforcement learning for partial observable dynamic process: adaptive dynamic programming using measured output data”, Trans. On Systems, Man, and Cybernetics - Part B. Vo. 41, pp. 14-25, 2011.

[20] S. Jagannathan, Neural Network Control of Nonlinear Discrete-Time Systems, Boca Raton, FL: CRC Press, 2006.

[21] M. Green and J. B. Moore, “Persistency of excitation in linear systems,” Syst. and Cont. Letter, vol. 7, pp. 351-360, 1986.

[22] K. S. Narendra and A. M. Annaswamy, Stable Adaptive Systems, New Jersey: Prentice-Hall, 1989.

[23] F. L. Lewis, S. Jagannathan, and A. Yesildirek, Neural Network Control of Robot Manipulators and Nonlinear Systems, New York: Taylor & Francis, 1999.

[24] H.K. Khalil, Nonlinear System, 3rd edition, Prentice-Hall, Upper Saddle River, NJ, 2002.

[25] R. W. Brochett, R. S. Millman, and H. J. Sussmann, Differential geometric control theory, Birkhauser, USA, 1983.

Journal of Artificial Intelligence and Soft Computing Research

The Journal of Polish Neural Network Society, the University of Social Sciences in Lodz & Czestochowa University of Technology

Journal Information

CiteScore 2018: 4.70

SCImago Journal Rank (SJR) 2018: 0.351
Source Normalized Impact per Paper (SNIP) 2018: 4.066


All Time Past Year Past 30 Days
Abstract Views 0 0 0
Full Text Views 143 87 4
PDF Downloads 64 42 4