Adaptive Critics Design with Support Vector Machine for Spacecraft Finite-Horizon Optimal Control
Publication: Journal of Aerospace Engineering
Volume 32, Issue 1
Abstract
In this study, an adaptive critics design based on a support vector machine (SVM) is adopted to design a finite-horizon optimal feedback controller. The adaptive critics design consists of actor and critic networks. The actor (control input) and critic (cost-to-go) network are trained off-line with respect to various initial states and final times within a finite step. Using the well-trained actor-critic, the near-optimal feedback control solution can be obtained online. In the process of applying SVM to the adaptive critics, an adequate kernel function and parameters depending on the kernel function must be selected. In this study, a polynomial function and radial basis function are used for the SVM kernel function to implement the algorithm. A minimum control effort problem with final constraints for spacecraft rendezvous is considered to demonstrate the performance of the proposed the developed algorithm with respect to each kernel function and to show its potential for designing an optimal controller.
Get full access to this article
View all available purchase options and get full access to this article.
Acknowledgments
This work has been supported by the National GNSS Research Center program of Defense Acquisition Program Administration and Agency for Defense Development.
References
Abu-Khalaf, M., and F. L. Lewis. 2005. “Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach.” Automatica 41 (5): 779–791. https://doi.org/10.1016/j.automatica.2004.11.034.
Bando, M., and H. Yamakawa. 2010. “New Lambert algorithm using the Hamilton-Jacobi-Bellman equation.” J. Guidance Control Dyn. 33 (3): 1000–1008. https://doi.org/10.2514/1.46751.
Barron, R. L., R. L. Cellucci, P. R. Jordan, N. E. Beam, P. Hess, and A. Barron. 1990. “Applications of polynomial neural networks to FDIE and reconfigurable flight control.” In Proc., IEEE National Aerospace and Electronics Conf., 507–519. Piscataway, NJ: IEEE.
Bryson, A. E. 1975. Applied optimal control: Optimization, estimation and control. New York: CRC Press.
Bryson, A. E. 1999. Vol. 1 of Dynamic optimization. Englewood Cliffs, NJ: Prentice Hall.
Chakrabarty, A., V. Dinh, M. J. Corless, A. E. Rundell, S. H. Żak, and G. T. Buzzard. 2017. “Support vector machine informed explicit nonlinear model predictive control using low-discrepancy sequences.” IEEE Trans. Autom. Control 62 (1): 135–148. https://doi.org/10.1109/TAC.2016.2539222.
Cherkassky, V., and Y. Ma. 2004. “Practical selection of SVM parameters and noise estimation for SVM regression.” Neural Networks 17 (1): 113–126. https://doi.org/10.1016/S0893-6080(03)00169-2.
Deb, A. K., M. Gopal, and S. Chandra. 2007. “SVM-based tree-type neural networks as a critic in adaptive critic designs for control.” IEEE Trans. Neural Networks 18 (4): 1016–1030. https://doi.org/10.1109/TNN.2007.899255.
Dibike, Y. B., S. Velickov, D. Solomatine, and M. B. Abbott. 2001. “Model induction with support vector machines: Introduction and applications.” J. Comput. Civ. Eng. 15 (3): 208–216. https://doi.org/10.1061/(ASCE)0887-3801(2001)15:3(208).
Gooding, R. 1990. “A procedure for the solution of Lambert’s orbital boundary-value problem.” Celestial Mech. Dyn. Astron. 48 (2): 145–165.
Heydari, A., and S. N. Balakrishnan. 2012. “Approximate closed-form solutions to finite-horizon optimal control of nonlinear systems.” In Proc., IEEE American Control Conf., 2657–2662. Piscataway, NJ: IEEE.
Heydari, A., and S. N. Balakrishnan. 2013. “Fixed-final-time optimal control of nonlinear systems with terminal constraints.” Neural Networks 48 (1): 61–71. https://doi.org/10.1016/j.neunet.2013.07.002.
Heydari, A., and S. N. Balakrishnan. 2014. “Adaptive critic-based solution to an orbital rendezvous problem.” J. Guidance Control Dyn. 37 (1): 344–350. https://doi.org/10.2514/1.60553.
Khalil, H. K. 2002. Nonlinear systems. 3rd ed. Englewood Cliffs, NJ: Prentice Hall.
Khamis, A., and D. S. Naidu. 2014. “Nonlinear optimal tracking with incomplete state information using finite-horizon state dependent Riccati equation (SDRE).” In Proc., IEEE American Control Conf., 2420–2425. Piscataway, NJ: IEEE.
Kirk, D. E. 2012. Optimal control theory: An introduction. Mineola, NY: Courier Corporation.
Leake, R., and R.-W. Liu. 1967. “Construction of suboptimal control sequences.” SIAM J. Control 5 (1): 54–63. https://doi.org/10.1137/0305004.
Liu, S., and N. Jiang. 2008. “SVM parameters optimization algorithm and its application.” In Proc., IEEE Int. Conf. on Mechatronics and Automation, 509–513. Piscataway, NJ: IEEE.
Massari, M., and M. Zamaro. 2014. “Application of SDRE technique to orbital and attitude control of spacecraft formation flying.” Acta Astronaut. 94 (1): 409–420. https://doi.org/10.1016/j.actaastro.2013.02.001.
McGrew, J. S., J. P. How, B. Williams, and N. Roy. 2010. “Air-combat strategy using approximate dynamic programming.” J. Guidance Control Dyn. 33 (5): 1641–1654. https://doi.org/10.2514/1.46815.
Mercer, J. 1909. “Functions of positive and negative type, and their connection with the theory of integral equations.” Philos. Trans. R. Soc. London, Ser. A. 209 (441–458): 415–446. https://doi.org/10.1098/rsta.1909.0016.
Murray, J. J., C. J. Cox, G. G. Lendaris, and R. Saeks. 2002. “Adaptive dynamic programming.” IEEE Trans. Syst. Man Cybern. Part C Appl. Rev. 32 (2): 140–153. https://doi.org/10.1109/TSMCC.2002.801727.
Nelson, S. L., and P. Zarchan. 1992. “Alternative approach to the solution of Lambert’s problem.” J. Guidance Control Dyn. 15 (4): 1003–1009. https://doi.org/10.2514/3.20935.
Park, C., V. Guibout, and D. J. Scheeres. 2006. “Solving optimal continuous thrust rendezvous problems with generating functions.” J. Guidance Control Dyn. 29 (2): 321–331. https://doi.org/10.2514/1.14580.
Patrikar, A., and J. Provence. 1996. “Nonlinear system identification and adaptive control using polynomial networks.” Math. Comput. Modell. 23 (1–2): 159–173. https://doi.org/10.1016/0895-7177(95)00225-1.
Prokhorov, D. V., and D. C. Wunsch. 1997. “Adaptive critic designs.” IEEE Trans. Neural Networks 8 (5): 997–1007. https://doi.org/10.1109/72.623201.
Ryan, T., and H. J. Kim. 2012. “Modelling of quadrotor ground effect forces via simple visual feedback and support vector regression.” In Proc., AIAA Guidance, Navigation, and Control Conf. Reston, VA: ASCE.
Shanthini, D., M. Shanthi, and M. Bhuvaneswari. 2017. “A comparative study of SVM kernel functions based on polynomial.” Int. J. Eng. Comput. Sci. 6 (3): 20765–20769.
Shin, J., H. J. Kim, S. Park, and Y. Kim. 2010. “Model predictive flight control using adaptive support vector regression.” Neurocomputing 73 (4): 1031–1037. https://doi.org/10.1016/j.neucom.2009.10.002.
Song, R., F. Lewis, Q. Wei, H.-G. Zhang, Z.-P. Jiang, and D. Levine. 2015. “Multiple actor-critic structures for continuous-time optimal control using input-output data.” IEEE Trans. Neural Networks Learn. Syst. 26 (4): 851–865. https://doi.org/10.1109/TNNLS.2015.2399020.
Suykens, J. A., J. Vandewalle, and B. De Moor. 2001. “Optimal control by least squares support vector machines.” Neural networks 14 (1): 23–35. https://doi.org/10.1016/S0893-6080(00)00077-0.
Vadali, S. R., and R. Sharma. 2006. “Optimal finite-time feedback controllers for nonlinear systems with terminal constraints.” J. Guidance Control Dyn. 29 (4): 921–928. https://doi.org/10.2514/1.16790.
Vapnik, V. 2013. The nature of statistical learning theory. New York: Springer.
Wang, P., X. Hu, and M. Wu. 2014. “Matching suitability analysis for geomagnetic aided navigation based on an intelligent classification method.” Proc. Inst. Mech. Eng. Part G J. Aerosp. Eng. 228 (2): 271–283. https://doi.org/10.1177/0954410012470906.
Werbos, P. J. 1977. “Advanced forecasting methods for global crisis warning and models of intelligence.” General Syst. Yearbook 22 (1): 25–38.
Zhang, H., C. Qin, B. Jiang, and Y. Luo. 2014a. “Online adaptive policy learning algorithm for state feedback control of unknown affine nonlinear discrete-time systems.” IEEE Trans. Cybern. 44 (12): 2706–2718. https://doi.org/10.1109/TCYB.2014.2313915.
Zhang, H., C. Qin, and Y. Luo. 2014b. “Neural-network-based constrained optimal control scheme for discrete-time switched nonlinear system using dual heuristic programming.” IEEE Trans. Autom. Sci. Eng. 11 (3): 839–849. https://doi.org/10.1109/TASE.2014.2303139.
Zhao, D., B. Wang, and D. Liu. 2013. “A supervised actor-critic approach for adaptive cruise control.” Soft Comput. 17 (11): 2089–2099. https://doi.org/10.1007/s00500-013-1110-y.
Information & Authors
Information
Published In
Copyright
©2018 American Society of Civil Engineers.
History
Received: Jan 24, 2018
Accepted: May 29, 2018
Published online: Sep 10, 2018
Published in print: Jan 1, 2019
Discussion open until: Feb 10, 2019
Authors
Metrics & Citations
Metrics
Citations
Download citation
If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. Simply select your manager software from the list below and click Download.