Technical Papers
Aug 11, 2021

Coordinated Control Based on Reinforcement Learning for Dual-Arm Continuum Manipulators in Space Capture Missions

Publication: Journal of Aerospace Engineering
Volume 34, Issue 6

Abstract

The increasing number of defunct and fragmented spacecraft poses a growing hazard to existing onorbit assets. The redundant continuum manipulator with high flexibility provides dual-arm robotic systems with apparent advantages in active debris removal missions in space. Existing autonomously-coordinated control approaches for dual-arm continuum manipulators require a real-time inverse kinematic solution and a security assurance mechanism for possible collisions, which are difficult to upscale for space debris capture systems with high-speed maneuverability. In this paper, we consider collision avoidance and input saturation control in proposing a multiagent reinforcement learning approach, named the multiagent twin delayed deep deterministic policy gradient (MATD3), to generate a real-time inverse kinematic solution for coordinated manipulators. During the training process, the MATD3 algorithm performs lower overestimation than the multiagent deep deterministic policy gradient (MADDPG) algorithm. Then, a feedback dynamics controller is designed for the continuum manipulators. Under the guidance of the policy networks, each agent can schedule the joint trajectory design online according to the collaborator and target debris information. During the capture operation, a competitive mechanism for the anticollision function is developed through reasonable reward functions to maintain dual arms at a safe distance. Simulation results show that the average accuracy of the proposed approach is 42% higher than that of MADDPG in inverse kinematic trajectory planning. The designed integrated tracking controller can effectively perform capture missions in the simulation environment. Multiagent reinforcement learning shows promise for future onorbit servicing missions.

Get full access to this article

View all available purchase options and get full access to this article.

Data Availability Statement

All data, models, and code generated or used during the study appear in the published article.

Acknowledgments

This work was supported by the National Natural Science Foundation Key Foundation (No. 91748203) and the Qian Xuesen Laboratory of Space Technology Seed Fund (QXSZZJJ03-07).

References

Amouri, A., C. Mahfoudi, A. Zaatri, and H. Merabti. 2014. “A new approach to solve inverse kinematics of a planar flexible continuum robot.” In Vol. 1618 of Proc., AIP Conf., 643–646. College Park, MD: American Institute of Physics.
Araromi, O., I. Gavrilovich, J. Shintake, S. Rosset, M. Richard, V. Gass, and H. R. Shea. 2015. “Rollable multisegment dielectric elastomer minimum energy structures for a deployable microsatellite gripper.” IEEE/ASME Trans. Mechatron. 20 (1): 438–446. https://doi.org/10.1109/TMECH.2014.2329367.
Bennett, T., D. Stevenson, E. Hogan, and H. Schaub. 2015. “Prospects and challenges of touchless electrostatic detumbling of small bodies.” Adv. Space Res. 56 (3): 557–568. https://doi.org/10.1016/j.asr.2015.03.037.
Buşoniu, L., T. Bruin, D. Tolić, J. Kober, and I. Palunko. 2018. “Reinforcement learning for control: Performance, stability, and deep approximators.” Ann. Rev. Control 46: 8–28. https://doi.org/10.1016/j.arcontrol.2018.09.005.
Dolado-Perez, J., B. Revelin, and R. Di-Costanzo. 2015. “Sensitivity analysis of the long-term evolution of the space debris population in Leo.” J. Space Saf. Eng. 2 (1): 12–22. https://doi.org/10.1016/S2468-8967(16)30035-0.
Endo, Y., H. Kojima, and P. Trivailo. 2020. “Study on acceptable offsets of ejected nets from debris center for successful capture of debris.” Adv. Space Res. 66 (2): 450–461. https://doi.org/10.1016/j.asr.2020.04.012.
Flores-Abad, A., O. Ma, K. Pham, and S. Ulrich. 2014. “A review of space robotics technologies for on-orbit servicing.” Prog. Aerosp. Sci. 68 (Jul): 1–26. https://doi.org/10.1016/j.paerosci.2014.03.002.
Frazelle, C., J. Rogers, I. Karamouzas, and I. Walker. 2020. “Optimizing a continuum manipulator’s search policy through model-free reinforcement learning.” In Proc. IEEE/RSJ Int. Conf. on Intelligent Robots and Systems (IROS), 5564–5571. New York: IEEE.
Fujimoto, S., H. Van, D. Meger, and Amsterdam Machine Learning Lab,. 2018. “Addressing function approximation error in actor-critic methods.” In Proc., Machine Learning Research, 1587–1596. Leibniz, Germany: Dblp computer Science Bibliography.
Gao, Y., and C. Steve. 2017. “Review on space robotics: Toward top-level science through space exploration.” Sci. Rob. 2 (7): eaan5074. https://doi.org/10.1126/scirobotics.aan5074.
Grassmann, R., V. Modes, and J. Burgner-Kahrs. 2018. “Learning the forward and inverse kinematics of a 6-DOF concentric tube continuum robot in SE3.” In Proc., IEEE/RSJ Int. Conf. on Intelligent Robots and Systems (IROS), 5125–5132. New York: IEEE.
Heidari, H., M. J. Pouria, S. Sharifi, and M. Karami. 2018. “Design and fabrication of robotic gripper for grasping in minimizing contact force.” Adv. Space Res. 61 (5): 1359–1370. https://doi.org/10.1016/j.asr.2017.12.024.
Huang, P., F. Zhang, L. Chen, Z. Meng, Y. Zhang, Z. Liu, and Y. Hu. 2018. “A review of space tether in new applications.” Nonlinear Dyn. 94 (1): 1–19. https://doi.org/10.1007/s11071-018-4389-5.
Izzo, D., M. Märtens, and B. Pan. 2019. “A survey on artificial intelligence trends in spacecraft guidance dynamics and control.” Astrodynamics 3 (4): 287–299. https://doi.org/10.1007/s42064-018-0053-6.
Kessler, D., and B. Cour-Palais. 1978. “Collision frequency of artificial satellites: The creation of a debris belt.” J. Geophys. Res. Space Phys. 83 (6): 2637–2646. https://doi.org/10.1029/JA083iA06p02637.
Lai, J., B. Lu, and H. Chu. 2019. “A learning-based inverse kinematics solver for two-segment continuum robot models.” In Proc., IEEE/RSJ Int. Conf. on Intelligent Robots and Systems (IROS), 3362. New York: IEEE.
Lesort, T., N. Díaz-Rodríguez, J. F. Goudou, and D. Filliat. 2018. “State representation learning for control: An overview.” Neural Netw. 108 (Dec): 379–392. https://doi.org/10.1016/j.neunet.2018.07.006.
Li, G., and P. Xu. 2020. “Design and analysis of a deployable grasping mechanism for capturing non-cooperative space targets.” Aerosp. Sci. Technol. 106 (Nov): 106230. https://doi.org/10.1016/j.ast.2020.106230.
Li, X., K. Sun, and H. Liu. 2019. “Design of a novel deployable mechanism for capturing tumbling debris.” Trans. Can. Soc. Mech. Eng. 43 (3): 294–305. https://doi.org/10.1139/tcsme-2018-0146.
Lillicrap, T., J. J. Hunt, A. Pritzel, N. Heess, T. Erez, Y. Tassa, D. Silver, and D. Wierstra. 2016. “Continuous control with deep reinforcement learning.” Preprint, submitted September 9, 2015. http://arxiv.org/abs/1509.02971.
Liou, J. 2011. “An active debris removal parametric study for LEO environment remediation.” Adv. Space Res. 47 (11): 1865–1876. https://doi.org/10.1016/j.asr.2011.02.003.
Lowe, R., Y. Wu, A. Tamar, J. Harb, P. Abbeel, and I. Mordatch. 2017. “Multi-agent actor-critic for mixed cooperative-competitive environments.” Preprint, submitted June 7, 2017. http://arxiv.org/abs/1706.02275.
Mark, C., and S. Kamath. 2019. “Review of active space debris removal methods.” Space Policy 47 (Feb): 194–206. https://doi.org/10.1016/j.spacepol.2018.12.005.
Mehling, J., M. Diftler, M. Chu, and M. Valvo. 2006. “A minimally invasive tendril robot for in-space inspection.” In Proc., 1st IEEE/ RAS-EMBS Int. Conf. on Biomedical Robotics and Biomechatronics, 690–695. New York: IEEE. https://doi.org/10.1109/BIOROB.2006.1639170.
Peng, H., Q. Gao, Z. Wu, and W. Zhong. 2012. “Symplectic approaches for solving two-point boundary-value problems.” AIAA J. Guidance Control Dyn. 35 (2): 653–659. https://doi.org/10.2514/1.55795.
Peng, H., F. Li, J. Li, and Z. Ju. 2020a. “A symplectic instantaneous optimal control for robot trajectory tracking with differential-algebraic equation models.” IEEE Trans. Ind. Electron. 67 (5): 3819–3829. https://doi.org/10.1109/TIE.2019.2916390.
Peng, H., J. Zhao, Z. Wu, and W. Zhong. 2011. “Optimal periodic controller for formation flying on libration point orbits.” Acta Astronaut. 69 (7): 537–550. https://doi.org/10.1016/j.actaastro.2011.04.020.
Peng, J., W. Xu, T. Yang, Z. Hu, and B. Liang. 2020b. “Dynamic modeling and trajectory tracking control method of segmented linkage cable-driven hyper-redundant robot.” Nonlinear Dyn. 101 (1): 233–253. https://doi.org/10.1007/s11071-020-05764-7.
Robinson, G., and J. Davies. 1999. “Continuum robots—A state of the art.” In Vol. 4 of Proc. 1999 IEEE Int. Conf. on Robotics and Automation, 2849–2854. New York: IEEE. https://doi.org/10.1109/ROBOT.1999.774029.
Satheeshbabu, S., N. Uppalapati, G. Chowdhary, and G. Krishnan. 2019. “Open loop position control of soft continuum arm using deep reinforcement learning.” In Proc. IEEE Int. Conf. on Robotics and Automation, 5133–5139. New York: IEEE.
Seijen, H., H. Hasselt, S. Whiteson, and M. Wiering. 2009. “A theoretical and empirical analysis of Expected Sarsa.” In Proc. IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning, 177–184. New York: IEEE. https://doi.org/10.1109/ADPRL.2009.4927542.
Shan, M., J. Guo, and E. Gill. 2016. “Review and comparison of active space debris capturing and removal methods.” Prog. Aerosp. Sci. 80 (Jan): 18–32. https://doi.org/10.1016/j.paerosci.2015.11.001.
Sugai, F., S. Abiko, T. Tsujita, X. Jiang, and M. Uchiyama. 2013. “Detumbling an uncontrolled satellite with contactless force by using an eddy current brake.” In Proc., IEEE/RSJ Int. Conf. on Intelligent Robots and Systems, 783–788. New York: IEEE. https://doi.org/10.1109/IROS.2013.6696440.
Thuruthel, T., E. Falotico, M. Cianchetti, F. Renda, and C. Laschi. 2016. “Learning global inverse statics solution for a redundant soft robot.” In Vol. 2 of Proc., 13th Int. Conf. on Informatics in Control, Automation and Robotics, 303–310. Berlin: Springer.
Thuruthel, T., E. Falotico, F. Renda, and C. Laschi. 2019. “Model-based reinforcement learning for closed-loop dynamic control of soft robotic manipulators.” IEEE Trans. Rob. 35 (1): 124–134. https://doi.org/10.1109/TRO.2018.2878318.
Wilcox, B., et al. 2015. “Testbed for studying the capture of a small, free-flying asteroid in space.” In Proc., AIAA SPACE 2015 Conf. and Exposition. 2015, 4583. Reston, VA: American Institute of Aeronautics and Astronautics.
Wilde, M., I. Walker, S. Choon, and J. Near. 2017. “Using tentacle robots for capturing non-cooperative space debris—A proof of concept.” In Proc. AIAA Space and Astronautics Forum and Exposition, 2017, 5246. Reston, VA: American Institute of Aeronautics and Astronautics.
Xie, K., and W. Lan. 2019. “Acceleration-level trajectory planning for a dual-arm space robot.” IFAC-PapersOnLine 52 (24): 243–248. https://doi.org/10.1016/j.ifacol.2019.12.415.
Yan, L., W. Xu, Z. Hu, and B. Liang. 2020. “Multi-objective configuration optimization for coordinated capture of dual-arm space robot.” Acta Astronaut. 167 (Feb): 189–200. https://doi.org/10.1016/j.actaastro.2019.11.002.

Information & Authors

Information

Published In

Go to Journal of Aerospace Engineering
Journal of Aerospace Engineering
Volume 34Issue 6November 2021

History

Received: Feb 23, 2021
Accepted: Jun 3, 2021
Published online: Aug 11, 2021
Published in print: Nov 1, 2021
Discussion open until: Jan 11, 2022

Permissions

Request permissions for this article.

Authors

Affiliations

Ph.D. Student, Dept. of Engineering Mechanics, Dalian Univ. of Technology, Dalian 116023, PR China. Email: [email protected]
Professor, Dept. of Engineering Mechanics, Dalian Univ. of Technology, Dalian 116023, PR China (corresponding author). Email: [email protected]
Haijun Peng [email protected]
Professor, Dept. of Engineering Mechanics, Dalian Univ. of Technology, Dalian 116023, PR China. Email: [email protected]
Professor, State Key Laboratory of Structural Analysis for Industrial Equipment, Dalian Univ. of Technology, Dalian 116023, PR China. Email: [email protected]

Metrics & Citations

Metrics

Citations

Download citation

If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. Simply select your manager software from the list below and click Download.

Cited by

  • An enhanced deep deterministic policy gradient algorithm for intelligent control of robotic arms, Frontiers in Neuroinformatics, 10.3389/fninf.2023.1096053, 17, (2023).
  • Path Planning of a Continuum Robot's End-effector for Assembly Missions in Unstructured Environments, 2022 IEEE 5th Advanced Information Management, Communicates, Electronic and Automation Control Conference (IMCEC), 10.1109/IMCEC55388.2022.10019843, (539-543), (2022).
  • An Integrated Tracking Control Approach Based on Reinforcement Learning for a Continuum Robot in Space Capture Missions, Journal of Aerospace Engineering, 10.1061/(ASCE)AS.1943-5525.0001426, 35, 5, (2022).
  • Safe reward‐based deep reinforcement learning control for an electro‐hydraulic servo system, International Journal of Robust and Nonlinear Control, 10.1002/rnc.6235, 32, 13, (7646-7662), (2022).

View Options

Get Access

Access content

Please select your options to get access

Log in/Register Log in via your institution (Shibboleth)
ASCE Members: Please log in to see member pricing

Purchase

Save for later Information on ASCE Library Cards
ASCE Library Cards let you download journal articles, proceedings papers, and available book chapters across the entire ASCE Library platform. ASCE Library Cards remain active for 24 months or until all downloads are used. Note: This content will be debited as one download at time of checkout.

Terms of Use: ASCE Library Cards are for individual, personal use only. Reselling, republishing, or forwarding the materials to libraries or reading rooms is prohibited.
ASCE Library Card (5 downloads)
$105.00
Add to cart
ASCE Library Card (20 downloads)
$280.00
Add to cart
Buy Single Article
$35.00
Add to cart

Get Access

Access content

Please select your options to get access

Log in/Register Log in via your institution (Shibboleth)
ASCE Members: Please log in to see member pricing

Purchase

Save for later Information on ASCE Library Cards
ASCE Library Cards let you download journal articles, proceedings papers, and available book chapters across the entire ASCE Library platform. ASCE Library Cards remain active for 24 months or until all downloads are used. Note: This content will be debited as one download at time of checkout.

Terms of Use: ASCE Library Cards are for individual, personal use only. Reselling, republishing, or forwarding the materials to libraries or reading rooms is prohibited.
ASCE Library Card (5 downloads)
$105.00
Add to cart
ASCE Library Card (20 downloads)
$280.00
Add to cart
Buy Single Article
$35.00
Add to cart

Media

Figures

Other

Tables

Share

Share

Copy the content Link

Share with email

Email a colleague

Share