Technical Papers
Oct 4, 2023

Cooperative Landing on Mobile Platform for Multiple Unmanned Aerial Vehicles via Reinforcement Learning

Publication: Journal of Aerospace Engineering
Volume 37, Issue 1

Abstract

This paper proposes a multiple unmanned aerial vehicles (UAVs) cooperative landing algorithm based on deep reinforcement learning. First, to solve the partial observation problem, we propose the recurrent neural network to predict the moving platform trajectory. Afterwards, with the centralized multiagent framework, we present a parameter sharing method to realize multi-UAV cooperation. Finally, focusing on the sensor noise problem of the actual UAV flight, we propose a noise compensation recurrent proximal policy optimization (NC-RPPO) algorithm to extract images’ features to compensate for inertial measurement unit (IMU) and GPS errors. We utilize AirSim to construct a simulated 3D environment resembling an offshore oil development zone. In this setting, we evaluate the effectiveness of our proposed multi-UAV cooperative landing algorithm while considering the presence of sensor noise. Through experimental trials, we demonstrate that our NC-RPPO algorithm enables UAVs to accurately predict the trajectory of a mobile platform and successfully land on it cooperatively in real time. Notably, the experimental outcomes obtained through our image-assisted noise correction method closely align with those obtained from the ground truth experiment.

Get full access to this article

View all available purchase options and get full access to this article.

Data Availability Statement

Some data, models, or code that support the findings of this study are available from the corresponding author upon reasonable request. Some data, models, or code generated or used during the study are proprietary or confidential in nature and may only be provided with restrictions.

Acknowledgments

This paper is funded by National Natural Science Foundation of China (62203050).

References

Chang, C., J. Tsai, P. Lu, and C. Lai. 2020. “Accuracy improvement of autonomous straight take-off, flying forward, and landing of a drone with deep reinforcement learning.” Int. J. Comput. Intell. Syst. 13 (1): 914–919. https://doi.org/10.2991/ijcis.d.200615.002.
Chen, C., S. Chen, G. Hu, B. Chen, P. Chen, and K. Su. 2021. “An auto-landing strategy based on pan-tilt based visual servoing for unmanned aerial vehicle in GNSS-denied environments.” Aerosp. Sci. Technol. 116 (Sep): 106891. https://doi.org/10.1016/j.ast.2021.106891.
Chen, X., Y. Liu, L. Yin, and L. Qi. 2020. “Cooperative task assignment and track planning for multi-UAV attack mobile targets.” J. Intell. Robot. Syst. 100 (3–4): 1383–1400. https://doi.org/10.1007/s10846-020-01241-w.
Chen, X., Y. Qi, Y. Yin, Y. Chen, L. Liu, and H. Chen. 2023. “A multi-stage deep reinforcement learning with search-based optimization for air-ground unmanned system navigation.” Appl. Sci. 13 (4): 2244. https://doi.org/10.3390/app13042244.
Chen, Y., D. Yang, and J. Yu. 2018. “Multi-UAV task assignment with parameter and time-sensitive uncertainties using modified two-part wolf pack search algorithm.” IEEE Trans. Aerosp. Electron. Syst. 54 (6): 2853–2872. https://doi.org/10.1109/TAES.2018.2831138.
Dorigo, M., and L. M. Gambardella. 1997. “Ant colony system: A cooperative learning approach to the traveling salesman problem.” IEEE Trans. Evol. Comput. 1 (1): 53–66. https://doi.org/10.1109/4235.585892.
Duan, H., J. Zhao, Y. Deng, Y. Shi, and X. Ding. 2021. “Dynamic discrete pigeon-inspired optimization for multi-UAV cooperative search-attack mission planning.” IEEE Trans. Aerosp. Electron. Syst. 57 (1): 706–720. https://doi.org/10.1109/TAES.2020.3029624.
Eberhart, R., and J. Kennedy. 1995. “A new optimizer using particle swarm theory.” In Proc., Sixth Int. Symp. on Micro Machine and Human Science, 39–43. New York: IEEE.
Engel, J., V. Koltun, and D. Cremers. 2018. “Direct sparse odometry.” IEEE Trans. Pattern Anal. Mach. Intell. 40 (3): 611–625. https://doi.org/10.1109/TPAMI.2017.2658577.
Engel, J., T. Schops, and D. Cremers. 2014. “LSD-SLAM: Large-scale direct monocular SLAM.” In Proc., European Conf. on Computer Vision, 834–849. Cham, Switzerland: Springer International.
Farinelli, A., N. Boscolo, E. Zanotto, and E. Pagello. 2016. “Advanced approaches for multi-robot coordination in logistic scenarios.” Rob. Auton. Syst. 90 (C): 34–44. https://doi.org/10.1016/j.robot.2016.08.010.
Forster, C., M. Pizzoli, and D. Scaramuzza. 2014. “SVO: Fast semi-direct monocular visual odometry.” In Proc., IEEE Int. Conf. on Robotics and Automation, 15–22. New York: IEEE.
Ghosh, D., J. Rahme, A. Kumar, A. Zhang, R. P. Adams, and S. Levine. 2021. “Why generalization in RL is difficult: Epistemic POMDPs and implicit partial observability.” In Vol. 34 of Proc., Advances in Neural Information Processing Systems, 25502–25515. La Jolla, CA: Neural Information Processing Systems.
Guo, K., X. Li, and L. Xie. 2020. “Ultra-wideband and odometry-based cooperative relative localization with application to multi-UAV formation control.” IEEE T. Cybern. 50 (6): 2590–2603. https://doi.org/10.1109/TCYB.2019.2905570.
Hausknecht, M., and P. Stone. 2015. “Deep recurrent Q-learning for partially observable MDPs.” In Proc., AAAI Fall Symp. Series. Washington, DC: Association for the Advancement of Artificial Intelligence.
Hochreiter, S., and J. Schmidhuber. 1997. “Long short-term memory.” Neural Comput. 9 (8): 1735–1780. https://doi.org/10.1162/neco.1997.9.8.1735.
Holland, J. H. 1992. Adaptation in natural and artificial systems: An introductory analysis with applications to biology, control, and artificial intelligence. London: MIT Press.
Klein, G., and D. Murray. 2008. “Parallel tracking and mapping for small AR workspaces.” In Proc., 6th IEEE and ACM Int. Symp. on Mixed and Augmented Reality, 225–234. New York: IEEE.
Krizhevsky, A., I. Sutskever, and G. Hinton. 2012. “ImageNet classification with deep convolutional neural networks.” In Vol. 25 of Proc., Advances in Neural Information Processing Systems. New York: Association for Computing Machinery.
Lample, G., and D. S. Chaplot. 2016. “Playing FPS games with deep reinforcement learning.” In Proc., AAAI Conf. on Artificial Intelligence. Washington, DC: Association for the Advancement of Artificial Intelligence.
Lillicrap, T. P., J. J. Hunt, A. Pritzel, N. Heess, T. Erez, Y. Tassa, D. Silver, and D. Wierstra. 2016. “Continuous control with deep reinforcement learning.” Preprint, submitted September 9, 2015. http://arxiv.org/abs/1509.02971.
Luo, Y., J. Song, K. Zhao, and Y. Liu. 2022. “UAV-cooperative penetration dynamic-tracking interceptor method based on DDPG.” Appl. Sci. 12 (3): 1618. https://doi.org/10.3390/app12031618.
Mnih, V., et al. 2015. “Human-level control through deep reinforcement learning.” Nature 518 (7540): 529–533. https://doi.org/10.1038/nature14236.
Moritz, P. 2018. “Ray: A distributed framework for emerging AI applications.” In Proc., 13th USENIX Symp. on Operating Systems Design and Implementation (OSDI 18), 561–577. Berkeley, CA: USENIX Association.
Mur-Artal, R., J. Montiel, and J. D. Tardos. 2015. “ORB-SLAM: A versatile and accurate monocular SLAM system.” IEEE Trans. Rob. 31 (5): 1147–1163. https://doi.org/10.1109/TRO.2015.2463671.
Pan, Z., Z. Sun, H. Deng, and D. Li. 2021a. “A multilayer graph for multiagent formation and trajectory tracking control based on MPC algorithm.” IEEE Trans. Cybern. 52 (12): 13586–13597. https://doi.org/10.1109/TCYB.2021.3119330.
Pan, Z., C. Zhang, Y. Xia, H. Xiong, and X. Shao. 2021b. “An improved artificial potential field method for path planning and formation control of the multi-UAV systems.” IEEE Trans. Circuits Syst. II Express Briefs 69 (3): 1129–1133.
Paszke, A., et al. 2019. “PyTorch: An imperative style, high-performance deep learning library.” In Proc., Advances in Neural Information Processing Systems 32 (Nips 2019), 32. La Jolla, CA: Neural Information Processing Systems.
Qin, T., P. Li, and S. Shen. 2018. “VINS-Mono: A robust and versatile monocular visual-inertial state estimator.” IEEE Trans. Rob. 34 (4): 1004–1020. https://doi.org/10.1109/TRO.2018.2853729.
Schulman, J., F. Wolski, P. Dhariwal, A. Radford, and O. Klimov. 2017. “Proximal policy optimization algorithms.” Preprint, submitted November 29, 2022. http://arxiv.org/abs/1707.06347.
Shah, S., D. Dey, C. Lovett, and A. Kapoor. 2017. “AirSim: High-fidelity visual and physical simulation for autonomous vehicles.” In Proc., Field and Service Robotics: Results of the 11th Int. Conf. New York: Springer International. https://doi.org/10.1007/978-3-319-67361-5_40.
Shan, T., Y. Wang, C. Zhao, Y. Li, G. Zhang, and Q. Zhu. 2023. “Multi-UAV WRSN charging path planning based on improved heed and IA-DRL.” Comput. Commun. 203 (Apr): 77–88. https://doi.org/10.1016/j.comcom.2023.02.021.
Sutton, R. S., and A. G. Barto. 2018. Reinforcement learning: An introduction. 2nd ed. London: MIT Press.
Sutton, R. S., D. Mcallester, S. Singh, and Y. Mansour. 1999. “Policy gradient methods for reinforcement learning with function approximation.” In Proc., Advances in Neural Information Processing Systems. Cambridge, MA: MIT Press.
Thang, V., D. Hui, J. Zhou, and P. W. Marshall. 2022. “Failure prevention of seafloor composite pipelines using enhanced strain-based design.” Rev. Adv. Mater. Sci. 61 (1): 306–321. https://doi.org/10.1515/rams-2022-0035.
Wang, Y., S. Fang, and J. Hu. 2023. “Active disturbance rejection control based on deep reinforcement learning of PMSM for more electric aircraft.” IEEE Trans. Power Electron. 38 (1): 406–416. https://doi.org/10.1109/TPEL.2022.3206089.
Weiss, S., M. W. Achtelik, S. Lynen, M. Chli, and R. Siegwart. 2013. “Real-time onboard visual-inertial state estimation and self-calibration of MAVs in unknown environments.” In Proc., IEEE Int. Conf. on Robotics and Automation, 957–964. New York: IEEE.
Yang, T., Q. Ren, F. Zhang, B. Xie, H. Ren, J. Li, and Y. Zhang. 2018. “Hybrid camera array-based uav auto-landing on moving UGV in GPS-denied environment.” Remote Sens. 10 (11): 1829. https://doi.org/10.3390/rs10111829.
Ye, D., Z. Liu, M. Sun, B. Shi, P. Zhao, H. Wu, H. Yu, S. Yang, X. Wu, and Q. Guo. 2019. “Mastering complex control in MOBA games with deep reinforcement learning.” In Vol. 34 of Proc., AAAI Conf. on Artificial Intelligence, 6672–6679. Washington, DC: Association for the Advancement of Artificial Intelligence.
Zhang, D., and H. Duan. 2018. “Social-class pigeon-inspired optimization and time stamp segmentation for multi-UAV cooperative path planning.” Neurocomputing 313 (Nov): 229–246. https://doi.org/10.1016/j.neucom.2018.06.032.
Zhang, J., and J. Xing. 2020. “Cooperative task assignment of multi-UAV system.” Chin. J. Aeronaut. 33 (11): 2825–2827. https://doi.org/10.1016/j.cja.2020.02.009.
Zhang, J., Q. Yang, G. Shi, Y. Lu, and Y. Wu. 2021. “UAV cooperative air combat maneuver decision based on multi-agent reinforcement learning.” J. Syst. Eng. Electron. 32 (6): 1421–1438. https://doi.org/10.23919/JSEE.2021.000121.
Zhao, J., J. Sun, Z. Cai, L. Wang, and Y. Wang. 2021. “End-to-end deep reinforcement learning for image-based UAV autonomous control.” Appl. Sci. 11 (18): 8419. https://doi.org/10.3390/app11188419.
Zhe, Z., and A. Yuxiu. 2018. “Nanotechnology for the oil and gas industry: An overview of recent progress.” Nanotechnol. Rev. 7 (4): 341–353. https://doi.org/10.1515/ntrev-2018-0061.

Information & Authors

Information

Published In

Go to Journal of Aerospace Engineering
Journal of Aerospace Engineering
Volume 37Issue 1January 2024

History

Received: Jan 13, 2023
Accepted: Jul 12, 2023
Published online: Oct 4, 2023
Published in print: Jan 1, 2024
Discussion open until: Mar 4, 2024

Permissions

Request permissions for this article.

ASCE Technical Topics:

Authors

Affiliations

School of Mechatronical Engineering, Beijing Institute of Technology, Beijing 100081, China (corresponding author). ORCID: https://orcid.org/0000-0002-2979-5797. Email: [email protected]
Jingtai Li, Ph.D. [email protected]
Equipment Industry Development Center, Ministry of Industry and Information Technology, Beijing 100804, China. Email: [email protected]
Bi Wu, Ph.D. [email protected]
First Laboratory, Beijing Blue Sky Innovation Center for Frontier Science, Beijing 100085, China. Email: [email protected]
Junqi Wu, Ph.D. [email protected]
School of Mechatronical Engineering, Beijing Institute of Technology, Beijing 100081, China. Email: [email protected]
Hongbin Deng [email protected]
Professor, School of Mechatronical Engineering, Beijing Institute of Technology, Beijing 100081, China. Email: [email protected]
Professor, Dept. of Mechanical Engineering, Composite Material Research Laboratory, Univ. of New Orleans, New Orleans, LA 70148. Email: [email protected]

Metrics & Citations

Metrics

Citations

Download citation

If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. Simply select your manager software from the list below and click Download.

View Options

Get Access

Access content

Please select your options to get access

Log in/Register Log in via your institution (Shibboleth)
ASCE Members: Please log in to see member pricing

Purchase

Save for later Information on ASCE Library Cards
ASCE Library Cards let you download journal articles, proceedings papers, and available book chapters across the entire ASCE Library platform. ASCE Library Cards remain active for 24 months or until all downloads are used. Note: This content will be debited as one download at time of checkout.

Terms of Use: ASCE Library Cards are for individual, personal use only. Reselling, republishing, or forwarding the materials to libraries or reading rooms is prohibited.
ASCE Library Card (5 downloads)
$105.00
Add to cart
ASCE Library Card (20 downloads)
$280.00
Add to cart
Buy Single Article
$35.00
Add to cart

Get Access

Access content

Please select your options to get access

Log in/Register Log in via your institution (Shibboleth)
ASCE Members: Please log in to see member pricing

Purchase

Save for later Information on ASCE Library Cards
ASCE Library Cards let you download journal articles, proceedings papers, and available book chapters across the entire ASCE Library platform. ASCE Library Cards remain active for 24 months or until all downloads are used. Note: This content will be debited as one download at time of checkout.

Terms of Use: ASCE Library Cards are for individual, personal use only. Reselling, republishing, or forwarding the materials to libraries or reading rooms is prohibited.
ASCE Library Card (5 downloads)
$105.00
Add to cart
ASCE Library Card (20 downloads)
$280.00
Add to cart
Buy Single Article
$35.00
Add to cart

Media

Figures

Other

Tables

Share

Share

Copy the content Link

Share with email

Email a colleague

Share