Autonomous Navigation for Cellular-Connected UAV in Highly Dynamic Environments: A Deep Reinforcement Learning Approach
Publication: Journal of Aerospace Engineering
Volume 37, Issue 5
Abstract
This study investigated the navigation problem for cellular-connected unmanned aerial vehicles (UAVs), particularly in highly dynamic urban environments. To address this problem, the UAV is required not only to evade high-speed obstacles in the airspace but also to avoid the coverage holes of cellular base stations (BS). Moreover, the UAV needs to reach the destination to complete the navigation task. Hence, it is imperative to design the trade-off in action selections between collision evasion and destination-approaching scenarios, while also considering the expected communication outage duration as a crucial reference. To overcome this multiobjective optimization challenge, we propose a deep reinforcement learning (DRL)-based algorithm aimed at enabling the UAV to acquire an optimal decision-making policy. Specifically, we formulated the navigation problem as a Markov decision process (MDP) and developed a layered recurrent soft actor–critic (RSAC)-based DRL framework, stimulating the UAV to resolve two fundamental subtasks of UAV navigation. Furthermore, we develop a multilayer perception (MLP)-based integrated evaluation network to select a particular action from the two subsolutions, satisfying the demands for the entire navigation problem. The layered architecture simplifies the navigation problem, thereby enhancing the convergence speed of the proposed algorithm. Numerical results indicate that the layered-RSAC-based UAV can autonomously perform scheduled navigation tasks in our designed simulated urban environments with superior effectiveness.
Get full access to this article
View all available purchase options and get full access to this article.
Data Availability Statement
Some or all data, models, or codes that support the findings of this study are available from the corresponding author upon reasonable request.
Acknowledgments
This work was supported by the Natural Science Foundation of Hainan Province (624MS036), the China Post-Doctoral Science Foundation under Grant 2022M722053, the Oceanic Interdisciplinary Program of Shanghai Jiao Tong University under Grant SL2022PT112, the National Natural Science Foundation of China under Grant 52201369.
References
3GPP. 2017. “Study on 3D channel model for LTE.” Accessed March 1, 2014. https://portal.3gpp.org/desktopmodules/Specifications/SpecificationDetails.aspx?specificationId=2574.
Abedin, S. F., M. S. Munir, N. H. Tran, Z. Han, and C. S. Hong. 2021. “Data freshness and energy-efficient UAV navigation optimization: A deep reinforcement learning approach.” IEEE Trans. Intell. Transp. Syst. 22 (9): 5994–6006. https://doi.org/10.1109/TITS.2020.3039617.
Chen, R., Y. Sun, L. Liang, and W. Cheng. 2022. “Joint power allocation and placement scheme for UAV-assisted IoT with QoS guarantee.” IEEE Trans. Veh. Technol. 71 (1): 1066–1071. https://doi.org/10.1109/TVT.2021.3129880.
Feng, Z., M. Huang, D. Wu, E. Q. Wu, and C. Yuen. 2023. “Multi-agent reinforcement learning with policy clipping and average evaluation for UAV-assisted communication Markov game.” IEEE Trans. Intell. Transp. Syst. 24 (12): 14281–14293. https://doi.org/10.1109/TITS.2023.3296769.
Gao, N., Z. Qin, X. Jing, Q. Ni, and S. Jin. 2020. “Anti-intelligent UAV jamming strategy via deep Q-networks.” IEEE Trans. Commun. 68 (1): 569–581. https://doi.org/10.1109/TCOMM.2019.2947918.
Guo, T., N. Jiang, B. Li, X. Zhu, Y. Wang, and W. Du. 2021. “UAV navigation in high dynamic environments: A deep reinforcement learning approach.” Chin. J. Aeronaut. 34 (2): 479–489. https://doi.org/10.1016/j.cja.2020.05.011.
Haarnoja, T., A. Zhou, P. Abbeel, and S. Levine. 2018. “Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor.” In Proc., Int. Conf. Machine Learning, 1861–1870. Berkeley, CA: Univ. of California.
Hart, F., M. Waltz, and O. Okhrin. 2021. “Missing velocity in dynamic obstacle avoidance based on deep reinforcement learning.” Preprint, submitted December 23, 2021. https://arxiv.org/abs/2112.12465.
Heess, N., J. J. Hunt, T. P. Lillicrap, and D. Silver. 2015. “Memory-based control with recurrent neural networks.” Preprint, submitted December 14, 2015. https://arxiv.org/abs/1512.04455.
Huang, H., Y. Yang, H. Wang, Z. Ding, H. Sari, and F. Adachi. 2020. “Deep reinforcement learning for UAV navigation through massive MIMO technique.” IEEE Trans. Veh. Technol. 69 (1): 1117–1121. https://doi.org/10.1109/TVT.2019.2952549.
ITU-R. 2012. “Propagation data and prediction methods required for the design of terrestrial broadband radio access systems operating in a frequency range from 3 GHz to 60 GHz.” Accessed February 1, 2012. https://www.itu.int/rec/R-REC-P.1410/en.
Khamidehi, B., and E. S. Sousa. 2021. “Trajectory design for the aerial base stations to improve cellular network performance.” IEEE Trans. Veh. Technol. 70 (1): 945–956. https://doi.org/10.1109/TVT.2021.3049367.
Khamidehi, B., and E. S. Sousa. 2022. “Reinforcement-learning-aided safe planning for aerial robots to collect data in dynamic environments.” IEEE Internet Things J. 9 (15): 13901–13912. https://doi.org/10.1109/JIOT.2022.3145008.
Konda, V. R., and J. N. Tsitsiklis. 1999. “Actor-critic algorithms.” Proc. Adv. Neural Inf. Process. Syst. 12 (9): 1008–1014. https://doi.org/10.1137/S0363012901385691.
Liang, Y., W. Xu, W. Liang, J. Peng, X. Jia, Y. Zhou, and L. Duan. 2019. “Nonredundant information collection in rescue applications via an energy-constrained UAV.” IEEE Internet Things J. 6 (2): 2945–2958. https://doi.org/10.1109/JIOT.2018.2877409.
Lillicrap, T. P., H. J. J. Pritzel, A. Heess, N. Erez, T. Tassa, Y. Silver, and D. Wierstra. 2015. “Proximal policy optimization algorithms.” Preprint, submitted September 9, 2015. https://arxiv.org/abs/1707.06347.
Liu, Q., L. Shi, L. Sun, J. Li, M. Ding, and F. Shu. 2020. “Path planning for UAV-mounted mobile edge computing with deep reinforcement learning.” IEEE Trans. Veh. Technol. 69 (5): 5723–5728. https://doi.org/10.1109/TVT.2020.2982508.
Mondal, A., D. Mishra, G. Prasad, and A. Hossain. 2022. “Joint optimization framework for minimization of device energy consumption in transmission rate constrained UAV-assisted IoT network.” IEEE Internet Things J. 9 (12): 9591–9607. https://doi.org/10.1109/JIOT.2021.3128883.
Mozaffari, M., W. Saad, M. Bennis, Y. H. Nam, and M. Debbah. 2019. “A tutorial on UAVs for wireless networks: Applications, challenges, and open problems.” IEEE Commun. Surveys Tuts. 21 (3): 2334–2360. https://doi.org/10.1109/COMST.2019.2902862.
Ouahouah, S., M. Bagaa, J. Prados-Garzon, and T. Taleb. 2022. “Deep-reinforcement-learning-based collision avoidance in UAV environment.” IEEE Internet Things J. 9 (6): 4015–4030. https://doi.org/10.1109/JIOT.2021.3118949.
Schulman, J., F. Wolski, P. Dhariwal, A. Radford, and O. Klimov. 2017. “Proximal policy optimization algorithms.” Preprint, submitted July 20, 2017. https://arxiv.org/abs/1707.06347.
Seid, A. M., G. O. Boateng, S. Anokye, T. Kwantwi, G. Sun, and G. Liu. 2021. “Collaborative computation offloading and resource allocation in multi-UAV-assisted IoT networks: A deep reinforcement learning approach.” IEEE Internet Things J. 8 (15): 12203–12218. https://doi.org/10.1109/JIOT.2021.3063188.
Singla, A., S. Padakandla, and S. Bhatnagar. 2021. “Memory-based deep reinforcement learning for obstacle avoidance in UAV with limited environment knowledge.” IEEE Trans. Intell. Transp. Syst. 22 (1): 107–118. https://doi.org/10.1109/TITS.2019.2954952.
Sutton, R. S., and A. G. Barto. 2018. Reinforcement learning: An introduction. Cambridge, MA: MIT Press.
Wang, D., T. Fan, T. Han, and J. Pan. 2020a. “A two-stage reinforcement learning approach for multi-UAV collision avoidance under imperfect sensing.” IEEE Rob. Autom. 5 (2): 3098–3105. https://doi.org/10.1109/LRA.2020.2974648.
Wang, D., P. Hu, J. Du, P. Zhou, T. Deng, and M. Hu. 2019. “Routing and scheduling for hybrid truck-drone collaborative parcel delivery with independent and truck-carried drones.” IEEE Internet Things J. 6 (6): 10483–10495. https://doi.org/10.1109/JIOT.2019.2939397.
Wang, J., K. Liu, and J. Pan. 2020b. “Online UAV-mounted edge server dispatching for mobile-to-mobile edge computing.” IEEE Internet Things J. 7 (2): 1375–1386. https://doi.org/10.1109/JIOT.2019.2954798.
Wang, X., and M. C. Gursoy. 2022. “Learning-based UAV trajectory optimization with collision avoidance and connectivity constraints.” IEEE Trans. Commun. 21 (6): 4350–4363. https://doi.org/10.1109/TWC.2021.3129226.
Wang, X., M. C. Gursoy, T. Erpek, and Y. E. Sagduyu. 2022. “Learning-based UAV path planning for data collection with integrated collision avoidance.” IEEE Internet Things J. 9 (17): 16663–16676. https://doi.org/10.1109/JIOT.2022.3153585.
Wu, Q., L. Liu, and R. Zhang. 2019. “Fundamental trade-offs in communication and trajectory design for UAV-enabled wireless network.” IEEE Trans. Commun. 26 (1): 36–44. https://doi.org/10.1109/MWC.2018.1800221.
Xie, H., D. Yang, L. Xiao, and J. Lyu. 2021. “Connectivity-aware 3D UAV path design with deep reinforcement learning.” IEEE Trans. Veh. Technol. 70 (12): 13022–13034. https://doi.org/10.1109/TVT.2021.3121747.
Xu, J., X. Liu, X. Li, L. Zhang, J. Jin, and Y. Yang. 2022. “Energy-aware computation management strategy for smart logistic system with MEC.” IEEE Internet Things J. 9 (11): 8544–8559. https://doi.org/10.1109/JIOT.2021.3115346.
Yang, T., Z. Jiang, R. Sun, N. Cheng, and H. Feng. 2020. “Maritime search and rescue based on group mobile computing for unmanned aerial vehicles and unmanned surface vehicles.” IEEE Trans. Ind. Inf. 16 (12): 7700–7708. https://doi.org/10.1109/TII.2020.2974047.
Zeng, Y., J. Lyu, and R. Zhang. 2019. “Cellular-connected UAV: Potential, challenges, and promising technologies.” IEEE Wireless Commun. 26 (1): 120–127. https://doi.org/10.1109/MWC.2018.1800023.
Zeng, Y., X. Xu, S. Jin, and R. Zhang. 2021. “Simultaneous navigation and radio mapping for cellular-connected UAV with deep reinforcement learning.” IEEE Trans. Wireless Commun. 20 (7): 4205–4220. https://doi.org/10.1109/TWC.2021.3056573.
Zhang, S., Y. Zeng, and R. Zhang. 2019. “Cellular-enabled UAV communication: A connectivity-constrained trajectory optimization perspective.” IEEE Trans. Commun. 67 (3): 2580–2604. https://doi.org/10.1109/TCOMM.2018.2880468.
Zhang, S., and R. Zhang. 2019. “Trajectory design for cellular-connected UAV under outage duration constraint.” In Proc., IEEE Int. Conf. Communication, 1–6. New York: IEEE.
Zhang, S., and R. Zhang. 2021. “Radio map-based 3D path planning for cellular-connected UAV.” IEEE Trans. Wireless Commun. 20 (3): 1975–1989. https://doi.org/10.1109/TWC.2020.3037916.
Zhang, Y., Z. Mou, F. Gao, J. Jiang, R. Ding, and Z. Han. 2020. “UAV-enabled secure communications by multi-agent deep reinforcement learning.” IEEE Trans. Veh. Technol. 69 (10): 11599–11611. https://doi.org/10.1109/TVT.2020.3014788.
Information & Authors
Information
Published In
Copyright
© 2024 American Society of Civil Engineers.
History
Received: May 18, 2023
Accepted: Apr 10, 2024
Published online: Jul 11, 2024
Published in print: Sep 1, 2024
Discussion open until: Dec 11, 2024
ASCE Technical Topics:
- Algorithms
- Artificial intelligence (AI)
- Artificial intelligence and machine learning
- Computer programming
- Computing in civil engineering
- Engineering fundamentals
- Geomatics
- Infrastructure
- Layered systems
- Markov process
- Mathematics
- Navigation (geomatic)
- Neural networks
- Probability
- Stochastic processes
- Systems engineering
- Systems management
- Urban and regional development
- Urban areas
Authors
Metrics & Citations
Metrics
Citations
Download citation
If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. Simply select your manager software from the list below and click Download.