Article ID: 2024EAL2075
The application of deep reinforcement learning (DRL) has become a hot research topic in unmanned aerial vehicle (UAV) path planning and resource allocation. However, current DRL methods do not consider coordination among spectrum, path and power, leading to a waste of spectrum resources. A coordinated routing and resource allocation Q network (CRRQN) algorithm with low computing complexity in multiple UAVs scenarios is proposed, and a co-optimization module is proposed to enhance coordination among path planning, spectrum and power allocation in CRRQN by designing their reward functions. Moreover, double deep Q network (DDQN) is employed to guarantee its stability. The simulation shows that the CRRQN algorithm reduces the flight time by about 4% and improves the channel capacity by about 15% compared to the existing algorithms. The running time per test epoch of CRRQN reduces by about 35%.