Article ID: 2024ECS6016
The Agile Earth Observation Satellite Constellations Mission Planning (AEOSCMP) problem seeks to maximize global cumulative reward by optimizing task selection and scheduling across the Earth's surface while adhering to the intricate resource constraints of individual satellites. This optimization challenge is further complicated by the diverse observation intervals required for different targets and the necessity for coordinated action among multiple satellites, introducing complexities in synchronization, data consistency, and overall mission planning. Deep reinforcement learning (DRL) and target clustering represent two complementary methodologies that synergistically enhance the autonomy and observation efficiency of AEOSCMP. This letter introduces an innovative approach that elegantly unifies these two methodologies - the Integrated Clustering and Planning with Proximal Policy Optimization Algorithm (ICP3O). This sophisticated framework seamlessly preserves the intelligent decision-making capabilities inherent to DRL while delivering substantial improvements in observation efficiency.