Article ID: 22.20250546
This paper presents an enhanced parameter-adaptive Q-learning algorithm with triple phase-shift (TPS) modulation for dual-active-bridge (DAB) converters, where limitations of traditional algorithms in achieving global optimum are overcome. TPS control provides three degrees of freedom through internal phase-shift angles, enhancing flexibility to reduce current stress and conduction losses under light-load conditions. However, power-loss-model-based TPS modulation requires complex computations under complicated conditions involving varying loads and voltage conversion ratios. This work proposes an enhanced parameter-adaptive Q-learning based modulation strategy which efficiently obtains the global optimal solutions. By leveraging frequency-domain unified phasor analysis, the optimization process avoids the manual mode selection associated with voltage conversion ratios and load conditions. The algorithm implements adaptive parameter updates within a phased framework by phase-based reward functions and sequence-adaptive ε-greedy strategy. Finally, experimental results demonstrate efficiency improvements of 3% and 8% under rated-load and light-load respectively.