Abstract:An energy storage station (ESS) usually includes multiple battery systems under parallel operation. In each battery system, a power conversion system (PCS) is used to connect the power system with the battery pack. When allocating the ESS power to multi-parallel PCSs in situations with fluctuating operation, the existing power control methods for parallel PCSs have difficulty in achieving the optimal efficiency during a long-term time period. In addition, existing Q-learning algorithms for adaptive power allocation suffer from the curse of dimensionality. To overcome these challenges, an adaptive power control method based on the double-layer Q-learning algorithm for n parallel PCSs of the ESS is proposed in this paper. First, a selection method for the power allocation coefficient is developed to avoid repeated actions. Then, the outer action space is divided into n + 1
power allocation modes according to the power allocation characteristics of the optimal operation efficiency. The inner layer uses an actor neural network to determine the optimal action strategy of power allocations in the non-steady state. Compared with existing power control methods, the proposed method achieves better performance for both static and dynamic operation efficiency optimization. The proposed method optimizes the overall operation efficiency of PCSs effectively under the fluctuating power outputs of the ESS.