Abstract:Residential heating, ventilation and air conditioning (HVAC) provides important demand response resources for the new power system with high proportion of renewable energy. Residential HAVC scheduling strategies that adapt to real-time electricity price signals formulated by demand response program and ambient temperature can significantly reduce electricity costs while ensuring occupants ’ comfort. However, since the pricing process and weather conditions are affected by many factors, conventional model-based method is difficult to meet the scheduling requirements in complex environments. To solve this problem, we propose an adaptive optimal scheduling strategy for residential HVAC based on deep reinforcement learning (DRL) method. The scheduling problem can be regarded as a Markov decision process (MDP). The proposed method can adaptively learn the state transition probability to make economical decision under the tolerance violations. Specifically, the residential thermal parameters obtained by the least-squares parameter estimation (LSPE) can provide a basis for the state transition probability of MDP. Daily simulations are verified under the electricity prices and temperature data sets, and numerous experimental results demonstrate the effectiveness of the proposed method.