Journal of Modern Power Systems and Clean Energy

ISSN 2196-5625 CN 32-1884/TK

网刊加载中。。。

使用Chrome浏览器效果最佳,继续浏览,你可能不会看到最佳的展示效果,

确定继续浏览么?

复制成功,请在其他浏览器进行阅读

Deep Active Learning for Solvability Prediction in Power Systems  PDF

  • Yichen Zhang (Senior Member, IEEE)
  • Jianzhe Liu (Member, IEEE)
  • Feng Qiu (Senior Member, IEEE)
  • Tianqi Hong (Member, IEEE)
  • Rui Yao (Senior Member, IEEE)
Argonne National Laboratory, Lemont, IL 60439, USA

Updated:2022-11-20

DOI:10.35833/MPCE.2021.000424

  • Full Text
  • Figs & Tabs
  • References
  • Authors
  • About
CITE
OUTLINE

Abstract

Traditional methods for solvability region analysis can only have inner approximations with inconclusive conservatism and handle limited types of power flow models. In this letter, we propose a deep active learning framework for solvability prediction in power systems. Compared with passive learning where the training is performed after all instances are labeled, active learning selects most informative instances to be labeled and therefore significantly reduces the size of the labeled dataset for training. In the active learning framework, the acquisition functions, which correspond to different sampling strategies, are defined in terms of the on-the-fly posterior probability from the classifier. First, the IEEE 39-bus system is employed to validate the proposed framework, where a two-dimensional case is illustrated to visualize the effectiveness of the sampling method followed by the high-dimensional numerical experiments. Then, the Northeast Power Coordinating Council (NPCC) 140-bus system is used to validate the performance on large-scale power systems.

I. Introduction

POWER system under the stochastic power injections of renewable energy may exceed the loadability limits and result in voltage collapse. Therefore, it is important to quickly assess if power flow has a solution (i.e., solvable) given a set of power injections. The conventional approach is to solve the power flow equations numerically using iterative methods. However, many real-time operation scenarios desire non-iterative and analytical approaches to determine the solvability. Earlier research focused on solvability conditions of decoupled power flow models [

1], [2]. The fixed-point theorem has been used to obtain the solvability of coupled full power flow models in distribution networks [3]. Improvements from [3] have been achieved in [4]-[6]. Reference [7] has derived a seminal explicit sufficient solvability condition that certifies the existence and uniqueness of solutions, which dominates earlier works [3]-[6].

Despite these innovative works, state-of-the-art analytical condition still cannot handle coupled full power flow models with different types of buses. The most recent work in [

7] can handle a system with only slack and PQ buses. To consider the generator bus, i.e., PV buses, an assumption has to be made: the voltage phasors are constant [8]. However, the assumption may fail when systems are close to their steady-state stability limits [7]. Other results considering PV-bus model can be either quite conservative or restricted to systems under certain modeling assumptions [9]-[11].

Machine learning techniques have long been employed to amend the shortcomings of analytical methods. The recent success of deep learning has facilitated its application into power flow problems [

12]-[17]. References [12] and [13] have developed deep reinforcement learning algorithms to solve optimal power flow. The N-1 contingency screening using a deep convolutional neural network is presented in [14]. Since it focuses on the contingency screening, the load power injections are fixed. The security-constrained DC optimal power flow is solved under the aid of deep learning in [15]. References [16] and [17] propose physics-informed learning models to solve the power flow. The results are promising, but the capability of the model under the full power injection space is not demonstrated. In a nutshell, existing works cannot be generalized into the solvability problem since the AC power flow under the full power injection space has not been investigated.

To this end, we propose to use deep active learning for solvability prediction that consists of two phases: off-line training and online prediction. In the off-line training phase, we sample power injections over all permissible ranges. This results in very high volumes of samples. Simultaneously, the labeling process requires solving the AC power flow problem of all samples and demands considerable computation resources. Therefore, we employ the active learning framework - a family of machine learning methods which query the data instances to be labeled for training by an oracle (e.g., a human annotator) - to achieve higher accuracy with much fewer labeled examples than passive learning for solvability prediction. Active learning integrates intelligent sampling and machine learning as a closed loop, and it is valuable in the problems where unlabeled data are available but obtaining training labels is expensive. Although sampling towards more informative subspaces has been studied in [

18], closed-loop integration of machine learning and intelligent sampling like active learning has not been explored yet. This letter innovatively uses deep active learning for solvability prediction.

II. Deep Active Learning for Power Flow Solvability Approximation

Consider an NB-bus power network with NG generator buses and ND load buses. Let 𝒩B, 𝒩G, and 𝒩D denote the set of all buses, generator buses and load buses, respectively. 𝒩PQ and 𝒩PV represent sets of PQ and PV buses, respectively. The AC power flow equations are as follows:

PiG-PiD=j𝒩BViVj(Gijcosφij+Bijsinφij)    i𝒩PQ𝒩PVQiG-QiD=j𝒩BViVj(Gijsinφij-Bijcosφij)    i𝒩PQ (1)

where PiG and QiG are the samples of generation active and reactive power injections at generator bus i, respectively; and PiD and QiD are the samples of load active and reactive power injections at load bus i, respectively; Gij and Bij are the line conductance and susceptance, respectively; the voltages Vi and angles φi at bus i are the system state variables, and the angle difference between buses i and j is φij=φi-φj. The existence of solutions to (1) depends on the values of power injections. Note that in this letter, we relax the feasibility conditions, i.e., the limits of voltage and line flow rating are not considered, although the feasibility scenarios such as voltage violation and transmission line overloading, occur more frequently than the solvability problem. Hence, feasibility constraints are the binding ones in most cases. Nonetheless, the solvability problem could provide some insights when the system is feasible but heavily loaded. In this case, a small perturbation can change both the solvability and feasibility conditions of the power flow solution. In other words, understanding the distance from the current operating point or feasibility boundary to the solvability boundary can be significant for early warning and remedial actions. And characterizing the solvability boundary would be the first step toward such an endeavor.

Hence, the goal is to build a classifier using a multi-layer perception (MLP) model that can separate the solvable power injections from the non-solvable ones. Therefore, the inputs to the deep neural network are power injections defined as X=[P1G, P2G, , PNGG, Q1G, Q2G, , QNGG, P1D, P2D, , PNDD, Q1G, Q2G,, QNDG]. Let s, NS, and 𝒩S denote the sample index, sample numbers, and set of sample indices, respectively. Each sample in X, denoted as xs=X[s,:] (the subscript [s,:] denotes the sth row of a matrix), will be solved by Power System Simulator for Engineering (PSS/E), which will label its solvability: 0 indicates that X[s,:] is not solvable and 1 otherwise. It is worth mentioning that our proposed framework can handle both PV- and PQ-type generator buses. Proper Q limits should be set in PSS/E. To ensure PV-type generator buses, we set the reactive power upper and lower limits of all generators to be sufficiently large. To ensure PQ-type generator buses, we set the reactive power upper and lower limits of all generators equal to the corresponding samples. We denote the classes of non-solvable and solvable as Cq for q=1,2, respectively. The label data after one-hot encoding reads:

Y*=[y1*y2*yNS*]Tys*=[ys,1*ys,2*ys,Nq*] (2)

where ys,q*=1 (q=1,2,,Nq) indicates that sample s belongs to class q. We then apply probabilistic smoothing approximations to the discrete label values [

19]. It is well known that when the targets are one-hot encoded and an appropriate loss function is used, an MLP directly estimates the posterior probability of class membership Cq conditioned on the input variables xs, denoted by p(Cq|xs). Denote the MLP classifier as ys=f(xs;θ)=[ys,1,ys,2]=[pθ(C1|xs),pθ(C2|xs)], where pθ(Cq|xs) for q=1,2 denotes the posterior probability of class membership q given by the classifier under parameter θ. The network parameters θ can be calculated using the maximum likelihood estimation. Therefore, we minimize the negative logarithm of the likelihood function, known as the cross-entropy loss, as follows:

L=-1NSs=1NSq=12ys,q*lnys,q (3)

Since the output values of the MLP are interpreted as probabilities, they each must lie in the range of (0,1), and they must sum to unity. This can be achieved by using a softmax activation function at the output layer of the MLP.

III. Active Learning Framework

Assume that we randomly generate a feature set 𝒳 that is sufficiently large to represent the underlying physical features. In traditional passive supervised learning methods, we will generate labels for the entire feature set 𝒳, denoted as 𝒴𝒳, using the simulation software and result parser, which is regarded as the oracle. The labeling process is computationally demanding if the data set is large and becomes intractable for high-dimensional problems. This is known as the labeling bottleneck, which occurs not only in power systems but also in computer vision, natural language processing, and other machine learning tasks. The active learning framework can overcome such a labeling bottleneck. The pseudocode of the active learning algorithm is formally presented in Algorithm 1. Obviously, the querying strategy a(,) differentiates the active learning from passive learning algorithms. In other words, active learning under the random querying strategy will be equivalent to passive learning algorithms. The queries can be either selected in serial (one at a time) or batches (several to be labeled at once). Algorithm 1 presents the batch-mode active learning. Given the machine learning model f, unlabeled pool 𝒰, and inputs X𝒰, the querying strategy can be represented as a function a, which is referred to as the acquisition function expressed as:

x*=argmaxX𝒰aX,f(;θ) (4)

where x* denotes the most informative sample selected by the corresponding strategy.

Algorithm 1  : batch-mode active learning

input: labeled set L, unlabeled set U, query strategy a(,), query batch size B, labeling oracle Oracle(), deep neural net f(;θ), neural network training function Train(,)

1: A            //Initialize the set to store acquisition instances

2: repeat

3: θ(L,f(;θ))               //Train the model using current L

4: for i1 to B do

5:  xi*argmaxUa(U,f(;θ)) //Query the instance from the unlabeled set

6:  yi*(xi*)                   //Label the acquisition instance

7:  LLxi*,yi*                 //Add the labeled query to L

8:  UU-xi*               //Remove the labeled query from U

9:  AAxi*,yi*               //Store the acquisition instance

10: end

11: until some stopping criterion

output: trained deep neural net f(;θ), all acquisition instances A

The query strategy aims at evaluating the informativeness of unlabeled instances. There have been many proposed ways to formulate such query strategies in the literature [

20]. Among all, the most widely used and computationally efficient strategy is uncertainty sampling. In this letter, an active learner queries the most difficult instances to classify by the deep learning model trained at the current stage. When interpreting the binary classification using a probabilistic model, uncertainty sampling queries the instance whose posterior probability provided by the classifier is the closest to 0.5 [20]. In other words, the selected sample is the least confident to the classifier. For a general multi-class problem, this least confident sampling [20] can be expressed as:

xLC*=argmaxx𝒰maxq=1,2pθ(Cq|x) (5)

In the case of multi-class classification, this metric omits information about the remaining labels. To compensate this omission, the margin sampling is introduced as:

xM*=argmaxx𝒰pθ(C2|x)-pθ(C1|x) (6)

Besides the aforementioned metrics, the entropy sampling is also widely used to measure the amount of information that is encoded and can only be a metric in active learning:

xE*=argmaxx-q=12pθ(Cq|x)lnpθ(Cq|x) (7)

As pointed out in [

20] and other references, although all strategies generally outperform passive baselines, the best strategy may be application-dependent. Thus, we apply all three strategies to the solvability problem in this letter.

It is worth mentioning that (5)-(7) all entail only a simple sorting problem that finds the largest value from a finite set of numerical values. Therefore, the computation complexity of the sampling strategies is the same. Efficient algorithms to solve the sorting problem have been extensively studied, especially in the realm of computer science. To select the most informative samples globally, we will perform this operation on all samples. But the algorithm can be flexible to operate on randomly grouped subsets of the overall sample set to increase the computation time if the entire sample size is too large.

IV. Case Study

We use the IEEE 39-bus system and Northeast Power Coordinating Council (NPCC) 140-bus system [

21] to demonstrate the approach. The structure of the deep neural network is shown in Fig. 1, where FC is the fully connected layer; ReLU is the rectified linear unit; X is the feature dataset; n is the feature number; p is the sample number; Zi is the intermediate output of neural networks; and Y is the prediction. The PSS/E software and the Newton method are used to label the sample. The labelling process is to obtain the power flow solution given a set of power injections. The power injections are the inputs (features) for the predictor, and the convergence flag is the label. In this paper, we use PSS/E software with built-in Newton method to obtain the convergence flag. Theoretically speaking, the certificate from the PSS/E software is not a sufficient and necessary condition of solvability. However, considering the fact that the sufficient and necessary conditions for full model power flow solvability with mixed PV and PQ buses are still open problems, we believe that labels from the most widely used tool in the power community could provide sufficient trustworthy results to guide system operators.

Fig. 1  Structure of deep neural network.

During the training, we also face the data imbalance issue as the number of unsolvable samples is larger than that of solvable samples. The classification accuracy, which is the most-used metric for evaluating classification models, can be misguiding under this circumstance, as high metrics cannot guarantee prediction capacity for the minority class. Here, we employ the under-sampling strategy to resolve this issue. With under-sampling, we randomly remove a subset of samples from the class with more instances to match the number of samples coming from each class. In the active learning algorithm, the under-sampling step takes place after the Oracle labels all selected samples.

A. Two-dimensional Solvability Region of IEEE 39-bus System

First, we illustrate a two-dimensional case for visualization purposes. In this case, we uniformly sample active power loads at buses 3 and 4 from -3000 MW to 3000 MW. Before the training starts, all samples are normalized. We allocate 80% samples for training and 20% samples for testing. The active learner randomly selects 100 samples from the training dataset to label for the initial training phase and queries ten instances in each iteration using the margin sampling strategy. The algorithm terminates if the averaged testing accuracy of the last four iterations is greater than 95% or the algorithm reaches 30 iterations. The margin sampling strategy terminates after seven iterations, and achieves 95.3% accuracy with only 170 labeled samples. While the random strategy fails to meet the accuracy criterion after 30 iterations, achieving only 94.6% accuracy with 400 labeled samples. Samples that are queried by the active learner are plotted as the filled dots, as shown in Fig. 2, where the decision boundary of the neural network is illustrated using the colored areas. Meanwhile, labeled dataset plotted as the unfilled dots in the background indicates that the estimation is not conservative. As can be observed, the margin sampling strategy precisely selects the instances at the solvability boundary, indicating significantly high sampling efficiency.

Fig. 2  Queried instances by margin sampling strategy that precisely selects instances at solvability boundary.

B. Solvability Prediction Under Full Power Injections of IEEE 39-bus System

Second, a high-dimensional scenario is illustrated. Except for the slack bus (Generator 39 at bus 10), active and reactive power outputs of all generators are sampled uniformly between the dispatchable limits. Meanwhile, active and reactive power demands of all loads are sampled using normal distributions, which use the base values as the means and admit 50% standard deviation. We have in total 57 features. All samples are normalized, among which 80% are allocated for training and 20% for testing. In the active learning, 2000 samples are randomly selected for the initial training phase followed by 2000-sample query iterations. We perform ten iterations and compare all the aforementioned sample strategies, including random (baseline), least-confident, margin, and entropy. We conduct five runs with different random seeds and illustrate the results in Fig. 3, where the solid lines indicate the mean values and the shaded areas represent the standard deviations. All three active learning strategies have the similar performance, and are all superior to the random sampling. Compared with the random strategy, active learning achieves mostly 5% improvement in accuracy. The actual accumulated size of training dataset after under-sampling is plotted in Fig. 4. In the initial step, all strategies randomly select 200 samples, which admit to approximately 400 samples after being under-sampled. Then, active learner can build up a more balanced training dataset, as the actual accumulated sizes of training dataset are larger than the random ones. This, from another aspect, verifies that the active learner can sample towards the decision boundary and potentially resolve the data imbalance issue.

Fig. 3  Testing accuracy of different sampling strategies of IEEE 39-bus system.

Fig. 4  Actual accumulated size of training dataset after under-sampling.

C. Solvability Prediction Under Full Power Injections of NPCC 140-bus System

We employ the NPCC 140-bus system to validate the scalability of the proposed method [

21]. The NPCC 140-bus system contains 45 generators and 82 loads. Therefore, we have in total 254 features. In this case, the PV-type generator bus is considered. Similarly, 5000 samples are generated and equally separated for training and testing. During the active learning, we use 100 samples for the initial training, and query 100 samples in each iteration. The training epoch is 50 and the learning rate is 5×10-3. We conduct three runs for each sampling strategy. The out-of-sample testing accuracies of different sampling strategies of NPCC 140-bus system are illustrated in Fig. 5, where the solid lines indicate the mean values and the shaded areas represent the standard deviations. As can be observed, with the same number of samples, the testing accuracies using active learning are about 4% higher than that using random sampling.

Fig. 5  Out-of-sample testing accuracies of different sampling strategies of NPCC 140-bus system.

V. Conclusion

This letter proposes the deep active learning framework for solvability prediction in power systems with full AC power flow models. In this problem, sampling over the full power injection space is necessary, which results in a high volume of data to be labeled. To achieve higher labeling and training efficiency, active learning is employed, where the most informative instances are selected to be labeled. This allows to achieve higher accuracy with much fewer labeled examples.

The sampling effectiveness is first visualized in a two-dimensional case. Then, four different sampling strategies are compared in the high-dimensional solvability prediction. The results indicate that active learning significantly outperforms passive methods and can resolve the data imbalance issue.

REFERENCES

1

F. Wu and S. Kumagai, “Steady-state security regions of power systems,” IEEE Transactions on Circuits and Systems, vol. 29, no. 11, pp. 703-711, Nov. 1982. [Baidu Scholar] 

2

M. Ilic, “Network theoretic conditions for existence and uniqueness of steady state solutions to electric power circuits,” in Proceedings of 1992 IEEE International Symposium on Circuits and Systems (ISCAS), San Diego, USA, May 1992, pp. 2821-2828. [Baidu Scholar] 

3

S. Bolognani and S. Zampieri, “On the existence and linear approximation of the power flow solution in power distribution networks,” IEEE Transactions on Power Systems, vol. 31, no. 1, pp. 163-172, Feb. 2015. [Baidu Scholar] 

4

C. Wang, A. Bernstein, J.-Y. le Boudec et al., “Explicit conditions on existence and uniqueness of load-flow solutions in distribution networks,” IEEE Tranactions on Smart Grid, vol. 9, no. 2, pp. 953-962, May 2016. [Baidu Scholar] 

5

H. D. Nguyen, K. Dvijotham, S. Yu et al., “A framework for robust long-term voltage stability of distribution systems,” IEEE Transactions on Smart Grid, vol. 10, no. 5, pp. 4827-4837, Sept. 2019. [Baidu Scholar] 

6

K. Dvijotham, H. Nguyen, and K. Turitsyn, “Solvability regions of affinely parameterized quadratic equations,” IEEE Control Systems Letters, vol. 2, no. 1, pp. 25-30, Jan. 2018. [Baidu Scholar] 

7

B. Cui and X. A. Sun. (2019, Apr.). Solvability of power flow equations through existence and uniqueness of complex fixed point. [Online]. Available: https://arxiv.org/abs/1904.08855v1 [Baidu Scholar] 

8

B. Cui and X. A. Sun, “A new voltage stability-constrained optimal power-flow model: sufficient condition, SOCP representation, and relaxation,” IEEE Transactiosn on Power Systems, vol. 33, no. 5, pp. 5092-5102, Sept. 2018. [Baidu Scholar] 

9

F. Dörfler, M. Chertkov, and F. Bullo, “Synchronization in complex oscillator networks and smart grids,” Proceedings of the National Academy of Sciences, vol. 110, no. 6, pp. 2005-2010, Jan. 2013. [Baidu Scholar] 

10

J. W. Simpson-Porco, “A theory of solvability for lossless power flow equations – Part I: fixed-point power flow,” IEEE Transactions on Control of Network Systems, vol. 5, no. 3, pp. 1361-1372, Sept. 2018. [Baidu Scholar] 

11

J. W. Simpson-Porco, “A theory of solvability for lossless power flow equations – Part II: conditions for radial networks,” IEEE Transactions on Control of Network Systems, vol. 5, no. 3, pp. 1373-1385, Sept. 2018. [Baidu Scholar] 

12

D. Cao, W. Hu, X. Xu et al., “Deep reinforcement learning based approach for optimal power flow of distribution networks embedded with renewable energy and storage devices,” Journal of Modern Power Systems and Clean Energy, vol. 9, no. 5, pp. 1101-1110, Sept. 2021. [Baidu Scholar] 

13

Y. Zhou, W.-J. Lee, R. Diao et al., “Deep reinforcement learning based real-time AC optimal power flow considering uncertainties,” Journal of Modern Power Systems and Clean Energy, vol. 10, no. 5, pp. 1098-1109, Sept. 2022. [Baidu Scholar] 

14

Y. Du, F. Li, J. Li et al., “Achieving 100x acceleration for [Baidu Scholar] 

contingency screening with uncertain scenarios using deep convolutional neural network,” IEEE Transactions on Power Systems, vol. 34, no. 4, pp. 3303-3305, Jul. 2019. [Baidu Scholar] 

15

X. Pan, T. Zhao, M. Chen et al., “DeepOPF: a deep neural network approach for security-constrained DC optimal power flow,” IEEE Transactions on Power Systems, vol. 36, no. 3, pp. 1725-1735, May 2021. [Baidu Scholar] 

16

X. Hu, H. Hu, S. Verma et al.,“Physics-guided deep neural networks for power flow analysis,” IEEE Transactions on Power Systems, vol. 36, no. 3, pp. 2082-2092, May 2021. [Baidu Scholar] 

17

X. Lei, Z. Yang, J. Yu et al., “Data-driven optimal power flow: a physics-informed machine learning approach,” IEEE Transactions on Power Systems, vol. 36, no. 1, pp. 346-354, Jan. 2021. [Baidu Scholar] 

18

V. Krishnan and J. D. McCalley, “Importance sampling based intelligent test set generation for validating operating rules used in power system operational planning,” IEEE Transactions on Power Systems, vol. 28, no. 3, pp. 2222-2231, Aug. 2013. [Baidu Scholar] 

19

R. A. Dunne, A Statistical Approach to Neural Networks for Pattern Recognition. Hoboken: John Wiley & Sons, 2007. [Baidu Scholar] 

20

B. Settles, “Active learning literature survey,” Tech. Rep., Department of Computer Sciences, University of Wisconsin-Madison, Madison, USA, 2009. [Baidu Scholar] 

21

W. Ju, “Modeling, simulation, and analysis of cascading outages in power systems,” Ph.D. dissertation, Department of Electrical Engineering and Computer Science, University of Tennessee, Knoxville, USA, 2018. [Baidu Scholar]