Journal of Modern Power Systems and Clean Energy

ISSN 2196-5625 CN 32-1884/TK

网刊加载中。。。

使用Chrome浏览器效果最佳,继续浏览,你可能不会看到最佳的展示效果,

确定继续浏览么?

复制成功,请在其他浏览器进行阅读

Data-driven Reactive Power Optimization of Distribution Networks via Graph Attention Networks  PDF

  • Wenlong Liao 1
  • Dechang Yang 2
  • Qi Liu 3
  • Yixiong Jia 4
  • Chenxi Wang 4
  • Zhe Yang 5
1. Wind Engineering and Renewable Energy Laboratory, Ecole Polytechnique Federale de Lausanne (EPFL), Lausanne 1015, Switzerland; 2. College of Information and Electrical Engineering, China Agricultural University, Beijing 100083, China; 3. College of Electrical Engineering and Automation, Shandong University of Science and Technology, Qingdao 266590, China; 4. Department of Electrical and Electronic Engineering (Energy Digitalization Laboratory), The University of Hong Kong, Hong Kong, China; 5. Department of Electrical Engineering, The Hong Kong Polytechnic University, Hong Kong, China

Updated:2024-05-20

DOI:10.35833/MPCE.2023.000546

  • Full Text
  • Figs & Tabs
  • References
  • Authors
  • About
CITE
OUTLINE

Abstract

Reactive power optimization of distribution networks is traditionally addressed by physical model based methods, which often lead to locally optimal solutions and require heavy online inference time consumption. To improve the quality of the solution and reduce the inference time burden, this paper proposes a new graph attention networks based method to directly map the complex nonlinear relationship between graphs (topology and power loads) and reactive power scheduling schemes of distribution networks, from a data-driven perspective. The graph attention network is tailored specifically to this problem and incorporates several innovative features such as a self-loop in the adjacency matrix, a customized loss function, and the use of max-pooling layers. Additionally, a rule-based strategy is proposed to adjust infeasible solutions that violate constraints. Simulation results on multiple distribution networks demonstrate that the proposed method outperforms other machine learning based methods in terms of the solution quality and robustness to varying load conditions. Moreover, its online inference time is significantly faster than traditional physical model based methods, particularly for large-scale distribution networks.

I. Introduction

REACTIVE power optimization (RPO) plays an important role in distribution networks to reduce power flow along the distribution lines and maintain the desired voltage profile under various demand loads [

1]. The scheduling of reactive power minimizes active power losses in distribution networks through the optimal adjustment of various control devices such as capacitor banks (CBs), transformer taps, static var compensator (SVC).

Normally, the optimal scheduling of these control devices can be considered as a combinatorial optimization problem subjected to different nonlinear operational constraints. Earlier, a large number of heuristic algorithms were applied to solve the RPO model of distribution networks. For example, the work in [

2] combines a roulette wheel selection and a genetic algorithm (GA) to optimize reactive power compensators, while the model in [3] employs a wolf pack algorithm to optimize reactive power sources and generator terminal voltages. In [4], an artificial bee colony algorithm is designed to formulate the day-ahead plans of distributed generations, on-load tap changer (OLTC), and CBs. Other commonly used heuristic algorithms in RPO involve ant colony algorithm, simulated annealing, imperial competition algorithm, wolf pack algorithm, etc. [5]. Generally, a major drawback of these heuristic algorithms is that most of them yield locally optimal solutions rather than global optima [6]. Also, numerous iterations inside these algorithms result in heavy online inference time consumption, especially for large-scale distribution networks. In contrast, this paper aims to address this problem by designing a machine learning based model that treats the RPO problem as a functional mapping between operational states of the distribution networks and solutions of the RPO problem, from a data-driven perspective.

There is a substantial body of publications on developing machine learning technologies for the RPO problem. Most of these previous works may be divided into two groups: ① similarity-based algorithms; ② model-based algorithms.

Specifically, the similarity-based algorithms calculate the distance between the current case and historical cases to find the historical case closest to the current one, and then assign the historical solution to the current case. For instance, the case-based reasoning (CBR) and principal component analysis are integrated to screen historical cases for the RPO problem in [

7]. The work in [8] applies the Apriori algorithm to search for the most suitable scheduling solution for the current case from the historical data based on the association rule learning and frequent itemset mining. In [9], the voltage control strategies are determined by a ranking of the similarity between the historical voltage profile modes. In fact, the current load may dramatically differ from the historical load due to a variety of reasons such as changes in electricity consumption patterns. In light of this, the solutions of similarity-based algorithms have limited accuracy for highly volatile load conditions, since they directly assign historical scheduling schemes to the current case without changes.

In contrast, the model-based algorithms show stronger adaptability for these highly volatile load conditions, since they aim to create a new scheduling scheme by inputting loads to a supervised learning model that maps the nonlinear relationship between loads and reactive power scheduling schemes. For example, a three-layer multi-layer perception (MLP) is presented to obtain the optimal adjustment of different control devices in [

10]. To avoid solving complex physical models, [11] employs the stacked extreme learning machine to project the nonlinear mapping between reactive power strategies and high-dimension statistic features. In [12], an improved convolutional neural network (CNN) is generalized from the computer vision into the RPO problem. In [13] and [14], deep reinforcement learning techniques are designed to obtain the optimal voltage control strategies. Similarly, [15] presents a safe deep reinforcement learning to solve the optimal operation problem of distribution networks considering battery storage systems, voltage regulators, and distributed generators. One of the great challenges for above-mentioned models is the neglect of topology information of distribution networks. The voltage profile and active power loss of distribution networks depend mainly on the topology and load conditions, but these models have difficulty in considering the topology information of nodes.

Graph neural networks (GNNs) are extensions of traditional neural networks from the Euclidean domain to the graph domain. Compared with the traditional neural networks, the inputs of GNNs include both feature matrices and adjacency matrices. This unique characteristic makes the GNNs ideal candidates for the RPO in distribution networks. So far, the application of GNNs in RPO has been relatively limited. In [

16] and [17], the graph convolutional network (GCN) is employed to determine the power generation of generators for the optimal power flow task. To protect distribution lines from being overloaded under line contingency, a two-layer GCN is utilized to predict an optimal load-shedding ratio by supervised learning [18]. However, the way GCN aggregates neighboring nodes is structure-dependent, which always limits its generalizability and performance [19].

Further, a new supervised learning framework, called graph attention network (GAT), has been proposed to address this problem by introducing attention mechanisms to assign larger weights to more important nodes in computer vision [

20]. Compared with GCN, GAT has shown stronger performance in a wide variety of graph inference tasks [21] such as node classification, link prediction, and social recommendation. Therefore, GAT should have the potential to optimize control devices for the RPO problem. However, many challenges remain on how to migrate the GAT to RPO problem. For example, how to model the load and topology into a graph as inputs? How to design the structure and loss function of GAT? What should be done if the solution of GAT is not feasible?

In this context, this paper specifically tailors a new method called RPOGAT for the RPO of distribution networks by using GAT. Compared with traditional physical model based methods, the proposed RPOGAT has the following benefits.

1) The heavy online inference time burden due to iterations of power flow calculations can be avoided, since the physical model is replaced by the direct mapping.

2) The proposed RPOGAT addresses the RPO problem from a data-driven perspective, and does not require the construction of complex physical models based on expert knowledge.

The key contributions of this paper are listed as follows.

1) Distinguished from most machine learning based methods that handle RPO problem in the Euclidean domain, the proposed RPOGAT discusses it from a new perspective in graph domain.

2) The power load, power output of generators, and topology information are modeled as a graph to take into account the strong correlation between nodes in distribution networks, which is usually ignored in most machine learning based methods.

3) The proposed RPOGAT is tailored specifically to the RPO problem and incorporates several innovative features such as a self-loop in the adjacency matrix, a customized loss function, and the use of max-pooling layers. Additionally, a rule-based strategy is proposed to adjust infeasible solutions that violate constraints.

The rest of this paper is organized as follows. Section II briefly introduces the RPO of distribution networks. Section III presents the details of the proposed RPOGAT. Section IV performs and analyzes simulation results. The conclusion and future works are given in Section V.

II. RPO of Distribution Networks

A. RPO Model

Normally, the goal of the RPO problem is to reduce power losses and maintain the desired voltage profile. This goal can be achieved by regulating the state of various control devices such as CBs, transformer taps, and SVCs. Therefore, the state of the control devices is considered as the variable to be optimized, and the objective function can be defined as the minimization of active power loss PL:

minPL=i=1nGijUi2+Uj2-2UiUjcosθi-θj (1)

where n is the number of branches in distribution networks; Gij is the mutual conductance of the branch between the ith node and the jth node; Ui is the voltage magnitude at the ith node; and θi is the phase angle of the voltage at the ith node.

Furthermore, the power flow equations and some operational constraints should be considered [

12], [22].

1) Power flow equations

Pi-Uij=1mUjGijcosθij+Bijsinθij=0Qi-QSVC,i-QCB,i-Uij=1mUjGijsinθij-Bijcosθij=0 (2)

where m is the number of nodes; θij is the phase angle difference between the ith node and the jth node; Bij is the mutual susceptance of the branch between the ith node and the jth node; Pi is the active power load of the ith node; QSVC,i is the reactive power at the ith SVC, which can be leading or lagging power; QCB,i is the reactive power at the ith shunt CB; and Qi is the reactive power load of the ith node.

2) Voltage constraints

UiminUiUimax    i=1,2,...,m (3)

where Uimin is the lower bound of voltage magnitude at the ith node; and Uimax is the upper bound of voltage magnitude at the ith node.

3) Current constraints

IiIimax    i=1,2,...,n (4)

where Ii is the current at the ith branch; and Iimax is the upper bound of current at the ith branch.

4) Transformer tap constraints

TiminTiTimax    i=1,2,...,nT (5)

where Ti is the tap position at the ith transformer; Timin is the lower bound of tap position at the ith transformer; Timax is the upper bound of tap position at the ith transformer; and nT is the number of transformers.

5) Reactive power constraints of SVC

QSVC,iminQSVC,iQSVC,imax    i=1,2,...,nS (6)

where QSVC,imin is the lower bound of reactive power at the ith SVC; QSVC,imax is the upper bound of reactive power at the ith SVC; and nS is the number of SVCs.

6) CB constraints

0QCB,iQCB,imax    i=1,2,...,nC (7)

where QCB,imax is the upper bound of reactive power at the ith shunt CB; and nC is the number of shunt CBs.

Note that only the active power loss is treated as the objective function, which is essential for day-ahead planning and scheduling of distribution networks in practice [

23]. This is because this paper focuses on the performance of the proposed RPOGAT to map the nonlinear relationship between inputs (i.e., load conditions, power outputs of generators, and topology information) and outputs (scheduling schemes of control devices) for the RPO problem (the framework is shown in Fig. 1), rather than exploring the balance between multiple objectives (e.g., the cost of control device regulation), which can also be easily added in future work.

Fig. 1  Framework of proposed RPOGAT.

Given that the focus of this paper is to map the nonlinear relationship between inputs and outputs for the RPO problem, the widely-used SVC, shunt CBs, and on-load regulator transformers are also considered as control devices, as in previous publications [

1]-[4]. The integration of other control devices can be considered in the extension work. For example, the static synchronous condensers, energy storage devices, heat pumps, and distributed generators (i.e., active power dispatching) are not considered here, but their mathematical models are similar to (6).

Further, the constrained optimization model is transformed into an unconstrained one by adding penalty functions to operational constraints, since it is difficult for neural networks to consider constraints directly.

min F=PL+αi=1mεUimin-Ui+εUi-Uimax+βi=1nεIi-Iimax (8)

where F is the new form of the objective function; α and β are the penalty coefficients; and ε is the step function.

If the voltage and current are within the constraints, the objective function F is equal to the active network loss PL; otherwise, F is much larger than PL. The constraints of different control devices are considered by a value coding method in the following subsection.

B. Value Coding Method to Control Devices

To summarize the above discussion, the objective function is (8), and the variables to be optimized include the tap position of transformers, the operational number of shunt CBs, and the reactive power provided by SVCs. The first two are discrete variables, while the third is a continuous one. Considering the difficulty of the binary coding method to handle continuous variables accurately, this paper employs the value coding method to encode and decode the variables to be optimized.

Given a discrete variable with N positions, the ith position can be encoded into a value Xe ranging from 0 to 1:

Xe=2i-12N    i=1,2,...,N (9)

In the same way, the value predicted by the RPOGAT X^e can be decoded into an integer Xd ranging from 0 to N:

Xd=10X^e1Nii-1N<X^eiN,i=2,3,...,N (10)

For a continuous variable, the reactive power provided by control devices Qd also can be encoded into a value Xce ranging from 0 to 1:

Xce=Qd-QdminQdmax-Qdmin (11)

where Qdmin and Qdmax are the lower bound and upper bound of reactive power provided by control devices, respectively.

Further, the value predicted by RPOGAT X^ce can be decoded into a real number Xcd ranging from Qdmin to Qdmax:

Xcd=X^ceQdmax-Qdmin+Qdmin (12)

Up to this point, the states of control devices have been bijectively related to real numbers from 0 to 1. Also, the value encoding methods mentioned above can ensure that the variables to be optimized are within the constraints.

III. RPOGATs

The RPOGAT takes topology, power outputs of generators, and load conditions as inputs and returns scheduling schemes of control devices as output. In this section, the power load, power output of generators, and topology information are modeled as graphs, which are fed to the RPOGAT. Then, the suitable structure of the RPOGAT is tailored specifically to the RPO problem and incorporates several innovative features, such as a self-loop in the adjacency matrix, a customized loss function, and the use of max-pooling layers. Finally, a rule-based strategy is proposed to adjust infeasible solutions that violate constraints.

A. Raw Data to Graphs

To consider the correlation between nodes in distribution networks, the raw data (i.e., the power load, power output of generators, and topology information) are modeled as the graph, as shown in Fig. 2.

Fig. 2  Strategy of transforming raw data into graphs.

Specifically, the graph consists of a feature matrix and an adjacency matrix of nodes. The feature matrix can be defined as a concatenation between the active power load and reactive power load of nodes:

XF=P1,Q1P2,Q2Pm,Qm (13)

where XF is the feature matrix of nodes with m rows and two columns.

If the nodes contain generators, the reactive power output and active power output of generators should also be taken into account in the feature matrix:

XF=P1-P1,g,Q1-Q1,gP2-P2,g,Q2-Q2,gPm-Pm,g,Qm-Qm,g (14)

where Pi,g is the active power output of the generator at the ith node; and Qi,g is the reactive power output of the generator at the ith node.

In addition, this paper only considers distribution networks with constant topology. In fact, when distribution networks are installed with a large number of soft open points instead of tie switches [

24], the topology of the distribution network is fixed for each sample. For distribution networks with network reconfiguration, the model must be retrained. One of the ways to solve the RPO model with dynamic topology is to construct the adjacency matrix for each sample, but this brings on heavy computational burden, which will be discussed in future work. Here, the adjacency matrix A is constructed as:

Aij=1there is a branch between node i and node j0otherwise (15)

As one of the innovative points, to avoid exploding gradients and numerical instabilities in original GAT [

19], this paper generalizes the self-loop mechanism from the GCN [18] to obtain the new form of adjacency matrix for the GAT: A^=D˜-1/2A˜D˜-1/2, with A˜=A+I and D˜ii=jA˜ij, where I is the unit matrix.

Moreover, there are two possible ways to address the dynamics of the topology (i.e., reconfiguration) in future works.

1) Extension 1: a separate model is trained for each topology. This is suitable for cases where the topology does not change much.

2) Extension 2: the adjacency matrix is dynamically changing. In other words, the adjacency matrix included in each sample may vary. However, this way requires the samples in the training set to include different topologies, leading to difficulties in obtaining the training samples.

B. Network Architecture

As shown in Fig. 3, the RPOGAT is proposed to predict scheduling schemes of control devices given a graph with adjacency matrix A^ and feature matrix XF as inputs, whereas traditional models (e.g., MLP) use only the feature matrix XF as inputs and ignore the topology.

Fig. 3  Sample architectures of RPOGAT and traditional MLP. (a) RPOGAT. (b) MLP.

The suitable structure of the RPOGAT is tailored specifically to the RPO problem and incorporates several innovative features, such as a self-loop in the adjacency matrix, a customized loss function, and the use of max-pooling layers.

In particular, the graph attentional layer is used to capture latent features of inputs and correlations between nodes. The role of the max-pooling layer is to reduce the complexity of neural works by down-sampling latent features. Finally, the dense layer outputs the scheduling scheme for the RPO problem.

1) Graph Attentional Layer

Unlike most other GNNs (e.g., GCN) that explicitly assign non-parametric weights to neighbors based on the structural properties of graphs, the GAT employs attention mechanisms, which assign larger weights to the more important nodes implicitly. This choice is not without motivation, since the attention mechanism has previously achieved state-of-the-art-level results on various machine translation tasks.

Specifically, graph attentional layers are performed by independently replicating K multi-head attention (each replica with different parameters), and outputs are feature-wise aggregated:

HGAT,i'=||k=1KσELUjNiαijkWGAT,kHGAT,i (16)

where || denotes the concatenating operation; HGAT,i denotes the input features of the ith node; HGAT,i' denotes the output features of the ith node; Ni is a set of nodes connected with the ith node; σELU· is the function of the exponential linear unit (ELU); WGAT,k is the weight matrix specifying the linear transformation for the kth replica; and αijk is the attention coefficient of the jth node to the ith node derived by the kth replica. The detail of attention mechanism in GAT is shown in Fig. 4.

Fig. 4  Attention mechanism in GAT.

These attention coefficients are typically normalized using the softmax function, in order to be comparable across different neighborhoods:

αij=expeijkNiexpeik (17)
eij=GHGAT,i,HGAT,j (18)

where G is a simple single-layer neural network.

With the previous settings, this fully specifies a graph attentional layer.

2) Max-pooling Layer

The main advantage of GATs is that they can learn the importance of each neighbor adaptively. However, the computational cost and memory consumption of previous GATs increase rapidly because the attention weights between each pair of neighbors must be computed.

To accelerate training process and reduce training time, this paper employs a max-pooling layer to down sample the features from the previous graph attentional layer, which is one of the innovative points.

HP'=σELUmaxi,jRHP,ij (19)

where HP,ij is the input feature of max-pooling layers; HP' is the output feature of max-pooling layers; and R is the max-pooling region.

3) Dense Layer

As one of the most commonly used layers in GNNs, the dense layer is generally located at the top of the neural network. In the RPOGAT, a dense layer is used to connect the features output by the max-pooling layer and another dense layer is employed to output the results (i.e., scheduling schemes of control devices):

HD'=σSWDHD+BD (20)

where HD is the input feature of dense layers; WD is the weight vector of dense layers; BD is the bias vector of dense layers; σS is the sigmoid function; and HD' is the output feature of dense layers. Note that the output feature of the second dense layer is the scheduling schemes of control devices.

4) Loss Function

The previous GAT is generally used for node classification tasks, which employs cross-entropy as a loss function, which is not applicable to RPO problem.

In this paper, the RPOGAT directly maps nonlinear relationships between the graphs and scheduling schemes of control devices. Therefore, the RPOGAT can be considered as a complex regression model, in which a simple yet robust mean absolute error (MAE) is employed as the customized loss functions for the RPOGAT:

MAE=1Mi=1Myi-y^i (21)

where M is the total number of predicted points; yi is the predicted value; and y^i is the real value.

C. A Rule-based Strategy to Adjust Infeasible Solutions

In this subsection, a rule-based strategy is proposed to adjust infeasible solutions that violate constraints.

As shown in Fig. 5, to check the feasibility of solutions for the distribution network state, the power flow analysis is performed to detect if constraints are violated. If all constraints are satisfied, the solutions are implemented as real-time RPO control. Relatively, if the constraints are defined, these solutions should be adjusted based on a rule-based strategy. Normally, the solution obtained by the RPOGAT can be used as an initial point for a rule-based strategy (e.g., linear programming, nonlinear programming, and heuristic algorithm) to speed up the convergence.

Fig. 5  Framework of rule-based strategy.

The GA is used as an example to illustrate how to adjust solutions. Other rule-based strategies can be treated in a similar way. Traditional GA uses random noises to initialize the chromosomes and their initial fitness functions are inferior. To improve the initial fitness functions, the solution obtained by RPOGAT is employed to initialize chromosomes. Also, in order to increase the diversity of chromosomes in the population, the initial chromosomes need to be mutated. In other words, one or more elements of each chromosome are replaced using random noises. The subsequent iterative process is consistent with the traditional GA. Overall, the chromosome of GA is initialized by the solution of RPOGAT to obtain a high-quality initial population, which can also accelerate the convergence of GA.

The reason for using GA here is to adjust infeasible solutions that do not satisfy the constraints to ensure the feasibility of the solution. GA is used as an example because it is a commonly used rule-based optimization algorithm for the RPO problem. However, it is also recognized that the decision to use GA may require consideration of computational time constraints, especially in online environments. In practice, depending on computational resources and time requirements, other rule-based optimization algorithms (e.g., linear programming) may be considered, which are more suitable for online scenarios.

IV. Case Study

A. Simulation Setup

To compare the performance of the proposed RPOGAT and the popular benchmarks, simulations and analyses are performed on an IEEE 33-bus distribution network, whose parameters can be found in [

25]. Further, various control devices (e.g., CBs, transformer, and SVCs) are added to the feeders, as shown in the Fig. 6.

Fig. 6  Framework of IEEE 33-bus distribution network.

In particular, the voltage base value is 12.66 kV. The transformer tap includes 17 ratios, which vary from -8×1.25% to 8×1.25%. Various control devices are generally decentralized and located at the end of feeders to reduce power loss and boost voltage. Therefore, this paper makes the following assumptions about the location and capacity of SVCs and CBs: SVCs are added to the 9th node, 21st node, and 24th node. The reactive power provided by each SVC ranges from -500 kvar to 500 kvar. Two groups of CBs are added to the 17th node and 32nd node, respectively. Each group has 7 CBs. The reactive power provided by each CB is 100 kvar. The voltage limits at all nodes are 0.9-1.1 p.u..

The original IEEE 33-bus distribution network only includes one moment of loads, which cannot be used to train and test the performance of each model. To construct the dataset, this paper assumes that the load of each node is multiplied by a noise from a truncated Gaussian distribution. This is because the load level is considered to obey the truncated Gaussian distribution according to previous publications [

12]. The standard deviation is 0.85 p.u., and the mean value is 1.12 p.u.. The upper boundary for all noise is 2 p.u. and the lower boundary is 0.25 p.u.. In this case, this paper randomly generates 4000 training samples, 500 validation samples, and 500 test samples. To obtain the labels of the training set and validation set, GA is run 30 times, and then the optimal solution is used as the label of these samples.

To illustrate the superiority of the proposed RPOGAT, simulations are analyzed in comparison with CNN [

12], MLP [10], GA, CBR [7], support vector machine (SVM), random forest (RF), and more recent and advanced GCN [17]. Note that node 0 is a slack node, so it is not considered as an input to the model. For GNNs (e.g., GCN and RPOGAT), the inputs include a 32×32 adjacency matrix (i.e., the connection relationship among the 1st node to 32nd node) and a 32×2 feature matrix (i.e., the active power and reactive power of the 1st node to 32nd node). For CNN, MLP, SVM, and RF, their inputs only include the feature matrix, since they cannot handle the adjacency matrix. Further, the determination of hyper parameters is a challenge of deep learning models. The control variable method in [12] and Bayesian optimization in [26] are employed to determine parameters and structures of each model after many experiments in the training set and validation set.

1) RPOGAT: this model consists of two graph attentional layers, a max-pooling layer, and two dense layers. In each graph attentional layer, the number of output channels is 8, and the number of attention heads is 4. The pooling size is 2 in the max-pooling layer. The numbers of neurons in the two dense layers are 64 and 6, respectively.

2) GCN: this model has the same structure and parameters as RPOGAT, except that graph convolutional layers are used to graph attentional layers. In each graph convolutional layer, the number of output channels is 8.

3) CNN: this model has the same structure and parameters as RPOGAT, except that traditional convolutional layers are used to graph attentional layers. The numbers of filters in two convolutional layers are 8 and 16, respectively. The size of the convolutional kernel is 2.

4) GA: the population includes 50 chromosomes. The probability of chromosomal crossover is 0.7, and the probability of gene mutation is 0.2. The maximum number of iterations is 100.

5) CBR: this model utilizes similarity to filter historical cases and directly assigns the historical scheduling scheme to the current case without any changes. The specific framework can be found in [

6].

6) SVM: this model is implemented for regression. The kernel type is linear, and the cache size is 300.

7) RF: this model improves the predictive accuracy by fitting a large number of decision trees. The number of decision trees is 200, and the minimum number of samples required to split an internal node is 2.

In addition, the neural networks mentioned above have the following parameters in common. The activation function of the output layer is the sigmoid function, and the activation function of the other layers is the ELU function. The train epoch is 200, and batch size is 32. The optimizer is the adaptive moment estimation (Adam) algorithm.

All models are implemented using the Python library, including Spektral 1.0 and Tensorflow 2.0. After obtaining the scheduling scheme, the forward and backward substitution algorithms are performed to analyze power flows, power losses, and voltages in MATLAB 2018a. The key parameters of computer are as follows: 1.80 GHz processor base frequency, Intel Core i5-8265U, and 8 GB memory size.

B. Comparative Analysis with Popular Benchmarks

Each model is independently trained 30 times to obtain the solution of the test set, as shown in Table I.

TABLE I  Results of Different Models in IEEE 33-bus Distribution Network
ModelPower loss (kW)Standard deviation (kW)Cross-constraint ratio (%)
MeanMaximumMinimum
RPOGAT 235.230 235.256 235.213 0.037 0
GCN 235.942 235.956 235.913 0.047 0
CNN 236.349 236.397 236.319 0.095 0
MLP 236.386 236.394 236.379 0.050 0
RF 236.397 236.465 236.337 0.090 0
GA 236.924 236.942 236.902 0.262 0
SVM 237.666 237.666 237.666 0 0
CBR 240.075 240.075 240.075 0 0

1) High-quality solutions. Comparing the mean, maximum, and minimum power losses of each model, it is found that the GNNs including RPOGAT and GCN, which emphasize the importance of modeling both the topology information and feature of nodes, generally have better performance than other popular benchmarks (CNN, MLP, RF, SVM, and CBR). This is mainly due to models such as the CNN, MLP, RF, and SVM that only process the feature matrix of the nodes and ignore the topology information. The performance of CBR is the worst, since it uses similarity to filter historical cases and assign historical scheduling schemes directly to current cases without making any changes. In other words, CBR is difficult to adapt to various load conditions. Note that the labels of the training and validation sets are generated by GA, while RPOGAT outperforms GA, which indicates that RPOGAT does not simply copy the labels generated by GA, but can adaptively generate suitable scheduling schemes according to different load conditions.

2) Stable solutions. SVM is a statistical model, while the results of CBR depend only on the search strategy and historical cases, so they have a standard deviation of 0. In other words, the parameters of independently trained SVMs or CBRs are consistent. In contrast, other models (e.g., RPOGAT, GCN, CNN, MLP, RF, and GA) need to initialize weights or populations with random noise, resulting in different performances for each trained model. To analyze the effect of random noise on model performance, Table I shows the standard deviation of the test set for each model. Except SVM and CBR, the RPOGAT has the best stability, as it has the smallest standard deviation than GCN, CNN, MLP, RF, and GA.

3) Low cross-constraint ratio. After obtaining the scheduling schemes for the test set, the power flow analysis is performed to detect if constraints are violated. From the 6th column of Table I, all constraints are satisfied, and solutions of each model can be implemented as real-time RPO control.

Further, Table II shows the mean off-line training time and online inference time of each model. Specifically, GA and CBR do not need to pre-train their models, so their off-line training time is 0 s. Neural network and decision tree based models (RPOGAT, GCN, CNN, MLP, and RF) take longer time to train models off-line than statistical models (e.g., SVM), because the former have thousands of parameters to be optimized.

TABLE II  Mean Off-line Training Time and Online Inference Time of Different Models in IEEE 33-bus Distribution Network
ModelOff-line training time (s)Online inference time (s)
RPOGAT 588.17 0.57
GCN 45.29 0.09
CNN 38.14 0.04
MLP 20.04 0.02
RF 110.31 0.12
GA 0 22.18
SVM 0.25 0.01
CBR 0 3.21

The main limitation of RPOGAT is that its mean off-line training time is longer than that of other models, but a few hours of training time is acceptable in practical engineering. Compared with the traditional physical model based algorithms (e.g., GA), the online inference time of the neural networks is shorter, which is also one of the advantages of the proposed RPOGAT.

C. Sensitivity Analysis of Training Set Size

When applying machine learning based models under realistic conditions, the historical data in each region vary over a wide range, which may affect the quality of scheduling schemes for RPO. In this subsection, an attempt is made to analyze the sensitivity of each model to the training set size.

First, 11 different cases are set to vary the training set size, i.e., the number of samples in the training set, as shown in Table III. The samples of each case are randomly sampled from the original training set. Note that the test set and the validation set are kept constant. Then, each machine learning based model is independently trained 30 times to obtain the mean power loss of the test set and mean off-line training time, as shown in Fig. 7.

TABLE III  Training Set Size in Different Cases in IEEE 33-bus Distribution Network
Case No.Number of training samplesCase No.Number of training samplesCase No.Number of training samples
1 4000 5 2000 9 300
2 3500 6 1500 10 100
3 3000 7 1000 11 50
4 2500 8 500

Fig. 7  Mean results of different cases. (a) Power losses. (b) Off-line training time.

On one hand, the performance of each model for the test set does not change significantly as the number of training samples decreases, indicating that these machine learning based models are not sensitive to the size of the training set size. For example, the power loss of the proposed RPOGAT in case 11 is only 0.15% higher than that in case 1. In addition, a small training set size can significantly reduce the mean off-line training time without affecting the performance of models, so machine learning based models do not require a large training set size, which is one of the advantages of the proposed RPOGAT.

On the other hand, the power loss of the proposed RPOGAT is smaller than other models regardless of how the training set size varies, indicating that the proposed RPOGAT has better adaptability to different training set sizes.

D. Robustness Analysis of Extreme Load Conditions

In fact, the consumption habits or market-based behavior of users may change the load profiles, causing the current load distributions to differ dramatically from the historical one. For example, very rare light loads or heavy loads may occur at some time in theory. In this subsection, the robustness of each model to extreme loading conditions is tested.

Firstly, only common load conditions (i.e., medium loads) are used to construct the 4000 training samples and 500 validation samples. To obtain these samples, the raw power load of each node is multiplied by a Gaussian noise ranging from 0.75 p.u. to 1.75 p.u..

Secondly, extreme load conditions (i.e., light loads and heavy loads) are used to construct the test set, which includes 250 light load conditions and 250 heavy load conditions. To obtain the sample of light loads, the raw power load of each node is multiplied by a Gaussian noise ranging from 0.25 p.u. to 0.5 p.u.. To obtain the sample of heavy loads, the raw power load of each node is multiplied by a Gaussian noise ranging from 2 p.u. to 2.5 p.u..

Finally, each model is independently trained 30 times to obtain the results, as shown in Table IV and Table V. Note that the proposed strategy is not used here, and it will be considered in Section IV-E.

TABLE IV  Results of Different Models for Light Loads in IEEE 33-bus Distribution Network
ModelPower loss (kW)Standard deviation (kW)Cross-constraint ratio (%)
MeanMaximumMinimum
RPOGAT 28.482 28.725 28.231 0.162 10.76
GCN 29.292 30.820 28.008 1.216 14.20
CNN 30.058 32.348 28.870 1.588 20.88
MLP 29.987 30.209 29.850 0.102 17.32
RF 31.568 32.153 28.967 1.065 22.80
GA 25.876 25.883 25.868 0.064 0
SVM 31.471 31.471 31.471 0 28.40
CBR 31.863 31.863 31.863 0 22.80
TABLE V  Results of Different Models for Heavy Loads in IEEE 33-bus Distribution Network
ModelPower loss (kW)Standard deviation (kW)Cross-constraint ratio (%)
MeanMaximumMinimum
RPOGAT 717.277 717.503 716.697 0.335 0
GCN 718.061 718.245 717.880 0.185 0
CNN 718.876 721.119 718.123 1.199 0
MLP 719.155 719.493 718.960 0.153 0
RF 719.026 719.158 718.910 0.093 0
GA 716.712 717.972 716.629 1.051 0
SVM 720.887 720.887 720.887 0 0
CBR 729.740 728.360 728.360 0 0

Whether it is a light load condition or a heavy load condition, the proposed RPOGAT outperforms other machine learning based models, including GCN, CNN, MLP, RF, SVM, and CBR, because the mean power loss of RPOGAT is the lowest.

On the other side, the partial solutions of all machine learning based models do not satisfy the constraints for light load conditions. This is because the light load conditions are significantly different from common load conditions. These scheduling schemes from these machine learning based models provide too much reactive power, causing the voltage to exceed the upper limit. Therefore, these bad solutions cannot be implemented as real-time RPO control, and they should be adjusted based on a rule-based strategy.

For example, the bad solutions from RPOGAT account for 10.76% of the light load conditions in the test set. They are reduced by approximately 3.44%, 10.12%, 6.56%, 12.04%, 17.64%, and 12.04% compared with the GCN, CNN, MLP, RF, SVM, and CBR, respectively.

E. Performance Analysis of Rule-based Strategy to Adjust Infeasible Solutions

The previous subsections have shown that the proposed RPOGAT and other machine learning based models may yield infeasible solutions for extreme load conditions (e.g., light load conditions). This subsection will analyze whether the proposed strategy can adjust infeasible solutions.

The solutions obtained by RPOGAT are utilized to initialize chromosomes on GA. Also, one element of each chromosome is replaced using random noises to increase the diversity of chromosomes in the population. Figure 8 shows the iterative processes of two populations initialized with random noise and the solutions obtained by RPOGAT.

Fig. 8  Iterative process of two populations.

Although the bad solution obtained by RPOGAT cannot be used directly for real-time RPO control, it can be used as the initial population of GA, which accelerates the convergence speed of GA. After initializing the population using solution obtained by RPOGAT, the mean number to GA convergence is 38 iterations, while that of the original GA is 55 iterations. In other words, the solutions of RPOGAT can reduce the online inference time of GA by 30.91%.

Generally, the proposed strategy ensures that the constraints are satisfied, and the solution derived from RPOGAT as a starting point can speed up convergence.

F. Comparison of Dynamic RPO

The previous subsections have verified the effectiveness of the proposed RPOGAT for static RPO, and dynamic RPO can be simplified into multiple static ones. It can be inferred that the proposed RPOGAT is also applicable to dynamic RPO.

To confirm above inference, the raw loads are extended into a typical daily load curve in [

6], including commercial, domestic, and industrial loads. One day is divided into five time intervals as a simple example, and models are trained 30 times at each interval to obtain mean results, as shown in Table VI.

TABLE VI  Mean Results of Different Models for Dynamic RPO
ModelPower loss (kW)
The 1st time intervalThe 2nd time intervalThe 3rd time intervalThe 4th time intervalThe 5th time intervalOne day
RPOGAT 131.57 91.51 547.18 1589.63 566.33 2926.22
GCN 132.75 92.41 547.93 1590.29 567.07 2930.43
CNN 133.65 93.48 548.72 1591.25 567.88 2934.98
MLP 140.60 97.09 550.25 1592.97 569.41 2950.32
RF 142.78 98.18 550.88 1593.48 570.06 2955.38
SVM 147.93 98.72 551.41 1601.32 569.75 2969.13
CBR 158.45 108.39 567.36 1597.86 573.54 3005.59

After dividing the day into multiple time intervals, the proposed RPOGAT has less power losses in each interval than other machine learning based models, which indicates that above inference is correct, i.e., RPOGAT also outperforms popular benchmarks for dynamic RPO.

G. Performance Comparison of Large-scale Distribution Networks

To valid the effectiveness of the proposed RPOGAT for large-scale distribution network, simulations and analyses are performed on the IEEE 69-bus distribution network and a 118-bus distribution network, whose parameters can be found in [

27], [28]. Further, various control devices (e.g., CBs, transformers, and SVCs) are added to the feeders, as shown in the Fig. 9 and Fig. 10. Other parameters (e.g., voltage base value, types of transformers, and capacities of SVCs and CBs) are the same as those in the IEEE 33-bus distribution network.

Fig. 9  Framework of IEEE 69-bus distribution network.

Fig. 10  Framework of 118-bus distribution network.

Similarly, to construct 5000 samples (4000 training samples, 500 validation samples, and 500 test samples), the load of each node is multiplied by a Gaussian noise ranging from 0.25 p.u. to 1.5 p.u.. The GA is also employed to obtain the labels of the training set and validation set. Each model is independently trained 30 times to obtain the solution of the test set, as shown in Tables VII-X.

TABLE VII  Results of Different Models in IEEE 69-bus Distribution Network
ModelPower loss (kW)Standard deviation (kW)Cross-constraint ratio (%)
MeanMaximumMinimum
RPOGAT 289.046 289.054 289.036 0.034 0
GCN 289.860 289.866 289.851 0.051 0
CNN 290.097 290.101 290.092 0.065 0
MLP 290.216 290.225 290.205 0.078 0
RF 290.112 290.127 290.100 0.074 0
GA 290.128 290.145 290.110 0.147 0
SVM 291.725 291.725 291.725 0 0
CBR 291.220 291.220 291.220 0 0
TABLE vIii  Results of Different Models in 118-bus Distribution Network
ModelPower loss (kW)Standard deviation (kW)Cross-constraint ratio (%)
MeanMaximumMinimum
RPOGAT 569.427 569.453 569.412 0.031 0
GCN 570.465 570.595 570.430 0.090 0
CNN 571.633 571.723 571.547 0.168 0
MLP 571.592 571.598 571.584 0.072 0
RF 571.222 571.234 571.212 0.039 0
GA 570.719 570.750 570.688 0.324 0
SVM 571.659 571.659 571.659 0 0
CBR 572.422 572.422 572.422 0 0

The simulation results of large-scale distribution networks are similar to those of the IEEE 33-bus distribution networks, i.e., the mean power loss of the proposed RPOGAT is smaller than that in other benchmarks, and the proposed RPOGAT is more stable than the neural network and decision tree based models.

Real-time power systems typically require an appropriate dispatching solution within 1 min [

29], during which the distribution networks acquire metering data and then obtain optimal adjustment for all control devices (e.g., transformers, CBs, and SVCs). From the Table IX and Table X, it is found that the online inference time of GA rises dramatically with the scale of distribution networks, while online inference time of proposed RPOGAT is not sensitive to the scale of distribution networks, since the physical model is replaced by the direct RPOGAT mapping. In a nutshell, the proposed RPOGAT is more suitable than the traditional physical model based algorithms (e.g., GA) for real-time RPO control in large-scale distribution networks.

TABLE IX  Mean Off-line Training Time and Online Inference Time of Different Models in IEEE 69-bus Distribution Network
ModelOff-line training time (s)Online inference time (s)
RPOGAT 1111.29 0.59
GCN 50.67 0.11
CNN 49.21 0.05
MLP 21.25 0.02
RF 267.45 0.18
GA 0 67.25
SVM 1.40 0.04
CBR 0 5.89
TABLE X  Mean Off-line Training Time and Online Inference Time of Different Models in 118-bus Distribution Network
ModelOff-line training time (s)Online inference time (s)
RPOGAT 2173.83 0.69
GCN 77.73 0.43
CNN 93.09 0.07
MLP 21.34 0.02
RF 826.61 0.30
GA 0 287.51
SVM 4.18 0.13
CBR 0 9.26

V. Conclusion

To improve accuracy and reduce online inference time of RPO, the GAT is migrated from computer vision into the RPO problem. Simulation and analysis of several distribution networks lead to the following conclusions.

1) The proposed RPOGAT achieves state-of-the-art performance with high-quality and stable solutions for both static and dynamic RPOs. For common load conditions, the scheduling schemes generated by proposed RPOGAT do not violate the constraints. For extreme load conditions, the proposed RPOGAT also has a lower cross-constraint ratio than other machine learning based models when the proposed rule-based strategy is not used.

2) The proposed rule-based strategy ensures that the constraints are satisfied, and the solution derived from the proposed RPOGAT as a starting point can speed up convergence.

3) The proposed RPOGAT is not sensitive to the training set size, and it has better adaptability than machine learning based models regardless of how the training set size varies.

4) The main limitation of the proposed RPOGAT is that the off-line training time is longer than other models, but an off-line training time of a few hours is acceptable in practical engineering. The proposed RPOGAT is more suitable than traditional physical model based algorithms (e.g., GA) for real-time RPO control in large-scale distribution networks, since its online inference is significantly faster than traditional physical model based algorithms.

For future work, the single objective function may be extended into multiple ones. Besides, renewable energy sources, active power dispatching, energy storage devices, and topology changes may also be considered [

30], [31].

References

1

P. Li, Z. Wu, C. Zhang et al., “Multi-timescale affinely adjustable robust reactive power dispatch of distribution networks integrated with high penetration of PV,” Journal of Modern Power Systems and Clean Energy, vol. 11, no. 1, pp. 324-334, Jan. 2023. [Baidu Scholar] 

2

W. Ma, W. Wang, Z. Chen et al., “Voltage regulation methods for active distribution networks considering the reactive power optimization of substations,” Applied Energy, vol. 284, pp. 2472-2480, Feb. 2021. [Baidu Scholar] 

3

Q. Zhao, W. Liao, S. Wang et al., “Robust voltage control considering uncertainties of renewable energies and loads via improved generative adversarial network,” Journal of Modern Power Systems and Clean Energy, vol. 8, no. 6, pp. 1104-1114, Nov. 2020. [Baidu Scholar] 

4

L. Chen, Z. Deng, and X. Xu, “Two-stage dynamic reactive power dispatch strategy in distribution network considering the reactive power regulation of distributed generations,” IEEE Transactions on Power Systems, vol. 34, no. 2, pp. 1021-1032, Mar. 2019. [Baidu Scholar] 

5

W. Liao, S. Wang, B. Bak-Jensen et al., “Robust reactive power scheduling of distribution networks based on modified bootstrap technique,” Journal of Modern Power Systems and Clean Energy, vol. 12, no. 1, pp. 154-166, Jan. 2024. [Baidu Scholar] 

6

H. Zhou, F. Tang, D. Liu et al., “Active distribution network dynamic reconfiguration and DG dynamic control strategy considering time-variant load,” Power System Technology, vol. 40, no. 8, pp. 2423-2430, Aug. 2016. [Baidu Scholar] 

7

L. Wen, S. Wang, L. Qi et al., “Reactive power optimization of distribution network based on case-based reasoning,” in Proceedings of 2018 IEEE PES General Meeting (PESGM), Portland, USA, Aug. 2018, pp. 1-5. [Baidu Scholar] 

8

G. Chen, T. Zhang, S. Hao et al., “Association mining based intelligent identification method of key parameters for reactive power optimization,” Automation of Electric Power Systems, vol. 41, no. 23, pp. 109-116, Sept. 2017. [Baidu Scholar] 

9

H. Ma, G. Wang, X. Gao et al., “An adaptive voltage control using local voltage profile mode and similarity ranking,” Frontiers in Energy Research, vol. 10, pp. 1-12, May 2022. [Baidu Scholar] 

10

M. Shahriyari, A. Safari, A. Quteishat et al., “A short-term voltage stability online assessment based on multi-layer perceptron learning,” Electric Power Systems Research, vol. 223, pp. 1-14, Oct. 2023. [Baidu Scholar] 

11

X. Lei, Z. Yang, J. Yu et al., “Data-driven optimal power flow: a physics-informed machine learning approach,” IEEE Transactions on Power Systems, vol. 36, no. 1, pp. 346-354, Jan. 2021. [Baidu Scholar] 

12

W. Liao, J. Chen, Q. Liu et al., “Data-driven reactive power optimization for distribution networks using capsule networks,” Journal of Modern Power Systems and Clean Energy, vol. 10, no. 5, pp. 1274-1287, Sept. 2022. [Baidu Scholar] 

13

D. Cao, J. Zhao, W. Hu et al., “Model-free voltage control of active distribution system with PVs using surrogate model-based deep reinforcement learning,” Applied Energy, vol. 306, pp. 1-15, Jan. 2022. [Baidu Scholar] 

14

S. Li, D. Cao, W. Hu et al., “Multi-energy management of interconnected multi-microgrid system using multi-agent deep reinforcement learning,” Journal of Modern Power Systems and Clean Energy, vol. 11, no. 5, pp. 1606-1617, Sept. 2023. [Baidu Scholar] 

15

H. Li and H. He, “Learning to operate distribution networks with safe deep reinforcement learning,” IEEE Transactions on Smart Grid, vol. 13, no. 3, pp. 1860-1872, May 2022. [Baidu Scholar] 

16

B. Donon, R. Clement, B. Donnot et al., “Neural networks for power flow: graph neural solver,” Electric Power Systems Research, vol. 189, pp. 1-9, Dec. 2020. [Baidu Scholar] 

17

M. Gao, J. Yu, Z. Yang et al., “A physics-guided graph convolution neural network for optimal power flow,” IEEE Transactions on Power Systems, doi: 10.1109/TPWRS.2023.3238377 [Baidu Scholar] 

18

C. Kim, K. Kim, P. Balaprakash et al., “Graph convolutional neural networks for optimal load shedding under line contingency,” in Proceedings of 2019 IEEE PES General Meeting (PESGM), Atlanta, USA, Aug. 2019, pp. 1-5. [Baidu Scholar] 

19

R. Wang, L. Wang, X. Wei et al., “Dynamic graph-level neural network for SAR image change detection,” IEEE Transactions on Geoscience and Remote Sensing, vol. 19, pp. 1-5, Jan. 2022. [Baidu Scholar] 

20

F. Xia, K. Sun, S. Yu et al., “Graph learning: a survey,” IEEE Transactions on Artificial Intelligence, vol. 2, no. 2, pp. 109-127, Apr. 2021. [Baidu Scholar] 

21

Y. Liu, S. Yang, Y. Xu et al., “Contextualized graph attention network for recommendation with item knowledge graph,” IEEE Transactions on Knowledge and Data Engineering, vol. 35, no. 1, pp. 181-195, Jan. 2023. [Baidu Scholar] 

22

J. Zhao, H. Niu, and Y. Wang, “Dynamic reconfiguration of active distribution network based on information entropy of time intervals,” Power System Technology, vol. 41, no. 2, pp. 402-408, Jan. 2017. [Baidu Scholar] 

23

P. Wang, Q. Wu, S. Huang et al., “ADMM-based distributed active and reactive power control for regional AC power grid with wind farms,” Journal of Modern Power Systems and Clean Energy, vol. 10, no. 3, pp. 588-596, May 2022. [Baidu Scholar] 

24

Y. Huo, P. Li, H. Ji et al., “Data-driven adaptive operation of soft open points in active distribution networks,” IEEE Transactions on Industrial Informatics, vol. 17, no. 12, pp. 8230-8242, Dec. 2021. [Baidu Scholar] 

25

M. Baran and F. Wu, “Network reconfiguration in distribution systems for loss reduction and load balancing,” IEEE Transactions on Power Delivery, vol. 4, no. 2, pp. 1401-1407, Apr. 1989. [Baidu Scholar] 

26

J. Wu, X. Chen, H. Zhang et al., “Hyperparameter optimization for machine learning models based on bayesian optimization,” Journal of Electronic Science and Technology, vol. 17, no.1, pp. 26-40, Mar. 2019. [Baidu Scholar] 

27

M. Baran and F. Wu, “Optimal capacitor placement on radial distribution systems,” IEEE Transactions on Power Delivery, vol. 4, no. 1, pp. 725-734, Jan. 1989. [Baidu Scholar] 

28

D. Zhang, Z. Fu, and L. Zhang, “An improved TS algorithm for loss-minimum reconfiguration in large-scale distribution systems,” Electric Power Systems Research, vol. 77, no. 5, pp. 725-734, Apr. 2007. [Baidu Scholar] 

29

Q. Nan. (2016, Jun.). Voltage control in the future power transmission systems. [Online]. Available: https://vbn.aau.dk/ws/portalfiles/portal/254173904/ [Baidu Scholar] 

30

W. Liao, S. Wang, B. Bak-Jensen et al., “Ultra-short-term interval prediction of wind power based on graph neural network and improved bootstrap technique,” Journal of Modern Power Systems and Clean Energy, vol. 11, no. 4, pp. 1100-1114, Jul. 2023. [Baidu Scholar] 

31

Z. Yang, W. Liao, C. Bak et al., “Active control based three-phase reclosing scheme for single transmission line with PMSGs,” IEEE Transactions on Industrial Electronics, doi: 10.1109/TIE.2023.3283709 [Baidu Scholar]