Fault Location and Classification for Distribution Systems Based on Deep Graph Learning Methods

Jiaxiang Hu; Weihao Hu; Jianjun Chen; Di Cao; Zhengyuan Zhang; Zhou Liu; Zhe Chen; Frede Blaabjerg

网刊加载中。。。

使用Chrome浏览器效果最佳，继续浏览，你可能不会看到最佳的展示效果，

确定继续浏览么?

复制成功，请在其他浏览器进行阅读

Fault Location and Classification for Distribution Systems Based on Deep Graph Learning Methods PDF

- ORCID：
Jiaxiang Hu
✉
- ORCID：
Weihao Hu (Senior Member, IEEE)
✉
- ORCID：
Jianjun Chen
✉
- ORCID：
Di Cao
✉
- ORCID：
Zhengyuan Zhang (Senior Member, IEEE)
✉
- ORCID：
Zhou Liu (Senior Member, IEEE)
✉
- ORCID：
Zhe Chen (Fellow, IEEE)
✉
- ORCID：
Frede Blaabjerg (Fellow, IEEE)
✉

the Sschool of Mechanical and Electrical engineering, University of Electronic Science and Technology of China, Chengdu, China； the Siemens Gamesa Renewable Energy A/S, Lyngby, Denmark； the Aalborg University, Aalborg, Denmark

Updated：2023-01-19

DOI：10.35833/MPCE.2022.000204

OUTLINE

Abstract

Accurate and timely fault diagnosis is of great significance for the safe operation and power supply reliability of distribution systems. However, traditional intelligent methods limit the use of the physical structures and data information of power networks. To this end, this study proposes a fault diagnostic model for distribution systems based on deep graph learning. This model considers the physical structure of the power network as a significant constraint during model training, which endows the model with stronger information perception to resist abnormal data input and unknown application conditions. In addition, a special spatiotemporal convolutional block is utilized to enhance the waveform feature extraction ability. This enables the proposed fault diagnostic model to be more effective in dealing with both fault waveform changes and the spatial effects of faults. In addition, a multi-task learning framework is constructed for fault location and fault type analysis, which improves the performance and generalization ability of the model. The IEEE 33-bus and IEEE 37-bus test systems are modeled to verify the effectiveness of the proposed fault diagnostic model. Finally, different fault conditions, topological changes, and interference factors are considered to evaluate the anti-interference and generalization performance of the proposed model. Experimental results demonstrate that the proposed model outperforms other state-of-the-art methods.

Keywords

Fault diagnosis; fault location; fault type analysis; distribution system; deep graph learning; multi-task learning

I. Introduction

WITH the expansion of modern distribution systems and the increase in load access, distribution systems are more likely to suffer from faults due to the occurrence of stochastic events such as lightning strikes, insulation breakdowns, and improper operations [

1]. These affect the production and livelihoods of people and even cause considerable economic losses due to accidental power outages [2]. Therefore, effective fault location and fault type analysis are critical to the safe and stable operation of distribution networks and the reliability of the power supply.

With the development of distribution automation (DA), more operational data are obtained from intelligent electronic devices [

3] and other DA devices. The data can be analyzed for fault diagnosis and protection. Several studies have contributed to this field of research. Traditional fault diagnostic methods can be divided into impedance-based methods [4], [5], voltage sag based methods [6], traveling wave based methods [7], [8], and machine learning based methods [9]. For example, [4] proposes an impedance-based method to locate faults in distribution networks using an impedance matrix. Reference [6] utilizes a voltage sag to determine the fault location in a distribution network. Reference [7] examines fault locations using traveling waves. Reference [9] extracts fault features using wavelet transformation. Reference [10] proposes a novel method for fault location, isolation, and service restoration for active distribution networks based on distributed processing. Reference [11] proposes a novel two-stage localization method for single-phase earth faults in resonant grounding systems. Reference [12] proposes a smart protection scheme which utilizes micro-phasor measurement units (µPMUs) to obtain continuous rapid synchronized phasor measurement data. Reference [13] proposes a precise and rapid technique for identifying the fault section in a low-voltage DC (LVDC) distribution system. These methods skillfully utilize the network structure information and fault characteristics to solve fault diagnostic tasks. However, the performances of some traditional methods are limited by their relatively weak feature extraction abilities and complex analytical processes. Artificial intelligence (AI) technology provides a new way to establish the mapping from fault features to fault locations.

With the continuous development of AI technology, deep learning methods are widely utilized in the field of fault diagnosis [

14], [15] due to their powerful feature extraction abilities from a large amount of fault information without human intervention. Reference [16] introduces an artificial neural network into the fault location of a distribution network. Reference [17] utilizes a neural network to extract features and applies a support vector machine (SVM) to the classifier. Reference [18] uses a convolutional neural network (CNN) for fault identification and classification. Reference [19] applies a CNN to deal with voltage dip and tests their model using different datasets. Although these traditional deep learning techniques can effectively extract fault features from Euclidean space, they have limitations such as low training efficiency, difficulty in extracting effective features, and poor generalization ability when dealing with increasingly complex information from distribution systems. In addition, traditional deep learning technology cannot effectively utilize the physical structural information of the power network. Therefore, more powerful fault information processing methods are required to solve these problems.

Graph neural network (GNN) is a type of novel neural network model based on spatial structural information. The physical structural relationship acts as a significant constraint during the model learning process. It makes the GNN have stronger feature extraction ability and faster training speed. The latest studies have implemented GNN for fault diagnosis [

20]. For example, [21] applies graph learning to fault diagnosis of power transformers. Reference [22] constructs a graph structure based on data similarities and applies it to bearing fault diagnosis. References [23] and [24] utilize graph convolution networks (GCNs) for fault diagnosis of transmission and distribution networks. However, these methods still have some limitations. For example, some methods require accurate line parameter information, which is difficult for the actual distribution network. The fault waveform features are not effectively utilized due to relatively weak feature extraction. In addition, these methods do not consider the conditions of different fault resistances, topological changes, and different data interferences. To solve this problem, this paper proposes a novel fault diagnostic model for distribution systems based on spatiotemporal graph learning. The proposed model embeds topological information into the model learning process and operates without line parameters. Compared with common data-driven methods, the model can learn the deeper structural information of data, making the proposed model more resistant to abnormal data and condition changes. In addition, the spatiotemporal convolutional blocks improve the feature extraction ability of the fault diagnostic model. The process of the proposed fault diagnostic model includes: ① constructing the graph structure according to the structural relationship of the measured data; ② collecting measurement data; ③ executing an offline training model; and ④ detecting the fault type and fault location in real time. Experiments show that the proposed method has better generalization performance and anti-interference ability under different fault resistances, topological changes, and different data interferences. The main contributions of this study are as follows.

1) A novel fault diagnostic model based on spatiotemporal graph learning is proposed to complete fault location and fault classification in the distribution system. The measurement information processing derives from the physical structure of distribution network. Compared with traditional data-driven methods, the graph-based method can embed topological information into the model learning process, which makes the proposed model learn the deeper structure information of data and be more resistant to abnormal data and condition changes.

2) To improve the information processing capabilities of the model, a special spatiotemporal convolutional block is designed to extract fault features. This structure employs an efficient process for dealing with waveform and spatial information, which can combine the feature extraction of data numerical features and data structural information. Compared with the common GCNs, the proposed method has stronger feature extraction and anti-interference abilities. The results show that the proposed blocks improve the diagnostic results and narrow the input data windows, which ensures the speed and sensitivity of relay protection devices.

3) With the effective utilization of the structural information and the enhancement of feature extraction ability, the proposed model can deal with unknown system conditions. Experiments show that the proposed model offers a better generalization performance. It can maintain the performance of fault diagnosis under different topologies, fault resistances, and noise interference conditions. The effectiveness of the proposed model is verified using the IEEE 33-bus and IEEE 37-bus test systems. In addition, other state-of-the-art intelligent methods are utilized for comparative experiments.

The remainder of this paper is organized as follows. Section II describes the significance of fault type analysis and fault location in distribution network. Section III introduces the proposed fault diagnostic framework in terms of theoretical basis and technical details of the implementation. Section IV verifies the effectiveness of the proposed model through case studies. Section V concludes the study.

II. Significance of Fault Type Analysis and Fault Location in Distribution Network

Different handling techniques can be used for different fault scenarios in distribution networks. For example, when the neutral point is not effectively grounded, the single-phase-to-ground fault is not necessary to be removed [

25], which means the system can continue to operate for a period. Phase-to-phase short-circuit faults in the distribution networks must be removed immediately. Although other metallic short-circuit faults do not occur frequently, the fault diagnostic model can judge them correctly to reduce the influence range of the protective action. In addition, fast and accurate fault locations are critical for the power supply reliability of distribution networks. More effective measures should be implemented to reduce the influence range of faults according to fault location results. Therefore, based on the judgment results of the fault type analysis and fault location, the sectionalizer can selectively cut off the fault line instead of all three-phase lines with larger ranges. Sectionalizer is a relatively economical switchgear, which can be utilized at the place where the feeder sectioning is required but fault interruption capability is not required. Traditionally, a sectionalizer is applied as the last automatic device on the feeder and is set to coordinate with the circuit breaker (CB). A single operating principle for sectionalizers and fault location [26] is illustrated using a simple example in Fig. 1. When a fault occurs, the CB will be triggered and opened to isolate the substation from the fault system. In addition, the fault line is located by the fault location function of the protection system and is isolated by operating the normally closed (N.C.) sectionalizers. In addition, the load may be transferred from the fault line to the normal feeder by the action of normally open (N.O.) sectionalizers. Finally, the system will return to operate normally and wait for maintenance personnel.

Fig. 1 Operating principle for sectionalizers and fault location.

In this case, accurate fault location and fault type analysis reflect the higher automation level of the distribution system. Fault type analysis and fault location can be combined to determine the specific area affected by the fault and the accurate fault diagnosis will make the loss caused by fault as small as possible. In addition, narrowing the data window can improve the performance of protection devices. A smaller time window can make the protection device obtain a faster response speed, which also means that the protection device needs to have a stronger information extraction ability. To realize an intelligent fault diagnostic model using global information for global judgment, a novel fault diagnostic model based on deep graph learning is proposed in this study. In addition, this study verifies the adaptability of the proposed model under different conditions.

III. Proposed Fault Diagnostic Model

In this section, the principles and structures of GNN and spatiotemporal graph convolutional network (STGCN) are introduced. The proposed fault diagnostic framework based on the STGCN and multi-task learning is then illustrated in detail.

A. Spectral Convolution on Graphs

The space-based GCN primarily originates from the convolutional operation of traditional CNNs [

27]. A brief introduction to spectral graph theory is presented in [28]. Consider an undirected graph as

G = (V, E)

, where

V

is the set of nodes; and

E

is the set of edges. Its Laplacian matrix is defined as

L = D - A

, where

A

is the adjacency matrix of the graph,

D

is a diagonal matrix, and

D_{i i} = \sum_{j} A_{i j}

. If the edges have weights,

A

is converted to

W

, and the Laplacian matrix is converted to

L = D - W .

Note that L has a series of important properties which makes it play a major role in spectral analysis. In addition, the traditional Fourier transform is similar to graph convolution in the structure, which can be defined as:

F (ω) = F [f (t)] = \int f (t) e^{- j ω t} d t

(1)

where the time-domain signal $f (t)$ is converted into the frequency-domain signal $F (ω)$ by the basis function $e^{- j ω t}$ , which is the characteristic function of the Laplacian operator and satisfies $Δ e^{- j ω t} = \frac{\partial^{2}}{\partial t^{2}} e^{- j ω t} = - w^{2} e^{- j ω t}$ . When the Laplacian operator is extended to the graph structure with $N$ nodes, the function $f$ is an $N$ -dimensional vector denoted as $f = [f_{1}, f_{2}, \dots, f_{N}]$ , where $f_{i}$ is the function value of $f$ at node $i$ . The gain between nodes i and j in the weighted graph is $w_{i j} (f_{i} - f_{j})$ , and the operation of the Laplacian operator at node $i$ is given as:

Δ f_{i} = \sum_{j} w_{i j} (f_{i} - f_{j}) = (\sum_{j} w_{i j}) f_{i} - \sum_{j} w_{i j} f_{j}

(2)

This holds for any $i = 1,2, \dots, N$ , and we have:

Δ f = [\begin{matrix} Δ f_{1} \\ Δ f_{2} \\ ⋮ \\ Δ f_{N} \end{matrix}] = [\begin{matrix} (\sum_{j} w_{1 j}) f_{i} - \sum_{j} w_{1 j} f_{j} \\ (\sum_{j} w_{2 j}) f_{i} - \sum_{j} w_{2 j} f_{j} \\ ⋮ \\ (\sum_{j} w_{N j}) f_{i} - \sum_{j} w_{N j} f_{j} \end{matrix}] = (D - W) f = L f

(3)

According to (2) and (3), the graph Fourier transform is constructed as:

F (λ_{k}) = \sum_{i = 1}^{N} f (i) u_{k} (i)

(4)

where $λ_{k}$ and $u_{k}$ are obtained from Laplacian analysis $L u_{k} = λ_{k} u_{k}$ . Here, $u_{k} (i)$ is the i^th element of $u_{k}$ . $u_{k}$ is derived from the orthogonal matrix $U$ , and $U$ is obtained by eigenvalue decomposition of the Laplacian matrix $L = U Λ U^{T}$ , and $Λ$ is a diogonal matrix. The Fourier transform on graph $\hat{f}$ is represented by matrix multiplication as:

\begin{matrix} \hat{f} = [\begin{matrix} {\hat{f}}_{1} \\ {\hat{f}}_{2} \\ ⋮ \\ {\hat{f}}_{N} \end{matrix}] = [\begin{matrix} u_{1} (1) & u_{1} (2) & \dots & u_{1} (N) \\ u_{2} (1) & u_{2} (2) & \dots & u_{2} (N) \\ ⋮ & ⋮ & ⋮ \\ u_{N} (1) & u_{N} (2) & \dots & u_{N} (N) \end{matrix}] [\begin{matrix} f_{1} \\ f_{2} \\ ⋮ \\ f_{N} \end{matrix}] \end{matrix}

(5)

Therefore, the matrix form of the graph Fourier transform can be expressed as $\hat{f} = U^{T} f$ , and its inverse transform is $f = U U^{T} f = U \hat{f}$ . Based on graph convolutional theory, many studies have conducted extensive research on reducing the computational burden and improving the performance [

29]. The approximation of Chebyshev polynomials is utilized as a convolutional kernel, and the form of the graph convolution within the neural network is:

Z = D^{- \frac{1}{2}} A D^{- \frac{1}{2}} X θ

(6)

where $Z \in R^{N \times F}$ is the signal matrix after convolution; $X \in R^{N \times C}$ is the input signal, and $C$ is the feature dimension; and $θ \in R^{C \times F}$ is the parameter matrix of filter. The feature dimension $C$ is changed to $F$ by convolution. In this case, the graph convolution is transformed into a learning neural network $θ$ .

B. Spatiotemporal Graph Learning Method

STGCNs have already been used in human dynamic action recognition [

30] and traffic prediction [31]. It can aggregate spatial and temporal features simultaneously during feature extraction. Spatiotemporal convolution is suitable for the scenes with interrelated temporal and spatial features [32]. In the power network, the voltage, current, and other features present waveform variations along the time axis. Waveform variation is a major source of information. GCN is not sensitive to information related to time variations. In the proposed method, the feature extraction of temporal information is combined with spatial feature aggregation. Temporal feature extraction and spatial feature aggregation have a special structure, i.e., spatiotemporal convolutional block, which greatly improves the information processing of the proposed method for power information.

1)　GCN

The structure and feature-update mode of GCN [

33] are illustrated in Fig. 2.

Fig. 2 Structure and feature-update mode of GCN.

The node connection relationship is considered to be a constraint to update the node features in the graph. Each node in the graph updates its information at each feature update by gathering information from its neighbors, where the closer neighbors have a greater effect on node features. This forms the basic rule for node feature updates. The node feature update policy of GCN is given by:

H^{(l + 1)} = σ (\hat{A} H^{(l)} W^{(l)})

(7)

where $H^{(l)}$ is the feature information at layer $l$ ; $W^{(l)}$ is the neural network weight at layer $l$ ; $σ (\cdot)$ is the activation function of layer $l$ ; and $\hat{A}$ is the calculation rule derived from the original adjacency matrix $A$ . The transformation process of $\hat{A}$ is given as:

\hat{A} = {\tilde{D}}^{- \frac{1}{2}} \tilde{A} {\tilde{D}}^{- \frac{1}{2}}

(8)

where $\tilde{A} = A + I$ , and $I$ is the identity matrix; and ${\tilde{D}}_{i i} = \sum_{j} {\tilde{A}}_{i j}$ . Adding a self-loop $A + I$ can integrate the information of node. Introducing the degree matrix $\tilde{D}$ can more effectively deal with multiple connection relationships. Normalization can transform the node feature values into a reasonable region. The processed adjacency matrix $\hat{A}$ reflects the rules of information flow and has a constant value during the calculation operation at any layer. Here, the feature mapping ability of the network derives from the learnable matrix $W^{(l)}$ of each layer. However, when the number of GCN layers increases, the features of nodes will show an average trend called over-smoothing [

34]. Therefore, significant restrictions are placed on the number of GCN layers, which limits the feature perception ability of GCN.

2)　Spatiotemporal Convolutional Block

To improve its feature perception ability, the STGCN contains a special spatiotemporal convolutional block. The temporal convolution module extracts the node features along the time axis before and after the spatial information aggregation based on GCN rules, as illustrated in Fig. 3. The temporal convolution can be defined as $c o n v_{x, y} = \sum_{i = 1}^{p q} k_{i} v_{i}$ , where $(x, y)$ is the input feature size of waveform, pq is the size of convolution kernel, $k$ is the weight of convolution kernel, and $v$ is the waveform feature, $i$ is the index of kernel weight and feature. The convolution process is the inner product of convolution kernel and input waveform feature.

Fig. 3 Calculation of spatiotemporal convolutional block.

In the spatiotemporal convolutional block, convolutional kernels similar to one-dimensional convolution are utilized to extract waveform features. The kernels can scan all waveform features along the time axis and map them into new features with stronger expression. Different from the traditional GCN, the feature processing of STGCN is composed of a type of temporal-spatiotemporal structure, and the first temporal convolutional layer is expressed as:

H_{T 1}^{(l + 1)} = φ (R e l u (h_{c 1}^{(l + 1)} + h_{c 2}^{(l + 1)} + h_{c 3}^{(l + 1)}))

(9)

h_{c 1}^{(l + 1)} = S i g m o i d (H^{(l)} * y_{c 1}^{(l + 1)})

(10)

h_{c 2}^{(l + 1)} = φ (H^{(l)} * y_{c 2}^{(l + 1)})

(11)

h_{c 3}^{(l + 1)} = H^{(l)} * y_{c 3}^{(l + 1)}

(12)

where $H_{T 1}^{(l + 1)}$ is the output feature of the first temporal convolutional layer; $H^{(l)}$ is the feature of layer l in the GNN, and when $l = 0$ , $H^{(0)}$ indicates the input fault features of the model; $y_{c 1}^{(l + 1)}$ , $y_{c 2}^{(l + 1)}$ , and $y_{c 3}^{(l + 1)}$ are three convolutional kernels with the same shape; $φ (\cdot)$ represents the batch normalization operation; and $S i g m o i d (\cdot)$ and $R e l u (\cdot)$ are the activation functions. After the first temporal convolutional layer, the spatial convolutional layer is expressed as:

H_{s}^{(l + 1)} = R e l u (\hat{A} g^{(l + 1)} W_{5}^{(l + 1)})

(13)

g^{(l + 1)} = g_{1}^{(l + 1)} ∥ g_{2}^{(l + 1)} ∥ g_{3}^{(l + 1)} ∥ g_{4}^{(l + 1)}

(14)

g_{1}^{(l + 1)} = R e l u (H_{T 1}^{(l + 1)} W_{1}^{(l + 1)} + b_{1}^{(l + 1)})

(15)

g_{2}^{(l + 1)} = R e l u (H_{T 1}^{(l + 1)} W_{2}^{(l + 1)} + b_{2}^{(l + 1)})

(16)

g_{3}^{(l + 1)} = R e l u (H_{T 1}^{(l + 1)} W_{3}^{(l + 1)} + b_{3}^{(l + 1)})

(17)

g_{4}^{(l + 1)} = R e l u (H_{T 1}^{(l + 1)} W_{4}^{(l + 1)} + b_{4}^{(l + 1)})

(18)

where $H_{s}^{(l + 1)}$ is the output feature of the spatial convolutional layer; $W_{1}^{(l + 1)}$ - $W_{5}^{(l + 1)}$ and $b_{1}^{(l + 1)}$ - $b_{4}^{(l + 1)}$ represent the learnable parameters; and the sign $∥$ denotes the feature splicing operation. After the spatial convolutional layer, the temporal convolutional layer with the same structure as the first temporal convolutional layer further extracts data features. The feature operation process is expressed as:

H^{(l + 1)} = φ (R e l u (h_{s 1}^{(l + 1)} + h_{s 2}^{(l + 1)} + h_{s 3}^{(l + 1)}))

(19)

h_{s 1}^{(l + 1)} = S i g m o i d (H_{s}^{(l + 1)} * y_{s 1}^{(l + 1)})

(20)

h_{s 2}^{(l + 1)} = φ (H_{s}^{(l + 1)} * y_{s 2}^{(l + 1)})

(21)

h_{s 3}^{(l + 1)} = H_{s}^{(l + 1)} * y_{s 3}^{(l + 1)}

(22)

where $H^{(l + 1)}$ is the output feature from the last temporal convolutional layer, which is also a feature of the $l + 1$ layer in the GNN; and $y_{s 1}^{(l + 1)}$ , $y_{s 2}^{(l + 1)}$ , and $y_{s 3}^{(l + 1)}$ are three convolutional kernels with the same shape.

C. Graph Structure of Distribution Network

Converting the power network into a graph structure is essential to the application of the GNN. A power network consists of transmission lines, power equipment, and buses. The graph is composed of edges and nodes. During the operation of GNN, the node message flows through the connection relationships of the edges. In the power network, the effects of state changes on the buses are also carried over to other buses and electrical equipment along the lines. In this case, the relationship between the power line and bus can be transformed into the relationship between the edge and node. The bus voltages, branch currents, and line impedances are the basic features of power network. With buses as nodes and lines as edges, distribution networks can be converted into graph structures. Voltage can be considered as the node feature, and the current and line impedances are edge features. However, it is not easy to obtain accurate line impedance on a large scale. In addition, various distribution networks exhibit different impedance characteristics. The fixed impedance feature will reduce the generalization performance of the model. Therefore, the line impedance feature is not utilized in the proposed model. In addition, the current value that flows into the node is considered as the node feature. In this case, the information of each node includes the voltage and current features converging to the bus. The features at node $i$ are denoted by $N_{(V_{i}, I_{i})} = (V_{i 1}, V_{i 2}, V_{i 3}, I_{i 1}, I_{i 2}, I_{i 3}) \in R^{6}$ , where 1, 2, and 3 represent the three phases; and $i \in 1,$ $2, \dots, m$ represents $m$ observable nodes. All features are root mean square (RMS) values and no features exist at the edges. The input feature $H^{(0)}$ of the GNN model is the input node feature $N_{(V, I)}$ . The objective of the fault diagnostic model is to obtain the fault type and location information $N_{(T, L)}$ through the observed node information $N_{(V, I)}$ . In addition, the topology structure of distribution network is also the input of the fault diagnostic model for embedding physical information. The function of the fault diagnostic model is expressed as:

N_{(T, L)} = f (G, N_{(V, I)})

(23)

where $f (G, N_{(V, I)})$ represents the fault diagnostic model. Note that $G$ is the connection relationship of the nodes, which is derived from the structure of the distribution network, and $G$ specifies the feature update rules for the graph convolutional layer. The model $f (G, N_{(V, I)})$ maps the fault features to the fault type and fault location information. The algorithm in this study is constructed using a deep graph library and Pytorch deep learning framework.

D. Proposed Fault Diagnostic Framework

In this study, a fault diagnostic framework based on the STGCN structure is proposed to combine fault type analysis and fault location in the distribution network. During the fault process, the fault impact spreads from the fault point to the entire graph based on the connection between nodes and edges. Different fault types generate different fault waveforms and affect the surrounding nodes. The spatiotemporal convolutional block can effectively perceive differences in node information and generate stronger feature expressions. In this case, the proposed fault diagnostic framework outputs accurate fault diagnostic results. The structure of the proposed fault diagnostic framework using a simple circuit as an example is shown in Fig. 4.

Fig. 4 Structure of proposed fault diagnostic framework.

1)　Multi-task Learning

Multi-task learning [

35] can improve the generalization performance of neural networks by weighing the input information for different tasks. The proposed fault diagnostic framework outputs the fault location and fault type. The faults have the greatest effects on the nodes directly connected to the fault position, so that the task of fault location depends more on the structural information of the network. With the increasing distance of the node, the effect of fault gradually decreases. However, the fault type analysis requires more fault information about waveform feature changes. In this case, a shared spatiotemporal convolutional block is applied to extract the basic features of the proposed fault diagnostic framework. In addition, independent spatiotemporal convolutional blocks are constructed to further aggregate the fault features required by both tasks. Finally, two specially constructed classifiers are used to output the network classification results. It should be noted that the parallel structure

W_{i}

is utilized in the fault diagnostic framework shown in Fig. 4. This structure can expand the model capacity and strengthen its feature-mapping ability.

2)　Classifier and Label Design

Two independent classifiers are designed for the two tasks because of different feature requirements of fault location and fault type analysis. In the fault location task, the output depends mainly on the spatial information of the nodes.

The positional relationship between the input features and output results is relatively regular. Therefore, the classifier for the fault location task is constructed using fully connected layers. However, faults can occur at any line in the distribution network. Therefore, the classification of fault types requires a classifier to identify valid mapping features among all nodes. CNN can search for effective features regardless of their positions in the input information. Therefore, a CNN is utilized to construct the fault type classifier for the features extracted by STGCN. The fault directly affects the edge and then severely affects the nodes connected to the edge. Therefore, two nodes can be used to indicate that a fault occurs on the edge between them in fault location task. In this case, the two nodes with the largest activation values represent the feature of the two buses are most severely affected by the fault. This indicates that a fault occurs between the two buses. In the fault type analysis, the information extracted from the blocks will be identified by a CNN-based classifier. This can be considered as a common multi-classification task.

IV. Case Study

In this section, the fault data are obtained by simulated test systems, and the structure of the proposed spatiotemporal convolutional block is introduced in detail, and the diagnostic results of the model are presented and analyzed. In addition, different conditions such as fault resistance changes, topological changes, and data interference are considered to verify the performance of the proposed model.

A. Simulation and Data Collection

To obtain labeled transient fault samples, a simulated distribution network model based on the IEEE 33-bus test system [

36] is constructed to collect fault data. MATLAB/Simulink is used as the simulation software. The standard test system consists of 32 three-phase lines and loads. The graph structure of the standard test system is constructed according to the relationship between the bus and line, as shown in Fig. 5.

Fig. 5 Graph structure of IEEE 33-bus test system for fault diagnosis.

Because of the mutual influence between buses, the information transmission on the edge is set as bidirectional. In addition, because the current information in the distribution network is attributed to node information, the edges in the graph only play the role of indicating the direction of information transmission during this task. In this case, the network uses three-phase voltage and current data of all nodes for global fault diagnosis.

The fault location task can be regarded as a 32-class classification task because of the 32 lines in the test system. In addition, each fault point contains 10 types of faults, and therefore the fault type analysis can be regarded as a 10-class classification task. Thus, the total number of output categories of the proposed model is $32 \times 10 = 320$ . When the fault occurs, the three-phase voltage and current on the bus are collected. Taking bus 9-bus 10 as an example, the voltage waveforms of different fault types are shown in Fig. 6.

Fig. 6 Voltage waveforms of different fault types. (a) Phase-A-to-ground (AN). (b) Phase-A-to-phase-B (AB). (c) Two-phases-to-ground (ABN). (d) Three phases (ABC).

The proposed model can accept a data window with fewer sampling points instead of complete waveforms. In this study, the transient metallic faults are implemented to verify the proposed fault diagnostic model. During the simulation, the total sampling time after the faults is 0.05 s.

The sampling frequency is 1 kHz, and the original fault data contain 50 time points. After downsampling, each fault sample contains 21 time points. The entire fault waveform is resampled repeatedly in steps at each sampling point. The 21 time points indicate that each fault sample for fault diagnosis obtains 0.02 s of fault data. In this case, the proposed model could utilize any 0.02 s of fault data within 0.05 s after the fault to determine the fault type and location. The length of the downsampling window refers to the fault information contained in each sample. With a longer data window, the sample contains more fault information, and the difficulty of fault diagnosis will be reduced. Twenty five sub-samples of each original fault sample are obtained for training and the number of samples for each fault resistance condition is $32 \times 10 \times 25 = 8000$ , where 70% of the samples are selected for model training, and the remaining data are used for model testing. The subsampling process of fault data is suitable for real-time fault diagnosis. Therefore, the fault diagnostic model can be applied to real-time fault diagnosis. The parameters of fault states for training are listed in Table I.

TABLE I Parameters of Fault States for Training

Fault parameter	Values or types
Fault type	AN(1), BN(2), CN(3), AB(4), AC(5), BC(6), ABN(7), CAN(8), BCN(9), ABC(10)
Fault position	Midpoint of 32 lines
Fault resistance (Ω)	0.01, 0.1, 0.5, 1, 2, 5, 10, 15, 20, 50, 100, 150, 200, 300, 400, 500, 600
Operating load	Basic load

B. Construction of Spatiotemporal Convolutional Blocks

In the proposed fault diagnostic model, the per-unit values of the voltage and current are directly utilized as the input of the model without other data processing. The structure and parameters of the proposed spatiotemporal convolutional block are shown in Fig. 7, where W₁-W₄ are four groups of learnable parameters that can convert the dimensions of the features from 64 to 128; and W₅ causes a feature to return to its original dimension.

Fig. 7 Proposed spatiotemporal convolutional block.

To improve the performance of the model, parallel structures and regularization techniques are implemented in both temporal and spatial convolutions. For example, the data shape of the fault samples is (33, 21, 6), where 33 indicates that the number of buses is 33; 21 indicates that the samples contains 21 time points; and 6 indicates the number of features.

In the processing of features forward in a single spatiotemporal convolutional block, the samples first pass through the temporal convolution. The shape of the 2D-convolutional kernel size is (1, 3), and the filter channel is 64. Because the convolutional kernels must scan the waveform information on the time axis, it is necessary to adjust the dimensions of the sample features. Accordingly, two temporal convolutions and one spatial convolution form a spatiotemporal convolutional block. Multi-channel feature extraction can make the network easier to capture key fault features. The two types of convolutional layers are combined to extract the fault features in greater depth. In this case, the feature extraction capability of STGCN is improved by the extended learnable parameters.

C. Experiment Results with Fault Resistance of 0.1 Ω Condition

The proposed fault diagnostic model has two main functions: fault type analysis and fault location The accuracy of the proposed model for the training process is illustrated in Fig. 8, where Train-acc and Test-acc are the accuracies of the model in the training and test datasets during model training, respectively. It can be observed that the accuracy of the proposed fault diagnostic model for fault location and fault type analysis reaches over 98%, well demonstrating the effectiveness of the proposed model. Therefore, the proposed fault diagnostic model can accurately perform fault classification and location tasks.

Fig. 8 Accuracy of proposed model for training process. (a)　Fault type analysis. (b)　Fault location.

In addition, GCN [

24], CNN [18], and principal component analysis SVM (PCA-SVM) [17] are used for comparative validation, as shown in Table II.

TABLE ii Comparasion Results of Different Models

Model	Accuracy
Model	Fault type analysis	Fault location
Proposed	0.999	0.992
GCN [24]	0.973	0.917
CNN [18]	0.899	0.981
PCA-SVM [17]	0.920	0.870

The results show that the proposed model has a better performance than other models. The improved spatiotemporal convolution block can significantly improve the feature extraction ability of the model.

In addition, the outputs of the penultimate layer from different models are extracted to represent the feature space for the tasks. We utilize t-distributed stochastic neighbor embedding (t-SNE) [

37] to show the distribution of the feature space. It can be illustrated from Fig. 9 that the proposed model achieves better feature extraction results than GCN.

Fig. 9 Visualization of output of each task from GCN and proposed model. (a)　Visualization of fault type output from GCN. (b) Visualization of fault type output from proposed model. (c) Visualization of fault location from GCN. (d) Visualization of fault location from proposed model.

As shown in the two-dimensional space, the feature outputs of the proposed model have a more reasonable and accurate distribution, illustrating that the proposed model has higher accuracy and better generalization performance. Figure 9(a) and (c) shows that the output of GCN have many confusion samples, resulting in incorrect fault diagnosis, where the results are shown in Table II. In Fig. 9(a), there are obvious confusion samples for different fault types.

In Fig. 9(c), the features of different fault locations are visualized using many confusing and scattered samples. In Fig. 9(d), the feature output of the proposed model is more concentrated for different fault locations, meaning that the proposed model has a better feature extraction ability and fault diagnosis performance.

Figure 10 shows the test accuracies of the proposed model and CNN for the training process. The proposed model can significantly reach a better performance after the parameter initialization. In addition, the proposed model has faster convergence than the traditional algorithms, which significantly reduces the calculation costs. This means that the effective use of physical structural information significantly improves the feature processing ability and training efficiency of the model. The improvement in training efficiency is also of great significance for the practical application of deep learning models in relay protection devices. The model has a greater possibility of online model updating using power operation data.

Fig. 10 Test accuracies of proposed model and CNN for training process. (a)　Fault type analysis. (b)　Fault location.

D. Performance Verification for Signal Interference

In point of fact, the data acquisition devices are influenced by different levels of noise or loss of data due to different working environments. In addition, the changes in load also affect the performance of the fault diagnostic model. Therefore, the generalization ability of the model is a core feature of the fault diagnostic model. In our study, different interference factors are considered to verify the generalization and anti-interference performance of the model.

1)　Effectiveness of Proposed Model Under Different Noise Conditions

Electrical measurements are easily influenced by electromagnetic interference and other environmental factors. In this study, Gaussian white noise is used to simulate the interference of environmental factors. The signal noise ratios (SNRs) [

38] of the data are set as 10, 15, 20, 25, 30, 35, and 40 dB. The model will be studied under seven types of noisy environments. The effects of different SNRs on the waveform of bus 9-bus 10 under AN fault are shown in Fig. 11.

Fig. 11 Effects of different SNRs on waveform of bus 9-bus 10 under AN fault. (a) SNR is 35 dB. (b) SNR is 25 dB. (c) SNR is 15 dB. (d) SNR is 10 dB.

The test accuracies for fault type analysis and fault location of the proposed model, GCN, and PCA-SVM under different SNRs are shown in Fig. 12(a) and (b), respectively. It can be observed that the traditional PCA-SVM is severely disturbed by noise, whereas the proposed model can reach satisfactory performance under the effects of noise.

Fig. 12 Test accuracies for fault type analysis and fault location of proposed model, GCN, and PCA-SVM under different SNRs. (a) Fault type analysis. (b) Fault location.

2)　Effectiveness of Proposed Model under Different Outlier Conditions

Typically, the sampled signal may have outliers due to inaccurate measurements and interference. Thus, eliminating the interference of abnormal values is a major requirement of the fault diagnostic model. Outliers are simulated by multiplying the standard measurements and random numbers between 0.7 and 1.3. The numbers of outliers are set as 1%, 2%, 5%, 10%, and 20% of the total sampled data. The model will be verified under these five outlier conditions. The effect of different outlier rates on waveform of bus 9-bus 10 under AN fault is shown in Fig. 13.

Fig. 13 Effect of different outlier rates on waveform of bus 9-bus 10 under AN fault. (a) Outlier rate is 10%. (b) Outlier rate is 20%.

The test accuracies for fault type analysis and fault location of the proposed model, GCN, and PCA-SVM under different outlier rates are shown in Fig. 14(a) and (b), respectively. It can be observed that the accuracy of traditional machine learning models decreases significantly with an increase in the number of outlier rates. However, the proposed model can resist the interference of outlier rates and maintain its original high performance.

Fig. 14 Test accuracies for fault type analysis and fault location of proposed model, GCN, and PCA-SVM under different outlier rates. (a) Fault type analysis. (b) Fault location.

3)　Effectiveness of Proposed Model Under Different Data Missing Rates

In practice, smart meters package and upload collected information. In this process, missing data cannot be ignored because of internet and equipment factors. Therefore, it is highly probable that the information of each node will be missing. When the meters fail to upload data, the voltage and current values collected by the meter cannot be obtained by the model. The model must diagnose the fault using the remaining information. During the verification, the possibility of data missing at each node are set to be 0.5%, 1%, 2%, 5%, and 10%. In the worst case with data missing rate of 10%, each data window has only a ${(1 - 10 %)}^{33} = 3.1 %$ probability of containing complete original data. The test accuracies for fault type analysis and fault location of the proposed model, GCN, and CNN under different data missing rates are shown in Fig. 15(a) and (b), respectively. It can be observed that the proposed model has significant advantages in terms of avoiding missing data. When the data missing rate increases gradually, the performances of all models diminish to different degrees. Compared with the original condition, CNN is affected by data missing seriously. It may be due to CNN cannot effectively use the structural information of data. By contrast, GCN has good robustness when input data have missing condition. It indicates that the use of topology information is effective for the model to resist data input anomalies. The proposed model has better fault diagnostic performance under data missing conditions.

Fig. 15 Test accuracies for fault type analysis and fault location of proposed model, GCN, and PCA-SVM under different data missing rates. (a) Fault type analysis. (b) Fault location.

It can be observed from Fig. 15 that CNN is more affected by missing values because it relies more on the numerical information of the data, which makes it greatly affected by abnormal values. The graph-based models embed topological information into the feature extraction process, which enables the model to extract deeper data structural information and makes it more resistant to abnormal values. In the worst case with data missing rate of 10%, the proposed model could still achieve a high performance, which is of great significance for practical relay protection systems.

4)　Effectiveness of Proposed Model Under Various Load Conditions

Various load conditions are simulated by multiplying the basic load value by a random number between 0.7 and 1.3. The test accuracy for fault type analysis and fault location of the proposed model under various load conditions is shown in Fig. 16. The random number at each fault line is changed in the simulation, and the experiment is repeated 10 times. Therefore, the load changes for $10 \times 32 = 320$ times, which completely simulates the load variation within a certain range. It can be observed from Fig. 16 that the load change in the normal range has little effect on the proposed fault diagnostic model.

Fig. 16 Test accuracies for fault type analysis and fault location of proposed model under various load conditions.

E. Performance Verification Under Different Fault Resistances

In an actual distribution system, the fault resistances are unknown and may differ from the fault conditions during model training. In this case, fault diagnostic models need to have a good adaptability to untrained fault resistance conditions. To further verify the effectiveness of the model under different unknown fault resistance conditions, the model trained with a 0.1 Ω fault resistance is directly applied to other fault resistance conditions. In addition, CNN and GCN are compared to verify the effectiveness of the proposed model. The test results of the proposed model under different fault resistances are presented in Table III.

TABLE III Results Under Different Fault Resistances in Training Case of 0.1 Ω Fault Resistance

Fault resistance (Ω)	Accuracy of proposed model		Accuracy of CNN		Accuracy of GCN
Fault resistance (Ω)	Fault type analysis	Fault location	Fault type analysis	Fault location	Fault type analysis	Fault location
0.01	0.998	0.989	0.894	0.982	0.975	0.924
0.1 (train)	0.999	0.992	0.899	0.981	0.973	0.917
0.5	0.999	0.992	0.894	0.981	0.973	0.915
1	0.997	0.992	0.893	0.983	0.974	0.908
2	0.997	0.992	0.891	0.983	0.969	0.884
5	0.993	0.993	0.883	0.981	0.970	0.831
10	0.990	0.991	0.869	0.976	0.971	0.739
15	0.988	0.986	0.852	0.965	0.955	0.640
20	0.981	0.975	0.835	0.955	0.932	0.584
50	0.941	0.897	0.693	0.879	0.789	0.414

Table III shows that all the models perform in a manner similar to that of the training situation under low fault resistance conditions. When the fault resistance increases gradually, the performance of all models decreases to various degrees. Table III shows that the fault classification performance of CNN and the fault location performance of GCN are both significantly affected by the increase in fault resistance. Due to the spatiotemporal convolutional structure and a series of parallel structures, the proposed model has a stronger feature extraction ability, which means it can be effectively applied to different fault resistance conditions after being trained under 0.1 Ω fault resistance condition. The performance of the proposed model is the least affected by fault resistance changes. In other words, the proposed model exhibits a stronger generalization performance under different fault resistance conditions.

In an actual distribution system, the performance of the proposed model needs to be generalized to a specific range of fault resistance. To verify the effectiveness of the proposed model under a specific fault resistance range, the proposed model is trained with 0.01 Ω+50 Ω fault resistance and tested in the range from 0.01 Ω to 50 Ω. In addition, CNN and GCN are used in comparative experiments to verify the effectiveness of the proposed model.

Table IV shows that the performances of CNN and GCN decrease significantly when the fault resistance increases. Even if the 50 Ω fault resistance are utilized as the training condition for model learning, CNN and GCN could not adapt to the entire fault resistance range due to the lack of a feature extraction ability. This indicates that the embedding of topological information and the designed processing of waveform features are effective with the fault diagnostic model.

TABLE IⅤ Results Under Different Fault Resistances in Training Case of 0.01 Ω+50 Ω Fault Resistance

Fault resistance (Ω)	Accuracy of proposed model		Accuracy of CNN		Accuracy of GCN
Fault resistance (Ω)	Fault type analysis	Fault location	Fault type analysis	Fault location	Fault type analysis	Fault location
0.01 (train)	0.998	0.991	0.893	0.981	0.972	0.910
0.1	0.997	0.991	0.894	0.981	0.974	0.901
0.5	0.999	0.992	0.894	0.979	0.973	0.901
1	0.999	0.993	0.896	0.979	0.974	0.900
2	0.999	0.992	0.896	0.978	0.973	0.871
5	0.998	0.995	0.896	0.977	0.975	0.844
10	0.998	0.996	0.895	0.979	0.973	0.799
15	0.998	0.996	0.893	0.979	0.971	0.765
20	0.999	0.996	0.893	0.981	0.969	0.763
50 (train)	0.996	0.994	0.890	0.945	0.968	0.802

The higher fault impedances indicate weaker fault features, which affect the performance and generalization ability of the fault diagnostic model. To verify the performance and generalization ability of the proposed model under a high-impedance fault, fault data with a 300 Ω fault resistance are selected for model training, and 100-600 Ω fault resistances are selected for model testing. The comparative results of the proposed model, CNN, and GCN experiments are presented in Table V.

TABLE V Results Under Different Fault Resistances in Training Case of 300 Ω Fault Resistance

Fault resistance (Ω)	Accuracy of proposed model		Accuracy of CNN		Accuracy of GCN
Fault resistance (Ω)	Fault type analysis	Fault location	Fault type analysis	Fault location	Fault type analysis	Fault location
100	0.973	0.960	0.867	0.676	0.881	0.429
150	0.989	0.991	0.912	0.817	0.932	0.551
200	0.994	0.999	0.934	0.905	0.967	0.677
300 (train)	0.999	0.999	0.960	0.956	0.978	0.739
400	0.999	0.999	0.947	0.931	0.969	0.650
500	0.996	0.998	0.905	0.883	0.960	0.496
600	0.993	0.993	0.853	0.832	0.934	0.385

It can be observed from Table V that the proposed model outperforms other models under different fault resistances. The accuracy of GCN could not meet the requirements of diagnosis when the fault resistance increases.

It may be owing to that representation ability of GCN is insufficient, which make it not extract key features under different fault resistances. The proposed model has a better fault diagnostic and generalization performances under a high-resistance fault because of its stronger feature extraction ability and physical information embedding.

F. Performance Verification Under Different Topological Structures

In an actual distribution system, the topology of the distribution network can be changed because of different operating states. To test the performance of the proposed diagnostic model for topological changes, three types of mesh distribution networks and different fault resistance conditions are modeled. Three mesh topological structures are obtained from the original IEEE 33-bus topology by connecting different branches. It should be noted that to verify the effectiveness of the model on a weak mesh distribution network, the line impedance parameters in the experiments are five times that of the original system. The topology numbers and connection methods of three mesh topological structures are listed in Table VI.

TABLE VI Topology Numbers and Connection Methods of Three Mesh Topological Structures

Topology number	Connection method
G1	Connect bus 8-bus 21, bus 9-bus 15, bus 12-bus 22, bus 18-bus 33, and bus 25-bus 29
G2	Connect bus 9-bus 15, bus 12-bus 22, bus 18-bus 33, and bus 25-bus 29
G3	Connect bus 8-bus 21, bus 12-bus 22, bus 18-bus 33, and bus 25-bus 29

To verify the generalization performance of the proposed model for mesh topologies and topological changes, the data under the G1 topology are utilized as the training dataset, and the data under the G2 and G3 topologies are utilized as the verification dataset. The fault state parameters utilized for the collected data are listed in Table VII.

TABLE VII Fault State Parameters Utilized for Collected Data

Fault parameter	Values or types
Fault type	AN(1), BN(2), CN(3), AB(4), AC(5), BC(6), ABN(7), CAN(8), BCN(9), ABC(10)
Fault position	The midpoint of length at bus 1-bus 2, bus 2-bus 3, bus 3-bus 23, bus 6-bus 7, bus 8-bus 9, bus 11-bus 12, bus 14-bus 15, bus 17-bus 18, bus 21-bus 22, bus 26-bus 27, bus 29-bus 30, and bus 32-bus 33
Fault resistance (Ω)	0.01, 0.1, 1, 10, 20, 50, 100, 200, 300, 500

For the G1, G2, and G3 topologies, 10 types of fault resistance conditions are set. The fault resistances of 1 Ω+200 Ω in the G1 topology will be utilized as the training environment of the model. Other fault resistance conditions in the G2 and G3 topologies will be utilized as the test environment. The graph structure of G1 topology is shown in Fig. 17.

Fig. 17 Graph structure of G1 topology.

CNN and GCN are utilized in comparative experiments to show the effectiveness of the proposed model. The performances of different models for G1 topology in the training case of 1 Ω+200 Ω fault resistances are presented in Table VIII.

TABLE VIII Performances of Different Models for G1 Topology in Training Case of 1 Ω+200 Ω Fault Resistance

Model	Accuracy
Model	Fault type analysis	Fault location
Proposed	0.998	0.999
GCN [24]	0.973	0.899
CNN [18]	0.799	0.984

Table VIII shows that the proposed model performs better on the training data of the mesh topological structure. The fault location accuracy of the proposed model is 10.0% higher than that of GCN. To verify the effectiveness of the proposed model when the topology changes, each type of model will be tested using the data of the G2 and G3 topologies under different fault resistances without other training processes. The results of the proposed model, CNN, and GCN are listed in Table IX.

TABLE IⅩ Results of Different Models Using Data Of G2 And G3 Topologies Under Different Fault Resistances in Training Case of 1 Ω+200 Ω Fault Resistances

Fault resistance(Ω)	Accuracy of proposed model (G2)		Accuracy of CNN (G2)		Accuracy of GCN (G2)		Accuracy of proposed model (G3)		Accuracy of CNN (G3)		Accuracy of GCN (G3)
Fault resistance(Ω)	Fault type	Fault location	Fault type	Fault location	Fault type	Fault location	Fault type	Fault location	Fault type	Fault location	Fault type	Fault location
0.01	0.998	0.989	0.789	0.801	0.970	0.912	0.999	0.993	0.794	0.913	0.970	0.907
0.1	0.999	0.990	0.795	0.804	0.971	0.911	0.997	0.990	0.794	0.912	0.971	0.916
1 (train)	0.997	0.991	0.796	0.802	0.977	0.917	0.998	0.993	0.795	0.913	0.975	0.912
10	0.999	0.988	0.797	0.795	0.964	0.914	0.999	0.988	0.792	0.913	0.968	0.879
20	0.999	0.988	0.797	0.791	0.968	0.809	0.999	0.985	0.791	0.907	0.967	0.803
50	0.999	0.985	0.785	0.792	0.968	0.761	0.999	0.977	0.776	0.898	0.969	0.776
100	0.998	0.986	0.764	0.775	0.963	0.766	0.998	0.972	0.768	0.862	0.964	0.788
200 (train)	0.992	0.981	0.726	0.724	0.936	0.790	0.993	0.976	0.744	0.791	0.930	0.816
300	0.968	0.984	0.697	0.679	0.913	0.775	0.979	0.982	0.729	0.740	0.867	0.801
500	0.898	0.971	0.656	0.642	0.823	0.724	0.912	0.965	0.709	0.685	0.787	0.721

It can be observed that the proposed model performs better when directly generalized to similar topologies because of its stronger feature extraction ability. In addition, CNN is most affected by topological changes. It may be due to that the fault diagnostic models based on CNN cannot embed physical topological information so that CNN extracts more numerical features of data but could not learn the deeper structure information. When dealing with changes in data, the reliability of CNN is significantly reduced. The graph-based models could learn the structural information of the data, which makes the models have reliable performance under different topology change conditions. With improved spatiotemporal convolutional operations, the proposed model shows more effective generalization performance during topological change conditions, which means the model could be implemented in existing distribution systems.

In an actual distribution system with lower voltage levels, the $R / X$ values of the line parameters may be larger. In addition, the three-phase load may be unbalanced due to different operating conditions and load levels. The structure of IEEE 37-bus test system for fault diagnosis is shown in Fig. 18. The fault parameters are listed in Table X.

Fig. 18 Structure of IEEE 37-bus test system for fault diagnosis.

TABLE X Fault Parameters in IEEE 37-bus Test System

Fault parameter	Values or types
Fault type	AN(1), BN(2), CN(3), AB(4), AC(5), BC(6), ABN(7), CAN(8), BCN(9), ABC(10)
Fault position	The midpoint of fault lines
Fault resistance (Ω)	0.01, 0.1, 1, 10, 20, 30, 40, 50, 100

The fault resistance of 0.1 Ω is utilized as the training environment, and other fault resistance conditions are utilized as the test conditions. The results are listed in Table XI.

TABLE XI Results Under Different Fault Resistances in Training Case of 0.1 Ω Fault Resistance

Fault resistance (Ω)	Accuracy of proposed model		Accuracy of CNN		Accuracy of GCN
Fault resistance (Ω)	Fault type analysis	Fault location	Fault type analysis	Fault location	Fault type analysis	Fault location
0.01	0.998	0.998	0.878	0.971	0.980	0.923
0.1 (train)	0.999	0.999	0.881	0.970	0.977	0.923
1	0.998	0.999	0.881	0.971	0.978	0.919
10	0.979	0.994	0.877	0.933	0.957	0.797
20	0.959	0.959	0.854	0.853	0.905	0.602
30	0.925	0.924	0.812	0.778	0.845	0.603
40	0.895	0.871	0.785	0.709	0.774	0.548
50	0.887	0.822	0.706	0.617	0.738	0.501

Table XI shows that the proposed model has a better performance in the training environment with a 0.1 Ω fault resistance. The accuracy of GCN could not satisfy the requirements of diagnosis when fault resistance increases. The proposed model still has better performances under different resistances on the IEEE 37-bus system. The verification on different systems further show that the proposed model has stronger generalization performance and adaptability. The effective utilization of fault waveform and data structural information are thus critical to the proposed model.

G. Fault Diagnostic Model for Practical Application

In an actual distribution network, the system scale is relatively large, and measuring devices may be scarce for all buses. The topology of the distribution network can be simplified to a smaller topology by the key buses. In this manner, the proposed model can realize fault analysis of the entire distribution network through limited measurement data. The fault diagnostic model can locate a fault in a specific subregion through the key nodes. In this study, the experimental results verify the fault type analysis and fault location performances of the proposed model for subregions. Eleven key nodes are selected from the IEEE 33-bus test system, and 10 subregions are divided to verify the effectiveness of the proposed model. This verifies the capability of the proposed fault diagnostic framework under less measurement data and topological simplification.

The structure of simplified topology of IEEE 33-bus test system is presented in Fig. 19, where the nodes 0-10 refer to the 11 selected key nodes, and edges refer to the subregions.

Fig. 19 Structure of simplified topology of IEEE 33-bus test system.

The fault diagnostic model utilizes the information of these key nodes. The data sources of the nodes and the definitions of edges are presented in Table XII. The fault location labels in the original topology are replaced with the labels in the simplified topology.

TABLE XII Data Sources of Nodes and Definitions of Edges

Node No.	Data source	Edge	Location label (IEEE 33-bus test system)
0	1	0-1	Bus 1-bus 2
1	2	1-2	Bus 2-bus 3
2	3	2-3	Bus 3-bus 6
3	6	3-4	Bus 6-bus 11
4	11	4-5	Bus 11-bus 14
5	14	5-6	Bus 14-bus 18
6	18	1-7	Bus 2-bus 19, bus 19-bus 22
7	22	2-8	Bus 3-bus 23, bus 23-bus 25
8	25	3-9	Bus 6-bus 26, bus 26-bus 29
9	29	9-10	Bus 29-bus 33
10	33

For the simplified system, the required measurement information is the data from the key nodes. The proposed fault diagnostic model can determine the fault type and locate the fault in the subregion between these key nodes. For example, if the model output edge is 9-10, the sub-region between buses 29 and 33 in the original IEEE 33-bus system has experienced a fault. CNN and GCN are utilized for comparative experiments, as shown in Table XIII.

TABLE XIII Results Under Different Fault Resistances on Simplified Topology

Fault resistance (Ω)	Accuracy of proposed model		Accuracy of CNN		Accuracy of GCN
Fault resistance (Ω)	Fault type analysis	Fault location	Fault type analysis	Fault location	Fault type analysis	Fault location
0.01	0.999	0.990	0.973	0.926	0.910	0.976
2	0.999	0.980	0.966	0.922	0.888	0.952
50	0.954	0.923	0.751	0.760	0.770	0.855

Table XIII shows that the proposed model performs better in terms of fault classification and fault location under the simplified topology. The proposed model shows stronger generalization ability and adaptability under unknown fault resistances. The results show that the proposed model not only has a stronger fitting ability with the training dataset but has a stronger generalization ability under different fault resistance conditions. The proposed model performs well on the simplified topology, suggesting that the proposed model is promising for application in actual distribution systems. For larger systems, key buses that provide data information for the proposed model can be varied to adjust the sub-regions of the fault location.

The measurement information of the bus is represented by the RMS values of the voltage and current. PMUs can be implemented as measuring devices for bus data. The sampling frequency of a PMU can reach 10 kHz, and its real-time data transmission is within 20 ms. Accordingly, PMUs can be applied to the actual measurements of the proposed fault diagnostic framework. In addition, the proposed fault diagnostic framework is based on AI technology. The input sample length, sampling frequency, and fault detection interval can be determined based on actual situations.

V. Conclusion

In this study, a combined fault type analysis and fault location model based on spatiotemporal graph learning is proposed to perform fault diagnostic tasks for distribution systems. Based on the excellent feature processing ability of the spatiotemporal convolutional block, fault type analysis, and fault location can be performed accurately in multi-task learning models. The topological information of the distribution network could be embedded to act as a significant constraint during model training, enabling the model to learn the deeper structural information of the fault data and giving it stronger resistance to abnormal data. Meanwhile, the waveform features and structural information are effectively combined by the spatiotemporal convolutional block, significantly improving the performance of the fault diagnostic model. Thus, the proposed fault diagnostic model has higher accuracy and stronger generalization ability under topological changes, unknown fault resistance conditions, and different types of signal interference. The effectiveness of the proposed framework is verified under different test system and fault conditions. The results show that the proposed framework has better performance and generalization ability than GCN and other intelligent models.

References

R. H. Salim, K. R. C. de Oliveira, A. D. Filomena et al., “Hybrid fault diagnosis scheme implementation for power distribution systems automation,” IEEE Transactions on Power Delivery, vol. 23, no. 4, pp. 1846-1856, Oct. 2008. [Baidu Scholar]

S. Su, X. Duan, and W. Chan, “Probability distribution of fault in distribution system,” IEEE Transactions on Power Systems, vol. 23, no. 3, pp. 1521-1522, Aug. 2008. [Baidu Scholar]

Y. Jiang, “Data-driven probabilistic fault location of electric power distribution systems incorporating data uncertainties,” IEEE Transactions on Smart Grid, vol. 12, no. 5, pp. 4522-4534, Sept. 2021. [Baidu Scholar]

A. D. Filomena, M. Resener, R. H. Salim et al., “Fault location for underground distribution feeders: an extended impedance-based formulation with capacitive current compensation,” International Journal of Electrical Power & Energy Systems, vol. 31, no. 9, pp. 489-496, Oct. 2009. [Baidu Scholar]

Z. Li, Y. Wan, L. Wu et al., “Study on wide-area protection algorithm based on composite impedance directional principle,” International Journal of Electrical Power & Energy Systems, vol. 115, pp. 05518, Feb. 2020. [Baidu Scholar]

S. Lotfifard, M. Kezunovic, and M. J. Mousavi, “Voltage sag data utilization for distribution fault location,” IEEE Transactions on Power Delivery, vol. 26, no. 2, pp. 1239-1246, Apr. 2011. [Baidu Scholar]

R. Liang, “Fault location method in power network by applying accurate information of arrival time differences of modal traveling waves,” IEEE Transactions on Industrial Informatics, vol. 16, no. 5, pp. 3124-3132, May 2020. [Baidu Scholar]

O. D. Naidu and A. K. Pradhan, “Precise traveling wave-based transmission line fault location method using single-ended data,” IEEE Transactions on Industrial Informatics, vol. 17, no. 8, pp. 5197-5207, Aug. 2021. [Baidu Scholar]

M. M. Leal, F. B. Costa, and J. T. L. S. Campos, “Improved traditional directional protection by using the stationary wavelet transform,” International Journal of Electrical Power & Energy Systems, vol. 105, pp. 59-69, Feb. 2019. [Baidu Scholar]

J. Weng, D. Liu, N. Luo et al., “Distributed processing based fault location, isolation, and service restoration method for active distribution network,” Journal of Modern Power Systems and Clean Energy, vol. 3, no. 4, pp. 494-503, Dec. 2015. [Baidu Scholar]

J. Chen, E. Chu, Y. Li et al., “Faulty feeder identification and fault area localization in resonant grounding system based on wavelet packet and Bayesian classifier,” Journal of Modern Power Systems and Clean Energy, vol. 8, no. 4, pp. 760-767, Jul. 2020. [Baidu Scholar]

M. S. Elbana, N. Abbasy, A. Meghed et al., “µPMU-based smart adaptive protection scheme for microgrids,” Journal of Modern Power Systems and Clean Energy, vol. 7, no. 4, pp. 887-898, Jul. 2019. [Baidu Scholar]

C. Noh, C. Kim, G. Gwon et al., “Development of fault section identification technique for low voltage DC distribution systems by using capacitive discharge current,” Journal of Modern Power Systems and Clean Energy, vol. 6, no. 3, pp. 509-520, May 2018. [Baidu Scholar]

Y. Deng, X. Liu, R. Jia et al., “Sag source location and type recognition via attention-based independently recurrent neural network,” Journal of Modern Power Systems and Clean Energy, vol. 9, no. 5, pp. 1018-1031, Sept. 2021. [Baidu Scholar]

H. Wang, Z. Liu, D. Peng et al., “Understanding and learning discriminant features based on multiattention 1DCNN for wheelset bearing fault diagnosis,” IEEE Transactions on Industrial Informatics, vol. 16, no. 9, pp. 5735-5745, Sept. 2020. [Baidu Scholar]

A. Rafinia and J. Moshtagh, “A new approach to fault location in three-phase underground distribution system using combination of wavelet analysis with ANN and FLS,” International Journal of Electrical Power & Energy Systems, vol. 55, pp. 261-274, Feb. 2014. [Baidu Scholar]

D. Thukaram, H. P. Khincha, and H. P. Vijaynarasimha, “Artificial neural network and support vector Machine approach for locating faults in radial distribution systems,” IEEE Transactions on Power Delivery, vol 20, no. 2, pp. 710-721, Apr. 2005. [Baidu Scholar]

S. Ananwattanaporn and A. Ngaopitakkul, “Study of multi-distributed generation behavior when fault occurrence in distribution system using wavelet transform,” in Proceedings of 2016 Joint 8th International Conference on Soft Computing and Intelligent Systems (SCIS) and 17th International Symposium on Advanced Intelligent Systems (ISIS), Sapporo, Japan, Aug. 2016, pp. 148-153. [Baidu Scholar]

A. Bagheri, I. Y. H. Gu, M. H. J. Bollen et al., “A robust transform-domain deep convolutional network for voltage dip classification,” IEEE Transactions on Power Delivery, vol. 33, no. 6, pp. 2794-2802, Dec. 2018. [Baidu Scholar]

W. Liao, B. Bak-Jensen, J. R. Pillai et al., “A review of graph neural networks and their applications in power systems,” Journal of Modern Power Systems and Clean Energy, vol. 10, no. 2, pp. 345-360, Mar. 2022. [Baidu Scholar]

W. Liao, D. Yang, Y. Wang et al., “Fault diagnosis of power transformers using graph convolutional network,” CSEE Journal of Power and Energy Systems, vol. 7, no. 2, pp. 241-249, Mar. 2021. [Baidu Scholar]

X. Zhao, M. Jia, and Z. Liu. “Semisupervised graph convolution deep belief network for fault diagnosis of electormechanical system with limited labeled Data,” IEEE Transactions on Industrial Informatics, vol. 17, no. 8, pp. 5450-5460, Aug. 2021. [Baidu Scholar]

H. Tong, R. C. Qiu, D. Zhang et al., “Detection and classification of transmission line transient faults based on graph convolutional neural network,” CSEE Journal of Power and Energy Systems, vol. 7, no. 3, pp. 456-471, May 2021. [Baidu Scholar]

K. Chen, J. Hu, Y. Zhang et al., “Fault location in power distribution systems via deep graph convolutional networks,” IEEE Journal on Selected Areas in Communications, vol. 38, no. 1, pp. 119-131, Jan. 2020. [Baidu Scholar]

J. Ma, X. Yan, B. Fan et al., “A novel line protection scheme for a single phase-to-ground fault based on voltage phase comparison,” IEEE Transactions on Power Delivery, vol. 31, no. 5, pp. 2018-2027, Oct. 2016. [Baidu Scholar]

J. Teng, W. Huang, and S. Luan. “Automatic and fast faulted line-section location method for distribution systems based on fault indicators,” IEEE Transactions on Power Systems, vol. 29, no. 4, pp. 1653-1662, Jul. 2014. [Baidu Scholar]

J. Choung, S. Lim, S. H. Lim et al., “Automatic discontinuity classification of wind-turbine blades using a-scan-based convolutional neural network,” Journal of Modern Power Systems and Clean Energy, vol. 9, no. 1, pp. 210-218, Jan. 2021. [Baidu Scholar]

M. Onuki, S. Ono, M. Yamagishi et al., “Graph signal denoising via trilateral filter on graph spectral domain,” IEEE Transactions on Signal and Information Processing over Networks, vol. 2, no. 2, pp. 137-148, Jun. 2016. [Baidu Scholar]

M. Niepert, M. Ahmed, and K. Kutzkov, “Learning convolutional neural networks for graphs,” Proceedings of International Conference on Machine Learning (ICML), New York, USA, Jun. 2016, pp. 2014-2023. [Baidu Scholar]

W. Peng, J. Shi, and G. Zhao, “Spatial temporal graph deconvolutional network for skeleton-based human action recognition,” IEEE Signal Processing Letters, vol. 28, pp. 244-248, Jan. 2021. [Baidu Scholar]

Y. Luo, C. Lu, L. Zhu et al., “Data-driven short-term voltage stability assessment based on spatial-temporal graph convolutional network,” International Journal of Electrical Power & Energy Systems, vol. 130, p. 106753, Sept. 2021. [Baidu Scholar]

X. Dong, Y. Sun, Y. Li et al. “Spatio-temporal convolutional network based power forecasting of multiple wind farms,” Journal of Modern Power Systems and Clean Energy, vol. 10, no. 2, pp. 388-398, Mar. 2022. [Baidu Scholar]

G. Wang, Z. Zhang, Z. Bian et al., “A short-term voltage stability online prediction method based on graph convolutional networks and long short-term memory networks,” International Journal of Electrical Power & Energy Systems, vol. 127, p. 106647, May 2021. [Baidu Scholar]

Y. Wang, Z. Hu, Y. Ye et al., “Demystifying graph neural network via graph filter assessment,” in Proceedings of International Conference on Learning Representations (ICLR), online, 2020, pp. 1-15. [Baidu Scholar]

J. Qin, Y. Zhang, S. Fan et al., “Multi-task short-term reactive and active load forecasting method based on attention-LSTM model,” International Journal of Electrical Power & Energy Systems, vol. 135, p. 107517, Feb. 2022. [Baidu Scholar]

M. A. Kashem, V. Ganapathy, G. B. Jasmon et al., “A novel method for loss minimization in distribution networks,” in Proeedings of DRPT 2000 International Conference on Electric Utility Deregulation and Restructuring and Power Technologies, London, UK, Apr. 2000, pp. 251-256. [Baidu Scholar]

R. Zhang, W. Yao, Z. Shi et al., “A graph attention networks-based model to distinguish the transient rotor angle instability and short-term voltage instability in power systems,” International Journal of Electrical Power & Energy Systems, vol. 137, pp. 107783, May 2022. [Baidu Scholar]

A. B. Waqas, Y. Saifullah, and M. M. Ashraf. “A hybrid quantum inspired particle swarm optimization and least square framework for real-time harmonic estimation,” Journal of Modern Power Systems and Clean Energy, vol. 9, no. 6, pp. 1548-1556, Nov. 2021. [Baidu Scholar]

Address:No.19 Chengxin Avenue, Jiangning District, Nanjing 211106, China

E-mail: mpce@alljournals.cn

Tel:86-25-81093060

Fax:86-25-81093040

Home

Introduction

Editorial Board

For Author

Call For Papers

APC

Sponsor & Publisher