1 Introduction

A power grid is an interconnected grid for delivering electricity from generators to loads through transmission and distribution systems, and also a network overlaid with monitoring, protection, and control components that ensure power grid stability and reliability. Now the power grid is being transformed into the “smart grid” to supply more reliable, more sustainable, and more affordable electricity to customers. To achieve these attributes, a variety of smart grid technologies are demanded [14].

The synchrophasor system, also known as synchronized phasor measurement system, is an important smart grid technology. It uses advanced information and communication technologies (ICTs), such as global positioning system (GPS), wide-area measurement system (WAMS), phasor measurement unit (PMU), and phasor data concentrator (PDC), implements low-latency, high-precision, and time-synchronized power system measurement, and further improves power system planning, operation, and analysis at a more efficient and responsive level [510].

In the past decade, increasing number of PMUs were installed around the world and a variety of PMU-based synchrophasor systems were available in power grids. It should be noted that most of these projects are subsidized. The technical and economic benefits of synchrophasor systems are not fully identified, and the potentials of various synchrophasor applications need further explored [8]. In practice, the synchrophasor system as a physical network involves communication constraints, such as data quality and data security. Many synchrophasor applications’ robustness to data quality issues is relatively unknown, and their performances may be affected or even disabled due to data flaws [3, 68]. Therefore, this work investigates the data quality issue for synchrophasor applications.

This work is divided into two Parts: Part I attempts to provide an overall picture of the data quality issue and Part II explores the potential reasons and solutions for the data quality issue.

Specifically, this paper performs a review of the standards of synchrophasor systems and the classifications and data requirements of synchrophasor applications, and presents the statistical results of actual data quality events. To the best knowledge of the authors, this is the first paper that reviews the data quality issue for synchrophasor applications. Moreover, the statistics of real-world synchronization signal loss, synchrophasor data loss, and latency are first published here.

2 Synchrophasor systems

To investigate the data quality issue for synchrophasor applications, it is advantageous to understand synchrophasor systems. In this section, the basic concept, the deployment, and the standard of synchrophasor systems are introduced, respectively.

2.1 Synchrophasor system components

A synchrophasor or a synchronized phasor refers to a phasor calculated from data samples using a standard time signal, e.g., GPS signal, as the reference for the measurement. A typical synchrophasor system is shown in Fig. 1, which primarily consists of the PMU, PDC, data storage, and communication network [1116].

Fig. 1
figure 1

Synchrophasor system framework [13]

In general, the PMU is a function or a device that provides synchrophasor, frequency, and rate of change of frequency (ROCOF) measurements from voltage and/or current signals and a time synchronizing signal; the PDC is a function that collects synchrophasor data and discrete event data from multiple PMUs and/or other PDCs, aligns the data by time tags to create a time-synchronized dataset, and transmits the dataset to a control center and/or various applications; and the data storage is used to store synchrophasor data and make them conveniently available for post-event analysis (note that PDCs may buffer data for a short time period but do not store data [16]). If a PDC collects the data from 100 PMUs (each PMU is with 20 measurements and 30 samples per second), it will require the data storage with the capacity over 50 GB/day and 1.5 TB/month [17].

In practice, PMUs are typically installed at a substation or a power plant, and PDCs are diversely located at a substation, a regional control room, and a centralized control room as shown in Fig. 1. Local PDCs aggregate and align the synchrophasor data from multiple PMUs, and mid- and higher-level PDCs collect the synchrophasor data from multiple PDCs, check the data quality, and deliver the data to a control center or a variety of applications.

2.2 Synchrophasor system deployment

In the mid 1980’s, the first PMUs were developed at Virginia Tech, and now the PMU-based synchrophasor systems are globally deployed [57]. According to the latest statistics from the North American Synchrophasor Initiative (NASPI), there are almost 2000 commercial-grade PMUs installed across North America, and many local and regional PDCs collecting real-time, high-speed, time synchronized power grid information [8].

The map in Fig. 2a shows the PMU locations and the way in which the synchrophasor data are being shared between power plant and transmission owners (which own the PMUs) and grid operators. The synchrophasor system and synchrophasor data provide a real-time wide-area view of North America power systems, and enhance wide-area monitoring, protection & control, and other functions for better system performances.

Fig. 2
figure 2

Synchrophasor system development

In addition, in 2003, a low cost and quickly deployable phasor measurement device named frequency disturbance recorder (FDR) was developed, and subsequently a wide-area frequency measurement system known as FNET or FNET/GridEye went online. The FDR, as the key component of the FNET/GridEye, measures voltage magnitude, angle, and frequency at a high precision level. The measured signals are calculated at 100 ms intervals and then transmitted across the public Internet to a PDC, where they are synchronized, analyzed, and archived. Specifically, the FDR is installed at ordinary 120 V outlets and thus is relatively inexpensive and simple to install if compared with a typical PMU [9, 10].

The FNET/GridEye system is currently operated by the University of Tennessee-Knoxville (UTK) and Oak Ridge National Laboratory (ORNL). As shown in Fig. 2b, it collects synchrophasor data from over 200 FDRs located across the continent and around the world. Additional FDRs are constantly being installed so as to provide better observation of power grids.

2.3 Synchrophasor system standards

In order to promote the synchrophasor system development, the NASPI, National Institute for Standards and Technology (NIST), IEEE, IEC, and electric industry (e.g., utilities, vendors, and academics) put joint effort to developing a set of synchrophasor system standards and guides [1120]. A brief review is presented below to provide a picture of the history and key points of these technical rules.

IEEE Std. 1344-1995 (R2001), released in 1995 and reaffirmed in 2001, is the first IEEE standard for synchrophasors for power systems [11]. It defined phasor and synchronized phasor, and specified synchronizing resources, synchronization methods, and synchrophasor message format (i.e. data frame, configuration frame, and header frame). IEEE Std. 1344-1995(R2001) defined the synchrophasor measurement in terms of the waveform sampling, timing, and basic phasor definition, and did not specify the synchrophasor communication [1719].

IEEE Std. C37.118-2005 is the revision of IEEE Std. 1344-1995(R2001) [12]. It revised the synchronized phasor definition, and specified the synchronization requirements, accuracy requirements under steady-state conditions, and synchrophasor message format (i.e. data frame, configuration frame, header frame, and command frame). In specific, IEEE Std. C37.118-2005 introduced the total vector error (TVE) criterion to quantify synchrophasor measurements. This shifted the focus from measurement methods to measurement results, allowing the use of any method or algorithm that produces good results [1719].

IEEE Std. C37.118-2011 is the current IEEE standard for synchrophasors for power systems [13, 14]. In order to gain a wider international acceptance, the IEEE and the IEC initiated a joint project in 2009 to harmonize IEEE Std. C37.118 with IEC 61850 standard. As a result, IEEE Std. C37.118-2011 is split into two parts [1720].

The first part, IEEE Std.118.1-2011 for synchrophasor measurements, deals with synchrophasor measurements and related performance requirements. It included the steady-state synchrophasor measurements and their performance requirements in IEEE Std. C37.118-2005; it also introduced the dynamic synchrophasor measurements and frequency and ROCOF estimates, and their performance requirements.

The second part, IEEE Std. C37.118.2-2011 for synchrophasor data transfer, standardizes the synchrophasor communication. It is based on the portion of IEEE Std. C37.118-2005 specifying data communication and portion of IEC 61850-90-5 standard. IEEE Std. C37.118.2-2011 allows more communication protocols and systems to be used with synchrophasor measurements and communication, which greatly promotes the development and deployment of synchrophasor systems.

In addition, IEEE Std. C37.238-2011 specifies the precision time protocol for power system applications, IEEE Std. C37.111-2013 standardizes the common format for transient data exchange (COMTRADE) for power systems, and IEEE Std. C37.242-2013 and C37.244-2013 are developed to guide PMU utilization (e.g., synchronization, calibration, testing, and installation), and PDC definitions and functions, respectively [1517]. These critical standards and guides for synchrophasor systems are compactly shown in Fig. 3.

Fig. 3
figure 3

IEEE Standards for synchrophasor systems

3 Synchrophasor applications

In the past decade, synchrophasor systems have become prevalent in power grids and a large number of actual and potential synchrophasor applications have been reported in the literature [58, 2125]. These applications’ classifications and data requirements and sensitivities are discussed in this section.

3.1 Classifications

Synchrophasor applications can be broadly classified into two categories: real-time and off-line applications. The former require real-time data and response within seconds or even sub-seconds after receiving the data; and they improve real-time operations with enhanced visibility and situational awareness, and also support wide-area protection and control actions, such as special protection scheme, remedial action scheme, emergency control system, and wide-area control system [2628].

In contrast, the latter use archived data and may be conducted off-line days or months after the data were collected; and they primarily improve power system analysis and planning, such as baselining, post-event analysis, and model calibration and validation.

Specifically, the NASPI has been working on the phasor application taxonomy. In 2008, the NASPI created a table for phasor application classification and condensed various applications into four categories as shown in Fig. 4, including the situational awareness, monitoring/alarming, analysis/assessment, and advanced applications [21]. Later, the applications were classified with their working fields in [22, 23], e.g., reliability operation, market operation, planning, and others, and grouped in accordance with their maturity levels in [24], e.g., Level-1 (conceptualization), Level-2 (development), Level-3 (implementation), Level-4 (operationalized), and Level-5 (integrated and highly mature). A set of metrics to describe and characterize each maturity level were also given in [24].

Fig. 4
figure 4

Classification of synchrophasor applications [2123]

In engineering, different groups of synchrophasor applications have different requirements on synchrophasor data, such as data rate, data volume, data quality, and data security. It is important to understand the variety of synchrophasor applications and their data requirements. It is also advantageous to develop a set of consistent and quantifiable data requirements for the applications, which help existing and new users learn the applications’ capability and suitability in their particular scenarios. The data quality requirements for synchrophasor applications are investigated in this paper.

3.2 Data quality requirements

The “data quality” term for synchrophasor applications has not been defined in the existing standards. The data quality issue in this paper is characterized by three qualities, including data accuracy, data availability, and data timeliness [24].

Generally, data accuracy demands the synchrophasor measurements, such as phasor measurements, frequency and ROCOF estimates, and time synchronization, within acceptable errors; data availability requires the measurement data to be complete, consistent, and without loss; and data timeless refers to the measurement data delivered to their destinations within acceptable latencies.

Data accuracy is largely determined by PMU performances, since the measurement data are measured, digitalized, and packaged by PMUs. As aforementioned, the data accuracy requirements under steady-state and dynamic conditions are well specified in C37. 118.1-2011, and TVE is used to quantify the measurement accuracy. For example, the maximum TVE is required to be 1% in steady-state synchrophasor measurement and the corresponding maximum timing error of PMU is 26.4 μs for 60 Hz power grid (Assuming PMU has no magnitude measurement error, 1% TVE corresponds to 0.57 degree phase error or 26.4 μs timing error). Moreover, two PMU performance classes “M” class and “P” class are also standardized in C37. 118.1-2011. The former emphasizes high precision and supports applications that are sensitive to signal aliasing but immune to delays (e.g., measurement devices), whereas the latter emphasizes low latencies and is used for applications that require minimal delays in responding to dynamic changes (e.g., protective relays).

Data availability and timeliness depends on the joint performance of PMUs, PDCs, and communication links. IEEE standards mention the data loss and latency issues for synchrophasor systems and applications, but have not formalized the related quantitative requirements. In recent years, the NASPI was working on synchrophasor application classification, and attempted to define the applications’ requirements on data accuracy, data loss, and latency. For example, a list of applications’ requirements are shown in Table 1, in which the applications are condensed into four categories and three metrics [25]. Note that Table 1 only gives high-level analysis. Actually, many applications robustness to data quality issues is relatively unknown, and various applications’ requirements and sensitivities on data quality still worth to be further investigated [8, 25].

Table 1 Classification of PMU applications [25]

This paper pays particular attention to the issues of time synchronization accuracy, data loss, and data latency, since the issue of phasor measurement accuracy has been thoroughly discussed and resolved in the literature [2932].

In specific, the historical PMU data and FDR data from OpenPDC and FNET/GridEye are used, and the related events including synchronization signal accuracy, synchrophasor data loss, and latency are extracted and analyzed, respectively. The statistical results and analysis are expected to provide a chance to understand the data quality issue in reality and explore the potential reasons and solutions. To the best knowledge of the authors, the statistics are first published here.

4 Statistical results and analysis

4.1 Synchronization signal loss

In reality, GPS-timing single loss is the main factor affecting synchronization signal accuracy, since most PMUs and FDRs use GPS-timing singles as time synchronization references. The GPS-timing-signal loss events in historical PMU and FDR data are studied first.

The PMU data frame contains a one-bit GPS status flag as shown in Fig. 5a [14], in which the GPS state “1” or “0” means whether the GPS loss occurs, and the variance of GPS states suggests when the GPS loss starts and ends. Then, the number and the duration of GPS loss events can be obtained as shown in Fig. 5b. In this way, the numbers of the PMUs suffering GPS loss with different time periods (e.g., annually, monthly, and hourly), and the numbers of the related PMUs with different GPS-loss durations are obtained in Fig. 6a–d, respectively. Fig. 6a shows the number of the surveyed PMUs increased from 26 in 2009 to 83 in 2012, and Fig 6b shows the distribution of the GPS loss events from 2009 to 2012 over different time durations.

Fig. 5
figure 5

Statistical approaches

Fig. 6
figure 6

Statistical results of GPS loss events in PMUs from 2009 to 2012

The FDR data frame does not include the GPS status flag, but records GPS signal strengths. To be specific, an FDR updates the number of locked GPS satellites in every minute, which represents the strength of GPS signals and further implies the possibility of GPS loss events. For example, four FDRs with different GPS signal strengths are shown in Fig. 7: (a) strong strength (i.e. the FDR always locks 6–12 GPS satellites), (b) medium strength (i.e. the FDR locks 2–6 GPS satellites), (c) weak strength (i.e. the FDR only locks 0 or 1 GPS satellite and GPS-signal-loss events frequently occur), and (d) variable strength, in which the number of locked GPS satellites varies in a random way or with certain patterns. Using the similar statistical approaches in Fig. 5b, the numbers of the FDRs suffering GPS loss with different time periods (e.g., annually, monthly, and hourly), and the numbers of the related FDRs with different GPS-loss duration are presented in Fig. 8a–d, respectively.

Fig. 7
figure 7

Statistical results of locked satellites in FDRs

Fig. 8
figure 8

Statistical results of GPS loss events in FDRs from 2010 to 2012

Figures 6a–b and 8a–b show that a large number of PMUs and FDRs experienced GPS loss, and as PMUs and FDRs were increasingly deployed in the past years, the numbers of GPS loss events grew constantly. The average GPS loss rate and average GPS loss duration for the PMU from 2009 to 2012 were 5 times per day and 6.7 second, respectively, and the average GPS loss rate for the FDR from 2010 to 2012 was about 6 to 10 times per day. Moreover, the statistical results of both PMUs and FDRs suggest that the majority of GPS loss events recover within a short period of time, and the number of GPS loss events decrease exponentially as the GPS recovery time increases. Note that FDRs stop sending data if lose GPS timing signals over 1 or 2 hours, which leads to high count values at 60 minutes and 120 minutes in Fig. 8b.

Figures 6c–d and 8c–d show monthly and hourly trends of the surveyed GPS loss events of PMUs and FDRs, respectively. It is observed that (1) the GPS loss events of PMUs more frequently occurred at certain hours in a day, e.g., 11 AM and 7 PM UTC (Coordinated Universal Time), whereas the GPS loss events of FDRs evenly distributed over a day; and (2) some specific pattern were diluted in a large amount of statistical data, suggesting no obvious seasonal or monthly trend or universal daily pattern that matches for all the units. Moreover, the large amount of statistical data can be used for big data and machine learning studies, which are becoming very popular in modern power systems. This study will be followed up in the future work.

4.2 Synchrophasor data loss and latency

Partially for confidential reasons, there are no public data or statistics showing PMU data loss and/or latency events in details. This paper takes advantage of the GPS-synchronized wide-area FNET/GridEye and records the FDR data loss and latency events over four weeks as shown in Fig. 9.

Fig. 9
figure 9

Data loss and latency events in FNET/GridEye

It is observed from Fig. 9a and b that the data loss events randomly occur and are often accompanied by high communication delays. Also, the data loss events display diverse scenarios but 95% of them only involve one to three continuous package losses. This implies that the large amounts of package losses are small probability events. In addition, it is found from Fig. 9a and c that the communication delay may vary dramatically in short terms (e.g., one minute) and its probability distribution changes with time periods (i.e. hourly, daily, and weekly). The real-time communication delay presents strong dynamic characteristics.

5 Conclusions

Currently, the data quality issues (i.e. data accuracy, availability, and latency) have not been clearly specified in the existing standards, and the robustness of various synchrophasor applications to data quality issues has not been thoroughly identified. This work investigates the data quality issue for synchrophasor applications. Part I presents a review and statistics, and attempts to provide an overall picture of the data quality issue.

Specifically, this paper reviews synchrophasor applications’ classifications and data requirements, and points out that it is necessary to formalize a set of consistent and quantifiable data quality requirements for synchrophasor applications. These requirements and technical rules can help existing and new users understand various applications’ capability and suitability in their particular scenarios, and further promote the development of synchrophasor systems and enhance the performance of power grids.

Further, this paper takes advantages of OpenPDC and FNET/GridEye, and shows real-world data quality issues including synchronization signal accuracy, synchrophasor data loss, and latency. The related statistical results and analysis suggest although the data quality issues are random and variable, the majority of GPS loss events recover within a short period of time and about 95% of data loss events involve only one to three continuous package losses. These points will be further discussed, and the potential reasons and solutions will be presented in part II.