1 Introduction

Smart grid has complex dependencies between physical and cyber realms [1,2,3,4]. This has been demonstrated by recent attacks on smart grid which is summarized in Table 1 [5,6,7,8,9,10]. These attacks exploited a limited visibility of the system and inadequate support from reliability coordinators [11,12,13,14,15,16,17,18,19,20,21,22]. Wide-area measurement systems (WAMS) increase the situation awareness (SA) for operators [23,24,25]. WAMS devices that are part of the wide area monitoring, protection, automation and control include phasor measurement units (PMUs) at transmission, frequency disturbance recorders (FDRs) at low-voltage distribution and micro-PMUs (µ−PMUs) for distributed renewables, called synchrophasors [26,27,28,29,30,31,32,33,34,35].

Table 1 Summary of the recent cyberattacks on smart grid impacting data quality

Significant challenges to the implementation of synchrophasors have emerged in communication, data quality and cybersecurity. The existing communication infrastructure is slow, expensive and inflexible. To leverage SA and support timeliness, adequate quality checking methods must be in-place at the phasor data concentrators (PDCs) which aggregate and process raw data and flag corrupt data. Due to their ubiquity, synchrophasors have an increased attack surface. The applications and challenges of synchrophasors are wellresearched [36,37,38,39,40,41]. However, the challenges of data quality and cybersecurity are considered one independent of the other, when in reality, they are interdependent [42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69]. Further, the literature does not leverage the knowledge of one challenge to address the other. For example, studying the changes to data quality can be key to potentially identify an underlying attack vector or an unexploited vulnerability.

The main contributions of this paper are: ① maps the dependencies between data quality and cybersecurity challenges of synchrophasors; ② reviews the methods to evaluate the challenges; and ③ surveys how data quality checking methods can leverage their observations to detect issues related to security. The paper also provides a high-level overview of synchrophasors, their standards, key applications, and challenges [70,71,72,73]. It is key to know that poor quality can be due to device errors or communication challenges like congestion and packet collision. Similarly, all cyber-attacks do not impact the data, although reduction in quality is one of the biggest observable consequences of a successful attack. A layout of WAMS comprising synchrophasors is shown in Fig. 1. This paper explores the challenges for PMUs at transmission and FDRs at distribution level.

Fig. 1
figure 1

Layout of smart grid WAMS comprising PMUs, µ-PMUs, FDRs and PDCs

This survey paper considers data quality and cybersecurity as challenges, where each has different issues. Issues are the ways in which the particular challenge manifests when observed. Figure 2 maps the challenges to their corresponding issues. The challenge of quality manifests in three ways: noise, outliers and missingness. Noise can be due to logical inconsistencies in data values or attributes while outliers result from poor integrity and origination. Missing data is a direct consequence of poor completeness and availability. Accuracy is impacted by noise, outliers as well as missingness while plausibility is a characteristic impacted by noise and outliers. These characteristics are discussed in Section 3.1. Cybersecurity manifests as delay/loss, manipulation or theft. While a delay/loss corresponds to a packet delay or drop due to congestion, timeout, buffer fullness or an intentional attack that affects availability, manipulation deals with attacks that alter the information, thereby impacting integrity. Theft captures attacks which compromise the confidentiality of data such as snooping, spoofing or espionage. These attacks occur at different levels of the synchrophasor hierarchy: Device corresponds to the edge devices like PMUs, FDRs, or µ−PMUs, while Aggregator implies Local PDCs or SuperPDCs. Communication refers to the synchrophasor network while Application contains the different power system applications that use synchrophasor data.

Fig. 2
figure 2

Proposed structure

The rest of the paper is organized as follows. Section 2 summarizes the architecture, major applications and key challenges of synchrophasors. The characteristics are described in Section 3, and their interdependencies mapped in Section 4. While evaluation methods for data quality and cybersecurity are discussed in Section 4.1, Section 4.2 surveys methods which use data quality characteristics to detect potential cyber-attacks. Section 5 highlights future directions of research in synchrophasor data analytics and cybersecurity.

2 Architecture, applications, challenges

Synchrophasors can be standalone devices with dedicated purposes, or be a part of a larger system like the substations, depending on various functional and operational requirements. With increased penetration of renewables and smart loads, synchrophasors are used at distribution transformers and points of common coupling to study frequency disturbances and harmonics. The architecture of synchrophasor devices are summarized at the device and network levels below.

  1. 1)

    PMU device: It comprises current transformers (CTs) and potential transformers (PTs) that measure current and voltage magnitudes which are then converted to digital data, a microprocessor module that compiles these values, computes phasors, and synchronizes them with the coordinated universal time (UTC) standard reference used by global positioning system (GPS) receivers that acquire a time-lag based on the atomic clock of GPS satellites [23, 74,75,76,77]. They measure local frequency and its rate of change, and can record individual phase voltage and current along with harmonics, negative and zero sequence values [78]. The information paints a dynamic picture of the grid at a given time. PMUs and PDCs transmit measured data as frames [79]. A 16-bit cyclic redundancy check ensures data integrity. PDCs equipped with logging functionality use comma separated values or transient data exchange for data logs, and common format for event data exchange for event logs [80, 81]. The data transfer rate of PMUs, which determine the message processing delays and network latencies, depend greatly on the timing requirements of applications.

  2. 2)

    PMU network: If there are multiple PMUs in a substation, Local PDCs aggregate site-level data and then transmit to a SuperPDC. PDCs conduct various data quality checks and set flags according to the issues encountered, log performance, validate, transform, scale and normalize data, and convert between protocols [82]. There is typically a direct interface between PDC and the utility’s SCADA or energy management system. PDCs can be deployed as standalone devices or integrated with other systems in the grid.

  3. 3)

    FDR device: The Oak Ridge National Laboratory and the University of Tennessee Knoxville have been leading the FNET/GridEye project since 2004. FDRs have been installed and managed to capture dynamic behaviors of the grid. Although FDRs are essentially PMUs, they are connected at 120 V, and hence incur lower installation costs than traditional PMUs do [83]. FDRs are largely deployed at renewable integration zones of the grid, and measure nearly 1440 samples per second with a hardware accuracy of ±0.5 mHz while PMUs measure between 10 and 240 samples per second and use GPS receivers that have 1µs accuracy for synchronization [84,85,86,87]. Given the availability of an extensive discussion of the architecture by the author of [88, 89], it is beyond the scope of this paper.

  4. 4)

    FDR network: FDRs use the internet to send data directly to the central servers for analytics and can provide information on transients, load shedding, breaker reclosing and the switching operations of capacitor banks and load tap changers [87]. Unlike PMUs, FDRs can be installed at buildings and offices.

  5. 5)

    Synchrophasor standards: Multiple standards exist for PMU data measurement, transfer and communication, proposed by IEEE, the National Institute of Standards & Technology (NIST), the North American Electric Reliability Commission (NERC) and the International Electrotechnical Commission (IEC) [90,91,92,93,94,95,96,97]. Due to multiple specifications and guidelines, there are possible contradictions in recommendations [70,71,72,73, 98]. A North American SynchroPhasor Initiative (NASPI) report in early 2016 identified the need for standardizing definitions related to synchrophasor data quality and availability by establishing the PMU applications requirements task force (PARTF) [99]. IEEE standard C37.X deals with WAMS, specifically PMUs [82, 100, 101]. These standards are summarized in Table 2 with their core contributions highlighted. A more comprehensive review of the synchrophasor standards is documented in [102].

  6. 6)

    Applications: Synchrophasors streamline security, reliability and stability of power systems. They have online and offline applications [103]. Online applications of PMUs include enhancing real-time SA, analyzing faults and disturbances, detecting and appraising oscillations and harmonics that impact power quality, and improving accuracy and reducing computational time of state estimation. Offline applications include congestion management, providing effective protection schemes, benchmarking, system restoration, overload monitoring and dynamic rating, validating the network model of SCADA, and improving overall power quality [25, 104, 105]. Real-time (online) applications of FDRs include frequency monitoring interface integrated with command and control centers in the future for power system health diagnosis to prevent cascading failures, and event trigger module that detects and notifies the mismatch between generation and load caused by frequency variations. Offline applications include event visualization that renders the data read from the even data files [106].

  7. 7)

    Challenges: One of the major drawbacks of synchrophasors is the lack of transmission protocol, which makes them vulnerable to spoofing attacks [26]. The existing architecture is not scalable since it entails an initially high investment. NASPI’s research initiative task force (RITT) emphasizes optimal placement as a significant challenge but also one dependent on the nature of applications the utility intends to use them for [18]. The literature has multiple models including but not limited to genetic algorithm, simulated annealing, Tabu search, Madtharads method, particle swarm optimization, artificial neural networks, binary search and binary integer programming to address this challenge [27, 28, 31,32,33,34, 107, 108]. More recently, managing and analyzing large volumes of synchrophasor data has become increasingly challenging. Lack of standardized data management solutions for smart grid has only made this problem more challenging. The ubiquitous presence of these devices has expanded their attack surface, making them vulnerable to different types of attacks. These two challenges are elaborated in the following section since they percolate to applications that directly operate upon the streaming data subject to minimal processing owing to timeliness requirements.

Table 2 Various standards and guidelines for synchrophasors

3 Data quality and cybersecurity challenges in synchrophasors

Due to their wide-ranging communication and automation capabilities, the challenges of synchrophasor data quality and cybersecurity have gained prominence.

3.1 Data quality challenges

NERC’s real-time tools best practices task force (RTBPTF) and NASPI’s PARTF impose requirements to ensure synchrophasor data quality [42, 109]. Data quality can be contextualized in different ways, depending on the needs of the concerned domain. For instance, data quality requirements of a smart meter recording energy consumption might differ from those of a net meter at a solar photovoltaic (PV) power plant. NASPI contextualizes synchrophasor data quality to determine “fitness of use” in terms of accuracy and lineage for static data points; lineage, completeness and logical consistency for static datasets; and availability, timeliness and origination for streams of data points [42].

There could be different causes for poor data quality as follow.

  1. 1)

    Device: poor calibration of device, biases due to CT, PT; erroneous filter design, poor synchronization of timing measurements, and issues due to measurement channel;

  2. 2)

    Communication: latency exceeding stipulated limits, network congestion, signal interferences and failure of communication nodes;

  3. 3)

    Aggregator: data transformation resulting in errors, delayed arrival of packets dropped due to time-limit exceeding, and unwanted duplication or corruption of data during computations;

  4. 4)

    Application: storage and maintenance issues, insufficient training size, erroneous manipulations to the data and poor association of context.

Although data quality requirements vary with applications, they have been extensively documented [42, 52, 102, 109]. The existing literature on synchrophasor data quality is summarized in Table 3.

Table 3 Summary of existing research in synchrophasor data quality challenges and solutions
  1. 1)

    Completeness: focuses on the gaps between different values, accounts for missing values [42]. The attributes of completeness defined at device and aggregator-levels are: gap rate—number of gaps in data per unit time; mean gap size—mean of the length of known gaps; and largest known gap—length of the largest known gap among the different gaps. While completeness is impacted by device malfunction, packet drops and communication link failure, the literature does not recognize the possibility of an attack behind such causes.

  2. 2)

    Accuracy: can be of the value or attribute, primarily measured in total vector error (TVE), which according to IEEE standard C37.118, is the vector difference between the measured and expected phasor value (magnitude, angle and frequency). Accuracy is categorized into that of: data values—impacted by factors like the difference between expected and observed signals or the introduction of noise to the data within the synchrophasor; and data attributes—affected by factors like accuracy of the measured timestamp, agreement between encoded and actual location coordinates of the device, and alignment of the location recorded in the power system topology with its actual location [46].

  3. 3)

    Plausibility and Availability: Measurement specifiers are the attributes of data which describe whether the process of measuring some phenomenon of the power system (observed value) and calculating its value (expected value) are documented effectively in terms of standard units to a given precision and are within a stated confidence interval [46, 48]. These specifiers have decisive sub-attributes influencing the qualitative value of data: data representing the measurement of quality or condition of the grid, and data represented in the form of SI units up to 3 decimal places with a confidence interval included.

    Network availability plays an important role in streaming data [49], and in-turn affects data availability. In case of high network latency, the incoming data streams from different synchrophasors get delayed or lost, causing applications to perceive them as missing or incomplete. Hence, network availability can be considered an indirect attribute affecting quality. This can be mitigated if the overlying applications are programmed to account for the delays, or if a more lenient waiting time limit is set. However, the second solution is dependent on the kind of applications the synchrophasors cater to. The latency requirements for synchrophasors recommended by the standards are very stringent.

  4. 4)

    Origination: is the source from which the data is measured. Its trustworthiness is associated with the background and source. Its attributes are as follow. ① Point of origin: the class of device from where the data originated (measurement (M) or performance (P) for PMUs), the standard followed by the device, and any data manipulation or standardization techniques through which the data passes [42, 118]; ② Coverage: physical location of the device based on its geospatial or electrical topology location [44, 45]; ③ Transformation applied to the data at the device, aggregator or application level.

  5. 5)

    Consistency: determines how agreeable the data is with the overall structure of its type. Incompatibility of attributes in terms of measurement rates or header labeling between datasets results in outliers, leading to an inconsistent result from an application. The attributes of consistency are as follow. ① Header frame consistency: consistency of the header frame of the device. This could be categorized into: persistence of PMU header that states whether the PMU header structure is consistent over time, and persistence of PDC header that states whether the PDC header structure is consistent over time. ② Data frame consistency: consistency of data frames of the device. This could be categorized into: persistence of PMU data frame that states whether the PMU data frame structure is consistent over time, and persistence of the PDC data frame that states whether the PDC data frame structure is consistent over time. ③ Order consistency of data frames: whether the order in which the data frames are recorded is consistent in the device. ④ Consistency in compliance to standards recommended for PMU and all the devices associated with it. ⑤ Consistency of reporting rate: whether data reporting rate is consistent across all devices.

    Emerging research in this area has lately focused on determining solutions for ensuring data quality. These solutions include using omnidirectional antennas to improve GPS availability, context-aware determination of missing data streams using accurate timing information, network time protocol (NTP) and associated chip scale atomic clocks (CSACs) as backups for synchronization when GPS fails, imputation, interpolation and extrapolation, stochastic forecasting with prediction error minimization (PEM) and data substitution [52].

  6. 6)

    Evaluation of quality: Methods to evaluate quality is discussed in Section 4.1. The approach for performance evaluation is to first study the impacts of device calibration and network conditions on quality, then examine how poor quality reduces the application performance [42]. Two effective methods are proposed to evaluate the impact of quality on performance: ① Benchmarking that tests an application multiple times with numerous erroneous datasets in contrast to those with no known errors, and ② Standardization that documents, for each application, the level of tolerable errors.

3.2 Cybersecurity challenges

Synchrophasors cater to applications like state estimation, contingency analysis and optimal power flow that need real-time high-resolution data measurement, communication and analytics [119]. Therefore, a successful attack on these devices might cause erroneous SA or cascading failures [56, 120]. Yet, many industrial organizations do not consider synchrophasors as critical cyber assets. Recent cyberattacks on the smart grid in Table 1 mostly used powerful malware like worms, viruses or Trojan horse, but a few attacks like the one on the Pacific Gas & Electric transmission substation relied on physical means. These attacks jeopardized not just the availability of power but also that of control data (information). In Table 4, cybersecurity of synchrophasors are categorized into: ① Device, Aggregator; ② Communication; and ③ Control center application.

Table 4 Summary of existing research in synchrophasor cybersecurity challenges and solutions
  1. 1)

    Device, Aggregator: NASPI network (NASPInet) is logically capable of integrating WAMS across multiple geographically distant organizations using phasor gateways (PGWs). The attacks at this level compromise data integrity, targeting devices from individual PMUs to PDCs, SuperPDCs or even PGWs. Some attacks include: ① tampering the signal measurement units of devices through interference; ② illicitly changing the calibration of devices to report erroneous readings; ③ forging data to reflect wrong measurements; and ④ GPS spoofing by broadcasting fabricated signals to the receiver to yield erroneous synchronization of phasors computed, modifying satellite position, or replaying legitimate GPS signals at later timestamps [54].

    GPS spoofing can be mitigated by enabling the receiver to predict visible GPS satellites at a given position and time instant and use the coarse/acquisition (C/A) code from those satellites. Another strategy compares the measured GPS signal to the estimated signal and computes the anomaly error which must have an accuracy of ≤ 40 ns for nearly 95% of the values according to IEEE C37.118 [44]. Synchrophasors must be subject to rigorous testing before installation. Some methods include port scans, device security feature robustness, network congestion testing, denial of service testing, network traffic sniffing and disclosure testing [55]. These tests should be periodically conducted by certified white hat penetration testers after installation. Regular patches and configuration updates must be made down to the end-device level.

  2. 2)

    Communication: Synchrophasors support bidirectional communication channels, where data measurements flow from devices to the control center while control signals flow the other way. The vulnerabilities of the protocols used by the devices also contribute to the overall security. Attacks on communication channels compromise integrity, availability and confidentiality. Some attacks include: ① Denial of service (DoS) by overwhelming PMUs, PDCs or other aggregation devices higher in the hierarchy with bogus frames so that legitimate frames are lost, delayed, denied or dropped; ② Man-in-the-middle (MITM) attacks by a malicious entity posing itself as PDC (to PMU) or PGW (to PDC) and sending malicious commands that causes PMUs/PDCs to behave in an abnormal manner that triggers failures; ③ False data injection (FDI) by intercepting frames over the channel, altering or replacing them with malicious information that then gets propagated to higher levels of the WAMS; ④ Snooping by the attacker eavesdropping on the channel for incoming or outgoing frames, typically not modifying or stealing but just capturing a copy of that information for packet replay or espionage; and ⑤ Delay caused by compromising communication routers that deliberately induce latencies in propagation to critically affect the grid’s SA.

    Many authentication and authorization algorithms are proposed to secure synchrophasor data over communication channels [57]. These methods range from conventional encryption methods to cyber trust. Due to the ubiquity and widespread range of these devices, key distribution and management becomes a problem. Mutual authentication is also proposed to account for trust [61]. Decentralized, blockchain-based trust acquisition is being considered too. The publish-subscribe hub-spoke architecture proposed by NASPInet supports dynamic sharing of device data to alleviate shortcomings of the communication medium like delays and latencies. Standards like IEC 61850-90-5 recommend trusted key distribution center to generate and distribute keys that meet system requirements [63,64,65,66].

  3. 3)

    Application: Despite being protected by enterprise security tools for intrusion detection and prevention, virtualization, segmentation, authentication, authorization and access control, cyberattacks still proliferate [67, 68]. It is understood that any successful attack at the other two levels perpetrated in a manner undetectable by the enterprise security systems can pose a significant threat. The attacks at this level are the most dangerous, since crucial power system applications use data from WAMS to conduct analysis to address reliability, power quality, network topology, and faults. An adverse impact on these calculations could compromise the “self-healing” nature of the grid. More recent solutions include game theory, machine learning, proactive data visualization, and defense-in-depth [12, 123].

3.2.1 Evaluation of security

Works have tested the resilience of PMUs and PDCs against different attacks. The authors in [124] conducted penetration testing of a synchrophasor network in IEEE 68-bus system to map vulnerabilities against the common vulnerabilities exposure (CVE) database. Potential corrective measures to ensure the security of PMUs and PDCs is proposed [125]. Considering the security at substation and information levels, the authors provide a wide range of tools to mitigate breaches at both fronts. A multilayered architecture at the substation is proposed where different levels of data abstraction is provided between PMUs and external environment, supplemented by firewalls, user datagram protocol (UDP) secure for communication over untrusted networks, and remote access using secure shell (SSH).

4 Data quality-cybersecurity dependency

The severity of an attack can be understood from the extent of its impacts on the targeted system. With the smart grid encouraging interoperability between devices, information, applications, and protocols, a transparent and direct information exchange is now feasible. This also means that if information in one of the interconnected systems is infected, it is bound to propagate to other systems upon exchange, affecting the whole network. Synchrophasor devices harbor such vulnerabilities, as summarized in Section 3.2. However, to mitigate cyberattacks on interconnected systems, the relationship between devices and data must be known.

Table 5 summarizes key interdependencies between the two challenges. There is a tight coupling between data quality and cyber-attacks, implying it is wise to study synchrophasor cybersecurity by accounting for the impacts on quality. In most attacks, plausibility, completeness, accuracy and consistency are primarily impacted [126, 127]. In Section 4.1, specific evaluation methods for quantifying this relationship are reviewed. Section 4.2 looks at how data quality characteristics can be used as markers to detect potential cyber-attacks within the context of synchrophasors. Results from these subsections are summarized in Tables 6 and 7, respectively.

Table 5 Summary of the interdependency between quality and cybersecurity challenges
Table 6 Summary of evaluation methods for quality (DQ) and cybersecurity (CS) issues
Table 7 Summary showing how quality can help identify cybersecurity issues

4.1 Interdependency evaluation methods

Next to communications, cybersecurity was found to impact the design and installation costs for synchrophasors [141]. This is because they are critical to the missionsupport systems of the grid. Different practical ways for utilities to mitigate quality issues like accuracy, timeliness and consistency are also identified. Some methods include employing dedicated communication channels between PMUs and PDCs, encrypting PMU data before communication, and enhancing communication endpoints using firewalls and routers. The report, however, does not delve into the details of how such methods could impact latency (and hence, timeliness) and availability of the data.

Given different manufacturers of devices, there will be differences in measurement and calibration quality despite adhering to the standards. The varying application requirements cause differences in application-level PMU performance, of which data quality is a major one. The static and dynamic PMU testing efforts of the Performance and Standards Task Team (PSTT) of NASPI and the PMU performance characterization are briefly summarized in [142]. In it, the different steady-state tests performed on magnitude, phase and frequency evaluate their conformance to accuracy requirements, which is an important attribute of data quality and is a direct target of many cyberattacks. Given the impact of instrumentation channels on the quality, they have been well-characterized and evaluated for impacts on accuracy in the literature. The errors induced by them could be rectified through model-based correction algorithms and state estimation based error filtering. Some other avenues where data quality could be evaluated include the cable configurations, testing and validating the devices to ensure accurate, consistent performance and interoperability at all levels [143, 144]. Although not explicit, these works hint at the improvement in the resilience of synchrophasor devices against potentially malicious activities by accounting for proper testing methods to characterize and evaluate the different sources of errors prior to deployment that might contribute to poor quality.

Final conclusions can be gathered from [145]. The report by the Pacific Northwest National Laboratory (PNNL) analyzes existing synchrophasor networks in terms of their communication and information-level interoperability, security and performance. It concluded that latency is a key issue for the future synchrophasor designs which is expected to compound latency due to PDC functionality. It also emphasized that substations generally did not employ redundancy; there is little consistency in adoption of security methods for synchrophasor networks. Some tools include link-level encryption, virtual private networks (VPNs), ID/IPS, firewalls and access control lists (ACLs). Further, existing data quality checking methods locate a compromise in integrity by identifying faulted data values (due to measurement errors, communication delays or external events) but not due to result of device tampering, MITM, spoofing or FDI. Since both faults and attacks have the same impact on quality, it is important to differentiate the two causes while checking for the attributes such as accuracy, consistency and timeliness.

To summarize, the following measures can be used as metrics to quantify data quality: TVE, errors in magnitude, phase, frequency and ROCOF, harmonics and noise for measurement accuracy; comparison between measured and expected results, confidence interval and precision for measurement specifiers; temporal, geospatial and topological accuracy for attribute accuracy; device model specifications, geospatial and topological coordinates, coverage and content for origination; persistence in Header and Data frames, standards compliance, reporting rate and order for logical consistency; and gap rate, gap size and largest known gap for completeness. Benchmarking and standardization are two methods that can be used to evaluate data quality. Similarly, cybersecurity can be quantified by conducting extensive penetration testing of the synchrophasor networks integrated into benchmarked IEEE bus systems for different types of attacks (DoS, MITM, FDI, spoofing, probing, cache poisoning) and discovering potential vulnerabilities that could be exploited. While doing so, it would be important to also repeat the evaluation of the quality attributes using the above metrics and explore how they are impacted due to the specific attacks, and whether they violate the industry standards requirements specified for different applications.

4.2 Addressing cyber-attacks using quality issues

It can be seen from Table 7 that successful cyberattacks compromise synchrophasor data quality since the security requirements are violated [146]. Given synchrophasors use TCP/UDP on the transport layer for their communications, attacks typically possible on TCP/IP stack like DoS, MITM, packet replay or spoofing are possible in synchrophasor domains as well.

Physical attacks like device tampering causes loss or incurs theft of critical information, easily observed through large gaps sizes, poor accuracy in obtained values and unreliable origin. The lost data is typically handled through substitution, either statistical or intelligent [128, 129]. The best way to prevent physical attacks like cable disconnects, direct damage to device, etc. is by ensuring the devices are isolated from external weather and human elements.

Spoofing synchrophasor data is achievable through polynomial fitting or data mirroring techniques. Such attacks impact quality that manifests as outliers or noise. Several methods have been proposed to counter these attacks: intra-PMU and inter-PMU correlations to determine the relationship between PMU parameters and across PMUs in a locality, respectively; machine learning techniques like support vector machines (SVMs) and more [130,131,132].

GPS spoofing exploits publicly available civilian GPS signals using air or cable to produce signals that initially align with the original, but slowly start increasing the power to drown the authentic signal and thereby compromising the receiver [54, 133]. By introducing measurement errors in the time synchronization, the attacks induce changes in data consistency and plausibility which can be used as markers to identify the likelihood of the attack [134, 135, 147].

In a successful DoS where multiple synchrophasor devices get compromised, packet delay or loss is observed. This impact in quality can serve a clue to the onset of DoS-style attacks. Typical solutions involve augmenting inline blocking tools, high bandwidth connections, disabling IP broadcasts and port hardening.

MITM is possible in synchrophasors where the attacker acts as a legitimate PDC to the PMUs and viceversa, thereby intercepting and/or modifying all messages exchanged. This is noticed by quality checking methods in the form of poor accuracy and consistency in values between what was sent by PMU and what was received by PDC. It can be avoided by having the devices employ mutual authentication and a digital certificate mechanism with an actively managed certificate evocation lists (CRLs) and certificate authorities [59, 129].

FDI impacts the consistency, accuracy and plausibility of the data. The effects are typically observed as spatio-temporal outliers in the data. Quality checking methods check for this anomaly and may employ correlation across different timestamps to identify the corruption of data. FDI is one of the widely explored attacks on synchrophasor domain, with solutions like determining the mismatch between the values obtained from PMUs and that observed in SCADA, monitoring the line impedances which get affected when data is manipulated, and using density-based local outlier filter (LOF) analysis [136,137,138,139,140].

Sometimes, attackers simply capture the packets flowing in a channel with an intent to listen. Such sniffing/snooping attacks have been conducted using WireShark to realize messages are exchanged in plaintext. This attack is difficult to detect using data quality checking methods since most often, no quality characteristic is impacted as the attackers do not affect the data actively. However, technologies like VPN, encryption of selective messages (to reduce the overall process overhead), or transport layer security (TLS)/secure ocket layer (SSL), secure shell (SSH) can be used to mitigate them. While TLS has been shown to be susceptible to poisoning attacks and VPN to side channel attacks, careful network design can account for them [129, 148].

With the increased frequency of campaign efforts and nation-sponsored attacks against the grid, synchrophasors could be lucrative targets for sophisticated attacks like advanced persistent threats (APTs), social engineering, watering-hole attacks and malware-based intrusions [149,150,151,152,153]. While these attacks scale beyond specific devices in the synchrophasor hierarchy, the quality checking methods alone would not be sufficient [123]. The use of defense-in-depth model augmented with stakeholder interactions, awareness and training, and intelligent solutions like machine learning for attack data classification and/or event prediction, root-cause analysis of observed events, developing evolving defense topographies using moving target defense, and advanced visualization techniques for efficient cognition of events would play a critical role.

The key takeaway from this section is that impacts on data quality can provide strong markers for an underlying cyber-attack. Noise, outliers and missing values are all commonly observed issues which quality checking methods may be programmed to detect, analyze and base decisions on. Certain sophisticated attacks like APTs, insider threats, sniffing, and social engineering have indirect impacts on quality which a checking method may not be able to detect with enough confidence or precision. Additional solutions are required to mitigate such attacks in the synchrophasor domain. These solutions include statistical methods like divergence, correlation, regression and substitution; intelligent methods like neural networks and evolutionary algorithms for event classification and prediction, logistic regression for substitution; technologies like VPNs, firewalls, ID/IPS, anomaly detectors, selective encryption, port hardening, network isolation and use of TLS/SSL, SSH; and human-in-the-loop solutions like advanced visualization techniques, awareness and training, and stakeholder engagements. While the impacts on quality can also be due to underlying device or measurement errors, most of the works in the literature assume the data has been subject to delay/loss, manipulation or theft intentionally. This paves way for the recommendation that the upcoming research in this area must look at ways to differentiate the impacts on data quality due to attacks from errors.

5 Future directions of research and conclusion

The future directions of research in the areas of synchrophasor data quality, cybersecurity and communications are multi-faceted. Addressing data quality challenges must begin with a strong push to the adoption of industry-wide, vendor-agnostic data management, processing and storage standards for smart grid. Most recent cyber-attacks were successful due to the difference in speed of cognition of the information generated by automated vulnerability detection tools and the speed with which the machine data is created (called cognitive gap) [123]. The design of synchrophasor devices are also expected to improve in the future [103]. Keeping in mind the quality challenges, an improvement to PDC design called flexible integrated synchrophasor system (FIPS) was proposed to minimize issues in quality and communication, and tackles specific tasks of PDC such as data alignment, employs cryptographic methods to ensure confidential exchange of data without jeopardizing integrity, and establishes relevance to the NASPInet [121]. To ensure device and applicationlevel interoperability, development of technical standards and conformance testing rules is expected. Further, the emergence of distribution-level µ−PMUs will evoke the need for developing measurement, communication, quality and security standards. Further, with the deployment of distributed renewable sources, electric and autonomous vehicles, energy storage and transactive energy, there is a strong impetus for enhancing technologies behind monitoring and control, of which synchrophasors will play a major role [141].

To conclude, while existing research has focused on the synchrophasor challenges of quality and cybersecurity individually, their interdependency has largely been ignored. This paper makes one of the first attempts at highlighting the impacts of cyber-attacks on various quality attributes, thereby recommending that the future research on the design and development of security solutions should account for their impacts on quality as well, and that different quality characteristics can be used by quality checking methods to flag for potential cyber-attacks. Plausibility, completeness, accuracy and consistency are some of the attributes that are most adversely impacted by a majority of the attacks on synchrophasors. At the same time, not all cases of poor data quality imply a successful cyber-attack as the reason. Different metrics that could be used to quantify quality attributes were summarized, and the methods that help evaluate the impacts of quality and security on performance were also briefly highlighted. This paper serves as a starting point for researchers entering these areas as it summarizes and determines their interdependency and relevance to smart grid security.