1 Introduction

With mounting concerns over global warming and fossil fuel depletion, in recent years wind power generation has been greatly increased in the worldwide. However, due to the stochastic behaviors of wind resources cannot be fully predicted, the induced uncertainties significantly influence the power system operations, such as reserve deployment, unit commitment, power dispatch, power system’s security and reliability assessments, etc [16]. Therefore, a thorough uncertainty model for wind power forecasting error is imperative for power system operators and participants, in terms of evaluating the benefits and costs of wind power integration and thus supporting trade-off decisions.

The short-term uncertainty of wind power generation has been typically modelled by the distribution of the wind power forecasting error. Numerous distributions and modelling methods have been proposed, such as the Gaussian distribution [7], Beta distribution [8] and non-parametric approach [9]. These studies mainly focused on the error distribution of a single wind farm based on a per-prediction horizon. However, due to the similar meteorological conditions, the errors of close prediction horizons and nearby farms influence with each other. The temporal dependence of forecast series is crucial for multistage decision-making problems, such as energy storage planning and scheduling. In parallel, the spatial dependence of multiple farms has been particular concerned to cluster operators in terms of transmission congestion and reserve deployment. Thus more detailed models which consider the temporal or spatial dependence of wind power forecasting errors have drawn wide attentions in wind power integration applications.

Under this background, researchers are inspired to develop feasible models for the correlation structure of wind power forecasting [1016]. General methods can be roughly classified into three main kinds: time series model, Markov model, and high-dimensional joint probability model. Time series model is able to discover change rules of historical series, the most widely-used of which is auto regression moving average (ARMA) model. ARMA model has advantages of simple structure and easily fitting [10]. However, ARMA model faces difficulty in dealing with non-stationary stochastic process. Additionally, since there exists essential difference in characteristics between temporal dependence and spatial dependence, simply using the same noise signal for temporal and spatial modelling lacks theoretical support [11]. Markov model depicts the state transition distribution of random variables, with advantages of low requirement to sample size and high fitting effect for short-term characteristics [12]. Yet, due to the restriction of model order, it may be difficult to apply Markov model in uncertainty modelling for multiple wind farms [13]. By regarding forecasting errors of different wind farms and different times as dependent multivariate random variables, high-dimensional joint probability model performs mathematical and statistical description for uncertainty characteristics through joint probability distribution function. Copula theory is a typical method in this area, which has been widely applied in financial market analyses, portfolio investments, and risk assessments [14, 15].

Recently copula theory has been applied in power system uncertainty analyses. The spatial dependence of regional wind power outputs using Gaussian Copula was investigated in [17]. Similar methods were used to model the temporal relevance of forecast errors [1821]. Although Gaussian distribution has been the default Copula choice in numerous studies for practical reasons, its appropriateness for wind power forecasting dependent structure has not been rigorously investigated. Reference [22] evaluated the goodness-of-fit of several bivariate elliptical and Archimedean Copulas for modelling the wind power spatial dependence and concluded that Gumbel Copula was preferred over Gaussian Copula. However, when extending the bivariate model to higher dimensions, Díaz [23] noted that the multivariate Gumbel Copula did not outperform Gaussian Copula due to its monotonous structure and fewer parameters [24]. Thus, although Archimedean Copula family provides various dependent structures, its applicability in high dimensions is significantly restricted due to a lack of feasible extension methods.

In response to these limitations, this paper introduces Pair-Copula theory to model the temporal and spatial dependence of wind power forecasting errors. The paper is organized as follows. Section 2 introduces the Pair-Copula theory and proposes the detailed procedures of modelling wind power forecasting dependence and sampling wind power uncertainty scenarios. Based on the regional wind power dataset from the National Renewable Energy Laboratory (NREL), the stochastic characteristics of temporal and spatial dependence have been analyzed and modelled in Sect. 3. In Sect. 4, different dependent models have been compared to validate the effectiveness of the proposed method, and the effects of temporal and spatial dependence on power system operations are addressed. Finally, concluding remarks have been presented in Sect. 5.

2 Pair-Copula theory based uncertainty modelling algorithm

2.1 Review of Copula and Pair-Copula theory

2.1.1 Copula theory

Assume {x 1, x 2,…, x n } are correlated random variables. According to the Sklar theorem [25], Copula function links the margin distribution functions to form the joint multivariate cumulative distribution function (CDF), as shown below.

$$F(x_{1} ,x_{2} , \ldots ,x_{n} ) = C_{1,2, \ldots ,n} (F(x_{1} ),F(x_{2} ), \ldots ,F(x_{n} ))$$
(1)

where F(x i ) (i = 1, 2, …, n) is the margin distribution of x i ; F(x 1, x 2, …, x n ) is the joint distribution function. Thus, C is the Copula function.

The corresponding Copula probability density function (PDF) is defined as follows:

$$\begin{aligned} f(x_{1} ,x_{2} , \ldots ,x_{n} ) & = c_{1,2, \cdots ,n} (F(x_{1} ),F(x_{2} ), \ldots ,F(x_{n} )) \\ & \quad \cdot f(x_{1} )f(x_{2} ) \ldots f(x_{n} ) \\ \end{aligned}$$
(2)

where f (x i ) is the margin density function; f(x 1, x 2,…, x n ) is the joint PDF; c is the Copula density function.

The Copula modelling approach has several attractive properties [26]: 1) it allows the marginal distributions and dependent structure to be modelled separately; 2) it does not require the margins to be normally or uniformly distributed; 3) it contains various Copula functions, which makes it flexible in modelling complex dependences, including nonlinear and asymmetric relationships.

2.1.2 Pair-Copula theory

In practical applications, high-dimensional dependence requires feasible multivariate Copulas and even flexible combinations of different Copula functions. To resolve this problem, hierarchical and nested approaches were proposed, which used sub-models of lower dimensions to construct the joint Copula function [27, 28]. However, these techniques faced the compatibility problem. Strict limitations were imposed on the selected types and parameters to satisfy the nested condition [28]. On this basis, the Pair-Copula construction method was proposed to build flexible multivariate distributions.

Based on the conditional probability theory, the joint PDF can be formulated as follows:

$$\begin{aligned} f(x_{1} ,x_{2} , \ldots ,x_{n} ) & = f(x_{1} )f(x_{2} |x_{1} ) \ldots f(x_{n} |x_{1} ,x_{2} , \ldots ,x_{n - 1} ) \\ \, & = \prod\limits_{t = 2}^{n} {f(x_{t} |x_{1} ,x_{2} , \ldots ,x_{t - 1} )} \cdot f(x_{1} ) \\ \end{aligned}$$
(3)

According to the definition of Copula density in (2), f(x t |x 1, x 2,…, x t−1) can be calculated recursively as:

$$\begin{aligned} f(x_{t} |x_{1} ,x_{2} , \ldots ,x_{t - 1} ) & = c_{1,t|2,3, \ldots ,t - 1} f(x_{t} |x_{2} ,x_{3} , \ldots ,x_{t - 1} ){ = } \cdots \\ & = \left(\prod\limits_{s = 1}^{t - 2} {c_{s,t|s + 1, \ldots ,t - 1} }\right )c_{t - 1,t} f(x_{t} ) \\ \end{aligned}$$
(4)

Thus the joint density of (3) can be further decomposed as [29]:

$$f(x_{1} ,x_{2} , \ldots ,x_{n} ) = \left( {\prod\limits_{i = 1}^{n - 1} {\prod\limits_{j = 1}^{n - i} {c_{j,i + j|j + 1, \ldots ,i + j - 1} } } } \right)\left( {\prod\limits_{k = 1}^{n} {f(x_{k} )} } \right)$$
(5)

where c j,i+j|j+1,…,i+j−1 is an abbreviated expression for the conditional bivariate Copula density c j,i+j|j+1,…,i+j−1(F(x j |x j+1,…,x i+j−1), (x i+j |x j+1,…,x i+j−1)). Equation (5) indicates that the joint density consists of the conditional densities of all pairs and the marginal densities, which explain why this form of decomposition is called Pair-Copula.

Pair-Copula theory maintains the merits of Copula theory, and further provides a flexible and intuitive way to build the multivariate Copula using bivariate Copula blocks. Compared with the other multivariate Copula extension methods, Pair-Copula imposes no restrictions on the selected types or parameters and is thus more flexible and practical for modelling and analysing complex dependences.

2.2 Model construction for wind power forecasting errors

To specify the Pair-Copula model for the wind power forecasting uncertainty, the temporal dependence model are derived as an example, and the spatial dependence can be constructed in the same manner. Suppose that {e 1, e 2, …, e n } are the forecast errors of prediction horizons. According to (5), the model construction includes two parts: 1) margin distribution estimation and 2) dependent model construction, i.e., the estimation of every bivariate pair density c j,i+j|j+1,…,i+j−1 evaluated at conditional CDF F(e j |e j+1,…,e i+j−1) and F(e i+j |e j+1,…,e i+j−1). As the modelling methods for the marginal distribution have been widely investigated, the estimation of dependent model is proposed in the following.

2.2.1 Parameter estimation and Copula selection

For the given candidate bivariate Copula, the corresponding parameters are estimated by maximum log-likelihood estimate (MLE) algorithm [30] in this paper. The parameter θ j,i+j of pair c j,i+j is obtained as:

$$\begin{aligned} & \hbox{max} \ln l(\theta_{j,i + j} ) \\ & \quad = \hbox{max} \sum\limits_{k = 1}^{m} {\ln c_{j,i + j} (F(e_{j}^{k} |e_{j + 1}^{k} , \ldots ,e_{i + j - 1}^{k}),F(e_{i + j}^{k} |e_{j + 1}^{k} , \ldots ,e_{i + j - 1}^{k} ))} \\ \end{aligned}$$
(6)

where e k j is the k th observation of historical data e j ; m is the number of observations.

To find the best-fitted family for a pair, several potential Copula functions are selected and parameters are estimated for each candidate. Then the empirical CDF, as formulated in (7), is regarded as the criterion to evaluate the goodness-of-fit for candidates. The Copula with the closest Euclidean distance to empirical CDF will be selected.

$$\begin{aligned} & C_{n} (x,y) \\ & \quad = \frac{1}{m}\sum\limits_{k = 1}^{m} I (F(e_{j}^{k} |e_{j + 1}^{k} , \ldots ,e_{i + j - 1}^{k} ) \le x)I(F(e_{i + j}^{k} |e_{j + 1}^{k} , \ldots ,e_{i + j - 1}^{k} ) \le y) \\ \end{aligned}$$
(7)

where (x, y) is the sample point in distributed space; I(·) is the indicative function. When the defined condition is satisfied in parenthesis, I(·) = 1, otherwise I(·) = 0.

2.2.2 Dependent structure estimation

As mentioned above, the Copula densities are fitted based on conditional CDFs, which can be calculated by [31]:

$$F(e_{j} |D) = \frac{{\partial C_{{e_{j} ,e_{k} |(D\backslash e_{k} )}} \left[ {F(e_{j} {\mathbf{|(}}D\backslash e_{k} )),F(e_{k} |{\mathbf{(}}D\backslash e_{k} ))} \right]}}{{\partial F(e_{k} |{\mathbf{(}}D\backslash e_{k} ))}}$$
(8)

where D is the conditional variable set, \(e_{k} \in D\); \(D\backslash e_{k}\) means removing element e k from D; \(\partial\) is the partial differential signal.

Equation (8) indicates the conditional CDFs used in high dimensions can be calculated by the fitted bivariate Copulas of lower dimensions. Therefore, the n(n − 1)/2 underestimated pairs c j,i+j|j+1,…,i+j−1 can be divided into (n − 1) trees according to the value of i, as shown in Fig. 1. The pair densities are estimated in the sequence of the tree number. As the Copula functions in lower trees have been selected and parameterized, the fitted Copula function can be applied to calculate conditional CDFs for the next tree by (8). Thus the entire model is achieved in a recursive manner.

Fig. 1
figure 1

Mechanism sketch of dependent model construction process

Through the construction process, it can be seen that the Pair-Copula method allows mixing different Copula types and sets no limitations on types or parameters, thus it can select the best-fitted distributions and gain a higher accuracy. When the Gaussian Copula is nominated for all pairs, the Pair-Copula model is proven to be equal to the multivariate Gaussian Copula. Therefore, the Pair-Copula model can replace the conventional multivariate models without sacrificing the accuracy of the model.

2.3 Wind power uncertainty scenario simulation

The most important application of the wind power uncertainty model is to evaluate the benefits and costs of wind power integration. For end-users, Monte Carlo simulated scenarios provide uncertainty information in a discrete form, which are practical for risk assessments and optimization programming. The simulation process is also based on conditional probability theory.

First, the independent uniform variables {w 1, …, w n } is randomized to follow the stratified sampling principle. Next the correlated uniform forecast errors {z 1, …, z n } for horizon 1 to n are calculated by (9). The conditional CDFs of (9) have been developed when fitting the dependent model; thus, they can be used directly in the simulation process. The uniform errors then will be transferred back to the original domain by inverse marginal CDFs, and finally form the wind power scenarios.

$$\left\{ \begin{aligned} z_{1} = w_{1} \hfill \\ z_{2} = F^{ - 1} (w_{2} |z_{1} ) \hfill \\ \vdots \hfill \\ z_{n} = F^{ - 1} (w_{n} |z_{1} ,z_{2} , \ldots ,z_{n - 1} ) \hfill \\ \end{aligned} \right.$$
(9)

Therefore, the entire procedure for modelling and generating wind power forecasting scenarios considering the temporal dependence is described as follows, and the mechanism sketch of each step is drawn in Fig. 2 for further clarification.

Fig. 2
figure 2

Mechanism sketch of entire modelling and simulation process

  • Step 1 Form the n-dimensional dataset E = {e 1, e 2, …, e n } of prediction errors for multiple forecast horizons or wind farms using the historical data.

  • Step 2 Develop the marginal distribution F(e i ) (i = 1, 2,.., n) and transform the data from the actual domain to the uniform domain.

  • Step 3 Construct the multivariate dependent model based on the uniform data.

  • Step 4 Sample correlated uniform error {z 1, z 2,…, z n } using (9) based on the model estimated in Step 3 .

  • Step 5 Transform samples back to the actual domain using the inverse margin e i  = F −1(z i ).

  • Step 6 Obtain wind power uncertainty scenarios by adding the given point forecasts and simulated errors.

Further extension to spatial-temporal dependence modelling also follows a similar principle, but the model dimension is higher.

3 Modelling temporal and spatial dependence of wind power forecasting errors

3.1 Dataset

In this section, the temporal and spatial dependences of wind power forecasting errors are modelled based on the Wind Integration datasets of the NREL [32]. The wind speeds in the dataset are generated by MASS v.6.8 mesoscale model, which is initialized with input from Atmospheric Research Global Reanalysis and assimilates both surface and rawindsonde data. The wind power outputs are computed by composite turbine power curves, and the forecasts are produced by a statistical forecast tool named SynForecast. Though the data are simulated, they are verified to contain similar stochastic features of actual wind farms and reflect the geographic diversity of wind power outputs. Thus they are suitable for researches of wind power temporal and spatial relevance [22, 23, 33].

In this paper, regional wind plants in Vermont are selected. The data consist of 17 onshore wind farms varying in capacity from 100 to 180 MW and distributed spatially from 10 to 250 km. Detailed spatial locations of these wind farms are presented in Table 1. The wind power outputs and day-ahead forecasts span 3 years and have a 1 h resolution. All data have been normalized by the rated capacity.

Table 1 Detailed spatial locations of concerning 17 wind farms

3.2 Temporal dependence model

The Gaussian model for temporal dependence is first built to analyse the linear dependence. Figure 3a shows the fitted parameters of the Gaussian model, where the matrix element in position (j, i) represents the linear correlation coefficient of errors of horizon j and i. High correlations are observed for close prediction horizons, whereas the relevance decreases dramatically as the intervals increase. Furthermore, Fig. 3b presents the colour histogram of the forecast errors in horizons 1 and 2 in the uniform domain. The dependent structure of adjacent errors is generally elliptical. Moreover, the tail is steep and symmetric, which highlights a higher coherence when large prediction errors occur. This characteristic should be addressed in multistage decision-making problems, as it would significantly affect the risk index.

Fig. 3
figure 3

Data analysis for temporal dependence

To develop the Pair-Copula model, five typical Copula function, i.e. Gaussian, Student t, Clayton, Gumbel and Frank Copulas are selected as candidates for estimating the pair densities. The independent function is also included here, and its bivariate CDF is the product of the marginal distributions. When the CDF of independent type owns the nearest distance to the empirical CDF, the concerning variables are judged to be independent. As parameters of different Copula families vary in different ranges, to facilitate model analyses and comparisons, the fitted parameters are translated into Kendall’s rank correlation coefficients, which can be identically formulated according to Copula class and parameters [29]. Figure 4 shows the selected Copula types and corresponding parameters (translated into Kendall’s rank correlations) for all pairs. The element in position (j, i) denotes the j th pair in the i th Tree, i.e., the conditional dependence of the errors of horizon j and i + j. Figure 4a denotes the best fitted Copula type with different colours, and Fig. 4b demonstrates the estimated parameter for each Copula function.

Fig. 4
figure 4

Estimated result of Pair-Copula temporal model

As the dimension is 24 (the forecast errors of 24 h), there are 23 trees in the temporal model. Tree1 has 23 pairs, each pair represents the temporal correlation of horizon j and j + 1. In parallel, Tree 2 has 22 pairs, each pair represents the conditional temporal correlation of horizon j and j + 2. The rest trees can be analysed in the same way. The fitted results show that the independent function dominates in Trees 2–23, which implies that non-adjacent errors are conditionally independent, i.e., the wind power forecasting error has a one-order Markov property. This result is not surprising, as wind speed and wind power outputs have been proven to be Markov chains in many studies [12, 34]. Using the Pair-Copula approach in this paper, the day-ahead wind power forecasting errors are verified to possess a Markov property with an optimal order of 1.

For the dependence of adjacent errors in Tree 1, the Student t is the most favored Copula function due to its elliptical structure and strong tail dependence, and the Gumbel function is also selected for some pairs. For parameter results, strong correlations are found for adjacent errors, and the variation of parameters in Tree 1 is small, indicating that the correlations are not significantly affected by the prediction horizons 4 figures and tables.

3.3 Spatial dependence model

Seventeen wind farms in Vermont are included in the spatial dependence study and are sequenced by distance. The analysis and modelling process are similar to those described above. The parameters of the Gaussian model are shown in Fig. 5a. Compared with the temporal dependence, the spatial correlation is remarkably weak, i.e., the smoothing effect of aggregate wind farms is considerable. Fig. 5b displays the colour histogram of wind farm 1 and 2 with a correlation of 0.452. It is discovered that the spatial dependence has an upper tail, which indicates that the probability of simultaneous large positive errors is higher than that of negative errors. The asymmetry property should be taken into consideration in reserve deployments and transmission congestion problems in wind farm cluster integration.

Fig. 5
figure 5

Data analysis for spatial dependence

The Copula types and parameters of the Pair-Copula spatial model are shown in Fig. 6, where the element in position (j, i) denotes the conditional dependence of errors of wind farm j and i + j. Unlike the temporal model, the spatial dependences are inherent; thus, Markov phenomenon is not observed. Furthermore, the asymmetric characteristic is common among the selected wind farms, thus the Gumbel function is superior to other candidates owing to its upper tail structure; yet the symmetrical Student t and Frank functions are also selected for some pairs of wind farms. Comparing Fig. 6b and the spatial distribution of the wind farms, the spatial correlation generally decreases when distance between wind farms increasing. For instance, farm a and b is nearest, thus the correlation coefficient is the largest (the correlation of pair c 1,2 is 0.3177). However, the distance of farm b and c is farther than farm h and i, but the correlation coefficient is larger (the correlation of pair c 2,3 and c 8,9 is 0.2652 and 0.2522 respectively). Therefore, the spatial dependence is affected by factors like distance, elevation, topographical and surface structure, and it is more variable and complex.

Fig. 6
figure 6

Estimated result of Pair-Copula spatial model

4 Model analysis and comparisons

To illustrate the effectiveness of the proposed method, the Pair-Copula, Gaussian-dependent and the independent (i.e., temporal or spatial correlation is ignored) models are compared in this section. The goodness-of-fit evaluation of multivariate distribution needs huge computation and is time-consuming, especially for high dimensional cases. Hence, the typical application scenes in which the temporal or spatial dependence of wind power forecasting errors are required are investigated. And according to the effects of the dependences in application scenes, the corresponding criterion is proposed to map the multivariate distribution to a univariate one, hence facilitating the comparisons between different multivariate models.

4.1 Temporal dependence model comparison

The scheduling and planning of an energy storage system (ESS) is a typical multistage decision-making problem. When the ESS is applied to balance the wind power forecasting deviations, the absorption or production energy of the ESS is determined by both the magnitude and sequence of the prediction errors, as formulated in (10). Therefore, the temporal dependence needs to be considered.

$$\begin{aligned} E_{\text{b}} & = \sum\limits_{{i \in T_{\text{b}} }} {e_{i} \Delta t} \\ T_{\text{b}} & = \{ e_{i} \ge 0,\forall i \in T_{\text{b}} \} {\text{ or }}\{ e_{i} \le 0,\forall i \in T_{\text{b}} \} \\ \end{aligned}$$
(10)

where E b is the charge or discharge energy of ESS; Δt is the time interval; T b is the charging or discharging period during which only positive or negative prediction errors occur.

Therefore, the distribution of E b is regarded as the criterion to evaluate the accuracy of different temporal models. To develop the univariate distribution, temporal- related prediction errors are generated by the Monte Carlo algorithm based on different temporal models, E b can then be calculated by (9), and the empirical distribution can be further obtained using the simulated data.

Figure 8 shows the probability density frequency of the ESS energy based on the measurements and the three models studied, and the data are normalized by the wind farm rated capacity. The measured data exhibit a fat-tailed characteristic due to the presence of continuous errors with the same sign, which agrees with the analysis in temporal modelling. The independent model does not reflect this property and therefore will greatly underestimate the necessary ESS energy capacity. On the other hand, the Pair-Copula and Gaussian models are alike because the selected Student t Copula has similar stochastic characteristics with Gaussian Copula, and they both fit the data well.

To further qualify the goodness-of-fit of the estimated model, Cramer-von Mises (CvM) and Kolmogorov-Smirnov (KS) indices [35], which respectively measure the average and maximum Euclidean distances between the estimated and empirical CDF, are calculated and shown in Table 2. Both indices are negatively oriented, meaning that lower values are preferable. The Pair-Copula model is better than the Gaussian model in terms of both indices.

Table 2 Goodness-of-fit indices of three temporal models

4.2 Spatial dependence model comparison

Due to the weak spatial correlation, the prediction errors of disperse wind farms is offset thus reserve demand can be reduced. Thus the spatial dependence is desired for the assessment of power system operation reliability and security. The distribution of the aggregated prediction error of regional wind farms is selected as the criterion to evaluate the accuracy of different spatial models.

The probability density frequency of aggregated forecast error of 17 wind farms in Vermont based on the measurements and three estimated models are shown in Fig. 7. Apparently, the independent model significantly underestimates the error range and reserve demands. And the measurements are slightly right-skewed due to the asymmetry spatial dependent structure, which is reflected in the Pair-Copula model but not in the Gaussian model. The goodness-of-fit indices are listed in Table 3, and Pair-Copula model is still better than the Gaussian model for both evaluated indices.

Fig. 7
figure 7

Probability density frequency of ESS energy of measurements and three estimated models

Table 3 Goodness-of-fit indices of three spatial models

Furthermore, it should be highlighted that the asymmetry distribution would result in different requirements on reserves. Suppose the upward and downward reserves providing for the wind power uncertainty are 0.2 p.u. In this case, when negative error exceeds upward reserve, load shedding will occur. Similarly, if the positive error is greater than the downward reserve, the surplus wind power must be curtailed, as shown in Fig. 8. Table 4 lists the load shedding and wind curtailing probabilities. The wind curtailing probability is clearly larger than the load shedding probability due to the asymmetry distribution. The probabilities of the Pair-Copula model are the closest to the measured data, while the Gaussian model overestimates the load shedding probability, which will significantly increase the operation costs to maintain the required reliability level.

Fig. 8
figure 8

Probability density frequency of aggregated forecast error of measurements and three estimated models

Table 4 Load shedding and wind curtailing probabilities

4.3 Feasibility evaluation of model

More comparisons for different wind farms are carried out to test the feasibility of the proposed method. Figure 9 draws the CvM and KS index of Pair-Copula model and Gaussian temporal model for 17 wind farms in Vermont, where the box shows the median, 25% and 75% percentiles of the indices, while the whiskers are the extreme indices. The Pair-Copula model has generally lower indices than Gaussian model for different wind farms.

Fig. 9
figure 9

Temporal goodness-of-fit indices for different wind farms

Figure 10 shows the goodness-of-fit indices of spatial models in different aggregated scale. Each column represents the index range of 60 wind farm clusters with the specified scale, which are randomly selected from the regional wind plants in Vermont. For both compared models, the goodness-of-fit index worsens as the model dimension increases. The Pair-Copula model yields greater accuracy for all dimensions, and the improvement is more evident than that of the temporal case. It can be concluded that the Pair-Copula model is flexible in different dependent relationships, and the superiority is more prominent for nonlinear and asymmetric structures when compared to the traditional Gaussian model.

Fig. 10
figure 10

Spatial goodness-of-fit indices for different scales of wind farm

5 Conclusion

Because of the highly variable forecast errors of wind power generation, a thorough uncertainty model describing both the predictive distribution and dependent structure of forecast errors is essential. In this paper, a Pair-Copula modelling approach has been proposed to construct joint distribution and generate scenarios that mimic wind power stochastic behaviours. The feasibility of the Pair-Copula approach for wind power temporal and spatial dependence is verified, and the advantages of the Pair-Copula are noted. First, owing to the basic of conditional dependence, the Pair-Copula model can reflect the essential relationships between random variables, which the other multivariate models fail to reveal, e.g., the one-order Markov characteristic of the temporal model. Furthermore, the Pair-Copula approach is more adaptive and practical in various applications, as it allows the combination of different types of Copulas and thus yields the best-fitted model. In the illustrated example, the symmetrical Student t function is selected for the temporal model, while the asymmetrical Gumbel Copula is chosen for the spatial model.

Based on the developed models, the effects of the temporal and spatial dependence of wind power forecasting errors on power system operations are analysed. For the temporal model, the tail dependence indicates that large prediction errors have stronger correlations, resulting in a fat-tail distribution of the ESS energy. Thus, a large ESS capacity is required to balance the prediction errors. For the spatial model, the upper tail makes the probability of excess generation higher than that of deficits, causing different demands for upward and downward reserves. Furthermore, goodness-of-fit indices of different models are calculated and comparison results show that Pair-Copula model gives greater accuracy in various application scenes.

The main contribution of this paper is to significantly advance the uncertainty model of wind power and to provide a systematic procedure for modelling and generating wind power scenarios, which can then be applied to a range of wind power integration problems. Although only wind power uncertainty modelling is discussed in this paper, the Pair-Copula approach provides a powerful tool for constructing flexible multivariate distributions that it can also be applied to a wide range of statistical analyses and stochastic decision fields of power systems.