1 Introduction

Electricity supply and demand are greatly influenced by weather conditions. Temperature, wind speed, and solar radiation are among the most influential factors. Temperature has a great effect on energy use by individuals and, thus, on the demand side of the electricity system. Heating and cooling loads depend largely on ambient temperature. Wind and solar generation are increasingly important as renewable energy gains in popularity. Wind power is growing at a rate of \(30\%\) annually, with a worldwide installed capacity of 283 GW at the end of 2012. The installed capacity of solar photovoltaic (PV) grew by \(41\%\) in 2012, reaching 100 GW.

However, the limited predictability of wind speed and solar radiation raises operational challenges for power systems as the penetrations of these technologies increase. Accurate very short-term forecasting (i.e., up to 12-hours-ahead) of the two resources could improve operational efficiency of power systems. Although it is not the focus of this work, longer-term weather forecasting is also beneficial for power system planning. For example, Maleki et al. [1] employ Monte Carlo simulation of wind and solar conditions to optimally design a grid-independent hybrid renewable energy system.

There are many works dealing with weather forecasting, and Widén et al. [2] provide a comprehensive survey of forecasting techniques in the literature. Some of these works forecast temperature, frost, or related financial derivatives. Others forecast solar radiation, cloud motion, or solar production. Others still forecast wind speed and wind production. Here we review some of the literature that forecast temperature, solar radiation, or wind speed.

Most works focusing on temperature forecasting analyze financial weather derivatives as the primary application. Besides atmospheric methods, models attempting to capture these dynamics can be divided into two categories: stochastic approaches (Monte Carlo simulation) and time-series models.

There are numerous examples of the stochastic approaches [3,4,5]. Alaton et al. [3] suggest a stochastic process that describes the evolution of temperature for the pricing of weather derivatives. Benth and Šaltytė-Benth [4] model daily average temperature with a mean-reverting Ornstein-Uhlenbeck process. Taylor and Buizza [5] investigate temperature ensemble predictions and compare them with time-series models.

There are also examples of time-series models [6, 7]. Campbell and Diebold [6] forecast daily average temperature using a nonstructural time-series approach. Šaltytė-Benth et al. [7] propose a stochastic model, which includes trend, seasonality, and mean reversion. Oetomo and Stevenson [8] review different temperature-forecasting models, including those relying on autoregressive moving average (ARMA) processes and Monte Carlo simulation.

Numerical weather prediction (NWP) models are a popular approach for solar radiation forecasting, and are used to generate forecasts up to several days ahead. Most short-term solar-radiation forecasts range from 30 minutes to six hours ahead and rely on satellite-derived cloud-motion forecasts [9,10,11,12]. Akarslan et al. [13] incorporate temperature, extraterrestrial irradiance, and derivatives of these data with a multi-dimensional linear prediction filter to improve solar forecasts. Alonso-Montesinos and Batlles [14] forecast solar radiation up to three hours ahead under a variety of atmospheric conditions, because such conditions have a major influence on solar forecasting. Perez et al. [15] use sky cover predictions as inputs when forecasting solar radiation. Heinemann et al. [11] and Remund et al. [16] note that comparing the forecasts of different methods is useful in providing comparative statistics to validate a forecasting model.

Wind speeds are typically forecasted several minutes to several days ahead, with statistical methods being extensively applied. For example, Erdem and Shi [17] use ARMA-based approaches. Liu et al. [18] propose a novel time-series technique that is based on the Taylor Kriging model. Other works combine multiple numerical techniques to produce ensemble wind forecasts [19,20,21]. Wang and Xiong [22] develop a hybrid forecasting method based on an ARMA process, outlier detection, and fuzzy time series to forecast the daily wind speed in Taiwan. Jiang et al. [21] propose a hybrid approach that employs a Boosting algorithm to improve the forecasting performance of a traditional ARMA model. They demonstrate the effectiveness of this technique using wind-production data from the east coast of Jiangsu Province, China. There are also some artificial intelligence-based models in the literature—Li and Shi [23] apply artificial neural networks (ANN) and Hong et al. [24] forecast wind power and wind speed up to one-hour ahead with a multi-layer feed-forward neural network (MFNN). Maleki et al. [25] take an ANN-based approach to forecasting solar radiation, wind speed, and temperature in optimizing the operations of a hybrid solar- and wind-powered water-desalination system. Giebel et al. [26] provide a detailed review of the techniques that are available for wind-speed forecasting.

In this paper, we use time-series methods to model and generate hourly temperature, wind-speed, and solar-radiation forecasts at 61 locations in the United States. The three weather variables at the 61 locations are response variables in a vector autoregression (VAR) model. In addition to estimating the model, we also conduct out-of-sample validation to test the quality of the forecasts that are produced. We compare our forecasting errors to those that are reported for other techniques in the literature, including persistence forecasts, showing that our method performs as well or better.

In light of the existing literature on weather forecasting, our work makes three contributions. First, we employ a VAR model, which allows correlations between the three different weather variables to be captured. This is important, because there are likely important correlations between temperature, wind speed, and solar radiation. Second, the autoregressive structure of the VAR model allows temporal autocorrelations, which are important, to be captured. Finally, the structure of the VAR model also allows time-lagged correlations between different locations to be captured. For instance, weather conditions in one location at time t may be correlated with weather conditions at another location at time \(t^{\prime} \not = t\). These three features or our proposed VAR model leads to its outperforming a number of other forecasting techniques that are reported in the literature.

The remainder of this paper is organized as follows. In Sect. 2 we provide descriptive statistics for the three weather variables. In Sect. 3 the model and estimation methods are introduced. For a large model of this form, we try to find a proper number of residual terms to include to ensure good forecasting performance while maintaining reasonable model size and degrees of freedom. Thirty lags for each time series are utilized and each equation is estimated separately with either ordinary or weighted least squares. In Sect. 4 we examine the forecasting performance up to six-hours ahead and provide comparative statistics with other models. Conclusions and suggestions for future research are provided in Sect. 5.

2 Weather data

We use data from the National Solar Radiation Database (NSRDB), which is produced by the National Renewable Energy Laboratory, National Climatic Data Center, and other partners. The NSRDB contains ground-based solar and meteorological data for 1454 sites around the United States. Nearly all of the solar data are modeled while meteorological elements, including wind speed and dry bulb temperature, are observed values. The hourly solar data are modeled global horizontal irradiance (GHI), which is the sum of modeled direct and diffuse solar radiation received on a horizontal surface, during the 60-minute period ending at the timestamp. Much of the data come from a model developed by State University of New York at Albany that uses Geostationary Operational Environmental Satellite imagery to estimate solar radiation. The dry-bulb temperature and wind speed are instantaneous values observed at or near each hour following meteorological measurement practice. Wind speeds are measured at \(2\)-m heights. Wilcox [27] provides further details regarding the NSRDB.

We model hourly wind speed, global solar radiation, and dry bulb temperature at the 61 locations that are shown in Fig. 1 in one single VAR model. The 61 locations are chosen to provide roughly even coverage of the continental United States. Moreover, locations that are close to population centers and areas with good solar and wind resource availability are also included in the dataset. Data covering the years 1990–2008 are used, because these data are complete and do not require any modification. Among the 18 years of hourly data, 16 years are used for model estimation and two years are used for out-of-sample model validation.

Fig. 1
figure 1

Sixty-one locations in the United States that are modeled

Table 1 Descriptive statistics of wind speed data (m/s)
Table 2 Descriptive statistics of global solar radiation data (\(\hbox {Wh}/\hbox {m}^2\))
Table 3 Descriptive statistics of dry bulb temperature data (K)

To get an overall feel for the data, Tables 1, 2, and 3 summarize some simple descriptive statistics of the wind speed, solar radiation, and temperature data, respectively, at six locations. Temperature data are reported in degrees Kelvin in Table 3 and throughout this paper. This is because we use relative root mean square error, which is not defined for average observations equal to zero, as a metric for model validation in Sect. 4.

Figures 2, 3, and 4 show wind speed, temperature, and solar radiation, respectively, in Las Vegas, NM (not to be confused with the popular gambling destination) from 2006 to 2008. These figures clearly show seasonal patterns for the three weather variables.

Fig. 2
figure 2

Time series of wind speed in Las Vegas, NM from 2006 to 2008

Fig. 3
figure 3

Time series of global solar radiation in Las Vegas, NM from 2006 to 2008

Fig. 4
figure 4

Time series of dry bulb temperature in Las Vegas, NM from 2006 to 2008

3 Methodology

A time series approach is proposed in this work to capture the characteristics of the three weather variables. Our approach consists of three parts that are integrated with one another into our overall model: (1) a linear trend, (2) a seasonal component, which is represented by Fourier series and Chebyshev polynomials, and (3) a VAR to model the stochastic component of the time series. We detail these three components below.

3.1 Trend

To check for the presence of a linear trend, we run a simple linear regression of the weather data against hourly time. Both the intercept and time parameters are significant at the \(1\%\) level (although the estimated time parameter is small in magnitude). Hence, a linear trend, though slight, should be included in our model. We represent this trend component by including a term of the form:

$${\text{ trend}}_{t} = \beta _{0} + \beta _{1} t $$

in our model.

3.2 Seasonality

As discussed in Sect. 2 and illustrated in Figs. 2, 3 and 4, there are strong seasonal variations in all three of the weather variables. Because of the hourly time step in our data, it is important to model both diurnal and seasonal seasonality. Because the three weather variables exhibit different diurnal patterns, we use different approaches to represent their diurnal seasonality.

For wind and temperature, diurnal seasonality is represented by a Fourier series of the form:

$$\begin{aligned} \text{ daySeas}_t = \sum _{p=1}^P \left[ \delta _{c,p} \cos \left( 2\pi p \frac{d(t)}{24}\right) + \delta _{s,p} \sin \left( 2\pi p \frac{d(t)}{24}\right) \right] \end{aligned}$$

where P is the order of the Fourier series, \(\delta _{c,p}\) and \(\delta _{s,p}\) are coefficients on the cosine and sine terms, respectively, and:

$$\begin{aligned} d(t) = (t \text{ mod } 24) \end{aligned}$$
(1)

converts t to hours of the day. Season-of-the-year seasonality is similarly captured by a Fourier series of the form:

$${\text{ annSeas}}_t = \sum _{p=1}^{\hat{P}} \left[ \hat{\delta}_{c,p} \cos \left( 2\pi p \frac{\hat{d}(t)}{365}\right) + \hat{\delta}_{s,p} \sin \left( 2\pi p \frac{\hat{d}(t)}{365}\right) \right]$$
(2)

where \(\hat{P}\) is the order of the Fourier series and \(\hat{\delta}_{c,p}\) and \(\hat{\delta}_{s,p}\) are coefficients on the cosine and sine terms, respectively, and:

$$\begin{aligned} \hat{d}(t) = \left\lceil \frac{t}{24} \right\rceil \end{aligned}$$
(3)

where \(\left\lceil \cdot \right\rceil\) is the ceiling operator, which converts t into days of the year.

Fourier series can produce a smooth seasonal pattern with a significant reduction in the number of parameters to be estimated as compared to dummy variables [6]. To find the proper order of the Fourier series, we estimate models with between first- and fifth-order terms. Examining modeled and observed seasonality with different-ordered Fourier series shows that a third-order series is sufficient to capture the seasonality dynamics. We also compare the forecasting performance of the model with third- and fifth-order Fourier series, finding them to be similar. This finding further suggests that third-order terms are sufficient. Thus, we include third-order Fourier series for daily and season-of-the-year seasonality of wind and temperature.

The season-of-the-year seasonality of solar radiation is given by the same Fourier series that is shown in (2). Daily seasonality is modeled using second-order Chebyshev polynomials, as opposed to Fourier series. To define the Chebyshev polynomials [28] we first convert our independent variable, x, where we assume \(x \in [a,b]\), to the normalized variable:

$$\begin{aligned} z = \frac{2(x-a)}{b-a}-1 \end{aligned}$$

By definition we have \(z \in [-1,1]\). We then define the Chebyshev polynomials recursively as:

$$T_j(z) = 2zT_{j-1}(z) - T_{j-2}(z)$$

where:

$$T_0(z) = 1$$

and:

$$T_1(z) = z$$

Thus, the second-order Chebyshev polynomial that is used to model diurnal solar radiation seasonality is given by:

$$ {\text{ daySeas}}_t =\alpha _0 + \alpha _1 \left[ \frac{2(x_t-a_t)}{b_t-a_t}-1\right] + \alpha _2 \left\{ 2 \left[ \frac{2(x_t-a_t)}{b_t-a_t}-1\right] ^2-1 \right\} $$
(4)

We use Chebyshev polynomials, as opposed to Fourier series, to model the diurnal seasonality for a number of reasons. First, we only need to model solar radiation during daytime hours, because there is (by definition) zero solar radiation at night. Moreover, solar radiation follows a predictable diurnal pattern, insomuch as it peaks in the middle of the day. A second-order Chebyshev polynomial is better able to produce this shape of a diurnal pattern than a Fourier series is. This is confirmed by our model estimates, because second-order Chebyshev polynomials provide much better goodness-of-fit than Fourier series do.

Table 4 AIC and BIC for VAR models for Los Angeles, CA with different lag structures

Based on these properties of the diurnal pattern, we define:

$$\begin{aligned} x_t = \frac{d(t)-r_{{\hat{d}}(t)}}{s_{{\hat{d}}(t)}-r_{{\hat{d}}(t)}} \end{aligned}$$

where d(t) and \({\hat{d}}(t)\) are as defined in (1) and (3), \(r_{{\hat{d}}(t)}\) and \(s_{{\hat{d}}(t)}\) are the sunrise and sunset times, respectively, on the day \({\hat{d}}(t)\), and \(a_t\) and \(b_t\) are the minimum and maximum values, respectively, that \(x_t\) takes on day \({\hat{d}}(t)\). Sunrise and sunset times are computed, based on the day of the year and geographic coordinates of each location modeled, using MATLAB functions that are developed by the U.S. Geological Survey [29].

3.3 VAR model

VAR is a statistical model that captures the linear interdependencies among multiple time series. Hence, it is beneficial in modeling temporal and spatial correlations among wind speed, solar radiation, and temperature in different locations. Each variable at each location has an equation explaining its evolution based on time-lagged values of all of the weather variables at all locations.

As a result, a VAR model is able to capture three important types of autocorrelations in the data. The first is temporal autocorrelation in an individual weather variable (e.g., the time-\(t\) temperature at location n may be correlated with the time-\(t^{\prime}\) temperature at the same location, where \(t^{\prime} \not = t\)). The second is cross-correlation between individual weather variables (e.g., the time-\(t\) temperature at location n may be correlated with contemporaneous solar radiation at the same location). The third is temporal autocorrelation and cross-correlations between locations. For instance, the time-\(t\) temperature at one location may be correlated with contemporaneous or temporally-offset temperature at another location. Along the same lines, the VAR model can also capture correlations between variables, locations, and time (e.g., temperature at one location may be correlated with time-lagged solar radiation at another location). Thus, the VAR is highly flexible in terms of relationships among the weather data that can be captured.

VAR models assume that all of the response variables are stationary. Thus, it is important to test our time-series data for stationarity before fitting the proposed VAR model. Augmented Dickey–Fuller (ADF) tests against trend-stationary alternatives are applied to 16 years of weather data for several of the locations modeled. The ADF tests are conducted after removing the seasonal components that are discussed in Sect. 3.2. Our results show that the data are trend-stationary with p-values smaller than 0.01 for all locations. Moreover, our model includes a linear trend term which is statistically significant, as discussed in Sect. 3.1. Inclusion of this term removes any long-term trend from the response variables (especially temperature). The results of the ADF tests and the inclusion of the linear trend suggest that stationarity is not an issue with our data. The ADF tests further favor the trend-stationary alternative. This suggests that deterministic trends, which are what we model through the linear and seasonality terms that are discussed in Sects. 3.1 and 3.2, are more appropriate than stochastic trends.

Modeling the three weather variables at 61 locations in a single VAR gives 183 response variables in total. Given the large model size, it is important to determine a suitable number of autoregressive lags and which time-lagged values to include in the model. To do this, we regard one week’s lag as the maximum number to be considered. We estimate multiple VAR models with up to 168 lags using two response variables only. After estimating several pairs, we find that regardless of the distance between locations, autoregressive lags of 1 and multiples of 24 are significant for most location pairs. This lag structure give us the spatial relationship among locations.

Akaike and Bayesian information criteria (AIC and BIC) are further used to determine the lag structure. AIC and BIC provide estimates of the information lost when a given model is used to represent the process that generates a given dataset. Smaller AIC and BIC values indicate a better relative model fit to the data. Due to the extreme size (and computational burden) of a 61-location VAR model, AICs and BICs are calculated for single-location VAR models. A single-location VAR model only has three response variables, as opposed to 183 for a 61-location VAR model. The single-location models are estimated using two years of hourly weather data. The lag structures that are estimated, which are listed in Table 4, are \(\hbox {VAR}(24)\), \(\hbox {VAR}(48)\), \(\hbox {VAR}(72)\), \(\hbox {VAR}(96)\), \(\hbox {VAR}(120)\), \(\hbox {VAR}(168)\), \(\hbox {VAR}(1{-}24,\,\,48,\,\,72,\,\,96,\,\,120,\,\,144,\,\,168)\). Table 4 summarizes AIC and BIC values of these VAR models using weather data from Los Angeles, CA.

The \(\hbox {VAR}(1{-}24,\,\,48,\,\,72,\,\,96,\,\,120,\,\,144,\,\,168)\) has the lowest BIC, which penalizes the number of parameters more strongly than AIC. This result, favoring the \(\hbox {VAR}(1{-}24,\,\,48,\,\,72,\,\,96,\,\, 120,\,\,144,\,\,168)\) structure, is consistent across the locations that are modeled. Thus, to fully capture the relationship between observations in adjacent periods, we use a VAR model with lags one through 24 and multiples of 24 up to 168 of the form:

$$\begin{aligned} \varvec{Y}_t = \text{ trend}_t + \text{ daySeas}_t + \text{ annSeas}_t + \sum _{l \in L} \varvec{A}_l \varvec{Y}_{t-l} + \varvec{U}_t \end{aligned}$$

where \(\varvec{Y}_t = (y_{1,t}, y_{2,t}, \ldots , y_{183,t})^{\text{T}},\) is a \(183 \times 1\) vector of hour-t response variables; \(L = \{1,2,\ldots ,24,48,72,96,120,144,168\}\) is the set of lags modeled; \(\varvec{A}_l\) are \(183 \times 183\) coefficient matrices for the lagged response variables; \(\varvec{U}_t = (u_{1,t}, u_{2,t}, \ldots , u_{183,t})^{\text{T}},\) is a \(183 \times 1\) vector of residuals. Because our data set covers 16 years of hourly observations, we have \(t = 1, 2, \ldots , 140256\).

3.4 Parameter estimation

A VAR model of the size that is proposed is difficult to estimate as a whole system due to computational and memory limitations of computers (the entire system consists of more than 25 million equations). Because the model is actually a seemingly unrelated regression system, we solve this problem by estimating each equation separately. The data that are used for estimation are hourly observations from 1991 to 2006. The variance/covariance matrix of the residuals is calculated after the estimation.

For wind and temperature, ordinary least squares is used for parameter estimation. Weighted least squares is applied for solar radiation. The weights assigned to night observations are zero whereas weights of one are given to daytime observations. We do this because the VAR model is only used to forecast solar radiation during the day—solar radiation is fixed equal to zero during the night because, by definition, there is no sunlight at night. By applying these weights, the estimated coefficients are better for forecasting solar radiation during the day because nighttime observations are ignored. As discussed in Sect. 3.2, we calculate sunrise and sunset times for each location based on geographic coordinates and the day of the year.

Figure 5 shows the residuals of the three weather variables in Chicago, IL and Las Vegas, NM. It is clear that the residuals display heteroskedasticity. However, Durbin’s alternative test reveals no serial correlation in the residuals.

Fig. 5
figure 5

Residuals for Chicago, IL and Las Vegas, NM from 1991 to 2006

4 Forecasting and validation

To validate our model, we generate out-of-sample forecasts and compare the performance of our VAR model to a number of benchmark competitors. In doing so, we consider forecasts that are up to six hours ahead and use two years of out-of-sample data covering the years 2007 and 2008. As noted in Sect. 3.2, we fix solar radiation forecasts equal to zero between sunset and sunrise on each day. We further truncate any negative forecasts equal to zero, because it is physically impossible for these values to be negative. Evaluation of solar forecasts is restricted to daylight hours, because nighttime solar radiation is not challenging to forecast.

We use two types of benchmark competitors in this validation. One compares the performance of our VAR model to other forecasting techniques appearing in the literature. This is a ‘more desirable’ benchmark competitor, because it allows our model to be directly contrasted with others. However, there are two issues with focusing exclusively on direct comparisons to other models. First, other models may be applied in different regions, where forecasting some weather variables may be easier or more difficult. For example, temperatures may be more stable and easier to forecast in one region compared to another. This could make one model appear better than another, due solely to the underlying weather conditions where the models are applied. Second, there may be differences in the weather variables being forecasted. For instance, our model relies on the NSRDB for wind speeds, which are measured at a \(2\) m height. Other works may use wind speeds at greater heights. Differences in wind speeds and patterns at different heights may also confound differences when comparing our model performance to other methods that are reported in the literature.

For these two reasons, we also compare the performance of our VAR model to persistence-type forecasting methods (cf. Sect. 4.1 for further discussion). Comparisons of our VAR model to persistence-type forecasts can partially control for the effects of regional and weather-variable differences in assessing the forecasting capability. Moreover, persistence-type methods are commonly used in assessing forecasting performance [30].

Numerous metrics are used in the literature to evaluate forecast accuracy. These include mean absolute error (MAE), root mean square error (RMSE), and relative root mean square error (RMSE%). We use all three of these metrics in our validation. To define these metrics, we let \(F_i\) and \(O_i\) denote forecast and observed values, respectively, of a given variable in hour i and N the number of out-of-sample forecasts used. MAE is then defined as:

$$\begin{aligned} \frac{1}{N} \sum _{i=1}^N |F_i-O_i| \end{aligned}$$

RMSE is defined as:

$$\begin{aligned} \sqrt{\frac{1}{N} \sum _{i=1}^N (F_i-O_i)^2} \end{aligned}$$

and RMSE% is defined as:

$$\begin{aligned} \frac{\sqrt{\frac{1}{N} \sum \limits _{i=1}^N (F_i-O_i)^2}}{\frac{1}{N} \sum \limits _{i=1}^N O_i} \end{aligned}$$

In addition to these metrics, it is also common to benchmark one forecasting model to another reference model. Such a benchmark provides what is known as a skill score. The benefit of a skill score is that it can mitigate some of the issues that are associated with directly comparing the forecasting performance of our model to performance metrics that are reported in the literature (i.e., issues associated with the models being applied to different regions or to different weather variables). A skill score is typically defined in terms of a metric used to evaluate forecast accuracy (e.g., MAE, RMSE, and RMSE%). The metric used is referred to as the score. Let \(\sigma\) represent the score of the model being benchmarked and \(\sigma _r\) the score of the reference model (to which the model being benchmarked is compared). Also define \(\sigma _p\) as the score of a perfect model (i.e., one with no forecast error). The skill score is defined as:

$$\frac{\sigma -\sigma _r}{\sigma _p-\sigma _r}$$

The skill score indicates the fractional improvement in the score from using the benchmarked model relative to the reference model. A perfect forecast would have a skill score of 1. We use MAE and RMSE as scores and persistence forecast models (cf. Sect. 4.1 for further discussion) as the reference forecast in our analysis. We fix \(\sigma _p = 0\) with both the MAE and RMSE scores, reflecting zero forecast error in the perfect model.

4.1 Persistence forecasts

We compare two kinds of persistence-type forecasting methods to our VAR model. The persistence-type methods are also used as reference models in computing skill scores. The first persistence-type method is the simple persistence model, which we denote the SP model. The SP model relies upon the weather condition at the current time to forecast future conditions. Letting \(O_i\) denote the hour-i observation, the SP forecast of the hour-\((i+\Delta i)\) weather variable generated at hour i is defined as \(O_i\). That is, the SP model assumes that the weather variable has the same value at hour (\(i+\Delta i\)) as it does at hour i. This persistence forecast is applied to all three weather variables for comparison with the VAR model.

We also use the clearness persistence forecast, which is proposed by Marquez and Coimbra [30], which we denote the CP model, to provide an additional benchmark for solar irradiance forecasts that are generated by our VAR. The CP model relies on extraterrestrial solar radiation and takes the solar zenith angle as an input. Let \(\theta _i\) represent the solar zenith angle at hour i. We then define hour-i extraterrestrial solar radiation as:

$$S_i = C \cos (\theta _i)$$

where \(C = 1367\,\hbox {W}/\hbox {m}^2\) is the solar constant. The CP forecast of the hour-\((i+\Delta i)\) solar irradiance is then given by:

$$O_i \cdot \frac{S_{i+\Delta i}}{S_i}$$

We use the hourly mean solar zenith angle that is recorded in the NSRDB to generate our CP forecasts.

4.2 Results

Tables 5, 6, and 7 summarize the forecasting performance of our VAR model in producing wind, solar, and temperature forecasts, respectively, These comparisons are made on the basis of the different metrics that are discussed above (i.e., MAE, RMSE, and RMSE%). The table reports the average (among the 61 locations modeled), minimum, and maximum MAE for the three weather variables. Average RMSE and RMSE% are reported as well.

The values that are reported in Tables 5, 6, and 7 are compared with results that are reported in the literature for other forecasting techniques in Sect. 4.3. As noted before, there are important caveats in directly comparing forecasting performance of our model to others in the literature.

Table 5 Average, minimum, and maximum (among 61 locations modeled) MAE, RMSE, and RMSE% of wind forecasts produced by VAR model
Table 6 Average, minimum, and maximum (among 61 locations modeled) MAE, RMSE, and RMSE% of solar forecasts produced by VAR model
Table 7 Average, minimum, and maximum (among 61 locations modeled) MAE, RMSE, and RMSE% of temperature forecasts produced by VAR model

Tables 8 and 9 summarize the average MAE and RMSE, respectively, of the VAR model in producing temperature, solar-radiation, and wind-speed forecasts. They also summarize the average MAE and RMSE of the SP model. The tables show that the VAR outperforms the SP model, by between \(6\%\) and \(80\%\), especially when the forecasting horizon increases. This is also illustrated in Fig. 6, which shows the average RMSE of the different models as a function of the forecasting horizon.

Tables 10 and 11 report skill scores on the basis of MAE and RMSE, respectively, using the SP model as the reference model. The skill scores obtained for our VAR model are compared to skill scores that are reported in the literature in Sect. 4.4.

4.3 Comparative studies

Our VAR model provides good forecasting performance compared to other methods reported in the literature, showing that our model can be used for providing very short-term forecasts of temperature, wind speed, and solar radiation. The average (across the 61 locations modeled) performance of our model is comparable to other works. Moreover, our model performs significantly better at some locations, as indicated by the minimum values of the MAE that are reported in Tables 5, 6, and 7. Tables 5, 6, and 7 also suggest that our VAR model provides relatively robust weather forecasts up to six-hours ahead.

Perez et al. [15] forecast wind speed using a blended ensemble, which consists of the Weather Research and Forecasting Single Column Model and time series forecasts that are calibrated with Bayesian model averaging. The MAEs of their hour-ahead wind speed forecasts are between \(0.9\,\hbox {m}/\hbox {s}\) and \(0.95\,\hbox {m}/\hbox {s}\) during the day and are between \(1.01\,\hbox {m}/\hbox {s}\) and \(1.07\,\hbox {m}/\hbox {s}\) overnight. Erdem and Shi [17] compare four approaches that are based on an ARMA method for hour-ahead wind forecasting. Their method has MAEs ranging from \(0.8\,\hbox {m}/\hbox {s}\) to \(2.3\,\hbox {m}/\hbox {s}\). Li and Shi [23] present a comparison study on the application of different ANN in hour-ahead wind-speed forecasting and measure forecasting performance in terms of MAE and RMSE. The best MAE and RMSE among the locations that they model are \(0.950\,\hbox {m}/\hbox {s}\) and \(1.254\,\hbox {m}/\hbox {s}\), respectively. Chen et al. [20] produce wind-speed forecasts using a Gaussian process that is applied to the outputs of an NWP model. Their hour-ahead and five-hours-ahead forecasts have RMSEs of \(1.8\,\hbox {m}/\hbox {s}\) and \(2.2\,\hbox {m}/\hbox {s}\), respectively. Hong et al. [24] produce hour-ahead wind-speed forecasts using their cascaded MFNN method with MAEs of 1.12 m/s in the summer, \(1.22\,\hbox {m}/\hbox {s}\) in the winter, \(1.13\,\hbox {m}/\hbox {s}\) in the spring, and \(1.03\,\hbox {m}/\hbox {s}\) in the autumn.

More short-term solar radiation forecasting is done using cloud motion derived from satellite images [11, 12, 19]. Perez et al. [12] report an increase in the RMSE% from \(25\%\) to \(42\%\) as the forecasting horizon goes from hour-ahead to six-hours-ahead. Traiteur et al. [19] compare their forecasts against single point ground-truth stations and report RMSEs that vary from \(68\,\hbox {Wh}/\hbox {m}^2\) to \(120\,\hbox {Wh}/\hbox {m}^2\) for hour-ahead forecasts and \(140\,\hbox {Wh}/\hbox {m}^2\) to \(200\,\hbox {Wh}/\hbox {m}^2\) to six-hour-ahead forecasts. Erdem and Shi [17] generate one-, two-, and three-hours-ahead solar forecasts and report RMSE%s of \(23\%\), \(32\%\), and \(38\%\), respectively. Remund et al. [16] compare short-term global radiation forecasts of three different models and find that ECMWF (Global Model of the European Centre for Medium-Range Weather Forecasts) is the best, with an RMSE% that stays at about \(38\%\) for one- to five-hours-ahead forecasting. The RMSEs of one- to three-hours-ahead forecasts of global radiation that are produced by Alonso-Montesinos and Batlles [14] are all greater than \(100\,\hbox {W}/\hbox {m}^2\), except for those under clear-sky conditions.

Table 8 Average (among 61 locations modeled) MAE of VAR and SP models
Table 9 Average (among 61 locations modeled) RMSE of VAR and persistence-type models
Fig. 6
figure 6

Average (among 61 locations modeled) RMSE of solar radiation forecasts produced by VAR, CP, and SP models

Table 10 MAE-based skill scores of VAR model using SP model as reference model
Table 11 RMSE-based skill scores of VAR model using SP model as reference model

Taylor and Buizza [5] compare point forecasts of daily air temperature generated by six different models to actual observations. The best MAE of an hour-ahead forecast that they report is \(0.9\,\hbox {K}\), as opposed to an average of \(0.68\,\hbox {K}\) that is generated by our model. Smith et al. [31] develop an ANN model to predict air temperature at hourly intervals from one to twelve hours ahead. Their network is trained using data from sites that are selected to encompass a broad range of conditions. The MAEs of one- to six-hour-ahead predictions, averaged from two evaluation datasets, are \(0.53\,\hbox {K}\), \(0.87\,\hbox {K}\), \(1.15\,\hbox {K}\), \(1.37\,\hbox {K}\), \(1.59\,\hbox {K}\), and \(1.77\,\hbox {K}\), which are similar to the forecasting performance of our VAR model.

4.4 Skill scores comparison

As discussed before, direct comparison of MAE or RMSE among different datasets provides a limited picture of forecasting performance. This is because forecasting performance is governed, in part, by the local climate conditions of the region in question. If one is comparing forecasts from different regions, differences in MAE, RMSE, and other scores may be confounded by the effects of climate. Moreover, differences in the underlying weather variables being forecasted can confound forecasting performance. We use skill scores to allow for a more clear comparison of our VAR model to other forecasting methods that are reported in literature.

Perez et al. [12] forecast short-term hourly average GHI one to six hours ahead. Their method uses cloud motion derived from consecutive geostationary satellite images. They compare the forecasts that are generated by their model to one year of ground measurements. They also report RMSEs for forecasts that are generated by their model at seven locations in the United States and the RMSEs of forecasts that are generated by a CP model. Using these reported RMSEs, we compute skill scores for their solar forecasts and compare them to the skill scores of solar forecasts that are generated by our VAR model across the 61 locations that we model (using a CP model as the reference model in both cases). Figure 7 compares the RMSE-based skill score of our model to that proposed by Perez et al. [12] for different forecast horizons. It is clear that the two models perform similarly up to two hours ahead. However, our model outperforms that of Perez et al. [12] for longer forecasting horizons.

Fig. 7
figure 7

RMSE-based skill score of VAR model and model proposed by Perez et al. [12] for solar radiation forecasting using CP model as reference model

Marquez et al. [32] predict GHI at temporal horizons of 30, 60, 90, and \(120\,\hbox {minutes}\). They use a hybrid method that combines information from processed satellite images with ANN. They apply their forecasting method to data from two distinct locations in the San Joaquin Valley. The raw data are captured at \(30\hbox{-second}\) intervals and are then averaged to \(30\hbox{-minute}\) intervals. Inman et al. [33] summarize the forecast skill of these two ANN-based models compared to a clear-sky-deviation persistence model. These comparisons are summarized in Table 12. The input variables for Models 1 and 2 are data from satellite images and lagged GHI data, respectively. Although Marquez et al. [32] and Inman et al. [33] examine forecasts at \(30\)-minute intervals, it is nevertheless helpful to compare the performance of their model to our VAR using the same number of forecasting steps. For our VAR model, the skill scores of solar forecasts between one and four time steps ahead are 0.08, 0.18, 0.31, and 0.43. The one-time-step-ahead solar forecasts generated by our VAR model have a lower skill score than those produced by the model of Marquez et al. [32]. However, our remaining solar forecasts have higher skill scores than the corresponding forecasts that are produced by their model.

Table 12 RMSE-based skill scores of solar radiation forecasts produced by ANN-based models of Marquez et al. [32] using CP model as reference model reported by Inman et al. [33]

Abdel-Aal et al. [34] forecast mean hourly wind-speeds at Dhahran, Saudi Arabia using group method of data handling abductive networks. The overall MAE for one-hour-ahead forecasts produced by their model is \(0.85\,\hbox {m}/\hbox {s}\). Their method achieves an \(8.2\%\) MAE reduction compared to hourly persistence forecasts, giving an MAE-based skill score of 0.08. For six-hour-ahead forecasts, the MAE of their method is \(1.20\,\hbox {m}/\hbox {s}\). This corresponds to an MAE-based skill score of 0.15 using a day-to-day persistence model as the reference model. Abdel-Aal et al. [34] conclude that the relative improvements of their model compared to persistence forecasts exceed those reported for a number of machine learning approaches that are discussed in the literature. Table 13 reports the MAE-based skill scores of wind speed forecasts produced by the method that is proposed by Abdel-Aal et al. [34] and our VAR method, using a persistence model as the reference model. The tables show that our VAR model performs similarly for one-hour-ahead forecasts but performs better for six-hour-ahead forecasts.

Table 13 MAE-based skill scores of wind speed forecasts produced by model of Abdel-Aal et al. [34] and VAR model using SP model as reference model

Sfetsos [35] compares a number of approaches to forecast mean hourly wind speed. These approaches include traditional linear ARMA models, the feed forward and recurrent neural networks, and more exotic approaches, such as adaptive neuro-fuzzy inference systems and neural logic networks. Sfetsos [35] identifies a neural logic network that incorporates logic rules as having the least error (among those surveyed), with an RMSE-based skill score of about 0.05. Our VAR model has an RMSE-based skill score for one-hour-ahead wind forecasts of 0.11 (cf. Table 11), which is better than the performance of the neural logic network model. Wang et al. [36] predict wind speed using an ANN-based method and then adjust the results according to long-term patterns. Their wind-speed data are sampled every twenty minutes. Compared to an SP model, the RMSE-based skill scores of their four- and six-hour-ahead forecasts are 0.16 and 0.13, respectively, which are lower than our VAR model. Fonte et al. [37] present an ANN-based method to predict average hourly wind speed. The RMSE-based skill score of one-hour-ahead forecasts for their model is 0.1 using an SP model as the reference model. The comparable skill score for our VAR model is 0.2.

5 Conclusion

In this paper, we propose a time series VAR model to forecast temperature, solar radiation, and wind speed at 61 locations around the United States. The proposed VAR model structure captures multiple types of temporal and cross-sectional autocorrelations in and between weather variables and locations. This is a novelty compared to other forecasting techniques that are in the literature. The forecasting performance is good for all three weather variables. Given the influence of the three weather variables on electricity systems, the model is able to provide proper inputs for electricity-supply and -demand modeling.

The consideration of spatial relationship allows the model to provide cromulent forecasts. The VAR model proposed is also flexible in size. The forecasting performance is similar when it is used to forecast the three weather variables for fewer locations (results for these more limited models are excluded for sake of brevity). We also show that the VAR model performs similarly to or better than other methods proposed in the literature, including persistence forecasts.

Another important contribution of this paper is that it shows that a time series approach can be used to provide robust short-term solar radiation forecasts with good forecasting performance.

This work does suggest several areas of future research. Although the VAR model proposed provides good forecasts, it may be redundant given its large size. Each equation has about five thousands parameters to be estimated. Not every one of these parameters contributes to the overall forecast. Thus, it may be possible to further customize the model and its autoregressive structure to better exploit the correlations in the data. There may also be additional exogenous variables that could be added to the model to embiggen its performance. That being said, the currently proposed model performs as well as or outperforms other models that are proposed in the literature. The residuals also display heteroskedasticity, which weighted or generalized least-squares techniques may reduce.