Journal of Modern Power Systems and Clean Energy

ISSN 2196-5625 CN 32-1884/TK

网刊加载中。。。

使用Chrome浏览器效果最佳,继续浏览,你可能不会看到最佳的展示效果,

确定继续浏览么?

复制成功,请在其他浏览器进行阅读

Online Demand Response Characterization Based on Variability in Customer Behavior  PDF

  • Lester Marrero
  • Daniel Sbárbaro (Senior Member, IEEE)
  • Luis García-Santander
Department of Electrical Engineering, University of Concepción, Concepción 4070409, Chile

Updated:2024-05-20

DOI:10.35833/MPCE.2023.000516

  • Full Text
  • Figs & Tabs
  • References
  • Authors
  • About
CITE
OUTLINE

Abstract

This paper proposes an online framework to characterize demand response (DR) over time. The proposed framework facilitates obtaining and updating the daily consumption patterns of customers. The essential concept of response profile class (RPC) is introduced for characterization and complemented by the measure of the variability in customer behavior. This paper uses a modified version of the incremental clustering by fast search and find of density peaks (CFSFDP) algorithm for daily profiles, considering the multivariate normal kernel density estimator and incremental forms of the Davies-Bouldin (iDB) and Xie-Beni (iXB) validity indices. Case studies conducted using real-world and simulated daily profiles of residential and commercial Chilean end-users have demonstrated how the proposed framework can continuously characterize DR. The proposed framework is proven to achieve realistic customer models for effective energy management by estimating the customer response to price signals at the distribution system operator (DSO) level.

I. Introduction

THROUGH deep coordination among grid operators and active customers, the capability of facilities for demand response (DR) and distributed energy resource management will be valuable for ancillary services [

1]. In particular, the implementation of DR targets the control of the power-consuming behavior of customers to meet the following objectives: ① reduction of the peak power consumption; ② reduction of the total required power generation, as the main result of the prior objective; ③ change of the demand to follow the available supply, especially with high penetration of renewable energy sources (RESs); and ④ elimination of overloads in the distribution system [2].

The intrinsic socio-demographic characteristics (individual preferences) of customers and real-time externalities (environmental factors) can influence their response [

3] in price-based DR programs. Typically, this information is private and unknown. However, understanding how customers respond conditioned by these influences is essential for estimating their potential for flexibility and designing correct pricing schemes to match the operation needs of the distribution system. To this end, a processing and subsequent characterization of daily load profiles are required first. These tasks represent a significant challenge since electricity data are data streams; thus, online clustering is necessary for handling this problem.

Although, in general, there are many studies in the literature about online (or stream) clustering foundations and algorithms (e.g., recent surveys [

4]-[6]), few have been reported directly addressing this topic with application to electricity data, which is the focus of this investigation. The same observation is found in related studies [7] and [8].

However, several studies have recently analyzed the electricity consumption of customers by exploiting important offline clustering methods. For example, [

9] classifies 2613 households under diverse load conditions such as calendar seasons and days. A clustering is applied in [10] to load patterns represented as images, and then periods with similar consumption levels are identified by considering load variation and uncertainty. A bilevel load shape dictionary is developed in [11], where extracted features such as weekly and seasonal patterns and segment entropy characterize the energy usage of customers. Consumption dynamics for each end-user are formulated first in [12]; then, from a clustering process, an evaluation of variability in the resulting clusters is performed by an entropy analysis. Reference [13] extracts, classifies, and verifies the reliability of the clustering and discovers clusters that describe end-users according to their demand and variability. Reference [14] uses an encoding system with a load-shape dictionary in 44 million daily profiles, focusing on segmenting customers’ lifestyles. Finally, a distributed-centralized identification method is proposed in [15] to extract and characterize typical daily patterns for industrial customers.

Despite the valuable contributions of these studies, they lack developing an online characterization. Meanwhile, [

16] presents an interim summarization function for the load data streams and an algorithm that incrementally learns and accumulates characterized patterns in six smart meters (SMs). Reference [17] implements the division of the load data streams in time windows, where objects in each window are assigned to different clusters (or concepts) whose structural change over time is analyzed, although they are not updated recursively. Similarly, an approach is presented in [8], which performs change detection and improves forecasting on aggregated time series within cluster by extracting interpretable features from the consumer data. An incremental algorithm is devised in [7] to detect pattern drifts through load pattern extraction, intergradation, and modification, but it works with only one customer simultaneously. Finally, [18] introduces an online adaptive clustering for load profiling. However, these studies mainly focus on analyzing the changes in consumption patterns rather than characterizing the customer behavior. The main advantage of this characterization for the distribution system operator (DSO) is obtaining online mathematical models to estimate the customer response according to consumption preferences and environmental factors. The DSO can then procure electrical energy with more certainty from the electricity pool.

This paper proposes an online framework to characterize the DR from the DSO perspective. The continuous processing of daily profiles makes it possible to know the response profile classes (RPCs) of customers and compute their variability. The framework comprises a modified version of the incremental clustering by fast search and find of density peaks (CFSFDP) algorithm [

19] based on the seminal work in [20]. Previous experiences using the CFSFDP algorithm for load profiling in related studies [12], [15] support this selection. The incremental formulation adapts this original offline algorithm to work in an online setup. The proposed framework considers the multivariate normal kernel density estimator, which is robust to the number of objects in clusters during online processing, and incremental forms of the Davies-Bouldin (iDB) and Xie-Beni (iXB) validity indices [21] to provide the information about the algorithmic performance. The DSO can obtain the expected consumption of customers from their corresponding RPCs.

The customer response to the price signal has also been studied recently in the literature in the context of DR pricing. For example, in [

22], the strategies for setting real-time prices are developed by implicitly learning the price elasticity of consumers online, although only own-price elasticity is in load changes. In [23], a model from the aggregator perspective runs a pricing program with distribution system constraints and learns the price sensitivities of customers. Reference [24] adopts a non-intrusive load monitoring-based pricing approach that estimates the DR potential of thermostatically controlled loads, and then models the price responsiveness of customers. While how to generate time-varying price signals is beyond the scope of this paper, the proposed framework provides the DSO with two powerful instruments to achieve the following goals: ① the updated customer models for the estimation of responses to different price signals; and ② the modeling of the underlying probability distribution of random deviations in demand from expected values, inherent to the stochastic behavior of customers. According to [25], demand uncertainty generally has a normal distribution; therefore, the proposed framework is suitable for finding the information about the statistical moments.

The main contributions of the paper are summarized below.

1) An innovative online framework is proposed to characterize the DR. The paper presents the modified incremental CFSFDP algorithm, defined to work in a Hilbert space. Online clustering introduces the multivariate normal kernel density estimator for robustness to the number of objects in clusters and the monitoring of algorithmic performance through the iDB and iXB validity indices.

2) Application of the proposed framework allows the DSO to perform two essential activities: updating the RPC and variability when customer response materializes (at the end of the previous day) and estimating the customer behavior to price signals based on a known RPC (within the current day).

3) The proposed framework is tested with real-world and simulated daily profiles. Results show the online process for obtaining RPCs of residential and commercial Chilean end-users. This paper also provides a comparison analysis with the online algorithm in [

18] and a sensitivity analysis employing different combinations for the number of representatives and shrink factor.

The organization of this paper is as follows. Section II describes the proposed framework and provides theoretical foundations and mathematical models. Section III describes the solution methodology. Section IV presents two case studies with real-world and simulated daily profiles to verify the proposed framework. The performance monitoring and the comparison and sensitivity analysis are discussed in Section V. Finally, Section VI concludes this paper.

II. Proposed Framework

The DSO needs to ensure the reliability of distribution system, which may include small distributed solar generation units and generally has tight capacity constraints. By appropriately choosing dynamic price signals to be broadcasted to consumers enrolled in a price-based DR program, the DSO can reduce the distribution system costs and increase its reliability, for example, by shifting flexible consumption to the periods with high stochastic production [

26].

This paper considers the price-setting DSO that aims at managing the demand flexibilities and pursues the estimation of customers’ behavior to price signals. To this end, the DSO performs the daily processing of load profiles through the modified incremental CFSFDP algorithm to update the RPCs and variability of customers. Using the corresponding RPCs, the DSO can generate the expected consumption profiles in response to control price signals. Therefore, it can decide with high certainty the amount of electricity to trade, for example, in the balancing market within the day. The proposed framework, focused on residential and commercial customers, comprises both the previous day, at the end of which the DSO knows the power responses of customers and updates their RPCs and individual variability, and the current day, during which the DSO estimates the customers’ response to a price signal based on known RPCs.

Let s¯l,t=p¯l,t+jq¯l,t designate an expected complex power value to be consumed at time point t by a customer l under a contract. s¯l,t𝒮l,t, where 𝒮l,t is positioned in the complex plane. It is possible to obtain a bounded and convex approximation 𝒮l,t' of this region given practical bounds, both for active and reactive power. Although the estimation of q¯l,t is fundamental for the analysis at the distribution system level, this paper focuses specifically on the active power responses of end-users. The following linear model for setting a flexible active power profile is defined [

26]:

pl,tminp¯l,tpl,tmax    lL,tT (1)
-rl,tdΔtp¯l,t-p¯l,t-1rl,tuΔt    lL,t=2,3,...,T (2)
t=1Tp¯l,tΔtel    lL (3)

Formula (1) provides the expected response p¯l,t between a minimum pl,tmin and maximum pl,tmax bound for customer l at time point t. Also, p¯l,t can increase or decrease depending on the market price due to the combined use of shifting and shedding loads. Formula (2) forces ramp limits on the decrease and increase of active power in two successive time points. Finally, a minimum daily energy el is specified by (3) to account for basic activities.

An important observation to consider actual consumption features of consumers is that each region 𝒮l,t' is time-varying since the bounds vary over time based on their preferences and environmental factors. This is exploited in this paper by developing an online processing and subsequent characterization of daily load profiles. From the corresponding outcome, it is attainable to differentiate the behaviors of consumers through RPCs, where each RPC represents a portion (of similar daily profiles or vectors) of the polytope that entirely contains the customer’s load scenarios in the vector space. Therefore, based on these classes or portions, a more refined estimation of the consumption activity is feasible.

From a set of daily profiles associated with a RPC, each pair of parameters pl,tmin and pl,tmax of the model can be obtained as the corresponding extreme values, providing the convex (inner) approximation of the active power.

Concerning the values of the maximum ramp rates in (2), as they are related to the speed at which the consumer can decrease or increase its demand, they differ for each time point of the day and between RPCs. The strategy for its online determination is to consider the load changes from time point t-1 to t (t=2,3,,T) within the set of daily profiles of the RPC. Then, the following expressions are obtained:

rl,tdΔt=maxpl,t-1-pl,tpl,t-1-pl,t    lL,t=2,3,,T (4)
rl,tuΔt=maxpl,t-pl,t-1pl,t-pl,t-1    lL,t=2,3,,T (5)

Formulas (4) and (5) indicate a decrease and an increase in demand concerning the previous time point, respectively.

Lastly, the minimum daily energy can be the lowest total consumption among all the profiles within the RPC.

III. Solution Methodology

Daily processing of load profiles favors the appropriate characterization of customer behavior. As consumers can be associated based on the similarity of their consumption patterns, to make this analysis scalable, this paper presents an incremental clustering, which is described in this section. Then, the section considers the performance monitoring and the estimation of the variability of customer response.

Algorithm 1 outlines the online workflow that runs at the end of each day when the customer response materializes.

Algorithm 1  : online DR characterization

Input: set P, Gk0, Rk0, k=1,2,,K

1: for tT do

2:  Assign each object pl to the corresponding cluster by (14) and solve (15) to update the representative

3:   for each cluster Gk, do

4:     Compute the local density by (7) and the minimum distance by (8) and (9) for each object, and select new cluster centers through their products if they exist

5:     Assign each remaining object to the corresponding cluster and select the representatives by (10)-(13)

6:   end for

7:   if new clusters arise then

8:     Compute the minimum distance between all pairs of clusters by (16) and merge accordingly

9:     In the merged clusters, compute the local density by (7) and the minimum distance by (8) and (9) for each object, and select as center the object with the highest product value and the representatives by (10)-(13)

10:   end if

11:   Compute the iDB and iXB validity indices

12:   Update the RPCs and variability of customers by (17)

13: end for

A. Modified Incremental CFSFDP Algorithm

Let P0=pl0 denote an initial set of power profiles collected during τ days from L customers equipped with SMs, with each vector pl0=pl,10,pl,20,,pl,τT0T. Let the initial power profile of each customer recast into τ daily load profiles. By gathering all these new profiles, the initial set is reformulated then with a total of N0=τL vectors of T-tuples, i.e., P0=pi0,i=1,2,,N0, with each vector pi0=pi,10,pi,20,,pi,T0T. Furthermore, a set P=pl of load profiles is processed daily after this historical collection.

Any structure in the vector space depends on the similarity metric and the clustering criterion, which expresses how to use the metric. In the paper, the metric d2: RT×RTR, defined in (6) in terms of the l2 norm for any pair of vectors pi0 and pj0, is employed. Hence, RT is a normed linear vector space, particularly a Hilbert space, due to the induced norm [

27].

d2pi0,pj0=pi0-pj02    pi0,pj0P0 (6)

The implementation of the modified incremental CFSFDP algorithm involves the following stages.

1) Application of CFSFDP Algorithm

The CFSFDP algorithm uses the initial set P0, where the clustering criterion relies on the computation of two magnitudes for each object pi0, i.e., the local density ρi and the minimum distance δi concerning the vectors of higher density:

ρi=1N0h1h2...hTj=1N0t=1Tkpi,t0-pj,t0ht    i=1,2,,N0 (7)
δi=minpj0P0d2pi0,pj0s.t.  ρjρi    i,j=1,2,,N0 (8)

Unlike [

19] and [20] that use the cutoff distance of high sensitivity to a small number of objects, (7) denotes the general form of a multivariate product kernel estimator at point pi0 of the probability density function, in which the same (univariate) kernel function k is used at each time point t with a different smoothing parameter ht. Without loss of generality, let the normal kernel be selected. Then, according to Scott’s rule [28] in RT, each smoothing parameter can be approximated as ht=σtN0-1/T+4, with σt as the standard deviation for time point t.

For the object pi0 with the highest density, the distance is:

δi=maxpj0P0d2pi0,pj0 (9)

Each cluster center ck0 (k=1,2,,K) describes a dominant consumption pattern and is expected to be surrounded by a neighborhood with lower local density and at a relatively large distance from any object with a higher local density. To identify the centers, both the plot of δi as a function of ρi for each object (the decision graph) and the plot of quantities γ1,γ2,,γN0 in decreasing order (each γi=ρiδi) can be used [

20]. After detecting the centers, each remaining object is assigned to the same cluster as its nearest neighbor with higher density. Each cluster is denoted as Gk0=ck0,pk,20,pk,30,...,pk,Nk00, where Nk0 is the number of daily profiles.

2) Determination of Representatives

Each cluster can be represented by a fixed number of representative points generated by selecting well-scattered points and then shrinking them toward the center by a specified fraction. This approach helps to identify the clusters with non-spherical shapes and wide variances in size [

29].

Let Nr be the number of representative points of clusters and Rk0=rk,u0,u=1,2,, Nr be the set of these points to select for any cluster k. Their determination follows a sequential order. A selected vector pk,i0 becomes a representative vector rk,u0. The first representative is:

rk,10=argmaxpk,i0Gk0d2pk,i0,ck0 (10)

And the rest of them are selected one by one as:

rk,u0=argmaxpk,i0Gk0Rk0minrk,u0Rk0d2pk,i0,rk,u0s.t.  d2pk,i0,rk,u0e-sGk0    k=1,2,,K,u=2,3,,Nr (11)

To select the second representative, rk,u0 is rk,10; to select the third representative, rk,u0 could be rk,10 or rk,20, and so on. The structure distance sGk0 is expressed in (12) based on the mean μk,t and standard deviation σk,t of time point t [

19].

sGk0=t=1Tσk,tμk,t21Tt=1Tμk,t    k=1,2,,K (12)

The shrinking process of the representative points depends on the (user-defined) shrink factor α0,1:

rk,u0=rk,u0+αck0-rk,u0    k=1,2,,K,u=1,2,,Nr (13)

Shrinking the scattered points toward the center undoes surface abnormalities and mitigates the effect of outliers since these are typically further away from the cluster center [

29].

3) Assignment of New Load Profiles

With a new daily set P, each object pl is assigned to the cluster Gk0 with the nearest representative. To this end, the local density ρl is set initially to be zero, and the minimum distance δl is given as:

δl=minrk,u0Rk0d2pl,rk,u0 (14)

The assignment causes a change in the cluster structure, which requires updating the representative points. Assuming rk,u0 as the nearest representative to the assigned vector pk,l, the solution of the problem below is found:

argmaxpk,l,rk,u0Aminrk,u0Rk0rk,u0d2A1:T,c,rk,u0s.t.  d2A1:T,c,rk,u0e-sGk0    k=1,2,,K,lL,c=1,2 (15)

where the auxiliary matrix is A=pk,l,rk,u0 with the column c.

From this result, if pk,l produces the maximum value, pk,l replaces rk,u0 as the new representative [

19]. Each cluster can be denoted as Gk=ck0,pk,2,pk,3,...,pk,Nk.

4) Splitting Procedure

In this stage, the algorithm looks for the cluster with more than one dominant pattern. Specifically, in each new cluster Gk, parameters ρi and δi, and the quantity γi, are first obtained for each object pk,i. Based on these product values, new cluster centers can be identified, and the remaining objects can be assigned as in the CFSFDP algorithm [

19]. Since only very few clusters arise in practice, an empirical criterion by computing the mean of the ten highest quantities can deliver new centers if these objects have a value higher than this.

5) Merging Procedure

This stage happens if new clusters arise, looking for those containing a similar dominant pattern. Specifically, the connected graph is constructed to find the components connected between clusters. The minimum distance between any two clusters Gk and Gm (k,m=1,2,,K) is computed as:

DGk,Gm=minrk,uRk,rm,uRmd2rk,u,rm,u (16)

If DGk,GmsGk and DGk,GmsGm, an edge is added between them. After adding all the edges, the graph with multiple components results, and the clusters with the same component can be merged [

19].

B. Incremental Indices for Performance Monitoring

Lately, [

21], [30], and [31] have extended several of the offline cluster validity indices to deal with the clustering over data streams. This paper implements the iDB and iXB validity indices to monitor the performance of the online algorithm.

For the computation of validity indices, the compactness term is essential. This paper considers the incremental formulation proposed in [

21], which represents a counterpart hard for the calculation of the compactness, as considered in [30] (in the context of fuzzy clustering). This formulation is applied to clusters the that do not undergo splitting or merging, and for calculating both indices, the corresponding equations in [21] are employed. Instead, this paper applies the offline forms of both indices to the clusters that arise due to a split or merger. However, the compactness formulation in this paper is more straightforward than that described in [21] due to the following differences: ① the calculation is executed after the assignment of objects in P to clusters and not after the assignment of each; and ② the use of a cluster center ck instead of the centroid, which changes with the added objects. In both indices, smaller values mean better solutions, whereas sudden changes indicate the changes in the cohesion and separation of clusters produced by the online algorithm [6].

C. Variability of Customer Response

To measure the uncertainty about the daily RPC for each customer, the entropy value [

27] is used:

Ηl=-𝔼lgck    lL,k=1,2,,K (17)

Equation (17) is solved after each daily observation of RPCs and gives the amount of information inherent to each, i.e., the variability in customer behavior.

IV. Case Studies

This section presents two case studies to demonstrate the benefits of the proposed framework. The first case study involves residential and commercial electricity data recorded over six weeks (February 1 to March 13, 2020), with 15 min intervals. The Hilbert space is then R96. The number of customers is 925, charged with a regulated tariff; however, this paper assumes that they practice an optimizing behavior, using electricity in known off-peak periods. Therefore, Algorithm 1 is applied continuously when each day ends, which simulates the succession of days during this period. As a result, the DSO can update the RPCs and variability of customers at the end of the day. The second case study considers numerical simulations for an additional week, where the DSO broadcasts price signals daily. Based on each daily update of RPCs from Algorithm 1, this case study allows obtaining the expected load profiles of customers for the next day.

A. Case Study 1: Using Real-world Data Set

Incompleteness in electricity data is a common trend. Then, cleaning is executed by identifying and discarding daily profiles with missing and inconsistent values. The initial set P0 considers the first week of measurement data. Thus, each set P of profiles corresponding to the rest of the days is processed incrementally. Likewise, the normalization of each daily profile concerning its maximum value is implemented to facilitate the clustering process.

The application of the CFSFDP algorithm to P0 allows the identification of four initial clusters through the decision graph and the plot of quantities γ1,γ2,,γN0 in decreasing order. The results are depicted in Fig. 1 and the selected centers are highlighted with bigger blue dots. One of them is remarkably different, whereas the remaining three differ from the rest mainly due to the distance parameter, which means that these points do not have a neighborhood as high as the first.

Fig. 1  Plots for identification of initial clusters. (a) Decision graph. (b) Quantities γ1,γ2,,γN0 in decreasing order.

Representative points in clusters are the basis for assigning new profiles. The incremental clustering uses the following parameter values: Nr=8 and α=0.4. To better observe the shrinking of representatives, Fig. 2 shows the shrinking process for the points that belong to the initial cluster 3, with the lowest cardinality, into a two-dimensional space using the principal component analysis (PCA) [

27]. Representative points and the cluster center are highlighted in bigger black dots and a blue diamond, respectively. In R96, this comprises the translation of the points around the center.

Fig. 2  Scatter data of cluster 3 for PCA. (a) Before shrinking process. (b) After shrinking process.

The online algorithm produces five final clusters considering the real-world data set. Figure 3 illustrates their evolution starting from the initial four. Following the trajectories, it is possible to identify the origin for each cluster, i.e., the one from which they arise, and the corresponding date. In the same way, for those clusters that fade, it is noticeable the one where they merge and the corresponding date. The colored dots in Fig. 3 indicate the sequence of days, which helps to identify the dates of clusters that arise and fade. Black lines are used to represent the splits and mergers of the different-colored clusters. Cluster 1 is the only one that maintains from the beginning. Also, several clusters arise and fade in the last days of the period, which is attributed to a change in the consumption behavior of most customers since February is a vacation month and March is a month of more work activities. Finally, some isolated objects (with small density and great distance) are classified as outliers and discarded during online processing.

Fig. 3  Evolution of clusters for measurement period.

Since customers generally have well-defined behaviors (at least for a specific period) and these behaviors repeat between them, continuous splitting of clusters is uncommon. Also, a splitting rarely generates more than two clusters, which also happens with merging. Figure 3 confirms these observations, with some exceptions in the last days when splits and mergers have increased. Furthermore, since a cluster might contain at most two dominant patterns, the empirical criterion presented in the splitting procedure to split it directly (without analysis as in Fig. 1) is simple but effective. This includes the isolated objects which are treated as outliers.

Figure 4 depicts the response profiles for the final clusters with cluster centers in black in case study 1, and Table I gives their cardinality and daily average consumption.

Fig. 4  Daily consumption profiles of final clusters in case study 1. (a) Cluster 1. (b) Cluster 2. (c) Cluster 3. (d) Cluster 4. (e) Cluster 5.

TABLE I  Cardinality and Daily Average Consumption in Final Clusters in Case Study 1
Final clusterCardinalityAverage consumption (kWh)
1 4153 50.5
2 26684 30.3
3 2643 37.4
4 3629 27.8
5 349 19.9

The most increased percentage of cluster 2 (over 70%) concludes that this pattern is present in most residential end-users; however, it also includes commercial establishments and small businesses. Cluster 1 presents the highest consumption and more stable behavior. Cluster 3 shows a slightly lower consumption in the morning. Daily profiles with irregular behavior are much more noticeable in cluster 4, and cluster 5 has a typical residential pattern with low consumption.

The appropriate characterization of customer behavior is essential in a control-by-price strategy. Table II summarizes the RPCs of customers and their combinations, which is the main benefit of the investigation. Most consumers use four RPCs within the measurement period.

TABLE II  RPCs of Customers and Their Combinations in Case Study 1
RPCNumber of customersCombination of RPCs (number of customers of combination)
1 130 2 (122), 4 (4), 1 (4)
2 215 1-2 (85), 2-4 (100), 1-4 (2), 2-3 (8), 1-3 (6), 2-5 (12), 3-4 (2)
3 264 1-2-3 (58), 1-2-4 (69), 2-4-5 (32), 2-3-4 (91), 1-3-4 (6), 2-3-5 (4), 1-2-5 (4)
4 294 1-2-4-5 (14), 1-2-3-4 (248), 2-3-4-5 (29), 1-2-3-5 (3)
5 22 1-2-3-4-5 (22)

To complement this, the variability between the patterns in the combination is fundamental. For example, customers 99 and 630 (which are internal identifications for privacy-preserving) equally have the most common combination: 1-2-3-4; however, the probabilities on energy usage are very different, which are 0.047-0.905-0.024-0.024 and 0.286-0.238-0.286-0.19, respectively, and their corresponding entropy values are Η99=0.1796 and Η630=0.5965 (with the general mean of 0.2436), respectively.

The lower value is because of the predominance of the second pattern and the very low probability of the rest, which implies less uncertainty about the daily RPC for customer 99. Figure 5 confirms this information based on the daily consumption profiles of both consumers for the measurement period, where a more uncertain daily behavior characterizes customer 630.

Fig. 5  Daily consumption profiles for measurement period. (a) Customer 99. (b) Customer 630.

B. Case Study 2: Using Simulated Data Set

This case explores the effect of relaxing the regulatory condition to allow the price to fall in off-peak periods and to increase in peak periods.

For simplicity, the DSO broadcasts a single price signal to all consumers each day of the specified week (from March 14 to March 20, 2020); however, customized price signals can be designed according to customer behavior and broadcasted, e.g., each hour of the day to exploit newly available information of system states. Figure 6 depicts these price signals generated according to the demand of Chilean power system.

Fig. 6  Price signals generated according to demand of Chilean power system.

The execution of numerical simulations considers the following ideas: ① a realistic situation is applied where consumers use the daily RPC with the highest probability for that day; ② to estimate active power responses to the price signal, customers practice a daily cost minimization; hence, this paper employs the linear programming problem in (18) [

26]; and ③ to account for the presence of stochasticity in the responses, each random deviation from the expected active power value at time point t follows a normal distribution with mean zero and variance of 0.01 kWh2.

minp¯lt=1Tλtp¯l,tΔts.t.  (1)-(3) (18)

For example, considering the same two customers on March 14, the RPC with the highest probability for this day results in RPC 2 for customer 99 and RPC 3 for customer 630. Figure 7 represents their corresponding mathematical models according to the RPCs, from which the optimizing behavior can be found. Each model involves the minimum and maximum bounds (yellow curves), which in general follow the shape of clusters 2 and 3 in Fig. 4, respectively, and the maximum ramp values given by products rl,tdΔt and rl,tuΔt. Both consumers present the time points with a high predominance of one of the ramp rates, which implies a higher flexibility of that ramp rate. For customer 630, lower values around the 6th hour indicate less flexibility for this time. Finally, the minimum daily energy is 73.73 kWh for customer 99 and 34.45 kWh for customer 630.

Fig. 7  Ramp values rl,tdΔt and rl,tuΔt, and the minimum and maximum bounds. (a) Customer 99. (b) Customer 630.

After the simulation period, the final clusters increase to six. Figure 8 illustrates their response profiles, and Table III gives their cardinality and daily average consumption.

Fig. 8  Daily consumption profiles of final clusters in case study 2. (a) Cluster 1. (b) Cluster 2. (c) Cluster 3. (d) Cluster 4. (e) Cluster 5. (f) Cluster 6.

TABLE III  Cardinality and Daily Average Consumption in Final Clusters in Case Study 2
Final clusterCardinality (ratio of new daily profiles (p.u.))Average consumption (kWh)
1 4227 (0.01) 51.7
2 12003 (0.05) 45.4
3 2467 (0.03) 34.2
4 6041 (0.32) 31.1
5 18960 (0.58) 17.5
6 190 (0.01) 21.0

The main difference concerning the case with the real-world data set is the new arising cluster 5, related in shape to the previous cluster 2. Most of the new daily profiles under the effect of the price signals are added to this new cluster, which is also indicated in Table III. Considering the shape of the control signals with the lower prices in the early morning and the corresponding constraints on bounds and the maximum ramp rates during these hours, most of the new daily profiles present lower consumption to reduce the final cost. In addition, the algorithm assigns the lower consumption profiles of cluster 2 to this new cluster. Based on these factors, cluster 5 shows the lowest energy value, whereas for cluster 2, the cardinality decreases notably, and its average consumption increases. Finally, the same cluster centers hold from the case using the real-world data set.

Lastly, Table IV summarizes the RPCs and their combinations including the simulation period. Most consumers continue to use four RPCs, but the most common combination is now 1-2-3-4-5.

TABLE IV  RPCs of Customers and Their Combinations in Case Study 2
RPCNumber of customersCombination of RPCs (number of customers of combination)
1 9 2 (5), 5 (3), 4 (1)
2 67 4-5 (34), 2-5 (27), 3-4 (2), 1-2 (1), 1-4 (1), 1-5 (1), 3-5 (1)
3 242 1-4-5 (6), 2-4-5 (172), 3-4-5 (29), 1-2-4 (7), 1-3-4 (11), 2-5-6 (7), 1-2-5 (10)
4 305 1-2-4-5 (135), 1-3-4-5 (40), 2-3-4-5 (85), 2-4-5-6 (24), 1-2-3-5 (5), 1-2-3-4 (12), 1-4-5-6 (3), 1-2-5-6 (1)
5 282 1-2-4-5-6 (17), 1-2-3-4-5 (252), 1-3-4-5-6 (3), 2-3-4-5-6 (10)
6 20 1-2-3-4-5-6 (20)

V. Performance Monitoring and Comparison and Sensitivity Analysis

Considering the real-world data set, this section first analyzes the monitoring of the clustering algorithm based on the iDB and iXB validity indices. The comparison with the online algorithm in [

18] and the sensitivity analysis employing the number of representatives and the shrinking factor are also investigated.

A. Performance Monitoring of Online Clustering

To assess the cohesion and separation of clusters produced in the online processing, Fig. 9 depicts the iDB and iXB validity indices. A high correlation characterizes the measurement period, and the resulting trend of the values exhibits good performance for the algorithm based on an adequate assignment of objects.

Fig. 9  Evolution of iDB and iXB validity indices.

On the days when splitting or merging events happen (as shown in Fig. 3), the indices almost always show a slight increase or decrease, respectively. Conversely, both indices remain constant when clusters hold from one day to the next. Then, the usefulness of these incremental forms for monitoring the performance of the online algorithm is proved, as in other correctly and poorly partitioned data sets [

21]. The higher values in the last few days are due to the change in daily patterns attributed to most consumers.

B. Comparison Analysis

This paper uses the algorithm in [

18] for comparison purposes since it also processes daily profiles. Concerning the proposed framework, this algorithm has two main differences: ① it uses centroids, which is not always valid in practice since clusters ideally need to have a spherical structure in the vector space; and ② it discards past daily profiles and preserves this information only through a distance matrix and the centroids themselves, which are updated daily.

The algorithm is applied using the l2 norm and the first week for the consensus clustering [

18], in which the k-means method uses the set 2, 3,, 8 for the number of clusters per instance. Table V gives the result of two cases with different parameter values for the facility cost Cf that decides the split and the minimum distance between centroids dmin that favors the merger, where Dp is the mean of the probability distances. The rest of the parameter values comprise the initial number of clusters K0 obtained from the corresponding dendrogram, the exponential forgetting ν, and the number of disruptive loads γmin, which remain as in the application with electricity data in [18]. Specifically, different parameter values in the second case produce many more splits and mergers.

TABLE V  Parameter Values and Final Clusters
CaseParameter valueFinal cluster
K0νγminCfdmin
1 7 0.85 5 6Dp 0.5Dp 6
2 7 0.85 5 5Dp 0.8Dp 8

Figure 10 illustrates the evolution of the iDB and iXB validity indices in these two case studies with different parameter values using the algorithm in [

18], where high-magnitude spikes that suggest a poorer assignment of daily profiles are present in both.

Fig. 10  Evolution of iDB and iXB validity indices in two case studies. (a) Case study 1. (b) Case study 2.

C. Sensitivity Analysis

The sensitivity analysis evaluates the influence of changes on the number of representatives Nr and the shrinking factor α. The solutions for a group of selected values of these parameters are given, as shown in Table VI.

TABLE VI  Final Clusters, Mean, Standard Deviation, and Difference Between the Maximum and Minimum iDB and iXB Validity Indices for Different Values of Nr and α
NrαFinal clusterMean, standard deviation, and difference between the maximum and minimum iDB validity indicesMean, standard deviation, and difference between the maximum and minimum iXB validity indices
8 0.4 5 1.57, 0.65, 2.89 1.02, 0.65, 2.37
4 0.4 6 2.92, 5.81, 29.68 1.52, 0.83, 3.27
6 0.4 5 1.47, 0.30, 1.04 1.20, 0.67, 2.23
10 0.4 3 1.27, 0.20, 0.88 0.61, 0.31, 1.36
12 0.4 3 1.19, 0.16, 0.76 0.60, 0.35, 1.34
8 0.3 3 1.41, 0.52, 3.23 1.14, 0.78, 3.50
8 0.5 5 1.28, 0.21, 0.96 0.75, 0.42, 1.52
1 1.0 9 1.86, 0.19, 0.76 1.34, 0.25, 1.22

Considering Nr, the trend is to obtain fewer clusters as this number increases. With one representative (the cluster center itself), the result of nine final clusters is obtained, which is close to the eight final clusters of case study 2 using the algorithm in [

18] (which employs the centroid). With Nr=4, a high difference between the maximum and minimum values is obtained for the iDB index. Generally, it is hard to establish the final number of these scattered points in the vector space. With a relatively small number, a large cluster can split incorrectly, but with a large number, two close clusters are more likely to merge. According to the values of the indices in Table VI with the real-world data set, using Nr6 produces adequate results.

Considering α, it is not practical to take very low or very high values [

29]. In addition to α=1, Table VI presents the results with α=0.3 and α=0.5. Similarly, with a smaller value of α, the scattered points shrink little, and merging between two clusters is more likely to occur, whereas larger values cause these points to move more toward the cluster center and favor the splitting.

VI. Conclusion

The main idea of this paper is to propose an online framework for DR characterization. In particular, the DSO can use the proposed framework to obtain and update daily the RPCs and variability of customers, and estimate the customer response to a price signal based on a known RPC, which is suitable for effective energy management on the demand side.

Furthermore, the underlying probability distribution of random deviations in demand from expected values is generally unknown. However, the proposed framework contributes to modeling it since each daily deviation can be considered a realization of the corresponding random variable. The use of parameters of this empirical distribution overcomes the limitation of making distributional assumptions that can result in risky or more conservative costly solutions.

Some technical and practical complications arise with each set of load profiles and the higher amount that needs to be processed daily. In particular: ① more computational effort is demanded; ② as clusters extend in the vector space, a larger Nr might eventually be more appropriate for capturing their geometry; and ③ the time to find the information about the RPCs of customers and estimate their responses could be longer than that available by the DSO. Thus, a suitable strategy to be included in the online algorithm is the well-known sliding window [

4], according to which the number of daily profiles remains practically constant, and the information of interest is present. Algorithm 1 favors this inclusion.

While processing and analyzing load data streams represent a significant challenge, case studies using real-world and simulated daily profiles of Chilean end-users have demonstrated the applicability of the proposed framework. According to the results, most consumers use four RPCs within the measurement period. Furthermore, the behavior of iDB and iXB validity indices and the comparison analysis have verified the adequate assignment of objects of the online clustering. Future work needs to address the following issues in more depth: ① the impact of dimensionality; ② the definition of the most reliable values for parameters Nr and α; and ③ the correct RPC for estimating the expected response of customers (not necessarily the RPC with the highest probability).

Nomenclature

Symbol —— Definition
α —— Shrink factor
γi —— Product of ρi and δi
Δt —— Interval between two consecutive time points
δi —— The minimum distance
λt —— Electricity price at time point t
ρi —— Local density
A —— Auxiliary matrix pk,l,rk,v0
ck0, ck —— Initial and non-initial cluster centers of cluster k (k=1,2,,K)
el —— The minimum daily energy of customer l
𝔼 —— Expectation operator
Gk0, Gk —— Clusters k of daily profiles, ck0,pk,20,...,pk,Nk00 and ck,pk,2,...,pk,Nk
Ηl —— Variability of customer l
ht —— Smoothing parameter of time point t
L —— Set of customers
N0 —— Total number of vectors of T-tuples
Nk —— Number of daily profiles
Nr —— Number of representative points
—— Probability of cluster center followed up to current day
P0 —— Initial set of daily profiles pi0,i=1,2,...,N0
P —— Set of daily profiles pl
pi0 —— The ith initial active power profile pi,10,pi,20,...,pi,T0T
p¯l —— Expected active power profile of customer l p¯l,1,p¯l,2,...,p¯l,TT
pl —— Active power profile of customer l pl,1,pl,2,...,pl,TT
pl,tmin, pl,tmax —— The minimum and maximum bounds for p¯l,t
p¯l,t, q¯l,t —— Expected active and reactive power profiles of customer l at time point t
Rk0, Rk —— Sets of representative points of cluster k, rk,u0,u=1,2,...,Nr and rk,u,u=1,2,...,Nr
rl,td, rl,tu —— The maximum ramp-down and ramp-up rates for p¯l,t
rk,u0, rk,u —— Representative vector in Rk0 and Rk
sGk0, sGk —— Structure distances of Gk0 and Gk
𝒮l,t —— Consumption region that contains active and reactive power values of customers at time point t
T —— Set of time points

References

1

K. Schmitt, R. Bhatta, M. Chamana et al., “A review on active customers participation in smart grids,” Journal of Modern Power Systems and Clean Energy, vol. 11, no. 1, pp. 3-16, Jun. 2023. [Baidu Scholar] 

2

J. S. Vardakas, N. Zorba, and C. V. Verikoukis, “A survey on demand response programs in smart grids: pricing methods and optimization algorithms,” IEEE Communications Surveys & Tutorials, vol. 17, no. 1, pp. 152-178, Jan.-Mar. 2015. [Baidu Scholar] 

3

X. Chen, Y. Li, J. Shimada et al., “Online learning and distributed control for residential demand response,” IEEE Transactions on Smart Grid, vol. 12, no. 6, pp. 4843-4853, Nov. 2021. [Baidu Scholar] 

4

J. A. Silva, E. R. Faria, R. C. Barros et al., “Data stream clustering: a survey,” ACM Computing Surveys, vol. 46, no. 1, pp. 1-31, Oct. 2013. [Baidu Scholar] 

5

M. Carnein and H. Trautmann, “Optimizing data stream representation: an extensive survey on stream clustering algorithms,” Business & Information Systems Engineering, vol. 61, pp. 277-297, Jun. 2019. [Baidu Scholar] 

6

J. C. Bezdek and J. M. Keller, “Streaming data analysis: clustering or classification?” IEEE Transactions on Systems, Man, and Cybernetics: Systems, vol. 51, no. 1, pp. 91-102, Jan. 2021. [Baidu Scholar] 

7

Z. Jiang, R. Lin, and F. Yang, “An incremental clustering algorithm with pattern drift detection for IoT-enabled smart grid system,” Sensors, vol. 21, no. 19, p. 6466, Oct. 2021. [Baidu Scholar] 

8

P. Laurinec and M. Lucká, “Interpretable multiple data streams clustering with clipped streams representation for the improvement of electricity consumption forecasting,” Data Mining and Knowledge Discovery, vol. 33, pp. 413-445, Mar. 2019. [Baidu Scholar] 

9

M. Sun, I. Konstantelos, and G. Strbac, “C-vine copula mixture model for clustering of residential electrical load pattern data,” IEEE Transactions on Power Systems, vol. 32, no. 3, pp. 2382-2393, May 2017. [Baidu Scholar] 

10

M. Charwand, M. Gitizadeh, P. Siano et al., “Clustering of electrical load patterns and time periods using uncertainty-based multi-level amplitude thresholding,” International Journal of Electrical Power & Energy Systems, vol. 117, p. 105624, May 2020. [Baidu Scholar] 

11

H. Liang and J. Ma, “Develop load shape dictionary through efficient clustering based on elastic dissimilarity measure,” IEEE Transactions on Smart Grid, vol. 12, no. 1, pp. 442-452, Jan. 2021. [Baidu Scholar] 

12

Y. Wang, Q. Chen, C. Kang et al., “Clustering of electricity consumption behavior dynamics toward big data applications,” IEEE Transactions on Smart Grid, vol. 7, no. 5, pp. 2437-2447, Sept. 2016. [Baidu Scholar] 

13

S. Haben, C. Singleton, and P. Grindrod, “Analysis and clustering of residential customers energy behavioral demand using smart meter data,” IEEE Transactions on Smart Grid, vol. 7, no. 1, pp. 136-144, Jan. 2016. [Baidu Scholar] 

14

J. Kwac, J. Flora, and R. Rajagopal, “Lifestyle segmentation based on energy consumption data,” IEEE Transactions on Smart Grid, vol. 9, no. 4, pp. 2409-2418, Jul. 2018. [Baidu Scholar] 

15

F. Lu, X. Cui, J. Xing et al., “Electricity load profile characterisation for industrial users based on normal cloud model and iCFSFDP algorithm,” IEEE Transactions on Power Systems, vol. 38, no. 4, pp. 3799-3813, Jul. 2023. [Baidu Scholar] 

16

D. de Silva, X. Yu, D. Alahakoon et al., “A data mining framework for electricity consumption analysis from meter data,” IEEE Transactions on Industrial Informatics, vol. 7, no. 3, pp. 399-407, Aug. 2011. [Baidu Scholar] 

17

M. A. Masud, J. Huang, M. Zhong et al., “Cluster survival model of concept drift in load profile data,” IEEE Access, vol. 6, pp. 51269-51285, Sept. 2018. [Baidu Scholar] 

18

G. L. Ray and P. Pinson, “Online adaptive clustering algorithm for load profiling,” Sustainable Energy, Grids and Networks, vol. 17, p. 100181, Mar. 2019. [Baidu Scholar] 

19

L. Zhao, Z. Chen, Y. Yang et al., “ICFS clustering with multiple representatives for large data,” IEEE Transactions on Neural Networks and Learning Systems, vol. 30, no. 3, pp. 728-738, Mar. 2019. [Baidu Scholar] 

20

A. Rodriguez and A. Laio, “Clustering by fast search and find of density peaks,” Science, vol. 344, no. 6191, pp. 1492-1496, Jun. 2014. [Baidu Scholar] 

21

L. E. B. da Silva, N. M. Melton, and D. C. Wunsch, “Incremental cluster validity indices for online learning of hard partitions: extensions and comparative study,” IEEE Access, vol. 8, pp. 22025-22047, Jan. 2020. [Baidu Scholar] 

22

S.-J. Kim and G. B. Giannakis, “An online convex optimization approach to real-time energy pricing for demand response,” IEEE Transactions on Smart Grid, vol. 8, no. 6, pp. 2784-2793, Nov. 2017. [Baidu Scholar] 

23

N. Tucker, A. Moradipari, and M. Alizadeh, “Constrained Thompson sampling for real-time electricity pricing with grid reliability constraints,” IEEE Transactions on Smart Grid, vol. 11, no. 6, pp. 4971-4983, Nov. 2020. [Baidu Scholar] 

24

Y. Tao, J. Qiu, S. Lai et al., “Customer-centered pricing strategy based on privacy-preserving load disaggregation,” IEEE Transactions on Smart Grid, vol. 14, no. 5, pp. 3401-3412, Sept. 2023. [Baidu Scholar] 

25

L. Roald, F. Oldewurtel, B. Van Parys et al. (2015, Aug.). Security constrained optimal power flow with distributionally robust chance constraints. [Online]. Available: https://arxiv.org/abs/1508.06061 [Baidu Scholar] 

26

J. M. Morales, A. J. Conejo, H. Madsen et al., Integrating Renewables in Electricity Markets: Operational Problems. New York: Springer, 2014. [Baidu Scholar] 

27

T. K. Moon and W. C. Stirling, Mathematical Methods and Algorithms for Signal Processing. Upper Saddle River: Prentice Hall, 2000. [Baidu Scholar] 

28

D. W. Scott, Multivariate Density Estimation: Theory, Practice, and Visualization, 2nd ed. Hoboken: John Wiley & Sons, 2015. [Baidu Scholar] 

29

S. Guha, R. Rastogi, and K. Shim, “Cure: an efficient clustering algorithm for large databases,” Information Systems, vol. 26, no. 1, pp. 35-58, Mar. 2001. [Baidu Scholar] 

30

M. Moshtaghi, J. C. Bezdek, S. M. Erfani et al., “Online cluster validity indices for performance monitoring of streaming data clustering,” International Journal of Intelligent Systems, vol. 34, no. 4, pp. 541-563, Apr. 2019. [Baidu Scholar] 

31

O. A. Ibrahim, J. M. Keller, and J. C. Bezdek, “Evaluating evolving structure in streaming data with modified Dunn’s indices,” IEEE Transactions on Emerging Topics in Computational Intelligence, vol. 5, no. 2, pp. 262-273, Apr. 2021. [Baidu Scholar]