An iterative unsupervised non-intrusive load identification method and system

By employing an iterative, unsupervised, non-intrusive load identification method based on variable-length topic discovery and adaptive similarity threshold setting, the problem of low accuracy in identifying complex electrical appliances is solved, enabling autonomous discovery and accurate identification of electrical appliance usage patterns.

CN115932428BActive Publication Date: 2026-06-26TIANJIN UNIV

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
TIANJIN UNIV
Filing Date
2022-09-14
Publication Date
2026-06-26

AI Technical Summary

Technical Problem

Existing unsupervised non-intrusive load identification methods are not very accurate when identifying complex electrical appliances. They cannot effectively detect power waveform feature samples in the intermediate process, and it is difficult to manually set similarity thresholds, resulting in poor identification performance.

Method used

A variable-length topic discovery method is adopted. By detecting key points in the load power time series, load power subsequences are extracted, and the DTW algorithm is used to match templates. Combined with adaptive similarity threshold settings, the autonomous discovery and recognition of electrical appliance power waveform topic sequences are realized.

Benefits of technology

It improves the accuracy of identifying complex electrical appliances, adapts to the diversity of different scenarios and appliance brands, and shows better performance, especially when identifying complex appliances such as air compressors and air conditioners.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN115932428B_ABST
    Figure CN115932428B_ABST
Patent Text Reader

Abstract

The application discloses an iterative unsupervised non-intrusive load identification method and system, and the method comprises the following steps: obtaining a load power time sequence; pre-processing the load power time sequence; detecting key points of the load power time sequence; extracting a load power subsequence and matching the load power subsequence with a template; judging whether the number of unknown load power subsequences accumulates to 30; performing theme discovery on the load power subsequence; judging whether the number of subsequences in each group in a non-directional graph is not less than 10; performing electrical appliance power waveform theme sequence mode mining; constructing corresponding electrical appliance load mark templates, and updating a load mark template library of a non-intrusive load monitoring system; and the system comprises corresponding processing modules. The application can autonomously discover simple and complex power consumption modes of different types of electrical appliances in various unfamiliar scenes, and is suitable for the diversity of power consumption modes of the same type of electrical appliances and the difference of power consumption modes of the same electrical appliance.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention belongs to the field of non-invasive load identification technology, specifically relating to an iterative unsupervised non-invasive load identification method and system. Background Technology

[0002] Non-intrusive load identification (NIDI) can acquire information such as the operating status, power consumption, and cumulative electricity consumption of each or every type of appliance within a load simply by collecting and analyzing the total load current and terminal voltage. It replaces the sensor network of invasive load monitoring systems with software algorithms, offering advantages such as simplicity, economy, reliability, good data integrity, and ease of rapid deployment. It is expected to become a core technology for next-generation smart meters in the Advanced Metering Infrastructure (AMI) of smart grids, providing support for advanced smart power distribution functions.

[0003] Unsupervised, non-intrusive load identification methods can autonomously analyze raw total load data without real labels to identify the operating status of electrical appliances and decompose electricity consumption, showing potential for widespread application. To date, many studies have used Hidden Markov Models (HMMs) or their variants to model electrical appliances and total load. However, these methods implicitly require prior knowledge of the number of electrical appliances in the target scenario and the number of states for each appliance in order to achieve unsupervised HMM parameter learning based on the total load power. Therefore, strictly speaking, they are not truly unsupervised NILM methods.

[0004] Another classic unsupervised NILM method does not require prior knowledge of the load composition information of the scene. It first extracts electrical load feature samples through load event detection, then classifies the feature samples and load events through unsupervised clustering analysis, and finally performs event matching to build the state model of the electrical appliances, thus achieving load identification.

[0005] In theory, unsupervised NILM methods based on load event detection have advantages such as low algorithm complexity, high computational efficiency, and strong scalability. However, most existing studies only achieve good modeling and recognition results for simple electrical appliances with transient power changes of "step" and constant power in intermediate processes, while the results are poor or there is little discussion on scenarios involving complex electrical appliances.

[0006] In real-world power consumption scenarios, the power of some complex electrical appliances varies significantly throughout their operation, including transient and intermediate processes. Furthermore, the power changes during their start-up and shutdown processes may not meet the zero-loop and constraints required by existing load event matching methods, resulting in low accuracy in unsupervised identification.

[0007] The reasons are twofold: firstly, existing methods do not include the intermediate process power waveform feature samples of complex electrical appliances in the detection scope, making it impossible to effectively achieve their autonomous modeling and recognition; secondly, load event detection often uses a sliding window to scan load power data for a fixed duration, which often makes it difficult to completely detect the transient process power waveform feature samples of complex electrical appliances in unfamiliar scenarios, which have diverse time scales and shapes.

[0008] In real life, users usually have certain habits in using electrical appliances, so the power waveform feature samples of the same appliance will appear repeatedly in the load power data. These similar power waveform feature samples correspond to the specific power consumption patterns of the appliance, and appear as recurring subsequences in the total load power time series.

[0009] In the field of time series analysis and data mining, topic discovery can detect recurring subsequence patterns of arbitrary duration and shape in long-term series. Theoretically, topic discovery can be used to mine similar power waveform feature samples in the total load power time series, achieving power waveform topic detection. Based on this, an unsupervised, non-intrusive load identification method can be established, thereby improving the overall accuracy of unsupervised electrical appliance modeling and identification, especially for complex electrical appliances.

[0010] In recent years, there has been a lot of research on topic discovery both at home and abroad. Some methods require users to specify the topic length and can only discover topics of a preset length, while other methods support variable-length topic discovery, eliminating the need for algorithms to rely on predefined parameters such as sliding window length.

[0011] Calculating the distance or similarity between different subsequences is a crucial step in topic discovery. To ensure accuracy, Euclidean distance or DTW distance can be directly calculated between the original time series. To improve computation speed, the dimensionality reduction representation obtained from the original time series using SAX can be processed. Based on the calculated distance or similarity between subsequences, a set threshold needs to be used to determine whether different subsequences belong to the same topic. Existing research typically involves manually setting distance or similarity thresholds for different scenarios based on expert experience. However, the calculated distance or similarity results between subsequences are not intuitive, and manually determining appropriate thresholds is often difficult. Although a few studies have avoided setting similarity thresholds, they can only achieve topic discovery for fixed lengths, which has limitations in practical applications.

[0012] To improve the accuracy of unsupervised non-intrusive load identification, especially its adaptability to complex electrical appliances, this paper considers applying the variable-length topic discovery method to unsupervised non-intrusive load identification and designs an iterative unsupervised non-intrusive load identification method based on variable-length topic discovery. Summary of the Invention

[0013] This invention is proposed to address the problems existing in the prior art, and its purpose is to provide an iterative, unsupervised, non-intrusive load identification method and system.

[0014] The technical solution of this invention is: an iterative unsupervised non-invasive load identification method, comprising the following steps:

[0015] i. Obtain the load power time series

[0016] Read the total active power signal of the user within a certain time period to form a load power time series;

[0017] ii. Preprocess the load power time series

[0018] Preprocess the load power time series to filter out power spikes in the load power time series;

[0019] iii. Key points for detecting load power time series

[0020] Detect key points in the load power time series and mark them in the load power time series;

[0021] iv. Extract the load power subsequence based on key points and match it with the template.

[0022] First, the load power subsequence is extracted based on key points, and then the load power subsequence is matched with the template.

[0023] Then, if the load power subsequence matches the template successfully, proceed to step x;

[0024] Finally, if the load power subsequence fails to match the template, proceed to step i.

[0025] v. Determine if the number of unknown load power subsequences has accumulated to 30.

[0026] If the number of unknown load power subsequences accumulates to 30, proceed to step ⅵ;

[0027] If the number of unknown load power subsequences accumulates to less than 30, proceed to step i;

[0028] vi. Perform topic discovery for load power subsequences;

[0029] Discover topics in shorter subsequences;

[0030] Discover topics in longer subsequences;

[0031] vii. Determine whether the number of subsequences in each clique of an undirected graph is not less than 10;

[0032] If the number of subsequences in each clique of the undirected graph is not less than 10, it is considered a valid result and enters step ⅷ together with other valid results;

[0033] If the number of subsequences in each clique of the undirected graph is less than 10, proceed to step i and wait for subsequent data accumulation;

[0034] ⅷ. Perform pattern mining of electrical power waveform themes;

[0035] Electrical power waveform theme sequence pattern mining is performed to obtain power waveform theme sequence patterns;

[0036] ⅸ. Construct corresponding electrical load imprint templates and update the load imprint template library of the non-intrusive load monitoring system;

[0037] First, the mined power waveform theme sequence patterns are constructed into corresponding electrical load imprint templates;

[0038] Then, the load imprint template library of the non-intrusive load monitoring system is updated, and the templates in the template library are successfully matched with the subsequently extracted unknown power waveform feature samples;

[0039] x. Achieve identification of electrical appliance operating status and decomposition of power consumption;

[0040] The template number in the load imprint template library is assigned to the successfully matched original power waveform feature sample, and the information is recorded. Based on the above information, the power consumption curve of a single appliance is reconstructed, thereby realizing the identification of the appliance's working status and the decomposition of its power consumption.

[0041] Furthermore, in step iii, the key points of the load power time series detection include important extreme points and important trend turning points.

[0042] Furthermore, the definition of important extreme points in step iii is as follows:

[0043] For a given distance function dist and compression ratio R, if a i It is a time series a1, a2, ..., a n If an element is a significant minimum or maximum point, and it has indices il and im, where il < i < im, then the following condition should be satisfied:

[0044] a i It is sequence a il ,...,a im The minimum value in, and dist(a i ,a il )≥R,dist(a i ,a im )≥R.

[0045] Furthermore, step iii defines the key trend turning points as follows:

[0046] For a given inflection point error threshold b, let a j It is a time series a1, a2, ..., a n At point j, there is an inflection point where 1 < j < n; for any position k, there are three slope values ​​where k > j:

[0047]

[0048]

[0049]

[0050] for If S c (k)>min(S u (k′)) or S c (k)<max(S l (k′)), then a k It is a new turning point in the time series at k.

[0051] Furthermore, step iii involves detecting key points in the load power time series, the specific process of which is as follows:

[0052] First, detect all important extreme points in the load power time series.

[0053] Then, detect all inflection points between every two adjacent significant extreme points.

[0054] Finally, regarding the aforementioned turning points, if the current turning point a... i and the turning points or important extreme points before and after them a i-1 a i+1 Given a significant inflection point threshold t, if the condition shown in equation (4) is satisfied, then the inflection point a i This marks a significant turning point in the trend.

[0055]

[0056] Furthermore, step iv extracts the load power subsequence based on key points and matches it with the template. The specific process is as follows:

[0057] First, the key point is to divide the load power time series into multiple load power subsequences of different lengths;

[0058] Then, the load power subsequence is extracted based on the key points;

[0059] Finally, the load power subsequence corresponds to the power waveform feature samples of the electrical appliances. The DTW algorithm is used to measure the similarity between the templates in the original load imprint template library and these samples, and template matching is performed.

[0060] Furthermore, step ⅵ involves topic discovery of the load power subsequence, the specific process of which is as follows:

[0061] A. Determine whether the number of data points contained in the load power subsequence is less than 10. If the number of data points contained in the load power subsequence is less than 10, proceed to step 2. If the number of data points contained in the load power subsequence is greater than 10, proceed to step (C).

[0062] B. Calculate the difference in active power change between the i-th and j-th subsequences, use it as the distance between the i-th and j-th subsequences, and store it in the i-th row and j-th column of the distance matrix D1;

[0063] Create an adjacency matrix L1 with the same dimensions as the distance matrix D1 and all initial elements set to 0, and set a similarity threshold.

[0064] Connect the m-th and n-th subsequences corresponding to the elements less than the threshold in the distance matrix D1, and set the element in the m-th row and n-th column of the adjacency matrix L1 to 1; proceed to step (D);

[0065] C. Calculate the DTW distance between the i-th and j-th subsequences and store it in the i-th row and j-th column of the distance matrix D2;

[0066] Create an adjacency matrix L2 with the same dimensions as the distance matrix D2 and all initial elements set to 0, and set a similarity threshold.

[0067] Connect the m-th and n-th subsequences corresponding to the elements in distance matrix D2 whose DTW distance is less than the threshold, and set the element in the m-th row and n-th column of adjacency matrix L2 to 1; proceed to step (D);

[0068] D. The subsequence theme discovery is achieved using graph theory. The connection results in adjacency matrices L1 and L2 are converted into an undirected graph. Each vertex in the undirected graph represents a subsequence, and the edge between any two vertices represents the matching relationship between the corresponding two subsequences. The two subsequences corresponding to the positions where the element is 1 in adjacency matrices L1 and L2 are connected in the undirected graph. Finally, subsequence clustering is achieved by finding clusters in the undirected graph, and each cluster corresponds to a power waveform theme.

[0069] E. Adaptively set similarity threshold.

[0070] Using the minimum value of the upper triangular elements of the distance matrix as the lower threshold and the maximum value as the upper threshold, a preliminary range of upper and lower thresholds is determined. Within this range, a set of candidate thresholds is selected at low resolution and equal intervals, and a threshold-similarity curve is plotted to initially obtain the first peak and two adjacent troughs. The candidate thresholds corresponding to the two troughs are used as the new lower threshold. l and upper limit th u Determine the new threshold search range;

[0071] Within the new search range, select a new set of candidate thresholds with high resolution and equal intervals, and repeat the above process until the newly determined search range no longer changes from the previous one;

[0072] The candidate threshold corresponding to the peak of the threshold-similarity curve at this point is taken as the final similarity threshold, and the corresponding topic discovery result is the final result.

[0073] Furthermore, the similarity calculation process is as follows:

[0074] First, the distance matrix D is calculated using the Gaussian similarity function according to equation (5). q (q∈{1,2}) is transformed into a similarity matrix A of the same dimension. q ;

[0075] Then, assuming that the cluster of the l-th topic has n subsequences, i1,...,i s ,...,i n They are respectively in similarity matrix A q index (i) s (where the index is the central subsequence), then the intra-cluster similarity of the l-th topic is calculated according to equation (6);

[0076] Finally, after calculating the intra-cluster similarity for each topic, their average value is taken as the average intra-cluster similarity corresponding to the candidate threshold.

[0077]

[0078] Among them, s q It is the distance matrix D q Standard deviation of all upper triangular elements

[0079]

[0080] A system for an iterative, unsupervised, non-invasive load identification method includes a load total power information acquisition and preprocessing module. The output of the load total power information acquisition and preprocessing module is connected to the input of a key point detection and subsequence extraction module. The output of the key point detection and subsequence extraction module is connected to the input of a template matching and load identification module. The output of the template matching and load identification module is connected to the input of a load power waveform theme discovery module. The output of the load power waveform theme discovery module is connected to the input of an electrical power waveform theme sequence pattern mining module. The output of the electrical power waveform theme sequence pattern mining module is connected to the input of a load imprint template library update module.

[0081] The beneficial effects of this invention are as follows:

[0082] This invention applies the variable-length topic discovery method to unsupervised non-intrusive load identification, and establishes an iterative unsupervised non-intrusive load identification method based on variable-length topic discovery, which autonomously discovers the power consumption patterns of electrical appliances with diverse time scales and shapes during the complete operation of electrical appliances.

[0083] This invention can autonomously discover simple and complex power consumption patterns of different types of electrical appliances in various unfamiliar scenarios, adapting to the diversity of power consumption patterns of different brands of the same appliance and the differences in power consumption patterns of different appliances. It shows particularly better performance in identifying complex appliances such as air compressors and air conditioners. Attached Figure Description

[0084] Figure 1 This is a system schematic diagram of the present invention;

[0085] Figure 2 This is a flowchart of the method of the present invention;

[0086] Figure 3 This is a schematic diagram of important minimum points in this invention;

[0087] Figure 4 This is a schematic diagram of the turning point in this invention;

[0088] Figure 5 This is a schematic diagram illustrating the principle of detecting important turning points in this invention;

[0089] Figure 6 This is a schematic diagram of the key point detection and load power subsequence extraction results in this invention;

[0090] Figure 7 This is a schematic diagram of the adaptive threshold search based on the threshold-similarity curve in this invention;

[0091] Where: (a) The candidate threshold-average intra-cluster similarity curve obtained in the first iteration

[0092] (b) Candidate threshold-average intra-cluster similarity curve obtained from the second iteration

[0093] (c) Candidate threshold-average intra-cluster similarity curve obtained in the third iteration

[0094] (d) Candidate threshold-average intra-cluster similarity curve obtained in the fourth iteration;

[0095] Figure 8 This is an example diagram of the power consumption patterns in the REDD dataset (house 4) of this invention;

[0096] Among them: (a) kitchen socket, (b) washing machine-dryer, (c) electric stove, (d) dishwasher, (e) stove-power mode 1, (f) stove-power mode 2;

[0097] Figure 9 This is an example diagram of the power consumption mode of the air conditioner in Pecan Street (ID 9278) in this invention;

[0098] Figure 10 This is an example diagram of the power consumption mode of the air conditioner in private scenario 1;

[0099] Among them: (a) Air conditioning - power consumption mode 1, (b) Air conditioning - power consumption mode 2;

[0100] Figure 11 This is an example diagram of the power consumption mode of the air conditioner in private scenario 2;

[0101] Among them: (a) variable frequency air conditioner, (b) fixed frequency air conditioner. Detailed Implementation

[0102] The present invention will now be described in detail with reference to the accompanying drawings and embodiments:

[0103] like Figures 1 to 11 As shown, an iterative unsupervised non-intrusive load identification method includes the following steps:

[0104] i. Obtain the load power time series

[0105] Read the total active power signal of the user within a certain time period to form a load power time series;

[0106] ii. Preprocess the load power time series

[0107] Preprocess the load power time series to filter out power spikes in the load power time series;

[0108] iii. Key points for detecting load power time series

[0109] Detect key points in the load power time series and mark them in the load power time series;

[0110] iv. Extract the load power subsequence based on key points and match it with the template.

[0111] First, the load power subsequence is extracted based on key points, and then the load power subsequence is matched with the template.

[0112] Then, if the load power subsequence matches the template successfully, proceed to step x;

[0113] Finally, if the load power subsequence fails to match the template, proceed to step i.

[0114] v. Determine if the number of unknown load power subsequences has accumulated to 30.

[0115] If the number of unknown load power subsequences accumulates to 30, proceed to step ⅵ;

[0116] If the number of unknown load power subsequences accumulates to less than 30, proceed to step i;

[0117] vi. Perform topic discovery for load power subsequences;

[0118] Discover topics in shorter subsequences;

[0119] Discover topics in longer subsequences;

[0120] vii. Determine whether the number of subsequences in each clique of an undirected graph is not less than 10;

[0121] If the number of subsequences in each clique of the undirected graph is not less than 10, it is considered a valid result and enters step ⅷ together with other valid results;

[0122] If the number of subsequences in each clique of the undirected graph is less than 10, proceed to step i and wait for subsequent data accumulation;

[0123] ⅷ. Perform pattern mining of electrical power waveform themes;

[0124] Electrical power waveform theme sequence pattern mining is performed to obtain power waveform theme sequence patterns;

[0125] ⅸ. Construct corresponding electrical load imprint templates and update the load imprint template library of the non-intrusive load monitoring system;

[0126] First, the mined power waveform theme sequence patterns are constructed into corresponding electrical load imprint templates;

[0127] Then, the load imprint template library of the non-intrusive load monitoring system is updated, and the templates in the template library are successfully matched with the subsequently extracted unknown power waveform feature samples;

[0128] x. Achieve identification of electrical appliance operating status and decomposition of power consumption;

[0129] The template number in the load imprint template library is assigned to the successfully matched original power waveform feature sample, and the information is recorded. Based on the above information, the power consumption curve of a single appliance is reconstructed, thereby realizing the identification of the appliance's working status and the decomposition of its power consumption.

[0130] Furthermore, in step iii, the key points of the load power time series detection include important extreme points and important trend turning points.

[0131] Furthermore, the definition of important extreme points in step iii is as follows:

[0132] For a given distance function dist and compression ratio R, if a i It is a time series a1, a2, ..., a n If an element is a significant minimum or maximum point, and it has indices il and im, where il < i < im, then the following condition should be satisfied:

[0133] a i It is sequence a il ,...,a im The minimum value in, and dist(a i ,a il )≥R,dist(a i ,a im )≥R.

[0134] Furthermore, step iii defines the key trend turning points as follows:

[0135] For a given inflection point error threshold b, let a j It is a time series a1, a2, ..., a n At point j, there is an inflection point where 1 < j < n; for any position k, there are three slope values ​​where k > j:

[0136]

[0137]

[0138]

[0139] for If S c (k)>min(S u (k′)) or Sc (k)<max(S l (k′)), then a k It is a new turning point in the time series at k.

[0140] Furthermore, step iii involves detecting key points in the load power time series, the specific process of which is as follows:

[0141] First, detect all important extreme points in the load power time series.

[0142] Then, detect all inflection points between every two adjacent significant extreme points.

[0143] Finally, regarding the aforementioned turning points, if the current turning point a... i and the turning points or important extreme points a before and after them. i-1 a i+1 Given a significant inflection point threshold t, if the condition shown in equation (4) is satisfied, then the inflection point a i This marks a significant turning point in the trend.

[0144]

[0145] Furthermore, step iv extracts the load power subsequence based on key points and matches it with the template. The specific process is as follows:

[0146] First, the key point is to divide the load power time series into multiple load power subsequences of different lengths;

[0147] Then, the load power subsequence is extracted based on the key points;

[0148] Finally, the load power subsequence corresponds to the power waveform feature samples of the electrical appliances. The DTW algorithm is used to measure the similarity between the templates in the original load imprint template library and these samples, and template matching is performed.

[0149] Furthermore, step ⅵ involves topic discovery of the load power subsequence, the specific process of which is as follows:

[0150] A. Determine whether the number of data points contained in the load power subsequence is less than 10. If the number of data points contained in the load power subsequence is less than 10, proceed to step 2. If the number of data points contained in the load power subsequence is greater than 10, proceed to step (C).

[0151] B. Calculate the difference in active power change between the i-th and j-th subsequences, use it as the distance between the i-th and j-th subsequences, and store it in the i-th row and j-th column of the distance matrix D1;

[0152] Create an adjacency matrix L1 with the same dimensions as the distance matrix D1 and all initial elements set to 0, and set a similarity threshold.

[0153] Connect the m-th and n-th subsequences corresponding to the elements less than the threshold in the distance matrix D1, and set the element in the m-th row and n-th column of the adjacency matrix L1 to 1; proceed to step (D);

[0154] C. Calculate the DTW distance between the i-th and j-th subsequences and store it in the i-th row and j-th column of the distance matrix D2;

[0155] Create an adjacency matrix L2 with the same dimensions as the distance matrix D2 and all initial elements set to 0, and set a similarity threshold.

[0156] Connect the m-th and n-th subsequences corresponding to the elements in distance matrix D2 whose DTW distance is less than the threshold, and set the element in the m-th row and n-th column of adjacency matrix L2 to 1; proceed to step (D);

[0157] D. The subsequence theme discovery is achieved using graph theory. The connection results in adjacency matrices L1 and L2 are converted into an undirected graph. Each vertex in the undirected graph represents a subsequence, and the edge between any two vertices represents the matching relationship between the corresponding two subsequences. The two subsequences corresponding to the positions where the element is 1 in adjacency matrices L1 and L2 are connected in the undirected graph. Finally, subsequence clustering is achieved by finding clusters in the undirected graph, and each cluster corresponds to a power waveform theme.

[0158] E. Adaptively set similarity threshold.

[0159] Using the minimum value of the upper triangular elements of the distance matrix as the lower threshold and the maximum value as the upper threshold, a preliminary range of upper and lower thresholds is determined. Within this range, a set of candidate thresholds is selected at low resolution and equal intervals, and a threshold-similarity curve is plotted to initially obtain the first peak and two adjacent troughs. The candidate thresholds corresponding to the two troughs are used as the new lower threshold. l and upper limit th u Determine the new threshold search range;

[0160] Within the new search range, select a new set of candidate thresholds with high resolution and equal intervals, and repeat the above process until the newly determined search range no longer changes from the previous one;

[0161] The candidate threshold corresponding to the peak of the threshold-similarity curve at this point is taken as the final similarity threshold, and the corresponding topic discovery result is the final result.

[0162] Furthermore, the similarity calculation process is as follows:

[0163] First, the distance matrix D is calculated using the Gaussian similarity function according to equation (5). q (q∈{1,2}) is transformed into a similarity matrix A of the same dimension. q;

[0164] Then, assuming that the cluster of the l-th topic has n subsequences, i1,...,i s ,...,i n They are respectively in similarity matrix A q index (i) s (where the index is the central subsequence), then the intra-cluster similarity of the l-th topic is calculated according to equation (6);

[0165] Finally, after calculating the intra-cluster similarity for each topic, their average value is taken as the average intra-cluster similarity corresponding to the candidate threshold.

[0166]

[0167] Among them, s q It is the distance matrix D q Standard deviation of all upper triangular elements

[0168]

[0169] Specifically, step ⅷ involves mining the theme sequence pattern of electrical power waveforms. The specific process is as follows:

[0170] a. Based on the aforementioned key point detection results, identify all balancing windows in the load power time series.

[0171] The balance window refers to the active power time series P1, P2, ..., P that satisfies the following conditions. T :

[0172] ①P1 and P T All are key points, and P2-P1>0∧P T -P T-1 <0∧|P1-P T |≤d;

[0173] ② P t >P1 and P t >P T .

[0174] In a load power time series, there is usually more than one balancing window that meets the above conditions, and there may be nested windows. After finding all the balancing windows in the time series, all the nested small balancing windows need to be removed from the large balancing window until all the balancing windows are no longer nested.

[0175] b. Within the balance window identified in step a, the unknown power waveform themes discovered in step E are grouped into a sequence record of electrical power waveform themes to be mined. All power waveform theme sequence records within the balance window together constitute the electrical power waveform theme sequence database.

[0176] c. For the established electrical power waveform theme sequence database, a frequent sequence pattern mining algorithm is used to realize the mining of frequent power waveform theme sequence patterns.

[0177] Based on this, and taking into account the characteristics of electrical appliance power consumption patterns and the usage patterns of electrical appliances, three rules are designed to filter out frequent sequence patterns that correspond to the complete power consumption patterns of electrical appliances.

[0178] Suppose the frequent power waveform theme sequence pattern corresponding to the complete power consumption pattern of a certain appliance is E=[m1,m2,...,m N ] T , where N represents the total number of power waveform topics in this frequent sequence pattern, according to the following rules:

[0179] First, the complete power consumption pattern of an appliance must begin with a subsequence where the difference in active power at the start and end times is positive, and end with a subsequence where the difference in power is negative, thus satisfying equation (7):

[0180]

[0181] in, This represents the difference in active power at the start and end times of the central subsequence of the cluster (cluster number m1) corresponding to the first power waveform theme in a certain frequent PWM sequence mode. Similarly.

[0182] Then, in a certain power waveform theme sequence record, if there is one and only one frequent sequence pattern, then that pattern must correspond to the complete power consumption pattern or working cycle of a certain appliance.

[0183] Finally, the complete power consumption mode of the electrical appliance must meet the zero-loop constraint. Considering that the power of the appliance fluctuates during operation due to its own operating characteristics and random noise, this paper sets the constraint proportional coefficient a∈(0,1), as shown in equation (8).

[0184]

[0185] Specifically, the power waveform theme in step D corresponds to the power consumption patterns of different electrical appliances or different power consumption patterns of the same electrical appliance. These patterns are power change patterns with diverse time scales and shapes during the complete operation of the appliance, thereby detecting the power change patterns during the complete operation of the appliance and effectively addressing the multi-time scale characteristics and complex power waveform shapes of the electrical appliance power waveform feature samples.

[0186] Specifically, the complete operation of an electrical appliance includes transient processes and intermediate processes.

[0187] Specifically, the similarity threshold adaptive setting method used for topic discovery in step E can automatically set and optimize the time series similarity threshold for the target scenario without human intervention, effectively achieving unsupervised topic discovery.

[0188] Specifically, in step ⅷ, within multiple balancing windows determined based on the key points of the power time series, the frequent sequence pattern mining method is used to mine frequent sequence patterns composed of different power waveform themes, thereby determining the complete power consumption pattern or working cycle of the electrical appliance, which can accurately model and identify complex electrical appliances such as multi-state and continuously changing states.

[0189] This invention is an incremental learning process that gradually increases the number of identifiable unfamiliar electrical appliances through continuous learning, thereby achieving iterative electrical appliance modeling, load identification, and electricity consumption decomposition, which helps to improve overall efficiency.

[0190] A system for an iterative, unsupervised, non-invasive load identification method includes a load total power information acquisition and preprocessing module. The output of the load total power information acquisition and preprocessing module is connected to the input of a key point detection and subsequence extraction module. The output of the key point detection and subsequence extraction module is connected to the input of a template matching and load identification module. The output of the template matching and load identification module is connected to the input of a load power waveform theme discovery module. The output of the load power waveform theme discovery module is connected to the input of an electrical power waveform theme sequence pattern mining module. The output of the electrical power waveform theme sequence pattern mining module is connected to the input of a load imprint template library update module.

[0191] Specifically, the load total power information acquisition and preprocessing module, according to the system settings, acquires the active power signal at the monitoring point and performs median filtering data preprocessing on the acquired active power signal.

[0192] Specifically, the key point detection and subsequence extraction module detects key points in the load power time series, marks the key points in the time series, and divides the time series into multiple subsequences of different lengths based on the key points.

[0193] Specifically, the template matching and load identification module matches the templates in the load imprint template library with the power waveform feature samples, assigns the template number from the template library to the successfully matched power waveform feature samples, and records their information, including the index and power of each point in the time series. Based on this information, the power consumption curve of a single appliance is reconstructed, thereby realizing the identification of the appliance's working status and the decomposition of its power consumption.

[0194] Specifically, the load power waveform theme discovery module divides subsequences into subsequences with more than or equal to 10 data points and subsequences with less than 10 data points based on the length of the subsequence duration. Different distance calculation methods are used to improve the accuracy and computational efficiency of electrical power consumption pattern mining. At the same time, an adaptive similarity threshold setting method is adopted to achieve unsupervised load power waveform theme discovery.

[0195] Specifically, the electrical appliance power waveform theme sequence pattern mining module uses all unknown power waveform themes within the load power time series balancing window to form a record of electrical appliance power waveform theme sequences to be mined. All power waveform theme sequence records within the balancing window together constitute the electrical appliance power waveform theme sequence database, enabling the mining of frequent power waveform theme sequence patterns. Based on this, three rules are designed to filter out frequent sequence patterns corresponding to the complete power consumption patterns of electrical appliances.

[0196] Specifically, the load imprint template library update module uses the mined power waveform theme sequence pattern as the corresponding electrical load imprint template and updates the load imprint template library of the non-intrusive load monitoring system.

[0197] Specifically, the data storage module saves the data analysis and processing results of other functional modules and provides data access interfaces for other functional modules.

[0198] Specifically, the external interaction module enables the system, based on the iterative unsupervised non-intrusive load identification method using variable-length topic discovery, to exchange necessary data and information with the outside world. This includes, but is not limited to, displaying and outputting the results of load power time series key point detection, load power subsequence extraction, template matching and load identification, load power waveform topic discovery, electrical power waveform topic sequence pattern mining, and load imprint template library update.

[0199] Another embodiment

[0200] A system for implementing the iterative unsupervised non-intrusive load identification method based on variable-length topic discovery of this invention, such as... Figure 1 As shown, the system mainly consists of eight functional modules, and the functions of each module are as follows;

[0201] Specifically, the total load power information acquisition and preprocessing module is used to acquire active power signals at monitoring points according to system settings. During the transient process of starting appliances such as refrigerators and air conditioners, active power usually exhibits spike waveforms. However, due to the randomness of the occurrence time of transient disturbances in the voltage cycle, the amplitude consistency of power spike waveforms is poor. In order to improve the accuracy of appliance startup transient power mode mining, it is necessary to preprocess the original active power time series and use the median filtering method to filter out these power spikes, thereby improving the repeatability of startup transient power waveforms.

[0202] Specifically, the key point detection and subsequence extraction module is used to detect key points in the load power time series and important minimum value diagrams, such as... Figure 3 As shown, its physical meaning is: a i It is a segment a of a time series. il ,a il+1 ,...,a im The minimum value on, and the two endpoints a of the segment il a im With a i The distances to both endpoints are greater than R, meaning the values ​​at both endpoints are greater than a. i Much larger, therefore a i This is a significant local minimum point in this segment. A diagram illustrating the inflection point is shown below. Figure 4 As shown in the figure, the inflection point is marked with ○, and the three slope values ​​S c (k), S u (k), S l (k) is marked with a dashed line in the diagram.

[0203] First, all significant extreme points in the load power time series are detected. Then, all inflection points between any two adjacent significant extreme points are detected. For each inflection point, if the distance from an inflection point to the line connecting it to its preceding or following inflection points or significant extreme points is greater than t, then that inflection point is considered a significant inflection point. Figure 5 As shown. After all key points are found, they are marked in the time series, as follows. Figure 6 As shown, square dots □ represent important extreme points, and black dots ● represent important turning points. Different load power subsequences extracted based on key points are labeled in the figure. Different subsequences marked with the same symbol belong to the same power waveform theme.

[0204] Specifically, the template matching and load identification module is used to match templates in the load imprint template library with power waveform feature samples. Considering that there is often a time offset and local scale scaling between the power waveform templates in the load imprint template library and the power waveform feature samples in the original power time series, template matching is performed based on the DTW algorithm. Then, the template number in the template library is assigned to the successfully matched power waveform feature sample, and its information is recorded. The information includes the index and power of each point in the time series. Based on the information, the power consumption curve of a single appliance is reconstructed, thereby realizing the identification of the appliance's working status and the decomposition of its power consumption.

[0205] Specifically, the load power waveform theme discovery module divides subsequences into subsequences with 10 or more data points and subsequences with fewer than 10 data points based on the duration of the subsequence. Different distance calculation methods are used to improve the accuracy and computational efficiency of appliance power consumption pattern mining. Simultaneously, an adaptive similarity threshold setting method is employed to achieve unsupervised load power waveform theme discovery.

[0206] When setting the adaptive similarity threshold, such as Figure 7 As shown, (a) is the candidate threshold-average intra-cluster similarity curve obtained within the initially determined upper and lower threshold ranges, where the first peak of the curve is marked with a square dot □, and the two adjacent troughs are marked with ○. (b) and (c) are the threshold-similarity curves obtained within the threshold search ranges determined in the second and third iterations. (d) is the threshold-similarity curve obtained within the threshold search range determined in the fourth iteration, where the candidate threshold corresponding to the point marked with Δ is the finally determined similarity threshold.

[0207] like Figure 8 As shown, for the REDD(House 4) dataset, (a)-(d) are the power waveform detection results for simple appliances such as kitchen sockets (topics A, B), washing machines-dryers (topics C, D), electric stoves (topics E, F, G), and dishwashers (topics H, I), respectively. (e) and (f) show two power consumption modes for complex appliances such as stoves (mode 1: topic JP; mode 2: topic QT). Figure 9 This demonstrates three power waveform themes (A, B, and C) for complex electrical appliances (air conditioners) mined from the Pecan Street (ID 9278) dataset; the power waveform themes detection results on the private dataset are as follows: Figure 10 (Scenario 1) and Figure 11 As shown in (Scenario 2), the method in this paper discovered two power consumption modes of complex electrical appliances air conditioners in Scenario 1 (Mode 1: Theme AG; Mode 2: Themes H, B, C, I, G) as well as variable frequency air conditioners (Theme AC) and fixed frequency air conditioners (Theme DG) in Scenario 2.

[0208] Specifically, the appliance power waveform theme sequence pattern mining module, following the conventions of frequent sequence pattern mining, uses all unknown power waveform themes within the load power time series balancing window to form a record of appliance power waveform theme sequences to be mined. All power waveform theme sequence records within the balancing window together constitute the appliance power waveform theme sequence database, enabling frequent power waveform theme sequence pattern mining. Based on this, three rules are designed to filter out frequent sequence patterns corresponding to the complete power consumption patterns of appliances.

[0209] Specifically, the load imprint template library update module uses the mined power waveform theme sequence pattern as the corresponding electrical load imprint template and updates the load imprint template library of the non-intrusive load monitoring system. For electrical appliances with existing templates, online identification of their power waveform samples can be achieved based on template matching to determine the specific working status of the corresponding electrical appliance and estimate its power consumption.

[0210] Specifically, the data storage module, as needed, saves the data analysis and processing results from other functional modules and provides data access interfaces for those modules; specifically as follows:

[0211] The processing results of the load total power information acquisition and preprocessing module, the key point detection and subsequence extraction module, the template matching and load identification module, the load power waveform theme discovery module, the electrical power waveform theme sequence pattern mining module, and the load imprint template library update module, as well as some output information of the external interaction function module, can be stored in the data information storage module. Moreover, the key point detection and subsequence extraction module, the template matching and load identification module, the load power waveform theme discovery module, the electrical power waveform theme sequence pattern mining module, the load imprint template library update module, and the external interaction function module can access the data information storage module to obtain the required data in order to implement the defined functions.

[0212] Specifically, the external interaction module enables the system, based on the iterative unsupervised non-intrusive load identification method using variable-length topic discovery, to exchange necessary data and information with the outside world. This includes, but is not limited to, displaying and outputting the results of load power time series key point detection, load power subsequence extraction, template matching and load identification, load power waveform topic discovery, electrical power waveform topic sequence pattern mining, and load imprint template library update.

[0213] like Figure 2 As shown, the steps for implementing iterative unsupervised non-intrusive load identification based on variable-length topic discovery using the above system are as follows:

[0214] Step i: Read the user's total active power signal for one day to form a load power time series;

[0215] Step ii: During the start-up transient process of appliances such as refrigerators and air conditioners, active power usually exhibits spike waveforms. However, due to the randomness of the occurrence time of transient disturbances in the voltage cycle, the amplitude consistency of the power spike waveform is poor. In order to improve the accuracy of appliance start-up transient power mode mining.

[0216] This step preprocesses the load power time series using a median filtering method to remove power spikes from the load power time series.

[0217] Step iii: Detect key points in the load power time series and mark them in the load power time series. The key points include important extreme points and important trend turning points. The concepts and detection methods for important extreme points and important trend turning points are as follows:

[0218] For important extreme points:

[0219] For a given distance function dist and compression ratio R, if a i It is a time series a1, a2, ..., a n If an element is a significant minimum or maximum point, and it has indices il and im, where il < i < im, then the following condition should be satisfied:

[0220] a i It is sequence a il ,...,a im The minimum value in, and dist(a i ,a il )≥R,dist(a i ,a im )≥R;

[0221] For key trend turning points:

[0222] For a given inflection point error threshold b, let a j It is a time series a1, a2, ..., a n At point j, there is an inflection point where 1 < j < n; for any position k, there are three slope values ​​where k > j:

[0223]

[0224]

[0225]

[0226] for If S c (k)>min(S u (k′)) or Sc (k)<max(S l (k′)), then a k It is a new turning point in the time series at k;

[0227] First, detect all significant extreme points in the load power time series. Then, detect all inflection points between every two adjacent significant extreme points. For these inflection points, if the current inflection point a... i and the turning points or important extreme points before and after them a i-1 a i+1 Given a significant inflection point threshold t, when the following conditions are met...

[0228]

[0229] When, then the turning point a i This marks a significant turning point in the trend.

[0230] The key points detected in steps iv and iii divide the load power time series into multiple load power subsequences of different lengths. These subsequences are extracted based on the key points. These subsequences correspond to the power waveform feature samples of electrical appliances. The DTW algorithm is used to measure the similarity between the templates in the original load imprint template library and these samples. Template matching is performed. If the matching is successful, proceed to step 10; otherwise, the load power subsequence is marked as unknown.

[0231] Step v determines whether the number of unknown load power subsequences has accumulated to 30. If the number of unknown load power subsequences has accumulated to 30, proceed to step vi; otherwise, proceed to step i.

[0232] Step ⅵ Load power waveform theme discovery, the steps are as follows:

[0233] A. Determine if the number of data points in the subsequence is less than 10. If the number of data points in the subsequence is less than 10, proceed to step B; otherwise, proceed to step C.

[0234] B. For shorter subsequences, calculate the difference in active power change between the i-th and j-th subsequences, use it as the distance between the i-th and j-th subsequences, and store it in the i-th row and j-th column of the distance matrix D1; create an adjacency matrix L1 with the same dimension as the distance matrix D1 and all initial elements are 0, set a similarity threshold, connect the m-th and n-th subsequences corresponding to the elements in the distance matrix D1 that are less than the threshold, and set the element in the m-th row and n-th column of the adjacency matrix L1 to 1; proceed to step D.

[0235] Step C: For longer subsequences, calculate the DTW distance between the i-th and j-th subsequences and store it in the i-th row and j-th column of the distance matrix D2; create an adjacency matrix L2 with the same dimension as the distance matrix D2 and all initial elements are 0; set a similarity threshold; connect the m-th and n-th subsequences corresponding to the elements in the distance matrix D2 whose DTW distance is less than the threshold; and set the elements in the m-th row and n-th column of the adjacency matrix L2 to 1; proceed to step D.

[0236] Step D uses graph theory to discover the subsequence theme: the connection results in the adjacency matrices L1 and L2 constructed in steps B and C are converted into an undirected graph. Each vertex in the undirected graph represents a subsequence, and the edge between each pair of vertices represents the matching relationship between the corresponding two subsequences. The two subsequences corresponding to the positions where the element is 1 in the adjacency matrices L1 and L2 are connected in the undirected graph. Finally, subsequence clustering is achieved by finding clusters in the undirected graph, and each cluster corresponds to a power waveform theme.

[0237] Step E adaptively sets the similarity threshold. The minimum value of the upper triangular element of the distance matrix is ​​used as the lower threshold, and the maximum value as the upper threshold, initially determining the range of the upper and lower thresholds. Within this range, a set of candidate thresholds is selected at low resolution and equal intervals. A threshold-similarity curve is plotted, initially obtaining the first peak and two adjacent troughs. The candidate thresholds corresponding to the two troughs are used as the new lower threshold, th. l and upper limit th u A new threshold search range is determined. Then, within the new search range, a new set of candidate thresholds is selected at high resolution and equal intervals. The above process is repeated until the newly determined search range no longer changes from the previous one. The candidate threshold corresponding to the peak of the threshold-similarity curve at this point is taken as the final similarity threshold, and the corresponding topic discovery result is the final result.

[0238] The average similarity is calculated as follows:

[0239] First, the distance matrix D is calculated using the Gaussian similarity function according to equation (5). q (q∈{1,2}) is transformed into a similarity matrix A of the same dimension. q If we assume that the cluster of the l-th topic has n subsequences, i1,...,i s ,...,i n They are respectively in similarity matrix A q index (i) s If the index of the central subsequence is used, then the intra-cluster similarity of the l-th topic is calculated according to equation (6). After calculating the intra-cluster similarity of each topic, their average value is taken as the average intra-cluster similarity corresponding to the candidate threshold.

[0240]

[0241] Among them, s q It is the distance matrix D q The standard deviation of all upper triangular elements in the equation.

[0242]

[0243] Step 1. Determine whether the number of subsequences in each clique of the undirected graph in Step D is not less than 10, that is, whether the number of times the load feature sample is repeated is not less than 10. If it is not less than 10, it is considered a valid result and enters Step 8 together with other valid results. Otherwise, go to Step 1 and wait for subsequent data accumulation.

[0244] Step ⅷ. Mining the theme sequence pattern of electrical appliance power waveform, the steps are as follows:

[0245] Step a, based on the aforementioned key point detection results, identifies all balancing windows in the load power time series. The balancing window refers to an active power time series P1, P2, ..., P that satisfies the following conditions. T ①P1 and P T All are key points, and P2-P1>0∧P T -P T-1 <0∧|P1-P T |≤d;②for P t >P1 and P t >P T In a load power time series, there is usually more than one balancing window that meets the above conditions, and there may be nested windows. After finding all the balancing windows in the time series, all nested smaller balancing windows need to be removed from the larger balancing window until all balancing windows are no longer nested.

[0246] Step b involves grouping the discovered unknown power waveform themes within the balance window identified in step a into a sequence record of electrical power waveform themes to be mined. All power waveform theme sequence records within the balance window together constitute the electrical power waveform theme sequence database.

[0247] Step c involves using a frequent sequence pattern mining algorithm to mine frequent power waveform theme sequence patterns from the established electrical appliance power waveform theme sequence database. Based on this, and considering the characteristics of electrical appliance power consumption patterns and appliance usage patterns, three rules are designed to filter out frequent sequence patterns corresponding to the complete power consumption patterns of electrical appliances. Assume that the frequent power waveform theme sequence pattern corresponding to the complete power consumption pattern of a certain electrical appliance is E=[m1,m2,...,m N ] T, where N represents the total number of power waveform topics in this frequent sequence pattern, according to the following rules:

[0248] ① The complete power consumption pattern of an electrical appliance must begin with a subsequence where the difference in active power at the start and end times is positive, and end with a subsequence where the difference in power is negative, thus satisfying equation (7):

[0249]

[0250] in, This represents the difference in active power at the start and end times of the central subsequence of the cluster (cluster number m1) corresponding to the first power waveform theme in a certain frequent PWM sequence mode. Similarly.

[0251] ② In a certain power waveform theme sequence record, if there is one and only one frequent sequence pattern, then the pattern must correspond to the complete power consumption pattern or working cycle of a certain electrical appliance.

[0252] ③ The complete power consumption mode of electrical appliances must meet the zero-loop constraint. Considering that the power of electrical appliances fluctuates during operation due to their own operating characteristics and random noise, this paper sets the constraint proportional coefficient a∈(0,1), as shown in Equation (8).

[0253]

[0254] Step 8. Construct the power waveform theme sequence pattern mined in Step 9 into the corresponding electrical load imprint template, thereby updating the load imprint template library of the non-intrusive load monitoring system, and successfully matching the templates in the template library with the subsequently extracted unknown power waveform feature samples;

[0255] Step X assigns the template number from the load imprint template library to the successfully matched original power waveform feature sample, and records its information, including the index and power of each point in the time series. Based on this information, the power consumption curve of a single appliance is reconstructed, thereby realizing the identification of the appliance's working status and the decomposition of its power consumption.

[0256] In the field of non-invasive load identification, existing methods are mostly based on load event detection. On the one hand, load event detection does not include the intermediate process power waveform feature samples of complex electrical appliances in its detection scope, making it impossible to effectively achieve autonomous modeling and identification. On the other hand, load event detection often uses a fixed-duration sliding window to scan load power data, which often fails to completely detect the transient process power waveform feature samples of complex electrical appliances in unfamiliar scenarios, which have diverse time scales and shapes. This invention adopts a variable-length topic discovery method for electrical appliance power waveform topic detection and mining. Compared with traditional load event detection, it can more effectively deal with the multi-time scale characteristics of electrical appliance power waveform feature samples and the problem of complex power waveform shapes of complex electrical appliances. It autonomously discovers the electrical power consumption patterns of electrical appliances with diverse time scales and shapes in the complete operation process of electrical appliances (including transient and intermediate processes), improving the accuracy of modeling and identification of complex electrical appliances with multiple states and continuous changing states.

[0257] According to the above embodiments, this invention iteratively reads and analyzes the active power data at the electricity inlet of residential users, extracts load power subsequences, classifies them based on the duration of the subsequences, and uses different distance or similarity calculation methods for short and long sequences, adaptively setting thresholds to achieve theme discovery. It mines frequently occurring electrical appliance power waveform theme sequence patterns, constructs load imprint templates, and updates the load imprint template library. The templates in the template library are matched with the extracted power waveform feature samples, thereby achieving unsupervised, non-intrusive load identification. Clearly, this scheme can improve the accuracy and computational efficiency of electrical appliance modeling and identification. Therefore, this invention can achieve unsupervised, non-intrusive load identification, autonomously discovering diverse and varied electrical appliance power consumption patterns in the complete operation process of electrical appliances (including transient and intermediate processes), proving the feasibility and applicability of the theme discovery method in unsupervised, non-intrusive load identification.

[0258] This invention applies the variable-length topic discovery method to unsupervised non-intrusive load identification, and establishes an iterative unsupervised non-intrusive load identification method based on variable-length topic discovery, which autonomously discovers the power consumption patterns of electrical appliances with diverse time scales and shapes during the complete operation of electrical appliances.

[0259] This invention can autonomously discover simple and complex power consumption patterns of different types of electrical appliances in various unfamiliar scenarios, adapting to the diversity of power consumption patterns of different brands of the same appliance and the differences in power consumption patterns of different appliances. It shows particularly better performance in identifying complex appliances such as air compressors and air conditioners.

Claims

1. An iterative unsupervised non-intrusive load identification method, characterized in that: Includes the following steps: (i) Obtaining the load power time series Read the total active power signal of the user within a certain time period to form a load power time series; (ii) Preprocessing the load power time series Preprocess the load power time series to filter out power spikes in the load power time series; (iii) Key points for detecting load power time series Detect key points in the load power time series and mark them in the load power time series; (iv) Extract the load power subsequence based on key points and match it with the template. First, the load power subsequence is extracted based on key points, and then the load power subsequence is matched with the template. Then, if the load power subsequence matches the template successfully, proceed to step (x). Finally, if the load power subsequence fails to match the template, proceed to step (v). (v) Determine if the number of unknown load power subsequences has accumulated to 30. If the number of unknown load power subsequences accumulates to 30, proceed to step (ⅵ). If the number of unknown load power subsequences accumulates to less than 30, proceed to step (i). (vi) Topic discovery of load power subsequences; Discover topics in shorter subsequences; Discover topics in longer subsequences; The specific process for topic discovery of load power subsequences is as follows: (A) Determine whether the number of data points contained in the load power subsequence is less than 10. If the number of data points contained in the load power subsequence is less than 10, proceed to step (B). If the number of data points contained in the load power subsequence is greater than 10, proceed to step (C). (B) Calculate the first and The difference in active power change among the subsequences is used as the first... and The distance between each subsequence is calculated and stored in the distance matrix. No. line, number List; Create a distance matrix Adjacency matrices of the same dimension, initially containing all zeros Set a similarity threshold; Distance matrix The element corresponding to the element smaller than the threshold and Connect the subsequences and then use the adjacency matrix. The Middle line, number Set the elements of the column to 1; proceed to step (D); (C) Calculate the first and The DTW distance between each subsequence is stored in the distance matrix. The line, number List; Create a distance matrix Adjacency matrices of the same dimension, initially containing all zeros Set a similarity threshold; Distance matrix The element whose DTW distance is less than the threshold corresponds to the first element. and Connect the subsequences and connect the adjacency matrix. The Middle line, number Set the elements of the column to 1; proceed to step (D); (D) Use graph theory to discover the topic of this subsequence, and use the adjacency matrix. and adjacency matrix The connection results are converted into an undirected graph, where each vertex represents a subsequence, and the edge between any two vertices represents the matching relationship between the corresponding two subsequences. The adjacency matrix... and adjacency matrix The two subsequences corresponding to the position where the element is 1 are connected in the undirected graph. Finally, the subsequences are clustered by finding the cliques in the undirected graph, and each clique corresponds to a power waveform theme. (E) Adaptively set similarity threshold, Using the minimum value of the upper triangular elements of the distance matrix as the lower threshold and the maximum value as the upper threshold, a preliminary range of upper and lower thresholds is determined. Within this range, a set of candidate thresholds is selected at low resolution and equal intervals, and a threshold-similarity curve is plotted to initially obtain the first peak and two adjacent troughs. The candidate thresholds corresponding to the two troughs are used as the new lower thresholds. and upper limit Determine the new threshold search range; Within the new search range, select a new set of candidate thresholds with high resolution and equal intervals, and repeat the above process until the newly determined search range no longer changes from the previous one; The candidate threshold corresponding to the peak of the threshold-similarity curve at this point is taken as the final similarity threshold, and the corresponding topic discovery result is the final result; (vii) Determine whether the number of subsequences in each clique of an undirected graph is not less than 10; If the number of subsequences in each clique of the undirected graph is not less than 10, it is considered a valid result and enters step (ⅷ) together with other valid results. If the number of subsequences in each clique of the undirected graph is less than 10, proceed to step (i) and wait for subsequent data accumulation; (ⅷ) Perform pattern mining of electrical power waveform themes; Electrical power waveform theme sequence pattern mining is performed to obtain power waveform theme sequence patterns; (ix) Construct corresponding electrical load imprint templates and update the load imprint template library of the non-intrusive load monitoring system; First, the mined power waveform theme sequence patterns are constructed into corresponding electrical load imprint templates; Then, the load imprint template library of the non-intrusive load monitoring system is updated, and the templates in the template library are successfully matched with the subsequently extracted unknown power waveform feature samples; (x) To identify the working status of electrical appliances and decompose their power consumption; The template number in the load imprint template library is assigned to the successfully matched original power waveform feature sample, and the information is recorded. Based on the above information, the power consumption curve of a single appliance is reconstructed, thereby realizing the identification of the appliance's working status and the decomposition of its power consumption. The similarity calculation process is as follows: First, the distance matrix is ​​calculated using the Gaussian similarity function according to equation (5). Convert to a similarity matrix of the same dimension ; Then, assuming the first Within each theme cluster are Subsequences They are respectively in the similarity matrix Index in If the index is the central subsequence, then the first... The intra-cluster similarity of each topic is calculated according to equation (6); Finally, after calculating the intra-cluster similarity for each topic, their average value is taken as the average intra-cluster similarity corresponding to the candidate threshold. (5); in, It is a distance matrix Standard deviation of all upper triangular elements (6); Step ⅷ involves mining the theme sequence pattern of electrical power waveforms. The specific process is as follows: a. Based on the aforementioned key point detection results, identify all equilibration windows in the load power time series; The balance window refers to the active power time series that meets the following conditions. : ① and All of these are key points, and ; ② and ; In a load power time series, there is usually more than one balancing window that meets the above conditions, and there may be nested windows. After finding all the balancing windows in the time series, all the nested small balancing windows need to be removed from the large balancing window until all the balancing windows are no longer nested. b. Within the balance window found in step a, the unknown power waveform themes discovered in step D are grouped into a sequence record of electrical power waveform themes to be mined. All power waveform theme sequence records within the balance window together constitute the electrical power waveform theme sequence database. c. For the established electrical power waveform theme sequence database, a frequent sequence pattern mining algorithm is used to realize the mining of frequent power waveform theme sequence patterns.

2. The iterative unsupervised non-invasive load identification method according to claim 1, characterized in that: In step (iii), the key points in detecting the load power time series include important extreme points and important trend turning points.

3. The iterative unsupervised non-invasive load identification method according to claim 2, characterized in that: The important extreme points in step (iii) are defined as follows: For a given distance function and compression ratio ,like It is a time series An important minimum or maximum point, if an index exists. and ,in Then the following conditions must be met: It is a sequence The minimum value in, and .

4. The iterative unsupervised non-invasive load identification method according to claim 3, characterized in that: Step (iii) defines the key trend turning points, as follows: For a given inflection point error threshold ,set up It is a time series exist A turning point, For any position There are three slope values ​​as follows: : ; for ,like or ,but It is a time series A new turning point.

5. The iterative unsupervised non-invasive load identification method according to claim 4, characterized in that: Step (iii) detects key points in the load power time series, and the specific process is as follows: First, detect all important extreme points in the load power time series. Then, detect all inflection points between every two adjacent significant extreme points. Finally, regarding the aforementioned turning points, if the current turning point... and the turning points or important extreme points before and after them. and given important turning point thresholds When the condition shown in equation (4) is satisfied, the inflection point is... This marks a significant turning point in the trend. (4)。 6. The iterative unsupervised non-invasive load identification method according to claim 1, characterized in that: Step (iv) extracts the load power subsequence based on key points and matches it with the template. The specific process is as follows: First, the key point is to divide the load power time series into multiple load power subsequences of different lengths; Then, the load power subsequence is extracted based on the key points; Finally, the load power subsequence corresponds to the power waveform feature samples of the electrical appliances. The DTW algorithm is used to measure the similarity between the templates in the original load imprint template library and these samples, and template matching is performed.