A power grid terminal anomaly detection method

By performing layer-by-layer parsing and feature extraction of power grid terminal messages, and combining single-class support vector machine and K-Means clustering algorithm, an anomaly detection model is constructed. This solves the detection problems at the network side and business command level in the power grid smart terminal system, realizes comprehensive security detection of power grid terminals, and improves the security and real-time performance of the power grid system.

CN117240511BActive Publication Date: 2026-06-23STATE GRID FUJIAN ELECTRIC POWER CO LTD +1

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
STATE GRID FUJIAN ELECTRIC POWER CO LTD
Filing Date
2023-08-25
Publication Date
2026-06-23

AI Technical Summary

Technical Problem

Existing technologies are insufficient to effectively detect abnormal security events at the network side and business command level in smart power grid terminal systems, especially attacks such as forged control commands.

Method used

By parsing the power grid terminal messages layer by layer, extracting network layer and application layer features, and combining single-class support vector machine and clustering learning methods, abnormal traffic and abnormal business instructions are detected. By matching with syntax and semantic rules and attack feature database, a detection model based on K-Means clustering algorithm is constructed to achieve comprehensive anomaly detection of power grid terminals.

Benefits of technology

It improves the security of power grid terminals, enabling simultaneous detection of abnormal security events at both the network side and the service command level, thereby enhancing the security and real-time performance of the power grid system.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN117240511B_ABST
    Figure CN117240511B_ABST
Patent Text Reader

Abstract

The application relates to a power grid terminal anomaly detection method, which comprises the following steps: performing protocol layer-by-layer analysis on a received power grid terminal message, extracting network layer data and application layer instruction level features; performing IP identification on the power grid terminal message, extracting flow features, and realizing flow anomaly detection through one-class support vector machine (OCSVM); extracting application layer instruction level features, extracting key fields capable of identifying protocol features, respectively matching syntax semantic rules and an attack feature library, and realizing detection of abnormal messages and attack messages; and extracting service instruction behavior feature values, and detecting illegal service instructions through a clustering detection model based on clustering learning. The method can simultaneously detect network side and service instruction level abnormal security events of the power grid terminal, and improves the security of the power grid terminal.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of power grid information security detection technology, and specifically to a method for detecting anomalies in power grid terminals. Background Technology

[0002] In power grids, the characteristic values ​​of key operational feature codes such as telemetry and remote signaling are primarily discrete variables. For discrete variables, a normal value should be a discrete value within the variable's range. Abnormal data represents outliers. Therefore, outlier detection for constants and discrete variables can be transformed into outlier detection. Currently, outlier detection algorithms are mainly based on four methods: statistical, distance, density, and clustering. Data distribution in industrial environments is difficult to model statistically and often exhibits characteristics such as localized concentration. Compared to outlier detection algorithms based on statistics, distance, and density, clustering-based detection algorithms avoid choosing statistical distribution models, fully consider the local characteristics of the data, and have linear or near-linear time and space complexity, better meeting the real-time requirements of industrial control systems.

[0003] Currently, cyberattacks have become a new type of weapon, and hostile forces have successfully used cyberattacks to damage critical national infrastructure such as power systems. Attacks on power grid smart terminals typically target the unique protocols and specific business logic of the power sector, characterized by clear targeting, covert operation, and long latency periods. They are generally carried out through group-level or even nation-level attacks. Currently, power grid smart terminal systems primarily rely on mature technologies from traditional IT systems for attack detection, detecting network-side security events. They are unable to detect abnormal security events targeting the system's business command level, such as forged control commands. Summary of the Invention

[0004] The purpose of this invention is to provide a method for detecting anomalies in power grid terminals. This method can simultaneously detect network-side and service instruction-level anomalies in power grid terminals, thereby improving the security of power grid terminals.

[0005] To achieve the above objectives, the technical solution adopted by the present invention is: a method for detecting anomalies in power grid terminals, comprising:

[0006] The received power grid terminal messages are parsed layer by layer to extract network layer data and application layer instruction-level features;

[0007] IP identification is performed on power grid terminal messages, traffic features are extracted, and traffic anomaly detection is achieved through single-class support vector machine (OCSVM).

[0008] By extracting the application layer instruction-level features, the keyword fields that can identify protocol features are extracted, and they are respectively matched with the syntax and semantic rules and the attack feature library to detect malformed packets and attack packets; the business instruction behavior feature values are extracted, and the illegal business instructions are detected through a clustering detection model based on clustering learning.

[0009] Further, the syntax and semantic rules are designed according to the protocol specifications; the attack feature library adopts the snort network attack rule library.

[0010] Further, the implementation method for detecting illegal business instructions through a clustering detection model based on clustering learning is as follows:

[0011] From a large number of training packet samples, the business instruction behavior feature values, including the business instruction feature codes and the business instruction frequencies, are extracted and provided to the clustering algorithm for learning;

[0012] Through the K-Means clustering algorithm, the features of each sample are learned, so that the business instruction behavior features of the same class are aggregated into the same cluster, realizing the classification of the business instruction behavior features, forming multiple clusters of business instruction behavior clusters, and then obtaining a trained clustering detection model;

[0013] In the monitoring stage, the business instruction behavior feature values are extracted from the power grid terminal packets collected in real time, and then the extracted business instruction behavior feature values are analyzed through the trained clustering detection model to determine whether there is an abnormality in the business features.

[0014] Further, the frequency clustering analysis for the business instructions of the power grid intelligent terminal includes the following steps:

[0015] 1) Determine the clustering analysis object: perform clustering analysis on the feature codes and frequencies of different business instructions of the power grid intelligent terminal;

[0016] 2) Construct the feature vector: construct a five-dimensional feature vector <IP, type identifier, transmission reason, information object address, business instruction frequency per unit time> for the business instruction frequency, and the five-dimensional feature vector represents the feature code and the feature code frequency of a certain type of business instruction transmitted by the power grid intelligent terminal of a certain IP per unit time;

[0017] 3) Collect training sample data: collect the normal network data sample traffic, parse and identify the business instruction types, and count the frequencies that appear per unit time;

[0018] 4) Construct the training vector set: according to the five-dimensional vector structure, generate a data set X={x1,x2,…,x n} containing n five-dimensional data points;

[0019] 5) Clustering and constructing a clustering detection model: The K-Means clustering algorithm is used to organize the data objects in the dataset into K partitions C={C k , i=1…k}, each partition represents a class c k Each class c k There is a category center μ i Euclidean distance is selected as the similarity and distance judgment criterion, and the sum of squares of the clusters from each point in the class to the cluster center is calculated; thus, a cluster detection model is constructed.

[0020] 6) Online detection of business instruction frequency: The clustering detection model established after training is used to classify the detection vectors. If the detection vector does not belong to any class, it is judged that an anomaly has occurred.

[0021] Furthermore, this method targets high-dimensional safety monitoring big data objects from multi-source heterogeneous power grid terminals, primarily processing raw text data, raw image data, and log data. For raw text and image data, features are first extracted to form a feature matrix. Then, principal component analysis is used to analyze the internal structure of the correlation or covariance matrix of the original variables, transforming multiple variables into a few comprehensive variables, i.e., principal components, thereby achieving dimensionality reduction. For log data, an association network is constructed based on the sequential adjacency relationships of different event units in the log. On this basis, a deep random walk network embedding model is used to learn the low-dimensional vector representation of key elements in the log, and the correlation and aggregation characteristics between key elements in the log are calculated to automatically discover specific behavioral patterns and outliers of abnormal behavior. By processing data from smart power grid terminals, abnormal behaviors can be detected.

[0022] Compared with the prior art, the present invention has the following beneficial effects: it provides a method for detecting anomalies in power grid terminals. This method can not only monitor network layer traffic anomalies, malformed packets, and attack packets, but also proposes an instruction-level feature detection method based on clustering learning, which can detect anomalies in service features, thereby improving the comprehensiveness of power grid terminal anomaly detection and enhancing the security of power grid terminals. Attached Figure Description

[0023] Figure 1 This is a block diagram illustrating the implementation principle of the method according to an embodiment of the present invention;

[0024] Figure 2 This is a flowchart illustrating the implementation of clustering learning-based detection of illegal business instructions in this embodiment of the invention.

[0025] Figure 3 This is a flowchart illustrating the implementation of frequency clustering analysis for smart grid terminal service commands in this embodiment of the invention.

[0026] Figure 4This is a flowchart illustrating the process of processing power grid terminal data in an embodiment of the present invention. Detailed Implementation

[0027] The present invention will be further described below with reference to the accompanying drawings and embodiments.

[0028] It should be noted that the following detailed descriptions are exemplary and intended to provide further explanation of this application. Unless otherwise specified, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application pertains.

[0029] It should be noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to limit the exemplary embodiments according to this application. As used herein, the singular form is intended to include the plural form as well, unless the context clearly indicates otherwise. Furthermore, it should be understood that when the terms "comprising" and / or "including" are used in this specification, they indicate the presence of features, steps, operations, devices, components, and / or combinations thereof.

[0030] like Figure 1 As shown, this embodiment provides a method for detecting anomalies in power grid terminals, including:

[0031] 1) Perform protocol-by-protocol parsing on the received power grid terminal messages to extract network layer data and application layer instruction-level features.

[0032] 2) IP identification is performed on the power grid terminal messages, traffic features are extracted, and traffic anomaly detection is achieved through single-class support vector machine OCSVM.

[0033] 3) By extracting application layer instruction-level features, key fields that can identify protocol features are extracted and matched with syntax and semantic rules and attack feature databases to detect malformed and attack packets; business instruction behavior feature values ​​are extracted and illegal business instructions are detected by clustering detection model based on clustering learning.

[0034] In this embodiment, the syntax and semantic rules are designed according to the protocol specification. The attack signature database adopts the Snort network attack rule base.

[0035] This method constructs a real-time interaction anomaly detection framework for power grid terminals. It achieves network-side traffic security detection during real-time interaction of power grid smart terminals through traffic anomaly detection, and application-layer instruction-level security detection during real-time interaction of power grid smart terminals through business instruction-level detection.

[0036] like Figure 2 As shown, the implementation method for detecting illegal business instructions using a clustering detection model based on clustering learning is as follows:

[0037] 1) Message feature extraction: Extract the business instruction behavior feature values, such as frequency, feature code, etc., from a large number of training message samples, and provide them to the clustering algorithm for learning.

[0038] 2) Clustering learning: Use the K-Means clustering algorithm to learn the features of each sample, so that the business instruction behavior features of the same class are aggregated into the same cluster, realizing the classification of business instruction behavior features, forming multiple clusters of business instruction behavior clusters, and then obtaining a trained clustering detection model.

[0039] The K-Means clustering algorithm is an iterative process, aiming to minimize the sum of the squares of the distances from all samples in the clustering domain to the cluster center.

[0040] Through the K-Means clustering algorithm, the business behaviors of the same class are aggregated into the same cluster, realizing the functional classification of business instruction behaviors. The clustering forms multiple sets of business instruction behaviors, such as: remote control commands, telemetry commands, electric energy summons commands, remote parameter reading and writing, file transfer, etc.

[0041] 3) In the monitoring stage, extract the business instruction behavior feature values from the real-time collected power grid terminal messages, and then analyze the extracted business instruction behavior feature values through the trained clustering detection model to determine whether there is an abnormal business feature.

[0042] Such as Figure 3 As shown, the frequency clustering analysis process for power grid intelligent terminal business instructions includes the following steps:

[0043] 1) Determine the clustering analysis object: Conduct clustering analysis on the feature codes and frequencies of different business instructions of power grid intelligent terminals (such as changing remote signals, changing telemetry and remote control, etc.).

[0044] 2) Construct a feature vector: Construct a five-dimensional feature vector <IP, type identifier, transmission reason, information object address, business instruction frequency per unit time> for the business instruction frequency. The five-dimensional feature vector represents the feature code and feature code frequency of a certain type of business instruction transmitted by a power grid intelligent terminal of a certain IP per unit time.

[0045] 3) Training sample data collection: Collect the normal network data sample traffic, parse and identify the business instruction types, and count the frequencies that appear per unit time.

[0046] 4) Construct a five-dimensional training vector set: According to the five-dimensional vector structure, generate a data set X={x1, x2, …, x n} containing n five-dimensional data points.

[0047] 5) Clustering and constructing a clustering detection model: The K-Means clustering algorithm is used to organize the data objects in the dataset into K partitions C={C k , i=1…k}, each partition represents a class c k Each class c k There is a category center μ i Euclidean distance is selected as the similarity and distance judgment criterion, and the sum of squares of the clusters from each point in the class to the cluster center is calculated; thus, a cluster detection model is constructed.

[0048] 6) Online detection of business instruction frequency: The clustering detection model established during the training phase is used to classify the detection vectors. If the detection vector does not belong to any class, it is judged that an anomaly has occurred.

[0049] like Figure 4 As shown, this method targets high-dimensional safety monitoring big data objects from multi-source heterogeneous power grid terminals, primarily processing raw text data, raw image data, and log data. For raw text and image data, features are first extracted to form a feature matrix. Then, principal component analysis is used to analyze the internal structure of the correlation or covariance matrix of the original variables, transforming multiple variables into a few comprehensive variables, i.e., principal components, thereby achieving dimensionality reduction. For log data, an association network is constructed based on the sequential adjacency relationships of different event units in the log. Based on this, a deep random walk network embedding model is used to learn the low-dimensional vector representation of key elements in the log, and the correlation and clustering characteristics between key elements in the log are calculated, thereby automatically discovering specific behavioral patterns and outliers. By processing data from smart power grid terminals, abnormal behaviors can be detected.

[0050] This invention analyzes and extracts the protocol features of service messages from smart grid terminals, constructing feature vectors covering the control domain, application layer function codes, command direction, and command transmission time. Based on this, it uses the K-means clustering algorithm to classify terminal service behaviors, building a smart grid terminal service behavior model. Combined with power grid terminal service commands, the model performs real-time comparison of service commands to detect command-level attacks, achieving online identification of command-level attacks and solving the challenge of identifying numerous novel protocol command-level attacks in smart grid terminals.

[0051] The present invention also provides a power grid terminal anomaly detection system, including a memory, a processor, and computer program instructions stored in the memory and executable by the processor. When the processor executes the computer program instructions, it can implement the above-described method steps.

[0052] Those skilled in the art will understand that embodiments of this application can be provided as methods, systems, or computer program products. Therefore, this application can take the form of a completely hardware embodiment, a completely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, this application can take the form of a computer program product embodied on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) containing computer-usable program code.

[0053] This application is described with reference to flowchart illustrations and / or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of this application. It will be understood that each block of the flowchart illustrations and / or block diagrams, and combinations of blocks in the flowchart illustrations and / or block diagrams, can be implemented by computer program instructions. These computer program instructions can be provided to a processor of a general-purpose computer, special-purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, generate instructions for implementing the flowchart... Figure 1 One or more processes and / or boxes Figure 1 A device that provides the functions specified in one or more boxes.

[0054] These computer program instructions may also be stored in a computer-readable storage medium that can direct a computer or other programmable data processing device to function in a particular manner, such that the instructions stored in the computer-readable storage medium produce an article of manufacture including instruction means, which are implemented in a process Figure 1 One or more processes and / or boxes Figure 1 The function specified in one or more boxes.

[0055] These computer program instructions may also be loaded onto a computer or other programmable data processing equipment to cause a series of operational steps to be performed on the computer or other programmable equipment to produce a computer-implemented process, thereby providing instructions that execute on the computer or other programmable equipment for implementing the process. Figure 1 One or more processes and / or boxes Figure 1 The steps of the function specified in one or more boxes.

[0056] The above description is merely a preferred embodiment of the present invention and is not intended to limit the invention in any other way. Any person skilled in the art may make changes or modifications to the above-disclosed technical content to create equivalent embodiments. However, any simple modifications, equivalent changes, and modifications made to the above embodiments based on the technical essence of the present invention without departing from the scope of the present invention shall still fall within the protection scope of the present invention.

Claims

1. A method for detecting anomalies in power grid terminals, characterized in that, Including: Perform layer-by-layer protocol analysis on the received power grid terminal message, and extract network layer data and application layer instruction-level features; Perform IP identification on the power grid terminal message, extract traffic features, and implement traffic anomaly detection of the power grid terminal through one-class support vector machine OCSVM; By extracting application layer instruction-level features, extract keyword fields that can identify protocol features, and match them with syntax and semantic rules and attack feature libraries respectively to implement the detection of malformed messages and attack messages; extract business instruction behavior feature values, and detect违规业务指令 through a clustering detection model based on clustering learning; The implementation method for detecting违规业务指令 through a clustering detection model based on clustering learning is as follows: Extract business instruction behavior feature values from a large number of training message samples, including business instruction feature codes and business instruction frequencies, and provide them to the clustering algorithm for learning; Learn the features of each sample through the K-Means clustering algorithm, so that the business instruction behavior features of the same class are aggregated into the same cluster, implement the classification of business instruction behavior features, form multi-class business instruction behavior clustering families, and then obtain a trained clustering detection model; In the monitoring stage, extract business instruction behavior feature values from the real-time collected power grid terminal messages, and then analyze the extracted business instruction behavior feature values through the trained clustering detection model to determine whether there is an abnormality in business features; The frequency clustering analysis for the business instructions of the power grid intelligent terminal includes the following steps: 1) Determine the clustering analysis object: perform clustering analysis on the feature codes and frequencies of different business instructions of the power grid intelligent terminal; 2) Construct a feature vector: construct a five-dimensional feature vector <IP, type identifier, transmission reason, information object address, business instruction frequency per unit time> for the business instruction frequency, and the five-dimensional feature vector represents the feature code and feature code frequency of a certain type of business instruction transmitted by the power grid intelligent terminal of a certain IP per unit time; 3) Training sample data collection: collect normal network data sample traffic, analyze and identify the business instruction type, and count the frequency of occurrence per unit time; 4) Constructing the training vector set: Based on the five-dimensional vector structure, generate a dataset X={x1,x2,…,x...} containing n five-dimensional data points. n }; 5) Clustering and constructing a clustering detection model: The K-Means clustering algorithm is used to organize the data objects in the dataset into K partitions C={c k , k=1…K}, each partition represents a class c k Each class c k There is a category center μ i Euclidean distance is selected as the similarity and distance judgment criterion, and the sum of squares of the clusters from each point in the class to the cluster center is calculated; thus, a cluster detection model is constructed. 6) Online detection of business instruction frequency: classify the detection vector using the trained clustering detection model. If the detection vector does not belong to any class family, it is determined that an abnormality has occurred.

2. The method for detecting anomalies in a power grid terminal according to claim 1, characterized in that, The syntax and semantic rules are designed according to protocol specifications; the attack feature library uses the snort network attack rule library.

3. The method for detecting anomalies in a power grid terminal according to claim 1, characterized in that, This method targets high-dimensional safety monitoring big data objects from multi-source heterogeneous power grid terminals, primarily processing raw text data, raw image data, and log data. For raw text and image data, features are first extracted to form a feature matrix. Then, principal component analysis is used to analyze the internal structure of the correlation or covariance matrix of the original variables, transforming multiple variables into a few comprehensive variables, i.e., principal components, thereby achieving dimensionality reduction. For log data, an association network is constructed based on the sequential adjacency relationships of different event units in the log. On this basis, a deep random walk network embedding model is used to learn the low-dimensional vector representation of key elements in the log, and the correlation and aggregation characteristics between key elements in the log are calculated to automatically discover specific behavioral patterns and outliers. By processing data from smart power grid terminals, abnormal behaviors can be detected.