AI-driven industrial manufacturing data stream real-time processing method and system
By employing an AI-driven real-time data stream processing method for industrial manufacturing, and utilizing similarity analysis of historical and real-time data sequences, gradual anomalies can be dynamically identified. This solves the problem of frequent misjudgments in existing technologies and achieves more accurate anomaly detection.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- SHANDONG BLUEBIRD IND INTERNET CO LTD
- Filing Date
- 2026-05-22
- Publication Date
- 2026-06-19
AI Technical Summary
Existing methods for detecting gradual anomalies are inadequate to identify the cumulative state of gradual drift during long-term production on a production line, leading to equipment damage and frequent misjudgments.
By acquiring historical and real-time data sequences, calculating drift significance, structural similarity, and feature similarity, and combining the probability of state occurrence, gradual anomalies are dynamically identified, employing an AI-driven real-time data stream processing method for industrial manufacturing.
It improves the accuracy of gradual anomaly identification, reduces the risk of misjudgment, and can more accurately capture the correlation structural differences of different drift states, thereby improving the sensitivity of equipment anomaly detection.
Smart Images

Figure CN122241542A_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of data processing, and more particularly to an AI-driven real-time data stream processing method and system for industrial manufacturing. Background Technology
[0002] In modern industrial manufacturing, with the deepening of Industry 4.0 and intelligent manufacturing, production equipment is becoming increasingly complex, automated, and interconnected, generating massive amounts of multi-source heterogeneous data streams in real time (such as vibration, temperature, pressure, speed, current, process parameters, etc.). During industrial manufacturing, factors such as progressive tool wear, early bearing degradation, lubricant deterioration, slight deviations in cutting parameters from specifications, and gradual increases in surface roughness can easily lead to gradually changing abnormal data in the manufacturing data stream.
[0003] In existing technologies, gradual anomaly detection methods mostly rely on statistical and traditional machine learning methods such as control charts (SPC), Mahalanobis distance, isolation forest, local anomaly factor (LOF), and one-class SVM. These methods assume that the data distribution is relatively stable and that the data has a clear and unique drift state. They are difficult to identify different cumulative states of asymptotic drift under long-term production on a production line, and are prone to misjudging the cumulative state, resulting in untimely identification of gradual anomalies and damage to production line equipment. Summary of the Invention
[0004] The main objective of this invention is to provide an AI-driven real-time processing method and system for industrial manufacturing data streams, aiming to solve the technical problems in existing manufacturing data stream gradual anomaly detection, such as misjudging the cumulative state of asymptotic drift, easily failing to identify gradual anomaly data, and consequently damaging equipment.
[0005] To achieve the above objectives, this invention provides an AI-driven real-time processing method for industrial manufacturing data streams, the method comprising: In the first aspect, AI-driven real-time processing methods for industrial manufacturing data streams include: Obtain historical data sequences of production equipment in various dimensions within a set historical time period, divide the historical time period into multiple sub-time periods, and obtain sub-data sequences of each sub-time period in various dimensions; The effective values of each sub-time period in each dimension are obtained based on the sub-data sequence, and the drift significance of each sub-time period in each dimension is calculated based on the effective values. Calculate the historical relevance of the sub-time period in each dimension based on the sub-data sequence; Acquire real-time data sequences of production equipment in various dimensions within a real-time time period, and calculate the real-time correlation between each of the real-time data sequences; The structural similarity between the real-time time period and the sub-time period is calculated based on the historical relevance and the real-time relevance, and the feature similarity between the real-time time period and the sub-time period is calculated based on the sub-data sequence and the real-time data sequence. The probability of state occurrence for the real-time time period and the sub-time period is obtained by fusing the structural similarity and feature similarity based on the degree of drift significance. The gradual anomaly scores of the real-time time period and the sub-time period are calculated based on the historical relevance and real-time relevance; the gradual anomaly scores are weighted according to the probability of state occurrence to obtain the degree of gradual anomaly of the real-time time period; if the degree of gradual anomaly is greater than a preset threshold, it is determined that there is a gradual anomaly in the real-time time period.
[0006] Preferably, dividing the historical time period into multiple sub-time periods includes: Fourier transform is performed on the historical data sequences of each dimension to obtain the spectrum. The cumulative sum is calculated based on the amplitude of each dimension at the same frequency to filter frequencies. The reciprocal of the filtered frequencies is used as the length to divide the historical time period into multiple sub-time periods of equal length.
[0007] Preferably, the acquisition of the significance of drift in each dimension within the sub-time period includes: Take any sub-time period as the target segment, perform differential processing on the sub-data sequences of each dimension in the target segment, and calculate the effective value of each dimension in the target segment based on the differential processing results; Using any dimension as the target dimension, calculate the drift difference between the effective value of the target dimension and the effective values of each dimension in the target segment, and then fuse the drift differences to obtain the drift significance of each dimension in the target segment.
[0008] Preferably, the acquisition of historical relevance and real-time relevance includes: Based on the partial correlation coefficients between the target segment and the sub-data sequences in each dimension, the historical correlation between the target segment and any two dimensions can be obtained. Based on the partial correlation coefficients between real-time data sequences in each dimension for the real-time time period, the real-time correlation between any two dimensions for the real-time time period can be obtained.
[0009] Preferably, the acquisition of the structural similarity between the real-time time period and the sub-time period includes: Based on the historical relevance of the target segment to the target dimension and each dimension, a historical relevance vector of the target segment in the target dimension is constructed, and based on the real-time relevance of the real-time time period to the target dimension and each dimension, a real-time relevance vector of the real-time time period in the target dimension is constructed. The first Canberra distance is calculated based on the historical correlation vector and the real-time correlation vector to obtain the structural similarity between the real-time time period and the target segment.
[0010] Preferably, the acquisition of feature similarity between the real-time time period and the sub-time period includes: The Canberra distance is calculated based on the real-time data sequence of the target dimension within the real-time time period and the sub-data sequence of the target dimension within the target segment to obtain the feature similarity between the real-time time period and the target segment in the target dimension.
[0011] Preferably, the acquisition of the probability of occurrence of the state in the real-time time period and sub-time period includes: The structural and feature similarities between the real-time time segment and the target segment in the target dimension are fused to obtain the fused similarity. The fusion similarity is weighted by the significant drift of the target segment in the target dimension to obtain the state similarity; the state similarity of the real-time time period and the target segment in each dimension is fused to obtain the probability of the state occurrence of the real-time time period and the target segment.
[0012] Preferably, the acquisition of the gradual anomaly score between the real-time time period and the sub-time period includes: The fluctuation difference is calculated based on the effective value of the real-time time period in the target dimension and the effective value of the target segment in the target dimension. The fluctuation differences of the real-time time period and the target segment in each dimension are fused to obtain the waveform anomaly degree of the real-time time period and the target end. The real-time time period is divided into multiple sub-windows according to a preset window length to obtain the local correlation between each dimension in the sub-windows; a local correlation vector of the sub-time period is constructed based on the local correlation between each dimension in the sub-windows, and the Canberra distance between the local correlation vectors of each sub-window is calculated. The Canberra distance is then fused to obtain the correlation anomaly. The waveform anomaly degree and the related anomaly degree are fused to obtain the gradual anomaly score of the real-time time period and the target segment.
[0013] Preferably, the acquisition of the gradual anomaly degree over the real-time time period includes: The gradual anomaly scores are weighted according to the probability of state occurrence, and the weighted results are fused to obtain the degree of gradual anomaly in the real-time time period.
[0014] Secondly, an AI-driven real-time processing method and system for industrial manufacturing data streams includes: a processor and a memory, wherein the memory stores computer program instructions, and when the computer program instructions are executed by the processor, the real-time processing method for industrial manufacturing data streams described in any one of the above claims is implemented.
[0015] The present invention has the following beneficial effects: 1. This invention calculates the significance of drift in different dimensions, analyzes the magnitude of change in different dimensions under standardized conditions based on first-order difference values, quantifies the drift intensity in different dimensions, and focuses on the stronger drift dimensions. At the same time, it calculates structural similarity to analyze the differences in local correlation structures when different drift states are generated in multiple dimensions, and calculates feature similarity to analyze the intuitive numerical differences between time series data in multiple sub-time periods. Compared with single intuitive numerical differences, the two-dimensional similarity analysis can more accurately capture the subtle differences in correlation structures between different drift states, effectively analyze the probability of occurrence of different drift states, and significantly reduce the risk of misjudging gradual anomalies.
[0016] 2. This invention constructs a gradual anomaly identification system based on the probability distribution of multiple drift states. By establishing the probability distribution of drift states represented by the real-time time period relative to the sub-time period, it is beneficial to analyze the probability of each drift state occurring during real-time production of production equipment. Compared with single drift state analysis, the comprehensive analysis of multiple drift states can more comprehensively identify the occurrence of gradual drift behavior, effectively distinguish between gradual anomalies and gradual drifts under the multiple drift state distribution, improve the detection sensitivity of gradual anomaly data, and significantly improve the accuracy of gradual anomaly identification.
[0017] The above and other objects, advantages and features of the present invention will become more apparent to those skilled in the art from the following detailed description of specific embodiments of the invention in conjunction with the accompanying drawings. Attached Figure Description
[0018] The following sections will describe some specific embodiments of the invention in detail by way of example and not limitation, with reference to the accompanying drawings. The same reference numerals in the drawings denote the same or similar parts or portions. Those skilled in the art should understand that these drawings are not necessarily drawn to scale. In the drawings: Figure 1 This is a schematic flowchart of an AI-driven real-time processing method for industrial manufacturing data streams according to an embodiment of the present invention; Figure 2 This is a schematic structural block diagram of an AI-driven real-time data stream processing system for industrial manufacturing according to an embodiment of the present invention. The objectives, features, and advantages of this invention will be further explained in conjunction with the embodiments and with reference to the accompanying drawings. Detailed Implementation
[0019] It should be understood that the specific embodiments described herein are for illustrative purposes only and are not intended to limit the scope of the invention.
[0020] The following reference Figures 1 to 2This embodiment describes an AI-driven real-time data stream processing method and system for industrial manufacturing. In this description, it should be understood that the terms "first" and "second" are used for descriptive purposes only and should not be construed as indicating or implying relative importance or implicitly specifying the number of indicated technical features. Therefore, a feature defined with "first" or "second" may explicitly or implicitly include at least one of that feature, that is, include one or more of that feature. In this description, "multiple" means at least two, such as two, three, etc., unless otherwise explicitly specified. When a feature "includes or contains" one or more of the features it encompasses, unless otherwise specifically described, this indicates that other features are not excluded and may be further included.
[0021] In the description of this embodiment, the terms "one embodiment," "some embodiments," "illustrative embodiment," "example," "specific example," or "some examples," etc., refer to specific features, structures, materials, or characteristics described in connection with that embodiment or example, which are included in at least one embodiment or example of the present invention. In this specification, the illustrative expressions of the above terms do not necessarily refer to the same embodiment or example. Furthermore, the specific features, structures, materials, or characteristics described may be combined in any suitable manner in one or more embodiments or examples.
[0022] Please see Figure 1 The AI-driven real-time processing method for industrial manufacturing data streams includes steps S1-S5, as detailed below: S1: Obtain the historical data sequence of the production equipment in each dimension within a set historical time period, divide the historical time period into multiple sub-time periods, and obtain the sub-data sequence of each sub-time period in each dimension.
[0023] In the gradual anomaly detection of industrial manufacturing data streams, there are complex gradual drift states that accumulate. To quantify this state accumulation and implement anomaly detection, it is necessary to first obtain the "historical state distribution" and "current state" of the drift accumulation. The historical data stream represents the historical state distribution, and the data stream to be processed represents the current state.
[0024] In this embodiment of the invention, the following historical data sequences were collected at a sampling frequency of 1 kHz: Multiple sensors are deployed on the production equipment on the production line. The central system reads the data collected by each sensor to obtain the historical data sequence of the production equipment in each dimension within a set historical time period. The historical data sequence in each dimension is then processed by Z-score standardization to obtain the historical data stream.
[0025] It should be noted that the various dimensions mentioned specifically refer to equipment characteristics such as temperature, pressure, and rotation speed that have a real impact on product production; the specific implementation of Z-score standardization is to calculate the standard deviation of the dimension to which the value to be standardized belongs as the denominator, calculate the difference between the value to be standardized and the mean of the dimension as the numerator, and use the ratio of the two as the standardized value to achieve standardization.
[0026] In industrial manufacturing, depending on the content and intensity of production, production equipment exhibits significant asymptotic drift phenomena and accumulates states of varying intensities and time spans. Therefore, it is necessary to automatically segment historical time periods into sub-time periods to quantify the asymptotic drift states and analyze the significance of drift in each dimension under each state.
[0027] In one embodiment, the historical time period is divided into multiple sub-time periods, as follows: To obtain the segment length of a sub-time period, firstly, Fourier transforms are performed on the historical data sequence for each dimension of the historical time period to obtain the historical spectrum for each dimension. Taking one frequency in the historical spectrum as an example, this frequency is designated as the target frequency. Based on the amplitude of the target frequency in the historical spectrum for each dimension, the cumulative sum of the amplitudes is calculated as the frequency intensity of the target frequency. The frequency intensity is then calculated for each frequency in the historical spectrum, and the reciprocal of the frequency corresponding to the maximum frequency intensity is used as the segment length.
[0028] The method for dividing historical time periods in the above embodiments can achieve the division of historical time periods. However, due to the differences in the rate of change and amplitude of signals in different dimensions in industrial manufacturing, the dimension with higher signal energy is easily smoothed by the lower energy dimension, resulting in the loss of too much information. The lower energy dimension is easily dominated by the higher energy dimension, resulting in the loss of too much detail. For example, vibration has higher energy than other dimensions, which can easily dominate the segmentation length and cause the loss of drift state details in other dimensions.
[0029] To address the aforementioned issues, in other embodiments, dividing the historical time period into multiple sub-time periods further includes: To obtain the segment length of a sub-time period, firstly, Fourier transforms are performed on the historical data sequence for each dimension of the historical time period to obtain the historical spectrum for each dimension. Then, the cumulative amplitude of all frequencies in the historical spectrum for each dimension is calculated to obtain the dimensional intensity for each dimension of the historical time period. Taking one frequency in the historical spectrum as an example, and assuming this frequency is the target frequency, the product between the amplitude of the target frequency and the dimensional intensity for each dimension is calculated sequentially. The product of the target frequency in each dimension is then summed to obtain the frequency intensity of the target frequency. The frequency intensity is calculated for each frequency in the historical spectrum, and the reciprocal of the frequency corresponding to the maximum frequency intensity is used as the segment length.
[0030] Specifically, this embodiment introduces the dimensional strength of each dimension, which can be differentiated and weighted according to the overall amplitude difference of each dimension. Dimensions with higher dimensional strength (more complex states and more diverse dimensions) will be given higher weights, which solves the limitations of simply superimposing each dimension and makes the quantification of drift state more in line with the difference in drift strength of each dimension in the data stream.
[0031] The historical data sequence is divided into equal-length segments based on the obtained segment length to obtain multiple sub-time periods.
[0032] Specifically, when the length of the historical time period does not meet the segmentation length, neighbor interpolation is used to fill the gaps to achieve equal-length segmentation. The specific implementation of neighbor interpolation is to iterate through and fill the data of each dimension corresponding to the first timestamp in the historical time period on the left side of the historical time period until the segmentation length is met and the interpolation stops.
[0033] S2: Obtain the effective values of the sub-time period in each dimension based on the sub-data sequence, and calculate the drift significance of the sub-time period in each dimension based on the effective values.
[0034] Taking one sub-time period as an example, let this sub-time period be the target segment. First-order differencing is performed on the sub-data sequences of each dimension within the target segment to obtain the first-order differencing sequence for each dimension. The RMS effective value is then calculated for each dimension's first-order differencing sequence to obtain the effective value for each dimension in the target segment. Since the first-order differencing value characterizes the magnitude of change in dimensional data at adjacent time points, and the effective value is calculated based on the first-order differencing value, it can characterize the differences and fluctuations in each dimension under the drift state presented in the target segment.
[0035] Specifically, the effective value of RMS is obtained by calculating the sum of squares of each element in the first-order difference sequence, taking the mean, and then taking the square root.
[0036] Taking one dimension as an example, let this dimension be the target dimension. The difference between the effective value of the target dimension and the effective value of each other in the target segment is calculated sequentially as the drift difference between the target dimension and each other. This drift difference is then summed after iterating through all drift differences to obtain the significance of the drift in the target dimension within the target segment. Since the effective value can characterize the magnitude of fluctuation, the drift difference, calculated based on the effective value, can characterize the magnitude of change in each dimension. The greater the magnitude of change, the greater the fluctuation difference when drift occurs, and the more significant the drift phenomenon.
[0037] Specifically, the significance of the drift satisfies the following relationship: In the formula, This indicates the degree of significance of the drift in the target dimension within the target segment. Indicates the number of dimensions in the data segment. This represents the valid values of the target dimension in the target segment. Indicates the first segment in the target segment Valid values for each dimension.
[0038] S3: Calculate the historical relevance of the sub-time period in each dimension based on the sub-data sequence, and construct a historical relevance vector based on the historical relevance.
[0039] The partial correlation coefficients between the target segment's sub-data sequences in the target dimension and the sub-data sequences in each dimension are calculated sequentially as the historical correlation of the target segment between the target dimension and each dimension. Based on the historical correlation of the target segment between the target dimension and each dimension, a historical correlation vector of the target segment in the target dimension is constructed.
[0040] Construct a historical correlation vector of the target segment in the target dimension based on the historical correlation between the target segment and each dimension.
[0041] Specifically, the partial correlation coefficient algorithm is implemented by calculating the prediction residuals between two sequences and outputting a correlation coefficient value between -1 and 1.
[0042] S4: Obtain the real-time data sequence of production equipment in each dimension within the real-time time period, and calculate the real-time correlation of the real-time time period in each dimension based on the real-time data sequence of each dimension within the real-time time period.
[0043] This step is the same as the step of obtaining historical correlation described above, and will not be described in detail in this embodiment to avoid redundancy. According to the step of obtaining historical correlation described in step S3, the real-time data sequence of each dimension within the real-time time period is processed sequentially to obtain the historical correlation between the real-time time period and each dimension in the target dimension. Based on the real-time correlation between the real-time time period and each dimension in the target dimension, a real-time correlation vector of the real-time time period in the target dimension is constructed.
[0044] S5: Calculate the structural similarity between the real-time time period and the sub-time period based on the historical relevance and real-time relevance, and calculate the feature similarity between the real-time time period and the sub-time period based on the sub-data sequence and the real-time data sequence.
[0045] First, based on the real-time correlation vector of the real-time time period in the target dimension and the historical correlation vector of the target segment in the target dimension, the Canberra distance between the real-time time period and the target time period is calculated as the structural distance. The negative exponent of the structural distance is then used to obtain the structural similarity between the real-time time period and the target segment in the target dimension. The negative exponent acts as a reverse mapping, satisfying a direct proportionality between larger negative values and greater structural similarity.
[0046] Secondly, based on the real-time data sequence of the real-time time period in the target dimension and the sub-data sequence of the target segment in the target dimension, the Canberra distance between the real-time time period and the target time period is calculated as the feature distance. The negative exponent of the feature distance is then used to obtain the feature similarity between the real-time time period and the target segment in the target dimension. The negative exponent acts as a reverse mapping, satisfying the direct proportionality between larger negative values and greater feature similarity.
[0047] Specifically, the distance to Canberra satisfies the following relationship: In the formula, Representing vectors with vector The distance between them to Canberra Represents the length of the vector. Representing vectors The One element, Representing vectors The Each element.
[0048] S6: Based on the degree of drift significance, the structural similarity and feature similarity are fused to obtain the probability of state occurrence for the real-time time period and the sub-time period.
[0049] The fusion similarity is obtained by summing the structural similarity and feature similarity between the real-time time period and the target segment in the target dimension. The state similarity is obtained by multiplying the fusion similarity with the drift significance in the target dimension of the target segment. The sum of the state similarity of each dimension is calculated by iterating through the data. The sum of the state similarity is then normalized to obtain the probability of the state occurrence of the real-time time period relative to the target segment.
[0050] Specifically, the summation normalization is implemented by iterating through the sum of the state similarities of the real-time time period relative to any sub-time period, calculating the cumulative sum as the denominator, taking the sum of the state similarities of the real-time time period relative to the target segment as the numerator, and using the ratio as the normalized value to achieve summation normalization.
[0051] Specifically, the sum of state similarities satisfies the following relationship: In the formula, This represents the sum of state similarity between the real-time time period and the target segment. Indicates the number of dimensions in the data segment. Indicates the first segment in the target segment The significance of drift in each dimension Indicates the real-time time period and the target segment in the 1st... Structural similarity in 1 dimension Indicates the real-time time period and the target segment in the 1st... Feature similarity in each dimension Indicates the real-time time period and the target segment in the 1st... The fusion similarity across multiple dimensions.
[0052] S7: Calculate the gradual anomaly score for the real-time time period and the sub-time period based on the historical relevance and real-time relevance. Weight the gradual anomaly scores according to the probability of state occurrence to obtain the degree of gradual anomaly for the real-time time period.
[0053] First, the difference between the effective values of the real-time time period and the target segment in the target dimension is calculated to obtain the fluctuation difference between the real-time time period and the target segment in the target dimension. The product between the drift significance of the target dimension in the target segment and the fluctuation difference of the target dimension is calculated. The sum of the products is calculated through all dimensions as the waveform anomaly degree of the real-time time period relative to the target segment.
[0054] Furthermore, the target segment is divided into multiple sub-windows according to a preset window length, and the partial correlation coefficient between the window data sequences of each dimension within each sub-window is calculated sequentially to obtain the local correlation between each dimension within the sub-window. A local correlation vector for the sub-time period is constructed based on the local correlation between each dimension within the sub-window, and the local Canberra distance between the local correlation vectors of each sub-window is calculated. The mean of all local Canberra distances is taken to obtain the correlation distance of the target segment.
[0055] Specifically, the preset window length is typically determined by those skilled in the art through traversal experiments on abnormal data sequences of various dimensions within a known abnormal time period. For each window length, the standard deviation of the Canberra distance between two sub-windows is calculated, and the window length corresponding to the largest standard deviation is selected as the preset value. A larger standard deviation indicates a greater difference in the relevant structure at different locations within the abnormal time period, suggesting that the relevant structural changes are more clearly identified.
[0056] Secondly, based on the aforementioned method of obtaining relevant distances, the relevant distances for the real-time time period are obtained, and the absolute difference between the relevant distances for the real-time time period and the relevant distances for the target segment is calculated to obtain the correlation anomaly degree of the real-time time period relative to the target segment.
[0057] The waveform anomaly degree and the correlation anomaly degree of the real-time time period relative to the target segment are calculated and summed to obtain the gradual anomaly score of the real-time time period relative to the target segment. The gradual anomaly score is then multiplied by the probability of the state occurrence of the real-time time period relative to the target segment. Finally, the cumulative sum of all products is calculated to obtain the gradual anomaly degree of the real-time time period.
[0058] Specifically, the degree of gradual anomaly satisfies the following relationship: In the formula, Indicates the degree of gradual anomaly over a real-time time period. Indicates the number of sub-time periods. Indicates the real-time time period relative to the first... The probability of a state occurring in each sub-time period. Indicates the real-time time period relative to the first... Gradual abnormality scores for individual time periods.
[0059] S8: Input the degree of gradual anomaly in real time period into the central system to determine whether there is a gradual anomaly and complete the real-time processing of industrial manufacturing data stream for gradual anomaly analysis.
[0060] The gradual anomaly level of real-time time periods is processed by Min-Max standardization to map the data to the range of 0-1. The standardized gradual anomaly level is then input into the central system. When the standardized gradual anomaly level is greater than the preset threshold of 0.75, the real-time time period is marked as an abnormal segment, and relevant personnel are arranged to check the equipment to complete the abnormal data detection and processing of industrial manufacturing data stream.
[0061] Specifically, the preset threshold is set by having relevant personnel on the production line perform a traversal test on multiple known abnormal data segments and normal data segments to obtain the gradual degree of abnormality in the known abnormal data segments and the gradual degree of abnormality in the normal data segments. The median between the average baseline of the gradual degree of abnormality of the abnormal data segments and the average baseline of the gradual degree of abnormality of the abnormal data segments is used as the preset threshold. The fluctuation range of this threshold can be adjusted by relevant personnel according to the production content and intensity of the production line.
[0062] Specifically, the Min-Max standardization process involves calculating the difference between the value to be standardized and the minimum calculated value of the gradual anomaly level as the numerator, and the difference between the maximum and minimum calculated values of the gradual anomaly level as the denominator. The ratio of these two values is then used as the standardized value. Furthermore, the real-time processing effect of this invention is manifested in anomaly detection within computer data processing.
[0063] In summary, the technical solution of this embodiment can dynamically determine the contribution weight of each dimension to anomaly detection based on the changes in drift intensity of each dimension in different drift states. Compared with the prior art, it can adapt anomaly data detection to the complex drift state scenarios of multiple working conditions in industrial manufacturing, thereby improving the sensitivity of anomaly data detection and real-time processing of anomaly data in industrial manufacturing data streams.
[0064] This invention also provides an AI-driven real-time data stream processing system for industrial manufacturing. For example... Figure 2 As shown, the system includes a processor and a memory. The memory stores computer program instructions, which, when executed by the processor, implement the AI-driven real-time processing method for industrial manufacturing data streams according to the first aspect of the present invention. The system also includes other components well-known to those skilled in the art, such as a communication bus and a communication interface, the setup and functions of which are known in the art and will not be described further here.
[0065] It should be noted that those skilled in the art can make various modifications and improvements without departing from the inventive concept, and these all fall within the scope of protection of this invention. Therefore, the scope of protection of this patent should be determined by the appended claims.
Claims
1. An AI-driven real-time data stream processing method for industrial manufacturing, characterized in that, include: Obtain historical data sequences of production equipment in various dimensions within a set historical time period, divide the historical time period into multiple sub-time periods, and obtain sub-data sequences of each sub-time period in various dimensions; The effective values of each sub-time period in each dimension are obtained based on the sub-data sequence, and the drift significance of each sub-time period in each dimension is calculated based on the effective values. Calculate the historical correlation of the sub-time period across various dimensions based on the sub-data sequence; Obtain real-time data sequences of production equipment in various dimensions within a set real-time time period, and calculate the real-time correlation between each of the real-time data sequences; The structural similarity between the real-time time period and the sub-time period is calculated based on the historical relevance and the real-time relevance, and the feature similarity between the real-time time period and the sub-time period is calculated based on the sub-data sequence and the real-time data sequence. The probability of state occurrence for the real-time time period and the sub-time period is obtained by fusing the structural similarity and feature similarity based on the degree of drift significance. The gradual anomaly scores of the real-time time period and the sub-time period are calculated based on the historical relevance and real-time relevance; the gradual anomaly scores are weighted according to the probability of state occurrence to obtain the degree of gradual anomaly of the real-time time period; if the degree of gradual anomaly is greater than a preset threshold, it is determined that there is a gradual anomaly in the real-time time period.
2. The AI-driven real-time processing method for industrial manufacturing data streams according to claim 1, characterized in that, The process of dividing the historical time period into multiple sub-time periods includes: Fourier transform is performed on the historical data sequences of each dimension to obtain the spectrum. The cumulative sum is calculated based on the amplitude of each dimension at the same frequency to filter frequencies. The reciprocal of the filtered frequencies is used as the length to divide the historical time period into multiple sub-time periods of equal length.
3. The AI-driven real-time processing method for industrial manufacturing data streams according to claim 1, characterized in that, The determination of the significance of drift in each dimension within the sub-time period includes: Take any sub-time period as the target segment, perform differential processing on the sub-data sequences of each dimension in the target segment, and calculate the effective value of each dimension in the target segment based on the differential processing results; Using any dimension as the target dimension, calculate the drift difference between the effective value of the target dimension and the effective values of each dimension in the target segment, and then fuse the drift differences to obtain the drift significance of each dimension in the target segment.
4. The AI-driven real-time processing method for industrial manufacturing data streams according to claim 1, characterized in that, The acquisition of historical relevance and real-time relevance includes: Based on the partial correlation coefficients between the target segment and the sub-data sequences in each dimension, the historical correlation between the target segment and any two dimensions can be obtained. Based on the partial correlation coefficients between real-time data sequences in each dimension for the real-time time period, the real-time correlation between any two dimensions for the real-time time period can be obtained.
5. The AI-driven real-time processing method for industrial manufacturing data streams according to claim 1, characterized in that, The process of obtaining the structural similarity between the real-time time period and the sub-time period includes: Based on the historical relevance of the target segment to the target dimension and each dimension, a historical relevance vector of the target segment in the target dimension is constructed, and based on the real-time relevance of the real-time time period to the target dimension and each dimension, a real-time relevance vector of the real-time time period in the target dimension is constructed. The first Canberra distance is calculated based on the historical correlation vector and the real-time correlation vector to obtain the structural similarity between the real-time time period and the target segment.
6. The AI-driven real-time processing method for industrial manufacturing data streams according to claim 1, characterized in that, The process of obtaining the feature similarity between a real-time time period and its sub-time periods includes: The Canberra distance is calculated based on the real-time data sequence of the target dimension within the real-time time period and the sub-data sequence of the target dimension within the target segment to obtain the feature similarity between the real-time time period and the target segment in the target dimension.
7. The AI-driven real-time processing method for industrial manufacturing data streams according to claim 1, characterized in that, The acquisition of the probability of occurrence of the state in the real-time time period and sub-time period includes: The structural and feature similarities between the real-time time segment and the target segment in the target dimension are fused to obtain the fused similarity. The fusion similarity is weighted by the significant drift of the target segment in the target dimension to obtain the state similarity; the state similarity of the real-time time period and the target segment in each dimension is fused to obtain the probability of the state occurrence of the real-time time period and the target segment.
8. The AI-driven real-time processing method for industrial manufacturing data streams according to claim 1, characterized in that, The process of obtaining the gradual anomaly scores for the real-time time period and sub-time periods includes: The fluctuation difference is calculated based on the effective value of the real-time time period in the target dimension and the effective value of the target segment in the target dimension. The fluctuation differences of the real-time time period and the target segment in each dimension are fused to obtain the waveform anomaly degree of the real-time time period and the target end. The real-time time period is divided into multiple sub-windows according to a preset window length to obtain the local correlation between each dimension in the sub-windows; a local correlation vector of the sub-time period is constructed based on the local correlation between each dimension in the sub-windows, and the Canberra distance between the local correlation vectors of each sub-window is calculated. The Canberra distance is then fused to obtain the correlation anomaly. The waveform anomaly degree and the related anomaly degree are fused to obtain the gradual anomaly score of the real-time time period and the target segment.
9. The AI-driven real-time processing method for industrial manufacturing data streams according to claim 1, characterized in that, The acquisition of the gradual anomaly degree over a real-time time period includes: The gradual anomaly scores are weighted according to the probability of state occurrence, and the weighted results are fused to obtain the degree of gradual anomaly in the real-time time period.
10. An AI-driven real-time data stream processing system for industrial manufacturing, characterized in that: include: A processor and a memory, the memory storing computer program instructions that, when executed by the processor, implement the steps of the AI-driven real-time processing method for industrial manufacturing data streams according to any one of claims 1-9.