A time series data monitoring method and related apparatus

By sampling, grouping, and processing time-series data in a cloud-native environment using predictive models, the problem of high false alarm rates in existing time-series data monitoring has been solved, achieving more reliable time-series data monitoring.

CN122285437APending Publication Date: 2026-06-26BEIJING QIYI CENTURY SCI & TECH CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
BEIJING QIYI CENTURY SCI & TECH CO LTD
Filing Date
2026-04-02
Publication Date
2026-06-26

AI Technical Summary

Technical Problem

Existing time-series data monitoring methods in cloud-native environments have high false alarm rates and cannot meet the requirements for high reliability. This is mainly because fixed thresholds cannot adapt to fluctuations in business traffic and historical percentiles ignore the trends and seasonality of time-series data.

Method used

After obtaining time-series data from monitoring data sources, the data is sampled, grouped, and sorted. A pre-trained time-series data prediction model is then used to make predictions, generating accurate time-series data monitoring results. This includes data cleaning, sorting of autocorrelation function values, and comparison of expected confidence intervals to generate alarm information.

Benefits of technology

It reduces the false alarm rate, improves the reliability and accuracy of time-series data monitoring, better adapts to changes in business traffic, and reduces anomaly detection latency.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122285437A_ABST
    Figure CN122285437A_ABST
Patent Text Reader

Abstract

This application discloses a time-series data monitoring method and related apparatus, relating to the field of cloud-native technology. After acquiring the time-series data to be processed, this application performs equidistant time-series processing, groups the data according to a preset grouping interval, and sorts multiple equidistant time-series data pairs within each group according to their correlation from high to low, obtaining more accurate time-series data to be predicted. This data is then input into a pre-trained time-series data prediction model to obtain more accurate time-series data prediction results. Finally, the actual time-series data at each predicted time point in the prediction results is compared with the prediction results to obtain the time-series data monitoring results for the processed time-series data. The time-series data monitoring results generated by this application based on accurate time-series data prediction results have a low false alarm rate and high reliability.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application relates to the field of cloud-native technology, and in particular to a time-series data monitoring method and related apparatus. Background Technology

[0002] With the rapid development of cloud-native technologies, container orchestration platforms, represented by the open-source Linux container Kubernetes, or the Prometheus monitoring system, an open-source monitoring and alerting system based on time-series databases, have become core infrastructure of modern distributed systems.

[0003] In cloud-native environments, monitoring time-series data is highly complex, with monitoring metrics exhibiting characteristics of high dimensionality, dynamism, and massive volume. A single cluster can generate terabytes of time-series data daily. Monitoring systems typically use alarm mechanisms based on set thresholds or historical percentiles to alert on abnormal data.

[0004] As business scenarios become increasingly complex, current time-series data monitoring methods have a high false alarm rate, and their reliability cannot meet higher requirements.

[0005] In conclusion, how to provide a highly reliable time-series data monitoring method has become an urgent problem to be solved. Summary of the Invention

[0006] In view of the above problems, this application provides a time-series data monitoring method and related apparatus to achieve the purpose of accurately monitoring time-series data in a cloud-native environment. The specific solution is as follows:

[0007] The first aspect of this application provides a time-series data monitoring method, including:

[0008] In response to the arrival of the preset time series data monitoring time, the time series data to be processed is obtained from the monitoring data source; the monitoring data source is the data terminal that generates the time series data; the time series data to be processed is the target monitoring time series data.

[0009] The time series data to be processed is sampled according to a preset sampling interval to obtain an equidistant time series dataset.

[0010] The equidistant time series dataset is grouped at a preset grouping interval to obtain equidistant time series data groups; each equidistant time series data group includes multiple equidistant time series data pairs arranged in descending order of correlation.

[0011] Arrange each equidistant time series data group in order from nearest to farthest sampling time corresponding to the preset sampling interval to obtain the time series data to be predicted;

[0012] The time series data to be predicted is input into a pre-trained time series data prediction model to obtain the time series data prediction results;

[0013] Obtain the actual time series data at each predicted time point in the time series data prediction results, and compare the actual time series data with the time series data prediction results to obtain the time series data monitoring results of the time series data to be processed.

[0014] In one possible implementation, the time-series data to be processed is sampled at a preset sampling interval to obtain an equidistant time-series dataset, including:

[0015] Identify and remove invalid values ​​from the time series data to be processed to obtain valid time series data;

[0016] Identify and remove outliers from the valid time series data to obtain smoothed time series data;

[0017] Smooth time series data are sampled according to a preset sampling interval to obtain an equidistant time series dataset.

[0018] In one possible implementation, the time-series data to be processed is sampled at a preset sampling interval to obtain an equidistant time-series dataset, including:

[0019] Identify and remove invalid values ​​from the time series data to be processed to obtain valid time series data;

[0020] The effective time series data is sampled according to the preset sampling interval to obtain the equidistant time series data to be cleaned;

[0021] Identify and remove outliers from the equidistant time series data to be cleaned to obtain the equidistant time series dataset.

[0022] In one possible implementation, the equidistant time-series dataset is grouped at a preset grouping interval to obtain equidistant time-series data groups, including:

[0023] The time interval for each group is determined based on the preset grouping interval;

[0024] The time interval to which the equidistant time series data belongs is determined according to the sampling time of each equidistant time series data in the equidistant time series dataset, thus obtaining each equidistant time series data group;

[0025] The equidistant time series data in the equidistant time series data group are combined in pairs to obtain each equidistant time series data pair;

[0026] Calculate the autocorrelation function values ​​used to characterize the degree of correlation between each pair of equally spaced time series data to obtain the set of autocorrelation function values;

[0027] The autocorrelation function values ​​in the set of autocorrelation function values ​​are sorted from largest to smallest to obtain the autocorrelation function value sequence;

[0028] Arrange the time series data pairs at equal intervals in descending order of their autocorrelation function values ​​to obtain the reordered time series data groups.

[0029] In one possible implementation, the time-series data monitoring results include alarm information. The actual time-series data is compared with the predicted time-series data to obtain the time-series data monitoring results for the data to be processed, including:

[0030] The expected confidence interval for each time series data to be processed is determined based on the upper and lower bounds of the prediction results for each time series data to be processed.

[0031] Compare the actual time series data with the expected confidence intervals of the time series data to be processed at the same prediction time point;

[0032] Determine whether the actual time series data exceeds the expected confidence interval. When the actual time series data exceeds the expected confidence interval, generate an alarm message.

[0033] In one possible implementation, before inputting the time series data to be predicted into a pre-trained time series data prediction model, the following steps are also included:

[0034] A fixed time window for the time series data to be predicted is determined according to a preset downsampling time interval;

[0035] The time series data to be predicted is grouped according to a fixed time window to obtain regrouped equidistant time series data groups. The adjusted time series data to be predicted is composed of the regrouped equidistant time series data groups.

[0036] A second aspect of this application provides a time-series data monitoring device, comprising:

[0037] The acquisition unit is used to acquire time-series data to be processed from the monitoring data source in response to the arrival of a preset time-series data monitoring time; the monitoring data source is the data terminal that generates time-series data; the time-series data to be processed is the target monitoring time-series data.

[0038] The sampling unit is used to sample the time series data to be processed according to a preset sampling interval to obtain an equidistant time series dataset.

[0039] A grouping unit is used to group the equidistant time series dataset at a preset grouping interval to obtain equidistant time series data groups. Each equidistant time series data group includes multiple equidistant time series data pairs arranged in descending order of correlation.

[0040] The sorting unit is used to arrange each equidistant time series data group in order from the nearest to the farthest sampling time corresponding to the preset sampling interval, so as to obtain the time series data to be predicted;

[0041] The prediction unit is used to input the time series data to be predicted into a pre-trained time series data prediction model to obtain the time series data prediction result;

[0042] The comparison unit is used to acquire the real time series data at each prediction time point in the time series data prediction results, and compare the real time series data with the time series data prediction results to obtain the time series data monitoring results of the time series data to be processed.

[0043] A third aspect of this application provides a time-series data monitoring device, comprising at least one processor and a memory connected to the processor, wherein:

[0044] Memory is used to store computer programs;

[0045] The processor is used to execute computer programs to enable the timing data monitoring device to implement the timing data monitoring method of the first aspect or any implementation thereof.

[0046] The fourth aspect of this application provides a computer program product including computer-readable instructions that, when executed on an electronic device, cause the electronic device to implement the timing data monitoring method of the first aspect or any implementation thereof.

[0047] The fifth aspect of this application provides a computer storage medium carrying one or more computer programs, which, when executed by an electronic device, enable the electronic device to implement the timing data monitoring method described in the first aspect or any implementation thereof.

[0048] Using the above technical solution, the time series data monitoring method and related apparatus provided in this application obtain the time series data to be processed from the monitoring data source and sample it according to a preset sampling interval to obtain an equidistant time series dataset. Then, the equidistant time series dataset is grouped according to a preset grouping interval to obtain each equidistant time series data group. Multiple equidistant time series data pairs in each equidistant time series data group are sorted from high to low according to their correlation. Then, the time series data to be predicted is input into a pre-trained time series data prediction model to obtain a more accurate time series data prediction result. The actual time series data at each predicted time point in the time series data prediction result is compared with the time series data prediction result to obtain the time series data monitoring result of the time series data to be processed.

[0049] The time-series data monitoring results generated by this application based on accurate time-series data prediction results have a low false alarm rate and high reliability. Attached Figure Description

[0050] The above and other features, advantages, and aspects of the embodiments of this disclosure will become more apparent from the accompanying drawings and the following detailed description. Throughout the drawings, the same or similar reference numerals denote the same or similar elements. It should be understood that the drawings are schematic, and the originals and elements are not necessarily drawn to scale.

[0051] Figure 1 A flowchart illustrating the time-series data monitoring method provided in this application;

[0052] Figure 2 A flowchart illustrating the time-series data monitoring method provided in this application;

[0053] Figure 3 A schematic diagram of a time-series data monitoring device provided in this application;

[0054] Figure 4 A schematic diagram of the time-series data monitoring device provided in this application. Detailed Implementation

[0055] The embodiments of this application are described below with reference to the accompanying drawings. The terminology used in the implementation section of this application is for explaining specific embodiments only and is not intended to limit the scope of this application.

[0056] The embodiments of this application will now be described with reference to the accompanying drawings. Those skilled in the art will recognize that, with technological advancements and the emergence of new scenarios, the technical solutions provided in the embodiments of this application are equally applicable to similar technical problems.

[0057] The terms "first," "second," etc., used in the specification, claims, and accompanying drawings of this application are used to distinguish similar objects and are not necessarily used to describe a specific order or sequence. It should be understood that such terms are interchangeable where appropriate; this is merely a way of distinguishing objects with the same attributes in the embodiments of this application. Furthermore, the terms "comprising" and "having," and any variations thereof, are intended to cover non-exclusive inclusion, so that a process, method, system, product, or apparatus that comprises a series of elements is not necessarily limited to those elements, but may include other elements not explicitly listed or inherent to those processes, methods, products, or apparatuses.

[0058] In cloud-native environments, time-series data monitoring metrics, such as CPU utilization, request latency, and memory usage, exhibit dynamic changes, with a single cluster generating up to terabytes of time-series data per day.

[0059] When dealing with time-series data in complex business scenarios, existing monitoring systems use alarm mechanisms based on fixed thresholds or historical percentiles. This approach has several problems:

[0060] Fixed thresholds cannot adapt to the periodic fluctuations in business traffic, resulting in a high false alarm rate; historical quantile methods ignore the trends and seasonality of time-series data, leading to delays in anomaly detection.

[0061] Taking a fixed threshold alarm mechanism as an example, the time-series data samples at different time points have the same expected value, which leads to alarm delays, false alarms, or missed alarms.

[0062] Therefore, how to fully utilize the massive amounts of collected and stored monitoring metrics data to predict the current expected value of time-series data in order to achieve accurate alerts for time-series data under abnormal service conditions is a key issue that urgently needs to be addressed in cloud-native environments.

[0063] To address the aforementioned issues, this application provides a time-series data monitoring method and related apparatus.

[0064] Optional, see Figure 1 This application provides a flowchart of a time-series data monitoring method.

[0065] like Figure 1 As shown, the time-series data monitoring method includes the following steps:

[0066] Step 101: In response to the arrival of the preset time series data monitoring time, obtain the time series data to be processed from the monitoring data source; the monitoring data source is the data terminal that generates the time series data; the time series data to be processed is the target monitoring time series data;

[0067] It should be noted that the preset time series data monitoring time is a pre-set time point or time period used to trigger the time series data acquisition operation; the monitoring data sources include the Prometheus client library, real-time running data of underlying resources such as physical machines, virtual machines, and containers, running status data of middleware such as databases, message queues, and caches, status and event data of K8s resource objects, short-term task data actively reported through the Prometheus monitoring system interface, and time series databases such as VictoriaMetrics and openGemini.

[0068] Optionally, based on the pre-configured data acquisition time point, the target monitoring time series data, i.e. the time series data to be processed, is obtained from the data terminal that generates the time series data.

[0069] Step 102: Sample the time series data to be processed according to the preset sampling interval to obtain an equidistant time series dataset.

[0070] It should be noted that equidistant time series data refers to data where the time intervals between two time series data are equidistant. In this application, equidistant time series data is specifically obtained by sampling from the time series to be processed according to a preset sampling interval. The equidistant time series data obtained through sampling in this step conforms to the data format required by the pre-trained time series data prediction model.

[0071] There are two ways to implement this step:

[0072] The first implementation method is to identify and remove invalid values ​​from the time series data to be processed to obtain valid time series data, then identify and remove outliers from the valid time series data to obtain smoothed time series data, and finally sample the smoothed time series data according to the preset sampling interval to obtain an equidistant time series dataset.

[0073] Specifically, this implementation method mainly involves cleaning followed by sampling. First, invalid values ​​are removed from the most obvious parts of the time series to be processed, such as null values, incorrect formats, and values ​​that clearly exceed physical or logical ranges. This process ensures that the subsequent base data is basically valid. Next, outliers are removed. Using complete original timestamps and data sequences, algorithms can detect and remove outliers. These outliers may be genuine sudden fault signals or noise during acquisition or transmission. Finally, the cleaned and anomaly-removed smoothed time series data is sampled at intervals, such as every 5 seconds or every 10 points. This step solves the problem of potentially unequal intervals in the original data, resulting in a standardized, equidistant time series dataset that is easy for subsequent analysis.

[0074] The second implementation method is as follows: identify and remove invalid values ​​from the time series data to be processed to obtain valid time series data, then sample the valid time series data according to the preset sampling interval to obtain the equidistant time series data to be cleaned, identify and remove outliers in the equidistant time series data to be cleaned, and obtain the equidistant time series dataset.

[0075] Specifically, this implementation method mainly involves sampling followed by cleaning. First, invalid values ​​are removed from the most obvious parts of the time series to be processed, such as null values, incorrectly formatted values, and values ​​that clearly exceed physical or logical ranges. This process ensures that the subsequent base data is basically valid. Then, the valid time series data obtained after removing invalid values ​​is sampled at equal intervals to obtain the equidistant time series data to be cleaned. At this point, the equidistant time series data is already equidistant, but may still contain outliers. Finally, outlier detection and removal are performed on the equidistant time series data. During this process, the time dimension of the data is uniform, and the anomaly detection algorithm can evaluate the continuity and smoothness of the data based on fixed time intervals.

[0076] In summary, the core logic of the first implementation method, which involves cleaning before sampling, is to perform quality screening at the highest information resolution. The input data for anomaly detection is the original, high-resolution, and potentially unequal-distance time-series data sequence. By using the original high-resolution data, it is possible to capture brief and sharp instantaneous anomalies, avoiding missed detections caused by averaging out anomalies during sampling. This makes anomaly identification more sensitive and accurate.

[0077] The second implementation, sampling followed by cleaning, focuses on precise correction within the target data structure. The input data for anomaly detection is a sampled, equidistant sequence with potentially reduced resolution. Sampling significantly reduces the data volume, allowing subsequent computationally intensive anomaly detection algorithms to run faster and with less resource consumption, resulting in higher computational efficiency and a more stable process. Standardizing the data to equidistant intervals prevents misjudgments by some anomaly detection algorithms due to uneven timestamps in the original data, effectively avoiding false anomalies introduced by sampling.

[0078] Step 103: Group the equidistant time series datasets at a preset grouping interval to obtain each equidistant time series data group; the equidistant time series data group includes multiple equidistant time series data pairs arranged in descending order of correlation.

[0079] It should be noted that the preset grouping interval can be half a day, a day, or several hours.

[0080] In a specific embodiment of this application, the autocorrelation function value can be used to characterize the correlation between various equidistant time series data. In equidistant time series data, the autocorrelation function value is mainly used to characterize the correlation strength under different time lags. When the absolute value of the value is close to 1, it indicates that the data at the current time and the lag time are highly similar, with a strong trend or periodicity, or that the data at the current time and the lag time show inverse fluctuations. When the absolute value of the value is close to 0, it indicates that the data at the current time and the lag time are independent of each other. When the value is in the middle, it indicates that there is a certain correlation between the data.

[0081] Optionally, this step can be implemented as follows:

[0082] 1> Determine the time interval for each group based on the preset grouping interval.

[0083] First, set a fixed time interval as the grouping interval, such as 1 hour or 1 day.

[0084] Based on this grouping interval, the entire equidistant time series dataset is divided into multiple time periods, with each time period being a grouped time interval.

[0085] 2> Determine the grouping time interval to which the equidistant time series data belongs based on the sampling time of each equidistant time series data in the equidistant time series dataset, and obtain each equidistant time series data group.

[0086] Each equidistant time series data point in the equidistant time series dataset is categorized into its corresponding time interval, forming equidistant time series data groups grouped according to time windows.

[0087] Specifically, the data can be allocated based on the sampling timestamp of each equidistant time series data point. All equidistant time series data points falling into the same time interval constitute an equidistant time series data group.

[0088] 3> Combine each equidistant time series data in the equidistant time series data group in pairs to obtain each equidistant time series data pair.

[0089] This step prepares for calculating the temporal correlation between any two equally spaced time series data points within each equally spaced time series data group.

[0090] Specifically, within an equidistant time series data set, any two equidistant time series data are combined to obtain all possible equidistant time series data pairs.

[0091] 4> Calculate the autocorrelation function values ​​used to characterize the degree of correlation between each equidistant time series data pair, and obtain the set of autocorrelation function values.

[0092] The autocorrelation function of each equidistant time series data pair is calculated to quantify the degree of linear correlation between each pair of equidistant time series data. The higher the autocorrelation function value, the more similar the two equidistant time series data in terms of change patterns.

[0093] The structure of the results of all equally spaced time series data pairs in each equally spaced time series data group constitutes a set of autocorrelation function values.

[0094] 5> Sort the autocorrelation function values ​​in the set of autocorrelation function values ​​from largest to smallest to obtain the autocorrelation function value sequence.

[0095] Sort all the autocorrelation function values ​​in this set of autocorrelation function values ​​in descending order to obtain a sequence of values ​​from most correlated to least correlated, i.e., the autocorrelation function value sequence.

[0096] 6> Arrange the time series data pairs at equal intervals in descending order of their autocorrelation function values ​​to obtain the reordered time series data groups.

[0097] Finally, the equidistant time series data groups are obtained by reordering them according to the inherent correlation between the equidistant time series data.

[0098] In this step, the time series data in each time series data group are arranged in descending order of autocorrelation function values. The time series data with strong correlations are placed first, which provides structured data support for subsequent time series data prediction and can improve the prediction efficiency of subsequent time series data prediction models.

[0099] In summary, this step is a process of intelligently grouping and internally sorting equidistant time series data based on time autocorrelation. The purpose is to discover and strengthen the temporal correlation between equidistant time series data within the same equidistant time series data group, and to obtain each equidistant time series data group after reordering.

[0100] Step 104: Arrange each equidistant time series data group in order from nearest to farthest sampling time corresponding to the preset sampling interval to obtain the time series data to be predicted.

[0101] Optionally, the time series data groups at equal intervals can be sorted from the nearest to the furthest sampling time to obtain the time series data to be predicted.

[0102] It should be noted that similar equidistant time series data and equidistant time series data groups whose data generation time is close to the current time have a significant impact on model prediction. Therefore, the grouping in the previous step and the order reordering in this step can improve the running performance and prediction effect of the time series data prediction model.

[0103] Step 105: Input the time series data to be predicted into the pre-trained time series data prediction model to obtain the time series data prediction result.

[0104] It should be noted that the pre-trained time series data prediction model used in this application can be SARIMA (Seasonal Autoregressive Integrated Moving Average), which is a time series data prediction model. This step uses this model to predict time series data in the cloud-native environment.

[0105] Before using this model for time series data prediction, the regression parameters in the model need to be adjusted. Specifically, the seasonal and non-seasonal parameters of the SARIMA model can be adjusted. After adjusting the regression parameters of the model based on experiments or other methods, the model can be used for time series data prediction.

[0106] Optionally, the time series data prediction model can be invoked to perform regression prediction on the time series data to be predicted, and the prediction results of the time series data for the preset prediction period can be obtained.

[0107] The time series data prediction results include the predicted values ​​of the time series data within a preset time period and the expected confidence intervals corresponding to the predicted values.

[0108] Step 106: Obtain the actual time series data at each predicted time point in the time series data prediction results, and compare the actual time series data with the time series data prediction results to obtain the time series data monitoring results of the time series data to be processed.

[0109] It should be noted that the time series data prediction results output by the pre-trained time series data prediction model determine an expected confidence interval for the time series data to be processed. This expected confidence interval can be understood as a normal fluctuation range. Then, the actual time series data at each prediction time point is compared with the expected confidence interval at that prediction time point. Once the actual time series data exceeds the expected confidence interval, it is judged as abnormal and an alarm is triggered.

[0110] Optionally, based on the upper and lower prediction limits of each time series data to be processed in the time series data prediction results, the expected confidence interval of each time series data to be processed is determined. The actual time series data is compared with the expected confidence interval of the time series data to be processed at the same prediction time point to determine whether the actual time series data exceeds the expected confidence interval. When the actual time series data exceeds the expected confidence interval, an alarm message is generated.

[0111] Specifically, for each time series prediction result at a given prediction point, it is necessary to first obtain the corresponding actual time series data for that time point. This can be achieved by collecting the actual observation values ​​at that time point from a real monitoring system. This step prepares two sets of basic data—the predicted values ​​and the measured values—for subsequent comparisons.

[0112] Then, based on the upper and lower bounds of the prediction for this prediction time point in the time series prediction results, the expected confidence interval is determined. The actual time series data is compared with its corresponding expected confidence interval to determine whether the actual time series data falls within the expected confidence interval. If it does, the fluctuation of the actual time series data is considered to be in line with expectations and is considered normal, requiring no alarm; if not, it is determined to be a statistically significant anomaly, and an alarm message is generated.

[0113] Taking QPS (Queries Per Second) as an example, its peak value is 1000, and its trough value is 100. Existing monitoring methods set a static threshold of 0 to 1100, which means that no alarm will be generated when the trough value is 500. However, this application sets different thresholds for different time periods, enabling it to detect the abnormal value of 500 during the trough. Therefore, this application can provide a more realistic predictive threshold, greatly improving the accuracy of indicator alerts.

[0114] In summary, the time series data monitoring method provided in this application obtains the time series data to be processed from the monitoring data source and samples it according to a preset sampling interval to obtain an equidistant time series dataset. Then, the equidistant time series dataset is grouped according to a preset grouping interval to obtain each equidistant time series data group. Multiple equidistant time series data pairs in each equidistant time series data group are sorted from high to low according to their correlation. Then, the time series data to be predicted is input into a pre-trained time series data prediction model to obtain a more accurate time series data prediction result. The actual time series data at each predicted time point in the time series data prediction result is compared with the time series data prediction result to obtain the time series data monitoring result of the time series data to be processed.

[0115] In another specific embodiment, before inputting the time series data to be predicted into a pre-trained time series data prediction model, the method further includes:

[0116] A fixed time window for the time series data to be predicted is determined according to a preset downsampling time interval. Then, the time series data to be predicted is grouped according to the fixed time window to obtain regrouped equidistant time series data groups. The regrouped equidistant time series data groups form the adjusted time series data to be predicted. Then, the adjusted time series data to be predicted is input into the pre-trained time series data prediction model.

[0117] It should be noted that this is the process of downsampling each equidistant time series data group in the time series data to be predicted. The time series data to be predicted composed of the regrouped equidistant time series data groups is the downsampled equidistant time series data group.

[0118] Understandably, this application performs equidistant sampling, grouping, and downsampling on the time series data to be processed, which can effectively improve the prediction efficiency and accuracy of the time series data prediction model. If the time series data to be processed is directly input into the time series data prediction model, it is easy to encounter problems such as the model being unable to fit or even failing to fit due to the large amount of time series data. However, the series of operations performed in this application before inputting the time series data to be processed into the time series data prediction model can effectively reduce the amount of time series data, enabling the model to fit normally and improving prediction accuracy while ensuring prediction efficiency.

[0119] This application provides another specific embodiment of the above-described time-series data monitoring method.

[0120] For example, see Figure 2 Here is a flowchart illustrating a time-series data monitoring method provided in this application.

[0121] like Figure 2 As shown, this is the process of time-series data processing and intelligent monitoring and early warning. The entire process includes data input / output, preprocessing, analysis and modeling, and finally, decision output.

[0122] The first step is data access and data cleaning.

[0123] As shown in the figure, the process begins with a time-series data source, which is usually a system that continuously generates time-series data, such as sensors, server metrics, or business systems.

[0124] First, data validation is performed to obtain a time-series dataset. The primary step is invalid value filtering, which removes or corrects "dirty data" in the time-series data that clearly does not conform to physical or logical definitions, such as null values, infinity values, and incorrect formats, to ensure the basic quality of subsequent analysis.

[0125] The second step is to detect and handle anomalies.

[0126] This step involves identifying and processing data points in the cleaned time-series dataset whose values ​​are within the valid range but whose behavioral patterns significantly deviate from historical or expected patterns, thus obtaining the cleaned time-series dataset.

[0127] The third step is time-related feature analysis and model prediction.

[0128] Feature detection involves in-depth analysis of the purified time-series dataset, which may include periodicity identification, trend decomposition, statistical feature calculation, and so on.

[0129] Then, grouping and downsampling are performed. Specifically, grouping can be determined based on a preset grouping interval, and sampling can be carried out according to a preset sampling interval, thereby reducing the data density of the time series data group and retaining the main trend features.

[0130] Finally, model prediction is performed, specifically using the SARIMA model to capture trends, seasonality, and autocorrelation in the time series data. Based on the time series dataset processed in the above steps, the indicator values ​​and their possible fluctuation ranges for one or more future time points are predicted, i.e., an expected confidence interval is predicted for each prediction time point.

[0131] The fourth step is monitoring, early warning, and result generation.

[0132] The actual time series data at each predicted time point is compared with the expected confidence interval corresponding to that predicted time point, which is determined by the model prediction result output by the SARIMA model. When the actual time series data exceeds the expected confidence interval, it is judged as a statistical anomaly, and an alarm message is automatically generated.

[0133] Ultimately, all data, analysis results, and alerts can be integrated and displayed through Grafana dashboards.

[0134] The above describes a time-series data monitoring method provided by this application. The following describes the apparatus for performing the above-described time-series data monitoring method.

[0135] Please see Figure 3 , Figure 3 This is a schematic diagram of a time-series data monitoring device provided in this application. Figure 3 As shown, the device includes:

[0136] The system comprises: acquisition unit 10, sampling unit 20, grouping unit 30, sorting unit 40, prediction unit 50, and comparison unit 60; wherein:

[0137] Acquisition unit 10 is used to acquire time-series data to be processed from the monitoring data source in response to the arrival of the preset time-series data monitoring time; the monitoring data source is the data terminal that generates time-series data; the time-series data to be processed is the target monitoring time-series data;

[0138] Sampling unit 20 is used to sample the time series data to be processed according to a preset sampling interval to obtain an equidistant time series dataset;

[0139] Grouping unit 30 is used to group the equidistant time series dataset at a preset grouping interval to obtain each equidistant time series data group; the equidistant time series data group includes multiple equidistant time series data pairs arranged in descending order of correlation.

[0140] The sorting unit 40 is used to arrange each equidistant time series data group in order from the nearest to the farthest sampling time corresponding to the preset sampling interval, so as to obtain the time series data to be predicted;

[0141] The prediction unit 50 is used to input the time series data to be predicted into a pre-trained time series data prediction model to obtain the time series data prediction result;

[0142] The comparison unit 60 is used to acquire the real time series data at each prediction time point in the time series data prediction result, and compare the real time series data with the time series data prediction result to obtain the time series data monitoring result of the time series data to be processed.

[0143] In one embodiment, the sampling unit 20 is specifically used for:

[0144] Identify and remove invalid values ​​from the time series data to be processed to obtain valid time series data;

[0145] Identify and remove outliers from the valid time series data to obtain smoothed time series data;

[0146] Smooth time series data are sampled according to a preset sampling interval to obtain an equidistant time series dataset.

[0147] In one embodiment, the sampling unit 20 is specifically used for:

[0148] Identify and remove invalid values ​​from the time series data to be processed to obtain valid time series data;

[0149] The effective time series data is sampled according to the preset sampling interval to obtain the equidistant time series data to be cleaned;

[0150] Identify and remove outliers from the equidistant time series data to be cleaned to obtain the equidistant time series dataset.

[0151] In one embodiment, the grouping unit 30 is specifically used for:

[0152] The time interval for each group is determined based on the preset grouping interval;

[0153] The time interval to which the equidistant time series data belongs is determined according to the sampling time of each equidistant time series data in the equidistant time series dataset, thus obtaining each equidistant time series data group;

[0154] The equidistant time series data in the equidistant time series data group are combined in pairs to obtain each equidistant time series data pair;

[0155] Calculate the autocorrelation function values ​​used to characterize the degree of correlation between each pair of equally spaced time series data to obtain the set of autocorrelation function values;

[0156] The autocorrelation function values ​​in the set of autocorrelation function values ​​are sorted from largest to smallest to obtain the autocorrelation function value sequence;

[0157] Arrange the time series data pairs at equal intervals in descending order of their autocorrelation function values ​​to obtain the reordered time series data groups.

[0158] In one embodiment, the time-series data monitoring results in the comparison unit 60 include alarm information. Specifically, the comparison unit 60 is used for:

[0159] The expected confidence interval for each time series data to be processed is determined based on the upper and lower bounds of the prediction results for each time series data to be processed.

[0160] Compare the actual time series data with the expected confidence intervals of the time series data to be processed at the same prediction time point;

[0161] Determine whether the actual time series data exceeds the expected confidence interval. When the actual time series data exceeds the expected confidence interval, generate an alarm message.

[0162] In one embodiment, the time-series data monitoring device further includes a downsampling unit, which is specifically used for:

[0163] A fixed time window for the time series data to be predicted is determined according to a preset downsampling time interval;

[0164] The time series data to be predicted is grouped according to a fixed time window to obtain regrouped equidistant time series data groups. The adjusted time series data to be predicted is composed of the regrouped equidistant time series data groups.

[0165] This application also provides a time-series data monitoring device in its embodiments. (See reference...) Figure 4 The diagram illustrates a structure suitable for implementing the time-series data monitoring device provided in this application. The time-series data monitoring device in this embodiment may include, but is not limited to, fixed terminals such as mobile phones, laptops, PDAs (personal digital assistants), PADs (tablet computers), desktop computers, etc. Figure 4 The time-series data monitoring device shown is merely an example and should not impose any limitations on the functionality and scope of use of the embodiments of this application.

[0166] like Figure 4 As shown, the timing data monitoring device may include a processing unit (e.g., a central processing unit, a graphics processing unit, etc.) 601, which can perform various appropriate actions and processes according to a program stored in a read-only memory (ROM) 602 or a program loaded from a storage device 608 into a random access memory (RAM) 603. When the timing data monitoring device is powered on, the RAM 603 also stores various programs and data required for the operation of the timing data monitoring device. The processing unit 601, ROM 602, and RAM 603 are interconnected via a bus 604. An input / output (I / O) interface 605 is also connected to the bus 604.

[0167] Typically, the following devices can be connected to I / O interface 605: input devices 606 including, for example, touchscreens, touchpads, keyboards, mice, cameras, microphones, accelerometers, gyroscopes, etc.; output devices 607 including, for example, liquid crystal displays (LCDs), speakers, vibrators, etc.; storage devices 608 including, for example, memory cards, hard drives, etc.; and communication devices 609. Communication device 609 allows the timing data monitoring device to communicate wirelessly or wiredly with other devices to exchange data. Although Figure 4 A timing data monitoring device with various devices is shown; however, it should be understood that implementation or possession of all the devices shown is not required. More or fewer devices may be implemented alternatively.

[0168] This application also provides a computer program product including computer-readable instructions, which, when executed on an electronic device, cause the electronic device to implement any of the timing data monitoring methods provided in this application.

[0169] This application also provides a computer storage medium that carries one or more computer programs. When the one or more computer programs are executed by an electronic device, the electronic device can implement any of the timing data monitoring methods provided in this application.

[0170] It should also be noted that the device embodiments described above are merely illustrative. The units described as separate components may or may not be physically separate, and the components shown as units may or may not be physical units; that is, they may be located in one place or distributed across multiple network units. Some or all of the modules can be selected to achieve the purpose of this embodiment according to actual needs. In addition, in the device embodiment drawings provided in this application, the connection relationship between modules indicates that they have a communication connection, which can be implemented as one or more communication buses or signal lines.

[0171] Through the above description of the embodiments, those skilled in the art can clearly understand that this application can be implemented by means of software plus necessary general-purpose hardware, or it can be implemented by special-purpose hardware including application-specific integrated circuits, special-purpose CPUs, special-purpose memory, special-purpose components, etc. Generally, any function performed by a computer program can be easily implemented by corresponding hardware, and the specific hardware structure used to implement the same function can also be diverse, such as analog circuits, digital circuits, or special-purpose circuits. However, for this application, software program implementation is more often the preferred implementation method. Based on this understanding, the technical solution of this application, in essence, or the part that contributes to the prior art, can be embodied in the form of a software product. This computer software product is stored in a readable storage medium, such as a computer floppy disk, USB flash drive, mobile hard disk, ROM, RAM, magnetic disk, or optical disk, etc., and includes several instructions to cause a computer device (which may be a personal computer, training equipment, or network device, etc.) to execute the methods described in the various embodiments of this application.

[0172] In the above embodiments, implementation can be achieved, in whole or in part, through software, hardware, firmware, or any combination thereof. When implemented in software, it can be implemented, in whole or in part, as a computer program product.

[0173] The computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on a computer, all or part of the processes or functions described in the embodiments of this application are generated. The computer may be a general-purpose computer, a special-purpose computer, a computer network, or other programmable device. The computer instructions may be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another. For example, the computer instructions may be transmitted from one website, computer, training device, or data center to another website, computer, training device, or data center via wired (e.g., coaxial cable, fiber optic, digital subscriber line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.) means. The computer-readable storage medium may be any available medium that a computer can store or a data storage device such as a training device or data center that integrates one or more available media. The available media may be magnetic media (e.g., floppy disks, hard disks, magnetic tapes), optical media (e.g., DVDs), or semiconductor media (e.g., solid-state drives (SSDs)).

Claims

1. A time-series data monitoring method, characterized in that, include: In response to reaching the preset time series data monitoring time, retrieve the time series data to be processed from the monitoring data source; The monitoring data source is the data terminal that generates time-series data; The time series data to be processed is the target monitoring time series data; The time series data to be processed is sampled according to a preset sampling interval to obtain an equidistant time series dataset; The equidistant time series dataset is grouped at a preset grouping interval to obtain equidistant time series data groups; each equidistant time series data group includes multiple equidistant time series data pairs arranged in descending order of correlation. Arrange each equidistant time series data group in order from nearest to farthest sampling time corresponding to the preset sampling interval to obtain the time series data to be predicted; The time series data to be predicted is input into a pre-trained time series data prediction model to obtain the time series data prediction result; The actual time-series data at each predicted time point in the time-series data prediction result is obtained, and the actual time-series data is compared with the time-series data prediction result to obtain the time-series data monitoring result of the time-series data to be processed.

2. The time-series data monitoring method according to claim 1, characterized in that, The step of sampling the time series data to be processed according to a preset sampling interval to obtain an equidistant time series dataset includes: Identify and remove invalid values ​​from the time series data to be processed to obtain valid time series data; Identify and remove outliers from the valid time series data to obtain smoothed time series data; The smooth time series data is sampled according to the preset sampling interval to obtain the equidistant time series dataset.

3. The time-series data monitoring method according to claim 1, characterized in that, The step of sampling the time series data to be processed according to a preset sampling interval to obtain an equidistant time series dataset includes: Identify and remove invalid values ​​from the time series data to be processed to obtain valid time series data; The effective time series data is sampled according to the preset sampling interval to obtain the equidistant time series data to be cleaned. Outliers are identified and removed from the equidistant time series data to be cleaned, thus obtaining the equidistant time series dataset.

4. The time-series data monitoring method according to claim 1, characterized in that, The step of grouping the equidistant time series dataset at a preset grouping interval to obtain each equidistant time series data group includes: The time interval for each group is determined based on the preset grouping interval; The time interval to which the equidistant time series data belongs is determined according to the sampling time of each equidistant time series data in the equidistant time series dataset, thus obtaining each equidistant time series data group; The equidistant time series data in the equidistant time series data group are combined in pairs to obtain each equidistant time series data pair; Calculate the autocorrelation function values ​​used to characterize the degree of correlation between each pair of equally spaced time series data to obtain a set of autocorrelation function values; The autocorrelation function values ​​in the set of autocorrelation function values ​​are sorted from largest to smallest to obtain the autocorrelation function value sequence; Arrange the equidistant time series data pairs in descending order of the autocorrelation function value sequence to obtain the equidistant time series data groups after reordering.

5. The time-series data monitoring method according to claim 1, characterized in that, The time-series data monitoring result includes alarm information. The step of comparing the actual time-series data with the time-series data prediction result to obtain the time-series data monitoring result for the time-series data to be processed includes: The expected confidence interval of each time series data to be processed is determined based on the upper and lower prediction limits of each time series data to be processed in the time series data prediction results. The expected confidence intervals of the actual time series data and the time series data to be processed at the same prediction time point are compared; Determine whether the actual time-series data exceeds the expected confidence interval. If the actual time-series data exceeds the expected confidence interval, generate the alarm information.

6. The time-series data monitoring method according to claim 1, characterized in that, Before inputting the time series data to be predicted into a pre-trained time series data prediction model, the method further includes: A fixed time window for the time series data to be predicted is determined according to a preset downsampling time interval; The time series data to be predicted is grouped according to the fixed time window to obtain regrouped equidistant time series data groups, and the adjusted time series data to be predicted is composed of the regrouped equidistant time series data groups.

7. A time-series data monitoring device, characterized in that, include: The acquisition unit is used to acquire time-series data to be processed from the monitoring data source in response to the arrival of a preset time-series data monitoring time. The monitoring data source is the data terminal that generates time-series data; The time series data to be processed is the target monitoring time series data; The sampling unit is used to sample the time series data to be processed according to a preset sampling interval to obtain an equidistant time series dataset; A grouping unit is used to group the equidistant time series dataset at a preset grouping interval to obtain equidistant time series data groups; the equidistant time series data groups include multiple equidistant time series data pairs arranged in descending order of correlation. The sorting unit is used to arrange each equidistant time series data group in order from the nearest to the farthest sampling time corresponding to the preset sampling interval, so as to obtain the time series data to be predicted; The prediction unit is used to input the time series data to be predicted into a pre-trained time series data prediction model to obtain the time series data prediction result; The comparison unit is used to obtain the real time series data at each prediction time point in the time series data prediction result, and compare the real time series data with the time series data prediction result to obtain the time series data monitoring result of the time series data to be processed.

8. A time-series data monitoring device, characterized in that, It includes at least one processor and a memory connected to the processor, wherein: The memory is used to store computer programs; The processor is used to execute the computer program so that the time-series data monitoring device can implement the time-series data monitoring method as described in any one of claims 1 to 6.

9. A computer program product, characterized in that, It includes computer-readable instructions that, when executed on an electronic device, cause the electronic device to implement the timing data monitoring method as described in any one of claims 1 to 6.

10. A computer storage medium, characterized in that, The storage medium carries one or more computer programs that, when executed by an electronic device, enable the electronic device to implement the timing data monitoring method as described in any one of claims 1 to 6.