Alarm detection method, device, equipment and medium
By using aggregation and smoothing algorithms to identify latency fluctuations, the problem of maintenance personnel being unable to identify effective alarms in a timely manner during low-latency services has been solved, achieving efficient alarm detection and improving troubleshooting efficiency and service stability.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- GUANGZHOU HUANJU SHIDAI INFORMATION TECH CO LTD
- Filing Date
- 2023-02-15
- Publication Date
- 2026-06-19
AI Technical Summary
In existing technologies, maintenance personnel need to frequently modify alarm thresholds, and valid alarm information is mixed in with a large number of invalid alarm information, making it impossible to identify and process them in a timely manner, increasing the workload and difficulty of troubleshooting, which may lead to business loss, especially in low-latency business scenarios.
By aggregating latency data to the minute level, and using time series smoothing and dynamic variance algorithms, latency fluctuation values are identified within a preset time window, invalid alarms are filtered out, latency alarms are accurately triggered, and frequent threshold modifications are avoided.
It significantly improved the accuracy of effective alarm identification, reduced the workload of operation and maintenance personnel, improved the efficiency of fault diagnosis and handling, ensured the real-time and low-latency nature of online services, and avoided business losses.
Smart Images

Figure CN116185786B_ABST
Abstract
Description
Technical Field
[0001] This application relates to the field of data processing, and more particularly to an alarm detection method, a corresponding device, an electronic device, and a computer-readable storage medium. Background Technology
[0002] Online service latency alarms are alarm messages that operations and maintenance personnel urgently need to handle. These alarm messages are signals of online faults, which means that there may be problems with online tasks. Especially for some business scenarios with high requirements for low latency, if effective latency alarm messages cannot be identified and processed in a timely manner, it will greatly reduce the user experience and lead to business loss.
[0003] Currently, operations and maintenance (O&M) personnel monitor alarm information by setting alarm thresholds. However, according to the current alarm threshold rules, a large number of alarms are invalid. O&M personnel need to review and analyze each alarm from a large number of alarms to find the truly valid ones. When a real fault occurs, the O&M personnel's processing speed is slow, and they cannot find valid alarms in time. This may lead to major losses due to minor faults. Moreover, over time, O&M personnel may become fatigued by the tedious alarm information and their enthusiasm will decrease.
[0004] Current technologies for monitoring alarm information require frequent adjustments to alarm thresholds depending on the service's state. Otherwise, they may receive too many invalid alarms or miss valid ones, failing to handle situations such as the addition of new business modules or changes in online traffic. Valid alarm information is often mixed in with a large number of invalid alarms. Since alarms are generated when the alarm threshold is reached, a large number of invalid alarms will be generated during network fluctuations and short-term traffic surges, while valid alarms often constitute a very small percentage. This makes it difficult for operations and maintenance personnel to promptly identify valid alarms from a large number of alarms, increasing their workload and the difficulty of troubleshooting. Summary of the Invention
[0005] The purpose of this application is to solve the above-mentioned problems by providing an alarm detection method, a corresponding device, an electronic device, and a computer-readable storage medium.
[0006] To achieve the various objectives of this application, the following technical solution is adopted:
[0007] An alarm detection method proposed for one of the purposes of this application includes the following steps:
[0008] Acquire the latency data corresponding to each indicator from the time the requester initiates the request to the time the processor receives the response result, and aggregate the latency data corresponding to each indicator into latency data at a preset time level.
[0009] A sliding time window of a preset time length is constructed based on the time delay data corresponding to each aggregated indicator. The time delay data corresponding to each aggregated indicator within the sliding time window is smoothed based on a time series smoothing algorithm.
[0010] The time delay fluctuation value of each indicator is obtained by calculating the variance ratio of the time delay data corresponding to each indicator after smoothing between adjacent time nodes.
[0011] The latency fluctuation value is compared with a preset threshold. If the latency fluctuation value reaches the preset threshold, a latency alarm is triggered.
[0012] Optionally, before comparing the time delay fluctuation value with a preset threshold, the following steps are also included:
[0013] Obtain latency data corresponding to each indicator in valid latency alarms within the historical period;
[0014] Calculate the mean variance of the latency data corresponding to each indicator in the effective latency alarm;
[0015] The maximum value of the variance mean is used as the fluctuation threshold.
[0016] Optionally, before comparing the time delay fluctuation value with a preset threshold, the following steps are also included:
[0017] Determine the frequency threshold of the time delay data corresponding to each indicator;
[0018] A maximum value function expression is constructed based on the fluctuation threshold and the frequency threshold, and the maximum value function expression is used to select the maximum value between the frequency threshold and the fluctuation threshold;
[0019] Use the constructed maximum value function expression as the preset threshold.
[0020] Optionally, the steps for obtaining the latency data corresponding to each indicator in the valid latency alarms within the historical period include the following steps:
[0021] Perform data cleaning or fill in missing values on the latency data corresponding to each indicator in the valid latency alarms obtained;
[0022] Normalize the time delay data corresponding to each indicator after data cleaning or missing value imputation.
[0023] Optionally, the step of aggregating the latency data corresponding to each of the indicators into latency data at a preset time level includes the following steps:
[0024] The latency data corresponding to each indicator is aggregated into latency data at the minute level;
[0025] Calculate the average latency data corresponding to each indicator within the minute level;
[0026] The average value of the latency data corresponding to each indicator is used as the aggregated latency data corresponding to each indicator.
[0027] Optionally, the step of constructing a sliding time window of a preset time length based on the latency data corresponding to each aggregated indicator includes the following steps:
[0028] Determine the size of the sliding time window;
[0029] Use the time delay data corresponding to each aggregated indicator as the sliding time window value;
[0030] A sliding time window of a preset time length is constructed based on the size and value of the sliding time window.
[0031] Optionally, the step of smoothing the time delay data corresponding to each aggregated indicator within the sliding time window based on a time series smoothing algorithm includes the following steps:
[0032] The change in time delay data corresponding to each aggregated index between adjacent time nodes within the sliding time window is calculated based on the first-order difference equation.
[0033] The time delay data corresponding to each aggregated indicator within the sliding time window is smoothed based on the amount of change.
[0034] An alarm detection device provided for another purpose of this application includes:
[0035] The data aggregation module is configured to acquire the latency data corresponding to each indicator from the time the requester initiates the request to the time the processor receives the response result, and aggregate the latency data corresponding to each indicator into latency data at a preset time level.
[0036] The data processing module is configured to construct a sliding time window of a preset time length based on the time delay data corresponding to each aggregated indicator, and to perform smoothing processing on the time delay data corresponding to each aggregated indicator within the sliding time window based on a time series smoothing algorithm.
[0037] The fluctuation value determination module is set to calculate the variance ratio of the time delay data corresponding to each indicator after smoothing between adjacent time nodes to obtain the time delay fluctuation value of each indicator.
[0038] The alarm triggering module is configured to compare the latency fluctuation value with a preset threshold. If the latency fluctuation value reaches the preset threshold, a latency alarm is triggered.
[0039] An alarm detection device provided for another purpose of this application includes a central processing unit and a memory, wherein the central processing unit is used to invoke and run a computer program stored in the memory to perform the steps of the alarm detection method described in this application.
[0040] A computer-readable storage medium is provided for another purpose of this application, which stores, in the form of computer-readable instructions, a computer program implemented according to the alarm detection method, which, when invoked by a computer, performs the steps included in the corresponding method.
[0041] Compared to existing technologies, this application addresses the problems of frequent alarm threshold modifications by operations and maintenance personnel, the mixing of valid alarm information with a large number of invalid alarms, and the inability of operations and maintenance personnel to promptly identify and process valid alarms from a large volume of alarms. It aggregates scattered and irregular raw latency data into minute-level data, making the latency data actionable. Within a preset time window, based on dynamic variance and time series smoothing algorithms, it accurately identifies valid latency alarm points, filtering out a large number of invalid alarms and significantly reducing alarm noise. This provides valuable alarm information for operations and maintenance personnel to further process. By improving the accuracy of valid alarm identification, it significantly reduces the workload of operations and maintenance personnel in dealing with invalid alarms, allowing them to focus their time and energy on each valid alarm. It also avoids the need for operations and maintenance personnel to frequently modify alarm thresholds, thereby significantly improving the efficiency of troubleshooting and processing online service faults, ensuring the real-time performance and low latency of online services, and preventing business losses due to the inability to promptly identify and process valid alarms. Attached Figure Description
[0042] The above and / or additional aspects and advantages of this application will become apparent and readily understood from the following description of the embodiments taken in conjunction with the accompanying drawings, wherein:
[0043] Figure 1 This is an exemplary network architecture used in the alarm detection method of this application;
[0044] Figure 2 This is a flowchart illustrating the alarm detection method in the embodiments of this application;
[0045] Figure 3 This is a schematic diagram of the process for determining the fluctuation threshold of latency data in an embodiment of this application;
[0046] Figure 4 This is a flowchart illustrating the process of determining a preset threshold for latency data in an embodiment of this application.
[0047] Figure 5This is a flowchart illustrating the normalization process for the latency data corresponding to each indicator in the effective latency alarm in this embodiment of the application.
[0048] Figure 6 This is a schematic diagram of the process of aggregating the latency data corresponding to each indicator into latency data at a preset time level in the embodiments of this application;
[0049] Figure 7 This is a flowchart illustrating the process of constructing a sliding time window of a preset time length based on the latency data corresponding to the aggregated indicators in this embodiment of the application.
[0050] Figure 8 This is a flowchart illustrating the process of smoothing the time delay data corresponding to each aggregated indicator within a sliding time window based on a time series smoothing algorithm in this embodiment of the application.
[0051] Figure 9 This is a schematic block diagram of the alarm detection device in the embodiments of this application;
[0052] Figure 10 This is a schematic diagram of the alarm detection device in the embodiments of this application. Detailed Implementation
[0053] The embodiments of this application are described in detail below. Examples of these embodiments are shown in the accompanying drawings, wherein the same or similar reference numerals denote the same or similar elements or elements having the same or similar functions throughout. The embodiments described below with reference to the accompanying drawings are exemplary and are only used to explain this application, and should not be construed as limiting this application.
[0054] Those skilled in the art will understand that, unless specifically stated otherwise, the singular forms “a,” “an,” “the,” and “the” used herein may also include the plural forms. It should be further understood that the term “comprising” as used in this application means the presence of the stated features, integers, steps, operations, elements, and / or components, but does not exclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and / or groups thereof. It should be understood that when we say an element is “connected” or “coupled” to another element, it can be directly connected or coupled to the other element, or there may be intermediate elements. Furthermore, “connected” or “coupled” as used herein can include wireless connections or wireless coupling. The term “and / or” as used herein includes all or any units and all combinations of one or more associated listed items.
[0055] It will be understood by those skilled in the art that, unless otherwise defined, all terms used herein (including technical and scientific terms) have the same meaning as commonly understood by one of ordinary skill in the art to which this application pertains. It should also be understood that terms such as those defined in general dictionaries should be understood to have the same meaning as in the context of the prior art, and should not be interpreted in an idealized or overly formal sense unless specifically defined as herein.
[0056] Those skilled in the art will understand that the terms "client," "terminal," and "terminal device" as used herein include both devices that receive wireless signals, devices that only possess wireless signal receiver capabilities without transmission capabilities, and devices with receiving and transmitting hardware, devices that have receiving and transmitting hardware capable of bidirectional communication over a bidirectional communication link. Such devices may include: cellular or other communication devices such as personal computers or tablets, having single-line displays, multi-line displays, or cellular or other communication devices without multi-line displays; PCS (Personal Communications Service) that can combine voice, data processing, fax, and / or data communication capabilities; PDA (Personal Digital Assistant) that may include a radio frequency receiver, pager, internet / intranet access, web browser, notepad, calendar, and / or GPS (Global Positioning System) receiver; and conventional laptops and / or handheld computers or other devices that have and / or include radio frequency receivers. As used herein, "client," "terminal," and "terminal device" can be portable, transportable, installed in a means of transportation (air, sea, and / or land), or suitable and / or configured to operate locally and / or in a distributed manner, operating in any other location on Earth and / or in space. "Client," "terminal," and "terminal device" as used herein can also be a communication terminal, an internet access terminal, or a music / video playback terminal, such as a PDA, a MID (Mobile Internet Device), and / or a mobile phone with music / video playback capabilities, or a smart TV, set-top box, etc.
[0057] The hardware referred to by the names "server," "client," and "service node" in this application is essentially an electronic device with the equivalent capabilities of a personal computer. It is a hardware device with the necessary components revealed by the Von Neumann architecture, such as a central processing unit (including an arithmetic logic unit and a control unit), memory, input devices, and output devices. The computer program is stored in its memory, and the central processing unit loads the program stored in the secondary storage into the main memory to run it, execute the instructions in the program, and interact with the input and output devices to complete specific functions.
[0058] Please see Figure 1 This application discloses an exemplary application scenario using a network architecture including a terminal device 80, a media server 81, and an application server 82. The application server 82 can be used to deploy a live streaming service. The media server 81 or the terminal device 80 can run a computer program product implemented according to the alarm detection method of this application. By running this product, the various steps of the method are implemented, enabling the detection of valid alarm information in online live streaming, providing valid alarm information to maintenance personnel, and avoiding invalid alarm information being mixed with valid alarm information. The terminal device 80 allows broadcasters or viewers to log into the live streaming room supported by the live streaming service. Broadcasters can use the camera unit in their terminal device 80 to acquire recordings as a live video stream and submit them to the media server. Viewers can use their terminal device 80 to receive and play the live video stream pushed by the media server.
[0059] It should be noted that the concept of "server" used in this application can also be extended to the case of server clusters. Based on the network deployment principles understood by those skilled in the art, the servers should be logically divided. Physically, these servers can be independent of each other but accessible through interfaces, or they can be integrated into a single physical computer or a computer cluster. Those skilled in the art should understand this flexibility and should not use it to constrain the implementation of the network deployment method in this application.
[0060] Based on the above exemplary scenarios, please refer to Figure 2 In one embodiment of the alarm detection method of this application, the following steps are included:
[0061] Step S1100: Obtain the time delay data corresponding to each indicator from the time the requester initiates the request to the time the processor receives the response result, and aggregate the time delay data corresponding to each indicator into time delay data at a preset time level.
[0062] Data requiring alarm detection using the technical solution of this application can be considered as the latency data described in this application. The type and source of the latency data described in this application can be determined according to the actual application scenario. In the scenario of live streaming, which differs from general audio and video call applications, the requirements for real-time performance and low latency are higher. Live streaming users are more aware of latency. In cases of network jitter or sudden surges in traffic, a large number of invalid alarm messages will be generated, while the proportion of valid alarm messages is often very low. Valid alarm messages will be mixed in with a large number of invalid alarm messages, making it very difficult to locate the fault point. This makes it impossible for maintenance personnel to promptly identify and process valid alarm messages from a large number of alarm messages, greatly increasing the workload of maintenance personnel and the difficulty of troubleshooting. If valid alarm messages cannot be detected and processed in a timely manner, the user experience will be greatly reduced, leading to business loss.
[0063] The latency data mentioned in this embodiment can be the latency generated during live streaming. The latency data can be the latency from the camera unit of the terminal device capturing the corresponding live video stream to pushing it to the client, the network transmission latency, the server processing latency, or the client display latency, etc. It can also be the response latency, network transmission latency, server processing latency, or client display latency from the client initiating a business request to receiving the response result during live streaming. The sources and types of the latency data are wide and are not limited here. In live streaming scenarios, the latency data generated by various metrics during the process from a client initiating a service request to receiving a response is unpredictable and fragmented. This means that the latency data generated by the terminal device's camera unit capturing the live video stream and pushing it to the client, as well as the latency generated by the client initiating a service request and receiving the response during the live stream, is generated unpredictably and fragmented. Therefore, a time aggregation module is needed to aggregate the latency data corresponding to these metrics into latency data at a preset time level. This preset time level can be at the minute level, and the minute level can be one minute, two minutes, or three minutes, etc., without limitation. Since the alarm information generated by each metric needs to consider the latency data generated within the preset time level, rather than the latency data generated at a single point, the average of the latency data generated by each metric within the preset time level is used as the aggregated latency data corresponding to each metric.
[0064] In some embodiments, when a user starts a live stream, they can perform activities such as dancing, singing, or exercising. The camera unit of their terminal device captures the corresponding live video stream and submits it to a server for encoding. The encoded live video stream is then submitted to a media server. After receiving the encoded live video stream, the media server decodes it before pushing it to the client for display. It's easy to understand that the latency data can be the response latency from the camera unit capturing the live video stream to pushing it to the client, the network transmission latency, etc. It can also be the server processing latency generated by the server encoding the live video stream or the media server decoding the encoded live video stream, the client display latency generated by the client displaying the decoded video stream, etc. Furthermore, it can be the response latency, network transmission latency, server processing latency, or client display latency from the client initiating a service request to receiving the response result during the live stream. For example, when a user enthusiastically likes the streamer in a live stream... The latency data can be the response latency, network transmission latency, and client display latency from the user's request to send comments, send bullet comments, or send gifts in the live stream to the receipt of the corresponding response. It can also be the server processing latency generated by the server processing the business data corresponding to the user's requests to send comments, send bullet comments, or send gifts. Since the generation of the response latency, network transmission latency, client display latency, or server processing latency data is irregular and scattered, it is necessary to aggregate the response latency, network transmission latency, client display latency, or server processing latency into latency data at the minute level.
[0065] Specifically, in a live streaming scenario, from the camera unit of the terminal device acquiring the corresponding live video stream to pushing it to the client, the network transmission latency can be the RTT (Round Trip Time) latency, the server processing latency can be the average encoding time of the live video stream by the server, including format conversion, memory copying, and video encoding, etc., the client display latency can be the rendering time, decoding time, and delay time in the jitter buffer, etc., and the response latency can be the sum of the network transmission latency, server processing latency, and client display latency.
[0066] Step S1200: Construct a sliding time window of a preset time length based on the time delay data corresponding to each aggregated indicator, and smooth the time delay data corresponding to each aggregated indicator within the sliding time window based on a time series smoothing algorithm.
[0067] For some business scenarios with high requirements for real-time performance and low latency, it is necessary to constantly monitor the latency data corresponding to various indicators within a short time window. Therefore, a sliding time window of a preset time length is constructed by aggregating the latency data corresponding to various indicators. This facilitates the timely detection of alarm information corresponding to various indicators and their processing by relevant personnel, ensuring the real-time performance and low latency of live streaming and related services. The latency data corresponding to various indicators aggregating within the sliding time window is smoothed using a time series smoothing algorithm, such as a first-order difference equation, to reduce the impact of extreme values in the latency data corresponding to various indicators within the sliding time window. This ensures the accuracy of the latency data corresponding to various indicators within the sliding time window, thereby improving the accuracy and effectiveness of alarms. The preset time length can be 5 min, 10 min, or 15 min, etc., and is not limited here.
[0068] In some embodiments, when a broadcaster conducts a live online event, in order to accurately and effectively respond to latency alarms corresponding to various indicators and to process these latency alarms in a timely manner, it is necessary to construct a 5-minute sliding time window by aggregating the latency data of the live video stream collected by the camera unit of the terminal device at the minute level, including the response latency, network transmission latency, server processing latency, or client display latency, as well as the response latency, network transmission latency, server processing latency, or client display latency from the client initiating a business request to receiving the response result during the live online event. The time series of latency data corresponding to each indicator is framed according to the 5-minute time unit length. Based on a first-order difference equation, the aggregated latency data corresponding to each indicator within the sliding time window is smoothed to reduce the impact of extreme values in the latency data corresponding to each indicator within the sliding time window, thereby calculating the fluctuation of the latency data corresponding to each indicator within the frame.
[0069] Step S1300: Calculate the variance ratio of the time delay data corresponding to each indicator after smoothing between adjacent time nodes to obtain the time delay fluctuation value of each indicator.
[0070] Within a preset sliding time window, the proportion of effective delay alarms corresponding to each indicator is usually very low. The effective delay alarms can be detected based on the delay data fluctuations of each indicator within the preset sliding time window. The delay data fluctuation detection can be based on a dynamic variance algorithm to calculate the variance ratio of the delay data corresponding to each indicator after smoothing between adjacent time nodes to obtain the delay fluctuation value of each indicator. The fluctuation of the delay data corresponding to each indicator is judged based on the delay fluctuation value.
[0071] In some embodiments, in a live streaming scenario, the smoothed video stream acquired by the camera unit of the terminal device is pushed to the client, and the latency data corresponding to various indicators such as response latency, network transmission latency, server processing latency, or client display latency, as well as the latency from the client initiating a business request to receiving the response result during the live streaming process, are added to the corresponding time series in real time. First, the variance of the latency data corresponding to each indicator at the previous time point is calculated in the corresponding time series. Then, the variance of the latency data corresponding to each indicator at the current time point is calculated in the corresponding time series. Based on the variance of each indicator at the current time point and the previous time point in the corresponding time series, the variance ratio of the smoothed latency data corresponding to each indicator between the current time point and the previous time point is calculated. The variance ratio is used as the latency fluctuation value of each indicator. Based on the latency fluctuation value, the range of change of the latency data corresponding to each indicator in the live streaming can be determined. When the range of change of the latency data reaches a certain threshold, the latency alarm corresponding to each indicator needs to be triggered.
[0072] Specifically, time nodes T = {T1, T2, ..., Tn-1, Tn}, and the time-delay data values corresponding to each indicator Y = {Y1, Y2, ..., Yn}. The time-delay data values corresponding to each indicator at the current time node Tn are Yn. First, the variance of the time-delay data corresponding to each indicator at the previous time node in the corresponding time series is calculated, that is, the variance σ1 corresponding to the time period from T1 to Tn-1 is calculated. Then, the variance of the time-delay data corresponding to each indicator at the current time node in the corresponding time series is calculated. That is, the time-delay data Yn corresponding to each indicator at the current time node is added to the time series, and the variance σ2 corresponding to the time period from T1 to Tn is calculated. The variance cycle ratio is calculated for every two adjacent time points to determine the data fluctuation.
[0073] The formula for the variance between the current time point and the previous time point is:
[0074]
[0075] Where X represents the average of the latency data values corresponding to each indicator, μ represents the latency data value corresponding to each indicator, and N represents the number of latency data values corresponding to each indicator.
[0076] Formula for variance cycle ratio: Variance cycle ratio = [(Variance value at the current time point - Variance value at the previous time point) / Variance value at the previous time point)] × 100%
[0077] Step S1400: Compare the latency fluctuation value with a preset threshold. If the latency fluctuation value reaches the preset threshold, a latency alarm is triggered.
[0078] After calculating the latency fluctuation values corresponding to each indicator, they need to be compared with preset thresholds to accurately detect valid latency alarm points, filter out a large number of invalid alarm messages, and trigger latency alarms accurately. This avoids business losses caused by the inability to promptly investigate and process valid alarm messages. When the latency fluctuation values of each indicator reach the corresponding preset thresholds, it is considered that the online latency indicators at the current time point are deteriorating, and a latency alarm needs to be issued in a timely manner and relevant personnel need to be notified for handling to prevent the occurrence and spread of online service failures. The preset thresholds can be determined based on the historical alarm data of each indicator to enhance the effectiveness and accuracy of the alarms.
[0079] In some embodiments, in a live streaming scenario, the latency fluctuation values of various indicators are calculated, including the latency from the camera unit of the terminal device acquiring the corresponding live video stream to pushing it to the client, the network transmission latency, the server processing latency, or the client display latency, and the latency from the client initiating a service request to receiving the response result during the live streaming process. The latency fluctuation values of each indicator are compared with the corresponding preset thresholds. If the latency fluctuation value reaches the corresponding preset threshold, a latency alarm for the corresponding indicator at the current time node is triggered.
[0080] As can be seen from the above embodiments, this application addresses the problems of frequent alarm threshold modifications by operations and maintenance personnel, the mixing of valid alarm information with a large number of invalid alarm information, and the inability of operations and maintenance personnel to promptly identify and process valid alarm information from a large number of alarm information. It aggregates scattered and irregular raw latency data into minute-level data, thereby making the latency data operable. Within a preset time window, based on dynamic variance algorithms and time series smoothing algorithms, it accurately identifies valid latency alarm points, filters out a large number of invalid alarm information, significantly reduces alarm noise, and provides valuable alarm information to operations and maintenance personnel for further processing. By improving the accuracy of identifying valid alarms, it significantly reduces the workload of operations and maintenance personnel in dealing with invalid alarm information, allowing them to focus their time and energy on each valid alarm. It also avoids frequent modifications to alarm thresholds by operations and maintenance personnel, thereby significantly improving the efficiency of troubleshooting and processing online service faults, ensuring the real-time performance and low latency of online services, and preventing business losses due to the inability to promptly identify and process valid alarm information.
[0081] Based on any embodiment of this application, please refer to Figure 3 Before the step of comparing the time delay fluctuation value with a preset threshold, the following steps are also included:
[0082] Step S1301: Obtain the latency data corresponding to each indicator in the valid latency alarms within the historical period;
[0083] The system acquires latency data corresponding to various indicators, including the response latency, network transmission latency, server processing latency, and client display latency of the live video stream collected by the camera unit of the terminal device within a historical period and pushed to the client, as well as the response latency, network transmission latency, server processing latency, and client display latency from the client initiating a business request to receiving the response result during the live broadcast. The historical period can be 3 months or 6 months, etc., and is not limited here.
[0084] Step S1303: Calculate the mean variance of the latency data corresponding to each indicator in the effective latency alarm;
[0085] In order to fully consider the changes in the corresponding latency data in the effective latency alarms of each indicator and to accurately detect the effective latency alarm points, it is necessary to calculate the mean variance of the latency data corresponding to each effective latency alarm point in the historical period, and compare the magnitude of the mean variance of the latency data corresponding to each effective latency alarm point to determine the corresponding fluctuation threshold of each indicator.
[0086] Specifically, in the scenario of live streaming, taking the acquisition of the corresponding live video stream by the camera unit of the terminal device and the effective response latency alarm generated within 3 months as an example, the response latency data corresponding to the acquired effective response latency alarm is used to construct a time series, the variance mean of the response latency data corresponding to each effective latency alarm point within 3 months is calculated, and the magnitude of the variance mean is compared to determine the fluctuation threshold.
[0087] Step S1305: Use the maximum value of the variance mean as the fluctuation threshold.
[0088] Based on a comprehensive consideration of online services, to avoid triggering invalid alarm information, and to avoid frequent modification of alarm thresholds by maintenance personnel, thereby reducing their workload, the maximum value of the mean variance of the latency data corresponding to the valid latency alarm points within the historical period is used as the fluctuation threshold of the corresponding indicator.
[0089] Based on any embodiment of this application, please refer to Figure 4 Before the step of comparing the time delay fluctuation value with a preset threshold, the following steps are also included:
[0090] Step S1302: Determine the frequency threshold of the time delay data corresponding to each indicator;
[0091] The frequency threshold needs to take into account the differences between different services. The frequency threshold for the latency data corresponding to each indicator can be set as needed according to the actual service situation. The frequency threshold can be 15%, 20%, or 25%, etc.
[0092] Step S1304: Construct a maximum value function expression based on the fluctuation threshold and the frequency threshold, wherein the maximum value function expression is used to select the maximum value between the frequency threshold and the fluctuation threshold;
[0093] To further avoid triggering invalid alarm messages, reduce the workload of maintenance personnel on invalid alarm messages, and focus time and energy on each valid alarm message, thereby improving the troubleshooting efficiency of maintenance personnel, the fluctuation thresholds and frequency thresholds corresponding to various indicators such as the response latency, network transmission latency, server processing latency, or client display latency of the corresponding live video stream collected by the camera unit of the terminal device and pushed to the client, as well as the response latency, network transmission latency, server processing latency, or client display latency from the client initiating a business request to receiving the response result during the live broadcast, are associated with each indicator. A maximum value function expression is constructed based on the fluctuation thresholds and frequency thresholds corresponding to each indicator. The maximum value function expression is used to select the maximum value between the frequency threshold and the fluctuation threshold, avoiding the inclusion of too many invalid alarm messages in the alarm information.
[0094] Step S1306: Use the constructed maximum value function expression as the preset threshold.
[0095] The maximum value function expression constructed based on the fluctuation threshold and frequency threshold corresponding to each indicator can be used as a preset threshold to filter out a large number of invalid alarm messages, thereby improving the accuracy of triggering delay alarms and facilitating timely handling of faults caused by online services by operation and maintenance personnel.
[0096] Based on any embodiment of this application, please refer to Figure 5 The steps to obtain the latency data corresponding to each indicator in the valid latency alarms within the historical period include the following steps:
[0097] Step S13011: Clean or fill in missing values for the latency data corresponding to each indicator in the obtained valid latency alarms.
[0098] For the latency data corresponding to various indicators in the effective latency alarms within the historical period, such as the response latency, network transmission latency, server processing latency, or client display latency of the live video stream collected by the camera unit of the terminal device and pushed to the client, as well as the response latency, network transmission latency, server processing latency, or client display latency from the client initiating a business request to receiving the response result during the live broadcast, the latency data corresponding to each indicator in the effective latency alarms within the historical period may contain a large number of missing values, a lot of noise, and abnormal data due to manual input errors. This will greatly affect the accuracy and precision of the fluctuation thresholds corresponding to each indicator. If the latency data corresponding to each indicator in the effective latency alarms is not cleaned or missing value filled, many factors such as the accuracy and reliability of the fluctuation thresholds corresponding to each indicator will be involved. Therefore, data cleaning or missing value filling is performed on the latency data corresponding to each indicator in the effective latency alarms to ensure the completeness and accuracy of the latency data corresponding to each indicator, and to ensure the accuracy and precision of the fluctuation thresholds corresponding to each indicator, thereby ensuring the accuracy of the latency alarms.
[0099] Step S13013: Normalize the time delay data corresponding to each indicator after data cleaning or missing value filling.
[0100] Since the dimensions and magnitudes of the latency data corresponding to each indicator in the effective latency alarm within the historical period may be inconsistent, in order to eliminate the influence of dimensions and magnitudes, it is necessary to normalize the latency data corresponding to each indicator after data cleaning or missing value filling.
[0101] Based on any embodiment of this application, please refer to Figure 6 The step of aggregating the latency data corresponding to each of the aforementioned indicators into latency data at a preset time level includes the following steps:
[0102] Step S1101: Aggregate the latency data corresponding to each indicator into latency data at the minute level;
[0103] The generation of latency data corresponding to various indicators, such as the response latency, network transmission latency, server processing latency, or client display latency, from the camera unit of the terminal device acquiring the corresponding live video stream to pushing it to the client, as well as the response latency, network transmission latency, server processing latency, or client display latency from the client initiating a business request to receiving the response result during the live broadcast, is irregular and scattered. Therefore, it is necessary to aggregate the latency data corresponding to each indicator into minute-level latency data based on the time aggregation module.
[0104] Step S1103: Calculate the mean value of the latency data corresponding to each indicator within the minute level;
[0105] Obtain the corresponding latency data of each indicator at each single point, calculate the average of the corresponding latency data of each indicator at each single point within the minute level, and use the average of the corresponding latency data of each indicator at each single point within the minute level as the aggregated latency data of each indicator.
[0106] Step S1105: Take the average value of the latency data corresponding to each indicator as the aggregated latency data corresponding to each indicator.
[0107] Since the alarm information generated by each indicator needs to take into account the latency data generated within minutes, rather than the latency data generated by a single point, the average of the latency data generated by each indicator within minutes is used as the aggregated latency data for each indicator.
[0108] Based on any embodiment of this application, please refer to Figure 7 The steps for constructing a sliding time window of a preset time length based on the latency data corresponding to the aggregated indicators include the following:
[0109] Step S1201: Determine the size of the sliding time window;
[0110] In the context of live streaming, there are high requirements for real-time performance and low latency. It is necessary to constantly monitor the latency data corresponding to various indicators within a short time window. Therefore, the length of the sliding time window is determined to be 5 min, which facilitates the timely detection of alarm information corresponding to various indicators and handing it over to relevant personnel for processing, so as to ensure the real-time performance and low latency of live streaming and related services.
[0111] Step S1203: Use the time delay data corresponding to each aggregated indicator as the sliding time window value;
[0112] The latency data values corresponding to various indicators, such as the response latency, network transmission latency, server processing latency, or client display latency of the camera unit of the aggregated terminal device acquiring the corresponding live video stream and pushing it to the client, as well as the response latency, network transmission latency, server processing latency, or client display latency from the client initiating a business request to receiving the response result during the live broadcast, are used as the sliding time window value.
[0113] Step S1205: Construct a sliding time window of a preset time length based on the size of the sliding time window and the value of the sliding time window.
[0114] In order to accurately and effectively respond to the latency alarms corresponding to each indicator, and to process the latency alarms in a timely manner to avoid business losses caused by the inability to promptly investigate and process effective alarm information, a sliding time window of a preset time length is constructed based on the determined sliding time window size and sliding time window value.
[0115] Based on any embodiment of this application, please refer to Figure 8 The steps for smoothing the time delay data corresponding to each aggregated indicator within the sliding time window based on a time series smoothing algorithm include the following steps:
[0116] Step S1207: Calculate the change in time delay data corresponding to each aggregated index between adjacent time nodes within the sliding time window based on the first-order difference equation.
[0117] Specifically, based on a first-order difference equation, the changes in latency data corresponding to various indicators, such as the response latency, network transmission latency, server processing latency, or client display latency of the terminal device's camera unit acquiring the corresponding live video stream from adjacent time nodes within the sliding time window to pushing it to the client, and the response latency, network transmission latency, server processing latency, or client display latency from the client initiating a service request to receiving the response result during the live broadcast, are calculated. The first-order difference equation is as follows:
[0118] Δy x =y x+1 -y x (x = 0, 1, 2, ...)
[0119] Where y x+1 y represents the latency data value corresponding to each indicator at the current time point. x Δy represents the time delay data value corresponding to each indicator in the adjacent time nodes of the current time node. x This represents the change in latency data for each indicator between adjacent time points.
[0120] Step S1209: Smooth the time delay data corresponding to each aggregated indicator within the sliding time window according to the amount of change.
[0121] In order to reduce the impact of extreme values of the time delay data corresponding to each indicator within the sliding time window, and thus accurately calculate the time delay fluctuation value of the time delay data corresponding to each indicator within the frame, the time delay data corresponding to each aggregated indicator within the sliding time window is smoothed according to the amount of change.
[0122] Please see Figure 9An alarm detection device provided to meet one of the purposes of this application includes a data aggregation module 1100, a data processing module 1200, a fluctuation value determination module 1300, and an alarm triggering module 1400. The data aggregation module 1100 is configured to acquire latency data corresponding to various indicators from the time the requester initiates a request to the time the processor receives the response, and aggregate the latency data corresponding to each indicator into latency data at a preset time level. The data processing module 1200 is configured to construct a sliding time window of a preset time length based on the aggregated latency data corresponding to each indicator, and smooth the aggregated latency data corresponding to each indicator within the sliding time window based on a time series smoothing algorithm. The fluctuation value determination module 1300 is configured to calculate the variance-to-cycle ratio of the smoothed latency data corresponding to each indicator between adjacent time nodes to obtain the latency fluctuation value of each indicator. The alarm triggering module 1400 is configured to compare the latency fluctuation value with a preset threshold; if the latency fluctuation value reaches the preset threshold, a latency alarm is triggered.
[0123] Based on any embodiment of this application, the alarm detection device of this application further includes: an acquisition module, configured to acquire latency data corresponding to each indicator in the effective latency alarms within a historical period; a calculation module, configured to calculate the mean variance of the latency data corresponding to each indicator in the effective latency alarms; and a fluctuation threshold determination module, configured to use the maximum value of the mean variance as the fluctuation threshold.
[0124] Based on any embodiment of this application, the alarm detection device of this application further includes: a frequency threshold determination module, configured to determine the frequency threshold of the delay data corresponding to each indicator; a function construction module, configured to construct a maximum value function expression based on the fluctuation threshold and the frequency threshold, wherein the maximum value function expression is used to select the maximum value between the frequency threshold and the fluctuation threshold; and a preset threshold determination module, configured to use the constructed maximum value function expression as a preset threshold.
[0125] Based on any embodiment of this application, the acquisition module includes: a preprocessing unit, configured to perform data cleaning or missing value filling on the latency data corresponding to each indicator in the acquired valid latency alarm;
[0126] The normalization processing unit is configured to normalize the time delay data corresponding to each indicator after data cleaning or missing value filling.
[0127] Based on any embodiment of this application, the data aggregation module 1100 includes: an aggregation unit configured to aggregate the latency data corresponding to each indicator into latency data at the minute level; an average calculation unit configured to calculate the average of the latency data corresponding to each indicator within the minute level; and a data determination unit configured to use the average of the latency data corresponding to each indicator as the aggregated latency data corresponding to each indicator.
[0128] Based on any embodiment of this application, the data processing module 1200 includes: a window size determination unit, configured to determine the size of the sliding time window; a window value determination unit, configured to use the time delay data corresponding to each aggregated indicator as the sliding time window value; and a window construction unit, configured to construct a sliding time window of a preset time length according to the size of the sliding time window and the sliding time window value.
[0129] Based on any embodiment of this application, the data processing module 1200 includes: a change determination unit, configured to calculate the change in the time delay data corresponding to each aggregated indicator between adjacent time nodes within the sliding time window based on a first-order difference equation; and a smoothing processing unit, configured to smooth the time delay data corresponding to each aggregated indicator within the sliding time window according to the change.
[0130] Based on any embodiment of this application, please refer to Figure 10 Another embodiment of this application also provides an alarm detection device, which can be implemented by a computer device, such as... Figure 10 The diagram shows the internal structure of a computer device. The computer device includes a processor, a computer-readable storage medium, a memory, and a network interface connected via a system bus. The computer-readable storage medium stores an operating system, a database, and computer-readable instructions. The database may store control information sequences. When the computer-readable instructions are executed by the processor, the processor can implement an alarm detection method. The processor of the computer device provides computing and control capabilities to support the operation of the entire computer device. The memory of the computer device may store computer-readable instructions. When the computer-readable instructions are executed by the processor, the processor can execute the alarm detection method of this application. The network interface of the computer device is used for communication with a terminal. Those skilled in the art will understand that… Figure 10 The structure shown is merely a block diagram of a portion of the structure related to the present application and does not constitute a limitation on the computer device to which the present application is applied. Specific computer devices may include more or fewer components than those shown in the figure, or combine certain components, or have different component arrangements.
[0131] In this embodiment, the processor is used to execute... Figure 9 The system contains the specific functions of each module and its sub-modules, and the memory stores the program code and various data required to execute these modules or sub-modules. The network interface is used for data transmission between the user terminal and the server. In this embodiment, the memory stores the program code and data required to execute all modules / sub-modules in the alarm detection device of this application, and the server can call the server's program code and data to execute the functions of all sub-modules.
[0132] This application also provides a storage medium storing computer-readable instructions, which, when executed by one or more processors, cause the one or more processors to perform the steps of the alarm detection method described in any embodiment of this application.
[0133] This application also provides a computer program product, including a computer program / instructions that, when executed by one or more processors, implement the steps of the alarm detection method described in any embodiment of this application.
[0134] Those skilled in the art will understand that all or part of the processes in the methods of the above embodiments of this application can be implemented by a computer program instructing related hardware. This computer program can be stored in a computer-readable storage medium, and when executed, it can include the processes of the embodiments of the methods described above. The aforementioned storage medium can be a magnetic disk, optical disk, read-only memory (ROM), or random access memory (RAM), etc.
[0135] The above description is only a partial embodiment of this application. It should be noted that for those skilled in the art, several improvements and modifications can be made without departing from the principle of this application, and these improvements and modifications should also be considered within the scope of protection of this application.
[0136] In summary, this application aggregates scattered and irregular raw latency data into minute-level data, making the latency data operable. Within a preset time window, based on dynamic variance and time series smoothing algorithms, it accurately identifies valid latency alarm points, filters out a large number of invalid alarm messages, and significantly reduces alarm noise. This provides valuable alarm information to operations and maintenance personnel for further processing. By improving the accuracy of identifying valid alarms, it significantly reduces the workload of operations and maintenance personnel on invalid alarm messages, allowing them to focus their time and energy on each valid alarm message. It also avoids the need for operations and maintenance personnel to frequently modify alarm thresholds, thereby significantly improving the efficiency of troubleshooting and handling online service faults, ensuring the real-time nature and low latency of online services, and preventing business losses due to the inability to promptly investigate and handle valid alarm information.
Claims
1. An alarm detection method, characterized in that, Includes the following steps: Acquire the latency data corresponding to each indicator from the time the requester initiates the request to the time the processor receives the response result, and aggregate the latency data corresponding to each indicator into latency data at a preset time level. The latency data is the latency generated during live streaming. A sliding time window of a preset time length is constructed based on the time delay data corresponding to each aggregated indicator. The time delay data corresponding to each aggregated indicator within the sliding time window is smoothed based on a time series smoothing algorithm. This includes: calculating the change in the time delay data corresponding to each aggregated indicator between adjacent time nodes within the sliding time window based on a first-order difference equation; and smoothing the time delay data corresponding to each aggregated indicator within the sliding time window based on the change. The time delay fluctuation value of each indicator is obtained by calculating the variance ratio of the time delay data corresponding to each indicator after smoothing between adjacent time nodes. Obtain the latency data corresponding to each indicator in the effective latency alarm within the historical period, calculate the mean variance of the latency data corresponding to each indicator in the effective latency alarm, and take the maximum value of the mean variance as the fluctuation threshold. Determine the frequency threshold of the time delay data corresponding to each indicator within the historical period, construct a maximum value function expression based on the fluctuation threshold and the frequency threshold, the maximum value function expression is used to select the maximum value between the frequency threshold and the fluctuation threshold, and set a preset threshold based on the constructed maximum value function expression; The latency fluctuation value is compared with a preset threshold. If the latency fluctuation value reaches the preset threshold, a latency alarm is triggered.
2. The alarm detection method according to claim 1, characterized in that, The steps to obtain the latency data corresponding to each indicator in the valid latency alarms within the historical period include the following: Perform data cleaning or fill in missing values on the latency data corresponding to each indicator in the valid latency alarms obtained; Normalize the time delay data corresponding to each indicator after data cleaning or missing value imputation.
3. The alarm detection method according to claim 1, characterized in that, The step of aggregating the latency data corresponding to each of the above indicators into latency data at a preset time level includes the following steps: The latency data corresponding to each indicator is aggregated into latency data at the minute level; Calculate the average latency data corresponding to each indicator within the minute level; The average value of the latency data corresponding to each indicator is used as the aggregated latency data corresponding to each indicator.
4. The alarm detection method according to claim 1, characterized in that, The steps for constructing a sliding time window of a preset length based on the latency data corresponding to the aggregated metrics include the following: Determine the size of the sliding time window; Use the time delay data corresponding to each aggregated indicator as the sliding time window value; A sliding time window of a preset time length is constructed based on the size and value of the sliding time window.
5. An alarm detection device, characterized in that, include: The data aggregation module is configured to acquire the latency data corresponding to each indicator from the time the requester initiates the request to the time the processor receives the response result, and aggregate the latency data corresponding to each indicator into latency data at a preset time level. The latency data is the latency generated during live streaming. The data processing module is configured to construct a sliding time window of a preset time length based on the time delay data corresponding to the aggregated indicators, and to smooth the time delay data corresponding to the aggregated indicators within the sliding time window based on a time series smoothing algorithm. This includes: calculating the change in the time delay data corresponding to the aggregated indicators between adjacent time nodes within the sliding time window based on a first-order difference equation; and smoothing the time delay data corresponding to the aggregated indicators within the sliding time window based on the change. The fluctuation value determination module is set to calculate the variance ratio of the time delay data corresponding to each indicator after smoothing between adjacent time nodes to obtain the time delay fluctuation value of each indicator. The acquisition module is configured to acquire latency data corresponding to each indicator in the valid latency alarms within the historical period; the calculation module is configured to calculate the mean variance of the latency data corresponding to each indicator in the valid latency alarms; and the fluctuation threshold determination module is configured to use the maximum value of the mean variance as the fluctuation threshold. The frequency threshold determination module is configured to determine the frequency threshold of the time delay data corresponding to each indicator within the historical period; the function construction module is configured to construct a maximum value function expression based on the fluctuation threshold and the frequency threshold, wherein the maximum value function expression is used to select the maximum value between the frequency threshold and the fluctuation threshold; the preset threshold determination module is configured to set a preset threshold based on the constructed maximum value function expression. The alarm triggering module is configured to compare the latency fluctuation value with a preset threshold. If the latency fluctuation value reaches the preset threshold, a latency alarm is triggered.
6. The alarm detection device according to claim 5, characterized in that, The acquisition module includes: The preprocessing unit is configured to perform data cleaning or missing value filling on the latency data corresponding to each indicator in the acquired valid latency alarms. The normalization processing unit is configured to normalize the time delay data corresponding to each indicator after data cleaning or missing value filling.
7. The alarm detection device according to claim 5, characterized in that, The data aggregation module includes: The aggregation unit is set to aggregate the latency data corresponding to each indicator into latency data at the minute level; the mean calculation unit is set to calculate the mean of the latency data corresponding to each indicator at the minute level. The data determination unit is set to use the average value of the time delay data corresponding to each indicator as the aggregated time delay data corresponding to each indicator.
8. The alarm detection device according to claim 5, characterized in that, The data processing module includes: The window size determination unit is configured to determine the size of the sliding time window; The window value determination unit is set to use the time delay data corresponding to each aggregated indicator as the sliding time window value. The window construction unit is configured to construct a sliding time window of a preset time length based on the size of the sliding time window and the value of the sliding time window.
9. An alarm detection device, comprising a central processing unit and a memory, characterized in that, The central processing unit is used to invoke and run a computer program stored in the memory to perform the steps of the method as described in any one of claims 1 to 4.
10. A computer-readable storage medium, characterized in that, It stores, in the form of computer-readable instructions, a computer program implemented according to any one of claims 1 to 4, which, when invoked by a computer, executes the steps included in the corresponding method.