Abnormal operation information determination method and device, computer device, and storage medium
By intelligently analyzing the abnormal operation information of the host's batch critical operation paths, the problem of time-consuming and labor-intensive traditional manual analysis is solved, and efficient and precise identification of abnormal information is achieved.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- INDUSTRIAL AND COMMERCIAL BANK OF CHINA
- Filing Date
- 2023-11-29
- Publication Date
- 2026-06-16
AI Technical Summary
Traditional methods of manually analyzing abnormal operation information of critical operation paths of host computers consume a lot of human resources and have poor timeliness, resulting in low analysis efficiency.
By acquiring the target host's overall operating time and multiple operating data, key operating paths are identified, and the causes of time anomalies or abnormal operating data are filtered and analyzed to generate abnormal operating information. Intelligent methods are used to improve the comprehensiveness and precision of the analysis.
While avoiding the poor timeliness of manual operation, it improves the comprehensiveness and precision of the analysis of abnormal operation information of key operation paths, and enhances the analysis efficiency.
Smart Images

Figure CN117435775B_ABST
Abstract
Description
Technical Field
[0001] This application relates to the field of artificial intelligence technology, and in particular to a method, apparatus, computer device, and storage medium for determining abnormal operation information. Background Technology
[0002] The mainframe handles the bank's core business data. In recent years, with the continuous increase in the bank's overall business volume, batch processing time has shown a continuous upward trend. The timeliness requirement for the bank's core batch processing is 150 minutes (excluding special days). This is not only an important foundation for ensuring the release window of the version, but also an important guarantee for the timely transmission of business reports. However, with the increase in business volume, how to continue to keep the core batch processing time within the 150-minute (excluding special days) limit requires system maintenance personnel to analyze the abnormal operation information of the mainframe batch processing critical operation path from a macro to a micro perspective and take targeted solutions. Therefore, how to analyze the abnormal operation information of the mainframe batch processing critical operation path is the current research focus.
[0003] Traditional methods for identifying abnormal operation information involve staff acquiring business data corresponding to all business volumes and then filtering the data for critical operation paths one by one to analyze abnormal operation information of a batch of host systems on these critical paths. However, this method requires a significant amount of manpower, and manual analysis is not timely, resulting in low efficiency in analyzing abnormal operation information of a batch of host systems on critical operation paths. Summary of the Invention
[0004] Therefore, it is necessary to provide a method, apparatus, computer device, computer-readable storage medium, and computer program product for determining abnormal operation information to address the above-mentioned technical problems.
[0005] Firstly, this application provides a method for determining abnormal operation information. The method includes:
[0006] Obtain the target host's overall runtime and multiple runtime data points, and identify the runtime path corresponding to each runtime data point;
[0007] Among the various running paths, key running paths are selected, and the core batch running time of the key running paths is identified based on the overall running time of the target host.
[0008] If the core batch running time is greater than a preset running time threshold, identify the cause of the time anomaly in the key running path; if the core batch running time is not greater than the preset running time threshold, identify the abnormal running data in the key running path.
[0009] Based on the time anomaly cause of the critical operation path or the abnormal operation data of the critical operation path, the abnormal operation information of the critical operation path is identified, and the abnormal operation information of the critical operation path is used as the target abnormal operation information of the target host.
[0010] Optionally, identifying the running path corresponding to each piece of running data includes:
[0011] Identify the running sequence to which each running data belongs, and cluster the running data according to each running sequence to obtain multiple data groups;
[0012] Identify the batch path corresponding to each data group, and use the batch path corresponding to each data group as the running path corresponding to each running data in each data group.
[0013] Optionally, identifying the cause of the time anomaly in the critical running path includes:
[0014] For each running data in the data group corresponding to the key running path, the running data is distributed and sorted according to the running sequence corresponding to each running data to obtain the running data distribution information corresponding to the key running path.
[0015] Identify the fluctuation range of the operation distribution information and the fluctuation frequency of the operation sub-information, and based on a preset time anomaly distribution strategy, identify the anomaly causes corresponding to the fluctuation range and the fluctuation frequency to obtain the time anomaly causes of the key operation path.
[0016] Optionally, identifying abnormal running data in the critical running path includes:
[0017] Identify the waiting time of each piece of data waiting to run in the key running path, and classify each piece of data waiting to run according to the date type to which each piece of data belongs;
[0018] For each date type, based on the waiting time threshold corresponding to the date type, waiting running data with waiting times greater than the waiting time threshold are filtered out as target waiting running data, and each target waiting running data and the date type corresponding to each target waiting running data are regarded as abnormal running data of the critical running path.
[0019] Optionally, identifying the abnormal operation information of the critical operation path based on the time anomaly cause of the critical operation path or the abnormal operation data of the critical operation path includes:
[0020] If the core batch execution time exceeds a preset execution time threshold, based on the cause of the time anomaly in the critical execution path and the path report template, abnormal execution information for the critical execution path is generated, or
[0021] If the core batch running time is not greater than a preset running time threshold, for each abnormal running data, identify the data change information of each abnormal running data, and based on the data change information, identify the abnormal running cause of each abnormal running data; based on each abnormal running cause and the path report template, generate the abnormal running information of the key running path.
[0022] Optionally, the step of identifying data change information for each abnormal operating data point, and identifying the cause of abnormal operation for each abnormal operating data point based on the data change information, includes:
[0023] Obtain the historical running data corresponding to the running data of each abnormal running data, and based on each running data and each historical running data, identify the data volume change information and the level change information of each running data;
[0024] Based on the data volume change information, the level change information, the preset data volume change threshold, and the preset level change threshold, the abnormal change information of each running data is identified, and the abnormal change information of each running data is used as the abnormal running cause of the abnormal running data corresponding to each running data.
[0025] Optionally, after using the abnormal operation information of the critical operation path as the target abnormal operation information of the target host, the method further includes:
[0026] If the core batch running time is greater than a preset running time threshold, based on the time anomaly cause corresponding to the target abnormal running information, the abnormal running program corresponding to the time anomaly running cause is identified, and the abnormal running program is adjusted through a preset program optimization strategy to obtain a new running program.
[0027] The target host containing all new running programs will be designated as the optimized target host.
[0028] Optionally, after using the abnormal operation information of the critical operation path as the target abnormal operation information of the target host, the method further includes:
[0029] If the core batch running time is not greater than a preset running time threshold, among the abnormal running data, target abnormal running data corresponding to waiting time greater than the waiting time threshold is selected, and data compression processing is performed on each target abnormal running data to obtain target compressed data, and the target compressed data replaces the target abnormal running data.
[0030] The indexing efficiency of each indexing program in the target host is identified, and target indexing programs with efficiency below the threshold are filtered out. The target indexing programs are adjusted and reorganized to obtain optimized target indexing programs. The target host containing all the abnormal running data of the replaced target and all the optimized target indexing programs is taken as the optimized target host.
[0031] Secondly, this application also provides an apparatus for determining abnormal operation information. The apparatus includes:
[0032] The acquisition module is used to acquire the overall runtime of the target host and multiple runtime data, and identify the runtime path corresponding to each runtime data.
[0033] The first identification module is used to filter key running paths among the various running paths, and identify the core batch running time of the key running paths based on the overall running time of the target host.
[0034] The second identification module is used to identify the cause of time anomalies in the key running path when the core batch running time is greater than a preset running time threshold, and to identify abnormal running data in the key running path when the core batch running time is not greater than the preset running time threshold.
[0035] The third identification module is used to identify the abnormal operation information of the critical operation path based on the time anomaly cause of the critical operation path or the abnormal operation data of the critical operation path, and to use the abnormal operation information of the critical operation path as the target abnormal operation information of the target host.
[0036] Optionally, the acquisition module is specifically used for:
[0037] Identify the running sequence to which each running data belongs, and cluster the running data according to each running sequence to obtain multiple data groups;
[0038] Identify the batch path corresponding to each data group, and use the batch path corresponding to each data group as the running path corresponding to each running data in each data group.
[0039] Optionally, the second identification module is specifically used for:
[0040] For each running data in the data group corresponding to the key running path, the running data is distributed and sorted according to the running sequence corresponding to each running data to obtain the running data distribution information corresponding to the key running path.
[0041] Identify the fluctuation range of the operation distribution information and the fluctuation frequency of the operation sub-information, and based on a preset time anomaly distribution strategy, identify the anomaly causes corresponding to the fluctuation range and the fluctuation frequency to obtain the time anomaly causes of the key operation path.
[0042] Optionally, the second identification module is specifically used for:
[0043] Identify the waiting time of each piece of data waiting to run in the key running path, and classify each piece of data waiting to run according to the date type to which each piece of data belongs;
[0044] For each date type, based on the waiting time threshold corresponding to the date type, waiting running data with waiting times greater than the waiting time threshold are filtered out as target waiting running data, and each target waiting running data and the date type corresponding to each target waiting running data are regarded as abnormal running data of the critical running path.
[0045] Optionally, the third identification module is specifically used for:
[0046] If the core batch execution time exceeds a preset execution time threshold, based on the cause of the time anomaly in the critical execution path and the path report template, abnormal execution information for the critical execution path is generated, or
[0047] If the core batch running time is not greater than a preset running time threshold, for each abnormal running data, identify the data change information of each abnormal running data, and based on the data change information, identify the abnormal running cause of each abnormal running data; based on each abnormal running cause and the path report template, generate the abnormal running information of the key running path.
[0048] Optionally, the third identification module is specifically used for:
[0049] Obtain the historical running data corresponding to the running data of each abnormal running data, and based on each running data and each historical running data, identify the data volume change information and the level change information of each running data;
[0050] Based on the data volume change information, the level change information, the preset data volume change threshold, and the preset level change threshold, the abnormal change information of each running data is identified, and the abnormal change information of each running data is used as the abnormal running cause of the abnormal running data corresponding to each running data.
[0051] Optionally, the device further includes:
[0052] The adjustment module is used to, when the core batch running time is greater than a preset running time threshold, identify the abnormal running program corresponding to the time abnormal running cause based on the time abnormal running information corresponding to the target abnormal running information, and adjust the abnormal running program through a preset program optimization strategy to obtain a new running program;
[0053] The determination module is used to identify the target host containing all new running programs as the optimized target host.
[0054] Optionally, the device further includes:
[0055] The compression module is used to, when the core batch running time is not greater than a preset running time threshold, filter out target abnormal running data corresponding to waiting time greater than a waiting time threshold from each of the abnormal running data, perform data compression processing on each of the target abnormal running data to obtain target compressed data, and replace the target abnormal running data with the target compressed data;
[0056] The optimization module is used to identify the indexing efficiency of each indexing program in the target host, filter target indexing programs that are below the efficiency threshold, and obtain optimized target indexing programs by adjusting and reorganizing each target indexing program. The target host containing all the abnormal running data of the replaced target and all the optimized target indexing programs is taken as the optimized target host.
[0057] Thirdly, this application provides a computer device. The computer device includes a memory and a processor, the memory storing a computer program, and the processor executing the computer program to implement the steps of the method described in any one of the first aspects.
[0058] Fourthly, this application provides a computer-readable storage medium having a computer program stored thereon that, when executed by a processor, implements the steps of the method described in any one of the first aspects.
[0059] Fifthly, this application provides a computer program product. The computer program product includes a computer program that, when executed by a processor, implements the steps of the method described in any one of the first aspects.
[0060] The aforementioned method, apparatus, computer equipment, and storage medium for determining abnormal operation information acquire the overall operating time of the target host and multiple operating data, and identify the operating path corresponding to each operating data; among the operating paths, key operating paths are selected, and based on the overall operating time of the target host, the core batch operating time of the key operating path is identified; if the core batch operating time is greater than a preset operating time threshold, the cause of the time anomaly in the key operating path is identified; if the core batch operating time is not greater than the preset operating time threshold, abnormal operating data in the key operating path is identified; based on the cause of the time anomaly in the key operating path or the abnormal operating data of the key operating path, abnormal operating information of the key operating path is identified, and the abnormal operating information of the key operating path is used as the target abnormal operating information of the target host. By filtering critical running paths within the target host and analyzing the abnormal running information of these paths based on the causes of time anomalies corresponding to the core batch running times of these critical running paths, or the abnormal running data within these critical running paths, the system avoids the timeliness issues caused by manual operation. This not only improves the comprehensiveness of the analysis of abnormal running information by analyzing both abnormal time and abnormal running data, but also enhances the precision of the analysis by determining abnormal running information at the granular level of each abnormal running data point within the critical running paths. Furthermore, by intelligently filtering the running data of critical running paths and analyzing their abnormal running information, the system improves the efficiency of analyzing abnormal running information of critical running paths in batches of the host. Attached Figure Description
[0061] Figure 1 This is a flowchart illustrating a method for determining abnormal operation information in one embodiment;
[0062] Figure 2 This is a schematic diagram of batch time in one embodiment;
[0063] Figure 3 A flowchart illustrating an example of determining abnormal operation information in one embodiment;
[0064] Figure 4 This is a structural block diagram of an abnormal operation information determination device in one embodiment;
[0065] Figure 5 This is an internal structural diagram of a computer device in one embodiment. Detailed Implementation
[0066] To make the objectives, technical solutions, and advantages of this application clearer, the following detailed description is provided in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative and not intended to limit the scope of this application.
[0067] The abnormal operation information determination method provided in this application embodiment can be applied to the application environment corresponding to the critical path analysis process of a large host. This method can be applied to terminals, servers, and systems including both terminals and servers, and is implemented through interaction between the terminals and servers. The terminals can be, but are not limited to, various personal computers, laptops, smartphones, tablets, etc. The terminal filters critical operation paths in each running path of the target host and analyzes the abnormal operation information of the critical operation paths based on the time anomalies corresponding to the core batch running time of the critical operation paths, or the abnormal operation data in the critical operation paths. While avoiding the timeliness issues caused by manual operation, this method not only improves the comprehensiveness of the analysis of abnormal operation information of critical operation paths by analyzing from the perspectives of abnormal time and abnormal operation data, but also improves the granularity of the analysis by determining abnormal operation information at the level of each abnormal operation data point in the critical operation paths. Furthermore, by intelligently filtering the running data of the critical operation paths and analyzing the abnormal operation information of the critical operation paths, it improves the efficiency of analyzing the abnormal operation information of critical operation paths in a batch of hosts.
[0068] In one embodiment, such as Figure 1 As shown, a method for determining abnormal operation information is provided. Taking the application of this method to a terminal as an example, the method includes the following steps:
[0069] Step S101: Obtain the overall running time of the target host and multiple running data, and identify the running path corresponding to each running data.
[0070] In this embodiment, after obtaining authorization from the target host, the terminal acquires multiple runtime data points of the target host during historical time periods. The terminal then calculates the average of the runtime of each runtime data point in each core batch data processing session within that historical time period, using this average as the overall runtime of the target host. The runtime data includes runtime data of multiple time types, including special day types and non-special day types. The terminal then identifies the runtime path corresponding to each runtime data point based on the runtime sequence corresponding to each runtime data point.
[0071] Step S102: Among the various running paths, select the key running paths and identify the core batch running time of the key running paths based on the overall running time of the target host.
[0072] In this embodiment, the terminal filters key running paths among various running paths. Each running path consists of multiple jobs (running data) in the job scheduling management program TWS in a certain order. This order is a sequence of jobs from the starting job (running data) CBBSTAT to the core batch's closing job CBBSEND (running data). The key running path is the path formed by the job sequence with the longest total running time of all running data in a single job sequence among the above multiple job sequences. Then, based on the overall running time of the target host, the terminal identifies the core batch running time of the key running path. The core batch running time of the key running path is the sum of the data processing time of all running data in the key running path. For example, as... Figure 2 The diagram shown is a batch time diagram, which represents the total running time of the core batch running data at different time points.
[0073] Step S103: If the core batch running time is greater than the preset running time threshold, identify the cause of time anomalies in the key running path; if the core batch running time is not greater than the preset running time threshold, identify abnormal running data in the key running path.
[0074] In this embodiment, the terminal presets a running time threshold and determines whether the core batch running time exceeds the preset running time threshold. If the core batch running time exceeds the preset running time threshold, the cause of the time anomaly in the critical running path is identified; if the core batch running time does not exceed the preset running time threshold, abnormal running data in the critical running path is identified. The specific identification process will be explained in detail later. The running time threshold is set to 150 minutes if the date type of each running data corresponding to the critical running path is a non-special day type; if the date type of each running data corresponding to the critical running path is a special day type, the terminal determines that the running time threshold is a fluctuation of no more than 5 minutes compared to the comprehensive running time corresponding to the previous special day type.
[0075] Step S104: Based on the time anomaly cause of the critical operation path or the abnormal operation data of the critical operation path, identify the abnormal operation information of the critical operation path, and use the abnormal operation information of the critical operation path as the target abnormal operation information of the target host.
[0076] In this embodiment, the terminal identifies abnormal operation information of the critical operation path based on the cause of the time anomaly and the abnormal operation data of the critical operation path, and uses the abnormal operation information of the critical operation path as the target abnormal operation information of the target host. Specifically, different abnormal operation information corresponds to the judgment information based on the core batch operation time and the preset operation time threshold. The detailed identification process will be explained in detail later.
[0077] Based on the above scheme, by screening the critical running paths in each running path of the target host, and analyzing the abnormal running information of the critical running paths based on the time anomalies corresponding to the core batch running time of the critical running paths or the abnormal running data in the critical running paths, the system avoids the poor timeliness caused by manual operation. It not only improves the comprehensiveness of the analysis of abnormal running information of the critical running paths by analyzing from the perspectives of abnormal time and abnormal running data, but also improves the granularity of the analysis of abnormal running information of the critical running paths by determining the abnormal running information at the granularity of each abnormal running data in the critical running paths. Furthermore, by intelligently filtering the running data of the critical running paths and analyzing the abnormal running information of the critical running paths, the system improves the efficiency of analyzing the abnormal running information of the critical running paths of the host in batches.
[0078] Optionally, identifying the running path corresponding to each running data includes: identifying the running sequence to which each running data belongs, and clustering each running data according to each running sequence to obtain multiple data groups; identifying the batch path corresponding to each data group, and using the batch path corresponding to each data group as the running path corresponding to each running data in each data group.
[0079] In this embodiment, the terminal identifies the running sequence to which each piece of running data belongs, and clusters the running data according to each running sequence to obtain multiple data groups. Then, the terminal identifies the batch path corresponding to each data group, and uses the batch path corresponding to each data group as the running path corresponding to each piece of running data in each data group.
[0080] Based on the above scheme, by identifying the running sequence to which each running data belongs, the running path corresponding to each running data is determined, thereby improving the accuracy of running path determination.
[0081] Optionally, identifying the causes of time anomalies in critical operating paths includes: sorting the operating data in the data group corresponding to the critical operating path according to the operating sequence corresponding to each operating data to obtain the operating data distribution information corresponding to the critical operating path; identifying the fluctuation range of the operating distribution information and the fluctuation frequency of the operating distribution information; and based on a preset time anomaly distribution strategy, identifying the anomaly causes corresponding to the fluctuation range and fluctuation frequency to obtain the causes of time anomalies in critical operating paths.
[0082] In this embodiment, the terminal performs distribution sorting on each running data in the data group corresponding to the key running path according to the running sequence corresponding to each running data, to obtain the running data distribution information corresponding to the key running path (such as...). Figure 2 (As shown). In another embodiment, the runtime data distribution information is arranged according to the time sequence of the runtime time points corresponding to each runtime data point. Then, the terminal identifies the fluctuation range of the runtime distribution information and the fluctuation frequency of the runtime distribution information, and based on a preset time anomaly distribution strategy, identifies the anomaly causes corresponding to the fluctuation range and fluctuation frequency, thereby obtaining the time anomaly causes of the key runtime paths. The time anomaly causes include, but are not limited to, causes corresponding to factors such as CPU busyness, disk response fluctuations, and business data volume fluctuations.
[0083] Based on the above scheme, the terminal identifies the fluctuation range and frequency of the key operating path based on the distribution information of the operating data corresponding to the key operating path, thereby determining the cause of the time anomaly and improving the accuracy of determining the cause of the time anomaly.
[0084] Optionally, identifying abnormal running data in the critical running path includes: identifying the waiting time of each waiting running data in the critical running path, and classifying each waiting running data according to the date type to which each waiting running data belongs; for each date type, based on the waiting time threshold corresponding to the date type, filtering out waiting running data with waiting times greater than the waiting time threshold as target waiting running data, and classifying each target waiting running data and the date type corresponding to each target waiting running data as abnormal running data of the critical running path.
[0085] In this embodiment, the terminal identifies the waiting time of each piece of pending data in the critical operation path and categorizes each piece of pending data according to its date type. Then, for each date type, the terminal filters out pending data with waiting times exceeding a corresponding waiting time threshold, designating these as target pending data. Different date types have different waiting time thresholds; for example, the threshold for non-special day types is 60 seconds, while the threshold for special day types is 30 seconds. Finally, the terminal classifies each target pending data, along with its corresponding date type, as abnormal operation data for the critical operation path.
[0086] Based on the above scheme, by filtering the target waiting data, abnormal running data in each waiting data is filtered out, which reduces the amount of data to be identified and improves the efficiency of data identification, while also improving the efficiency of identifying abnormal running data.
[0087] Optionally, based on the time anomaly cause of the critical operation path or the abnormal operation data of the critical operation path, identify the abnormal operation information of the critical operation path, including: when the core batch operation time is greater than a preset operation time threshold, generate the abnormal operation information of the critical operation path based on the time anomaly cause of the critical operation path and the path report template; when the core batch operation time is not greater than the preset operation time threshold, identify the data change information of each abnormal operation data for each abnormal operation data, and identify the abnormal operation cause of each abnormal operation data based on the data change information; generate the abnormal operation information of the critical operation path based on each abnormal operation cause and the path report template.
[0088] In this embodiment, when the core batch processing time exceeds a preset processing time threshold, the terminal generates abnormal processing information for the critical processing paths based on the time anomaly cause and the path report template. Then, when the core batch processing time does not exceed the preset processing time threshold, the terminal identifies the data change information for each abnormal processing data point and, based on this data change information, identifies the cause of the abnormal processing for each abnormal data point. The specific process of identifying the cause of the abnormal processing for each abnormal data point will be explained in detail later. Finally, the terminal generates abnormal processing information for the critical processing paths based on each abnormal processing cause and the path report template. The data change information includes, but is not limited to, data volume change information and I / O change information.
[0089] Based on the above scheme, by identifying the data change information of abnormal operating data, the cause of abnormal operation of each abnormal operating data can be determined, thereby improving the accuracy of identifying the cause of abnormal operation of abnormal operating data.
[0090] Optionally, identify the data change information for each abnormal operating data, and based on the data change information, identify the cause of abnormal operation for each abnormal operating data, including: obtaining the historical operating data corresponding to the operating data of each abnormal operating data, and based on each operating data and each historical operating data, identifying the data volume change information and the level change information of each operating data; based on the data volume change information, the level change information, the preset data volume change threshold, and the preset level change threshold, identify the abnormal change information of each operating data, and use the abnormal change information of each operating data as the cause of abnormal operation for each operating data.
[0091] In this embodiment, the terminal acquires historical operating data corresponding to each abnormal operating data, and identifies data volume change information and level change information for each operating data based on each operating data and each historical operating data. Then, the terminal identifies abnormal change information for each operating data based on the data volume change information, the level change information (i.e., I / O change information), a preset data volume change threshold, and a preset level change threshold. Specifically, when the data volume change information is greater than the preset data volume change threshold, the abnormal change information for that operating data is simply the data volume change information; when the level change information is greater than the preset level change threshold, the abnormal change information for that operating data is simply the level change information; when both the data volume change information and the level change information are greater than the preset data volume change threshold, the abnormal change information for that operating data includes both the data volume change information and the level change information. Finally, the terminal uses the abnormal change information for each operating data as the cause of the abnormal operation for that corresponding operating data.
[0092] Based on the above scheme, by analyzing the data volume change information and level change information in each data change information, the abnormal operation cause of each abnormal operation data is determined, thereby improving the accuracy of identifying the abnormal operation cause.
[0093] Optionally, after using the abnormal operation information of the critical operation path as the target abnormal operation information of the target host, the method further includes: when the core batch operation time exceeds a preset operation time threshold, identifying the abnormal operation program corresponding to the time abnormal operation cause based on the time abnormal operation cause corresponding to the target abnormal operation information, and adjusting the abnormal operation program through a preset program optimization strategy to obtain a new operation program; using the target host containing all the new operation programs as the optimized target host; when the core batch operation time does not exceed the preset operation time threshold, filtering the target abnormal operation data corresponding to the waiting time exceeding the waiting time threshold from each abnormal operation data, and performing data compression processing on each target abnormal operation data to obtain target compressed data, and replacing the target abnormal operation data with the target compressed data; calculating the indexing efficiency of each indexing program in the target host, filtering target indexing programs below the efficiency threshold, and adjusting and reorganizing each target indexing program to obtain an optimized target indexing program, and using the target host containing all the operation data of the replaced target abnormal operation data and all the optimized target indexing programs as the optimized target host.
[0094] In this embodiment, when the core batch running time of the terminal exceeds the preset running time threshold, the terminal identifies the abnormal running program corresponding to the time abnormal running cause based on the time abnormal running information of the target, and adjusts the abnormal running program through the preset program optimization strategy to obtain a new running program; the target host containing all the new running programs is taken as the optimized target host.
[0095] Specifically: 1. For batch processes that frequently access many large files with high I / O volumes, implementing Data-in-Memory technology can effectively improve batch I / O performance, thereby optimizing batch execution efficiency. Data-in-Memory reduces the number and time of I / O operations by reading or writing data to a cache in the processor's memory (instead of tape or disk), thus reducing data I / O time during job execution and allowing individual jobs to run faster. 2. DSNTIAUL (UNLOAD tool) optimization: For batch processes that use the DSNTIAUL program for data processing. The DSNTIAUL program is a traditional DB2 UNLOAD tool that uses dynamic SQL to access data tables, resulting in a long execution time. Using the higher-performance DB2 built-in UNLOAD tool can significantly reduce the execution time of related jobs.
[0096] If the core batch execution time does not exceed a preset execution time threshold, target abnormal execution data with waiting times exceeding a threshold are selected from each abnormal execution data set. These target abnormal execution data sets are then compressed to obtain compressed target data, which replaces the original target abnormal execution data. Next, the indexing efficiency of each indexing program on the target host is assessed, and target indexing programs with efficiency below a threshold are selected. These target indexing programs are then adjusted and reorganized to obtain optimized target indexing programs. The target host containing all the replaced target abnormal execution data and all the optimized target indexing programs is then designated as the optimized target host.
[0097] Specifically, we optimized batch programs with long execution times: For long-running programs shown in the DB2PM report, we used DB2 EXPLAIN to study their access paths, identifying full table scans, full index scans, and inefficient index usage. For these jobs, especially those on the critical path, we improved DB2 access performance by adding indexes and performing reorganization operations, thereby improving DB2 efficiency. 4. DB2 Data Compression: The system contains some historical data tables and master data tables with large file sizes. Implementing DB2 data compression on these DB2 tables will reduce the I / O of related data processing and increase the buffer pool hit rate, thus optimizing the execution efficiency of batch jobs.
[0098] Based on the above scheme, different abnormal operation information of the target is used to set different abnormal solutions, thereby optimizing the target host, avoiding abnormal operation during the batch data processing of the target host, and improving the stability of the target host during the batch data processing.
[0099] This application also provides an example of determining abnormal operation information, such as... Figure 3 As shown, the specific processing procedure includes the following steps:
[0100] Step S301: Obtain the overall running time of the target host and multiple running data.
[0101] Step S302: Identify the running sequence to which each running data belongs, and perform clustering processing on each running data according to each running sequence to obtain multiple data groups.
[0102] Step S303: Identify the batch path corresponding to each data group, and use the batch path corresponding to each data group as the running path corresponding to each running data in each data group.
[0103] Step S304: Among the various running paths, select the key running paths and identify the core batch running time of the key running paths based on the overall running time of the target host.
[0104] Step S305: If the core batch running time is greater than the preset running time threshold, sort the running data in the data group corresponding to the key running path according to the running sequence corresponding to each running data to obtain the running data distribution information corresponding to the key running path.
[0105] Step S306: Identify the fluctuation range of the operation distribution information and the fluctuation frequency of the operation sub-information, and based on the preset time anomaly distribution strategy, identify the anomaly causes corresponding to the fluctuation range and fluctuation frequency to obtain the time anomaly causes of the key operation path.
[0106] Step S307: If the core batch running time is not greater than the preset running time threshold, identify the waiting time of each waiting data in the key running path, and classify each waiting data according to the date type to which each waiting data belongs.
[0107] Step S308: For each date type, based on the waiting time threshold corresponding to the date type, filter the waiting running data corresponding to the waiting time exceeding the waiting time threshold as target waiting running data, and use each target waiting running data and the date type corresponding to each target waiting running data as abnormal running data of the critical running path.
[0108] Step S309: If the core batch running time exceeds the preset running time threshold, generate abnormal running information for the critical running path based on the time anomaly cause of the critical running path and the path report template.
[0109] Step S310: If the core batch running time is not greater than the preset running time threshold, for each abnormal running data, obtain the historical running data corresponding to the running data of each abnormal running data, and based on each running data and each historical running data, identify the data volume change information and the level change information of each running data.
[0110] Step S311: Based on the data volume change information, the level change information, the preset data volume change threshold, and the preset level change threshold, identify the abnormal change information of each running data, and use the abnormal change information of each running data as the abnormal running cause of the abnormal running data corresponding to each running data.
[0111] Step S312: Based on the causes of each abnormal operation and the path report template, generate abnormal operation information for the key operation paths.
[0112] Step S313: When the core batch running time is greater than the preset running time threshold, based on the time abnormality cause corresponding to the target abnormal running information, identify the abnormal running program corresponding to the time abnormality cause, and adjust the abnormal running program through the preset program optimization strategy to obtain a new running program.
[0113] Step S314: The target host containing all the new running programs is designated as the optimized target host.
[0114] Step S315: If the core batch running time is not greater than the preset running time threshold, select the target abnormal running data corresponding to the waiting time that is greater than the waiting time threshold from each abnormal running data, perform data compression processing on each target abnormal running data to obtain target compressed data, and replace the target abnormal running data with the target compressed data.
[0115] Step S316: Identify the indexing efficiency of each indexing program in the target host, and filter the target indexing programs that are below the efficiency threshold. By adjusting and reorganizing each target indexing program, an optimized target indexing program is obtained. The target host containing all the running data of all replaced abnormal target running data and all optimized target indexing programs is taken as the optimized target host.
[0116] It should be understood that although the steps in the flowcharts of the embodiments described above are shown sequentially according to the arrows, these steps are not necessarily executed in the order indicated by the arrows. Unless explicitly stated herein, there is no strict order restriction on the execution of these steps, and they can be executed in other orders. Moreover, at least some steps in the flowcharts of the embodiments described above may include multiple steps or multiple stages. These steps or stages are not necessarily completed at the same time, but can be executed at different times. The execution order of these steps or stages is not necessarily sequential, but can be performed alternately or in turn with other steps or at least some of the steps or stages of other steps.
[0117] Based on the same inventive concept, this application also provides an abnormal operation information determining device for implementing the abnormal operation information determining method described above. The solution provided by this device is similar to the solution described in the above method; therefore, the specific limitations in one or more embodiments of the abnormal operation information determining device provided below can be found in the limitations of the abnormal operation information determining method described above, and will not be repeated here.
[0118] In one embodiment, such as Figure 4 As shown, an abnormal operation information determination device is provided, including: an acquisition module 410, a first identification module 420, a second identification module 430, and a third identification module 440, wherein:
[0119] The acquisition module 410 is used to acquire the overall running time of the target host and multiple running data, and identify the running path corresponding to each running data.
[0120] The first identification module 420 is used to filter key running paths among the various running paths, and identify the core batch running time of the key running paths based on the comprehensive running time of the target host.
[0121] The second identification module 430 is used to identify the cause of time anomalies in the key running path when the core batch running time is greater than a preset running time threshold, and to identify abnormal running data in the key running path when the core batch running time is not greater than the preset running time threshold.
[0122] The third identification module 440 is used to identify the abnormal operation information of the critical operation path based on the time abnormality cause of the critical operation path or the abnormal operation data of the critical operation path, and to use the abnormal operation information of the critical operation path as the target abnormal operation information of the target host.
[0123] Optionally, the acquisition module 410 is specifically used for:
[0124] Identify the running sequence to which each running data belongs, and cluster the running data according to each running sequence to obtain multiple data groups;
[0125] Identify the batch path corresponding to each data group, and use the batch path corresponding to each data group as the running path corresponding to each running data in each data group.
[0126] Optionally, the second identification module 430 is specifically used for:
[0127] For each running data in the data group corresponding to the key running path, the running data is distributed and sorted according to the running sequence corresponding to each running data to obtain the running data distribution information corresponding to the key running path.
[0128] Identify the fluctuation range of the operation distribution information and the fluctuation frequency of the operation sub-information, and based on a preset time anomaly distribution strategy, identify the anomaly causes corresponding to the fluctuation range and the fluctuation frequency to obtain the time anomaly causes of the key operation path.
[0129] Optionally, the second identification module 430 is specifically used for:
[0130] Identify the waiting time of each piece of data waiting to run in the key running path, and classify each piece of data waiting to run according to the date type to which each piece of data belongs;
[0131] For each date type, based on the waiting time threshold corresponding to the date type, waiting running data with waiting times greater than the waiting time threshold are filtered out as target waiting running data, and each target waiting running data and the date type corresponding to each target waiting running data are regarded as abnormal running data of the critical running path.
[0132] Optionally, the third identification module 440 is specifically used for:
[0133] If the core batch execution time exceeds a preset execution time threshold, based on the cause of the time anomaly in the critical execution path and the path report template, abnormal execution information for the critical execution path is generated, or
[0134] If the core batch running time is not greater than a preset running time threshold, for each abnormal running data, identify the data change information of each abnormal running data, and based on the data change information, identify the abnormal running cause of each abnormal running data; based on each abnormal running cause and the path report template, generate the abnormal running information of the key running path.
[0135] Optionally, the third identification module 440 is specifically used for:
[0136] Obtain the historical running data corresponding to the running data of each abnormal running data, and based on each running data and each historical running data, identify the data volume change information and the level change information of each running data;
[0137] Based on the data volume change information, the level change information, the preset data volume change threshold, and the preset level change threshold, the abnormal change information of each running data is identified, and the abnormal change information of each running data is used as the abnormal running cause of the abnormal running data corresponding to each running data.
[0138] Optionally, the device further includes:
[0139] The adjustment module is used to, when the core batch running time is greater than a preset running time threshold, identify the abnormal running program corresponding to the time abnormal running cause based on the time abnormal running information corresponding to the target abnormal running information, and adjust the abnormal running program through a preset program optimization strategy to obtain a new running program;
[0140] The determination module is used to identify the target host containing all new running programs as the optimized target host.
[0141] Optionally, the device further includes:
[0142] The compression module is used to, when the core batch running time is not greater than a preset running time threshold, filter out target abnormal running data corresponding to waiting time greater than a waiting time threshold from each of the abnormal running data, perform data compression processing on each of the target abnormal running data to obtain target compressed data, and replace the target abnormal running data with the target compressed data;
[0143] The optimization module is used to identify the indexing efficiency of each indexing program in the target host, filter target indexing programs that are below the efficiency threshold, and obtain optimized target indexing programs by adjusting and reorganizing each target indexing program. The target host containing all the abnormal running data of the replaced target and all the optimized target indexing programs is taken as the optimized target host.
[0144] The modules in the aforementioned abnormal operation information determination device can be implemented entirely or partially through software, hardware, or a combination thereof. These modules can be embedded in the processor of a computer device in hardware form or independent of it, or stored in the memory of a computer device in software form, so that the processor can call and execute the operations corresponding to each module.
[0145] In one embodiment, a computer device is provided, which may be a terminal, and its internal structure diagram may be as follows: Figure 5As shown. The computer device includes a processor, memory, communication interface, display screen, and input devices connected via a system bus. The processor provides computing and control capabilities. The memory includes non-volatile storage media and internal memory. The non-volatile storage media stores the operating system and computer programs. The internal memory provides an environment for the operation of the operating system and computer programs stored in the non-volatile storage media. The communication interface is used for wired or wireless communication with external terminals; wireless communication can be achieved through Wi-Fi, mobile cellular networks, NFC (Near Field Communication), or other technologies. When executed by the processor, the computer program implements a method for determining abnormal operating information. The display screen can be an LCD screen or an e-ink screen. The input devices can be a touch layer covering the display screen, buttons, a trackball, or a touchpad mounted on the computer device casing, or an external keyboard, touchpad, or mouse.
[0146] Those skilled in the art will understand that Figure 5 The structure shown is merely a block diagram of a portion of the structure related to the present application and does not constitute a limitation on the computer device to which the present application is applied. Specific computer devices may include more or fewer components than those shown in the figure, or combine certain components, or have different component arrangements.
[0147] In one embodiment, a computer device is provided, including a memory and a processor, wherein the memory stores a computer program, and the processor executes the computer program to implement the steps of the method described in any one of the first aspects.
[0148] In one embodiment, a computer-readable storage medium is provided having a computer program stored thereon, which, when executed by a processor, implements the steps of the method described in any one of the first aspects.
[0149] In one embodiment, a computer program product is provided, including a computer program that, when executed by a processor, implements the steps of the method described in any one of the first aspects.
[0150] It should be noted that the user information (including but not limited to user device information, user personal information, etc.) and data (including but not limited to data used for analysis, data stored, data displayed, etc.) involved in this application are all information and data authorized by the user or fully authorized by all parties.
[0151] Those skilled in the art will understand that all or part of the processes in the methods of the above embodiments can be implemented by a computer program instructing related hardware. The computer program can be stored in a non-volatile computer-readable storage medium, and when executed, it can include the processes of the embodiments of the above methods. Any references to memory, databases, or other media used in the embodiments provided in this application can include at least one of non-volatile and volatile memory. Non-volatile memory can include read-only memory (ROM), magnetic tape, floppy disk, flash memory, optical memory, high-density embedded non-volatile memory, resistive random access memory (ReRAM), magnetic random access memory (MRAM), ferroelectric random access memory (FRAM), phase change memory (PCM), graphene memory, etc. Volatile memory can include random access memory (RAM) or external cache memory, etc. By way of illustration and not limitation, RAM can take many forms, such as Static Random Access Memory (SRAM) or Dynamic Random Access Memory (DRAM). The databases involved in the embodiments provided in this application may include at least one type of relational database and non-relational database. Non-relational databases may include, but are not limited to, blockchain-based distributed databases. The processors involved in the embodiments provided in this application may be general-purpose processors, central processing units, graphics processing units, digital signal processors, programmable logic devices, quantum computing-based data processing logic devices, etc., and are not limited to these.
[0152] The technical features of the above embodiments can be combined in any way. For the sake of brevity, not all possible combinations of the technical features in the above embodiments are described. However, as long as there is no contradiction in the combination of these technical features, they should be considered to be within the scope of this specification.
[0153] The embodiments described above are merely illustrative of several implementation methods of this application, and while the descriptions are specific and detailed, they should not be construed as limiting the scope of this patent application. It should be noted that those skilled in the art can make various modifications and improvements without departing from the concept of this application, and these all fall within the protection scope of this application. Therefore, the protection scope of this application should be determined by the appended claims.
Claims
1. A method for determining abnormal operation information, characterized in that, The method includes: Obtain the target host's overall runtime and multiple runtime data points, and identify the runtime path corresponding to each runtime data point; Among the various running paths, key running paths are selected, and the core batch running time of the key running paths is identified based on the overall running time of the target host. If the core batch running time is greater than a preset running time threshold, identify the cause of the time anomaly in the key running path; if the core batch running time is not greater than the preset running time threshold, identify the abnormal running data in the key running path. Based on the time anomaly cause of the critical operation path or the abnormal operation data of the critical operation path, the abnormal operation information of the critical operation path is identified, and the abnormal operation information of the critical operation path is used as the target abnormal operation information of the target host.
2. The method according to claim 1, characterized in that, The process of identifying the running path corresponding to each piece of running data includes: Identify the running sequence to which each running data belongs, and cluster the running data according to each running sequence to obtain multiple data groups; Identify the batch path corresponding to each data group, and use the batch path corresponding to each data group as the running path corresponding to each running data in each data group.
3. The method according to claim 2, characterized in that, The identification of the time anomalies in the critical running path includes: For each running data in the data group corresponding to the key running path, the running data is distributed and sorted according to the running sequence corresponding to each running data to obtain the running data distribution information corresponding to the key running path. Identify the fluctuation range of the operation distribution information and the fluctuation frequency of the operation sub-information, and based on a preset time anomaly distribution strategy, identify the anomaly causes corresponding to the fluctuation range and the fluctuation frequency to obtain the time anomaly causes of the key operation path.
4. The method according to claim 1, characterized in that, The identification of abnormal operation data in the critical operation path includes: Identify the waiting time of each piece of data waiting to run in the key running path, and classify each piece of data waiting to run according to the date type to which each piece of data belongs; For each date type, based on the waiting time threshold corresponding to the date type, waiting running data with waiting times greater than the waiting time threshold are filtered out as target waiting running data, and each target waiting running data and the date type corresponding to each target waiting running data are regarded as abnormal running data of the critical running path.
5. The method according to claim 1, characterized in that, The process of identifying abnormal operation information of the critical operation path based on the cause of the time anomaly in the critical operation path or the abnormal operation data of the critical operation path includes: If the core batch execution time exceeds a preset execution time threshold, based on the cause of the time anomaly in the critical execution path and the path report template, abnormal execution information for the critical execution path is generated, or If the core batch running time is not greater than a preset running time threshold, for each abnormal running data, identify the data change information of each abnormal running data, and based on the data change information, identify the abnormal running cause of each abnormal running data; based on each abnormal running cause and the path report template, generate the abnormal running information of the key running path.
6. The method according to claim 5, characterized in that, The process of identifying data change information for each abnormal data point and, based on this data change information, identifying the cause of the abnormal operation for each abnormal data point includes: Obtain the historical running data corresponding to the running data of each abnormal running data, and based on each running data and each historical running data, identify the data volume change information and the level change information of each running data; Based on the data volume change information, the level change information, the preset data volume change threshold, and the preset level change threshold, the abnormal change information of each running data is identified, and the abnormal change information of each running data is used as the abnormal running cause of the abnormal running data corresponding to each running data.
7. The method according to claim 5, characterized in that, After using the abnormal operation information of the critical operation path as the target abnormal operation information of the target host, the method further includes: If the core batch running time is greater than a preset running time threshold, based on the time anomaly cause corresponding to the target abnormal running information, the abnormal running program corresponding to the time anomaly running cause is identified, and the abnormal running program is adjusted through a preset program optimization strategy to obtain a new running program. The target host containing all new running programs will be designated as the optimized target host.
8. The method according to claim 5, characterized in that, After using the abnormal operation information of the critical operation path as the target abnormal operation information of the target host, the method further includes: If the core batch running time is not greater than a preset running time threshold, among the abnormal running data, target abnormal running data corresponding to waiting time greater than the waiting time threshold is selected, and data compression processing is performed on each target abnormal running data to obtain target compressed data, and the target compressed data replaces the target abnormal running data. The indexing efficiency of each indexing program in the target host is identified, and target indexing programs with efficiency below the threshold are filtered out. The target indexing programs are adjusted and reorganized to obtain optimized target indexing programs. The target host containing all the abnormal running data of the replaced target and all the optimized target indexing programs is taken as the optimized target host.
9. A device for determining abnormal operation information, characterized in that, The device includes: The acquisition module is used to acquire the overall runtime of the target host and multiple runtime data, and identify the runtime path corresponding to each runtime data. The first identification module is used to filter key running paths among the various running paths, and identify the core batch running time of the key running paths based on the overall running time of the target host. The second identification module is used to identify the cause of time anomalies in the key running path when the core batch running time is greater than a preset running time threshold, and to identify abnormal running data in the key running path when the core batch running time is not greater than the preset running time threshold. The third identification module is used to identify the abnormal operation information of the critical operation path based on the time anomaly cause of the critical operation path or the abnormal operation data of the critical operation path, and to use the abnormal operation information of the critical operation path as the target abnormal operation information of the target host.
10. A computer device comprising a memory and a processor, wherein the memory stores a computer program, characterized in that, When the processor executes the computer program, it implements the steps of the method according to any one of claims 1 to 8.
11. A computer-readable storage medium having a computer program stored thereon, characterized in that, When the computer program is executed by a processor, it implements the steps of the method according to any one of claims 1 to 8.
12. A computer program product, comprising a computer program, characterized in that, When the computer program is executed by a processor, it implements the steps of the method according to any one of claims 1 to 8.