Method and device for monitoring abnormality of automated operation, electronic device and storage medium

By collecting and parsing log data from automated operations, generating structured data, calculating health scores, and identifying and locating the root causes of anomalies, the problem of low efficiency in automated operation monitoring due to reliance on manual execution is solved, enabling proactive monitoring and efficient fault handling.

CN122241516APending Publication Date: 2026-06-19TRAVELSKY TECHNOLOGY LIMITED

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
TRAVELSKY TECHNOLOGY LIMITED
Filing Date
2026-03-18
Publication Date
2026-06-19

Smart Images

  • Figure CN122241516A_ABST
    Figure CN122241516A_ABST
Patent Text Reader

Abstract

This invention discloses a method, apparatus, electronic device, and storage medium for anomaly monitoring of automated operations, relating to the field of automated data processing or other related technical fields. The method includes: collecting log data from automated operations and parsing the log data to generate structured data; calculating a health score for each automated operation based on the structured data; identifying automated operations with health scores below a preset health threshold, thus identifying abnormal operations; extracting execution indicator data and dependency relationship data from the abnormal operations, and identifying the root cause of the abnormality based on the execution indicator data and dependency relationship data. This invention solves the technical problem in related technologies where anomaly monitoring of automated operations relies on manual execution, resulting in low monitoring efficiency.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of automated data processing or other related technical fields. Specifically, it relates to a method and apparatus for monitoring anomalies in automated operations, as well as electronic equipment and storage media. Background Technology

[0002] With the rapid development of information technology and the continuous upgrading of business needs, the core business systems of the modern civil aviation industry are facing unprecedented challenges. In particular, the departure system, as a core component of airlines' daily operations, needs to integrate various subsystems with different architectures and technology stacks, ranging from mainframe systems to open systems (such as cloud-based microservice architectures), as well as various databases, message queues, and third-party services. These heterogeneous systems perform a large number of automated tasks daily, including but not limited to data synchronization, report generation, and system maintenance, to ensure data consistency, integrity, and business continuity.

[0003] In related technologies, monitoring automated operations mostly relies on post-event alarms and manual responses, which is inefficient, unable to identify performance bottlenecks and potential faults in a timely manner, and unable to take proactive measures to prevent business interruption or data loss, leaving maintenance work often in a passive state.

[0004] There is currently no effective solution to the above problems. Summary of the Invention

[0005] This invention provides a method, apparatus, electronic device, and storage medium for anomaly monitoring in automated operations, to at least solve the technical problem in related technologies where anomaly monitoring in automated operations relies on manual execution, resulting in low monitoring efficiency.

[0006] According to one aspect of the present invention, an anomaly monitoring method for automated jobs is provided, comprising: collecting log data of automated jobs and parsing the log data to generate structured data; calculating a health score for each automated job based on the structured data; identifying automated jobs whose health scores are less than a preset health threshold to obtain abnormal jobs; extracting execution indicator data and dependency data of the abnormal jobs, and identifying the root cause of the abnormality of the abnormal jobs based on the execution indicator data and the dependency data.

[0007] Furthermore, the step of parsing the log data to generate structured data includes: querying job data at each level through a retrieval script to obtain hierarchical job data; and parsing the hierarchical job data according to the structured data templates for each level to obtain the structured data.

[0008] Further, the step of calculating the health score of each automated task based on the structured data includes: extracting task execution progress data from the structured data and calculating the progress health of the automated task based on the task execution progress data; extracting task execution timeliness data from the structured data and calculating the timeliness health of the automated task based on the task execution timeliness data; extracting task quality data from the structured data and calculating the quality health of the automated task based on the task quality data; and calculating the health score of the automated task based on the progress health, the weight value corresponding to the progress health, the timeliness health, the weight value corresponding to the timeliness health, the quality health, and the weight value corresponding to the quality health.

[0009] Furthermore, before identifying the root cause of the abnormal operation based on the execution metric data and the dependency data, the method further includes: obtaining historical log data for each of the automated operations; analyzing the historical log data; and constructing a baseline profile of the metrics for each of the automated operations based on the analysis results, wherein the baseline profile includes at least one of the following: performance metric baseline, log metric baseline, and time-series metric baseline.

[0010] Further, the step of identifying the root cause of the abnormal operation based on the execution indicator data and the dependency relationship data includes: comparing the execution indicator data with the indicator baseline profile, and identifying the indicator baseline deviation type of the abnormal operation based on the comparison result; identifying automated operations with dependencies based on the dependency relationship data to obtain dependent operations, and obtaining the abnormal situation of the dependent operations; and identifying the root cause of the abnormal operation based on the indicator baseline deviation type of the abnormal operation and the abnormal situation of the dependent operations.

[0011] Furthermore, after identifying the root cause of the abnormal operation based on the execution indicator data and the dependency relationship data, the method further includes: matching the abnormal operation with an anomaly repair strategy based on the root cause to obtain a matching result; if the matching result indicates that the abnormal operation and the anomaly repair strategy are successfully matched, repairing the abnormal situation of the abnormal operation based on the anomaly repair strategy; or, if the matching result indicates that the abnormal operation and the anomaly repair strategy are not matched, generating an anomaly alarm message and sending the anomaly alarm message to the operation and maintenance terminal.

[0012] Furthermore, after generating structured data, the method further includes: creating entity nodes based on the automated job, and creating a tree diagram based on the structured data corresponding to each entity node, wherein the tree diagram is updated based on the execution status of the automated job, wherein the execution status includes at least one of the following: execution progress, health score, abnormal status, and root cause of the abnormality.

[0013] According to another aspect of the present invention, an anomaly monitoring device for automated operations is also provided, comprising: a collection unit for collecting log data of automated operations and parsing the log data to generate structured data; a calculation unit for calculating a health score for each automated operation based on the structured data; an acquisition unit for identifying automated operations whose health scores are less than a preset health threshold, thereby obtaining abnormal operations; and an identification unit for extracting execution indicator data and dependency relationship data of the abnormal operations, and identifying the root cause of the abnormality of the abnormal operations based on the execution indicator data and the dependency relationship data.

[0014] Furthermore, the acquisition unit includes: a first query module, used to query the job data of each level through a retrieval script to obtain the hierarchical job data; and a first parsing module, used to parse the hierarchical job data according to the structured data template of each level to obtain the structured data.

[0015] Further, the calculation unit includes: a first calculation module, used to extract job execution progress data from the structured data, and calculate the progress health of the automated job based on the job execution progress data; a second calculation module, used to extract job execution timeliness data from the structured data, and calculate the timeliness health of the automated job based on the job execution timeliness data; a third calculation module, used to extract job quality data from the structured data, and calculate the quality health of the automated job based on the job quality data; and a fourth calculation module, used to calculate the health score of the automated job based on the progress health, the weight value corresponding to the progress health, the timeliness health, the weight value corresponding to the timeliness health, the quality health, and the weight value corresponding to the quality health.

[0016] Furthermore, the anomaly monitoring device for the automated operation further includes: a first acquisition module, used to acquire historical log data of each of the automated operations; and a first analysis module, used to analyze the historical log data and construct an indicator baseline profile of each of the automated operations based on the analysis results, wherein the indicator baseline profile includes at least one of the following: performance indicator baseline, log indicator baseline, and time series indicator baseline.

[0017] Furthermore, the identification unit includes: a first comparison module, used to compare the execution indicator data with the indicator baseline profile, and identify the indicator baseline deviation type of the abnormal job based on the comparison result; a first identification module, used to identify automated jobs with dependencies based on the dependency data, obtain dependent jobs, and acquire the abnormal situation of the dependent jobs; and a second identification module, used to identify the abnormal root cause of the abnormal job based on the indicator baseline deviation type of the abnormal job and the abnormal situation of the dependent jobs.

[0018] Furthermore, the automated operation anomaly monitoring device further includes: a first matching module, used to obtain a matching result based on the anomaly root cause and an anomaly repair strategy for the anomaly operation; a first repair module, used to repair the anomaly of the anomaly operation based on the anomaly repair strategy when the matching result indicates that the anomaly operation and the anomaly repair strategy are successfully matched; and a first generation module, used to generate an anomaly alarm message and send the anomaly alarm message to the operation and maintenance terminal when the matching result indicates that the anomaly operation and the anomaly repair strategy are not matched.

[0019] Furthermore, the anomaly monitoring device for the automated operation further includes: a first creation module, used to create entity nodes based on the automated operation, and to create a tree diagram based on the structured data corresponding to each entity node, wherein the tree diagram is updated based on the execution status of the automated operation, wherein the execution status includes at least one of the following: execution progress, health score, abnormal status, and root cause of the abnormality.

[0020] According to another aspect of the present invention, a computer-readable storage medium is also provided, the computer-readable storage medium including a stored computer program, wherein, when the computer program is executed, the anomaly monitoring method controls the device where the computer-readable storage medium is located to perform any of the above-mentioned automated operations.

[0021] According to another aspect of the present invention, an electronic device is also provided, including one or more processors and a memory, the memory being used to store one or more programs, wherein when the one or more programs are executed by the one or more processors, the one or more processors cause the one or more processors to implement the above-described method for monitoring the anomaly of any of the automated operations.

[0022] In this application, the following steps are performed: log data of automated jobs are collected and parsed to generate structured data; health scores of each automated job are calculated based on the structured data; automated jobs with health scores less than a preset health threshold are identified to obtain abnormal jobs; execution indicator data and dependency data of abnormal jobs are extracted; and the root cause of abnormality of abnormal jobs is identified based on the execution indicator data and dependency data.

[0023] In this application, for automated operations, log data during the execution process is collected in real time, parsed, and structured data is established to construct a health calculation system. The health score of each automated operation is comprehensively evaluated, and a preliminary anomaly assessment is conducted. For the identified abnormal operations, the root cause of the anomaly is identified based on the indicator data and the dependencies between operations, thereby achieving proactive monitoring of automated operations, improving monitoring efficiency, breaking the data silo defect in the identification of anomalies in automated operations, improving the accuracy of the anomaly root cause monitoring results, and thus solving the technical problem in related technologies where the anomaly monitoring of automated operations relies on manual execution, resulting in low monitoring efficiency. Attached Figure Description

[0024] The accompanying drawings, which are included to provide a further understanding of the invention and form part of this application, illustrate exemplary embodiments of the invention and, together with their description, serve to explain the invention and do not constitute an undue limitation thereof. In the drawings:

[0025] Figure 1 This is a flowchart of an optional method for detecting anomalies in an automated operation according to an embodiment of the present invention;

[0026] Figure 2 This is a schematic diagram of an optional automated operation anomaly monitoring process according to an embodiment of the present invention;

[0027] Figure 3 This is a schematic diagram of an optional automated operation anomaly monitoring device according to an embodiment of the present invention;

[0028] Figure 4 This is a hardware structure block diagram of an electronic device (or mobile device) for performing an abnormality monitoring method for automated operations according to an embodiment of the present invention. Detailed Implementation

[0029] To enable those skilled in the art to better understand the present invention, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings of the embodiments of the present invention. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort should fall within the scope of protection of the present invention.

[0030] It should be noted that the terms "first," "second," etc., in the specification, claims, and accompanying drawings of this invention are used to distinguish similar objects and are not necessarily used to describe a specific order or sequence. It should be understood that such data can be interchanged where appropriate so that the embodiments of the invention described herein can be implemented in orders other than those illustrated or described herein. Furthermore, the terms "comprising" and "having," and any variations thereof, are intended to cover a non-exclusive inclusion; for example, a process, method, system, product, or apparatus that comprises a series of steps or units is not necessarily limited to those steps or units explicitly listed, but may include other steps or units not explicitly listed or inherent to such processes, methods, products, or apparatus.

[0031] To facilitate understanding of the present invention by those skilled in the art, some terms or nouns involved in the various embodiments of the present invention are explained below:

[0032] Elasticsearch Cluster is a distributed search and analytics engine primarily used to handle large-scale data search, analysis, and storage needs.

[0033] Domain Specific Language (DSL) scripts are used to perform complex retrieval, sorting, and aggregation operations on data stored in Elasticsearch.

[0034] It should be noted that the method and apparatus for monitoring anomalies in automated operations in this application can be used in the field of automated data processing for monitoring anomalies in automated operations, and can also be used in any field other than automated data processing for monitoring anomalies in automated operations. This application does not limit the application field of the method and apparatus for monitoring anomalies in automated operations.

[0035] It should be noted that the relevant information (including but not limited to user device information, user personal information, etc.) and data (including but not limited to data used for analysis, stored data, and displayed data) involved in this application are information and data authorized by the user or fully authorized by all parties. Furthermore, the collection, storage, use, processing, transmission, provision, disclosure, and application of such data all comply with the relevant laws, regulations, and standards of the relevant regions, necessary confidentiality measures have been taken, and they do not violate public order and good morals. Corresponding operation entry points are provided for users to choose to authorize or refuse. For example, this system has interfaces with relevant users or organizations. Before obtaining relevant information, a request to obtain the information needs to be sent to the aforementioned user or organization through the interface, and the relevant information is obtained only after receiving consent from the aforementioned user or organization.

[0036] It should be noted that in this application, when collecting and analyzing customer information, users are provided with corresponding operation entry points to choose whether to agree to or reject the automated decision-making results; if the user chooses to reject, the process will proceed to the expert decision-making process.

[0037] The following embodiments of the present invention can be applied to various automated operation anomaly monitoring systems / applications / equipment. The present invention introduces a mechanism for evaluating automated operation anomalies based on health score calculation and for locating the root causes of anomalies based on dynamic baselines. This enables the system to adapt to changes in business cycles and data scale, achieving accurate anomaly identification and risk assessment, and significantly shifting operational actions from passive alarms to proactive protection.

[0038] Simultaneously, the system transforms raw, discrete monitoring data points into topological profiles rich in contextual relationships. This enables the system to understand the dependencies of automated operations, thereby identifying the impact chain of faulty operations and achieving a leap from indicator monitoring to business situation understanding, ultimately enabling precise root cause localization.

[0039] The present invention will now be described in detail with reference to various embodiments.

[0040] Example 1

[0041] According to an embodiment of the present invention, an embodiment of an anomaly monitoring method for automated operations is provided. It should be noted that the steps shown in the flowchart in the accompanying drawings can be executed in a computer system such as a set of computer-executable instructions. Furthermore, although a logical order is shown in the flowchart, in some cases, the steps shown or described may be executed in a different order than that shown here.

[0042] Figure 1 This is a flowchart of an optional anomaly detection method for automated operations according to an embodiment of the present invention, such as... Figure 1 As shown, the method includes the following steps:

[0043] Step S101: Collect log data from automated operations and parse the log data to generate structured data.

[0044] In step S101 above, all departure extension function systems participating in the automated operation are identified, including their operating environment, log file location, and log format, ensuring accurate collection of all log data related to the automated operation. Log data is captured in real-time from multiple data sources, and the collected log data is parsed and standardized based on a pre-written DSL script. The DSL script can accurately identify key information in the logs, such as job name, start time, end time, job status, and resource usage information, and convert them into predefined, uniformly formatted structured data.

[0045] The collected execution log data is stored in the Elasticsearch cluster in real time, ensuring high availability and fast retrieval capabilities. Simultaneously, Elasticsearch's horizontal scaling capabilities guarantee smooth data collection and storage without data loss, even under large-scale data streams.

[0046] Define a standardized data model for storing parsed log information. This model ensures that all job-related data can be processed consistently by the system, facilitating subsequent monitoring, analysis, and decision-making. The model includes at least three levels: total task data, table task data, and block task data, covering job information at the software, task, table, and block levels, respectively.

[0047] Furthermore, the steps for parsing log data and generating structured data include: querying job data at each level through a retrieval script to obtain hierarchical job data; and parsing the hierarchical job data according to the structured data templates for each level to obtain structured data.

[0048] Specifically, when parsing data, retrieve job data at each level from the collected log data set through a pre-constructed retrieval script, thereby dividing the original log data into job data at each level to obtain hierarchical job data. Subsequently, for each level, call a structured data template to parse the hierarchical job data. The template needs to clearly specify data fields, formats, and meanings to ensure data consistency and interpretability. For example, the total task template may include the name of the job, the software it belongs to, the status, the total elapsed time, and the number of tables processed; the table task template may further refine to the specific table names processed, the operation types, the progress, etc. Thus, structured data is obtained. For example, software-level structured data ["DE", "JOB1", "Night Maintenance Data Deletion", "TaskEnd", "20240403161923", "SUCCESS", "30", "10"] is parsed based on the total task template; data table-level structured data ["JOB1", "TableEnd", "TABLE1", "20240403162023", "2 / 10", "32", "SUCCESS", "10", "32000", "20240401"] is parsed based on the table task template, etc.

[0049] Through the above steps, the original log data is transformed into a structured data form that can be understood and processed by the system, ensuring data consistency, integrity, and availability, providing a solid foundation for subsequent analysis and decision-making.

[0050] Further, after generating the structured data, it also includes: creating entity nodes based on automated jobs, and creating a tree diagram based on the structured data corresponding to each entity node. The tree diagram is updated based on the execution status of the automated job, where the execution status includes at least one of the following: execution progress, health score, abnormal status, and root cause of the abnormality.

[0051] Specifically, during the execution of an automated job, for each job, an entity node is generated based on its structured data. These entity nodes include but are not limited to total task nodes, table task nodes, and block task nodes. Each node represents a specific execution unit in the automated job and carries key information about job execution. Based on the hierarchical relationship between entity nodes, a tree diagram model is constructed. The root node of the tree diagram is the total task of the entire automated job, and the child nodes are successively refined into table tasks and block tasks downward, forming a clear job flow view from the top layer to the bottom layer. The construction of the tree diagram ensures the visualization of the job execution status, facilitating operation and maintenance personnel to grasp the overall execution status of the entire job from a macroscopic perspective. At the same time, the construction of the tree diagram can also uncover the dependency relationships between jobs.

[0052] As automated jobs are executed, the information of entity nodes is continuously updated, including their execution progress, health score, abnormal status, and root cause. This information is mapped in real time to the corresponding nodes in the tree diagram. For example, when the health score of a job decreases or an abnormal status occurs, the corresponding entity node will be marked with a specific color or icon in the tree diagram, intuitively alerting operations personnel to pay attention. The dynamic updating characteristic of the tree diagram allows it to display the latest status of the job workflow in real time, providing immediate data support for operations and maintenance decisions.

[0053] Step S102: Calculate the health score of each automated operation based on structured data.

[0054] In step S102 above, based on the previously generated structured data, a score reflecting the overall health status of the automated operation is derived by quantitatively evaluating multiple aspects of the automated operation. This helps operations and maintenance personnel quickly identify potential problems in the operation and take timely measures to ensure business continuity and stability.

[0055] Furthermore, the steps for calculating the health score of each automated task based on structured data include: extracting task execution progress data from structured data and calculating the progress health of the automated task based on the task execution progress data; extracting task execution timeliness data from structured data and calculating the timeliness health of the automated task based on the task execution timeliness data; extracting task quality data from structured data and calculating the quality health of the automated task based on the task quality data; and calculating the health score of the automated task based on the progress health, the weight value corresponding to the progress health, the timeliness health, the weight value corresponding to the timeliness health, the quality health, and the weight value corresponding to the quality health.

[0056] Specifically, when calculating the health score, job execution progress data is extracted from structured data. This process involves reading and analyzing job status information stored in the database, including the number of completed jobs, the number of jobs in progress, and the total number of jobs. This data is used to measure whether the execution progress of automated jobs meets expectations. By comparing the number of completed jobs with the expected number of jobs, a numerical value reflecting the progress health status is calculated according to a formula. For example, the progress health score can be calculated as follows: Progress Health Score = (Number of successfully completed jobs / Number of jobs expected to be completed at the current moment) × 100%. The higher this value, the more ideal the job execution progress.

[0057] Extract job execution timeliness data from structured data. This information pertains to job execution time, including the current job's average execution time, longest execution time, and shortest execution time. This data is used to assess whether jobs are completed on time and whether there is a risk of exceeding the time limit. Calculate the timeliness health of automated jobs. Assess job timeliness by analyzing the deviation of job execution time from historical normal levels. For example, a baseline model can be built using a weighted average time and standard deviation. Then, the timeliness health is calculated by comparing the current job execution time with the deviation of the model. Specifically, timeliness health = 100 - (weighted deduction value for risky jobs), where the weighted deduction value for risky jobs = Σ (timeliness risk coefficient for each automated job). Task weight), timeliness risk coefficient = Max(0, Min(1, (predicted remaining time / remaining available time window))), remaining available time window = the latest planned completion time of the task - the current time.

[0058] Extract job quality data from structured data. Quality data covers the accuracy and completeness of the job, such as checking for data errors or omissions in the job results and whether the job produced the correct results as expected. Calculate the quality health of automated jobs. Assess the execution quality of the job by checking the consistency between the job results and the expected output. For example, check for failed jobs, data validation errors, etc., and calculate a numerical value reflecting the job quality based on this: Quality Health = 100 - (Failed Job Deduction + Quality Warning Deduction), where Failed Job Deduction = (Total weight of completed jobs with failed results / Total weight of all jobs). 100. Quality warnings are deducted based on automated jobs with abnormal data volume or execution time (significantly lower or higher than historical averages).

[0059] Finally, based on the above calculation results and combined with preset weight values, the overall health score of the automated operation is calculated. The weight values ​​reflect the relative importance of each health indicator (schedule, timeliness, quality) in the overall health evaluation.

[0060] For the calculated health score, a specific label is used to label the automated jobs at different health levels.

[0061] Step S103: Identify automated jobs with health scores lower than the preset health threshold to obtain abnormal jobs.

[0062] In step S103 above, the calculated health score is compared with a preset health threshold, and automated jobs with health scores lower than the preset health threshold are selected and marked as abnormal jobs. By comparing the health scores of automated jobs with the preset health threshold in real time, automatic detection and identification of abnormal job states are achieved, improving the efficiency of abnormal detection.

[0063] Step S104: Extract the execution indicator data and dependency data of the abnormal job, and identify the root cause of the abnormal job based on the execution indicator data and dependency data.

[0064] In step S104 above, the step of extracting execution indicator data for abnormal jobs includes obtaining relevant performance indicators of the abnormal jobs, such as execution time, resource consumption (CPU, memory), and number of I / O operations. By parsing the structured job data stored in the database, it is possible to identify which indicators exceed the normal range, providing a basis for subsequent anomaly localization. Dependency data is also collected. Automated jobs are often not isolated; they may have complex dependencies. Therefore, the interdependencies between jobs can be extracted based on a pre-built tree diagram, such as the calling order between tasks, data flow paths, and the degree of job dependence on specific resources (such as database connections).

[0065] The step of identifying the root causes of abnormal jobs based on execution metric data and dependency data can utilize statistical analysis, machine learning models, or expert systems to conduct in-depth analysis of the aforementioned data and identify the root causes leading to job anomalies. For example, by comparing the current job execution time with historical baselines and considering the status of dependent jobs, it can be determined whether the problem is caused by resource bottlenecks (such as high CPU utilization) or upstream job failures.

[0066] Furthermore, before identifying the root cause of abnormal operations based on execution metric data and dependency data, the process also includes: obtaining historical log data for each automated operation; analyzing the historical log data; and constructing a baseline profile of metrics for each automated operation based on the analysis results. The baseline profile of metrics includes at least one of the following: performance metric baseline, log metric baseline, and time-series metric baseline.

[0067] Specifically, this embodiment of the invention introduces a baseline profile of indicators for anomaly root cause identification, obtaining historical log data of each normally executing automated job. This data can be extracted from the Elasticsearch cluster and covers detailed information on past job execution, including but not limited to execution time, resource consumption, data processing volume, and any anomaly records. The historical log data is analyzed to extract key performance indicators, such as average and peak CPU utilization, memory usage, disk I / O operations, and network bandwidth usage. Simultaneously, abnormal patterns and periodic behaviors in the log data are analyzed, such as the changing patterns of job latency and peak resource usage within specific time periods.

[0068] Based on the analysis results, a baseline profile of indicators is constructed. This stage aims to create a template or benchmark reflecting normal operational behavior as a reference for subsequent anomaly detection. The baseline profile typically includes three types: performance indicator baselines, reflecting expected performance indicators during job execution, such as normal execution time, average and fluctuating range of resource consumption; log indicator baselines, based on historical log data, identifying typical log messages and patterns during normal job operation, used as a standard for anomaly log detection; and time-series indicator baselines, considering the periodic impact of job execution (such as differences between weekdays and weekends, day and night), constructing a baseline reflecting operational behavior within a specific period to enable accurate anomaly detection in different time periods.

[0069] By constructing baseline profiles of metrics involving performance, logs, and time series, a solid data foundation is provided for anomaly detection, making the identification of root causes of anomalies more accurate and reducing the possibility of false positives and false negatives.

[0070] Furthermore, the steps for identifying the root causes of abnormal operations based on execution indicator data and dependency data include: comparing the execution indicator data with the indicator baseline profile, and identifying the deviation type of the indicator baseline of the abnormal operation based on the comparison results; identifying automated operations with dependencies based on dependency data, obtaining dependent operations, and obtaining the abnormal conditions of dependent operations; and identifying the root causes of abnormal operations based on the deviation type of the indicator baseline of the abnormal operation and the abnormal conditions of dependent operations.

[0071] Specifically, the system compares the execution metric data with the baseline metric profile. It receives and analyzes the execution metric data of the current automated job (such as CPU utilization, memory usage, and execution time) in real time and compares it with a pre-built baseline metric profile. The baseline metric profile includes the distribution of metrics during historical normal operation, such as the mean, standard deviation, and outlier thresholds. This identifies deviations in metrics during job execution—that is, which metrics exceed the normal range, potentially causing job anomalies.

[0072] The system identifies automated jobs with dependencies based on dependency data. By analyzing dependency data between jobs (e.g., which jobs' outputs are inputs for other jobs), the system determines other jobs related to the current anomalous job, i.e., dependent jobs. It also identifies the upstream and downstream jobs of the anomalous job to check for any propagation of the anomaly caused by other jobs.

[0073] Obtain anomalies in dependent jobs. If a dependent job has been marked as abnormal, the system will further collect specific details of the anomaly, including its health score, execution metric data, anomaly time point, and other relevant anomaly information. Assess whether the current abnormal job is directly affected by the abnormal dependent job to more accurately pinpoint the problem area.

[0074] Finally, by combining the identified deviations in metrics with the anomalies in dependent jobs, the root cause of the abnormal job is determined. This may include resource bottlenecks, upstream data errors, software defects, or other system failures. This allows for a multi-faceted understanding of the background of the abnormal job, leading to a more accurate identification of the root cause and providing direction for subsequent troubleshooting.

[0075] Furthermore, after identifying the root cause of abnormal operations based on execution indicator data and dependency data, the process also includes: matching the abnormal operation with an anomaly repair strategy based on the root cause to obtain a matching result; if the matching result indicates that the abnormal operation and the anomaly repair strategy are successfully matched, repairing the abnormal situation of the abnormal operation based on the anomaly repair strategy; or, if the matching result indicates that the abnormal operation and the anomaly repair strategy are not matched, generating an anomaly alarm message and sending the anomaly alarm message to the operation and maintenance terminal.

[0076] Specifically, the system matches anomaly repair strategies to abnormal jobs based on their root causes. After identifying the root cause of an anomaly, the system automatically searches for a corresponding repair strategy for the abnormal job based on the built-in anomaly handling manual or preset handling rules. If the matching result indicates that the abnormal job and the anomaly repair strategy are successfully matched, the abnormal situation of the abnormal job is repaired based on the anomaly repair strategy. If the system finds an applicable repair strategy, it will automatically execute the strategy to attempt to restore the normal operation of the abnormal job. This may include operations such as restarting the service, releasing resources, clearing the cache, and performing backup and restore. This achieves automated fault recovery without manual intervention, reduces the workload of operation and maintenance personnel, and improves the system's self-healing capabilities.

[0077] If the matching result indicates that the abnormal operation and the abnormal repair strategy fail to match, an abnormal alarm message is generated and sent to the operations and maintenance terminal. If the system cannot find an effective repair solution, or if the abnormal situation is too complex to be handled automatically, the system will generate an alarm message containing abnormal details, possible root causes, and suggested actions, and send the alarm message to the operations and maintenance terminal via email, SMS, or other communication channels. This ensures that the fault can be handled manually in a timely manner to prevent the problem from escalating.

[0078] Through the above steps, log data of automated operations is collected, parsed, and structured data is generated. Based on the structured data, the health score of each automated operation is calculated, and automated operations with health scores less than the preset health threshold are identified as abnormal operations. The execution indicator data and dependency data of abnormal operations are extracted, and the root cause of the abnormality of the abnormal operations is identified based on the execution indicator data and dependency data.

[0079] In this embodiment, for automated operations, log data during the execution process is collected in real time, parsed, and structured data is established to construct a health computing system. The health scores of each automated operation are comprehensively evaluated, and a preliminary anomaly assessment is conducted. For the identified abnormal operations, the root causes of the anomalies are identified based on the indicator data and the dependencies between operations. This enables proactive monitoring of automated operations, improves monitoring efficiency, breaks down the data silos in the identification of anomalies in automated operations, improves the accuracy of the root cause monitoring results, and thus solves the technical problem in related technologies where the anomaly monitoring of automated operations relies on manual execution, resulting in low monitoring efficiency.

[0080] The following describes in detail another optional implementation method.

[0081] Figure 2 This is a schematic diagram of an optional automated operation anomaly monitoring process according to an embodiment of the present invention, such as... Figure 2 As shown, the anomaly monitoring process for automated operations includes:

[0082] Step 1, Begin;

[0083] Step 2: Collect job execution logs within time t;

[0084] Collect job execution logs from multiple automated job log systems within a time period t, for example, setting t=10min. Use a DSL script to query and obtain structured hierarchical job data: software level, task level, data table level, and block level.

[0085] Step 3: Job parsing and database storage;

[0086] The raw task data is parsed and standardized, and can be stored according to the following template: The overall task includes [software name, task name, task description, task operation, timestamp, task status, total time, total number of tables processed]; the table task includes [task name, table operation, table name, timestamp, execution progress of table, total number of tables processed, task status, total time, number of records processed, processing date]; the block task includes [task name, table operation, table name, timestamp, execution progress of table, total number of tables processed, task status, total time, number of records processed, processing date]. For example, the overall task parsing result for the night maintenance operation is ["DE", "JOB1", "Night Maintenance Deletion", "TaskEnd", "20240403161923", "SUCCESS", "30", "10"]]; the table task parsing result is ["JOB1", "TableEnd", "TABLE1", "20240403162023", "2 / 10", "32", "SUCCESS", "10", "32000", "20240401"]; the block task parsing result is ["JOB1", "S ecStart","TABLE1","20240403162029","1 / 20"]["JOB1","SecEnd","TABLE1","20240403162034","1 / 32","SUCCESS" , "5", "1000", "OK"], ...., ["JOB1", "SecEnd", "TABLE1", "20240403162124", "10 / 20", "FAILURE", "5", "1000", "OK"].

[0087] Step 4: Topology construction;

[0088] A tree diagram is constructed based on structured data, and the real-time status of nodes is displayed in the tree diagram. This allows for a clear view of automated jobs and the data tables under those jobs, as well as the execution status and statistics of each data table (such as job execution time, number of business entities, number of database tables, etc.).

[0089] Step 5: Health Calculation;

[0090] A health model is constructed to assess how well the current job execution meets expectations. It consists of four key dimensions, each with a pre-defined weight.

[0091] Progress health score P_score: Used to evaluate whether the overall completion progress of automated jobs meets expectations.

[0092] Progress health status = (Number of tasks successfully completed / Number of tasks expected to be completed at the current moment) 100.

[0093] Timeliness Dimension Health Score T_score: Used to assess whether a running job is at risk of timeout.

[0094] Timeliness and health score = 100 - (weighted deduction value for risky operations);

[0095] Weighted deduction for risky tasks = Σ (timeliness risk coefficient for each task) (Assignment weight);

[0096] Timeliness risk coefficient = Max(0, Min(1, (predicted remaining time / remaining available time window)));

[0097] Remaining available time window = Latest scheduled completion time of the job - Current time;

[0098] Quality health score R_score: Used to assess whether there are data quality issues in completed jobs, and the impact of failed jobs.

[0099] Quality and health score = 100 - (deduction for failed work + deduction for quality warnings);

[0100] Points deducted for failed assignments = (total weight of completed assignments that resulted in failure / total weight of all assignments) 100.

[0101] Quality warnings are penalized based on factors such as abnormal data volume or abnormal execution time (significantly lower or higher than historical averages).

[0102] Health score = (P_score) W1+T_score W2+R_score W3)^(1 / (W1+W2+W3)), where W1 is the progress weight, W2 is the timeliness weight, and W3 is the risk weight.

[0103] Step 6: Determine if an anomaly is detected. If yes, proceed to Step 7; otherwise, proceed to Step 12.

[0104] Step 7: Identification of the root causes of anomalies;

[0105] A dynamic baseline profile is established based on the historical execution status of automated jobs. This baseline primarily includes, but is not limited to, the following categories: performance indicator baselines, which show the distribution of job execution time during normal historical periods (e.g., average 30 minutes, standard deviation 5 minutes); log indicator baselines, which show the log sequence templates typically printed during normal operation and the amount of data input / output under normal conditions; and time-series indicator baselines, considering periodicity (weekdays / weekends, seasonality) to establish behavioral expectations for different time periods. The root causes of anomalies are then analyzed in depth, in conjunction with the dependencies between jobs.

[0106] Step 8: Repair strategy matching;

[0107] Step 9: Determine if a repair strategy is matched. If yes, proceed to step 10. If no, generate an alarm message and notify the operation and maintenance terminal.

[0108] Step 10: Automatic Repair;

[0109] Step 11: Feedback on repair results;

[0110] Step 12, End.

[0111] This invention introduces a mechanism for evaluating automated operation anomalies based on health score calculation and for locating the root causes of anomalies based on dynamic baselines. This enables the system to adapt to changes in business cycles and data scale, achieving accurate anomaly identification and risk assessment, and significantly shifting operational actions from passive alarms to proactive protection.

[0112] Simultaneously, the system transforms raw, discrete monitoring data points into topological profiles rich in contextual relationships. This enables the system to understand the dependencies of automated operations, thereby identifying the impact chain of faulty operations and achieving a leap from indicator monitoring to business situation understanding, ultimately enabling precise root cause localization.

[0113] The following is a detailed description with reference to another embodiment.

[0114] Example 2

[0115] The abnormal monitoring device for automated operation provided in this embodiment includes multiple implementation units, each of which corresponds to a specific implementation step in the above embodiment one. The specific implementation method and beneficial effects can be referred to the foregoing method embodiment, and will not be repeated here.

[0116] Figure 3 This is a schematic diagram of an optional automated operation anomaly monitoring device according to an embodiment of the present invention, such as... Figure 3 As shown, the anomaly monitoring device for this automated operation includes: a data acquisition unit 31, a calculation unit 32, an acquisition unit 33, and an identification unit 34, wherein,

[0117] The acquisition unit 31 is used to collect log data of automated operations and parse the log data to generate structured data.

[0118] Calculation unit 32 is used to calculate the health score of each automated operation based on structured data;

[0119] The acquisition unit 33 is used to identify automated jobs whose health score is less than a preset health threshold and to obtain abnormal jobs.

[0120] The identification unit 34 is used to extract the execution indicator data and dependency data of abnormal jobs, and to identify the root cause of abnormal jobs based on the execution indicator data and dependency data.

[0121] The aforementioned abnormal monitoring device for automated operations collects log data of automated operations through the acquisition unit 31, parses the log data, and generates structured data; the calculation unit 32 calculates the health score of each automated operation based on the structured data; the acquisition unit 33 identifies automated operations with health scores lower than a preset health threshold, thus identifying abnormal operations; and the identification unit 34 extracts the execution indicator data and dependency data of the abnormal operations, and identifies the root cause of the abnormality based on the execution indicator data and dependency data.

[0122] In this embodiment, for automated operations, log data during the execution process is collected in real time, parsed, and structured data is established to construct a health computing system. The health scores of each automated operation are comprehensively evaluated, and a preliminary anomaly assessment is conducted. For the identified abnormal operations, the root causes of the anomalies are identified based on the indicator data and the dependencies between operations. This enables proactive monitoring of automated operations, improves monitoring efficiency, breaks down the data silos in the identification of anomalies in automated operations, improves the accuracy of the root cause monitoring results, and thus solves the technical problem in related technologies where the anomaly monitoring of automated operations relies on manual execution, resulting in low monitoring efficiency.

[0123] Furthermore, the acquisition unit includes: a first query module, used to query the job data of each level through a retrieval script to obtain the hierarchical job data; and a first parsing module, used to parse the hierarchical job data according to the structured data template of each level to obtain structured data.

[0124] Furthermore, the calculation unit includes: a first calculation module, used to extract job execution progress data from structured data and calculate the progress health of automated jobs based on the job execution progress data; a second calculation module, used to extract job execution timeliness data from structured data and calculate the timeliness health of automated jobs based on the job execution timeliness data; a third calculation module, used to extract job quality data from structured data and calculate the quality health of automated jobs based on the job quality data; and a fourth calculation module, used to calculate the health score of automated jobs based on the progress health, the weight value corresponding to the progress health, the timeliness health, the weight value corresponding to the timeliness health, the quality health, and the weight value corresponding to the quality health.

[0125] Furthermore, the anomaly monitoring device for automated operations also includes: a first acquisition module for acquiring historical log data of each automated operation; and a first analysis module for analyzing the historical log data and constructing an indicator baseline profile for each automated operation based on the analysis results, wherein the indicator baseline profile includes at least one of the following: performance indicator baseline, log indicator baseline, and time series indicator baseline.

[0126] Furthermore, the identification unit includes: a first comparison module, used to compare the execution indicator data with the indicator baseline profile, and identify the indicator baseline deviation type of the abnormal operation based on the comparison result; a first identification module, used to identify automated operations with dependencies based on dependency data, obtain dependent operations, and obtain the abnormal situation of dependent operations; and a second identification module, used to identify the abnormal root cause of the abnormal operation based on the indicator baseline deviation type of the abnormal operation and the abnormal situation of the dependent operation.

[0127] Furthermore, the anomaly monitoring device for automated operations also includes: a first matching module, used to match anomaly repair strategies with anomaly repair strategies based on anomaly root causes to obtain matching results; a first repair module, used to repair the anomaly of anomaly operations based on anomaly repair strategies when the matching results indicate that the anomaly operations and anomaly repair strategies are successfully matched; and a first generation module, used to generate anomaly alarm information and send the anomaly alarm information to the operation and maintenance terminal when the matching results indicate that the anomaly operations and anomaly repair strategies are not matched.

[0128] Furthermore, the anomaly monitoring device for automated operations also includes: a first creation module, used to create entity nodes based on the automated operation and to create a tree diagram based on the structured data corresponding to each entity node, wherein the tree diagram is updated based on the execution status of the automated operation, wherein the execution status includes at least one of the following: execution progress, health score, abnormal status, and root cause of the abnormality.

[0129] The aforementioned abnormal monitoring device for automated operations may also include a processor and a memory. The aforementioned acquisition unit 31, calculation unit 32, acquisition unit 33, identification unit 34, etc., are all stored in the memory as program units, and the processor executes the aforementioned program units stored in the memory to realize the corresponding functions.

[0130] The processor described above contains a kernel, which retrieves the corresponding program units from memory. One or more kernels can be configured, and their parameters can be adjusted to monitor for anomalies in automated tasks.

[0131] The aforementioned memory may include non-permanent memory in computer-readable media, such as random access memory (RAM) and / or non-volatile memory, such as read-only memory (ROM) or flash RAM, and the memory includes at least one memory chip.

[0132] According to another aspect of the present invention, a computer-readable storage medium is also provided, the computer-readable storage medium including a stored computer program, wherein the anomaly monitoring method controls the device where the computer-readable storage medium is located to perform any of the above-mentioned automated operations when the computer program is running.

[0133] According to another aspect of the present invention, an electronic device is also provided, including one or more processors and a memory, the memory being used to store one or more programs, wherein when the one or more programs are executed by the one or more processors, the one or more processors cause the one or more processors to implement the above-described method for monitoring the anomaly of any of the automated operations.

[0134] According to another aspect of the present invention, a computer program product is also provided, the computer program product including a computer program, wherein when the computer program is executed by a processor, it implements the anomaly monitoring method for any of the above-described automated operations.

[0135] This application also provides a computer program product, which, when executed on a data processing device, is suitable for executing an initialization program with the following method steps: collecting log data of automated jobs and parsing the log data to generate structured data; calculating the health score of each automated job based on the structured data; identifying automated jobs with health scores less than a preset health threshold to obtain abnormal jobs; extracting execution indicator data and dependency data of abnormal jobs, and identifying the root cause of abnormality of abnormal jobs based on the execution indicator data and dependency data.

[0136] This application also provides a computer program product that, when executed on a data processing device, is suitable for executing an initialization program with the following method steps: querying job data at each level through a retrieval script to obtain hierarchical job data; and parsing the hierarchical job data according to the structured data templates of each level to obtain structured data.

[0137] This application also provides a computer program product, which, when executed on a data processing device, is suitable for executing an initialization program having the following method steps: extracting job execution progress data from structured data and calculating the progress health of the automated job based on the job execution progress data; extracting job execution timeliness data from structured data and calculating the timeliness health of the automated job based on the job execution timeliness data; extracting job quality data from structured data and calculating the quality health of the automated job based on the job quality data; and calculating the health score of the automated job based on the progress health, the weight value corresponding to the progress health, the timeliness health, the weight value corresponding to the timeliness health, the quality health, and the weight value corresponding to the quality health.

[0138] This application also provides a computer program product that, when executed on a data processing device, is suitable for executing an initialization program with the following method steps: acquiring historical log data of each automated job; analyzing the historical log data and constructing an indicator baseline profile for each automated job based on the analysis results, wherein the indicator baseline profile includes at least one of the following: performance indicator baseline, log indicator baseline, and time series indicator baseline.

[0139] This application also provides a computer program product, which, when executed on a data processing device, is suitable for executing an initialization program with the following method steps: comparing execution indicator data with an indicator baseline profile, and identifying the indicator baseline deviation type of abnormal operations based on the comparison results; identifying automated operations with dependencies based on dependency data, obtaining dependent operations, and acquiring the abnormal conditions of dependent operations; and identifying the abnormal root cause of abnormal operations based on the indicator baseline deviation type of abnormal operations and the abnormal conditions of dependent operations.

[0140] This application also provides a computer program product, which, when executed on a data processing device, is suitable for executing an initialization program with the following method steps: matching an anomaly repair strategy with an anomaly based on an anomaly root cause, and obtaining a matching result; if the matching result indicates that the anomaly and the anomaly repair strategy are successfully matched, repairing the anomaly of the anomaly based on the anomaly repair strategy; or, if the matching result indicates that the anomaly and the anomaly repair strategy are not matched, generating an anomaly alarm message and sending the anomaly alarm message to the operation and maintenance terminal.

[0141] This application also provides a computer program product that, when executed on a data processing device, is suitable for executing an initialization program with the following method steps: creating entity nodes based on automated jobs, and creating a tree diagram based on the structured data corresponding to each entity node, wherein the tree diagram is updated based on the execution status of the automated jobs, wherein the execution status includes at least one of the following: execution progress, health score, abnormal status, and root cause of abnormality.

[0142] Figure 4 This is a hardware structure block diagram of an electronic device (or mobile device) for performing an anomaly monitoring method for automated operations according to an embodiment of the present invention. Figure 4 As shown, an electronic device may include one or more processors ( Figure 4 The processor, denoted by 402a, 402b, ..., 402n, can include, but is not limited to, a processing device such as a microprocessor (MCU) or a programmable logic device (FPGA), and a memory 404 for storing data. In addition, it may include: a display, an input / output interface (I / O interface), a universal serial bus (USB) port (which may be included as one of the ports in the I / O interface), a network interface, a keyboard, a power supply, and / or a camera. Those skilled in the art will understand that... Figure 4 The structure shown is for illustrative purposes only and does not limit the structure of the electronic device described above. For example, the electronic device may also include... Figure 4 The more or fewer components shown, or having the same Figure 4 The different configurations shown.

[0143] The sequence numbers of the above embodiments of the present invention are for descriptive purposes only and do not represent the superiority or inferiority of the embodiments.

[0144] In the above embodiments of the present invention, the descriptions of each embodiment have different focuses. For parts not described in detail in a certain embodiment, please refer to the relevant descriptions of other embodiments.

[0145] In the several embodiments provided in this application, it should be understood that the disclosed technical content can be implemented in other ways. The device embodiments described above are merely illustrative; for example, the division of units can be a logical functional division, and in actual implementation, there may be other division methods. For instance, multiple units or components may be combined or integrated into another system, or some features may be ignored or not executed. Furthermore, the displayed or discussed mutual coupling, direct coupling, or communication connection may be through some interfaces; the indirect coupling or communication connection between units or modules may be electrical or other forms.

[0146] The units described as separate components may or may not be physically separate. The components shown as units may or may not be physical units; that is, they may be located in one place or distributed across multiple units. Some or all of the units can be selected to achieve the purpose of this embodiment according to actual needs.

[0147] Furthermore, the functional units in the various embodiments of the present invention can be integrated into one processing unit, or each unit can exist physically separately, or two or more units can be integrated into one unit. The integrated unit can be implemented in hardware or as a software functional unit.

[0148] If the integrated unit is implemented as a software functional unit and sold or used as an independent product, it can be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present invention, in essence, or the part that contributes to the prior art, or all or part of the technical solution, can be embodied in the form of a software product. This computer software product is stored in a storage medium and includes several instructions to cause a computer device (which may be a personal computer, server, or network device, etc.) to execute all or part of the steps of the methods described in the various embodiments of the present invention. The aforementioned storage medium includes various media capable of storing program code, such as USB flash drives, read-only memory (ROM), random access memory (RAM), portable hard drives, magnetic disks, or optical disks.

[0149] The above description is only a preferred embodiment of the present invention. It should be noted that for those skilled in the art, several improvements and modifications can be made without departing from the principle of the present invention, and these improvements and modifications should also be considered within the scope of protection of the present invention.

Claims

1. A method of anomaly monitoring of an automated job, the method comprising: include: Collect log data from automated operations and parse the log data to generate structured data; The health score of each automated operation is calculated based on the structured data. Automated jobs whose health scores are less than a preset health threshold are identified as abnormal jobs; Extract the execution index data and dependency data of the abnormal job, and identify the root cause of the abnormal job based on the execution index data and dependency data.

2. The method of claim 1, wherein, The steps for parsing the log data and generating structured data include: The hierarchical job data is obtained by querying the job data at each level through the retrieval script; The hierarchical operation data is parsed according to the structured data templates of each level to obtain the structured data.

3. The method of claim 1, wherein, The steps for calculating the health score of each automated task based on the structured data include: Extract job execution progress data from the structured data, and calculate the progress health of the automated job based on the job execution progress data; Extract job execution timeliness data from the structured data, and calculate the timeliness health of the automated job based on the job execution timeliness data; Extract job quality data from the structured data, and calculate the quality health of the automated job based on the job quality data; The health score of the automated operation is calculated based on the progress health score, the weight value corresponding to the progress health score, the timeliness health score, the weight value corresponding to the timeliness health score, the quality health score, and the weight value corresponding to the quality health score.

4. The method according to claim 1, characterized in that, Before identifying the root cause of the abnormal operation based on the execution metric data and the dependency data, the process also includes: Obtain historical log data for each of the aforementioned automated tasks; The historical log data is analyzed, and a baseline profile of the indicators for each automated job is constructed based on the analysis results. The baseline profile of the indicators includes at least one of the following: performance indicator baseline, log indicator baseline, and time-series indicator baseline.

5. The method according to claim 4, characterized in that, The steps for identifying the root cause of the abnormal operation based on the execution metric data and the dependency data include: The execution indicator data is compared with the indicator baseline profile, and the deviation type of the indicator baseline of the abnormal operation is identified based on the comparison results. Based on the dependency data, identify automated jobs with dependencies, obtain dependent jobs, and obtain abnormal situations of the dependent jobs; The root cause of the abnormality of the abnormal operation is identified based on the baseline deviation type of the abnormal operation and the abnormality of the dependent operation.

6. The method according to claim 1, characterized in that, After identifying the root cause of the abnormal operation based on the execution metric data and the dependency data, the method further includes: Based on the root cause of the anomaly, the anomaly repair strategy is matched with the anomaly operation to obtain the matching result; If the matching result indicates that the abnormal job successfully matches the abnormality repair strategy, the abnormality of the abnormal job is repaired based on the abnormality repair strategy; or, If the matching result indicates that the abnormal operation fails to match the abnormal repair strategy, an abnormal alarm message is generated and sent to the operation and maintenance terminal.

7. The method according to claim 2, characterized in that, After generating structured data, the following steps are also included: Entity nodes are created based on the automated job, and a tree diagram is created based on the structured data corresponding to each entity node. The tree diagram is updated based on the execution status of the automated job, wherein the execution status includes at least one of the following: execution progress, health score, abnormal status, and root cause of the abnormality.

8. An anomaly monitoring device for automated operations, characterized in that, include: The data acquisition unit is used to collect log data from automated operations and parse the log data to generate structured data. A calculation unit is used to calculate the health score of each of the automated operations based on the structured data; The acquisition unit is used to identify automated jobs whose health score is less than a preset health threshold, and to obtain abnormal jobs. The identification unit is used to extract the execution index data and dependency data of the abnormal job, and identify the root cause of the abnormal job based on the execution index data and dependency data.

9. A computer-readable storage medium, characterized in that, The computer-readable storage medium includes a stored computer program, wherein, when the computer program is executed, it controls the device on which the computer-readable storage medium is located to perform the anomaly monitoring method for automated operations as described in any one of claims 1 to 7.

10. An electronic device, characterized in that, It includes one or more processors and a memory, the memory being used to store one or more programs, wherein when the one or more programs are executed by the one or more processors, the one or more processors cause the one or more processors to implement the anomaly monitoring method for automated operations as described in any one of claims 1 to 7.