Causal graph model-based critical illness diagnosis and treatment data anomaly detection method and system

By constructing a causal graph model, analyzing the temporal correlation and conditional mutual information of critical illness diagnosis and treatment data, performing energy allocation and calibration, and dynamically tracking abnormal propagation paths, the problem of misjudgment and missed detection of abnormalities caused by ignoring causal relationships in existing technologies is solved, achieving more efficient disease tracing and early warning.

CN122245830APending Publication Date: 2026-06-19ZHEJIANG YISHAN SMART MEDICAL RES CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
ZHEJIANG YISHAN SMART MEDICAL RES CO LTD
Filing Date
2026-05-22
Publication Date
2026-06-19

AI Technical Summary

Technical Problem

Existing methods for detecting anomalies in critical illness diagnosis and treatment data ignore the complex causal relationships between different physiological variables, leading to abnormal energy leakage and biased propagation direction, making it difficult to identify early signs of disease deterioration and pinpoint the root cause pathway.

Method used

Based on the causal graph model, by constructing an initial causal graph, analyzing the temporal lag correlation and conditional mutual information structure between variables, performing energy allocation and calibration, dynamically tracking the path propagation differences of anomalous energy, identifying and updating the main path of anomalous propagation, and iteratively optimizing the model's anomaly identification capability.

Benefits of technology

By accurately identifying the true causal relationships between diagnostic and treatment indicators, the problems of hidden missed detections and misjudgments of transmission direction have been solved, improving the accuracy of abnormal detection in acute and critical illnesses and the consistency of source tracing, and providing a reliable basis for early clinical warning and precise intervention.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122245830A_ABST
    Figure CN122245830A_ABST
Patent Text Reader

Abstract

This invention discloses a method and system for anomaly detection in critical illness diagnosis and treatment data based on a causal graph model, belonging to the field of medical information technology. It collects multi-source raw data from critical illness diagnosis and treatment, analyzes the temporal lag correlation and conditional mutual information structure among variables, and constructs an initial causal graph with data features as nodes. Based on the directed edge weights in the graph, it performs anomaly energy allocation and discreteness analysis to achieve path energy calibration. It extracts multi-branch nodes and dynamically tracks the differences in anomaly energy propagation, updating the main path to form a complete anomaly propagation trajectory. Based on the trajectory, it determines the target node and traces the root cause path in reverse, feeding the effective path back to the causal graph model to update structural parameters, continuously iterating and optimizing the anomaly identification capability. This forms a complete process of data modeling, energy calibration, trajectory tracking, root cause tracing, and model self-optimization, significantly improving the accuracy, reliability, and stability of anomaly detection and tracing in critical illnesses.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of medical information technology, and in particular to a method and system for detecting anomalies in critical illness diagnosis and treatment data based on a causal graph model. Background Technology

[0002] During the ICU treatment of critically ill patients, various monitors, ventilators, blood gas analyzers, and hospital information systems continuously generate massive amounts of multi-source heterogeneous data. This includes time-series data of vital signs such as heart rate, blood pressure, and blood oxygen saturation; laboratory test data such as complete blood count, inflammatory markers, and coagulation function; and clinical intervention records such as medication orders, fluid management, and mechanical ventilation parameters. These data collectively constitute a dynamic, high-dimensional dataset reflecting the evolution of the patient's pathophysiological state. How to timely and accurately identify early signs of deterioration and trace the root causes from this high-noise and high-missing-rate data of critical care patients has become a research hotspot and core technological challenge in the field of intelligent critical care.

[0003] However, most existing methods for detecting anomalies in acute and critical illness diagnosis and treatment data employ univariate threshold detection or global anomaly scoring based on statistical distribution, neglecting the complex causal relationships between different physiological variables. When abnormal energy is contributed by multiple upstream causal nodes and propagates downstream through multiple overlapping paths in the causal graph, path contribution energy leakage occurs, making it difficult for the model to identify the abnormal sample, resulting in hidden missed detections. Secondly, existing causal graph models ignore the selectivity and bias of the true propagation direction. This bias in the selection of abnormal propagation direction can cause the model to incorrectly direct abnormal energy to secondary or non-pathogenic pathways when the disease progresses rapidly, thus delaying the location of the true root cause of the deterioration. Summary of the Invention

[0004] To address the shortcomings of existing technologies, this invention provides a method and system for detecting anomalies in critical illness diagnosis and treatment data based on a causal graph model, thus solving the problems mentioned in the background section.

[0005] To achieve the above objectives, the present invention provides the following technical solution: The first aspect is an anomaly detection method for critical illness diagnosis and treatment data based on causal graph models, including: We acquire multi-source raw data from the diagnosis and treatment of critical illnesses, and construct an initial causal graph with each feature in the multi-source raw data as a node by analyzing the temporal lag correlation and conditional mutual information structure among variables. Energy allocation and energy discrepancy analysis are performed based on the connection weights of each directed edge in the initial causal graph to obtain the calibrated path energy; Extract multi-branch nodes within the initial causal graph and analyze the path propagation differences of anomalous energy during dynamic tracking to identify and update the initial main path of anomalous propagation in order to obtain the anomalous propagation trajectory. Construct target nodes, identify root cause paths by combining abnormal propagation trajectories, and input the root cause paths into the cause-effect graph model to complete the iterative optimization of the model's abnormal identification capabilities.

[0006] Secondly, an anomaly detection system for critical illness diagnosis and treatment data based on a causal graph model includes: The data acquisition and preprocessing module is used to acquire multi-source raw data during the diagnosis and treatment of acute and critical illnesses, and to construct an initial causal graph with each feature in the multi-source raw data as nodes by analyzing the temporal lag correlation and conditional mutual information structure between variables. The path analysis module is used to perform energy allocation and energy dispersion analysis based on the connection weights of each directed edge in the initial causal graph, and to obtain the calibrated path energy. The anomaly detection module is used to extract multi-branch nodes within the initial causal graph and analyze the path propagation differences of anomalous energy during dynamic tracking in order to identify and update the initial main path of anomalous propagation in order to obtain the anomalous propagation trajectory. The anomaly monitoring and optimization module is used to construct target nodes, identify root cause paths by combining anomaly propagation trajectories, and input the root cause paths into the cause-effect graph model to complete the iterative optimization of the model's anomaly identification capability.

[0007] The above-described solution of the present invention has at least the following beneficial effects: This method, based on multi-source diagnostic and treatment data, eliminates dimensional differences and data noise through refined data preprocessing. It constructs an initial causal graph by combining time-lag correlation and conditional mutual information structures, accurately uncovering the true causal relationships between diagnostic and treatment indicators. This approach avoids the spurious interference caused by traditional methods that rely solely on correlation analysis, providing structural support for anomaly detection that aligns with clinical pathological patterns. Simultaneously, through energy allocation and calibration mechanisms, it effectively addresses the problem of hidden missed detections caused by significant overall anomalies but dispersed energy along single paths. Combined with dynamic optimization of multi-branch node path transmission resistance and preference coefficients, it avoids the risk of misjudging the direction of anomaly propagation, achieving precise tracking of anomaly propagation trajectories and efficient location of root cause paths. This improves the accuracy and consistency of anomaly detection in acute and critical illnesses, providing a reliable technical basis for early clinical warning and precise intervention.

[0008] By locating abnormal target nodes and root cause paths, and combining cross-validation mechanisms to ensure the effectiveness of the tracing results, the origin and propagation process of abnormalities can be clearly identified, effectively reducing the risk of missed detection and misdiagnosis in acute and critical illnesses, and helping to improve the quality of diagnosis and treatment and patient prognosis. At the same time, by iteratively updating the effective root cause paths to the causal graph model, the model can continuously adapt to the evolution of clinical pathology, continuously improve the ability and accuracy of abnormality identification, and solve the problem of traditional models being rigid and unable to adapt to dynamic changes in the condition. Attached Figure Description

[0009] Figure 1 This is a flowchart illustrating the method for detecting anomalies in critical illness diagnosis and treatment data based on a causal graph model according to the present invention. Figure 2 This is a schematic diagram of the structure of the critical illness diagnosis and treatment data anomaly detection system based on the causal graph model of the present invention; In the attached diagram, the components represented by each number are as follows: Data acquisition and preprocessing module 11, path analysis module 12, anomaly identification module 13, anomaly monitoring and optimization module 14. Detailed Implementation

[0010] The technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.

[0011] In the description of this invention, the terms "first" and "second" are used for descriptive purposes only and should not be construed as indicating or implying relative importance or implicitly specifying the number of indicated technical features. Thus, a feature defined as "first" or "second" may explicitly or implicitly include one or more of the stated features. In the description of this invention, "a plurality of" means two or more, unless otherwise explicitly specified.

[0012] In the description of this invention, the term "for example" is used to mean "used as an example, illustration, or description." Any embodiment described as "for example" in this invention is not necessarily to be construed as being more preferred or advantageous than other embodiments. The following description is provided to enable any person skilled in the art to make and use the invention. Details are set forth in the following description for purposes of explanation. It should be understood that those skilled in the art will recognize that the invention can be made without using these specific details. In other instances, well-known structures and processes will not be described in detail to avoid obscuring the description of the invention with unnecessary detail. Therefore, the invention is not intended to be limited to the embodiments shown, but is consistent with the broadest scope of the principles and features disclosed herein.

[0013] Example 1, as Figure 1 As shown, this invention provides a method for detecting anomalies in critical illness diagnosis and treatment data based on a causal graph model, comprising the following steps: S100: Acquire multi-source raw data during the diagnosis and treatment of acute and critical illnesses, and construct an initial causal graph with each feature in the multi-source raw data as nodes by analyzing the temporal lag correlation and conditional mutual information structure between variables. S200: Based on the connection weights of each directed edge in the initial causal graph, perform energy allocation and energy discrepancy analysis to obtain the calibrated path energy; S300: Extract multi-branch nodes within the initial causal graph and analyze the path propagation differences of anomalous energy during dynamic tracking to identify and update the initial main path of anomalous propagation in order to obtain the anomalous propagation trajectory; S400: Construct target nodes, combine them with anomaly propagation trajectories, identify root cause paths, and input the root cause paths into the cause-effect graph model to complete the iterative optimization of the model's anomaly identification capabilities.

[0014] In S100, this invention constructs an initial causal graph using time-lag correlation and conditional mutual information structures. This breaks through the limitations of traditional methods that only focus on whether the indicator values ​​are abnormal and ignore the time-series causal relationship between variables. It can truly restore the physiological transmission logic between indicators in the diagnosis and treatment of acute and critical illnesses, providing a structural basis that conforms to pathological laws for the analysis of abnormal energy distribution and path propagation. It avoids the interference of pseudo-causal associations caused by simply relying on correlation analysis, and improves the reliability of abnormal judgment from the source.

[0015] Secondly, the S200 performs energy allocation and discreteness analysis based on directed edge connection weights to obtain calibrated path energy. It is specifically optimized to address the defect of path contribution energy leakage. It can effectively calibrate and reasonably amplify weak abnormal energy scattered on multiple paths, solving the problem of missed detection of hidden anomalies caused by significant overall anomalies but insufficient energy in a single path. This enables the effective detection of early deterioration signals and weak abnormal changes that were previously difficult to identify, significantly improving the sensitivity of abnormal detection in acute and critical illnesses. It is especially suitable for critical illness scenarios where the condition changes rapidly and abnormal manifestations are hidden.

[0016] By extracting multi-branch nodes in S300 and dynamically updating the main path and propagation trajectory based on the differences in abnormal energy propagation, the defect of abnormal propagation direction selection bias is effectively solved. By comprehensively sorting the conduction resistance, preference coefficient and energy intensity of each path under multi-branch nodes, the misjudgment problem caused by judging the propagation direction based on a single rule is avoided. At the same time, dynamic tracking is achieved by combining a time-series sliding window, which can adjust the main path according to the real-time changes in the disease condition, making the abnormal propagation trajectory closer to the real pathological evolution process, and greatly improving the accuracy and stability of path identification.

[0017] Finally, the S400 forms a closed-loop mechanism of detection, tracking, tracing, and iteration by locating target nodes, tracing root cause paths, and feeding back effective paths to the causal graph model for iterative optimization. This mechanism can not only accurately locate the origin of anomalies and core transmission nodes, providing clear evidence for rapid clinical intervention, but also continuously improve the causal graph structure by introducing effective root cause paths. This allows the model to gradually improve its anomaly recognition capabilities, accuracy, and consistency in tracing the source during continuous use, effectively reducing the probability of misjudgment and missed judgment. Overall, it realizes an upgrade from passive anomaly recognition to active tracing the source, and from a static model to dynamic self-optimization. While improving the accuracy of anomaly detection in the diagnosis and treatment of acute and critical illnesses, it also provides strong technical support for early clinical warning, precise intervention, and disease evolution analysis, effectively solving a series of key problems in the current anomaly detection of acute and critical illness diagnosis and treatment data, such as energy leakage, directional bias, model solidification, and delayed warning.

[0018] Specifically, the initial causal graph construction process in S100 includes: By preprocessing the multi-source raw data in the diagnosis and treatment of acute and critical illnesses, a standardized diagnosis and treatment dataset with unified dimensions and continuous time sequence is obtained. The data preprocessing includes cleaning missing values, removing outliers, and normalization. The multi-source raw data includes vital sign data collected in real time by the intensive care unit, laboratory test data, clinical medical order data, medical record text data, and imaging examination data, etc. For missing values ​​in the data, linear interpolation based on adjacent time series data is used to supplement them to ensure the continuity of the time series data; for missing data, in accordance with clinical diagnosis and treatment guidelines, the historical mean of patients with similar conditions of the same disease is used to supplement them to avoid the impact of missing data on the subsequent construction of causal structure.

[0019] All data are normalized to convert indicators with different dimensions and value ranges to the [0,1] interval, eliminating interference caused by differences in dimensions. At the same time, outliers are initially screened, and extreme outliers are removed using the 3σ principle to ensure the reliability of the input data and provide standardized input data for subsequent causal graph structure learning.

[0020] Based on standardized medical datasets, we analyzed the temporal lag correlations and conditional mutual information structures among variables to obtain effective causal connections. Using effective causal connections as directed edges, and each feature in the standardized diagnostic dataset as a node, along with the hysteresis order sign, an initial causal graph and the connection weights of each directed edge within it are constructed. The connection weights represent the mutual information of the corresponding directed edges. Based on standardized clinical datasets, we analyzed the time-lag correlations and conditional mutual information structures among variables to obtain effective causal connections, including: Based on standardized diagnostic and treatment datasets, the time-series cross-correlation function corresponding to each lag order is obtained by calculating the centered time-series cross-correlation function for variable pairs under different lag orders; The centralized time-series cross-correlation function uses the Pearson correlation coefficient algorithm to calculate the time-series cross-correlation coefficients step by step within a preset lag order range. By traversing all lag orders, a complete time-series cross-correlation coefficient sequence corresponding to the variable pairs is obtained. This sequence is used to quantify the time-series correlation strength between variables and locate lag relationships, providing numerical basis for causal direction candidate determination. The range of lag order is determined by the user based on the sampling frequency and pathophysiological time-series characteristics of the critical illness diagnosis and treatment data. For example, if the monitoring data sampling frequency is 1Hz and the typical time scale of critical illness changes is on the minute level, it can be set to 60, which covers the time dependence within 1 minute. Extract the maximum absolute magnitude within the time-series cross-correlation coefficient sequence. If the maximum absolute magnitude exceeds a preset significance threshold, it is recorded as a pair of related variables. Determine the lag order corresponding to the peak value of the time-series cross-correlation coefficient of the pair of related variables. If the lag order is not equal to 0, it is determined that the pair of related variables has an asynchronous lag dependency relationship and is included in the candidate variable pair set. The peak value of the time-series cross-correlation coefficient refers to the maximum absolute magnitude within the time-series cross-correlation coefficient sequence. Based on the lag order sign of the related variable pairs within the candidate variable pair set, causal direction candidates are marked by the time sequence rule to form a directed edge candidate set; If the lag order of the related variable pair (X, Y) within the candidate variable pair set is greater than 0, it means that the change of variable X precedes the change of variable Y, and the candidate directed edge is marked as X→Y; X and Y represent different variables; If the lag order of the related variable pair (X, Y) within the candidate variable pair set is less than 0, it means that the variable Y changes before the variable X changes, and the candidate directed edge is marked as Y→X; Eliminate synchronous dependent pairs with a lag order of 0 and do not output directed edges; summarize all candidate directed edges with definite directions to form the initial causal graph directed edge candidate set.

[0021] The maximum absolute amplitude refers to the maximum value among the absolute values ​​of the correlation coefficients corresponding to all lag orders within the entire time series of cross-correlation coefficients calculated for two diagnostic and treatment variables. In essence, it is a quantitative indicator used to measure the strongest correlation between the two diagnostic and treatment indicators. A fixed-length sliding window is set to calculate the mutual information between each pair of related variables in the candidate variable pair set within each sliding window. Features other than the corresponding two variables in the standardized diagnosis and treatment dataset are used as the condition set to analyze the conditional dependency structure between related variables and obtain conditional mutual information. If the difference between mutual information and conditional mutual information is higher than the preset independence threshold, it is determined that there is a stable conditional dependency between the variables, and the corresponding variable pair is retained as an effective causal connection; otherwise, it is determined that the variables are simply correlated and approximately conditionally independent under the condition set, and are marked as pseudo-association pairs and removed. Suppose that the current sliding window contains a set of standardized data samples, select candidate variable pairs as heart rate variable X and blood pressure variable Y, and condition set S consists of the remaining features other than X and Y, namely respiratory rate, blood oxygen saturation and body temperature. Calculate mutual information and conditional mutual information based on the Shannon entropy algorithm.

[0022] First, the probability distributions of the discretized numerical values ​​of X and Y within the window are statistically analyzed to obtain the probability distribution p(X) of X, the probability distribution p(Y) of Y, and the joint probability distribution p(X,Y). Based on this, the information entropy of X H(X) = 0.85, the information entropy of Y H(Y) = 0.92, and the joint entropy H(X,Y) = 1.10; According to the mutual information calculation formula MI(X,Y)=H(X)+H(Y)-H(X,Y), substituting the values, we get MI(X,Y)=0.85+0.92-1.10=0.67; Then, the condition set S is introduced, and the conditional entropy H(X|S) of X, the conditional entropy H(Y|S) of Y, and the joint conditional entropy H(X,Y|S) are calculated. Statistically, H(X|S)=0.42, H(Y|S)=0.48, and H(X,Y|S)=0.61 are obtained. Then, according to the conditional mutual information formula MI(X,Y|S)=H(X|S)+H(Y|S)-H(X,Y|S), substituting the values, we get MI(X,Y|S)=0.42+0.48-0.61=0.29; Next, the difference between mutual information and conditional mutual information, MI(X,Y)-MI(X,Y|S)=0.67-0.29=0.38, is calculated. This difference is compared with the preset independence threshold of 0.20. Since 0.38 is higher than 0.20, it is determined that there is a stable conditional dependency between variables X and Y, and this variable pair is retained as an effective causal connection. If the difference is lower than the independence threshold, it is determined to be a pseudo-association pair and is removed. By traversing all sliding windows and taking the average of the multi-window calculation results, globally stable estimates of mutual information and conditional mutual information are obtained, thus completing the screening of effective causal connections.

[0023] The probability distribution involves discretizing the values ​​of a variable within a window, segmenting them, and counting the frequency of each segment with the total number of points in the window to obtain the probability distribution of that variable. It is used to describe the uncertainty of the value of a diagnostic indicator. The joint probability distribution is to simultaneously observe the combinations of values ​​of two variables (X, Y) at the same time and count the frequency of each combination. It is used to calculate the joint entropy and describe the uncertainty of the two indicators fluctuating together. Information entropy is obtained using the Shannon entropy algorithm and is used to measure the degree of fluctuation and disorder of a single diagnostic indicator. The higher the entropy, the more unstable and abnormal the indicator is. Joint entropy is the sum of the joint probabilities of all (X,Y) combinations, used to measure the overall disorder of two indicators changing together; mutual information is used to measure the strength of the correlation between two indicators. The greater the mutual information, the stronger the relationship between the two, and it is used as the connection weight of directed edges in a causal graph.

[0024] Conditional entropy is the uncertainty remaining in X or Y when other features are fixed, and it is used to determine whether the correlation is a spurious correlation. Joint conditional entropy is used to measure the common uncertainty of X and Y after controlling for other features. Conditional mutual information is used to measure the true correlation remaining between X and Y after excluding the interference of other features. Mutual information is obtained based on the Shannon entropy algorithm in information theory and used to quantify the strength of the statistical dependency between two variables. Conditional mutual information is obtained based on the standard extension method of conditional entropy in information theory. It is used to quantify whether two variables still have an independent statistical relationship when all their features are known. The calculation is completed by traversing all windows, and the average of the multi-window results is taken to obtain the globally stable mutual information and conditional mutual information estimates. The difference between the globally stable mutual information and the conditional mutual information estimates is compared with the independence threshold to determine the effective causal connection candidates.

[0025] In S100, this embodiment of the invention further addresses key issues in traditional causal graph construction, such as poor data quality, inaccurate causal association identification, and interference from spurious associations, by constructing an initial causal graph. This improves the accuracy and reliability of anomaly detection and tracing from the source. First, in the data preprocessing stage, considering the complexity of multi-source raw data from acute and severe illnesses, targeted missing value supplementation and outlier normalization are adopted. This ensures the continuity of time-series data through adjacent temporal linear interpolation and supplements missing values ​​with the mean of historical data of similar conditions in the same disease, effectively avoiding interference from missing data in the construction of the causal structure. At the same time, extreme outliers are eliminated by the 3σ principle, and all data are normalized to the [0,1] interval, solving the problem of causal identification bias caused by data disorder and inconsistent dimensions in traditional methods.

[0026] Secondly, in the process of identifying effective causal connections, based on a standardized dataset, the Pearson correlation coefficient algorithm is used to calculate the centered time-series cross-correlation function under different lag orders, resulting in a complete time-series cross-correlation coefficient sequence. This can accurately quantify the time-series correlation strength between variables and locate lag relationships, providing solid numerical support for determining the causal direction. At the same time, by screening for associated variable pairs with the maximum absolute amplitude exceeding the significance threshold and lag order not equal to 0, synchronously correlated pseudo-associated pairs are effectively eliminated, ensuring that all candidate variable pairs have a real asynchronous lag dependency relationship.

[0027] Based on this, the mutual information and conditional mutual information of each pair of related variables are calculated by a fixed-length sliding window. Combined with the Shannon entropy algorithm and the conditional entropy extension method, stable and effective causal connections are further screened out based on whether the difference between mutual information and conditional mutual information is higher than the independence threshold. This completely eliminates pseudo-related pairs that are simply correlated or conditionally independent, ensuring that the constructed causal connections conform to the real physiological transmission logic between critical illness diagnosis and treatment indicators.

[0028] Finally, using effective causal connections as directed edges and each feature as a node, an initial causal graph and directed edge connection weights are constructed by combining hysteresis order symbols. This allows the initial causal graph to not only accurately depict the temporal causal relationship between various diagnostic and treatment indicators, but also to quantify the strength of causal associations through connection weights. This provides a structural basis that conforms to pathological laws for subsequent energy allocation of S200, abnormal propagation path analysis of S300, and root cause tracing of S400. It effectively avoids the problems of subsequent energy leakage and directional selection bias caused by the imprecise structure and inaccurate associations of traditional causal graphs. It improves the reliability and accuracy of the entire anomaly detection method from the source, providing core support for subsequent accurate identification of abnormal propagation paths and location of root cause nodes. At the same time, it makes the entire technical solution more in line with the clinical reality of critical illness diagnosis and treatment, and can better adapt to the characteristics of multi-source, complex, and highly temporal critical illness diagnosis and treatment data.

[0029] Specifically, S200: Based on the connection weights of each directed edge in the initial causal graph, energy allocation and energy discrepancy analysis are performed to obtain the calibrated path energy. The specific steps include: Based on the standardized diagnosis and treatment dataset and the connection weights of each directed edge in the initial causal graph, the overall sample deviation is calculated by reconstructing the error and energy is allocated according to the weights to obtain the initial energy and potential energy leakage samples of each path. In practice, the deviation of each diagnostic data sample from the normal data distribution is calculated to obtain the mean square error. This mean square error is used as the total abnormal energy to quantify the overall abnormality of the sample. If the total abnormal energy exceeds the preset energy threshold, the overall total abnormal energy has reached an abnormal level. Based on the initial weights of each path in the causal graph, the total energy of anomalies is initially allocated to each causal path according to the weight ratio. That is, the connection weight of the corresponding path is multiplied by the total energy of anomalies to obtain the initial energy of each path. Simultaneously, the initial energy allocation results of each path are recorded, and the difference between the initial energy of each path and the preset detection threshold is calculated. It is determined whether there are multiple paths with energy below the detection threshold. If so, they are marked as potential energy leakage samples. These samples are used to reflect that the overall abnormal total energy has reached an abnormal level, but after allocation according to path weight, the abnormal energy on each path is below the detection threshold, and the diagnostic data samples cannot be identified by the single-path detection mechanism.

[0030] The diagnostic data sample consists of all feature values ​​within each sliding window; Normal data distribution refers to the multivariate joint distribution formed by statistical analysis of the diagnosis and treatment data of critical and severe illnesses in the stable period of historical non-anomaly labeling. This distribution is composed of the mean, variance, covariance structure or kernel density estimation results of the normal sample set. It is used to characterize the typical value range of each diagnosis and treatment indicator and the correlation between variables in the stable state of the disease, and serves as a reference benchmark for judging whether the current sample deviates from the normal state. For potential energy leakage samples, the degree of energy dispersion is analyzed based on the initial causal graph and the initial energy of each path to obtain the dilution coefficient. The dilution coefficients of each path are then weighted and summed according to the connection weights to obtain the energy dilution coefficient. The energy dilution coefficient is used to reflect the overall level of abnormal energy of the sample being dispersed and diluted in multiple paths.

[0031] The dilution factor is obtained by multiplying the difference between the detection threshold and the initial energy by the path length. It reflects the gap between the initial energy and the detection threshold of a specific causal path in a potential energy leakage sample. Combined with the path length, it is used to quantify the degree to which the anomalous energy is severely dispersed and weakened. The difference between the detection threshold and the initial energy is the energy gap, which represents how much anomalous energy the path should have carried but is missing due to dilution. The longer the path, the easier it is for the energy to be diluted, and the higher the degree of dilution.

[0032] The range of energy dilution coefficient values ​​is mapped to a preset coefficient range, and the energy calibration coefficient is obtained by linear mapping. Specifically, iterate through all potential energy leakage samples, calculate the energy dilution coefficient of all samples, obtain their minimum and maximum values, and obtain the range of values ​​for the energy dilution coefficient; then perform min-max normalization on the energy dilution coefficient to obtain the normalized energy dilution coefficient, and then linearly map the normalized energy dilution coefficient to a preset coefficient range, such as [1.2, 2.0], to obtain the mapping coefficient; The mapping coefficient is truncated with upper and lower bounds. For example, if the mapping coefficient is less than 1.2, it is forced to be 1.2. In this case, the mapping coefficient is used as the energy calibration coefficient. This coefficient refers to the amplification coefficient obtained by constructing the mapping relationship based on the overall energy dilution coefficient. It is used to numerically amplify the initial energy of each path and is the direct execution parameter for subsequent energy calibration.

[0033] The initial energy of each path is multiplied by the corresponding energy calibration coefficient to obtain the calibrated path energy.

[0034] Specifically, the initial energy of each path is multiplied by the corresponding energy calibration coefficient, and then compared with the preset detection threshold to determine whether there is a single path whose energy is not lower than the threshold. If so, the energy calibration of the sample is completed. If there is still a case where the energy of all paths is lower than the threshold, the value of the calibration coefficient is adjusted, for example, by increasing it by 0.1, and the calibration is repeated until at least one path has an energy that reaches or exceeds the threshold.

[0035] Simultaneously, the calibration results were verified by selecting 100-200 known acute and critical illness samples with occult abnormalities, i.e., samples with energy leakage, and using the calibrated path energy for detection. The change in the false negative rate before and after calibration was recorded to ensure that the calibration mechanism can effectively solve the energy leakage problem. The calibrated path energy data will serve as the input basis for subsequent abnormal propagation direction judgment.

[0036] In this embodiment of the invention, S200 first uses the connection weights of the initial causal graph as a basis, combined with the normal distribution of standardized diagnostic and treatment data as a reference, and determines the total abnormal energy by calculating the mean square error between the sample and the normal distribution, accurately quantifying the overall abnormality. Then, the initial energy is allocated according to the connection weight ratio of each path, and at the same time, potential energy leakage samples are accurately identified, that is, the overall abnormality meets the standard but the energy of a single path does not reach the threshold, avoiding the missed detection of hidden abnormalities caused by energy dispersion.

[0037] For this type of energy leakage sample, a dilution coefficient is obtained by analyzing the degree of energy dispersion. A calibration mechanism is constructed by combining path length and connection weight. The dilution coefficient is transformed into an executable energy calibration coefficient through linear mapping to amplify and calibrate the initial energy. At the same time, multi-window verification is used to ensure the calibration effect. This not only solves the problem that the energy of a single path is too low to be identified, but also avoids misjudgment caused by excessive energy amplification.

[0038] The entire calibration process closely integrates the temporal and correlational characteristics of critical illness diagnosis and treatment data, without deviating from clinical practice. The calibrated pathway energy can accurately reflect the degree of abnormality in each pathway, providing precise quantitative basis for subsequent main pathway identification and abnormality tracing. It effectively makes up for the problems of missed detection and misjudgment caused by energy dispersion and lack of calibration in traditional detection. At the same time, it provides reliable energy data support for subsequent pathway priority ranking and root cause localization, making the entire abnormality detection process more in line with clinical pathological laws and realizing a closed-loop connection from data preprocessing to energy calibration.

[0039] Specifically, S300: Extract multi-branch nodes within the initial causal graph and analyze the path propagation differences of anomalous energy during dynamic tracing. Specific steps include: Traverse all nodes in the initial causal graph and filter out multi-branch nodes with two or more output paths, and determine the set of output paths corresponding to each multi-branch node. This set refers to the set of all causal paths that can directly reach downstream nodes from a certain multi-branch node. It is used to limit the optional directions of anomaly propagation and to provide a unified calculation object for quantifying the propagation resistance, calculating the direction preference and prioritizing the same group of paths. A multi-branch node is a node that points to two or more different downstream paths simultaneously. Based on the average path length and connection weight of each output path in the output path set, the comprehensive hindering effect of path conduction distance and overall causal correlation strength on energy propagation is analyzed, and the conduction resistance value of each output path is obtained. The average connection weight represents the overall transmission strength of the entire causal path; The path length is the total number of nodes contained in a single output path in the set of output paths corresponding to multi-branch nodes. Its value is directly related to the output path and is used to characterize the conduction distance and energy attenuation of the path.

[0040] The path is normalized by taking the average of the connection weights and then inverting it (i.e., subtracting the average of the connection weights from 1). This results in a higher normalized value for paths with lower weights. The path propagation resistance value is calculated by weighted summation of the inverted connection weights and the path length. This value reflects the ease with which anomalous energy propagates along the path and is used to calculate which path the anomalous energy will preferentially propagate along, thus addressing the direction selection bias problem. A lower resistance value indicates that the path is more likely to become the preferred path for anomalous propagation, providing a quantitative basis for subsequent propagation direction preference judgments.

[0041] The average connection weight refers to the average of the connection weights of multiple directed edges in the corresponding path, which is used to comprehensively reflect the weight ratio of each path. For all output paths under each multi-branch node, preference coefficients are obtained through negative correlation mapping based on the transmission resistance value. The preference coefficients are then adjusted in conjunction with different time stages of acute and critical illnesses to form an abnormal propagation direction preference matrix. This matrix stores the preference coefficients with multi-branch nodes as rows and output paths as columns, and is used to globally and uniformly describe the direction preference and priority distribution of abnormal propagation.

[0042] The different time stages of acute and severe illness include the initial stage of onset, the disease progression stage, and the treatment and recovery stage; The preference coefficient represents the degree of tendency for abnormal energy to choose the corresponding output path for propagation at multi-branch nodes. The sum of the preference coefficients of all paths under the same node is 1. It determines the path that the abnormal propagates first, and is used to solve the problem of direction selection bias. It provides a basis for subsequent abnormal source tracing and path ranking. The specific method of obtaining it is as follows: invert the transmission resistance value to get the transmission smoothness, that is, 1 minus the transmission resistance value; then sum the smoothness of all paths under the same node, and divide the transmission smoothness by the sum of the smoothness of all paths under the same node to obtain the preference coefficient. Subsequently, based on the clinical pathological transmission patterns, correction coefficients were set for the pathways at each stage. These coefficients could be obtained through the Delphi method by clinical experts, statistical methods using real clinical data, and analytic hierarchy process. For example, in the early stage of the disease, the pathways were slightly enhanced, with a coefficient of 1.05–1.1; in the disease progression stage, the pathways were significantly enhanced, with a coefficient of 1.08–1.1; and in the stable or recovery stage, the recovery pathways were slightly enhanced, while the pathways were inhibited, with a coefficient of 0.9–1.0. For paths that need to be strengthened or weakened at the corresponding stage, their preference coefficient is multiplied by the corresponding correction coefficient to adjust the preference coefficient so that the abnormal propagation direction is consistent with the clinical pathological evolution law. Based on the magnitude of the preference coefficients within the anomaly propagation direction preference matrix, the output paths of each multi-branch node are sorted in descending order to obtain a preliminary priority. The preliminary priority is then corrected by combining the calibrated path energy to form an anomaly propagation path priority list. This list will be directly used for subsequent anomaly localization and source tracing analysis. Specifically, when the difference in preference coefficients of different paths under the same multi-branch node is less than a preset threshold, it is difficult to achieve stable and effective priority differentiation by relying solely on propagation tendency. Therefore, calibrated path energy is introduced for secondary correction. Path energy can reflect the actual intensity of anomalies on the corresponding path. The higher the energy, the more significant the anomaly performance of the path. Therefore, the priority of paths with higher energy is increased, so that the ranking of anomaly propagation paths takes into account both propagation tendency and actual anomaly intensity, thereby improving the stability and rationality of priority division.

[0043] Identify and update the initial main path of anomaly propagation to obtain the anomaly propagation trajectory, including: Read the priority list of anomaly propagation paths and the calibrated path energy, and determine the initial main path of anomaly propagation by priority traversal; In practice, the highest priority output path in the abnormal propagation path priority list is selected, and it is determined whether its calibrated path energy is lower than the detection threshold. If it is not lower, the corresponding output path is taken as the initial main path. If it is lower, the next output path is selected in order of priority, and the detection threshold judgment is repeated until the initial main path is determined. The node sequence, energy value and preference coefficient of the initial main path are recorded to clarify the starting node and propagation direction, providing a benchmark for subsequent dynamic tracking.

[0044] The initial main path is used to avoid unfounded random selections in multi-branch, multi-optional paths, thus mitigating the problem of biased selection of abnormal propagation direction from the source.

[0045] Based on the initial main path, the abnormal propagation process is dynamically tracked by updating the path energy and preference coefficient window by window, so as to update the abnormal propagation path priority list and the initial main path switching. Specifically, the current standardized diagnosis and treatment data is extracted window by window to update the calibrated path energy of each path in real time, and it is determined whether the current main path energy is still higher than the detection threshold. If the initial main path energy is lower than the threshold, it is determined that the path no longer has the conditions to dominate the propagation of abnormalities. The path with the highest priority that meets the threshold energy is selected from the current path priority list as the new main path, and the main path switching is completed.

[0046] Meanwhile, within each sliding window, newly added nodes and new relationships are identified, the transmission resistance of newly emerging potential paths is calculated, the corresponding preference coefficients are obtained, and they are inserted into the original path priority list for reordering, thereby realizing the dynamic updating of the priority list. This allows the abnormal propagation tracking results to conform to the temporal changes and path evolution patterns of the disease.

[0047] During dynamic tracking, the priority list and sorting of anomaly propagation paths are updated periodically according to the sliding window, and the anomaly propagation trajectory is recorded to provide structured data support for subsequent anomaly localization and source tracing.

[0048] Specifically, in response to path adjustments and priority updates that occur during dynamic tracking, the entire abnormal propagation path is updated periodically, with the update cycle consistent with the sliding window cycle. Each update generates a new list of abnormal propagation paths and a priority ranking. Simultaneously, the complete trajectory of abnormal propagation is recorded in detail, including the node sequence of the propagation path, the energy value of each node, path switching time, and changes in preference coefficients, forming an abnormal propagation trajectory report.

[0049] The trajectory reports are initially organized to extract the nodes and main paths of abnormal propagation, clarify the overall trend of abnormal propagation, and provide complete trajectory data for subsequent anomaly localization and source tracing. This ensures that the dynamic tracking of abnormal propagation can accurately reflect the real pathophysiological process, connect the results of path energy calibration and direction preference modeling, and further optimize the accuracy of anomaly detection.

[0050] In this embodiment of the invention, S300 first accurately selects the multi-branch nodes in the initial causal graph, identifies the key bifurcation points of abnormal propagation, and then analyzes the transmission resistance and connection weight of each path through the system to determine the direction and priority of abnormal propagation, effectively avoiding the problems of missed detection and misjudgment caused by path selection deviation. Through a dynamic tracking mechanism, the energy allocation and priority ranking of the path are updated in real time to ensure the accurate capture of abnormal transmission trajectories. At the same time, the path selection is optimized by combining clinical pathological patterns, so that the judgment of the direction and intensity of abnormal transmission is more in line with the actual development of the disease.

[0051] Compared to traditional detection methods, this step not only overcomes the limitations of single-path judgment but also makes the tracking of anomaly propagation more targeted through a dynamic update mechanism. It effectively avoids problems such as energy dispersion and path selection bias, ensuring the accuracy of subsequent root cause localization and path tracing. At the same time, it provides a clear direction for subsequent model iteration and optimization, realizing the upgrade of anomaly detection from static judgment to dynamic adaptation. It not only ensures the accuracy of anomaly propagation path identification but also improves the rigor and practicality of the overall detection process. It perfectly connects with the energy calibration and path priority setting steps mentioned earlier, forming a complete anomaly detection logic closed loop. It effectively solves the pain points of path judgment bias, missed detection, and misjudgment in traditional detection, making anomaly detection more in line with actual clinical needs.

[0052] Specifically, S400: Constructs the target node, identifies the root cause path based on the anomaly propagation trajectory, and inputs the root cause path into the cause-effect graph model to complete the iterative optimization of the model's anomaly identification capability. Specific steps include: Extract all nodes in the anomaly propagation trajectory, and use the PageRank graph ranking algorithm to analyze and calculate the importance score of each node, and use the importance score as the anomaly contribution. The node with the largest abnormal contribution is selected as the target node; the target node is the key node with the greatest influence in the abnormal propagation chain and plays a core driving role in the deterioration of the condition. For example, in patients with acute myocardial infarction complicated by cardiogenic shock, the abnormal propagation trajectory is output by step S300. Based on the abnormal propagation trajectory, the nodes and directed edges are determined: myocardial ischemia → decreased myocardial contractility, myocardial ischemia → arrhythmia, decreased myocardial contractility → hypotension, arrhythmia → hypotension. The cause-effect graph structure is as follows: Outgoing chain of myocardial ischemia: refers to decreased myocardial contractility and arrhythmia, i.e., outgoing chain = 2; The outgoing chain of decreased myocardial contractility points to hypotension, i.e., outgoing degree = 1; Outgoing chain of arrhythmia: pointing to hypotension, i.e., outgoing degree = 1; Outgoing chain for low blood pressure: None, i.e., outgoing degree = 0; Given a damping coefficient of 0.85, 4 nodes, and an initial anomaly contribution of 1 / 4 = 0.25 per node; Based on the iterative formula of the PageRank graph sorting algorithm, perform the first iteration: The process for calculating the abnormal contribution of hypotension is as follows: The entry points for hypotension: decreased myocardial contractility and arrhythmia; Contribution = 0.25 / 1 + 0.25 / 1 = 0.5; where the base score = 0.0375; The abnormal contribution of hypotension = 0.0375 + 0.85 × 0.5 = 0.0375 + 0.425 = 0.4625; The process for calculating the abnormal contribution of decreased myocardial contractility is as follows: Entering chain nodes that cause decreased myocardial contractility: myocardial ischemia; Contribution sum = 0.25 / 2 = 0.125; Abnormal contribution of decreased myocardial contractility = 0.0375 + 0.85 × 0.125 = 0.0375 + 0.10625 = 0.14375; The process for calculating the abnormal contribution of cardiac arrhythmias is as follows: The entry point of arrhythmia: myocardial ischemia; Contribution sum = 0.25 / 2 = 0.125; The abnormal contribution of arrhythmia = 0.14375; The process for calculating the abnormal contribution of myocardial ischemia is as follows: Ingress nodes in myocardial ischemia: None; The abnormal contribution of myocardial ischemia = 0.0375; Results of the first iteration: The abnormal contributions of myocardial ischemia, decreased myocardial contractility, arrhythmia and hypotension were 0.0375, 0.14375, 0.14375 and 0.4625, respectively; Then, the iterative calculation was repeated. In the second iteration, it was found that the abnormal contribution of hypotension was always the highest, and the second time it was 0.2819, which was much higher than other nodes. Therefore, the target node is hypotension. Anomaly contribution indicates the importance, influence, and contribution of a node in the entire anomaly propagation path to the overall anomaly formation. The higher the value, the more likely it is to be the core trigger point or key transit node for the occurrence, spread, and deterioration of the anomaly. It is used to distinguish between key nodes and secondary nodes, and automatically screen out the core nodes that play a leading and pivotal role in the transmission and spread of the disease from the anomaly propagation trajectory as target nodes, providing a basis for subsequent anomaly tracing, root cause location, and determination of intervention targets. In an abnormal propagation trajectory, a target node pointed to by multiple upstream abnormal nodes, and whose upstream nodes themselves have a high degree of abnormality, will receive a higher abnormal contribution, i.e., a PageRank score. This precisely corresponds to a node that, in clinical terms, is both a convergence point of multiple pathological processes and a key hub for disease deterioration. Unlike traditional methods that simply count the number of times a node is traversed or accumulate path energy, the PageRank algorithm considers the recursive dependency of propagation. That is, the importance of a node depends not only on how many predecessor nodes point to it, but also on the importance of those predecessor nodes themselves, which aligns with the true pathological characteristics of abnormalities amplifying and propagating layer by layer in a causal chain. Therefore, using PageRank to screen target nodes can automatically locate core abnormal nodes from complex abnormal propagation trajectories, providing a precise entry point for subsequent root cause tracing. When using the PageRank graph ranking algorithm to calculate the importance score of each node, the damping coefficient is typically set to 0.85. Based on the target node and combined with the abnormal propagation trajectory, the root cause path of the abnormal propagation is obtained by tracing back level by level along the propagation direction and filtering the path, thus clarifying the origin and propagation process of the abnormality. In practice, starting from the target node, the system traces back upstream to its predecessor nodes along the anomaly propagation trajectory, sequentially reading the calibrated path energy, preference coefficient, and transmission resistance of each predecessor node's path. In each level of backtracking, multiple predecessor paths branching off from the same parent node are compared horizontally. Predecessor paths with energy higher than the anomaly judgment threshold, higher preference coefficients than other paths, and lower transmission resistance are retained to ensure that the traced paths have real anomaly support and conform to anomaly propagation tendencies and transmission patterns, thereby ensuring the accuracy and rationality of the root cause path. At the same time, invalid branches with insufficient energy and no actual anomaly support are eliminated.

[0053] Repeat the upstream backtracking process described above until the starting node with no upstream predecessor node is reached. This starting node is the origin node of the anomaly. Connect the complete path from the origin node to the target node to form the main path of anomaly propagation with temporal consistency and energy support, that is, the root cause path of anomaly propagation. Finally, clarify the complete propagation process of the anomaly from its initial triggering to its spread and deterioration.

[0054] Each level of backtracking refers to the process of tracing back from the current node to the upstream predecessor node level by level.

[0055] Cross-validation is performed on the root cause path of abnormal propagation to ensure the validity of the source tracing results, providing core evidence for subsequent anomaly detection output and determining the validity of the source tracing. If the source tracing is invalid, the path priority ranking is optimized by adjusting the preference coefficients of different time stages of acute and severe illness until the source tracing is valid.

[0056] Next, the results of tracing the target node and root cause path are compiled into an anomaly tracing report. The report clearly identifies the anomaly target node, root cause path, anomaly propagation trajectory, and the contribution of each node. Cross-validation is used to verify the tracing results. Data samples from different time periods and disease types of acute and severe illnesses are selected, and the above process is repeated. The consistency between the tracing results and the actual clinical diagnosis is compared. For example, if the consistency reaches 85% or higher, the tracing results are considered valid; if the consistency is lower than 85%, the correction coefficient for the time series stage is adjusted to adjust the anomaly propagation direction preference coefficient and optimize the path priority ranking until the tracing results meet the validity standard. The specific adjustment method is as follows: By comparing the source tracing results with actual clinical pathways, we identify missed real pathways and misjudged false pathways. If a real pathological pathway is not prioritized, it indicates that its preference coefficient is too low, so we increase the correction coefficient for that pathway at the corresponding stage, for example, from 1.05 to 1.07-1.09. If a pathway is incorrectly identified as the primary pathway, it indicates that its preference coefficient is too high, so we decrease the correction coefficient for that pathway, for example, from 1.0 to 0.96-0.98. Through a gradient parameter optimization method, we make only minor adjustments within the range of ±0.02 to ±0.05 each time, thus avoiding abrupt adjustments. Based on the adjusted correction coefficients, we recalculate the preference coefficients to optimize the pathway priority ranking until the source tracing results meet the effective standard. The verified anomaly tracing results will serve as the core basis for subsequent anomaly detection outputs, ensuring the accuracy of anomaly location and tracing, connecting all outputs from previous steps, and forming a complete anomaly detection logic chain.

[0057] The root cause path corresponding to the effective source tracing is input into the causal graph model to update the causal graph model structure; Based on the updated causal graph model, the path energy calibration and conduction resistance value calculation of each path are re-executed to obtain the accuracy of the abnormal propagation judgment. If the accuracy exceeds the preset accuracy threshold, the current causal graph structure is saved as the final stable causal graph model. If the accuracy does not exceed the preset accuracy threshold, the root cause path corresponding to the effective source tracing is iteratively updated to the causal graph model until the accuracy exceeds the preset accuracy threshold, at which point the iteration stops.

[0058] In this invention, all parameters are dimensionless by using dimensionless processing technology to remove their dimensions; at the same time, all thresholds can be obtained by the mean-standard deviation method. The validated abnormal propagation paths, target nodes, and propagation relationships are used as prior knowledge and imported into the initial causal graph structure. The confidence of existing nodes and edges in the path is strengthened to increase their connection weights. New paths and new relationships not included in the model are added and completed so that the causal graph can more accurately reflect the real pathological transmission law and provide a more realistic structural basis for subsequent anomaly detection. Using the updated causal graph as input, the path energy calculation, transmission resistance assessment, and preference coefficient generation processes are re-executed. The newly added effective path information is used to optimize the energy allocation method, reduce path omissions and direction misjudgments, and enable the model to more accurately identify abnormal propagation paths in the new round of detection, further reducing energy leakage and direction selection bias. The updated model is validated using test samples to obtain the accuracy of the detection results. If the performance meets the preset requirements, the current causal graph structure and related parameters are saved as the final stable model. If there are still deviations, new effective paths are iteratively updated to the causal graph, and the structure optimization and model inference process is repeated to form a complete closed loop of detection, source tracing, validation and model updating.

[0059] Accuracy is obtained by comparing the source tracing results output by the model with the actual pathological paths labeled by clinicians, and by dividing the number of correctly predicted samples by the total number of test samples. In this embodiment of the invention, the step first quantifies the anomaly contribution of each node using the PageRank algorithm, accurately selects the target node that plays a core role in the propagation of anomalies, and then traces the root cause path in reverse by combining the anomaly propagation trajectory to ensure the accuracy of root cause location; cross-validation ensures the effectiveness of the source tracing results, and at the same time, the validated root cause path is fed back to the causal graph model to realize the dynamic update of the model structure, forming a complete closed loop of detection, source tracing, verification, and update.

[0060] Compared to traditional detection methods, this process avoids missed detections and misjudgments caused by root cause localization bias. Furthermore, through gradient parameter fine-tuning and dynamic updates to the causal graph structure, the model continuously adapts to clinical pathological patterns, reducing energy leakage and path misjudgment. Simultaneously, through complete closed-loop iteration, anomaly detection is upgraded from passive identification to proactive early warning and dynamic optimization. This aligns with the actual needs of acute and critical illness diagnosis and treatment, while standardized processes ensure the reliability of tracing results. Anomaly detection not only accurately identifies anomalies and locates root causes but also continuously improves detection efficiency and accuracy through ongoing model iteration, providing clear direction for clinical intervention.

[0061] Example 2, as Figure 2 As shown, based on the same inventive concept as the method for detecting anomalies in critical illness diagnosis and treatment data based on a causal graph model provided in Embodiment 1, this embodiment of the invention also provides a system for detecting anomalies in critical illness diagnosis and treatment data based on a causal graph model, comprising: The data acquisition and preprocessing module 11 is used to acquire multi-source raw data in the process of diagnosis and treatment of acute and critical illnesses, and to construct an initial causal graph with each feature in the multi-source raw data as nodes by analyzing the temporal lag correlation and conditional mutual information structure between variables. Path analysis module 12 is used to perform energy allocation and energy discrepancy analysis based on the connection weights of each directed edge in the initial causal graph, and to obtain the calibrated path energy; Anomaly identification module 13 is used to extract multi-branch nodes in the initial causal graph and analyze the path propagation differences of anomalous energy during dynamic tracking in order to identify and update the initial main path of anomalous propagation in order to obtain the anomalous propagation trajectory. The anomaly monitoring and optimization module 14 is used to construct target nodes, identify root cause paths by combining anomaly propagation trajectories, and input the root cause paths into the cause-effect graph model to complete the iterative optimization of the model's anomaly identification capability.

[0062] Furthermore, the execution steps of the data acquisition and preprocessing module 11 also include: By preprocessing the multi-source raw data in the diagnosis and treatment of acute and critical illnesses, a standardized diagnosis and treatment dataset is obtained. Based on standardized medical datasets, we analyzed the temporal lag correlations and conditional mutual information structures among variables to obtain effective causal connections. Using effective causal connections as directed edges, and taking each feature in the standardized diagnosis and treatment dataset as a node, an initial causal graph and the connection weight of each directed edge in it are constructed.

[0063] Furthermore, the execution steps of the data acquisition and preprocessing module 11 also include: Based on standardized diagnostic and treatment datasets, the time-series cross-correlation function corresponding to each lag order is obtained by calculating the centered time-series cross-correlation function for variable pairs under different lag orders; Extract the maximum absolute magnitude within the time series cross-correlation coefficient sequence. If the maximum absolute magnitude exceeds the preset significance threshold, it is recorded as a pair of related variables. The lag order corresponding to the peak of the time series cross-correlation of the pair of related variables is determined. If the lag order is not equal to 0, it is determined that the pair of related variables has an asynchronous lag dependency relationship and is included in the candidate variable pair set. Based on the lag order sign of the related variable pairs within the candidate variable pair set, causal direction candidates are marked by the time sequence rule to form a directed edge candidate set; A fixed-length sliding window is set to calculate the mutual information between each pair of related variables in the candidate variable pair set within each sliding window. Features other than the corresponding two variables in the standardized diagnosis and treatment dataset are used as the condition set to analyze the conditional dependency structure between related variables and obtain conditional mutual information. If the difference between mutual information and conditional mutual information is higher than a preset independence threshold, the corresponding variable pair is retained as a valid causal link.

[0064] Furthermore, the execution steps of the path analysis module 12 also include: Based on the standardized diagnosis and treatment dataset and the connection weights of each directed edge in the initial causal graph, the overall sample deviation is calculated by reconstructing the error and energy is allocated according to the weights to obtain the initial energy and potential energy leakage samples of each path. For potential energy leakage samples, based on the initial causal graph and the initial energy of each path, the degree of energy dispersion is analyzed to obtain the dilution coefficient. The dilution coefficients of each path are then weighted and summed according to the connection weights to obtain the energy dilution coefficient. The range of energy dilution coefficient values ​​is mapped to a preset coefficient range, and the energy calibration coefficient is obtained by linear mapping. The initial energy of each path is multiplied by the corresponding energy calibration coefficient to obtain the calibrated path energy.

[0065] Furthermore, the execution steps of the anomaly detection module 13 also include: Traverse all nodes in the initial causal graph and filter out multi-branch nodes with two or more output paths, and determine the set of output paths corresponding to each multi-branch node; Based on the average path length and connection weight of each output path in the output path set, the comprehensive hindering effect of path conduction distance and overall causal correlation strength on energy propagation is analyzed, and the conduction resistance value of each output path is obtained. For all output paths under each multi-branch node, the preference coefficient is obtained through negative correlation mapping based on the transmission resistance value. The preference coefficient is then adjusted in combination with different time stages of acute and critical illness to form an abnormal propagation direction preference matrix. Based on the magnitude of the preference coefficients within the anomaly propagation direction preference matrix, the output paths of each multi-branch node are sorted in descending order to obtain preliminary priorities. These preliminary priorities are then adjusted using the calibrated path energy to form an anomaly propagation path priority list.

[0066] Furthermore, the execution steps of the anomaly detection module 13 also include: Read the priority list of anomaly propagation paths and the calibrated path energy, and determine the initial main path of anomaly propagation by priority traversal; Based on the initial main path, the abnormal propagation process is dynamically tracked by updating the path energy and preference coefficient window by window, so as to update the abnormal propagation path priority list and the initial main path switching. During dynamic tracking, the priority list and sorting of anomaly propagation paths are updated periodically according to the sliding window, and the anomaly propagation trajectory is recorded.

[0067] Furthermore, the execution steps of the anomaly monitoring optimization module 14 also include: Extract all nodes in the anomaly propagation trajectory, and use the PageRank graph ranking algorithm to analyze and calculate the importance score of each node, and use the importance score as the anomaly contribution. Select the node with the largest abnormal contribution as the target node; Based on the target node and combined with the abnormal propagation trajectory, the root cause path of abnormal propagation is obtained by tracing back level by level along the propagation direction and filtering the path. Cross-validation is performed on the root cause path of abnormal transmission to determine the effectiveness of source tracing. If source tracing is ineffective, the path priority ranking is optimized by adjusting the preference coefficients of different time stages of acute and severe illness until source tracing is effective.

[0068] Furthermore, the execution steps of the anomaly monitoring optimization module 14 also include: The root cause path corresponding to the effective source tracing is input into the causal graph model to update the causal graph model structure; Based on the updated causal graph model, the path energy calibration and conduction resistance value calculation of each path are re-executed to obtain the accuracy of the abnormal propagation judgment. If the accuracy exceeds the preset accuracy threshold, the current causal graph structure is saved as the final stable causal graph model. If the accuracy does not exceed the preset accuracy threshold, the root cause path corresponding to the effective source tracing is iteratively updated to the causal graph model until the accuracy exceeds the preset accuracy threshold, at which point the iteration stops.

[0069] It should be noted that the descriptions of each embodiment in the above embodiments have different focuses. For parts that are not described in detail in a certain embodiment, please refer to the relevant descriptions in other embodiments.

[0070] Those skilled in the art will understand that embodiments of the present invention can be provided as methods, systems, or computer program products. Therefore, the present invention can take the form of a completely hardware embodiment, a completely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present invention can take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) containing computer-usable program code.

[0071] This invention is described with reference to flowchart illustrations and / or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each block of the flowchart illustrations and / or block diagrams, and combinations of blocks in the flowchart illustrations and / or block diagrams, can be implemented by computer program instructions. These computer program instructions can be provided to a processor of a general-purpose computer, special-purpose computer, embedded computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, generate instructions for implementing the flowchart illustrations. Figure 1 One or more processes and / or boxes Figure 1 A device that provides the functions specified in one or more boxes.

[0072] These computer program instructions may also be stored in a computer-readable storage medium that can direct a computer or other programmable data processing device to function in a particular manner, such that the instructions stored in the computer-readable storage medium produce an article of manufacture including instruction means, which are implemented in a process Figure 1 One or more processes and / or boxes Figure 1 The function specified in one or more boxes.

[0073] These computer program instructions may also be loaded onto a computer or other programmable data processing equipment to cause a series of operational steps to be performed on the computer or other programmable equipment to produce a computer-implemented process, thereby providing instructions that execute on the computer or other programmable equipment for implementing the process. Figure 1 One or more processes and / or boxes Figure 1 The steps of the function specified in one or more boxes.

[0074] Although preferred embodiments of the invention have been described, those skilled in the art, once they have learned the basic inventive concept, can make other changes and modifications to these embodiments.

[0075] Obviously, those skilled in the art can make various modifications and variations to this invention without departing from its spirit and scope. Therefore, if these modifications and variations fall within the scope of this invention and its equivalents, this invention also intends to include these modifications and variations.

Claims

1. A method for detecting abnormality of critical illness diagnosis and treatment data based on a causal diagram model, characterized in that, The method includes: We acquire multi-source raw data from the diagnosis and treatment of critical illnesses, and construct an initial causal graph with each feature in the multi-source raw data as a node by analyzing the temporal lag correlation and conditional mutual information structure among variables. Energy allocation and energy discrepancy analysis are performed based on the connection weights of each directed edge in the initial causal graph to obtain the calibrated path energy; Extract multi-branch nodes within the initial causal graph and analyze the path propagation differences of anomalous energy during dynamic tracking to identify and update the initial main path of anomalous propagation in order to obtain the anomalous propagation trajectory. Construct target nodes, identify root cause paths by combining abnormal propagation trajectories, and input the root cause paths into the cause-effect graph model to complete the iterative optimization of the model's abnormal identification capabilities. 2.The causal graph model based critical illness diagnosis and treatment data anomaly detection method according to claim 1, characterized in that, The initial cause-effect graph construction process includes: By preprocessing the multi-source raw data in the diagnosis and treatment of acute and critical illnesses, a standardized diagnosis and treatment dataset is obtained. Based on standardized medical datasets, we analyzed the temporal lag correlations and conditional mutual information structures among variables to obtain effective causal connections. Using effective causal connections as directed edges, and taking each feature in the standardized diagnosis and treatment dataset as a node, an initial causal graph and the connection weight of each directed edge in it are constructed.

3. The method for detecting anomalies in critical illness diagnosis and treatment data based on a causal graph model according to claim 2, characterized in that, Based on standardized clinical datasets, we analyzed the time-lag correlations and conditional mutual information structures among variables to obtain effective causal connections, including: Based on standardized diagnostic and treatment datasets, the time-series cross-correlation function corresponding to each lag order is obtained by calculating the centered time-series cross-correlation function for variable pairs under different lag orders; Extract the maximum absolute magnitude within the time series cross-correlation coefficient sequence. If the maximum absolute magnitude exceeds the preset significance threshold, it is recorded as a pair of related variables. The lag order corresponding to the peak of the time series cross-correlation of the pair of related variables is determined. If the lag order is not equal to 0, it is determined that the pair of related variables has an asynchronous lag dependency relationship and is included in the candidate variable pair set. Based on the lag order sign of the related variable pairs within the candidate variable pair set, causal direction candidates are marked by the time sequence rule to form a directed edge candidate set; A fixed-length sliding window is set to calculate the mutual information between each pair of related variables in the candidate variable pair set within each sliding window. Features other than the corresponding two variables in the standardized diagnosis and treatment dataset are used as the condition set to analyze the conditional dependency structure between related variables and obtain conditional mutual information. If the difference between mutual information and conditional mutual information is higher than a preset independence threshold, the corresponding variable pair is retained as a valid causal link.

4. The method for detecting anomalies in critical illness diagnosis and treatment data based on a causal graph model according to claim 3, characterized in that, Based on the connection weights of each directed edge in the initial causal graph, energy allocation and energy discrepancy analysis are performed to obtain the calibrated path energy, including: Based on the standardized diagnosis and treatment dataset and the connection weights of each directed edge in the initial causal graph, the overall sample deviation is calculated by reconstructing the error and energy is allocated according to the weights to obtain the initial energy and potential energy leakage samples of each path. For potential energy leakage samples, based on the initial causal graph and the initial energy of each path, the degree of energy dispersion is analyzed to obtain the dilution coefficient. The dilution coefficients of each path are then weighted and summed according to the connection weights to obtain the energy dilution coefficient. The range of energy dilution coefficient values ​​is mapped to a preset coefficient range, and the energy calibration coefficient is obtained by linear mapping. The initial energy of each path is multiplied by the corresponding energy calibration coefficient to obtain the calibrated path energy.

5. The method for detecting anomalies in critical illness diagnosis and treatment data based on a causal graph model according to claim 4, characterized in that, Extract multi-branch nodes from the initial causal graph and analyze the differences in path propagation of anomalous energy during dynamic tracing, including: Traverse all nodes in the initial causal graph and filter out multi-branch nodes with two or more output paths, and determine the set of output paths corresponding to each multi-branch node; Based on the average path length and connection weight of each output path in the output path set, the comprehensive hindering effect of path conduction distance and overall causal correlation strength on energy propagation is analyzed, and the conduction resistance value of each output path is obtained. For all output paths under each multi-branch node, the preference coefficient is obtained through negative correlation mapping based on the transmission resistance value. The preference coefficient is then adjusted in combination with different time stages of acute and severe illness to form an abnormal propagation direction preference matrix. Based on the magnitude of the preference coefficients within the anomaly propagation direction preference matrix, the output paths of each multi-branch node are sorted in descending order to obtain preliminary priorities. These preliminary priorities are then adjusted using the calibrated path energy to form an anomaly propagation path priority list.

6. The method for detecting anomalies in critical illness diagnosis and treatment data based on a causal graph model according to claim 5, characterized in that, Identify and update the initial main path of anomaly propagation to obtain the anomaly propagation trajectory, including: Read the priority list of anomaly propagation paths and the calibrated path energy, and determine the initial main path of anomaly propagation by priority traversal; Based on the initial main path, the abnormal propagation process is dynamically tracked by updating the path energy and preference coefficient window by window, so as to update the abnormal propagation path priority list and the initial main path switching. During dynamic tracking, the priority list and sorting of anomaly propagation paths are updated periodically according to the sliding window, and the anomaly propagation trajectory is recorded.

7. The method for detecting anomalies in critical illness diagnosis and treatment data based on a causal graph model according to claim 6, characterized in that, Construct target nodes and, combined with anomaly propagation trajectories, identify root cause paths. Input these root cause paths into the cause-effect graph model to iteratively optimize the model's anomaly detection capabilities, including: Extract all nodes in the anomaly propagation trajectory, and use the PageRank graph ranking algorithm to analyze and calculate the importance score of each node, and use the importance score as the anomaly contribution. Select the node with the largest abnormal contribution as the target node; Based on the target node and combined with the abnormal propagation trajectory, the root cause path of abnormal propagation is obtained by tracing back level by level along the propagation direction and filtering the path. Cross-validation is performed on the root cause path of abnormal transmission to determine the effectiveness of source tracing. If source tracing is ineffective, the path priority ranking is optimized by adjusting the preference coefficients of different time stages of acute and severe illness until source tracing is effective.

8. The method for detecting anomalies in critical illness diagnosis and treatment data based on a causal graph model according to claim 7, characterized in that, Constructing target nodes and combining them with anomaly propagation trajectories to identify root cause paths, and inputting these root cause paths into the cause-effect graph model to iteratively optimize the model's anomaly identification capabilities, also includes: The root cause path corresponding to the effective source tracing is input into the causal graph model to update the causal graph model structure; Based on the updated causal graph model, the path energy calibration and conduction resistance value calculation of each path are re-executed to obtain the accuracy of the abnormal propagation judgment. If the accuracy exceeds the preset accuracy threshold, the current causal graph structure is saved as the final stable causal graph model. If the accuracy does not exceed the preset accuracy threshold, the root cause path corresponding to the effective source tracing is iteratively updated to the causal graph model until the accuracy exceeds the preset accuracy threshold, at which point the iteration stops.

9. A system for detecting anomalies in critical illness diagnosis and treatment data based on a causal graph model, used to implement the method for detecting anomalies in critical illness diagnosis and treatment data based on a causal graph model as described in any one of claims 1 to 8, characterized in that, include: The data acquisition and preprocessing module is used to acquire multi-source raw data during the diagnosis and treatment of acute and critical illnesses, and to construct an initial causal graph with each feature in the multi-source raw data as nodes by analyzing the temporal lag correlation and conditional mutual information structure between variables. The path analysis module is used to perform energy allocation and energy dispersion analysis based on the connection weights of each directed edge in the initial causal graph, and to obtain the calibrated path energy. The anomaly detection module is used to extract multi-branch nodes within the initial causal graph and analyze the path propagation differences of anomaly energy during dynamic tracking to identify and update the initial main path of anomaly propagation in order to obtain the anomaly propagation trajectory. The anomaly monitoring and optimization module is used to construct target nodes, identify root cause paths by combining anomaly propagation trajectories, and input the root cause paths into the cause-effect graph model to complete the iterative optimization of the model's anomaly identification capability.