An ICU patient key clinical change trend combination mode mining method

By constructing feature association subgraphs and reinforcement learning optimization models, the key clinical change trend combination patterns of ICU patients are explicitly modeled, which solves the problems of insufficient trend capture and association modeling in existing technologies, improves prediction performance and interpretability, and adapts to risk assessment of different disease types.

CN122245758APending Publication Date: 2026-06-19ZHONGSHAN HOSPITAL FUDAN UNIV +1

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
ZHONGSHAN HOSPITAL FUDAN UNIV
Filing Date
2026-03-16
Publication Date
2026-06-19

AI Technical Summary

Technical Problem

Existing technologies struggle to explicitly capture the changing trends of key physiological indicators in predicting mortality risk in ICU patients, lack modeling of the correlation between the changing trends of multiple features, have insufficient interpretability of prediction results, and do not fully consider the heterogeneity of patients' conditions.

Method used

By constructing a feature association subgraph based on normalized point mutual information, a structured reasoning chain is generated. Supervised fine-tuning and reinforcement learning are used to optimize the model, and key clinical change trend combination patterns are explicitly modeled to improve prediction performance and interpretability.

Benefits of technology

It enables dynamic capture of the disease evolution process of ICU patients, improves the stability and interpretability of risk prediction, supports clinical decision-making, and adapts to specific pattern analysis of different disease types.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122245758A_ABST
    Figure CN122245758A_ABST
Patent Text Reader

Abstract

This invention discloses a method for mining key clinical trend combinations in ICU patients, belonging to the field of natural language processing and medical model mining technology. Based on time-series electronic medical record data, this method first extracts abnormal trends in single indicators related to mortality risk (such as continuous increases or rapid decreases); secondly, it constructs a feature association subgraph based on Normalized Point Mutual Information (NPMI) to characterize strong correlations between multiple trends; then, it combines the subgraph with medical record data to guide a large model to generate a structured reasoning chain, and selects high-quality samples based on prediction accuracy; finally, through two-stage training—supervised fine-tuning and reinforcement learning based on Group Relative Policy Optimization (GRPO)—the model's ability to identify and generalize key trend combinations is improved. This invention achieves a leap from "feature discovery" to "clinical pattern discovery," significantly enhancing the accuracy and interpretability of ICU mortality risk prediction.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention belongs to the field of natural language processing technology, specifically to the field of medical model mining technology, and more specifically, to a method for mining key clinical change trend combination patterns in ICU patients. Background Technology

[0002] With the widespread adoption of hospital information systems, ICUs have accumulated a large amount of high-frequency, long-term electronic medical record data, including vital signs, laboratory tests, medication records, and interventions. This data provides an important foundation for patient prognosis assessment and risk prediction. Current technologies for predicting mortality risk in ICU patients mainly include statistical regression models, traditional machine learning models, and deep learning models.

[0003] However, the above methods still have the following significant shortcomings in actual clinical applications:

[0004] (1) Limited ability to characterize the changing trends of time series features

[0005] The disease progression of ICU patients is highly dynamic, and clinical decisions often rely on the trends of key physiological indicators over a period of time, rather than static values ​​at a single moment. Existing methods mostly employ time aggregation, sliding windows, or simple sequence modeling, which struggle to explicitly capture and express trends with clear clinical semantics, such as "continuous rise," "rapid decline," or "transition from stability to fluctuation."

[0006] (2) Lack of modeling of the correlation between the changing trends of multiple features

[0007] In evidence-based medicine practice, clinicians often review past cases to summarize clinical patterns composed of trends in multiple indicators. For example, the combination of increased heart rate, increased respiratory rate, and decreased systolic blood pressure often indicates deteriorating circulatory function. Existing prediction methods typically perform global statistics or attention weighting on the importance of individual features, failing to explicitly model the combined relationships between trends in multiple features, which can easily confuse the semantics of different clinical patterns.

[0008] For example, in existing technologies, "increased heart rate," "decreased systolic blood pressure," "increased respiratory rate," and "increased blood lactate" may each be assigned a high importance weight, but they cannot distinguish between the following two clinically distinct patterns:

[0009] • Mode 1: Increased heart rate + decreased systolic blood pressure + increased respiratory rate + increased blood lactate = high-risk state;

[0010] • Mode 2: Increased heart rate + decreased systolic blood pressure + increased respiratory rate + stable blood lactate = moderate risk status.

[0011] When any key trend in the associated feature combination changes, the clinical pattern it represents and the corresponding risk implications will change significantly, and existing methods are unable to effectively distinguish such pattern changes.

[0012] (3) The predictive results are not interpretable enough and are difficult to support clinical decision-making.

[0013] ICU risk prediction models require high interpretability in clinical applications. Clinicians are more concerned with "which combinations of trends led to the increased risk," rather than just a final risk score. Existing methods typically only output predicted probabilities or global feature importance rankings, making it difficult to interpret the predictive basis in a "clinical paradigm" that aligns with clinical cognition, thus limiting their credibility and usability in real-world clinical settings.

[0014] (4) The high heterogeneity of ICU patients' conditions was not adequately considered.

[0015] ICU patients come from diverse backgrounds, and different disease types exhibit significantly different key indicators and evolution patterns. Actual electronic medical records typically consist of both general clinical templates and disease-specific templates; for example, patients with sepsis and those with acute respiratory distress syndrome show marked differences in key indicators and their patterns of change. Existing methods often employ a unified feature modeling framework, making it difficult to effectively characterize disease-specific clinical patterns while sharing general patterns.

[0016] Therefore, there is an urgent need for a new technical solution that can automatically discover clinical patterns composed of the changing trends of key clinical features and their combination relationships from time-series electronic medical record data, dynamically capture multiple clinical patterns that appear during the evolution of a patient's condition, and analyze the impact of each clinical pattern on the patient's mortality risk, thereby achieving ICU mortality risk assessment with higher predictive performance and stronger clinical interpretability. Summary of the Invention

[0017] This invention aims to elevate the process from "feature discovery" to "clinical pattern discovery," making the predictive reasoning process more aligned with the evidence-based thinking of clinicians, and providing reliable support for early warning and precise intervention for critically ill patients.

[0018] To achieve the above objectives, this invention provides a method for mining combination patterns of key clinical changes in ICU patients, characterized by the following steps:

[0019] S1: Based on retrospective time-series electronic medical record data of critically ill patients and corresponding real clinical outcomes, the changes in multidimensional clinical indicators of patients during hospitalization are analyzed, and the trends of key factors related to the patient's mortality risk are extracted. The trends of these key factors are used to characterize the directional changes of clinical indicators in the time dimension, and the trends of key factors affecting mortality risk are summarized accordingly.

[0020] S2: Construct a feature correlation subgraph based on the statistical correlation strength between the key factor trends described in step S1. The feature correlation subgraph is used to characterize the strong correlation between key factor trends to assist in the mining of clinical change trend combination patterns. The construction of the feature correlation subgraph includes: calculating the normalized point mutual information (NPMI) between any two key factor trends, traversing the abnormal change trend nodes in a preset order, and adding the correlation edge with the strongest correlation strength to the current node in turn.

[0021] S3: The patient's time-series electronic medical record data and the feature association subgraph described in step S2 are used as joint inputs to guide the model to generate a reasoning thought chain based on the combination of key factor trends in a preset structured format, and the generated thought chain is filtered for quality. Specifically, when the model is provided with the real patient outcome, the thought chain is filtered based on the consistency between the outcome predicted based on the thought chain and the real outcome, or when the model is not provided with the real patient outcome, the thought chain is filtered based on the correctness of the outcome predicted by the model.

[0022] S4: Based on the filtered high-quality thought chain samples, the model is trained under supervision and fine-tuned. After the supervision is completed, reinforcement learning is introduced for optimization. The reinforcement learning uses the analyzed combination of key clinical change trends as the reward signal, thereby improving the model's ability to model the combination of key clinical change trends.

[0023] Furthermore, step S1 specifically includes:

[0024] S11: Based on retrospective time-series electronic medical record data of critically ill patients and corresponding real clinical outcomes, analyze the changes in multidimensional clinical indicators of patients during their stay in the ICU;

[0025] S12: Given the known patient death or survival outcome, the guided model automatically extracts abnormal trends related to mortality risk from the time-series electronic medical records as candidate risk factors or protective factors.

[0026] S13: The abnormal trend is used to describe the directional change of a single patient's clinical indicators over time, including continuous increase, continuous decrease, rapid mutation, or transition from a stable state to an unstable state; specifically, it includes risk trends indicating an increased risk of death and protective trends indicating a reduced risk of death; risk trends include: a deteriorating trend in which clinical indicators continue to change in an adverse direction and no clear signs of recovery appear within the observation period; an abnormal trend in which clinical indicators remain within the abnormal range for a long time and do not return to the normal range; a fluctuating trend in which clinical indicators frequently or significantly change between the normal and abnormal ranges, indicating system instability; and a sudden abnormal trend in which clinical indicators deviate sharply from the normal range in a short period of time, exceeding the previous range of change; protective trends include: an improving trend in which clinical indicators that were originally in an abnormal state continue to recover towards the normal direction; and a normal trend in which clinical indicators remain within the normal range for a long period of time without significant abnormal changes.

[0027] S14: The extracted trend-based risk factors and protective factors are used to characterize the directional semantic association between the trend of changes in patients' clinical indicators and outcomes;

[0028] Furthermore, step S2 specifically includes:

[0029] S21: Based on the abnormal change trend clinical features extracted in step S1, construct an abnormal change trend sub-graph to characterize the strong correlation between features;

[0030] S22: For any two abnormal trend features, calculate their normalized point mutual information (NPMI) in the retrospective sample to measure the strength of the association between them. The calculation formula is as follows:

[0031]

[0032] Here, 𝑥 and 𝑦 represent two abnormal trend features; NPMI (Normalized Pointwise Mutual Information) is an indicator used to measure the strength of statistical association between two events or features. Its value ranges from [−1,1], where a larger value indicates a stronger positive correlation between the two events, a value close to 0 indicates a weaker association, and a negative value indicates a negative correlation. It is used to measure the association strength of two abnormal trend features appearing simultaneously in retrospective electronic medical record data. It represents the probability of an abnormal trend characteristic 𝑥 appearing in a retrospective electronic medical record data sample, that is, the proportion of patients or time segments in which trend 𝑥 appears in the selected sample or time window. Indicating abnormal change trend characteristics The probability of occurrence in a retrospective sample of electronic medical record data, i.e., the trend within the selected sample or time window. The proportion of patients or time segments; Indicating abnormal change trend characteristics and The probability of simultaneous occurrence in the same patient or within the same time window; This represents a logarithmic operation, used to convert probability ratios into a measure of information content.

[0033] S23: During the subgraph construction process, the abnormal change trend feature nodes are traversed in a preset order, and the current node is added with the associated edge whose NPMI value is the largest.

[0034] S24: When adding related edges, avoid introducing existing duplicate edges to ensure the sparsity and discriminativeness of the subgraph structure;

[0035] S25: The constructed subgraph of abnormal change trends retains only the topological relationships between features, without explicitly including edge weight values, in order to highlight the combination pattern of abnormal change trends itself.

[0036] Furthermore, step S3 specifically includes:

[0037] S31: The patient’s time-series electronic medical record data and the corresponding abnormal change trend sub-graph are used as joint inputs to provide the large model to generate reasoning thought chains.

[0038] S32: The model generation process is constrained according to a preset structured format, so that the reasoning chain explicitly describes the clinical pattern composed of multiple abnormal change trends and their strong correlation combinations.

[0039] S33: The model predicts patient outcomes based on the output of the generated reasoning chain and compares the predicted outcomes with the actual outcomes;

[0040] S34: Based on the consistency between the predicted outcome and the actual outcome, evaluate and screen the quality of the generated reasoning chain;

[0041] S35: Using the above method, only the reasoning chain that can support the correct judgment of mortality risk is retained as a highly credible clinical model reasoning sample;

[0042] Furthermore, step S4 specifically includes:

[0043] S41: Based on the high-quality reasoning thought chain samples obtained in step S3, supervised fine-tuning training is performed on the large language model to obtain the initial optimized model. The supervised fine-tuning is used to enable the model to learn the correspondence between abnormal change trend combinations and patient outcomes, as well as the ability to follow instructions, and to form the basic model for the subsequent reinforcement learning stage.

[0044] S42: After supervised fine-tuning, a reinforcement learning mechanism based on Group Relative Policy Optimization (GRPO) is introduced to further train the model, including time-series electronic medical record input for the same patient. Generated from the current model to be optimized Different candidate reasoning chains and the Each reasoning chain is considered a candidate output within the same group;

[0045] S43: During reinforcement learning, for the same electronic medical record input... Based on the first a chain of reasoning Generate patient outcome prediction results And based on the prediction results Compared with real patient outcomes If they match, calculate the corresponding base reward value. Where the predicted result matches the actual outcome ,otherwise ;

[0046] S44: Based on the same group The basic reward value of a reasoning chain Construct the relative advantage term within the group and use the GRPO strategy to update the objective function and model parameters. The objective function for optimization is defined as follows:

[0047]

[0048] in, This represents the parameters of the large language model that need to be optimized. Represents the mathematical expectation; This represents the sequential electronic medical record input for a single ICU patient; Indicates the input of a given electronic medical record The next generated A chain of reasoning; This represents the number of reasoning thought chains generated for the same input; This indicates that the current model parameters are When entering Generate reasoning thought chain under conditions The probability of the strategy; This indicates that the reference model obtained during the supervised fine-tuning phase generates inference thought chains under the same input conditions. The policy probability is used to constrain the model update magnitude; This represents the policy clipping threshold, used to limit the range of variation of the current policy relative to the reference policy; clip This indicates that the probability ratio is restricted to an interval. Inside; Indicates the first The relative advantage within a group corresponding to each reasoning chain is defined as follows:

[0049]

[0050] in Indicates the first The base reward value for each reasoning chain.

[0051] After adopting the above strategy, the positive effects of the present invention are:

[0052] (1) From single feature to trend combination pattern modeling: It can automatically summarize clinical experience from retrospective time-series electronic medical record data, and upgrade the traditional importance analysis based on single feature to clinical pattern modeling based on abnormal change trend combination, thus being closer to the evidence-based cognition of clinicians.

[0053] (2) Explicit reasoning guidance based on correlation subgraphs: By constructing strong correlation subgraphs between abnormal change trends, the model is guided to explicitly utilize trend combination relationships during the reasoning process, avoiding semantic confusion between different clinical patterns, thereby improving the stability and interpretability of risk prediction;

[0054] (3) Structured reasoning and impact mechanism revelation: By combining electronic medical records and subgraphs as inputs and distilling high-quality reasoning thought chains, the model can clearly reveal the impact mechanism of key clinical change trends on mortality risk through a structured reasoning process.

[0055] (4) Two-stage optimization and new pattern discovery: In the model training stage, the optimization strategy of "supervised fine-tuning first and then reinforcement learning" is adopted, and the reward design based on group relative policy optimization (GRPO) is used to encourage the model to discover new key clinical patterns while ensuring the accuracy of prediction. Attached Figure Description

[0056] Figure 1 This is a flowchart illustrating a method for mining key clinical trend combinations in ICU patients according to the present invention. Detailed Implementation

[0057] To enable those skilled in the art to better understand the present invention and to make the above-mentioned objectives, technical solutions and advantages of the present invention more apparent and understandable, the present invention will be further described in detail below with reference to embodiments.

[0058] A patient admitted to the ICU for sepsis was selected as an example, and the time-series electronic medical record data during the patient's hospitalization was analyzed. The electronic medical record data included continuously monitored clinical indicators such as heart rate, respiratory rate, systolic blood pressure, and blood lactate level.

[0059] S1: A retrospective analysis was conducted on the changes in the patient's clinical indicators during the ICU. Given the known adverse outcome, the model was guided to extract abnormal trends from the time-series electronic medical records. Specifically, the model identified trends of continuously increasing heart rate, continuously increasing respiratory rate, gradually decreasing systolic blood pressure, and continuously increasing blood lactate levels within the observation time window, and labeled these trends as risk trends.

[0060] S2: Based on the co-occurrence statistical characteristics of the above-mentioned abnormal trends in the retrospective samples, the Normalized Point Mutual Information (NPMI) values ​​between each abnormal trend were calculated. The results showed that there were high NPMI values ​​between increased heart rate and decreased systolic blood pressure, and between increased respiratory rate and increased blood lactate. After traversing the abnormal trend nodes in a preset order, the association edge with the strongest correlation strength was added to each node, thereby constructing an abnormal trend subgraph with "increased heart rate - increased respiratory rate - decreased systolic blood pressure - increased blood lactate" as the core.

[0061] S3: The patient's time-series electronic medical record data and the constructed abnormal trend sub-graph are used as joint inputs to the large language model, guiding it to generate an inference chain according to a preset structured format. The generated inference chain explicitly describes the simultaneous occurrence and strong correlation of the above-mentioned multiple abnormal trends, and further infers that the combination of trends reflects the continuous deterioration of the patient's circulatory system and metabolic state.

[0062] Subsequently, the model outputs patient outcome predictions based on the generated reasoning chain and compares them with actual patient outcomes. Because the predictions match the actual outcomes, the reasoning chain is deemed high-quality and retained as a clinical pattern reasoning sample.

[0063] S4: Based on this high-quality reasoning thought chain sample, supervised fine-tuning training is performed on the large language model to enable the model to learn the correspondence between the above-mentioned abnormal trend combinations and high mortality risk. After completing supervised fine-tuning, a reinforcement learning mechanism based on group relative strategy optimization is introduced to generate multiple different candidate reasoning thought chains for the same patient's electronic medical record input. Reward signals are constructed based on the correctness of the predicted outcome corresponding to each reasoning thought chain and the novelty and rationality of the key clinical trend combination mining, thereby further enhancing the model's ability to model the key clinical trend combination pattern of "increased heart rate + increased respiratory rate + decreased systolic blood pressure + increased blood lactate".

[0064] As can be seen from the above embodiments, the method of the present invention can elevate the change of a single clinical indicator into a combination pattern of clinical change trends composed of multiple strongly correlated abnormal change trends, and explicitly express the impact of this pattern on the patient's mortality risk in the form of a structured reasoning chain.

[0065] Specific embodiments of the present invention have been described above. However, those skilled in the art will understand that various modifications and substitutions can be made to the specific embodiments of the present invention without departing from the spirit and scope of the invention. All such modifications and substitutions fall within the scope defined by the claims of the present invention.

Claims

1. A method for mining key clinical trend combinations in ICU patients, characterized in that... Includes the following steps: S1: Based on retrospective time-series electronic medical record data of critically ill patients and corresponding real clinical outcomes, the changes in multidimensional clinical indicators of patients during hospitalization are analyzed, and the trends of key factors related to the patient's mortality risk are extracted. The trends of these key factors are used to characterize the directional changes of clinical indicators in the time dimension, and the trends of key factors affecting mortality risk are summarized accordingly. S2: Construct a feature correlation subgraph based on the statistical correlation strength between the key factor trends described in step S1. The feature correlation subgraph is used to characterize the strong correlation between key factor trends to assist in the mining of clinical change trend combination patterns. The construction of the feature correlation subgraph includes: calculating the normalized point mutual information (NPMI) between any two key factor trends, traversing the abnormal change trend nodes in a preset order, and adding the correlation edge with the strongest correlation strength to the current node in turn. S3: The patient's time-series electronic medical record data and the feature association subgraph described in step S2 are used as joint inputs to guide the model to generate a reasoning thought chain based on the combination of key factor trends in a preset structured format, and the generated thought chain is filtered for quality. Specifically, when the model is provided with the real patient outcome, the thought chain is filtered based on the consistency between the outcome predicted based on the thought chain and the real outcome, or when the model is not provided with the real patient outcome, the thought chain is filtered based on the correctness of the outcome predicted by the model. S4: Based on the filtered high-quality thought chain samples, the model is trained under supervision and fine-tuned. After the supervision is completed, reinforcement learning is introduced for optimization. The reinforcement learning uses the analyzed key clinical change trend combination as the reward signal, thereby improving the model's ability to model the key clinical change trend combination pattern.

2. The method for mining key clinical change trend combination patterns in ICU patients according to claim 1, characterized in that, Step S1 specifically includes: S11: Based on retrospective time-series electronic medical record data of critically ill patients and corresponding real clinical outcomes, analyze the changes in multidimensional clinical indicators of patients during their stay in the ICU; S12: Given the known patient death or survival outcome, the guided model automatically extracts abnormal trends related to mortality risk from the time-series electronic medical records as candidate risk factors or protective factors. S13: The abnormal trend is used to describe the directional change of a single patient's clinical indicators over time, including continuous increase, continuous decrease, rapid mutation, or transition from a stable state to an unstable state; specifically, it includes risk trends indicating an increased risk of death and protective trends indicating a reduced risk of death; risk trends include: a deteriorating trend in which clinical indicators continue to change in an adverse direction and no clear signs of recovery appear within the observation period; an abnormal trend in which clinical indicators remain within the abnormal range for a long time and do not return to the normal range; a fluctuating trend in which clinical indicators frequently or significantly change between the normal and abnormal ranges, indicating system instability; and a sudden abnormal trend in which clinical indicators deviate sharply from the normal range in a short period of time, exceeding the previous range of change; protective trends include: an improving trend in which clinical indicators that were originally in an abnormal state continue to recover towards the normal direction; and a normal trend in which clinical indicators remain within the normal range for a long period of time without significant abnormal changes. S14: The extracted trend-based risk factors and protective factors are used to characterize the directional semantic association between the trend of changes in patients' clinical indicators and outcomes.

3. The method for mining key clinical change trend combination patterns in ICU patients according to claim 1, characterized in that, Step S2 specifically includes: S21: Based on the abnormal change trend clinical features extracted in step S1, construct an abnormal change trend sub-graph to characterize the strong correlation between features; S22: For any two abnormal trend features, calculate their normalized point mutual information (NPMI) in the retrospective sample to measure the strength of the association between them. The calculation formula is as follows: Here, 𝑥 and 𝑦 represent two abnormal trend features; NPMI (Normalized Pointwise Mutual Information) is an indicator used to measure the strength of statistical association between two events or features. Its value ranges from [−1,1], where a larger value indicates a stronger positive correlation between the two events, a value close to 0 indicates a weaker association, and a negative value indicates a negative correlation. It is used to measure the association strength of two abnormal trend features appearing simultaneously in retrospective electronic medical record data. It represents the probability of an abnormal trend characteristic 𝑥 appearing in a retrospective electronic medical record data sample, that is, the proportion of patients or time segments in which trend 𝑥 appears in the selected sample or time window. Indicating abnormal change trend characteristics The probability of occurrence in a retrospective sample of electronic medical record data, i.e., the trend within the selected sample or time window. The proportion of patients or time segments; Indicating abnormal change trend characteristics and The probability of simultaneous occurrence in the same patient or within the same time window; This represents a logarithmic operation, used to convert probability ratios into a measure of information content. S23: During the subgraph construction process, the abnormal change trend feature nodes are traversed in a preset order, and the current node is added with the associated edge whose NPMI value is the largest. S24: When adding related edges, avoid introducing existing duplicate edges to ensure the sparsity and discriminativeness of the subgraph structure; S25: The constructed subgraph of abnormal change trends retains only the topological relationships between features, without explicitly including edge weight values, in order to highlight the combination pattern of abnormal change trends itself.

4. The method for mining key clinical change trend combination patterns in ICU patients according to claim 1, characterized in that, Step S3 specifically includes: S31: The patient’s time-series electronic medical record data and the corresponding abnormal change trend sub-graph are used as joint inputs to provide the large model to generate reasoning thought chains. S32: The model generation process is constrained according to a preset structured format, so that the reasoning chain explicitly describes the clinical pattern composed of multiple abnormal change trends and their strong correlation combinations. S33: The model predicts patient outcomes based on the output of the generated reasoning chain and compares the predicted outcomes with the actual outcomes; S34: Based on the consistency between the predicted outcome and the actual outcome, evaluate and screen the quality of the generated reasoning chain; S35: Using the above method, only the reasoning chain that can support the correct judgment of mortality risk is retained as a highly credible clinical model reasoning sample.

5. The method for mining key clinical change trend combination patterns in ICU patients according to claim 1, characterized in that, Step S4 specifically includes: S41: Based on the high-quality reasoning thought chain samples obtained in step S3, supervised fine-tuning training is performed on the large language model to obtain the initial optimized model. The supervised fine-tuning is used to enable the model to learn the correspondence between abnormal change trend combinations and patient outcomes, as well as the ability to follow instructions, and to form the basic model for the subsequent reinforcement learning stage. S42: After supervised fine-tuning, a reinforcement learning mechanism based on Group Relative Policy Optimization (GRPO) is introduced to further train the model, including time-series electronic medical record input for the same patient. Generated from the current model to be optimized Different candidate reasoning chains and the Each reasoning chain is considered a candidate output within the same group; S43: During reinforcement learning, for the same electronic medical record input... Based on the first a chain of reasoning Generate patient outcome prediction results And based on the prediction results Compared with real patient outcomes If they match, calculate the corresponding base reward value. Where the predicted result matches the actual outcome ,otherwise ; S44: Based on the same group The basic reward value of a reasoning chain Construct the relative advantage term within the group and use the GRPO strategy to update the objective function and model parameters. The objective function for optimization is defined as follows: in, This represents the parameters of the large language model that need to be optimized. Represents the mathematical expectation; This represents the sequential electronic medical record input for a single ICU patient; Indicates the input of a given electronic medical record The next generated A chain of reasoning; This represents the number of reasoning thought chains generated for the same input; This indicates that the current model parameters are When entering Generate reasoning thought chain under conditions The probability of the strategy; This indicates that the reference model obtained during the supervised fine-tuning phase generates inference thought chains under the same input conditions. The policy probability is used to constrain the model update magnitude; This represents the policy clipping threshold, used to limit the range of variation of the current policy relative to the reference policy; clip This indicates that the probability ratio is restricted to an interval. Inside; Indicates the first The relative advantage within a group corresponding to each reasoning chain is defined as follows: in Indicates the first The base reward value for each reasoning chain.