Disease prediction method and apparatus based on implicit knowledge augmentation, and device and medium

By constructing a disease prediction model based on implicit knowledge enhancement, and utilizing multi-dimensional patient data and similar patients and temporal heterogeneous graphs from medical records, the interpretability and generalization issues of deep learning models in disease prediction are solved, achieving more accurate disease prediction and clearer interpretation.

WO2026129367A1PCT designated stage Publication Date: 2026-06-25SHENZHEN INST OF ADVANCED TECH CHINESE ACAD OF SCI

Patent Information

Authority / Receiving Office
WO · WO
Patent Type
Applications
Current Assignee / Owner
SHENZHEN INST OF ADVANCED TECH CHINESE ACAD OF SCI
Filing Date
2024-12-21
Publication Date
2026-06-25

Smart Images

  • Figure CN2024141246_25062026_PF_FP_ABST
    Figure CN2024141246_25062026_PF_FP_ABST
Patent Text Reader

Abstract

A disease prediction method and apparatus based on implicit knowledge augmentation, and a device and a medium. The method comprises: on the basis of historical visit record representations and static information representations, calculating comprehensive representations of patients, and using the comprehensive representations of the patients to perform training, so as to obtain a backbone network; performing calculation by means of the backbone network to obtain a patient representation matrix, and using cosine similarity to select the representations of K similar patients having the highest similarity to a target patient; on the basis of disease types and relationship types, constructing a temporal visit heterogeneous graph, and using a heterogeneous graph encoding algorithm to encode the temporal visit heterogeneous graph, so as to obtain temporal visit heterogeneous graph representations; and using the comprehensive representations of the patients, the representations of similar patients and the temporal visit heterogeneous graph representations to construct a comprehensive disease representation, training a disease prediction model on the basis of the comprehensive disease representation, and using the trained disease prediction model to output a disease prediction result. The group intelligence of similar patients and the potential complex relationships between diseases are used to enhance patient modeling, thereby improving the accuracy of disease prediction.
Need to check novelty before this filing date? Find Prior Art

Description

Disease prediction methods, devices, equipment, and media based on implicit knowledge enhancement Technical Field

[0001] This application belongs to the field of medical and health technology, and specifically relates to a disease prediction method, device, equipment and medium based on implicit knowledge enhancement. Background Technology

[0002] In the field of healthcare, using deep learning models for disease prediction has become an important research direction. Compared with traditional machine learning methods, deep learning models can learn complex features and correlations from data, thus making disease predictions more effective. The interpretability of deep learning models is directly related to the accuracy of disease prediction results and the transparency of medical decisions.

[0003] Currently, the application of deep learning models in disease prediction mainly includes traditional disease prediction models, interpretable disease prediction models, knowledge-enhanced disease prediction models, and disease prediction models that integrate similar patients. Traditional disease prediction models typically employ unsupervised deep learning methods, predicting a patient's future health status by analyzing HER (electronic health records) data. Interpretable disease prediction models usually process HER data through a two-level neural network attention mechanism: one level focuses on the importance of past medical visits, and the other focuses on key diagnoses during visits. Finally, they model medical records using RNN (Recurrent Neural Network), providing interpretation of the prediction results while maintaining accuracy. Alternatively, they capture contextual information from medical codes through attention mechanisms and RNN structures, and use time-aware disease progression functions to simulate the dynamic changes in the impact of different diseases on patients' future medical visits, while providing interpretation by analyzing and understanding the influence of different medical codes on the risk of future medical visits. Knowledge-enhanced disease prediction models treat the disease hierarchy as internal knowledge, enriching disease information by utilizing parent and child nodes in the hierarchy, thereby more accurately modeling patient representations. Disease prediction models that integrate similar patients use statistical or numerical methods to measure similarity between patients. Specifically, they standardize all patient characteristics and then use a distance-based metric to assess similarity.

[0004] In summary, although deep learning technology has made significant progress in the field of healthcare, it still faces the following shortcomings:

[0005] 1) Many current deep learning models are used as black boxes in disease prediction, lacking interpretability. While some models attempt to provide explanations through attention mechanisms, these are not comprehensive enough. Furthermore, most existing models only use patients' discrete medical records, failing to effectively integrate external medical knowledge. Even knowledge-enhancing methods often focus only on the hierarchical structure or direct relationships between diseases, failing to fully consider the more complex and subtle interactions between diseases, thus limiting the model's comprehensive understanding and accurate prediction of patients' health conditions.

[0006] 2) Existing models generally neglect multi-source information such as patient static information, primary diagnosis information, similar patient information, and medical record graphs. They also often treat diseases in medical records as an unordered set, ignoring the complex interactions between diseases. In patient similarity calculations, global information from medical records may be lost, affecting the model's ability to capture the overall patient condition and leading to insufficient patient state modeling, thus impacting the accuracy of disease prediction.

[0007] 3) Existing models have generalization problems in disease prediction, often limited to specific diseases and difficult to extend to multiple disease situations. This limits the model's ability to be applied in complex medical scenarios and restricts its widespread application and long-term effectiveness in actual clinical environments. Summary of the Invention

[0008] This application provides a disease prediction method, apparatus, device, and medium based on implicit knowledge enhancement, which aims to at least partially solve one of the aforementioned technical problems in the prior art.

[0009] To address the above problems, this application provides the following technical solution:

[0010] A disease prediction method based on implicit knowledge enhancement includes:

[0011] The system acquires the patient's historical medical records and static information representations, calculates the patient's comprehensive representation based on the historical medical records and static information representations, and trains the backbone network using the patient's comprehensive representation.

[0012] All patients' historical medical records are represented and input into the backbone network. The patient representation matrix is ​​calculated through the backbone network, and the K most similar patient representations with the target patient are selected using cosine similarity.

[0013] Disease types and relationships between different disease types are extracted from the historical medical records. A heterogeneous medical record graph is constructed based on the disease types and relationships. The heterogeneous medical record graph is then encoded using a heterogeneous graph encoding algorithm to obtain a representation of the patient's medical record graph.

[0014] A comprehensive disease representation is constructed using the patient comprehensive representation, similar patient representation, and time-series heterogeneous representation of medical visits. A disease prediction model is trained based on the comprehensive disease representation, and the disease prediction result is output using the trained disease prediction model.

[0015] The technical solution adopted in this application embodiment further includes: obtaining the patient's historical medical record representation and static information representation, calculating the patient's comprehensive representation based on the historical medical record and static information representation, and training the backbone network using the patient's comprehensive representation, specifically as follows:

[0016] The static information representation is encoded, and the encoded static information representation is fused with the historical medical record representation to obtain the backbone network;

[0017] The patient's comprehensive representation is calculated based on the historical medical records and static information representation. The patient's comprehensive representation is used to predict the main diagnosis. The backbone network is then optimized and trained using multi-label cross-entropy as the loss function to obtain a well-trained backbone network.

[0018] The technical solution adopted in this application embodiment also includes: the static information representation includes the patient's sex information. (i) Age information (i) and race information (i) The encoding of the static information representation specifically includes:

[0019] The gender information (sex) is encoded using one-hot encoding. (i) Transform into a binary vector x sex ;

[0020] The age information is processed by normalization. (i) Mapped to the numerical range of 0 to 1, and denoted by x. age This is to indicate the differences between different age groups;

[0021] The race information is encoded using one-hot encoding. (i) Convert to num r dimensional vector x race , where num r Represents the number of all races in the dataset;

[0022] Encode x sex x age x race By stitching the images together, we can obtain a static representation of the patient's information.

[0023] The technical solution adopted in this application embodiment further includes: inputting the historical medical records of all patients into the backbone network, calculating the patient representation matrix through the backbone network, and using cosine similarity to select the K most similar patient representations with the target patient, specifically:

[0024] Traversing the patient representation matrix M p The distance between any two patients, P, is calculated using cosine similarity. i ∈M p and P j ∈M p The distance between them is calculated as follows:

[0025] Where H i and H j Patient P i and P j The representation of , where "·" represents the vector dot product, and |·| represents the Euclidean norm of the vector;

[0026] Based on the distance calculation results, the similarity among all patients is sorted, and the similarity with the target patient P is selected. i The top-k most similar patients (simP) (i) The cross-attention mechanism was applied to obtain the similar patient P. j ∈simP (i) Weighted representation;

[0027] By weighting the representations of all similar patients using attention weights, a comprehensive similar patient representation H is obtained. simP .

[0028] The technical solution adopted in this application embodiment further includes: extracting disease types and relationship types between different disease types from the historical medical record representation, and constructing a medical visit time-series heterogeneous graph based on the disease types and relationship types, specifically as follows:

[0029] The disease types include persistent diseases, diseases caused by a previous disease, and sudden-onset diseases. The relationships between different disease types include co-occurrence relationships, inheritance relationships, and causal relationships. When two diseases c i and c j When both conditions appear in the same medical record, it is considered that disease c i and c j There is a co-occurrence relationship between them; in this case, in disease c i With c j Construct an edge between them, and define the type of the edge as "Co-occurring"; when disease c i V in two consecutive medical recordst-1 and V t If it appears in the middle, then it is considered that c i This disease was inherited from a previous medical visit; at this time, it is disease c. i Construct a self-loop and define the type of the edge as "Inherited"; when disease c j Appears in current medical record V t In the previous medical record, V t-1 It did not appear in the text, but it is related to V. t-1 Disease c i If a co-occurrence relationship exists, then c is considered to be... j With c i There is a causal relationship; in this case, construct a path from c. i to c j The edge is defined as "Cause";

[0030] Construct a heterogeneous graph G of the time sequence of medical visits based on the relationship types and edge types of different disease types.

[0031] The technical solution adopted in this application embodiment further includes: encoding the medical visit time-series heterogeneous graph using a heterogeneous graph encoding algorithm to obtain a representation of the patient's medical visit time-series heterogeneous graph, specifically as follows:

[0032] A BERT model is pre-trained on the historical medical records of all patients with only one medical visit. The BERT model is then used to initialize the representation of each node in the heterogeneous medical visit time graph G.

[0033] From the heterogeneous time-series graph G initialized with the representation, we obtain subgraphs of patient visits composed of different edge types. Each subgraph is then encoded using a multi-layer graph attention network. After encoding all subgraphs, we integrate the representations of all relation types to obtain the final representation of node i at layer l.

[0034] Global information is extracted from each patient visit subgraph using global average pooling to obtain subgraph representations representing each patient visit subgraph.

[0035] LSTM is used to represent all subgraphs. The process yields the time-series heterogeneity representation H of the medical visit. G for:

[0036] Where T represents the patient's total number of medical visits. This represents the subgraph representation obtained during the t-th visit.

[0037] The technical solution adopted in this application embodiment further includes: after constructing a comprehensive disease representation using the patient comprehensive representation, similar patient representation, and visit time sequence heterogeneous graph representation, training a disease prediction model based on the comprehensive disease representation, and outputting the disease prediction result using the trained disease prediction model, it further includes:

[0038] The historical medical visit contribution is extracted from the patient's historical medical records, and the key path and related paths of similar patients are extracted from the medical visit time sequence heterogeneous graph G. The historical medical visit contribution, key path and related paths of similar patients are used as the reasoning basis. The reasoning basis is converted into prompt words according to the preset prompt word template, and the prompt words guide the LLMs model to generate explanatory text of the disease prediction results.

[0039] Another technical solution adopted in this application embodiment is: a disease prediction device based on implicit knowledge enhancement, comprising:

[0040] Backbone network construction module: used to acquire the patient's historical medical record representation and static information representation, calculate the patient's comprehensive representation based on the historical medical record and static information representation, and use the patient's comprehensive representation to train and obtain the backbone network;

[0041] Similarity filtering module: It is used to input the historical medical records of all patients into the backbone network, calculate the patient representation matrix through the backbone network, and use cosine similarity to filter out the K similar patient representations with the highest similarity to the target patient.

[0042] Heterogeneous graph construction module: used to extract disease types and relationship types between different disease types from the historical medical record representation, construct a medical visit time-series heterogeneous graph based on the disease types and relationship types, and encode the medical visit time-series heterogeneous graph using a heterogeneous graph encoding algorithm to obtain the patient's medical visit time-series heterogeneous graph representation;

[0043] Disease prediction module: used to construct a comprehensive disease representation using the comprehensive patient representation, similar patient representation, and heterogeneous representation of the time sequence of visits, train a disease prediction model based on the comprehensive disease representation, and output disease prediction results using the trained disease prediction model.

[0044] Another technical solution adopted in this application embodiment is: a device, the device including a processor and a memory coupled to the processor, wherein,

[0045] The memory stores program instructions for implementing the disease prediction method based on implicit knowledge enhancement.

[0046] The processor is used to execute the program instructions stored in the memory to control a disease prediction method based on implicit knowledge enhancement.

[0047] Another technical solution adopted in this application embodiment is: a medium storing processor-executable program instructions, the program instructions being used to execute the disease prediction method based on implicit knowledge enhancement.

[0048] Compared to existing technologies, the beneficial effects of the embodiments of this application are as follows: The disease prediction method, apparatus, device, and medium based on implicit knowledge enhancement proposed in the embodiments of this application provide a disease prediction model based on implicit knowledge enhancement. This model integrates multi-dimensional patient data, mines and utilizes implicit knowledge such as similar patients and heterogeneous consultation time series graphs in medical records, enhances patient modeling by leveraging the collective wisdom of similar patients and the potential complex relationships between diseases, and encodes the heterogeneous consultation time series graphs through heterogeneous graph encoding and sequence modeling techniques to more accurately capture the complex relationships between diseases and the dynamic evolution of patients' health status. It also combines historical consultation contribution, critical paths, and related paths of similar patients to form task-specific reasoning basis, and combines a large-scale language model to generate context-relevant and domain-specific explanatory texts for the prediction results. This not only improves the accuracy of disease prediction but also provides clear explanations for the prediction results, thereby enhancing the model's performance and its practical application value and credibility in the medical field. Attached Figure Description

[0049] Figure 1 is a flowchart of the disease prediction method based on implicit knowledge enhancement according to an embodiment of this application;

[0050] Figure 2 is a schematic diagram of the backbone network and its pre-training process according to an embodiment of this application;

[0051] Figure 3 is a schematic diagram of the structure of the disease prediction device based on implicit knowledge enhancement according to an embodiment of this application;

[0052] Figure 4 is a schematic diagram of the device structure according to an embodiment of this application;

[0053] Figure 5 is a schematic diagram of the structure of the medium according to an embodiment of this application. Detailed Implementation

[0054] The technical solutions of the embodiments of this application will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only a part of the embodiments of this application, and not all of the embodiments. Based on the embodiments of this application, all other embodiments obtained by those of ordinary skill in the art without creative effort are within the scope of protection of this application.

[0055] The terms "first," "second," and "third" in this application are for descriptive purposes only and should not be construed as indicating or implying relative importance or implicitly specifying the number of indicated technical features. Therefore, a feature defined as "first," "second," or "third" may explicitly or implicitly include at least one of that feature. In the description of this application, "multiple" means at least two, such as two, three, etc., unless otherwise explicitly specified. All directional indications (such as up, down, left, right, front, back, etc.) in the embodiments of this application are only used to explain the relative positional relationships and movements between components in a specific orientation (as shown in the figures). If the specific orientation changes, the directional indications also change accordingly. Furthermore, the terms "comprising" and "having," and any variations thereof, are intended to cover non-exclusive inclusion. For example, a process, method, system, product, or device that includes a series of steps or units is not limited to the listed steps or units, but may optionally include steps or units not listed, or may optionally include other steps or units inherent to these processes, methods, products, or devices.

[0056] In this document, the term "embodiment" means that a particular feature, structure, or characteristic described in connection with an embodiment may be included in at least one embodiment of this application. The appearance of this phrase in various places throughout the specification does not necessarily refer to the same embodiment, nor is it a separate or alternative embodiment mutually exclusive with other embodiments. It will be explicitly and implicitly understood by those skilled in the art that the embodiments described herein can be combined with other embodiments.

[0057] Specifically, please refer to Figure 1, which is a flowchart of the disease prediction method based on implicit knowledge enhancement according to an embodiment of this application. The disease prediction method based on implicit knowledge enhancement according to an embodiment of this application includes the following steps:

[0058] S100: Obtain the patient's historical medical record representation and static information representation respectively, and after encoding the static information representation, fuse it with the historical medical record representation to obtain the backbone network;

[0059] In this step, the backbone network is designed based on the patient medical record encoding module in the IKxDP disease prediction model, which is based on implicit knowledge enhancement. Specifically, firstly, the patient's historical medical record representation is obtained through the patient medical record encoding module in IKxDP. Secondly, the patient's static information representation is obtained through the MIMIC (Medical Information Mart for Intensive Care, a large public clinical database). After encoding the static information representation, the historical medical record representation and the patient's static information representation are fused to obtain the backbone network, which can be represented as follows:

[0060] in, V represents the health status of patient i. (i) This represents the patient's historical medical records. This represents the patient's static information representation.

[0061] Furthermore, the acquired static information representation includes the patient's sex information. (i) Age information (i) and race information (i) The encoding process for representing static information, such as individual static information, includes:

[0062] The sex information is encoded using one-hot encoding. (i) Transform into a binary vector x sex ;

[0063] The age information is processed by normalization. (i) Mapped to the numerical range of 0 to 1, and denoted by x. age This is to indicate the differences between different age groups;

[0064] Race information is encoded using one-hot encoding. (i) Convert to num r dimensional vector x race , where num r This represents the number of all races in the dataset.

[0065] After encoding all individual static information, the encoded individual static information is concatenated to serve as the patient's static information representation.

[0066] S110: Calculate the patient's comprehensive representation based on the patient's historical medical records and static information representation, use the comprehensive representation to predict the main diagnosis, and optimize the backbone network by using multi-label cross-entropy as the loss function to obtain a well-trained backbone network.

[0067] In this step, as shown in Figure 2, it is a schematic diagram of the backbone network and its pre-training process according to an embodiment of this application. First, the patient's historical medical records and static information representations are obtained through the MIMIC dataset, and the primary diagnosis of each patient visit is extracted. Based on the patient's historical medical records and static information, a comprehensive patient representation H representing the patient's health status is calculated. p Then, using the patient's comprehensive characterization H p The main diagnostic prediction is performed, and the backbone network is optimized and trained by using multi-label cross-entropy as the loss function. The trained backbone network is then saved for application to downstream tasks.

[0068] S120: Input the historical medical records of all patients into the trained backbone network, calculate the representation of all patients through the backbone network to obtain the patient representation matrix, and traverse the patient representation matrix. Use cosine similarity to calculate the distance between any two patients, and select the K similar patient representations with the highest similarity to the target patient based on the distance calculation results.

[0069] In this step, after successfully pre-training and saving the backbone network, the representations of all patients can be calculated by loading the backbone network and combining it with the patients' historical medical records, thus obtaining the patient representation matrix M. p And traverse the patient representation matrix M p The distance between any two patients is calculated using cosine similarity. Based on the distance calculation results, the K most similar patient representations H with the highest similarity to the target patient are selected. simP It is used to capture health status information of other patients similar to the target patient, and provide the target patient with predictions and suggestions based on collective intelligence, thereby improving the accuracy of disease prediction for the target patient.

[0070] Specifically, the screening process for similar patient characteristics includes the following steps:

[0071] S121: Traversing the patient representation matrix M p The distance between any two patients, P, is calculated using cosine similarity. i ∈M p and P j ∈M p The distance between them is calculated as follows:

[0072] Where H i and H j Patient P i and P j The symbols are represented by "·", which denotes the vector dot product, and |·| denotes the Euclidean norm of the vector.

[0073] S122: Sort the similarity among all patients based on the distance calculation results, and select the patient P. i The top-k most similar patients (simP) (i) For the selected similar patients P j ∈simP (i) We apply a cross-attention mechanism to obtain its weighted representation:

[0074] Where W is a learnable weight matrix.

[0075] S123: By weighting the representations of all similar patients using attention weights, a comprehensive similar patient representation H is obtained.simP The specific weighted summation calculation method is as follows:

[0076] S130: The data statistical algorithm is used to extract the disease type and the relationship type between different disease types from the historical medical records of all patients. The time-series heterogeneous graph of medical visits is constructed according to the disease type and the relationship type. The time-series heterogeneous graph of medical visits is encoded using the heterogeneous graph encoding algorithm to obtain the time-series heterogeneous graph representation of the patient's medical visits.

[0077] In this step, the embodiments of this application obtain disease types and relationship types based on data statistical analysis, thereby adaptively learning and revealing potential connections between diseases from different medical data without relying on external knowledge bases. Specifically, disease types include persistent diseases, diseases caused by a previous disease, and sudden-onset diseases. Persistent diseases refer to diseases that appear in at least two consecutive medical records; for example, a certain disease c... i Also appeared in medical record V t and V t-1 In the middle, then the disease c i Classified as a persistent illness. An illness arising from a previous illness refers to one that appears in the current medical record. t And compared with the previous medical record V t-1 Diseases that co-occur in the data, where the co-occurrence relationship can be obtained from the global co-occurrence matrix M, are constructed as follows:

[0078] Where I{·} is an indicator function, I(i,j) = 1 if and only if diseases i and j appear simultaneously in the same medical record; otherwise, (i,j) = 0. To ensure the significance of the co-occurrence relationship, in this embodiment, a relationship threshold θ is set, when the co-occurrence frequency C of the two diseases is... ij If the relationship threshold θ is exceeded, then the two diseases are determined to have a significant co-occurrence relationship. The determination method is shown in the following formula:

[0079] The above methods can accurately identify and quantify the co-occurrence relationships between different diseases, and filter out occasional or low-frequency co-occurrences, thereby better characterizing the relationships between diseases.

[0080] Sudden onset illness refers to illness occurring only in the current medical record. t It appeared in the previous medical record V t-1 It has never appeared in China, and it is related to V. t-1 Diseases that do not have a direct co-occurrence relationship.

[0081] The types of relationships between different disease types include co-occurrence, inheritance, and causal relationships. Among these, co-occurrence refers to the relationship between two diseases when one disease is predominantly linked to the other. i and c j When two diseases appear together in the same medical record, a co-occurrence relationship is considered to exist. In this case, in the two diseases c i With c j Construct an edge between them and define its type as "Co-occurring". Inheritance relationship refers to the relationship where if a certain disease c i V in two consecutive medical records t-1 and V t If both appear in the middle, then c is considered to be true. i This disease was inherited from a previous medical visit; at this time, it is disease c. i Construct a self-loop and define the type of the edge as "Inherited". A causal relationship refers to the relationship where a disease c... j Appears in current medical record V t In the previous medical record, V t-1 It did not appear in the text, but it is related to V. t-1 A certain disease c i If a co-occurrence relationship exists, then c is considered to be... j With c i There is a causal relationship; in this case, construct a path from c. i to c j The edge is defined as "Cause".

[0082] Finally, a heterogeneous graph G of the time sequence of visits is constructed based on the relationship type and edge type of different disease types. This graph can not only represent the potential complex relationships between diseases, but also be used to analyze the dynamic evolution of patients' health status, providing a new perspective for patient modeling and further enriching the patient representation.

[0083] Furthermore, since the heterogeneous graph G representing the timeline of patient visits is a heterogeneous graph, traditional graph neural network models are not suitable for directly encoding it. Therefore, embodiments of this application design a heterogeneous graph encoding algorithm based on the characteristics of the task. The specific encoding process includes the following steps:

[0084] S131: The BERT (Bidirectional Encoder Representations from Transformers, a deep learning model based on Transformers) model is obtained by pre-training the historical medical record representations of all patients with only one medical record. The BERT model is then used to initialize the representation of each node in the heterogeneous medical record graph G.

[0085] Specifically, the characterization initialization process is as follows: assuming that node c in the heterogeneous graph of medical visit time sequence G...i The initial characterization is x i ∈R dim Where dim is the dimension of the representation, using the representation matrix E N Node c i Node type n i ∈N is mapped to a vector space of fixed dimension, where E N ∈R |N|×dim N is the number of node types; based on this, we obtain node c. i Enhanced representation of node type x′ i The specific calculation is as follows: x′ i =x i +E N [n i (7)

[0086] Where n i For node type, E N [n i ] indicates node type n i The representation of.

[0087] S132: Obtain the patient visit subgraphs composed of different edge types from the patient visit time-series heterogeneous graph G after representation initialization, and encode each patient visit subgraph using a multi-layer graph attention network;

[0088] To deeply model the complex relationships in the heterogeneous graph G of patient visit time sequence, this embodiment employs a multi-layer graph attention network for encoding. Specifically, for a relationship type r∈R in the heterogeneous graph G of patient visit time sequence, the patient visit subgraph G consisting of all nodes and edges of relationship type r is first obtained. r Then, a separate graph attention network is used to process the patient visit subgraph G. r Encode the subgraph G of the patient visit. r In the context of node i and its neighboring nodes... The previous layer representation is used to update the current layer representation of node i, and the specific calculation formula is as follows:

[0089] in, and It is the weight matrix of the l-th layer of the graph neural network, used for the linear transformation of node features. This represents the importance of node j's representation update to node i under relation type r. It can also represent the attention weight of the edge between node j and node i, used to identify and extract critical paths in the heterogeneous graph of patient visit timelines, thereby revealing potential patterns of disease progression. Specifically, The calculation formula is:

[0090] Among them, aT These are the parameters of the learned attention mechanism, σ (r) "|" represents a non-linear activation function, such as LeakyReLU, and "|" represents a feature concatenation operation.

[0091] After performing the above encoding operation on all patient visit subgraphs, the representations of all relation types r∈R are integrated to obtain the final representation of node i at level l. for:

[0092] S133: Extract global information from each patient visit subgraph using global average pooling to obtain subgraph representations representing each patient visit subgraph.

[0093] In this process, after encoding each subgraph of a patient visit using a multi-layer graph attention network, global information is extracted from each subgraph through global average pooling. Then, the value of a specific patient visit record V is calculated. t In the corresponding subgraph of the medical visit, the representations of all nodes obtained by the last layer of the graph attention network are... This is achieved by averaging the values, thus obtaining a subgraph representation that can represent the patient visit subgraph. The formula for calculating global information is as follows:

[0094] S134: Represent all subgraphs using LSTM. After processing, the heterogeneous graph representation N of the time sequence of medical visits is obtained. G for:

[0095] Where T represents the patient's total number of medical visits. H represents the subgraph representation obtained during the t-th visit. G It can not only handle the complex relationships between diseases during each medical visit, but also capture the dynamic evolution of a patient's health status.

[0096] S140: Construct a comprehensive disease representation using comprehensive patient representation, similar patient representation, and heterogeneous representation of consultation time sequence; train a disease prediction model based on the comprehensive disease representation; and output disease prediction results using the trained disease prediction model.

[0097] In this step, the model training strategy is the same as that of IKxDP. This embodiment utilizes the patient comprehensive representation H... p Similar patient characteristics H simP And the time-series heterogeneity graph representation H G Constructing a comprehensive disease characterization H allBy inputting the comprehensive disease representation into the MLP (Multilayer Perceptron) model for training, and calculating the loss function based on the output of the MLP model to train the disease prediction model, this approach not only integrates the multi-dimensional information of patients and deeply captures the personalized health status and dynamic evolution of the target patients, but also draws on the collective wisdom of similar patients, significantly improving the model's disease prediction capability.

[0098] S150: Extract the historical medical visit contribution from the patient's historical medical visit records, and extract the key path and the relevant path of similar patients from the medical visit time sequence heterogeneity graph. Use the historical medical visit contribution, key path and the relevant path of similar patients as the reasoning basis. According to the preset prompt word template, the reasoning basis is transformed into prompt words, and the prompt words guide the large language model (LLMs) to generate explanatory text of the disease prediction results.

[0099] In this step, the extraction process of the reasoning basis is as follows:

[0100] 1) Contribution of historical medical visits: Drawing on the importance analysis of historical medical records in the IKxDP model, the contribution of historical diseases to disease prediction results is calculated and used as part of the reasoning basis.

[0101] 2) Critical Path Extraction: The critical paths in the heterogeneous time-series of patient visits are extracted using attention weights. The larger the attention weight, the more attention the model pays to that edge during prediction. Specifically, the critical path extraction algorithm includes: in the heterogeneous graph of patient visit time series G = {G1, G2, ..., G...} T In the diagram, V represents the set of all nodes, and E represents the set of all edges. Each edge e i,j Each ∈E has a weight α ij This is used to measure the importance of the edge in the entire heterogeneous graph of patient visit time sequences. To quantify the importance of each path in the heterogeneous graph of patient visit time sequences, we first summarize all nodes in the graph to form a node set N. G Then, by traversing N G Each node v∈N G Calculate all edges e that terminate at v. i,v The sum of the weights is used to obtain the weight score of the path corresponding to node v. v The specific calculation formula is as follows:

[0102] Where {end(E)} represents the set of endpoints of all edges in the heterogeneous graph of patient visit time. Finally, the weight score is calculated. vThe paths are sorted in descending order, and the paths corresponding to the top-k endpoints with the highest scores are selected as critical paths. By extracting critical paths from the heterogeneous time-series graph of medical visits, the disease paths that the model focuses on during the decision-making process can be identified, thereby aiding in the interpretation of disease prediction results.

[0103] 3) Relevant path analysis of similar patients: The key path of the target patient is compared with the relevant path of its similar patients. If the same path also exists among similar patients, it is considered to be a relevant path that can be used as an auxiliary explanation and is used as part of the reasoning basis.

[0104] Ultimately, historical patient contribution, critical pathways, and related pathways of similar patients are combined to form task-specific reasoning. The reasoning is then transformed into prompts using a pre-set prompt template, which guides LLMs to generate explanatory text for disease prediction results.

[0105] It is understood that the application scope of the embodiments of this application is not limited to disease prediction, but can also be flexibly extended to multiple medical fields such as drug recommendation and personalized treatment plan formulation. It has significant scalability and adaptability, and can effectively integrate patients' multimodal data or medical information. In terms of multimodal data fusion, the embodiments of this application also provide the potential for further optimization. More advanced and flexible fusion strategies such as attention mechanisms or graph neural networks can be introduced, which can dynamically adjust the weight of different modal data in the decision-making process according to their relative importance.

[0106] Based on the above, this application proposes a disease prediction model based on implicit knowledge enhancement. This model integrates multi-dimensional patient data, mines and utilizes implicit knowledge such as similar patients and heterogeneous consultation timelines in medical records, enhances patient modeling by leveraging the collective wisdom of similar patients and the potential complex relationships between diseases, and encodes the heterogeneous consultation timeline using heterogeneous graph encoding and sequence modeling techniques to more accurately capture the complex relationships between diseases and the dynamic evolution of patient health status. It combines historical consultation contributions, critical paths, and relevant paths of similar patients to form task-specific reasoning basis, and uses a large-scale language model to generate context-relevant and domain-specific explanatory text for the prediction results. This not only improves the accuracy of disease prediction but also provides clear explanations for the prediction results, thereby enhancing the model's performance and its practical application value and credibility in the medical field.

[0107] Please refer to Figure 3, which is a schematic diagram of the structure of the disease prediction device based on implicit knowledge enhancement according to an embodiment of this application. The device 40 of the disease prediction method based on implicit knowledge enhancement according to an embodiment of this application includes:

[0108] Backbone network construction module 41: used to obtain the patient's historical medical record representation and static information representation, calculate the patient's comprehensive representation based on the historical medical record and static information representation, and use the patient's comprehensive representation to train and obtain the backbone network;

[0109] Similarity filtering module 42: is used to input the historical medical records of all patients into the backbone network, calculate the patient representation matrix through the backbone network, and use cosine similarity to filter out the K similar patient representations with the highest similarity to the target patient.

[0110] Heterogeneous graph construction module 43: used to extract disease types and relationship types between different disease types from the historical medical record representation, construct a medical visit time-series heterogeneous graph based on the disease types and relationship types, and encode the medical visit time-series heterogeneous graph using a heterogeneous graph encoding algorithm to obtain the patient's medical visit time-series heterogeneous graph representation.

[0111] Disease prediction module 44: Used to construct a comprehensive disease representation using the comprehensive patient representation, similar patient representation, and heterogeneous representation of the time sequence of visits, train a disease prediction model based on the comprehensive disease representation, and output disease prediction results through the trained disease prediction model.

[0112] It should be noted that the information interaction and execution process between the above-mentioned devices / units are based on the same concept as the method embodiments of this application. For details on their specific functions and technical effects, please refer to the method embodiments section, and they will not be repeated here.

[0113] The apparatus provided in this application can be applied to the foregoing method embodiments. For details, please refer to the description of the above method embodiments, which will not be repeated here.

[0114] Please refer to Figure 4, which is a schematic diagram of the device structure according to an embodiment of this application. The device 50 includes:

[0115] Memory 51 storing executable program instructions;

[0116] Processor 52 connected to memory 51;

[0117] The processor 52 is used to call the executable program instructions stored in the memory 51 and perform the following steps: acquire the patient's historical medical record representation and static information representation; calculate the patient's comprehensive representation based on the historical medical record and static information representation; and train a backbone network using the patient's comprehensive representation; input the historical medical record representations of all patients into the backbone network; calculate the patient representation matrix through the backbone network; and use cosine similarity to select the K most similar patient representations with the target patient; extract the disease type and the relationship type between different disease types from the historical medical record representation; construct a time-series heterogeneous graph of medical visits based on the disease type and relationship type; and encode the time-series heterogeneous graph of medical visits using a heterogeneous graph encoding algorithm to obtain the patient's time-series heterogeneous graph representation; construct a comprehensive disease representation using the patient's comprehensive representation, similar patient representations, and time-series heterogeneous graph representation; train a disease prediction model based on the comprehensive disease representation; and output the disease prediction result through the trained disease prediction model.

[0118] The processor 52 can also be referred to as a CPU (Central Processing Unit). The processor 52 may be an integrated circuit chip with signal processing capabilities. The processor 52 can also be a general-purpose processor, a digital signal processor (DSP), an application-specific integrated circuit (ASIC), an off-the-shelf programmable gate array (FPGA), or other programmable logic devices, discrete gate or transistor logic devices, or discrete hardware components. A general-purpose processor can be a microprocessor or any conventional processor.

[0119] Please refer to Figure 5, which is a schematic diagram of the structure of the medium in an embodiment of this application. The medium in this embodiment stores program instructions 61 capable of implementing the following steps: acquiring the patient's historical medical record representation and static information representation; calculating the patient's comprehensive representation based on the historical medical record and static information representation; training a backbone network using the patient's comprehensive representation; inputting the historical medical record representations of all patients into the backbone network; calculating the patient representation matrix through the backbone network; and using cosine similarity to select the K most similar patient representations to the target patient; extracting disease types and relationship types between different disease types from the historical medical record representations; constructing a time-series heterogeneous graph of medical visits based on the disease types and relationship types; encoding the time-series heterogeneous graph of medical visits using a heterogeneous graph encoding algorithm to obtain the patient's time-series heterogeneous graph representation; constructing a comprehensive disease representation using the patient's comprehensive representation, similar patient representations, and time-series heterogeneous graph representation; training a disease prediction model based on the comprehensive disease representation; and outputting disease prediction results through the trained disease prediction model.

[0120] The program instructions 61 can be stored in the aforementioned medium in the form of a software product, including several instructions to cause a device (which may be a personal computer, server, or network device, etc.) or processor to execute all or part of the steps of the methods of the various embodiments of this application. The aforementioned medium includes various media capable of storing program instructions, such as USB flash drives, portable hard drives, read-only memory (ROM), random access memory (RAM), magnetic disks, or optical disks, or terminal devices such as computers, servers, mobile phones, and tablets. The server can be an independent server or a cloud server providing basic cloud computing services such as cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, content delivery networks (CDNs), and big data and artificial intelligence platforms.

[0121] In the several embodiments provided in this application, it should be understood that the disclosed systems, apparatuses, and methods can be implemented in other ways. For example, the system embodiments described above are merely illustrative; for instance, the division of units is only a logical functional division, and in actual implementation, there may be other division methods. For example, multiple units or components may be combined or integrated into another system, or some features may be ignored or not executed. Furthermore, the coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interfaces, apparatuses, or units, and may be electrical, mechanical, or other forms.

[0122] Furthermore, the functional units in the various embodiments of this application can be integrated into one processing unit, or each unit can exist physically separately, or two or more units can be integrated into one unit. The integrated units described above can be implemented in hardware or as software functional units. The above are merely embodiments of this application and do not limit the patent scope of this application. Any equivalent structural or procedural transformations made based on the description and drawings of this application, or direct or indirect applications in other related technical fields, are similarly included within the patent protection scope of this application.

Claims

1. A disease prediction method based on implicit knowledge enhancement, characterized in that, include: The system acquires the patient's historical medical records and static information representations, calculates the patient's comprehensive representation based on the historical medical records and static information representations, and trains the backbone network using the patient's comprehensive representations. All patients' historical medical records are represented and input into the backbone network. The patient representation matrix is ​​calculated through the backbone network, and the K most similar patient representations with the target patient are selected using cosine similarity. The disease types and the relationship types between different disease types are extracted from the historical medical records. A heterogeneous medical record graph is constructed based on the disease types and relationship types. The heterogeneous medical record graph is then encoded using a heterogeneous graph encoding algorithm to obtain the patient's heterogeneous medical record graph representation. A comprehensive disease representation is constructed using the patient comprehensive representation, similar patient representation, and time-series heterogeneous representation of medical visits. A disease prediction model is trained based on the comprehensive disease representation, and the disease prediction result is output using the trained disease prediction model.

2. The disease prediction method based on implicit knowledge enhancement according to claim 1, characterized in that, The process of acquiring the patient's historical medical records and static information representations, calculating the patient's comprehensive representation based on the historical medical records and static information representations, and training the backbone network using the patient's comprehensive representations specifically involves: The static information representation is encoded, and the encoded static information representation is fused with the historical medical record representation to obtain the backbone network; The patient's comprehensive representation is calculated based on the historical medical records and static information representation. The patient's comprehensive representation is used to predict the primary diagnosis. The backbone network is then optimized and trained using multi-label cross-entropy as the loss function to obtain a well-trained backbone network.

3. The disease prediction method based on implicit knowledge enhancement according to claim 2, characterized in that, The static information representation includes the patient's sex information. (i) Age information (i) and race information (i) The encoding of the static information representation specifically includes: The gender information (sex) is encoded using one-hot encoding. (i) Transform into a binary vector x sex ; The age information is processed by normalization. (i) Mapped to the numerical range of 0 to 1, and denoted by x. age This is to indicate the differences between different age groups; The race information is encoded using one-hot encoding. (i) Convert to num r dimensional vector x race , where num r Represents the total number of all races in the dataset; Encode x sex x age x race By stitching the images together, we can obtain a static representation of the patient's information.

4. The disease prediction method based on implicit knowledge enhancement according to claim 3, characterized in that, The process involves inputting the historical medical records of all patients into the backbone network, calculating the patient representation matrix through the backbone network, and then using cosine similarity to select the K most similar patient representations to the target patient. Specifically: Traversing the patient representation matrix M p The distance between any two patients, P, is calculated using cosine similarity. i ∈M p and P j ∈M p The distance between them is calculated as follows: Where H i and H j Patient P i and P j The representation of , "·" denotes the vector dot product, and |·| denotes the Euclidean norm of the vector; Based on the distance calculation results, the similarity among all patients is sorted, and the similarity with the target patient P is selected. i The top-k most similar patients (simP) (i) The cross-attention mechanism was applied to obtain the similar patient P. j ∈simP (i) Weighted representation; By weighting the representations of all similar patients using attention weights, a comprehensive similar patient representation H is obtained. simP .

5. The disease prediction method based on implicit knowledge enhancement according to claim 4, characterized in that, The step of extracting disease types and relationship types between different disease types from the historical medical record representation, and constructing a medical visit time-series heterogeneity graph based on the disease types and relationship types, specifically involves: The disease types include persistent diseases, diseases caused by a previous disease, and sudden-onset diseases. The relationships between different disease types include co-occurrence relationships, inheritance relationships, and causal relationships. When two diseases c i and c j When both conditions appear in the same medical record, it is considered that disease c i and c j There is a co-occurrence relationship between them; in this case, in disease c i With c j Construct an edge between them, and define the type of the edge as "Co-occurring"; when disease c i V in two consecutive medical records t-1 and V t If it appears in the middle, then it is considered that c i This disease was inherited from a previous medical visit; at this time, it is disease c. i Construct a self-loop and define the type of the edge as "Inherited"; when disease c j Appears in current medical record V t In the previous medical record, V t-1 It did not appear in the text, but it is related to V. t-1 Disease c i If a co-occurrence relationship exists, then c is considered to be... j With c i There is a causal relationship; in this case, construct a path from c. i to c j Find the edges and define the type of the edges as "Cause"; construct a heterogeneous graph G of the time sequence of visits based on the relationship type between different disease types and the edge type.

6. The disease prediction method based on implicit knowledge enhancement according to claim 5, characterized in that, The heterogeneous graph encoding algorithm is used to encode the patient's time-series heterogeneous graph to obtain a representation of the patient's time-series heterogeneous graph. Specifically: A BERT model is pre-trained on the historical medical records of all patients with only one medical visit. The BERT model is then used to initialize the representation of each node in the heterogeneous medical visit time graph G. From the heterogeneous time-series graph G initialized with the representation, we obtain subgraphs of patient visits composed of different edge types. Each subgraph is then encoded using a multi-layer graph attention network. After encoding all subgraphs, we integrate the representations of all relation types to obtain the final representation of node i at layer l. Global information is extracted from each patient visit subgraph using global average pooling to obtain subgraph representations representing each patient visit subgraph. LSTM is used to represent all subgraphs. The process yields the time-series heterogeneity representation H of the medical visit. G for: Where T represents the patient's total number of medical visits. This represents the subgraph representation obtained during the t-th visit.

7. The disease prediction method based on implicit knowledge enhancement according to any one of claims 1 to 6, characterized in that, The process of constructing a comprehensive disease representation using the patient comprehensive representation, similar patient representation, and visit time-series heterogeneous graph representation; training a disease prediction model based on the comprehensive disease representation; and outputting disease prediction results using the trained disease prediction model further includes: The historical medical visit contribution is extracted from the patient's historical medical records, and the key path and related paths of similar patients are extracted from the medical visit time sequence heterogeneous graph G. The historical medical visit contribution, key path and related paths of similar patients are used as the reasoning basis. The reasoning basis is converted into prompt words according to the preset prompt word template, and the prompt words guide the LLMs model to generate explanatory text of the disease prediction results.

8. A disease prediction device based on implicit knowledge enhancement, characterized in that, include: Backbone network construction module: used to acquire the patient's historical medical record representation and static information representation, calculate the patient's comprehensive representation based on the historical medical record and static information representation, and use the patient's comprehensive representation to train and obtain the backbone network; Similarity filtering module: It is used to input the historical medical records of all patients into the backbone network, calculate the patient representation matrix through the backbone network, and use cosine similarity to filter out the K similar patient representations with the highest similarity to the target patient. Heterogeneous graph construction module: used to extract disease types and relationship types between different disease types from the historical medical record representation, construct a medical visit time-series heterogeneous graph based on the disease types and relationship types, and encode the medical visit time-series heterogeneous graph using a heterogeneous graph encoding algorithm to obtain the patient's medical visit time-series heterogeneous graph representation; Disease prediction module: used to construct a comprehensive disease representation using the comprehensive patient representation, similar patient representation, and heterogeneous representation of the time sequence of visits, train a disease prediction model based on the comprehensive disease representation, and output disease prediction results using the trained disease prediction model.

9. A device, characterized in that, The device includes a processor and a memory coupled to the processor, wherein, The memory stores program instructions for implementing the disease prediction method based on implicit knowledge enhancement as described in any one of claims 1-7; The processor is used to execute the program instructions stored in the memory to control a disease prediction method based on implicit knowledge enhancement.

10. A medium, characterized in that, The system stores processor-executable program instructions for performing the disease prediction method based on implicit knowledge enhancement as described in any one of claims 1 to 7.