A mental disorder disease identification method based on an invariant learning robust graph neural network
By constructing a robust graph neural network method that integrates functional connectivity graphs, subgraph deentanglement, and environment clustering, the heterogeneity problem of group brain map data is solved, improving the accuracy and interpretability of mental illness identification and adapting to clinical heterogeneous data scenarios.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- NANJING BRAIN HOSPITAL
- Filing Date
- 2026-02-25
- Publication Date
- 2026-06-12
AI Technical Summary
Existing group brain map-based mental illness classification methods suffer from low recognition accuracy and insufficient generalization ability due to problems such as group map neighborhood offset, multimodal feature heterogeneity, and high-dimensional redundancy. These methods are difficult to adapt to heterogeneous clinical data scenarios and cannot meet the needs of precision medicine.
A robust graph neural network based on invariant learning is adopted. By constructing a functional connectivity graph, subgraph unentanglement, and environment clustering, a penalty term is introduced as an invariant regularizer to optimize the classification loss and constraint loss, thereby improving the robustness and interpretability of the model.
It improves the accuracy and interpretability of mental illness identification, enhances the applicability of the model across cohorts and scenarios, and meets the actual needs of precision medicine.
Smart Images

Figure CN122201710A_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the fields of data mining, graph representation learning, and machine learning, and specifically to a method for identifying mental disorders based on invariant learning robust graph neural networks. Background Technology
[0002] Mental disorders have become a serious public health challenge worldwide, with their incidence rate rising year by year. They not only severely impair patients' cognitive function, emotional regulation, and social adaptation, reducing their quality of life, but also impose a heavy care burden on their families, significantly impacting socioeconomic development and the stability of public health systems. Therefore, accurate identification and early screening of mental illness have become critical issues urgently needing to be addressed in the medical field, attracting widespread attention from academia, clinical institutions, and the general public.
[0003] Currently, the clinical identification of mental illnesses mainly relies on traditional methods such as scale assessments, physician interviews and observations, and medical history collection. These methods are highly dependent on the physician's professional experience and subjective judgment. Furthermore, they are affected by factors such as the strong heterogeneity of patients' clinical symptoms, atypical symptom expression, and subjective biases, making it difficult to achieve accurate early identification of the disease. This often leads to misdiagnosis and missed diagnosis. Moreover, the identification criteria lack objective quantitative indicators, failing to meet the actual clinical needs for accurate disease classification and prognostic assessment.
[0004] In recent years, with the rapid development of non-invasive neuroimaging technology, resting-state functional magnetic resonance imaging (fMRI) data can capture the strength of functional connectivity and dynamic activity patterns between brain regions, revealing abnormalities in the connectivity of core brain networks such as the default mode network and executive control network. Brain networks constructed from neuroimaging data can objectively characterize the pathophysiological basis of mental illnesses, providing objective and quantitative evidence for disease identification. This transforms mental illness identification into a node classification problem based on population brain maps—using individuals as nodes and constructing population graphs with similarities in brain features, correlations in clinical indicators, or neuropathological correlations between individuals as edges. Graph learning algorithms are then used to distinguish patients from healthy controls and identify disease subtypes.
[0005] However, existing technologies still have significant shortcomings and application bottlenecks: First, clinical population brain map data are inherently heterogeneous. Factors such as scanning equipment, age stratification, disease stage, and regional differences in different sample cohorts can lead to a systematic shift in the local neighborhood structure of nodes in the population graphs of the training and test sets. Nodes of the same disease category may form similar connections with individuals of the same type in the training set (homogeneous neighborhood), while in the test set, due to changes in data distribution, they tend to form associations with individuals of different types (heterogeneous neighborhood). This causes the assumption of identical distribution of training and test data, which is relied upon by traditional graph neural networks, to fail, resulting in a sharp decline in the model's generalization performance and making it difficult to adapt to clinical heterogeneous data scenarios. Second, existing graph neural network models are prone to learning spurious associations in the population graph (such as cohort-specific individual connection patterns) rather than the essential disease characteristics of the individuals themselves. This results in insufficient robustness of the model in cross-cohort and cross-scenario applications, making it difficult to directly transform it into a clinically usable identification tool.
[0006] In summary, existing group brain map-based mental illness classification methods suffer from severely limited recognition accuracy, generalization ability, and clinical applicability due to their failure to effectively address core issues such as group map neighborhood offset, multimodal feature heterogeneity, and high-dimensional redundancy. There is an urgent need to propose a robust group brain disease classification technology that can adapt to heterogeneous clinical data to meet the practical needs of precision medicine.
[0007] In view of this, there is an urgent need to conduct research on patent analysis methods to further improve the accuracy and interpretability of patent analysis. Summary of the Invention
[0008] To address the aforementioned technical problems, the present invention provides a method for analyzing the results of mental disorder identification based on invariant learning robust graphical neural networks, comprising the following steps:
[0009] Step 1. Given the resting-state functional magnetic resonance imaging (fMRI) images of the samples. Construct a functional brain connectivity map based on the resting-state fMRI image data of the samples, specifically including the following sub-steps:
[0010] 1-1. Constructing a functional connectivity graph: First, standardize and preprocess the resting-state functional magnetic resonance imaging (fMRI) images and register individual brain images to a standard space.
[0011] 1-2, based on a preset brain region segmentation template, extract the mean blood oxygenation level dependent signal of each region of interest;
[0012] 1-3. The functional connectivity strength between brain regions is calculated using the time series mean of each brain region. Correlation analysis is used to establish the association between brain regions. The threshold ratio method is used to convert continuous correlation values into a binary connectivity matrix, which serves as the topological representation of the graph, i.e., the adjacency matrix.
[0013] 1-4. Integrate the in-and-out degree attributes of brain regions within each time window with the average time series signal to form a node feature vector. Combine this with the aforementioned binary adjacency matrix to complete the construction of the functional connectivity graph.
[0014] Step 2, using the functional connectivity graph constructed in Step 1 as input, constructs a group graph and applies subgraph deentanglement, environment clustering, and invariant learning constraint distribution out-of-generalization scenarios to assess model performance. This includes the following sub-steps:
[0015] 2-1. Constructing a group graph based on the functional connection graph of individuals. , where X represents individual image features, and the association between individuals is obtained by calculating the similarity of image features between individuals, denoted as A.
[0016] 2-2, Using a subgraph generator, invariant and variable subgraphs are implemented in a group graph. This utilizes topology. The distribution of node influence is measured, among which It represents the random walk restart probability, where I is the identity matrix. It is the original adjacency matrix A after normalization of the degree matrix D. p∼Bernoulli samples a subset of edges from the original edge set E as edges in the invariant subgraph.
[0017] 2-3, Regarding the environment subgraph The various connection patterns contained in the graph structure are represented by nodes based on the environment subgraph using a graph neural network encoder. The KMeans algorithm is used to generate environment labels, and nodes are assigned to the clusters corresponding to their nearest cluster centers. After clustering, each node is assigned to a potential environment based on its nearest cluster center. Therefore, the entire graph is segmented into pseudo-environments based on differences in local variant topology. This inferred environment labeling can simulate and correct distribution shifts without explicit environment supervision.
[0018] 2-4. Based on the obtained environment labels, a penalty term is introduced as an invariant regularizer to constrain the consistency of node representations across different environments. This constraint is achieved using the statistical variance M of empirical risk across multiple environments. ,in This represents the gradient during backpropagation. This is the inferred environmental risk calculated in 2-3.
[0019] After the node features are updated, a pooling step is performed to obtain a spatial graph representation with high generalization. Finally, a fully connected layer is used to complete the classification.
[0020] This invention addresses the out-of-distribution generalization problem in the identification of less-discussed group diseases. After constructing the group graph, a subgraph de-entanglement step separates the changing and invariant subgraphs. Then, based on the domain knowledge of the patent and the characteristics of different subgraphs, environment label inference is achieved through environment clustering. The next step utilizes an invariant learning paradigm, introducing a penalty term as an invariant regularizer to encourage the network to capture more causal representations between subgraphs. Finally, by simultaneously optimizing the classification loss and constraint loss, the accuracy and interpretability of patent classification are further improved, providing guidance for valuable information mining from patents and predicting future technological developments.
[0021] The above description is merely an overview of the technical solution of the present invention. In order to better understand the technical means of the present invention and to implement it in accordance with the contents of the specification, and in order to make the above and other objects, features and advantages of the present invention more apparent and understandable, specific embodiments of the present invention are described below. Attached Figure Description
[0022] To more clearly illustrate the specific embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the specific embodiments or the prior art will be briefly introduced below. Obviously, the drawings described below are some embodiments of the present invention. For those skilled in the art, other drawings can be obtained from these drawings without creative effort.
[0023] Figure 1 This is a flowchart of the method of the present invention.
[0024] Figure 2 This is a schematic diagram of the implementation process of the present invention.
[0025] Figure 3 The present invention presents comparative experimental results on different evaluation metrics, including (a) node classification results on ACC; (b) node classification results on F1; and (c) node classification results on AUC. Detailed Implementation
[0026] The technical solution of the present invention will be clearly and completely described below with reference to the embodiments. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.
[0027] In the description of this invention, it should be understood that the terms "first" and "second" are used for descriptive purposes only and should not be construed as indicating or implying relative importance or implicitly specifying the number of indicated technical features. Therefore, a feature defined as "first" or "second" may explicitly or implicitly include one or more of the stated features. In the description of this invention, "a plurality of" means two or more, unless otherwise explicitly specified.
[0028] The present invention will now be described in detail with reference to specific embodiments and accompanying drawings.
[0029] Method Implementation Examples
[0030] According to embodiments of the present invention, a method for analyzing the results of mental disorder identification based on invariant learning robust graphical neural networks is provided. The patent result analysis method based on invariant learning according to embodiments of the present invention includes:
[0031] Step 1. Given the resting-state functional magnetic resonance imaging (fMRI) data of the sample. Construct a functional brain connectivity map based on the resting-state fMRI data of the sample, specifically including the following sub-steps:
[0032] 1-1. Constructing a functional connectivity graph: First, standardize and preprocess the resting-state functional magnetic resonance images of N samples, and then register the individual brain images to the standard space.
[0033] 1-2, based on a preset brain region segmentation template, extract the mean blood oxygen level dependent signal of each region of interest;
[0034] 1-3. The functional connectivity strength between brain regions is calculated using the time series mean of each brain region. Correlation analysis is used to establish the association between brain regions. The threshold ratio method is used to convert continuous correlation values into a binary connectivity matrix, which serves as the topological representation of the graph, i.e., the adjacency matrix.
[0035] 1-4. Integrate the in-and-out degree attributes of brain regions within each time window with the average time series signal to form a node feature vector. Combine this with the aforementioned binary adjacency matrix to complete the construction of the functional connectivity graph.
[0036] Step 2, using the functional connectivity graph constructed in Step 1 as input, constructs a group graph and applies subgraph deentanglement, environment clustering, and invariant learning constraint distribution out-of-generalization scenarios to assess model performance. This includes the following sub-steps:
[0037] 2-1. Constructing a group graph based on the individual functional connection graph. , where X∈ Let N represent the number of individuals, and B represent the feature dimension of each individual after RFE evaluation. The association between individuals is obtained by calculating the similarity of their image features, denoted as A∈ Let aij be the weight of the edge between individuals, and the connection between individuals is evaluated through feature similarity: ,at the same time ,in This represents the image features of the i-th individual. Indicates the correlation distance. This indicates a positive correlation between individuals. and express and The mean of Fea_sim is used to evaluate not only the strength of connections between individuals but also to construct the population graph as A.
[0038] 2-2, Based on the group graph constructed in 2-1, a subgraph generator is used to divide the entire group graph into multiple subgraphs through an edge masking mechanism. Invariant and variable subgraphs are implemented in the group graph to generate the virtual environment. This minimizes the empirical risk in different environments. Assume that there exists an invariant subgraph G in the input graph G. I The neighboring nodes contribute partial causal features to achieve invariant predictions across multiple environments. This utilizes topology. The distribution of node influence is measured, among which It represents the random walk restart probability, where I is the identity matrix. It is the original adjacency matrix A after normalization of the degree matrix D. Conflict definition. The expected influence of node v on other labeled nodes of other categories. ,in It is a collection of nodes in the same category. A higher value indicates that the node is more strongly affected by the heterogeneity of other types of nodes in its subgraph, and vice versa. Similarly, edge conflict is further calculated as the average of the conflicts between two adjacent nodes: Invariant subgraph G I edge weight Using p∼Bernoulli ( A subset of edges is sampled from the original edge set E to form edges in the invariant subgraph. To address the non-differentiability of edge operations, a reinforcement learning policy gradient direction is used for optimization. Deleting / keeping edges is treated as an action, and m actions are performed on m edges using a p∼Bernoulli distribution. The reward function is... , where Y is the label and r is the causal representation that enables the downstream predictor to produce consistent performance in different environments.
[0039] 2-3, Regarding the environment subgraph The various connection patterns contained in the graph structure are represented by nodes based on the environment subgraph using a graph neural network encoder. The KMeans algorithm is used to generate environment labels, and nodes are assigned to the clusters corresponding to their nearest cluster centers. After clustering, each node is assigned to a potential environment based on its nearest cluster center. Therefore, the entire graph is segmented into pseudo-environments based on differences in local variant topology. This inferred environment labeling can simulate and correct distribution shifts without explicit environment supervision.
[0040] 2-4. Based on the environment labels obtained in 2-3, a penalty term is introduced as an invariant regularizer to constrain the consistency of node representations across different environments. This constraint is achieved using the statistical variance M of empirical risk across multiple environments. ,in This represents the gradient during backpropagation. This is the inferred environmental risk calculated in 2-3.
[0041] After the node features are updated, a pooling step is performed to obtain a highly generalized individual representation, and finally a fully connected layer is used to complete the classification.
[0042] This invention addresses the out-of-distribution generalization problem in the identification of less-discussed group diseases. After constructing the group graph, it uses topological conflict detection to de-entangle the subgraphs, separating invariant subgraphs containing causal relationships from variant subgraphs containing environmental distinguishing features, thus eliminating interference sources of neighborhood shift at the structural level. Then, it uses the variant subgraphs for unsupervised inference of the latent environment, generating virtual environment labels through KMeans clustering, solving the pain point of missing environment labels in clinical scenarios and providing a foundation for cross-environment constraints. The next step introduces cross-environment semantic consistency regularization, which forces the model to focus on the essential disease features in the invariant subgraphs rather than environment-dependent neighborhood structures by minimizing the empirical risk variance in different virtual environments. Finally, by simultaneously optimizing the classification loss and constraint loss, the accuracy and interpretability of patent classification are further improved. This invention uses three common classification evaluation metrics to evaluate the performance of the brain map classification model: accuracy (ACC), F1 score (f1_score), and area under the ROC curve (AUC), which has guiding significance for valuable information mining in patents and prediction of future technological developments.
[0043] Example:
[0044] The following example, using data from Nanjing Brain Hospital, illustrates the mental disorder identification method based on invariant learning robust graph neural networks of this invention.
[0045] Experimental conditions: A computer was selected for the experiment. The computer was equipped with an Intel processor (3.4GHz) and 10GB of random access memory, a 64-bit operating system, and Python 3 programming language.
[0046] The experimental data used in this chapter were provided by Nanjing Brain Hospital, including 124 cases of bipolar disorder, 366 cases of schizophrenia, and 256 cases of normal control group.
[0047] Figure 3 The comparative experimental results on a three-class classification task are presented, with comparisons made to advanced graph-based out-of-distribution generalization methods StableGNN and IGM. Clearly, the method of this invention outperforms existing methods in terms of accuracy (ACC), F1 score (F1), and area under the ROC curve (AUC).
[0048] The foregoing has described specific embodiments of this specification. Other embodiments are within the scope of the appended claims. In some cases, the actions or steps recited in the claims may be performed in a different order than those shown in the embodiments and may still achieve the desired results.
Claims
1. A method for analyzing the results of mental disorder identification based on invariant learning robust graphical neural networks, characterized in that, Includes the following steps: Step 1: Given the resting-state functional magnetic resonance imaging (fMRI) images of the samples, after preprocessing, construct a functional brain connectivity map; Step 2: Using the functional connectivity graph constructed in Step 1 as input, construct a group graph and evaluate the model performance in out-of-generalization scenarios using subgraph deentanglement, environment clustering, and invariant learning constraint distribution.
2. The method for analyzing the results of mental disorder identification based on invariant learning robust graphical neural networks according to claim 1, characterized in that, Step 1 specifically includes the following sub-steps: 1-1. Constructing a functional connectivity graph: First, the resting-state functional magnetic resonance imaging (fMRI) images are standardized and preprocessed, and individual brain images are registered to a standard space. 1-2, Based on a pre-defined brain region segmentation template, the mean blood oxygenation level-dependent signal of each region of interest is extracted. 1-3. The functional connectivity strength between brain regions was calculated using the time-series mean values of each brain region. Correlation analysis was used to establish associations between brain regions. The threshold ratio method was then used to transform continuous correlation values into a binary connectivity matrix, which serves as a topological representation of the graph, i.e., the adjacency matrix. 1-4. Integrate the in-and-out degree attributes of brain regions within each time window with the average time series signal to form a node feature vector. Combine this with the aforementioned binary adjacency matrix to complete the construction of the functional connectivity graph.
3. The method for analyzing the results of mental disorder identification based on invariant learning robust graphical neural networks according to claim 2, characterized in that, Step 2 specifically includes the following sub-steps: 2-1. Constructing a group graph based on the individual functional connection graph. Where X represents individual image features, the association between individuals is obtained by calculating the similarity of image features between individuals, denoted as A. 2-2, Using a subgraph generator, invariant and variable subgraphs are implemented in the group graph, utilizing topology. The distribution of node influence is measured, among which It represents the random walk restart probability, where I is the identity matrix. It is the original adjacency matrix A after normalization of the degree matrix D. p∼Bernoulli samples a subset of edges from the original edge set E as edges in the invariant subgraph. 2-3, Regarding the environment subgraph The various connection patterns contained in the graph structure are represented by nodes based on the environment subgraph using a graph neural network encoder. The KMeans algorithm is used to generate environment labels, and nodes are assigned to the clusters corresponding to their nearest cluster centers. After clustering, each node is assigned to a potential environment based on its nearest cluster center. Therefore, the entire graph is divided into pseudo-environments based on the differences in local variant topology. This inferred environment label can simulate and correct distribution shifts without explicit environment supervision. 2-4. Based on the obtained environment labels, a penalty term is introduced as an invariant regularizer to constrain the consistency of node representations across different environments. This constraint is achieved using the statistical variance M of empirical risk across multiple environments. ,in This represents the gradient during backpropagation. This is the inferred environmental risk calculated in 2-3.
4. The method for analyzing the results of mental disorder identification based on invariant learning robust graphical neural networks according to claim 3, characterized in that, In step 2-1, the association between individuals is obtained by calculating the image feature similarity between individuals, denoted as A∈ Let aij be the weight of the edge between individuals, and the connection between individuals is evaluated through feature similarity: ,at the same time ,in This represents the image features of the i-th individual. Indicates the correlation distance. This indicates a positive correlation between individuals. and express and The mean of Fea_sim matrix is used to evaluate not only the strength of connections between individuals but also to construct a population graph as A.
5. An electronic device, comprising a memory, a processor, and a computer program stored in the memory and executable on the processor, characterized in that, When the processor executes the program, it implements the results analysis method for the identification of mental disorders based on invariant learning robust graph neural networks as described in any one of claims 1 to 4.
6. A computer-readable storage medium storing computer instructions thereon, characterized in that, When executed by the processor, the computer instructions implement the results analysis method for the identification of mental disorders based on invariant learning robust graph neural networks as described in any one of claims 1-4.