A brain network representation learning method and system based on multi-view diffusion

By using a multi-view diffusion mechanism to collaboratively mine global functional connectivity and local neural activity within a unified framework, the problem of information silos in existing technologies is solved, and the robustness and accuracy of brain network analysis are improved, especially in the ADNI tri-class classification task.

CN122290907APending Publication Date: 2026-06-26ZHEJIANG GONGSHANG UNIVERSITY

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
ZHEJIANG GONGSHANG UNIVERSITY
Filing Date
2026-03-13
Publication Date
2026-06-26

AI Technical Summary

Technical Problem

Existing GNN-based brain network analysis methods construct models from a single perspective, making it difficult to comprehensively and accurately capture the global-local collaborative working characteristics of the brain. Furthermore, rs-fMRI data suffers from high dimensionality, inter-individual physiological differences, and noise interference, resulting in low information utilization efficiency and poor model robustness.

Method used

By constructing a functional connectivity view and a local activity view, and employing a multi-view diffusion mechanism, the multi-view diffusion mechanism simultaneously constructs FC and LA views under a shared adjacency matrix topology. This collaboratively mines and deeply integrates global functional connectivity and local neural activity information, and achieves information interaction and noise suppression through a cross-attention mechanism.

Benefits of technology

It achieves the deep integration of global functional connectivity and dynamic interaction of local neural activities within a unified framework, which significantly improves the robustness and accuracy of the model, especially the accuracy improvement of 12.14 percentage points on the ADNI three-class classification task.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122290907A_ABST
    Figure CN122290907A_ABST
Patent Text Reader

Abstract

This invention discloses a brain network representation learning method and system based on multi-view diffusion. First, a shared adjacency matrix is ​​constructed from the same resting-state functional magnetic resonance imaging (fMRI) data based on the Pearson correlation coefficient. Under this shared topology, both the FC (front-end view) and LA (back-end view) views are built. Second, the commonality strength scores of node features in the FC and LA views are calculated. Singular value decomposition is used to identify and discard common noise features across subjects. Then, intra-view topology enhancement and inter-view feature propagation are performed synchronously on a unified graph topology, and information interaction is achieved through a cross-attention mechanism. Finally, the dual-view representations are adaptively fused using an attention mechanism to complete disease diagnosis. This invention solves the problems of incomplete brain functional state representation and insufficient information utilization caused by the reliance on a single perspective in existing brain network analysis methods. It can effectively suppress noise and significantly improve the classification performance of brain diseases in complex scenarios such as class imbalance.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of medical image processing and artificial intelligence, and in particular to a brain network representation learning method and system based on multi-view diffusion. Background Technology

[0002] With the rapid development of neuroimaging and computational science, brain functional networks constructed based on resting-state functional magnetic resonance imaging (rs-fMRI) have become important tools for studying the intrinsic functional organization of the brain, revealing the pathological mechanisms of neuropsychiatric diseases, and developing objective biomarkers. Graph Neural Networks (GNNs), with their powerful graph structure data modeling capabilities, have shown great potential in the field of brain network analysis, automatically learning discriminative representations from complex brain connectivity patterns.

[0003] However, most existing GNN-based brain network analysis methods suffer from a fundamental limitation: they typically construct brain network models from only a single perspective. Specifically, one type of method focuses on global functional connectivity (FC) patterns, constructing a whole-brain connectivity matrix by calculating the statistical correlations (such as Pearson correlation coefficients) between time series of oxygen-dependent (BOLD) signals from different brain regions, and using this as the adjacency matrix or node features of the graph. Another type of method emphasizes local activity (LA) intensity indicators, such as amplitude of low-frequency fluctuation (ALFF) or fractional ALFF (fALFF), to quantify the intensity of spontaneous neural activity within a single brain region. This "information silo" modeling paradigm struggles to comprehensively and accurately capture the global-local collaborative characteristics exhibited by the brain as a complex dynamic system, resulting in incomplete representations of brain functional states and low information utilization efficiency.

[0004] Although a few studies have attempted to combine FC and LA features, their fusion strategies are often too simplistic and crude. For example, they may directly concatenate the two feature vectors before inputting them into the model, or model the two views independently and then fuse the results later. These methods fail to effectively model the potential, dynamic interaction and synergistic mechanisms between FC and LA on a unified graph topology, and fail to fully utilize their complementarity in characterizing brain function.

[0005] Furthermore, the inherent high dimensionality of rs-fMRI data, significant individual physiological differences, and various noises introduced during the scanning process (such as physiological noises like head movement, heartbeat, and respiration, as well as equipment-related system noise) result in a large number of common components in the raw brain network features that are unrelated to specific cognitive tasks or disease states. These common noises can severely interfere with the model's learning process, leading to overfitting and reducing the model's generalization ability and robustness in real-world clinical scenarios.

[0006] Therefore, there is an urgent need for a novel brain network representation learning framework that can:

[0007] 1. Deep Fusion: Within a unified mathematical and computational framework, collaboratively mine and deeply fuse two complementary types of information from the same rs-fMRI data: global functional connectivity and local neural activity.

[0008] 2. Dynamic interaction: Explicitly model the dynamic interaction and information transmission process between the two in the brain network topology;

[0009] 3. Strong robustness: It has an effective noise suppression mechanism that can identify and filter out common noise across subjects, thereby learning more discriminative and stable brain network representations. Summary of the Invention

[0010] To address the aforementioned technical problems, this invention proposes a brain network representation learning method and system based on multi-view graph diffusion. This invention collaboratively constructs functional connectivity (FC) and local activity (LA) views from the same rs-fMRI data set. Through an innovative multi-view graph diffusion mechanism, it synchronously performs topology enhancement within views and feature propagation between views on a shared graph topology, ultimately achieving deep information fusion and effective noise suppression, thereby learning more discriminative brain network representations.

[0011] In a first aspect, this invention proposes a brain network representation learning method based on multi-view diffusion, comprising the following steps:

[0012] Multi-view construction: From the same resting-state functional magnetic resonance imaging (fMRI) data, a shared adjacency matrix is ​​constructed based on the Pearson correlation coefficient, and both FC (Functional Magnetic Resonance) and LA (Low-Layer) views are simultaneously constructed under a shared topology.

[0013] Feature purification: Calculate the commonality strength score of the node features of the FC view and the LA view, and discard the top λ feature dimensions with the highest commonality strength scores to obtain the purified FC' view and LA' view;

[0014] Multi-view diffusion: On the purified FC' view and LA' view, topological diffusion within the view and feature propagation between the views are performed simultaneously, and information interaction is achieved through the cross attention mechanism to obtain an enhanced multi-view node representation;

[0015] Fusion classification: The enhanced multi-view node representations are fused through a node-level attention mechanism to generate a comprehensive embedding vector for each brain region. This vector is then processed by global pooling and a multilayer perceptron classifier to output the classification prediction result.

[0016] Secondly, this invention also proposes a brain network representation learning system based on multi-view diffusion, comprising:

[0017] The multi-view construction module is used to construct a shared adjacency matrix based on the Pearson correlation coefficient from the same resting-state functional magnetic resonance imaging (fMRI) data, and simultaneously construct FC and LA views under a shared topology.

[0018] The feature purification module is used to calculate the commonality strength score of the node features of the FC view and the LA view, and discard the top λ feature dimensions with the highest commonality strength scores to obtain the purified FC' view and LA' view.

[0019] The multi-view diffusion module is used to simultaneously perform intra-view topology diffusion and inter-view feature propagation on the purified FC' view and LA' view, and realize information interaction through the cross attention mechanism to obtain an enhanced multi-view node representation;

[0020] The fusion classification module is used to fuse the enhanced multi-view node representations through a node-level attention mechanism to generate a comprehensive embedding vector for each brain region. This vector is then processed by global pooling and a multilayer perceptron classifier to output the classification prediction result.

[0021] Thirdly, the present invention also provides a computer device, including a memory and a processor, wherein the memory stores a computer program, and the processor executes the computer program to implement the steps in the above-described brain network representation learning method based on multi-view diffusion.

[0022] Fourthly, the present invention also provides a computer-readable storage medium having a computer program stored thereon, wherein the computer program, when executed by a processor, implements the steps in the above-described brain network representation learning method based on multi-view diffusion.

[0023] Fifthly, the present invention also provides a computer program product, including a computer program that, when executed by a processor, implements the steps in the above-described brain network representation learning method based on multi-view diffusion.

[0024] The beneficial effects of this invention are:

[0025] This invention applies the multi-view diffusion mechanism to brain network representation learning, breaking down information silos in existing methods by explicitly modeling the dynamic interactions between functional connections and local activities. It fully leverages the advantages of graph diffusion theory in smoothing noise and capturing long-range dependencies, and combines this with a SVD feature cleansing strategy to significantly improve the model's robustness. Experiments on the ADHD-200 and ADNI public benchmark datasets demonstrate that this invention outperforms state-of-the-art methods on multiple evaluation metrics, particularly on the highly challenging imbalanced ADNI three-class classification task, where it significantly improves accuracy by 12.14 percentage points while maintaining minimal cross-running variance. This proves the superior performance and stability of this invention in processing complex, real-world brain imaging data, providing a powerful tool for the precise auxiliary diagnosis of neuropsychiatric disorders. Attached Figure Description

[0026] Figure 1 A schematic diagram of a brain network representation learning method based on multi-view diffusion provided in this application embodiment;

[0027] Figure 2 This is a schematic diagram of the multi-view construction process provided in an embodiment of this application. Detailed Implementation

[0028] The technical solutions of the embodiments of this application will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of this application, and not all embodiments. Based on the embodiments of this application, all other embodiments obtained by those of ordinary skill in the art without creative effort are within the scope of protection of this application.

[0029] In one embodiment, such as Figure 1 As shown, this invention proposes a brain network representation learning method based on multi-view diffusion. First, the invention obtains the BOLD time series of each subject from an rs-fMRI data source; then, through multi-view construction, it generates an FC view and an LA view with a shared adjacency matrix A; next, it uses SVD to denoise the node features of the two views; subsequently, it performs K iterations on the shared topology to achieve intra-view enhancement and inter-view interaction; finally, it outputs diagnostic results through fusion classification. Specifically, the method includes the following steps:

[0030] S1: Multi-view Construction

[0031] Furthermore, such as Figure 2As shown, S1 constructs a functional connectivity (FC) view and a local activity (LA) view from the same set of resting-state functional magnetic resonance imaging (rs-fMRI) data, and forces the FC view and the LA view to share the same adjacency matrix composed of Pearson correlation coefficients, thereby ensuring that they are analyzed under the same brain network topology.

[0032] Specifically, the brain is divided into N regions of interest (ROIs) using publicly available brain atlases (such as AAL90). For each participant, the Pearson correlation coefficients between each pair of time series of the N ROIs are calculated, resulting in an N×N adjacency matrix A, which is then used to construct a full-field view. =(A, ).in The i-th row is the i-th row of A, representing the connection group contour of the i-th ROI. Simultaneously, the low-frequency amplitude (ALFF) of the time series of each ROI is calculated to construct the LA view. =(A, ),in The i-th row contains the ALFF values ​​of the i-th ROI. The two views share the adjacency matrix A.

[0033] Furthermore, in the multi-view construction step, the node feature of the FC view is the "Connectivity Profile", which is each row of the shared adjacency matrix. This row vector fully describes the functional connectivity pattern between the corresponding brain region and all other brain regions in the whole brain. The node feature of the LA view is the low-frequency amplitude (ALFF), which is used to characterize the intensity of local neural activity or functional importance of the corresponding brain region.

[0034] S2: Feature Purification

[0035] Furthermore, S2 performs singular value decomposition (SVD) on the node features of the FC view and LA view respectively, calculates the commonality strength score of each feature dimension, and discards the top λ feature dimensions with the highest commonality strength scores to obtain the purified FC' view and LA' view, so as to effectively suppress commonality noise across subjects.

[0036] Specifically, on the training set, SVD feature discarding is performed on the FC view and LA view respectively. In the feature purification step, the commonality strength score is calculated as follows: For any view (taking the FC view as an example), its feature dimension is N. For each brain region node n (1≤n≤N), the N-dimensional feature vectors of all M training subjects in the training set at that node are collected, forming an N×M matrix. Perform singular value decomposition (SVD) on the matrix: Take the left singular matrix. The first column And calculate the absolute value of each element. | (d=1,...,N), representing the commonality strength score of the d-th feature dimension for that node. The score for all N nodes on the d-th feature dimension. Calculate the arithmetic mean to obtain the global average score for the d-th feature dimension. Arrange the N feature dimensions according to The top-ranked feature dimensions (λ) are discarded. The exact same operation is performed on the LA view. This SVD feature discard mask (i.e., which dimensions to discard) is computed only on the training set and applied to both the validation and test sets. The final cleaned features are obtained. and .

[0037] S3: Multi-view diffusion

[0038] Furthermore, S3 simultaneously performs intra-view topological diffusion (i.e., feature aggregation based on a shared adjacency matrix) and inter-view feature propagation (i.e., information exchange between FC and LA views) on the purified FC' and LA' views, and dynamically adjusts the information flow intensity between views through a cross-attention mechanism. After K iterations, this process yields an enhanced multi-view node representation rich in interactive information.

[0039] Furthermore, topological diffusion within a view and feature propagation between views are achieved through the following discretized iterative formula:

[0040]

[0041]

[0042] Where k is the current diffusion layer number (k=0,1,...,K-1), and η is the diffusion step size (satisfying 0 < η < 1 to ensure numerical stability). To share elements of the adjacency matrix, and These are the node representations of the i-th brain region in the FC view and LA view at the k-th layer, respectively. and The inter-view diffusion coefficient is calculated by the cross-attention mechanism and is used to dynamically control the information flow intensity from the LA view to the FC view, and from the FC view to the LA view.

[0043] Furthermore, the calculation method for the cross-attention mechanism is as follows:

[0044]

[0045]

[0046] For the sigmoid function, , , and It is a learnable projection matrix.

[0047] Specifically, the S2 obtained and As the initial node representation and The diffusion layer number K=4 and diffusion step size η=0.1 are set. Four diffusion updates are performed according to the discretized iterative formula. In each update, the cross-attention mechanism dynamically calculates the diffusion coefficient α between views based on the node representation of the current layer, thereby achieving adaptive information interaction.

[0048] S4: Integration and Classification

[0049] Furthermore, S4 adaptively fuses the enhanced multi-view node representations through a node-level attention mechanism to generate an optimal comprehensive embedding vector for each brain region. Subsequently, a global pooling operation is performed on the comprehensive embeddings of all brain regions to generate a final graph-level representation, which is then input into a multilayer perceptron (MLP) classifier to complete brain disease diagnosis or other downstream prediction tasks.

[0050] Furthermore, after diffusion through the K layer of S3, the following is obtained: and .in for The enhanced node representation matrix after the view has undergone K layers of multi-view graph diffusion, its first... OK , indicating the first The final embedding vector of each brain region from the perspective of global functional connectivity; ,for The enhanced node representation matrix after the view has undergone K layers of multi-view graph diffusion, its first... OK , indicating the first The final embedding vector of each brain region from the perspective of global functional connectivity. N is the number of brain regions, and d is the embedding dimension. For each node i, the attention weights are calculated. and fused together For all Global average pooling is used to obtain the graph-level representation. Finally, the prediction result is obtained through an MLP classifier. The model is trained end-to-end using cross-entropy loss.

[0051] Furthermore, in the fusion classification step, the attention mechanism is calculated as follows:

[0052]

[0053]

[0054] in, and Let be the final embedding vectors of the FC view and the LA view at the i-th brain region node, respectively. , , For learnable parameters, For sigmoid function, weights ∈ [0, 1] reflects the importance of global connectivity patterns relative to local activity intensity when characterizing the functional state of the i-th brain region. Finally, through the fusion representation of all N brain regions { Perform global average pooling to obtain a graph-level representation. It is then fed into an MLP classifier for final prediction.

[0055] Ultimately, the multi-view diffusion brain network system provides objective and interpretable imaging biomarkers for the early screening, accurate classification, and individualized diagnosis and treatment of neuropsychiatric diseases (such as attention deficit hyperactivity disorder (ADHD) and Alzheimer's disease (AD)) by analyzing patients' physiological indicators.

[0056] In one exemplary embodiment, a brain network representation learning system based on multi-view diffusion is provided, comprising:

[0057] The multi-view construction module is used to construct a shared adjacency matrix based on the Pearson correlation coefficient from the same resting-state functional magnetic resonance imaging (fMRI) data, and simultaneously construct FC and LA views under a shared topology.

[0058] The feature purification module is used to calculate the commonality strength score of the node features of the FC view and the LA view, and discard the top λ feature dimensions with the highest commonality strength scores to obtain the purified FC' view and LA' view.

[0059] The multi-view diffusion module is used to simultaneously perform intra-view topology diffusion and inter-view feature propagation on the purified FC' view and LA' view, and realize information interaction through the cross attention mechanism to obtain an enhanced multi-view node representation;

[0060] The fusion classification module is used to fuse the enhanced multi-view node representations through a node-level attention mechanism to generate a comprehensive embedding vector for each brain region. This vector is then processed by global pooling and a multilayer perceptron classifier to output the classification prediction result.

[0061] In one exemplary embodiment, a computer device is provided, including a memory and a processor, the memory storing a computer program, the processor executing the computer program to implement the steps in the above-described brain network representation learning method based on multi-view diffusion.

[0062] In one exemplary embodiment, a computer-readable storage medium is provided having a computer program stored thereon, which, when executed by a processor, implements the steps in the above-described brain network representation learning method based on multi-view diffusion.

[0063] In one exemplary embodiment, a computer program product is provided, including a computer program that, when executed by a processor, implements the steps in the above-described brain network representation learning method based on multi-view diffusion.

[0064] Those skilled in the art will recognize that the units and algorithm steps of the various examples described in conjunction with the embodiments disclosed in this application can be implemented in electronic hardware or a combination of computer software and electronic hardware. Whether these functions are implemented in hardware or software depends on the specific application and design constraints of the technical solution. Those skilled in the art can use different methods to implement the described functions for each specific application, but such implementation should not be considered beyond the scope of this application.

[0065] The above are merely preferred embodiments of the present invention. The scope of protection of the present invention is not limited to the above embodiments. All technical solutions falling within the scope of the present invention's concept are within the scope of protection of the present invention. It should be noted that for those skilled in the art, any improvements and modifications made without departing from the principles of the present invention should be considered within the scope of protection of the present invention.

Claims

1. A brain network representation learning method based on multi-view diffusion, characterized in that, Includes the following steps: Multi-view construction: From the same resting-state functional magnetic resonance imaging data, a shared adjacency matrix is ​​constructed based on the Pearson correlation coefficient, and FC view and LA view are constructed simultaneously under the shared topology; ‌ Feature purification: Calculate the commonality strength score of the node features of the FC view and the LA view, and discard the top λ feature dimensions with the highest commonality strength scores to obtain the purified FC' view and LA' view; Multi-view diffusion: On the purified FC' view and LA' view, topological diffusion within the view and feature propagation between the views are performed simultaneously, and information interaction is achieved through the cross attention mechanism to obtain an enhanced multi-view node representation; Fusion classification: The enhanced multi-view node representations are fused through a node-level attention mechanism to generate a comprehensive embedding vector for each brain region. This vector is then processed by global pooling and a multilayer perceptron classifier to output the classification prediction result.

2. The method according to claim 1, characterized in that, The node features of the FC view are represented by their corresponding row vectors in the shared adjacency matrix. These vectors are the connectivity group outlines, which fully describe the functional connectivity patterns between this brain region and other brain regions. The node features of the LA view are low-frequency amplitudes, which are used to characterize the intensity or functional importance of local neural activity in the corresponding brain region.

3. The method according to claim 1 or 2, characterized in that, The topological diffusion within the view and the feature propagation between views are achieved through a discretized iterative formula.

4. The method according to claim 3, characterized in that, The cross-attention mechanism dynamically calculates the diffusion coefficient between different views based on the node representation of the current layer, thereby achieving adaptive information interaction between multiple views.

5. The method according to claim 1, characterized in that, For each brain region node, the commonality strength score of the node features in the FC view and LA view is calculated according to the following steps: Feature matrix construction: Collect the N-dimensional feature vectors of all M subjects in the training set at node n to form an N×M feature matrix, where rows correspond to feature dimensions and columns correspond to different subjects; Singular value decomposition operation: Performing singular value decomposition on the characteristic matrix; Commonality strength score extraction: Take the first column vector of the left singular matrix, take the absolute value of each element in the vector, and obtain the commonality strength score of the feature dimension of that node.

6. The method according to claim 1 or 5, characterized in that, After calculating the common strength score of the node features in the FC view and LA view, the process of discarding the top λ feature dimensions with the highest common strength scores includes the following steps: The arithmetic mean of the common strength score vectors of all nodes is calculated to obtain the global average common strength score vector. Arrange the dimensions of the global average commonality strength score vector in descending order of score, and discard the feature dimensions with the highest λ ranking.

7. A brain network representation learning system based on multi-view diffusion, characterized in that, include: The multi-view construction module is used to construct a shared adjacency matrix based on the Pearson correlation coefficient from the same resting-state functional magnetic resonance imaging data, and simultaneously construct FC view and LA view under the shared topology. ‌ The feature purification module is used to calculate the commonality strength score of the node features of the FC view and the LA view, and discard the top λ feature dimensions with the highest commonality strength scores to obtain the purified FC' view and LA' view. The multi-view diffusion module is used to simultaneously perform intra-view topology diffusion and inter-view feature propagation on the purified FC' view and LA' view, and realize information interaction through the cross attention mechanism to obtain an enhanced multi-view node representation; The fusion classification module is used to fuse the enhanced multi-view node representations through a node-level attention mechanism to generate a comprehensive embedding vector for each brain region. This vector is then processed by global pooling and a multilayer perceptron classifier to output the classification prediction result.

8. A computer device comprising a memory and a processor, wherein the memory stores a computer program, characterized in that, When the processor executes the computer program, it implements the steps of the method according to any one of claims 1 to 6.

9. A computer-readable storage medium having a computer program stored thereon, characterized in that, When the computer program is executed by a processor, it implements the steps of the method according to any one of claims 1 to 6.

10. A computer program product, comprising a computer program, characterized in that, When the computer program is executed by a processor, it implements the steps of the method according to any one of claims 1 to 6.