A method, system and application for diagnosing a motor cognitive risk syndrome
By combining Transformer with graph attention network, deep fusion of multimodal features and interpretable diagnosis are achieved, solving the problems of incomplete modality coverage and poor interpretability in MCR diagnosis, and improving the accuracy of early identification and the credibility of clinical application.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- BEIJING TIANTAN HOSPITAL AFFILIATED TO CAPITAL MEDICAL UNIV
- Filing Date
- 2026-03-24
- Publication Date
- 2026-06-19
AI Technical Summary
Existing technologies have incomplete modality coverage, simple feature fusion methods, and poor model interpretability in the diagnosis of motor cognitive risk syndrome (MCR), making it difficult to achieve early and accurate identification and clinical translation.
By employing the attention mechanism of Transformer to fuse multidimensional behavioral features and modeling brain networks through graph attention networks, combined with gradient weighted analysis, we can achieve cross-modal deep interaction and highly interpretable diagnosis.
It improves the early identification accuracy of MCR, enhances the interpretability and clinical credibility of the model, provides clear explanations of key behavioral features and abnormal brain regions, and is applicable to the assessment of other neurodegenerative diseases.
Smart Images

Figure CN122245771A_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of medical artificial intelligence technology, specifically to an intelligent diagnostic method, system, and application for motor cognitive risk syndrome based on multimodal feature fusion. Background Technology
[0002] Dementia and related neurodegenerative diseases pose a significant health challenge to aging societies worldwide. Research indicates a lengthy prodromal phase before a clinical diagnosis of dementia, and early identification and intervention during this phase are crucial for slowing disease progression. Motor cognitive risk syndrome (MCR), an important prodromal state of dementia, is characterized by the coexistence of slowed gait and subjective cognitive decline, without a definitive dementia diagnosis. Epidemiological data show that individuals with MCR have a significantly higher risk of developing dementia within 3-5 years than the general elderly population; therefore, accurate and early identification of MCR has significant clinical implications.
[0003] Currently, clinical assessment of microrespiratory tachycardia (MCT) primarily relies on neuropsychological scales combined with simple gait rate measurements. This method is highly subjective, has limited sensitivity, and struggles to quantify subtle early behavioral changes. Research has confirmed that the occurrence of MCR is closely related to multidimensional behavioral manifestations and changes in brain function. On the one hand, behavioral characteristics such as gait, eye movements, and language exhibit quantifiable abnormalities in the MCR stage; on the other hand, multimodal brain imaging data (such as structural magnetic resonance imaging, functional magnetic resonance imaging, and diffusion tensor imaging) can reveal brain structural atrophy, abnormal functional connectivity, and alterations in brain network topology associated with MCR. These multi-source heterogeneous features collectively reflect the underlying pathological mechanisms of MCR.
[0004] Utilizing artificial intelligence to analyze multimodal medical data has become a research hotspot in the auxiliary diagnosis of neurodegenerative diseases. Existing research has attempted to use multimodal medical data to assist in the diagnosis of neurodegenerative diseases. For example, patent literature (publication number CN107506797A) provides a multimodal imaging Alzheimer's disease classification method based on deep neural networks. This method collects imaging data such as MRI and PET scans, combines them with cerebrospinal fluid biochemical indicators, and then uses a deep learning model for classification after feature stitching. Another existing technology (Guan XJ, Guo T, Zhou C, Gao T, Wu JJ, Han V, Cao S, Wei HJ, Zhang YY, Xuan M, Gu QQ, Huang PY, Liu CL, Pu JL, Zhang BR, Cui F, Xu XJ, Zhang MM. A multiple-tissue-specific magnetic resonance imagingmodel for diagnosing Parkinson's disease: a brain radiomics study. NeuralRegen Res. 2022 Dec;17(12):2743-2749.) involves a Parkinson's disease diagnostic model based on multimodal magnetic resonance imaging omics, which uses radiomics methods to extract features and uses machine learning algorithms such as random forest to construct the diagnostic model.
[0005] However, when applying the above-mentioned existing technical solutions to the early diagnosis of MCR, the following limitations still exist: (1) Incomplete modality coverage: Existing solutions mostly rely on single imaging or biochemical modalities and do not systematically integrate multidimensional behavioral features such as gait and eye movement, making it difficult to comprehensively depict the multidimensional clinical manifestations of MCR. (2) Simple feature fusion mechanism: Most of them adopt shallow fusion methods such as feature splicing, which fail to effectively model the deep interaction relationship between features of different modalities and cannot fully explore the complementary information across modalities. (3) Insufficient model expressive ability: Traditional machine learning or single deep networks are difficult to accurately depict the complex structure-function-network relationship between brain regions, which limits their ability to model brain network abnormalities related to MCR. (4) Lack of model interpretability: Existing solutions are mostly "black box" models, lacking effective explanations of the basis for diagnostic decisions, and failing to clarify the mechanism of action of key brain regions and behavioral features, which hinders clinical understanding and application.
[0006] Therefore, there is an urgent need in this field for an MCR intelligent diagnostic solution that can systematically integrate multi-dimensional behavioral and brain imaging features, achieve cross-modal deep interaction and brain network modeling, and has high interpretability, in order to overcome the above-mentioned defects of existing technologies and meet the actual needs of early accurate identification and clinical translation. Summary of the Invention
[0007] To address the problems of incomplete modality coverage, simplistic feature fusion methods, and poor model interpretability in existing diagnostic methods for motor cognitive risk syndrome (MCR), this invention aims to provide an intelligent diagnostic scheme for MCR that can achieve deep fusion of multimodal behavior and brain imaging features and has high interpretability.
[0008] To achieve the above objectives, the technical solution adopted by the present invention is as follows: A first aspect of the present invention provides a diagnostic system for motor cognitive risk syndrome (MCR), comprising: Acquire at least two different types of behavioral characteristics; The behavioral features are fused using the attention mechanism of Transformer to obtain behavioral fusion features; Obtain brain network data, and construct a brain network graph based on the brain network data, with brain regions as nodes and the connection relationships between brain regions as edges; The brain network graph is modeled using a graph attention network to obtain brain network features; Based on the behavioral fusion features and the brain network features, an MCR diagnostic result is generated.
[0009] In one possible implementation, the behavioral characteristics include two or more of the following: gait characteristics, eye movement characteristics, language characteristics, writing characteristics, or upper limb movement characteristics.
[0010] In one possible implementation, the fusion using the attention mechanism of Transformer includes: encoding each behavioral feature separately using an intramodal self-attention mechanism, and then performing feature interaction and weighted fusion through a cross-modal cross-attention mechanism.
[0011] In one possible implementation, the cross-modal cross-attention mechanism is based on a multi-head attention mechanism.
[0012] In one possible implementation, the diagnostic method further includes: extracting and fusing features from the brain network data, and using the fused features as the initial features of the nodes; the feature extraction and fusion includes gradient-weighted fusion.
[0013] In one possible implementation, generating an MCR diagnostic result based on the behavioral fusion features and the brain network features includes: concatenating the two or interacting with them through an attention mechanism, and then inputting the result into a classifier to obtain a diagnostic probability.
[0014] In one possible implementation, the classifier is a fully connected neural network classifier.
[0015] In one possible implementation, the diagnostic method further includes an interpretability analysis step: determining key behavioral features based on the attention weights of the Transformer, and / or locating key brain regions based on gradient analysis of the graph attention network.
[0016] A second aspect of the present invention provides a diagnostic system for motor cognitive risk syndrome (MCR), comprising: The behavior feature fusion module is used to fuse at least two different types of behavior features using the attention mechanism of Transformer to obtain behavior fusion features; The brain network modeling module is used to construct a brain network graph based on brain network data, with brain regions as nodes and the connections between brain regions as edges, and to use graph attention networks for modeling to obtain brain network features. The diagnostic output module is used to generate MCR diagnostic results based on the behavioral fusion features and the brain network features.
[0017] A third aspect of the present invention provides a computer-readable storage medium having a computer program stored thereon, which, when executed by a processor, implements the diagnostic method as described in the first aspect of the present invention.
[0018] Beneficial effects (1) High prediction accuracy: By integrating multidimensional behavioral and brain imaging features and using Transformer and graph attention network to achieve deep interaction modeling, the early recognition accuracy of MCR is significantly improved.
[0019] (2) Strong feature fusion capability: It adopts a cross-modal attention mechanism to realize dynamic and adaptive fusion between behavioral features, which goes beyond simple feature splicing and fully explores the complementarity of multi-source information.
[0020] (3) The model has good interpretability: By analyzing attention weights and network gradients, key behavioral features and abnormal brain regions can be identified, making the diagnostic decision-making process more transparent and enhancing clinical credibility.
[0021] (4) Wide applicability of the solution: The core feature fusion and network modeling methods provided are universal, providing a technical framework for extending to the early assessment of other neurodegenerative diseases. Attached Figure Description
[0022] Figure 1This flowchart illustrates the diagnostic method for Motor Cognitive Risk Syndrome (MCR) based on multimodal feature fusion provided in Embodiment 1 of the present invention. The method includes: parallel acquisition and preprocessing of behavioral data and brain imaging data; single-modal feature modeling and cross-modal behavioral feature fusion using a Transformer architecture; gradient-weighted fusion and graph attention network modeling of brain imaging features; and finally, diagnosis and interpretability analysis through a joint model. This flowchart systematically demonstrates the technical solution for achieving deep multimodal feature fusion and interpretable diagnosis by combining Transformer and graph attention networks. Detailed Implementation
[0023] To enable those skilled in the art to better understand the present invention, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings.
[0024] In some of the processes described in the specification, claims, and accompanying drawings of this invention, multiple operations appearing in a specific order are included. However, it should be clearly understood that these operations may not be performed in the order they appear herein, or may be performed in parallel. The operation numbers, such as S101, S102, etc., are merely used to distinguish different operations and do not themselves represent any execution order. Furthermore, these processes may include more or fewer operations, and these operations may be performed sequentially or in parallel.
[0025] The technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.
[0026] Example 1: A diagnostic method for MCR based on the fusion of multimodal behavioral features and brain imaging features This embodiment provides a complete MCR diagnostic method, the core process of which is as follows: Figure 1 As shown. This method integrates multidimensional behavioral features and multimodal brain imaging features to construct a fusion model based on Transformer and Graph Attention Network (GAT), achieving early and accurate identification and interpretable analysis of MCR. The specific steps are as follows: S101. Multimodal Data Acquisition and Standardized Preprocessing This step aims to acquire and standardize raw, multi-source data from participants, laying the foundation for subsequent analysis.
[0027] 1. Behavioral data collection and processing Gait data: Under standardized experimental conditions, high-frame-rate optical motion capture systems or depth cameras placed on both sides of the walking path were used to collect continuous motion sequences of subjects during natural walking. Gait-related features, such as gait speed, stride length, gait period, and joint angle changes, were extracted from the continuous image or video sequences using the optical flow method. All gait parameters were Z-score normalized to form a gait feature vector F. g .
[0028] Eye-tracking data: Infrared video eye trackers were used to record the eye movements of subjects while performing visual search tasks (such as identifying a specific object in a complex scene) or in a natural fixation state. Eye-tracking parameters, including fixation point position, saccade amplitude, saccade velocity, and fixation duration, were extracted using the pupil-corneal reflex method. These parameters were Z-score standardized to form the eye-tracking feature vector F. e .
[0029] Language data: In an acoustically controlled laboratory environment, spontaneous speech by subjects was recorded using directional microphones in response to standardized prompts (e.g., "Please describe your activities yesterday in detail"). Fourier transform and related speech signal processing methods were used to extract the following acoustic and linguistic features: Mel-frequency cepstral coefficients (MFCC), fundamental frequency, formant frequencies, speech energy, and rhythmic features. The extracted language features were then Z-score standardized to form a language feature vector F. s .
[0030] 2. Multimodal brain imaging data acquisition and processing High-resolution T1-weighted structural imaging (sMRI), resting-state functional magnetic resonance imaging (rs-fMRI), and diffusion tensor imaging (DTI) data were acquired for each subject using a 3.0T magnetic resonance imaging system.
[0031] sMRI processing: Standard software workflows such as FreeSurfer or SPM were used for skull dissection, tissue segmentation, and spatial normalization to the MNI standard space. Gray matter volume and cortical thickness values of multiple brain regions related to cognitive and motor functions, such as the hippocampus, amygdala, entorhinal cortex, and middle temporal gyrus, were automatically extracted to construct the brain structural feature vector F. mri .
[0032] rs-fMRI processing: Preprocessing of the data includes temporal correction, head motion correction, spatial normalization, spatial smoothing, delinearization, and filtering (typically 0.01-0.08 Hz). Subsequently, time series data for each brain region are extracted based on standard brain atlases (such as AAL-90 or Schaefer-200), and Pearson correlation coefficients between all pairs of brain regions are calculated to construct the whole-brain functional connectivity matrix M. fc .
[0033] DTI processing: Head motion and eddy current corrections were performed using tools such as FSL or MRtrix3 to estimate fractional anisotropy (FA) and mean diffusion rate (MD) scalar maps, followed by spatial normalization. Deterministic fiber tracing algorithms (such as the FACT algorithm) were used to construct white matter fiber connectivity between brain regions, generating the structural connectivity matrix M. sc .
[0034] S102. Single-modal feature modeling and construction of labeled feature sets This step aims to deeply refine and standardize the raw features of each modality, constructing a labeled feature set for subsequent fusion processes. Specifically, the preprocessed gait features F g Eye movement characteristics F e Language features F s and brain structural features F mri The input is uniformly fed into the feature annotation module, and its corresponding annotation feature set is constructed respectively.
[0035] 1. Feature Serialization and Embedding: Construct an independent feature sequence for each modality feature. To do this, each feature vector is reshaped into a sequence and learnable modality type embeddings and positional encodings are added to clearly distinguish different modalities and the order of internal feature elements.
[0036] 2. Intramodal Self-Attention Modeling (Weighted Aggregation of Key Features): The aforementioned feature sequences are input into independent Transformer encoder modules. Within each encoder, the core operation is a multi-head self-attention mechanism. In this mechanism, the query (Q), key (K), and value (V) all originate from the feature sequence of the same modality itself. This process generates an attention weight matrix by calculating the correlation between different feature dimensions within the modality, thereby achieving adaptive weighting of the features.
[0037] The technical effect of this step is to highlight the key features that contribute significantly to MCR discrimination in a single modality, while suppressing redundant and noise information, thereby achieving the purification and enhancement of features within the modality.
[0038] 3. Constructing the Annotated Feature Set: After processing by the intramodal self-attention layer and feedforward neural network, a refined, more discriminative deep feature representation is output. These feature representations constitute the labeled feature set for the next stage of cross-modal fusion, denoted as: MCR gait feature representation H. g MCR eye movement features represent H e MCR language feature representation H s MCR brain structural features represent Hmri .
[0039] S103. Cross-modal feature fusion based on Transformer This step is one of the core innovations of this invention, aiming to achieve deep interaction and dynamic fusion between features of different behavioral modalities through the cross-modal attention mechanism of the Transformer architecture.
[0040] 1. Input and Objective: Obtain the labeled feature representation H of each single modality. g H e H s Subsequently, this invention introduces a cross-modal attention mechanism to achieve deep information fusion among behavioral features such as gait, eye movement, and language.
[0041] 2. Cross-modal attention computation: In the cross-modal attention process, information interaction is achieved by dynamically combining queries (Q), keys (K), and values (V) from different modalities. Typical fusion methods include: using gait features H... g As a query (Q), eye-tracking features H e Attention is computed as a key (K) and value (V); or as a linguistic feature H. s As a query (Q), a combination of gait and eye movement features is used as the key (K) and value (V).
[0042] The core of this mechanism is to allow one modality (as a query) to actively retrieve and aggregate relevant information from one or more other modalities (as keys and values).
[0043] 3. Dynamic Weighting and Feature Generation: Through the cross-modal attention calculation described above, the system automatically learns the correlation between features of different modalities and dynamically adjusts the contribution weights of each modal feature according to the current diagnostic task. Finally, the outputs of different attention heads are concatenated and integrated through a feedforward neural network to generate a unified, deeply fused multi-behavioral feature representation H. behavior .
[0044] Compared with simple feature splicing, the technical advantages of this fusion method are: (1) It can automatically learn the correlation between modalities and explore the potential association between different behavioral features (such as specific gait patterns and specific eye movement patterns); (2) It can perform dynamic weight adjustment, and the model can adaptively focus on more relevant modalities and information according to the characteristics of the input samples; (3) It can effectively capture early anomalies and capture subtle abnormal patterns that may only appear in multimodal association in the early stage of MCR through deep interaction.
[0045] S104. Brain imaging feature fusion and abnormal region identification This step involves interpretable fusion of multimodal brain imaging features and preliminary identification of key brain regions.
[0046] 1. Feature Input and Encoding: Input brain structural feature vector F mri Functional connection matrix M fc Connection matrix M with structure sc .
[0047] 2. Gradient-based weighted fusion: A gradient weighting mechanism (such as integral gradient method or Grad-CAM principle) is introduced to calculate the gradient of the diagnostic target relative to the brain imaging features during feature fusion, and the features are weighted accordingly. This process fuses sMRI morphological features, fMRI functional connectivity, and DTI structural connectivity features to generate fused brain imaging features with preliminary spatial importance weights, implicitly representing the contribution of each brain region to the diagnosis of MCR.
[0048] 3. Output: The output is the weighted and fused brain image features, which are used for subsequent brain network modeling and provide a basis for the final localization of abnormal brain regions.
[0049] S105. Construction of a joint diagnostic model based on Transformer and GAT This step involves constructing a unified diagnostic model to achieve joint modeling of behavioral and brain network features.
[0050] 1. Brain network map construction and initialization: using brain regions as nodes, and M... fc Or M sc Define edges and construct a brain network graph G. Use the weighted fused brain image features output by S104 as the initial features X of each node.
[0051] 2. Graph Attention Network Modeling: Input the graph G and node features X into a graph attention network (GAT). GAT uses attention coefficients α... ij The information of neighboring nodes is dynamically aggregated and, after multi-layer propagation, is used to obtain the global brain network embedding vector H through graph pooling. brain .
[0052] 3. Multimodal feature combination and classification: The behavioral fusion features H obtained from S103 are combined. behavior The brain network features H obtained in this step brain By performing splicing or cross-attention interactions, a bidirectional mapping between behavior and brain network features is achieved. The combined features are then input into a fully connected classifier, and the MCR diagnostic probability P is output through a softmax function. MCR Complete the model construction.
[0053] S106. Model Interpretability Analysis 1. Behavioral Feature Tracing: Analyze the attention weights in S102 / S103 to identify key behavioral features. For a specific modality, feature dimensions with high attention weights (such as "gait speed" in gait and "MFCC" in language) are identified as the most important key behavioral markers for MCR discrimination.
[0054] 2. Abnormal Brain Region Localization: Gradient-Weighted Class Activation Mapping (Grad-CAM) was applied to the GAT model of S104. By calculating the gradient of the final diagnostic category score relative to the features of the last layer of GAT nodes, an "importance score" for each brain region node was generated and mapped back to a standard brain template to form a brain region activation heatmap. The darker the region in the graph (such as the posterior cingulate cortex and sensorimotor cortex of the default mode network), the greater its contribution to the current MCR diagnosis, suggesting that there may be functional or structural abnormalities in these regions.
[0055] 3. Cross-modal association visualization: Visualize the cross-modal attention in S103 and S105 to reveal the association patterns between behavior and brain features. Visualize the attention weights of the cross-modal attention layer in step S103. For example, a heatmap can be generated to show which eye movement features (such as "slow gait") the model primarily focuses on when "slow gait" is used as a strong query. This provides an intuitive cross-modal association map for understanding the underlying mechanisms of the "motor-cognition" association.
[0056] Example 2: A diagnostic system for MCR based on the fusion of multimodal behavioral features and brain imaging features This embodiment provides an MCR diagnostic system that implements the method described in Embodiment 1. This system can be deployed on a server, workstation, or cloud platform using a combination of hardware and software, for automated and intelligent assisted diagnosis of motor cognitive risk syndrome.
[0057] The system includes the following modules: 1. Multi-source data acquisition and access module: Used to receive or import raw data from external devices or databases. Includes: Gait data acquisition interface, used to connect to high frame rate cameras or motion capture systems to acquire gait video data; Eye-tracking data acquisition interface, used to connect to an eye tracker to acquire eye-tracking trajectory data; The language data acquisition interface is used to connect a microphone and audio acquisition equipment to acquire voice signals; The medical imaging data interface is used to connect to the Picture Archiving and Communication System (PACS) to obtain raw DICOM data of brain images such as sMRI, fMRI, and DTI.
[0058] 2. Data Processing and Feature Extraction Module: Configured to automatically preprocess and extract features from the raw data input by the multi-source data acquisition and access module. This module is programmed to perform all the functions of step S101 in Embodiment 1, including Z-score normalization of gait, eye movement, and language data, as well as normalized preprocessing of multimodal brain imaging data, brain region segmentation, feature extraction (such as brain region volume and cortical thickness), calculation of functional connectivity matrix and structural connectivity matrix, and outputting normalized feature vectors Fg, Fe, Fs, Fmri and connectivity matrices Mfc, MSc.
[0059] 3. Multimodal Deep Learning Fusion Modeling Module: This is the core analysis unit of the system, including a processor and memory. The memory stores a computer program, which, when executed by the processor, implements the following functional sub-modules: Single-modal feature modeling submodule: Used to execute step S102 in Example 1, it performs intra-modal self-attention modeling on the input gait, eye movement, language, and brain structure features through multiple built-in Transformer encoders, and outputs the corresponding MCR gait feature representation H. g MCR eye movement features represent H e MCR language feature representation H s MCR brain structural features represent H mri .
[0060] Cross-modal feature fusion submodule: used to execute step S103 in embodiment 1, based on the cross-modal cross-attention mechanism of Transformer, for H g H e H s Dynamic interaction and fusion of behavioral feature representations are performed to generate a unified multi-behavior fusion feature representation H. behavior .
[0061] Brain Image Fusion and Network Modeling Submodule: This submodule performs the brain image processing steps S104 and S105 in Example 1. This submodule first applies the integral gradient method or Grad-CAM principle to the F... mri M fc M sc The features are weighted and fused; then the brain network nodes are initialized with the fused features, and the constructed brain network graph is deeply modeled using a graph attention network (GAT), finally outputting the brain network embedding vector H. brain .
[0062] Joint classification and diagnosis submodule: used to execute the classification part in step S105 of embodiment 1, and receive the multi-behavior fusion feature representation H behavior With the brain network embedding vector H brainAfter concatenation or cross-attention interaction, the data is input into a classifier (such as a fully connected neural network with a softmax layer) to calculate and output the MCR diagnostic probability P. MCR .
[0063] 4. Interpretability Analysis Module: Configured to execute step S106 in Example 1. This module analyzes the internal processes of the multimodal deep learning fusion modeling module, including: Extract and analyze the attention weights in unimodal and crossmodal attention sub-modules to identify key behavioral features; Based on gradient-weighted class activation mapping (Grad-CAM) technology, the GAT model in the brain image fusion and network modeling submodule is processed to generate heat maps that identify key abnormal brain regions. Visualize cross-modal attention weights and generate a map showing the association between behavior and brain features.
[0064] 5. Human-Computer Interaction and Diagnostic Report Generation Module: This module provides a graphical user interface (GUI) for users to upload data, configure parameters, and initiate analysis. It receives the diagnostic probabilities output by the joint classification diagnosis submodule and the analysis results (key feature list, brain region heatmap, association map) generated by the interpretability analysis module, automatically integrating them to generate a structured MCR intelligent diagnostic report. The report includes at least: subject identification, MCR risk assessment results (positive / negative and confidence level), key abnormal behavioral feature hints, key abnormal brain region localization diagrams, and a clinical interpretation summary.
[0065] 6. Data storage and management module: Used to securely store the anonymized feature data of subjects, intermediate model results, final diagnostic reports, and system operation logs, and supports data query, export, and management functions.
[0066] Example 3: A computer-readable storage medium This embodiment provides a computer-readable storage medium, such as a USB flash drive, a portable hard drive, a read-only memory (ROM), a random access memory (RAM), a solid-state drive (SSD), an optical disc, or a cloud storage server, on which a computer program (instruction) is stored.
[0067] When the computer program is loaded and executed by one or more processors (e.g., a server, workstation, or central processing unit of a personal computer), it enables a computing device or system containing the processor to implement all or part of the steps of the “MCR diagnostic method based on the fusion of multimodal behavioral features and brain imaging features” as described in Example 1.
[0068] The computer program contains instructions specifically used to control the computing device to perform the following operations: Receive or read multimodal raw data; Perform the data preprocessing and feature extraction operations as described in S101 of Example 1; The algorithm flow of multimodal feature modeling, fusion, network construction and classification diagnosis as described in S102, S103, S104 and S105 of Example 1 is executed. Perform the interpretability analysis as described in S106 of Example 1; Output and save the diagnostic results and related analysis charts.
[0069] Although embodiments of the present invention have been shown and described above, it is understood that the above embodiments are exemplary and should not be construed as limiting the present invention. Those skilled in the art can make changes, modifications, substitutions and variations to the above embodiments within the scope of the present invention.
Claims
1. A diagnostic method for motor cognitive risk syndrome (MCR), characterized in that, include: Acquire at least two different types of behavioral characteristics; The behavioral features are fused using the attention mechanism of Transformer to obtain behavioral fusion features; Obtain brain network data, and construct a brain network graph based on the brain network data, with brain regions as nodes and the connection relationships between brain regions as edges; The brain network graph is modeled using a graph attention network to obtain brain network features; Based on the behavioral fusion features and the brain network features, an MCR diagnostic result is generated.
2. The diagnostic method according to claim 1, characterized in that, The behavioral characteristics include two or more of the following: gait characteristics, eye movement characteristics, language characteristics, writing characteristics, or upper limb movement characteristics.
3. The diagnostic method according to claim 1, characterized in that, The fusion using the attention mechanism of Transformer includes: encoding each behavioral feature separately using an intramodal self-attention mechanism, and then performing feature interaction and weighted fusion through a cross-modal cross-attention mechanism.
4. The diagnostic method according to claim 3, characterized in that, The cross-modal attention mechanism is based on a multi-head attention mechanism.
5. The diagnostic method according to claim 1, characterized in that, Also includes: Feature extraction and fusion are performed on the brain network data, and the fused features are used as the initial features of the nodes; The feature extraction and fusion include gradient-weighted fusion.
6. The diagnostic method according to claim 1, characterized in that, The process of generating MCR diagnostic results based on the behavioral fusion features and the brain network features includes: concatenating the two or interacting with them through an attention mechanism, and then inputting the result into a classifier to obtain a diagnostic probability.
7. The diagnostic method according to claim 6, characterized in that, The classifier is a fully connected neural network classifier.
8. The diagnostic method according to claim 1, characterized in that, It also includes interpretability analysis steps: determining key behavioral features based on the attention weights of the Transformer, and / or locating key brain regions based on gradient analysis of the graph attention network.
9. A diagnostic system for motor cognitive risk syndrome (MCR), characterized in that, include: The behavior feature fusion module is used to fuse at least two different types of behavior features using the attention mechanism of Transformer to obtain behavior fusion features; The brain network modeling module is used to construct a brain network graph based on brain network data, with brain regions as nodes and the connections between brain regions as edges, and to use graph attention networks for modeling to obtain brain network features. The diagnostic output module is used to generate MCR diagnostic results based on the behavioral fusion features and the brain network features.
10. A computer-readable storage medium having a computer program stored thereon, characterized in that, When the program is executed by the processor, it implements the diagnostic method as described in any one of claims 1-8.
Citation Information
Patent Citations
Alzheimer's disease classification method based on depth neural network and multi-mode images
CN107506797A