Alzheimer's disease prediction method and system based on a continuous learning neural network model

CN122201753APending Publication Date: 2026-06-12GUANGDONG POLYTECHNIC NORMAL UNIV

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
GUANGDONG POLYTECHNIC NORMAL UNIV
Filing Date
2026-02-13
Publication Date
2026-06-12

Smart Images

  • Figure CN122201753A_ABST
    Figure CN122201753A_ABST
Patent Text Reader

Abstract

The application provides an Alzheimer's disease prediction method and system based on a continuous learning neural network model, the application obtains multi-omics data of a subject; the multi-omics data is subjected to feature coding, the cross-modal multi-scale attention learning module is used for weighted fusion of hidden layer representation, the multi-router sparse gate expert network module MR-some is introduced to perform self-adaptive calculation and task perception enhancement on the multi-attention fusion features, and the task perception enhanced feature representation is obtained; the continuous learning neural network model is used for prediction on the task perception enhanced feature representation, and an Alzheimer's disease prediction result is obtained. The application guides feature learning through multi-omics correlation prior knowledge, combines cross-modal multi-scale attention fusion and MR-SMoE task perception enhancement, and further improves the prediction accuracy of Alzheimer's disease; the application adopts a PNN architecture and a task column parameter freezing mechanism, and effectively prevents catastrophic forgetting.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of interdisciplinary technology of artificial intelligence and biomedicine, and in particular to a method and system for predicting Alzheimer's disease based on a continuous learning neural network model. Background Technology

[0002] Alzheimer's disease AD (Alzheimer's disease) is a serious neurodegenerative disease that primarily affects people over 60 years of age. It is characterized by progressive memory impairment and behavioral disturbances, significantly reducing patients' quality of life and imposing a heavy care burden. Due to the lack of effective treatments, early detection and intervention are crucial. In recent years, the development of artificial intelligence technology has promoted the application of multi-omics data in the intelligent diagnosis of AD. However, current methods still face the following three main limitations: Most existing methods rely on learning networks (such as self-attention mechanisms) to capture associations between multi-omics data. These methods tend to perform poorly in terms of performance and interpretability, especially with small sample sizes. Since clinical cohorts for Alzheimer's disease are typically limited in size, such data-driven methods struggle to stably learn reliable multi-omics association patterns, resulting in unstable model performance on independent validation sets and an inability to provide biologically meaningful explanations.

[0003] Integrating prior knowledge into learning networks can enhance feature learning capabilities; however, current methods neglect population distribution and individual differences. For example, brain structural biomarkers (such as hippocampal volume) are not only affected by disease but also follow a natural age-related trajectory of change and are influenced by genetic factors (such as ApoE). Significant regulation of these populations; ignoring these statistical regularities and individual variations will lead to model confusion between pathological changes and normal physiological differences, thereby reducing diagnostic specificity.

[0004] The correlation between different tasks is often overlooked: In real clinical scenarios, AD diagnosis involves multiple related tasks (such as distinguishing normal controls, mild cognitive impairment and Alzheimer's disease, or predicting disease progression), but existing models usually handle each task independently and fail to utilize the shared knowledge between tasks, which limits the model's knowledge transfer ability and its ability to continuously adapt to new cohorts. Summary of the Invention

[0005] To address the shortcomings of existing technologies, this invention provides an Alzheimer's disease prediction method and system based on a continuously learning neural network model. This invention can predict Alzheimer's disease with high accuracy, interpretability, and continuous evolution capability.

[0006] In a first aspect, the present invention provides an Alzheimer's disease prediction method based on a continuously learning neural network model, comprising the following steps: S1) Obtain multi-omics data from the subjects; S2) Feature encoding is performed on multi-omics data to obtain hidden layer representations; S3) The hidden layer representation is weighted and fused through a cross-modal multi-scale attention learning module to obtain multi-attention fusion features; S4) Introduce the multi-router sparse gating expert network module MR-some to perform adaptive computation and task-aware enhancement on the multi-attention fusion features, and obtain the feature representation after task-aware enhancement. S5) By continuously learning the neural network model, the feature representation after task perception enhancement is predicted to obtain the prediction result of Alzheimer's disease.

[0007] Preferably, in step S1), the multi-omics data of the subject includes structural magnetic resonance imaging data, genomic data, and phenotypic data.

[0008] Preferably, in step S2), the structural magnetic resonance imaging data, genomic data, and phenotypic data are respectively input into an independent multilayer perceptron (MLP) for feature encoding to obtain the corresponding hidden layer representation.

[0009] Preferably, in step S3), the fusion of hidden layer representations is achieved through a cross-modal multi-scale attention learning module to model the dependencies between genomics, phenotype, and structural magnetic resonance imaging; specifically as follows: Extracting hidden layer representations using convolution operations with three different kernel sizes. Multi-scale patterns generate query vectors Extract key vectors from the hidden layer representations of the other two modalities. and ; Cross-modal dependencies are computed using scaled dot product attention, i.e.: ; ; ; In the formula, This represents the joint feature representation after multi-scale attention fusion. Represents a multi-scale attention function; For the first Attention output at various scales; It is a multilayer perceptron; Indicates splicing; To extract modalities from convolution operations Extracted multi-scale query vector; , These are the key and value matrices from the other two modes, respectively; Let be the dimension of the key vector. For the activation function, ensure that the attention weights are normalized; Joint feature representation after multi-scale attention fusion The final multi-omics fusion features are generated through a multi-layer perceptron including fully connected layers. ,Right now: ; In the formula, This is the activation function.

[0010] Preferably, in step S4), the multi-router sparse gating expert network module MR-some includes an expert network and routers, and the multi-router sparse gating expert network module MR-some is based on the final multi-attention fusion features of the input. Dynamically select and combine the most relevant expert networks; Each expert network is implemented as a neural network specifically designed to handle a particular input pattern, namely: ; in, For the first The output of an expert network; For activation functions; and These are the weight matrices for the upward and downward projections, respectively; and For the corresponding bias term; This represents the total number of expert networks.

[0011] The router is responsible for determining the final multi-attention fusion features based on the input. A set of routing weights is dynamically generated to determine the contribution of each expert network to the final output, specifically: The final multi-attention fusion feature of the input Applying a linear transformation, we obtain the gating score for each expert network, i.e.: ; In the formula, For the first Gating scores of an expert network; , These are the routing weight matrix and the bias vector, respectively; Before selection The expert network corresponding to the maximum gate score is: ; The routing weights of the selected expert network are then normalized to obtain the final routing weights. ,Right now:

[0012] In the formula, This represents the set of expert network indices selected by Top-K operations, corresponding to the maximum gated scores; Finally, the multi-router sparse gating expert network module MR-some determines the final routing weights. For the output of each expert network After weighting, we obtain the feature representation after task-aware enhancement, namely: ; In the formula, This is a feature representation enhanced by task awareness.

[0013] Preferably, in step S5), the continuous learning neural network model adopts a progressive neural network (PNN) architecture. The continuous learning neural network model is trained using a prior knowledge graph. During the training process, a task adaptive regularization mechanism and a dynamic memory replay strategy are used to prevent catastrophic forgetting and gradually integrate new clinical cohort data, thereby achieving continuous learning and knowledge transfer for the current task.

[0014] The continuous learning neural network model assigns an independent column to each new task. This new column includes an independent set of parameters and establishes lateral connections between columns, which effectively prevents catastrophic forgetting and promotes knowledge sharing across tasks. In training When performing the first task, all parameters in the previous task columns remain fixed, and the second task... The gradients for each task are set to zero to prevent catastrophic forgetting.

[0015] Secondly, the present invention provides an Alzheimer's disease prediction system based on a continuously learning neural network model, comprising: The data acquisition module is used to acquire multi-omics data from the subjects; The feature encoding module performs nonlinear mapping on multi-omics data through an independent multilayer perceptron to generate corresponding hidden layer representations. The multimodal fusion module uses a cross-modal multi-scale attention mechanism to perform weighted fusion of the encoded hidden layer representations to generate multi-attention fusion features; The task-aware enhancement module uses the multi-router sparse gating expert network module MR-some to adaptively calculate and enhance the multi-attention fusion features, resulting in a task-aware enhanced feature representation. The prediction module calls a pre-trained, continuously learning neural network model to perform incremental learning on the feature representation after task perception enhancement, in order to output prediction results including Alzheimer's disease state classification, disease progression risk score, and subtype classification.

[0016] Thirdly, the present invention provides an electronic device including a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor executes the program to implement the Alzheimer's disease prediction method.

[0017] The beneficial effects of this invention are as follows: 1. This invention guides feature learning by associating prior knowledge with multiple omics, and combines cross-modal multi-scale attention fusion with MR-SMoE task perception enhancement, thereby further improving the prediction accuracy of Alzheimer's disease; 2. This invention adopts a PNN architecture and a task column parameter freezing mechanism to effectively prevent catastrophic amnesia. It can integrate new clinical cohorts and new diagnostic tasks without accessing historical data, thereby improving the clinical adaptability of the model and significantly enhancing its predictive ability for the early pathological stages of Alzheimer's disease. Attached Figure Description

[0018] Figure 1 This is a flowchart illustrating the method of Embodiment 1 of the present invention; Figure 2 This is a schematic diagram of the framework of the method in Embodiment 1 of the present invention; Figure 3 This is a schematic diagram comparing the prediction results of Embodiment 1 of the present invention with those of existing methods; Detailed Implementation

[0019] The specific embodiments of the present invention will be further described below with reference to the accompanying drawings: like Figure 1 As shown, this embodiment provides an Alzheimer's disease prediction method based on a continuously learning neural network model, including the following steps: S1) Obtain multi-omics data from the subjects; The multi-omics data of the subjects mentioned include structural magnetic resonance imaging (sMRI) data, genomic data, and phenotypic data; Specifically, for the structural magnetic resonance imaging data, the T1-weighted images were segmented into cortical and subcortical layers using FreeSurfer software to extract morphological features of brain regions from 34 anatomical areas, including surface area, thickness, and volume of deep nuclei. The genomic data were detected using whole-genome microarrays and genotyping and quality control were performed using HRC or 1000 Genomes reference panels to retain SNP loci that meet the requirements. The phenotypic data mentioned include the subject's age and gender.

[0020] S2) Feature encoding is performed on multi-omics data to obtain hidden layer representations; In this embodiment, the subject's structural magnetic resonance imaging data, genomic data, and phenotypic data are input into independent multilayer perceptrons (MLPs) for feature encoding to obtain the corresponding hidden layer representations, i.e.: ; in, These represent genomic data, phenotypic data, and structural magnetic resonance imaging data, respectively. These represent the hidden layer representations of genomic data, phenotypic data, and structural magnetic resonance imaging data, respectively.

[0021] S3) The hidden layer representation is weighted and fused through a cross-modal multi-scale attention learning module to obtain multi-attention fusion features; In this embodiment, a cross-modal multi-degree attention learning module is introduced to implement the hidden layer representation. The fusion of these elements allows for the modeling of dependencies between genomics, phenotype, and structural magnetic resonance imaging, resulting in a multi-omics association representation; that is: Extracting hidden layer representations using convolution operations with three different kernel sizes. Multi-scale patterns generate query vectors Extract key vectors from the hidden layer representations of the other two modalities. and ; Cross-modal dependencies are computed using scaled dot product attention, i.e.: ; ; ; In the formula, This represents the joint feature representation after multi-scale attention fusion. Represents a multi-scale attention function; For the first Attention output at various scales; It is a multilayer perceptron; Indicates splicing; To extract modalities from convolution operations Extracted multi-scale query vector; , These are the key and value matrices from the other two modes, respectively; Let be the dimension of the key vector. As the activation function, ensure that the attention weights are normalized; Joint feature representation after multi-scale attention fusion The final multi-attention fusion feature is generated through a multi-layer perceptron including fully connected layers. ,Right now: ; In the formula, This is the activation function.

[0022] S4) Introduce the multi-router sparse gating expert network module MR-some to perform adaptive computation and task-aware enhancement on the multi-attention fusion features, and obtain the feature representation after task-aware enhancement. In this embodiment, the multi-router sparse gating expert network module MR-some includes an expert network and routers. The multi-router sparse gating expert network module MR-some is based on the input final multi-attention fusion features. Dynamically select and combine the most relevant expert networks; The expert network adopts a lightweight feedforward structure. Its upper and lower projection weight matrices are randomly initialized when a new task is introduced, and only the expert parameters corresponding to the current task column are updated during training, while the historical expert parameters are kept frozen.

[0023] Each expert network is implemented as a neural network specifically designed to handle a particular input pattern, namely: ; in, For the first The output of an expert network; For activation functions; and These are the weight matrices for the upward and downward projections, respectively; and For the corresponding bias term; This represents the total number of expert networks.

[0024] The router is responsible for determining the final multi-attention fusion features based on the input. A set of routing weights is dynamically generated to determine the contribution of each expert network to the final output, specifically: The final multi-attention fusion feature of the input Applying a linear transformation, we obtain the gating score for each expert network, i.e.: ; In the formula, For the first Gating scores of an expert network; , These are the routing weight matrix and the bias vector, respectively; Before selection The expert network corresponding to the maximum gate score, i.e.: ; The routing weights of the selected expert network are then normalized to obtain the final routing weights. ,Right now:

[0025] In the formula, This represents the set of expert network indices selected by Top-K operations, corresponding to the maximum gated scores; The multi-router sparse gating expert network module MR-some is based on the final route weight. For the output of each expert network After weighting, we obtain the feature representation after task-aware enhancement, namely: ; In the formula, This is a feature representation enhanced by task awareness.

[0026] S5) Construct a prior knowledge graph and use the prior knowledge graph to train a continuous learning neural network model. The trained continuous learning neural network model is then used to predict the feature representation after task perception enhancement to obtain the prediction result of Alzheimer's disease.

[0027] In this embodiment, a prior knowledge graph is constructed as follows: Extract morphological features (sMRI), genomic data (SNPs), and phenotypic data (age, sex) of brain regions associated with Alzheimer's disease from large-scale population cohort data. In this embodiment, the data came from the UK Biobank (UKB), which included 46,834 participants aged 45–84 years, including 32 cases of Alzheimer's disease (AD), 20 cases of mild cognitive impairment (MCI), and 254 normal controls (NC). For structural magnetic resonance imaging (sMRI) data, T1-weighted images were used, and cortical segmentation was performed using the FreeSurfer software package. Based on the Desikan–Killiany–Tourville (DKT) template, 204 morphological features of brain regions from 34 anatomical regions were extracted, including surface area, cortical thickness, and volume. A total of 204 features were obtained for each hemisphere, covering the cortex, white matter, and subcortical gray matter regions, ensuring biological interpretability.

[0028] The UKB cohort contains approximately 810,000 SNP sites, and after completion with the Haplotypes Reference Consortium (HRC) reference panel, approximately 96 million variant sites are retained. Based on the UKB cohort, genome-wide association studies (GWAS) were performed on morphological features of 24 brain regions associated with Alzheimer's disease. PLINK software was used to identify significantly associated SNP loci (p < 5 × 10⁻⁶). -8 Based on this, the individual-level polygenic risk score PRS is calculated; that is: ; In the formula, For the first The effect size of each SNP site; For the first Allele count of each SNP locus in an individual; Based on population-level statistical analysis, a quantitative relationship was established between brain region morphological features and genetic variation PRS, age, and sex, forming a 24×6 multi-omics association matrix, where each row corresponds to a brain region feature and each column corresponds to a covariate (including genetic variation PRS, age, sex, and their age-sex matched mean and variance).

[0029] In this embodiment, the continuous learning neural network model adopts a progressive neural network (PNN) architecture. The continuous learning neural network model is trained through a prior knowledge graph. During the training process, a task-adaptive regularization mechanism and a dynamic memory replay strategy are used to prevent catastrophic forgetting and gradually integrate new clinical cohort data, thereby achieving continuous learning and knowledge transfer for the current task.

[0030] The continuous learning neural network model assigns an independent column to each new task. The newly added column includes an independent set of parameters, and lateral connections are established between the columns to effectively prevent catastrophic forgetting and promote knowledge sharing across tasks. The overall architecture of the described continuous learning neural network model consists of a set of columns, represented as follows: Each column Specifically designed for learning the first One task.

[0031] When a new task is added, its input comes from the task-aware enhanced feature representation. The enhanced feature representation after task awareness It is sent to the adaptive layer of the current task column for processing, specifically: The outputs of all historical task columns are concatenated and fused through a trainable adaptive layer to form a cross-task knowledge transfer path, namely: ; In the formula, Indicates the first Adaptive features for each task; This represents a vector concatenation operation. For the first The adaptive function of the layer adaptation layer For the current task In the The output of the adaptive layer, For the first The task in the first The historical output of the adaptive layer; By adapting features Perform task-specific classification transformation to obtain the first The prediction results for Alzheimer's disease for each task are as follows: ; In the formula, Indicates the first The final output of each task; This is the activation function.

[0032] The predicted results for Alzheimer's disease include: Alzheimer's disease status classification: Predicted probability distribution of three labels: normal control (NC), mild cognitive impairment (MCI), and Alzheimer's disease (AD); Disease progression risk score: This score represents the probability that a subject will progress from the current stage to the next stage within the next 5 years. It is generated based on a multi-omics dynamic trend model. Subtyping based on combinations of biomarkers: including the combined discrimination of Aβ positivity, positivity and Tau protein pathological status, supporting precision medicine intervention.

[0033] In training When performing the first task, all parameters in the previous task columns remain fixed, and the second task... The gradients for each task are set to zero to prevent catastrophic forgetting, i.e.: ; In the formula, For the current task The loss function; For the first The parameters for each task column.

[0034] Therefore, the continuous learning neural network model described above can quickly adapt to new tasks while retaining knowledge from previous tasks, achieving stable and efficient sequential learning.

[0035] To verify the effectiveness of the method in this embodiment, an experimental evaluation was conducted using a publicly available dataset. The experimental setup was as follows: the data was divided into an 80% training set and a 20% test set, and all results were reported on a separate reserved test set. The evaluation metrics included six indicators: accuracy (ACC), sensitivity (SEN), specificity (SPE), area under the curve (AUC), precision (PRE), and F1 score (F1).

[0036] This embodiment's method was compared with various classic machine learning models (SVM, LR, RF, XGBoost, LightGBM) and advanced deep learning architectures (Transformer, CNN, TSMixer, Tab-Transformer, TimesNet, FT-Transformer) in comparative experiments. All models were run with the same input data, random seeds, learning rate (0.001), and number of training epochs (200 epochs) to ensure a fair comparison. Experimental results are as follows: Figure 2 As shown. In the NCvs.MCI task, the method of this embodiment achieves an accuracy of 84.38% and an AUC of 87.69%, significantly outperforming the best baseline Tab-Transformer (ACC: 74.85%, AUC: 74.87%). In the MCIvs.AD task, it achieves an ACC of 80.21% and an AUC of 78.54%, outperforming Transformer (ACC: 72.10%, AUC: 76.35%). In the NCvs.AD task, it achieves an ACC of 92.85% and an AUC of 96.43%, surpassing FT-Transformer (ACC: 86.41%, AUC: 95.49%).

[0037] This embodiment demonstrates significant advantages in early-stage identification tasks (NC vs. MCI and MCI vs. AD), improving ACC by 12.84% and 3.67% respectively, and also maintaining a leading position in AUC. This indicates that this embodiment can more effectively capture early pathological changes and has greater potential for clinical application.

[0038] Furthermore, this embodiment demonstrates balanced performance across all metrics, avoiding the severe sensitivity deficiencies of models such as XGBoost (SEN=28.26%) or LightGBM (SEN=32.61%). (Radar chart) Figure 2 Furthermore, it is shown that the present invention consistently remains near the optimal profile in all three tasks, demonstrating a consistent and comprehensive performance advantage, while most baseline models have obvious shortcomings.

[0039] This embodiment has higher classification ability, and can maintain high sensitivity, especially in the low threshold area, making it suitable for accurate screening of early cases in actual clinical scenarios.

[0040] In summary, this embodiment achieves higher accuracy, stronger generalization ability, and more robust performance in multi-task diagnostics, fully verifying its technical feasibility and practical value.

[0041] Example 2 This embodiment provides an Alzheimer's disease prediction system based on a continuous learning neural network model, including: The data acquisition module is used to acquire multi-omics data from the subjects; The feature encoding module performs nonlinear mapping on multi-omics data through an independent multilayer perceptron to generate corresponding hidden layer representations. The multimodal fusion module uses a cross-modal multi-scale attention mechanism to perform weighted fusion of the encoded hidden layer representations to generate multi-attention fusion features; The task-aware enhancement module uses the multi-router sparse gating expert network module MR-some to adaptively calculate and enhance the multi-attention fusion features, resulting in a task-aware enhanced feature representation. The prediction module calls a pre-trained, continuously learning neural network model to perform incremental learning on the feature representation after task perception enhancement, in order to output prediction results including Alzheimer's disease state classification, disease progression risk score, and subtype classification.

[0042] Example 3 This embodiment provides an electronic device, including a memory, a processor, and a computer program stored in the memory and executable on the processor. When the processor executes the program, it implements the Alzheimer's disease prediction method.

[0043] In this embodiment, the memory can be implemented by any type of volatile or non-volatile storage device or a combination thereof, such as static random access memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic storage, flash memory, magnetic disk, or optical disk. A processor, coupled to the memory, is used to execute computer programs stored in the memory.

[0044] The computer program includes computer program code, which may be in the form of source code, object code, executable file, or some intermediate form.

[0045] The embodiments and descriptions above are merely illustrative of the principles and preferred embodiments of the present invention. Various changes and modifications may be made to the present invention without departing from its spirit and scope, and all such changes and modifications fall within the scope of the present invention as claimed.

Claims

1. An Alzheimer's disease prediction method based on a continuous learning neural network model, characterized in that, Includes the following steps: S1) Obtain multi-omics data from the subjects; S2) Feature encoding is performed on multi-omics data to obtain hidden layer representations; S3) The hidden layer representation is weighted and fused through a cross-modal multi-scale attention learning module to obtain multi-attention fusion features; S4) Introduce the multi-router sparse gating expert network module MR-some to perform adaptive computation and task-aware enhancement on the multi-attention fusion features, and obtain the feature representation after task-aware enhancement. S5) By continuously learning the neural network model, the feature representation after task perception enhancement is predicted to obtain the prediction result of Alzheimer's disease.

2. The Alzheimer's disease prediction method based on a continuous learning neural network model according to claim 1, characterized in that: In step S2), the multi-omics data of the subjects includes structural magnetic resonance imaging data, genomic data, and phenotypic data; the corresponding hidden layer representations are obtained by inputting the structural magnetic resonance imaging data, genomic data, and phenotypic data into independent multilayer perceptrons (MLPs) for feature encoding.

3. The Alzheimer's disease prediction method based on a continuous learning neural network model according to claim 2, characterized in that: Step S3) specifically includes: Multi-scale patterns of hidden layer representations are extracted using convolution operations with three different kernel sizes to generate query vectors, and key and value vectors are extracted from the hidden layer representations of the other two modalities. Cross-modal dependencies are calculated by scaling dot product attention, resulting in a multi-attention feature map; The multi-attention feature maps are passed through a multi-layer perceptron including fully connected layers to generate the final multi-attention fusion features.

4. The Alzheimer's disease prediction method based on a continuous learning neural network model according to claim 3, characterized in that: In step S4), the multi-router sparse gating expert network module MR-some includes expert networks and routers. The multi-router sparse gating expert network module MR-some dynamically selects and combines the most relevant expert networks based on the final multi-attention fusion features of the input.

5. The Alzheimer's disease prediction method based on a continuous learning neural network model according to claim 4, characterized in that: In step S4), each expert network is implemented as a neural network specifically designed to handle a particular input pattern, i.e.: ; in, For the first The output of an expert network; For activation functions; and These are the weight matrices for the upward and downward projections, respectively; and For the corresponding bias term; This represents the total number of expert networks.

6. The Alzheimer's disease prediction method based on a continuous learning neural network model according to claim 5, characterized in that: In step S4), the router is responsible for determining the final multi-attention fusion features based on the input. A set of routing weights is dynamically generated to determine the contribution of each expert network to the final output, specifically: The final multi-attention fusion feature of the input Applying a linear transformation, we obtain the gating score for each expert network, i.e.: ; In the formula, For the first Gating scores of an expert network; , These are the routing weight matrix and the bias vector, respectively; Before selection The expert network corresponding to the maximum gate score, i.e.: ; The routing weights of the selected expert network are then normalized to obtain the final routing weights. ,Right now: ; In the formula, This represents the set of expert network indices selected by Top-K operations, corresponding to the maximum gated scores; Finally, the multi-router sparse gating expert network module MR-some determines the final routing weights. For the output of each expert network After weighting, we obtain the feature representation after task-aware enhancement, namely: ; In the formula, This is a feature representation enhanced by task awareness.

7. The Alzheimer's disease prediction method based on a continuous learning neural network model according to claim 6, characterized in that: In step S5), the continuous learning neural network model adopts a progressive neural network (PNN) architecture. The continuous learning neural network model is trained through a prior knowledge graph. During the training process, a task adaptive regularization mechanism and a dynamic memory replay strategy are used to prevent catastrophic forgetting and gradually integrate new clinical cohort data. The continuous learning neural network model assigns an independent column to each new task. This new column includes an independent set of parameters and establishes lateral connections between columns to prevent catastrophic forgetting and promote knowledge sharing across tasks. In training When performing the first task, all parameters in the previous task columns remain fixed, and the second task... The gradients for each task are set to zero to prevent catastrophic forgetting.

8. The Alzheimer's disease prediction method based on a continuous learning neural network model according to claim 7, characterized in that: In step S5), when a new task column is added, its input comes from the feature representation enhanced by task awareness. The enhanced feature representation after task awareness It is sent to the adaptive layer of the current task column for processing, specifically: The outputs of all historical task columns are concatenated and fused through a trainable adaptive layer to form a cross-task knowledge transfer path, namely: ; In the formula, Indicates the first Adaptive features for each task; This represents a vector concatenation operation. For the first The adaptive function of the layer adaptation layer For the current task In the The output of the adaptive layer, For the first The task in the first The historical output of the adaptive layer; By adapting features Perform task-specific classification transformation to obtain the first The prediction results for Alzheimer's disease for each task are as follows: ; In the formula, Indicates the first The final output of each task; This is the activation function.

9. An Alzheimer's disease prediction system based on a continuous learning neural network model, characterized in that, include: The data acquisition module is used to acquire multi-omics data from the subjects; The feature encoding module performs nonlinear mapping on multi-omics data through an independent multilayer perceptron to generate corresponding hidden layer representations. The multimodal fusion module uses a cross-modal multi-scale attention mechanism to perform weighted fusion of the encoded hidden layer representations to generate multi-attention fusion features; The task-aware enhancement module uses the multi-router sparse gating expert network module MR-some to adaptively calculate and enhance the multi-attention fusion features, resulting in a task-aware enhanced feature representation. The prediction module calls a pre-trained, continuously learning neural network model to perform incremental learning on the feature representation after task perception enhancement, in order to output prediction results including Alzheimer's disease state classification, disease progression risk score, and subtype classification.

10. An electronic device comprising a memory, a processor, and a computer program stored in the memory and executable on the processor, characterized in that, When the processor executes the program, it implements the Alzheimer's disease prediction method according to any one of claims 1-8.