Disease classification risk prediction apparatus and device
By using a homogeneous-heterogeneous dual-view aggregation framework and leveraging graph neural networks and heterogeneous graph Transformer models, the problems of information redundancy and insufficient modeling of heterogeneous data relationships in multi-omics neural networks for brain disease prediction are solved, achieving more efficient multi-omics data fusion and improved accuracy.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- SHENZHEN UNIV
- Filing Date
- 2026-05-25
- Publication Date
- 2026-06-19
AI Technical Summary
Existing multi-omics neural network methods for predicting and diagnosing brain diseases suffer from problems such as redundant information due to primitive gene data encoding methods, excessively high feature dimensions, low computational efficiency, insufficient biological significance, inadequate modeling of relationships between heterogeneous data, and difficulty in fully integrating information from different data modalities.
A homogeneous-heterogeneous dual-view aggregation framework is adopted. A graph neural network is used to capture the similarity relationship between individuals within a single omics modality, while a heterogeneous graph Transformer model is used to capture the influence relationship between different omics types. A multi-scale comprehensive representation vector is obtained through a contrastive learning module to achieve effective fusion of multi-omics data.
It significantly improves the accuracy and reliability of multi-omics neural networks in predicting brain diseases, effectively capturing local aggregation structures within modalities and global heterogeneous relationships between modalities, and has important clinical application value.
Smart Images

Figure CN122241624A_ABST
Abstract
Description
Technical Field
[0001] This application relates to the field of data processing technology, and in particular to a disease classification risk prediction device and equipment. Background Technology
[0002] Multi-omics technology is a systematic research method that integrates data from multiple biological levels, such as genomics, transcriptomics, proteomics, and metabolomics, to comprehensively analyze the complexity of biological systems.
[0003] Existing multi-omics neural network methods face the following main drawbacks in the prediction and diagnosis of brain diseases: 1. The primitive encoding method of gene data leads to information redundancy. Existing methods directly use the original sequence or one-hot encoding, which does not make full use of the information theory properties of gene features, resulting in excessively high feature dimensionality, low computational efficiency, and insufficient biological significance.
[0004] 2. The feature dimensions of different omics data vary greatly and are mismatched. For example, genotype single nucleotide polymorphism (SNP) data often have tens of thousands of dimensions, while brain imaging features have only a few hundred dimensions. This dimensional mismatch makes cross-modal aggregation difficult and makes it hard to fully integrate information from different data modalities.
[0005] 3. Insufficient modeling of relationships between heterogeneous data and a lack of systematic methods for representing heterogeneous graphs. Existing methods mainly use simple isomorphic graphs or manually designed relationships, which are difficult to capture the complex heterogeneous relationships between multimodal omics data (such as gene-image, image-phenotype, etc.), thus limiting the effectiveness of practical applications. Summary of the Invention
[0006] This application proposes a disease classification risk prediction device and equipment, which can solve one of the problems existing in the background art.
[0007] To achieve the above objectives, this application adopts the following technical solution: Firstly, a disease classification risk prediction device is provided, comprising: The feature encoding layer is used to obtain and encode the multi-omics data of the tested individuals to obtain multi-omics features. The feature preprocessing layer is used to preprocess multi-omics features; The dual-view aggregation layer includes a homogeneous view aggregation module and a heterogeneous view aggregation module. The homogeneous view aggregation module is used to perform local aggregation within a single omics type for a homogeneous graph constructed based on omics features, using a graph neural network to obtain an intra-omics aggregation representation. The heterogeneous view aggregation module is used to capture the influence relationship between different omics types for heterogeneous graphs constructed based on omics features, using the attention mechanism of the heterogeneous graph Transformer model to obtain an inter-omics fusion representation. And, the representation learning layer includes: a contrastive learning module and a classification prediction module. The contrastive learning module is used to: obtain a multi-scale comprehensive representation vector of the same tested individual by contrastive learning intra-omics aggregation representation and inter-omics fusion representation. The classification prediction module is used to: process the multi-scale comprehensive representation vector as input to obtain the prediction result.
[0008] In one possible design of the first aspect, the feature encoding layer includes a feature extraction module and a mutual information evaluation module, wherein the feature extraction module is used to encode omics data into omics features, and the mutual information evaluation module is used to filter omics features based on the mutual information calculation results of the omics features for the preprocessing.
[0009] In one possible design of the first aspect, the isomorphic view aggregation module is further used to: add edges to the isomorphic graph based on a preset meta-path with the same start and end nodes, wherein the added edges represent the relationship between the start and end nodes of the meta-path.
[0010] In one possible design approach of the first aspect, the heterogeneous graph Transformer model is used to: by parameterizing the node type and edge type, the attention score between the target node t and the source node s is determined not only by the eigenvectors of the target node t and the source node s, but also by the spatial projection correction of the edge type connecting the target node t and the source node s.
[0011] In one possible design approach of the first aspect, inter-omics fusion representation for: Where j is a node, N(t) represents all neighboring nodes of node t, τ(t) represents the type of the target node t, τ(s) represents the type of the source node, and d represents the dimension of the key vector. For heterogeneous attention, It is a matrix specific to the edge type. and It is a projection matrix of type t and s for nodes. It is the key matrix for capturing the semantics of edge types. It is heterogeneous message passing.
[0012] In one possible design approach of the first aspect, a multi-scale comprehensive representation vector of the tested individual is obtained by comparing the intra-omics aggregated representation and the inter-omics fusion representation. Specifically, the multi-scale comprehensive representation vector of the tested individual is obtained by comparing the intra-omics aggregated representation and the inter-omics fusion representation, using the intra-omics aggregated representation as the anchor view and the inter-omics fusion representation as the enhanced view.
[0013] In one possible design approach for the first aspect, the contrastive learning module employs the InfoNCE loss function: in, Cosine similarity , The temperature parameter controls the model's ability to distinguish negative samples in detail, and N is the count value.
[0014] In one possible design approach of the first aspect, the multi-omics data includes at least two of the following: genetic data, brain imaging data, phenotypic data, and environmental data; Preprocessing includes: dimension detection, linear projection, and adaptive normalization.
[0015] In one possible design of the first aspect, the feature coding layer further includes a population sparse coding module, which is used to generate a sparse representation of the gene features for each tested individual as a genomic feature.
[0016] In a second aspect, an electronic device is provided, comprising: a processor, and a memory coupled to the processor, the memory being used to store a computer program; the processor being used to execute the computer program stored in the memory, such that the electronic device performs the following process: The multi-omics data of the tested individuals are obtained and encoded to obtain multi-omics features; Preprocessing of multi-omics features; For isomorphic graphs of a single omics type constructed based on omics features, graph neural networks are used to perform local aggregation within a single omics type to obtain intra-omics aggregation representations. For heterogeneous graphs of different omics types constructed based on omics features, the attention mechanism of the heterogeneous graph Transformer model is used to capture the influence relationship between different omics types to obtain inter-omics fusion representations. Furthermore, for the same tested individual, by comparing the intra-omics aggregation representation and the inter-omics fusion representation, a multi-scale comprehensive representation vector of the tested individual is obtained, and the prediction result is obtained by processing the multi-scale comprehensive representation vector as input.
[0017] Beneficial effects: Based on the above technical solution, a homogeneous-heterogeneous dual-view aggregation framework is introduced. The homogeneous graph neural network is used to capture the similarity relationship between individuals within a single omics modality, while the heterogeneous graph Transformer model is used for message passing and feature interaction across multiple omics modalities. This dual-view design is more flexible than a single graph neural network and can simultaneously preserve the local aggregation structure within a modality and the global heterogeneous relationship between modalities. It significantly improves the accuracy and reliability of multi-omics neural networks in predicting brain diseases and has important clinical application value. Attached Figure Description
[0018] To more clearly illustrate the technical solutions in the embodiments of this application, the drawings used in the description of the embodiments or related technologies will be briefly introduced below. Obviously, the drawings described below are only some embodiments of the embodiments of this application. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.
[0019] Figure 1 This is a structural diagram of the disease classification risk prediction device provided in Embodiment 1 of this application; Figure 2 This is a flowchart of the disease classification risk prediction method based on a multi-omics neural network model provided in Embodiment 2 of this application; Figure 3 This is a schematic diagram of GCN processing provided in Embodiment 2 of this application; Figure 4 This is a schematic diagram of heterogeneous graph Transformer processing provided in Embodiment 2 of this application; Figure 5 This is a schematic diagram of the heterogeneous graph Transformer structure provided in Embodiment 2 of this application; Figure 6 This is a schematic diagram of the heterogeneous attention mechanism provided in Embodiment 2 of this application. Detailed Implementation
[0020] To make the objectives, technical solutions, and advantages of this application clearer, the following detailed description is provided in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative and not intended to limit the scope of this application.
[0021] It should be noted that although functional modules are divided in the device schematic diagram and the logical order is shown in the flowchart, in some cases, the steps shown or described may be performed in a different order than the module division in the device or the order in the flowchart. The terms "first," "second," etc., in the specification and the above-mentioned figures are used to distinguish similar objects and are not necessarily used to describe a specific order or sequence.
[0022] Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. The terminology used herein is for the purpose of describing embodiments of this application only and is not intended to limit this application.
[0023] Example 1 like Figure 1 As shown, this embodiment provides a disease classification risk prediction device, including: Feature encoding layer 101 is used to obtain and encode the multi-omics data of the tested individual to obtain multi-omics features; Feature preprocessing layer 102 is used to preprocess multi-omics features; The dual-view aggregation layer 103 includes a homogeneous view aggregation module and a heterogeneous view aggregation module. The homogeneous view aggregation module is used to: for a homogeneous graph of a single omics type constructed based on omics features, use a graph neural network (GCN) to perform local aggregation within a single omics type to obtain an intra-omics aggregation representation. The heterogeneous view aggregation module is used to: for heterogeneous graphs of different omics types constructed based on omics features, use the attention mechanism of the heterogeneous graph Transformer model to capture the influence relationship between different omics types to obtain an inter-omics fusion representation. And, the learning layer 104 includes: a contrastive learning module and a classification prediction module. The contrastive learning module is used to: obtain a multi-scale comprehensive representation vector of the same tested individual by contrastive learning of intra-omics aggregation representation and inter-omics fusion representation. The classification prediction module is used to: process the multi-scale comprehensive representation vector as input to obtain the prediction result.
[0024] Specifically, multi-omics data includes at least two of the following: genetic data, brain imaging data, phenotypic data, and environmental data.
[0025] Preprocessing includes: dimension detection, linear projection, and adaptive normalization.
[0026] In one possible implementation, the feature encoding layer includes a feature extraction module and a mutual information evaluation module, wherein the feature extraction module is used to encode omics data and generate omics features, and the mutual information evaluation module is used to filter omics features based on the mutual information calculation results of the omics features for the preprocessing.
[0027] In one possible implementation, the isomorphic view aggregation module is further used to: add edges to the isomorphic graph based on a preset meta-path with the same start and end nodes, wherein the added edges represent the relationship between the start and end nodes of the meta-path.
[0028] In one possible implementation, the heterogeneous graph Transformer model is used to: by parameterizing the node type and edge type, the attention score between the target node t and the source node s is determined not only by the eigenvectors of the target node t and the source node s, but also by the spatial projection correction of the edge type connecting the target node t and the source node s.
[0029] In one possible implementation, inter-omics fusion representation for: Where j is a node, N(t) represents all neighboring nodes of node t, τ(t) represents the type of the target node t, τ(s) represents the type of the source node, and d represents the dimension of the key vector. For heterogeneous attention, It is a matrix specific to the edge type. and It is a projection matrix of type t and s for nodes. It is the key matrix for capturing the semantics of edge types. It is heterogeneous message passing.
[0030] In one possible implementation, a multi-scale comprehensive representation vector of the tested individual is obtained by comparing the intra-omics aggregated representation and the inter-omics fusion representation. Specifically, the multi-scale comprehensive representation vector of the tested individual is obtained by comparing the intra-omics aggregated representation and the inter-omics fusion representation, using the intra-omics aggregated representation as the anchor view and the inter-omics fusion representation as the enhanced view.
[0031] In one possible implementation, the contrastive learning module uses the InfoNCE loss function: in, Cosine similarity , The temperature parameter controls the model's ability to distinguish negative samples in detail, and N is the count value.
[0032] In one possible implementation, the feature encoding layer further includes a population sparse encoding module, which is used to generate sparse representations of gene features for each tested individual as genomic features.
[0033] Based on the above technical solution, a homogeneous-heterogeneous dual-view aggregation framework is introduced. The homogeneous graph neural network is used to capture the similarity relationship between individuals within a single omics modality, while the heterogeneous graph Transformer model is used for message passing and feature interaction across multiple omics modalities. This dual-view design is more flexible than a single graph neural network and can simultaneously preserve the local aggregation structure within a modality and the global heterogeneous relationship between modalities. It significantly improves the accuracy and reliability of multi-omics neural networks in predicting brain diseases and has important clinical application value.
[0034] Example 2 This embodiment proposes a multi-omics neural network model based on entropy coding and heterogeneous graph aggregation. Its core innovation is to solve the above problems through Shannon entropy-driven population sparse gene coding, feature dimension alignment mechanism, and homogeneous-heterogeneous dual-view graph neural network aggregation framework.
[0035] like Figure 2 As shown, this scheme proposes a disease classification risk prediction method based on a multi-omics neural network model. The main modules of the multi-omics neural network model and their functions in disease classification risk prediction are as follows: (1) Feature coding layer, including feature extraction module, information theory evaluation module, and population sparse coding module. Wherein: The feature extraction module is used to process multi-omics raw data into a feature form that can be input into the network, and includes a preprocessing step; Genetic data is a person's genetic / genomic data, which can be obtained through single nucleotide polymorphism (SNP) detection and Plink processing.
[0036] Brain imaging data is human neuroanatomical / brain structural data, which can be obtained through magnetic resonance imaging (MRI) scans and FreeSurfer processing.
[0037] Phenotypic data are behavioral psychology / clinical assessment data of a person, which can be measured through psychological scales and clinical assessment tools.
[0038] The feature extraction module extracts gene features by using Shannon entropy representation from the gene data; extracts seven cortical morphological indicators (volume, cortical thickness, curvature, etc.) from the brain imaging data to obtain brain imaging features; and performs statistical processing (mean, quartile range, skewness, kurtosis, Shannon entropy, etc.) on the phenotypic and environmental data to obtain phenotypic and environmental features.
[0039] The mutual information evaluation module is used to evaluate the mutual information of each feature based on Shannon entropy and screen features with high information gain. After the feature extraction module, this module is used for preliminary screening to retain features with high mutual information. This operation is performed on all omics. In general, this module serves as a preliminary screening before features are input into the model.
[0040] The mutual information evaluation module is used to evaluate the mutual information of multi-level window-encoded features corresponding to gene data based on Shannon entropy, obtaining the information gain ranking and feature contribution score of each gene for disease risk prediction. The evaluation method is as follows: Where D JS It is the Jensen-Shannon divergence (relative entropy), which measures the difference between the individual gene window distribution and the population reference distribution; functional genes with information gain > threshold are screened by summing the weighted mutual information of multiple windows.
[0041] Symbol explanation: :Gene The total information gain score, obtained by weighted summation of Jensen-Shannon divergences across multiple windows, reflects the cross-population variability and information density of this gene.
[0042] : No. The Jensen-Shannon divergence of each window measures the individual gene distribution. Compared with the group reference distribution The similarity, with a value range of [0,1].
[0043] :individual genes In the window The entropy difference is defined as the individual entropy minus the group reference entropy, quantifying the degree to which an individual deviates from the group.
[0044] :Gene The total number of windows depends on the length of the SNP sequence.
[0045] The mutual information assessment module is used to evaluate the mutual information of cortical morphological region features corresponding to brain imaging data based on Shannon entropy, obtaining the causal criticality index and cross-modal influence ranking of each brain region in disease risk prediction. The assessment method is as follows: The first two items quantify the conditional dependence of brain regions on heterogeneous features, and the third item is an entropy-weighted sum of seven morphological features; brain regions with a comprehensive score greater than the median are identified as key causal driving regions.
[0046] Symbol explanation: Brain regions The causal criticality index (between 0 and 1) is used. The causal driving ability is comprehensively evaluated by combining conditional mutual information (the first two terms) and the weighted sum of the entropies of seven morphological features (the third term).
[0047] Brain region characteristics and heterogeneous representation in confounding factors Mutual information under certain conditions quantifies the independent contribution of brain regions to causal representation z'. Among these, The text indicates radiomics features, img indicates influence omics, and v indicates the brain region number.
[0048] Brain regions Morphological characteristics Shannon entropy (one of seven factors including volume and thickness) reflects the diversity of that feature within a population. The f-th dimension represents the feature of the v-th brain region. This dimension can represent one of the classic sMRI features such as gray volume and cortical thickness.
[0049] :feature The weighting coefficients are usually taken as uniform values. .
[0050] The mutual information evaluation module is used to evaluate the mutual information of the statistical feature set corresponding to phenotypic data based on Shannon entropy, and to obtain the causal information flow strength of each phenotypic variable affected by the image modality. The evaluation method is as follows: Among them, the transmission entropy We measure the predictive power of late phenotypic states for causal characterization; KL divergence captures pseudo-label consistency; and we identify disease phenotypic biomarkers with high information gain through multi-dimensional comprehensive scoring.
[0051] Symbol explanation: : Phenotypic variables The causal information flow strength of (i.e., phenotype). It consists of three components: the transmission entropy term quantifies causal influence, the KL divergence captures pseudo-label consistency, and the confounding factor term corrects for bias.
[0052] Transmission entropy, a measure of phenotypic characteristics. Historical state on current causal representation The predictive capability, expressed in bits.
[0053] Pseudo-label distribution Compared with the reference distribution The smaller the Kullback-Leibler divergence value, the more stable the weak supervision signal.
[0054] : Phenotype Statistical descriptors ( ), such as mean, skewness, kurtosis, etc.
[0055] The population sparse coding module is specifically designed for genomics data to generate sparse representations of genetic data from different individuals while preserving biological meaning.
[0056] Specifically, Step 1: Data Preprocessing SNP data are first quality-controlled and preprocessed using the Plink tool to form individual-level genomic inputs.
[0057] Step 2: Windowing Processing SNP sequences of each gene are ordered by length The process is segmented to generate multiple windows. (Regarding the genes...) The Calculate Shannon entropy using one window: in This indicates that all subjects varied within this window. The group frequency. This step captures the diversity characteristics at the group level.
[0058] Step 3: Population Sparsification Define entropy difference Quantify the degree of deviation between individual and group reference patterns: in It is individual entropy. It is the reference entropy of the corresponding group window (usually a healthy control group or a baseline of the overall population). The key characteristic is that only information deviating from the group is preserved, which forms a sparse representation.
[0059] Step 4: Feature Vector Construction The gene-individual feature vector is obtained by concatenating the entropy differences of all windows: By systematically applying Shannon entropy from information theory to perform population-level sparse coding of gene data in neural networks, specifically, a sliding window approach is used to calculate the Shannon entropy of base combinations within a window according to the frequency of occurrence of each base. Compared with traditional one-hot or tensor coding methods, this ensures that gene features are not scattered, which is beneficial to the computation of neural networks while maximizing the preservation of biologically relevant information. Then, within the same window, the highest frequency base in the population is used as the starting value to calculate the relative values of other base combinations, thus completing sparsification. This ensures that the dimensionality-reduced gene features satisfy both computational efficiency and retain sufficient discriminative power.
[0060] (2) Feature dimension alignment layer, including dimension detection module, linear projection module, and adaptive normalization module. Wherein: The dimensionality detection module is used to identify dimensional differences between different omics data (genotypes, brain imaging, behavioral features, etc.); The dimensionality detection module, when identifying dimensional differences between different omics data (genotypes, brain imaging, behavioral features, etc.), specifically includes: The input to this module is data (in vector form) of different dimensions processed in the previous layer, including genotype, brain images, and behavioral features. The lengths (dimensions) of the different vectors vary, so this module automatically identifies the longest-dimension vector feature and aligns the remaining features to the longest dimension using zero-padding, thus bridging the dimensional differences.
[0061] The linear projection module is used to project all data onto a unified feature dimension space through learnable linear transformations. The linear projection module implements the mapping function of aX+b. However, for different X (i.e., different omics have different X, which is reflected in the feature dimension), the code needs to determine which linear layer to use. In the code, this is implemented by an if statement, where a and b are the linear projection parameters.
[0062] The adaptive normalization module is used to handle the distribution differences of data from different modalities and ensure fair fusion.
[0063] When the adaptive normalization module is used to handle the distribution differences of data from different modalities, the following steps are included: As before, the normalization module implements normalization separately according to different omics. In the code, this is reflected in a simple norm layer, but different omics do not share the same norm layer, hence the name adaptive normalization.
[0064] By proposing a systematic feature dimension alignment layer, this paper projects diverse omics data, such as gene sequences, brain images, and behavioral features, with extreme dimensionality differences, into a unified feature space through adaptive linear projection and distribution normalization. This solves the problems of modality differences and unfair competition between modalities in traditional multimodal fusion, enabling gene data with large dimensionality differences to be aggregated with other modalities with equal weight, thus improving the capture of low-frequency but important biological features.
[0065] (3) Dual-view aggregation layer, including a homogeneous view aggregation module, a heterogeneous view aggregation module, and a fusion aggregation module. Wherein: The isomorphic view aggregation module employs a graph convolutional neural network (GCN) for local aggregation within each individual dataset type; When performing local aggregation within each individual learning type, it includes: First, a pre-defined metapath is set, such as gene-environment-gene, or image-gene-image. Using this pre-defined metapath can transform heterogeneous graphs into homogeneous graphs.
[0066] The specific implementation method is as follows: If there is a metapath The final generated isomorphic graph adjacency matrix for: in: Is a node type and The heterogeneous adjacency matrix (or transition matrix) between them.
[0067] It is a square matrix with dimensions of . This represents the node type. Internal isomorphic associations.
[0068] Here, l represents the number of relations in the heterogeneous graph, and R represents the relation. It should be noted that R² here is different from the confusion factor.
[0069] Operating steps: First, determine the target node type: select the node type to retain (such as "gene").
[0070] Second, define semantic meta-paths: select paths that can represent specific associations (such as "gene-image-gene").
[0071] Third, extract path instances: search the original graph for all instances that match the path structure.
[0072] Fourth, if there is at least one path between node i and node j that conforms to the meta-path, then an edge is created for i and j in the new isomorphic graph.
[0073] like Figure 3 As shown, each omics has an isomorphic graph, including nodes and edge weights. These are input into a GCN for aggregation to obtain updated features of all omics nodes. Then, all omics features are combined to obtain an aggregation result based on the isomorphic view. Each graph before input consists of a subject's environment, genes, images, and phenotypic features (these four may not all be present depending on the dataset). After input, the number of nodes remains unchanged; only the node features are updated. Regarding the construction of the isomorphic view, for heterogeneous graphs with different types of nodes, in order to use GCN for aggregation, the different types of nodes between two nodes of the same type are ignored. For example, in a connection like gene-brain region-gene, the brain region node is ignored, resulting in gene-gene connections. The same operation is performed on all such triples to obtain isomorphic graphs of different omics containing only the nodes of that omics, which are then input into the GCN.
[0074] The heterogeneous view aggregation module employs a heterogeneous graph transformer (HGT) for message passing across different omics types and capturing heterogeneous relationships; When messaging across different omics types to capture heterogeneous relationships, including: like Figure 4 , 5 As shown, the principle of HGT is to assign a specific attention computation layer to each connection (such as gene-image, image-phenotype) without pre-setting the meta-path. Therefore, the heterogeneous view aggregation module constructed in this way can aggregate features of different omics nodes. Taking the gene-image connection as an example, for a brain region, there may be brain regions, genes, phenotypes and its connections. After being input into HGT, each connection will be calculated with attention (some unreasonable connections will be removed based on domain knowledge, such as brain images will not affect genes). Then, weighted aggregation is performed to obtain heterogeneous view aggregation features with the same scale as homogeneous view aggregation.
[0075] The Heterogeneous Graph Transformer (HGT) achieves dynamic modeling of complex semantics in heterogeneous graphs by parameterizing node type and edge type. Its core logic decomposes the traditional self-attention mechanism into an interaction based on meta-relations: the attention score between the target node t and the source node s is determined not only by their feature vectors but also by the edge type connecting them. Spatial projection correction is used to capture the diverse interaction patterns between different entities within a unified architecture. In this embodiment, the target node and source node represent data from different omics, such as genes and images, while the edge type represents their connection relationships.
[0076] Core mechanism formula: Heterogeneous attention, such as Figure 5 As shown: Where j is a node, N(t) represents all neighboring nodes of node t, τ(t) represents the type of the target node t, τ(s) represents the type of the source node, and d represents the dimension of the key vector. and It is a projection matrix specific to the node type. These are the characteristics of the target node. It is a feature of the source node, and It is the key matrix for capturing the semantics of edge types. It is a matrix specific to the edge type.
[0077] Heterogeneous Message Passing: The message passes through a matrix specific to the edge type. Transformation ensures that information from different relationships is semantically decoupled and distinguishable.
[0078] Target-specific aggregation: Where N(t) represents all neighboring nodes of node t.
[0079] Finally, the weighted heterogeneous messages are aggregated, and the target node representation is updated through residual connections and layer normalization.
[0080] (4) Represents the learning layer, including the contrastive learning module and the classification prediction module. Wherein: 1. The contrastive learning module uses the InfoNCE loss function to learn robust individual representations by comparing different augmented views of the same individual (i.e., heterogeneous views and isomorphic views, where isomorphic views are used as the original views in traditional contrastive learning and heterogeneous views are used as augmented views). This module maps views of different properties to a unified representation space using a dual-channel encoder: Isomorphic Path (Anchor Point): Input raw features or single-modal data to generate a baseline vector. .
[0081] Heterogeneous Path (Enhancement): Input cross-modal features processed by heterogeneous graphs (such as HGT) to generate enhancement vectors. .
[0082] Projection Head: Both paths are mapped via nonlinear mapping. Eliminate dimensional differences to make features comparable on the same hypersphere.
[0083] 2. Regarding the InfoNCE loss function : Its goal is to maximize the similarity of the same set of "isomorphic-heterogeneous" pairs while minimizing the similarity with other samples in the batch: Variable description: Cosine similarity .
[0084] Temperature parameter controls the model's ability to distinguish negative samples.
[0085] Denominator: Contains 1 positive sample pair and N-1 negative sample pairs ($j\neq i$).
[0086] Here, N stands for counter.
[0087] 3. Core Logic Positive samples: Isomorphic and heteromorphic views from the same individual i.
[0088] Negative samples: are homogeneous views of individual i and heterogeneous views of other individuals j in the batch.
[0089] Physical meaning: Through constraints The model is forced to extract essential robust features that cross modal differences.
[0090] The classification prediction module is used to classify diseases or predict risks based on the learned representations.
[0091] In the overall model architecture, the Linear Support Vector Classifier (LinearSVC) serves as the classification prediction module for downstream tasks, responsible for mapping the high-dimensional representation output by the front-end network (such as the encoder after contrastive learning) to a specific class space.
[0092] The logical structure of this module can be divided into the following three stages: Input phase (Feature Input): Receives the robustly enhanced individual representation vectors output by the contrastive learning module. .at this time It is a low-dimensional dense vector with a fixed dimension.
[0093] Linear Transformation Stage: LinearSVC searches for an optimal hyperplane in the feature space that maximizes the margin between sample points of different classes and the hyperplane.
[0094] Output phase (Classification Output): The class to which a sample belongs is determined by a symbolic function.
[0095] By introducing a homogeneous-heterogeneous dual-view aggregation framework, a homogeneous GCN is used to capture individual similarity relationships within a single omics modality, while a heterogeneous HGT is used for message passing and feature interaction across multiple omics modalities. This dual-view design is more flexible than a single graph neural network (whether GCN or HGT), and can simultaneously preserve the local aggregation structure within a modality and the global heterogeneous relationships between modalities.
[0096] In summary, this embodiment significantly improves the accuracy and reliability of multi-omics neural networks in predicting brain diseases through an innovative combination of information theory, cross-modal alignment, and graph neural networks, and has important clinical application value.
[0097] This embodiment also proposes a disease classification risk prediction method based on the above-mentioned multi-omics neural network model, including the following steps: Step 1: Perform feature extraction and sparse coding on the input multi-body multi-omics data through the feature coding layer to obtain a compressed coding vector; In step 1, a population sparsity constraint based on Shannon entropy is adopted to reduce the encoding dimension while maximizing the amount of information retained.
[0098] Step 2: The gene vectors output in Step 1, as well as other omics feature vectors such as brain images and clinical phenotypes, are dimensionally mapped and normalized through the feature dimension alignment layer, so that all omics data are aligned to a unified dimension space, and an aligned multimodal feature representation is obtained. In step 2, an adaptive scaling factor is used to handle the distribution differences of different modalities, ensuring fair contribution of each modality during aggregation.
[0099] Step 3: Construct a heterogeneous graph structure based on aligned multimodal features, where nodes are individuals, and different types of edges represent similarity relationships under different omics modalities (e.g., gene similarity, phenotypic similarity, metabolic similarity, etc.), and dynamically set the edge dropout probability according to the test set samples; In step 3, the heteromorphic graph structure accurately captures the relationships between multiple omics data types, which is closer to biological reality than traditional single isomorphic graphs.
[0100] Step 4: Message pass the graph structure within each individual modality through the isomorphic view aggregation module (GCN) to obtain the aggregated representation within the modality; In step 4, GCN is run independently for each omics modality to learn the local neighborhood features of individuals within that modality.
[0101] Step 5: Perform cross-modal message passing on the heterogeneous graph through the Heterogeneous View Aggregation Module (HGT), and use the attention mechanism of the heterogeneous graph transformer to capture the influence relationship between different omics modalities to obtain the intermodal fusion representation; In step 5, HGT learns the heterogeneous contribution of different omics modalities to the target node through metapath and multi-head attention mechanisms, which can significantly improve accuracy compared with homogeneous aggregation.
[0102] Step 6: The outputs of GCN and HGT are weighted and fused through the fusion and aggregation module. The results of intra-modal aggregation and inter-modal aggregation are combined to finally obtain the multi-scale comprehensive representation vector of the individual. In step 6, an adaptive weighting mechanism is used to learn the optimal GCN-HGT fusion ratio during the training phase.
[0103] Step 7: Input the comprehensive representation obtained in Step 6 into the classification prediction module, and obtain the disease classification probability or risk prediction result through a fully connected layer and Softmax activation to complete the test.
[0104] In step 7, interpretable probability outputs are used to support multi-class or multi-label prediction.
[0105] This embodiment also proposes a training method for the above-mentioned multi-omics neural network model, including the following steps: Step 1: Randomly initialize the parameters of each layer of the neural network, set hyperparameters (learning rate, batch size, dropout ratio, etc.), and prepare labeled training and validation sets; In step 1, the Xavier initialization strategy is used to ensure convergence stability.
[0106] Step 2: For each batch in the training set, repeat the following training iterations: (2.1) The input multi-omics data is processed through a feature encoding layer, and the aligned multimodal feature representation is obtained through a feature dimension alignment layer; (2.2) Construct a heterogeneous graph structure, randomly discard a certain proportion of edges according to the preset dropout ratio, and obtain two different graph structure views (View A and View B) to simulate data augmentation; (2.3) Input the two views into the homogeneous view aggregation module (GCN) and the heterogeneous view aggregation module (HGT) respectively to obtain two different representations; (2.4) The two representations are fused by the fusion and aggregation module to obtain the final individual representation vector z; (2.5) Calculate three loss functions: classification cross-entropy loss (for supervised learning), InfoNCE contrastive loss (for self-supervised reinforcement), and L2 regularization term; In step 2, the three loss functions are combined in a weighted manner: , where α, β, and γ are adjustable weights.
[0107] Classification cross-entropy loss: , where C is the total number of categories. The true label after one-hot encoding. If the sample belongs to class i, Otherwise, it is 0. The predicted probability after the model output layer passes through the Softmax function is... Among them, z i , z j All represent the features z output by the model.
[0108] InfoNCE Contrast Loss: L contrastive .
[0109] Defined as all learnable parameters in the model Sum of squares: ,in This is the set of model parameters. N is the total number of parameters.
[0110] Step 3: Calculate the gradient of each layer's parameters using backpropagation, and update the network parameters using the Adam optimizer; In step 3, gradient clipping is used to prevent gradient explosion.
[0111] Step 4: Evaluate the model performance (accuracy, AUC, F1 score, etc.) on the validation set. If the performance no longer improves (the validation loss does not decrease for N consecutive epochs), trigger the early stopping mechanism. In step 4, N is typically set to 10-20 epochs.
[0112] Step 5: When the verification performance meets the preset threshold or the maximum number of iterations is reached, save the optimal network parameters and training is complete; otherwise, return to step 2 to continue training on new batch data.
[0113] In step 5, the saved optimal model is used in the subsequent testing phase.
[0114] This application also provides an electronic device, including: a processor, and a memory coupled to the processor, the memory being used to store a computer program; the processor being used to execute the computer program stored in the memory, so that the electronic device performs the method as described in any of the above embodiments.
[0115] Electronic devices can be computing devices such as desktop computers, laptops, handheld computers, and cloud servers. These electronic devices may include, but are not limited to, processors and memory.
[0116] The processor can be a Central Processing Unit (CPU), or other general-purpose processors, digital signal processors (DSPs), application-specific integrated circuits (ASICs), field-programmable gate arrays (FPGAs), or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc. A general-purpose processor can be a microprocessor or any conventional processor. The processor is the control center of the electronic device, connecting various parts of the device via various interfaces and lines.
[0117] The memory can be used to store the computer program, and the processor implements various functions of the electronic device by running or executing the computer program stored in the memory and calling the data stored in the memory.
[0118] The memory may primarily include a program storage area and a data storage area. The program storage area may store the operating system, applications required for at least one function, etc.; the data storage area may store data created based on the use of the mobile phone, etc. In addition, the memory may include high-speed random access memory, and may also include non-volatile memory, such as hard disk, memory, plug-in hard disk, smart media card (SMC), secure digital (SD) card, flash card, at least one disk storage device, flash memory device, or other volatile solid-state storage device.
[0119] This application also provides a storage medium, which is a computer-readable storage medium. The computer program is stored in the computer-readable storage medium, and when executed by a processor, the computer program can implement the steps of the various method embodiments described above. The computer program includes computer program code, which can be in the form of source code, object code, executable file, or some intermediate form. The computer-readable medium can include: any entity or device capable of carrying the computer program code, a recording medium, a USB flash drive, a portable hard drive, a magnetic disk, an optical disk, a computer memory, a read-only memory (ROM), a random access memory (RAM), an electrical carrier signal, a telecommunication signal, and a software distribution medium, etc.
[0120] This application also provides a computer program product, including: a computer program or instructions that, when the computer program or instructions are run on a computer, cause the computer to perform any of the above possible implementation methods.
[0121] The above description is the preferred embodiment of this application. It should be noted that for those skilled in the art, several improvements and modifications can be made without departing from the principle of this application, and these improvements and modifications are also considered to be within the scope of protection of this application.
Claims
1. A disease classification risk prediction device, characterized in that, include: The feature encoding layer is used to obtain and encode the multi-omics data of the tested individuals to obtain multi-omics features. The feature preprocessing layer is used to preprocess multi-omics features; The dual-view aggregation layer includes a homogeneous view aggregation module and a heterogeneous view aggregation module. The homogeneous view aggregation module is used to perform local aggregation within a single omics type for a homogeneous graph constructed based on omics features, using a graph neural network to obtain an intra-omics aggregation representation. The heterogeneous view aggregation module is used to capture the influence relationship between different omics types for heterogeneous graphs constructed based on omics features, using the attention mechanism of the heterogeneous graph Transformer model to obtain an inter-omics fusion representation. And, the representation learning layer includes: a contrastive learning module and a classification prediction module. The contrastive learning module is used to: obtain a multi-scale comprehensive representation vector of the same tested individual by contrastive learning intra-omics aggregation representation and inter-omics fusion representation. The classification prediction module is used to: process the multi-scale comprehensive representation vector as input to obtain the prediction result.
2. The disease classification risk prediction device as described in claim 1, characterized in that, The feature encoding layer includes a feature extraction module and a mutual information evaluation module. The feature extraction module is used to encode omics data and generate omics features. The mutual information evaluation module is used to filter omics features based on the mutual information calculation results of the omics features for the preprocessing.
3. The disease classification risk prediction device as described in claim 1, characterized in that, The isomorphic view aggregation module is further used to: add edges to the isomorphic graph based on a preset meta-path with the same start and end nodes, whereby the added edges represent the relationship between the start and end nodes of the meta-path.
4. The disease classification risk prediction device as described in claim 1, characterized in that, The heterogeneous graph Transformer model is used to: by parameterizing the node type and edge type, the attention score between the target node t and the source node s is determined not only by the feature vectors of the target node t and the source node s, but also by the spatial projection correction of the edge type connecting the target node t and the source node s.
5. The disease classification risk prediction device as described in claim 4, characterized in that, Inter-omics fusion representation for: Where j is a node, N(t) represents all neighboring nodes of node t, τ(t) represents the type of the target node t, τ(s) represents the type of the source node, and d represents the dimension of the key vector. For heterogeneous attention, It is a matrix specific to the edge type. and It is a projection matrix of type t and s for nodes. It is the key matrix for capturing the semantics of edge types. It is heterogeneous message passing.
6. The disease classification risk prediction device as described in claim 1, characterized in that, By comparing the intra-omics aggregation representation and the inter-omics fusion representation, a multi-scale comprehensive representation vector of the tested individual is obtained. Specifically, the intra-omics aggregation representation is used as the anchor view and the inter-omics fusion representation is used as the enhancement view. By comparing the intra-omics aggregation representation and the inter-omics fusion representation, a multi-scale comprehensive representation vector of the tested individual is obtained.
7. The disease classification risk prediction device as described in claim 1, characterized in that, The contrastive learning module uses the InfoNCE loss function: in, Cosine similarity , Temperature is a parameter that controls the model's ability to distinguish negative samples in detail. For inter-omics fusion representation, For omics-based aggregation representation, N is the count value.
8. The disease classification risk prediction device as described in claim 2, characterized in that, Multi-omics data includes at least two of the following: genetic data, brain imaging data, phenotypic data, and environmental data; Preprocessing includes: dimension detection, linear projection, and adaptive normalization.
9. The disease classification risk prediction device as described in claim 8, characterized in that, The feature coding layer also includes a population sparse coding module, which is used to generate sparse representations of gene features for each tested individual as genomic features.
10. An electronic device, characterized in that, The electronic device includes: a processor, and a memory coupled to the processor. The memory is used to store computer programs; The processor is configured to execute the computer program stored in the memory, causing the electronic device to perform the following process: The multi-omics data of the tested individuals are obtained and encoded to obtain multi-omics features; Preprocessing of multi-omics features; For isomorphic graphs of a single omics type constructed based on omics features, graph neural networks are used to perform local aggregation within a single omics type to obtain intra-omics aggregation representations. For heterogeneous graphs of different omics types constructed based on omics features, the attention mechanism of the heterogeneous graph Transformer model is used to capture the influence relationship between different omics types to obtain inter-omics fusion representations. Furthermore, for the same tested individual, by comparing the intra-omics aggregation representation and the inter-omics fusion representation, a multi-scale comprehensive representation vector of the tested individual is obtained, and the prediction result is obtained by processing the multi-scale comprehensive representation vector as input.