Cancer subtype identification method and system based on self-attention deep learning

A technology of deep learning and identification methods, applied in the field of biological information, can solve the problem of ignoring the relationship between data features and achieve the effect of good clustering effect

Inactive Publication Date: 2022-04-12
XUZHOU MEDICAL UNIV
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

These methods ignore the data characteristics between different omics and the relationship between samples

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Cancer subtype identification method and system based on self-attention deep learning
  • Cancer subtype identification method and system based on self-attention deep learning
  • Cancer subtype identification method and system based on self-attention deep learning

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0084] A cancer subtype identification method and system based on self-attention deep learning of the present invention, such as figure 1 As shown, it specifically includes the following steps:

[0085] Step 1, preprocessing the four kinds of omics data of cancer samples respectively. For mRNA and miRNA expression data, logarithmic transformation is first performed to reduce the absolute value of the data. For DNA copy number variation data, the repetitive regions are removed first, and then features are constructed based on the correspondence between samples and genomic regions. For DNA methylation data, since each sample corresponds to a lot of methylation site information, the DNA methylation information is first integrated and the average value of each sample is calculated. In cancer multi-omics data, there will be different degrees of missing data, and the sample average is taken for each omics data to fill in the missing data. Finally, normalization processing was per...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a cancer subtype identification method and system based on self-attention deep learning, and the method comprises the following steps: firstly, carrying out the preprocessing of multi-omics data of a cancer, then employing a deep learning Dense network to learn the low-dimensional features of each omics, and carrying out the preliminary integration of different omics features through a splicing mode; and then a similarity matrix between samples is constructed by using self-attention, and feature fusion is carried out according to matrix weight and splicing features to obtain final integrated feature representation. And using a decoder to minimize an error between the fusion feature and the original omics feature, and performing adversarial learning of integrated feature distribution through a discriminator. And finally, clustering the learned integrated feature distribution through a Gaussian mixture model to identify cancer subtypes. According to the method, multiple omics data can be effectively integrated, meanwhile, the relation between samples is modeled in a self-adaptive mode, better feature representation is learned, a better clustering result is obtained, and accurate recognition of cancer subtypes is achieved.

Description

technical field [0001] The present invention relates to the field of biological information technology, in particular to a method and system for identifying cancer subtypes based on self-attention deep learning. Background technique [0002] The diagnosis, treatment and prognosis evaluation of cancer is one of the most urgent and important research topics in the field of life science and medicine. Studies have shown that cancer is highly heterogeneous, with the same clinical stage or histological morphology, and its molecular typing is very different, and different molecular typing plays a crucial role in the selection of preoperative treatment options and prognosis of patients It is an important basis for individualized therapy, especially endocrine therapy and targeted therapy. [0003] Early cancer molecular typing research mainly used single omics data. This typing method depends on the type of data used, and the results obtained from different types of omics data are i...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G16B40/30G16H50/20
Inventor 巩萍孙秋文程磊张志远孟军葛海涛陈洁章龙珍
Owner XUZHOU MEDICAL UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products