Multi-omics cancer data integrating and analyzing method based on similarity fusion

A similarity fusion and data integration technology, applied in the field of bioinformatics, can solve problems such as high dimensionality, high integration methods, and large noise

Active Publication Date: 2019-07-09
SOUTH CHINA UNIV OF TECH
View PDF3 Cites 20 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The main problem of the network-based method is that the modeling of the network structure and transmission path is simple and cannot describe the complex network structure of multi-omics data, and it does not provide a method to feed back the fusion results to the original omics data
Therefore, for cancer multi-omics data with high dimensionality, high noise, and various distributions, there is still a lack of efficient, accurate, and robust integration methods for cancer molecular analysis

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Multi-omics cancer data integrating and analyzing method based on similarity fusion
  • Multi-omics cancer data integrating and analyzing method based on similarity fusion
  • Multi-omics cancer data integrating and analyzing method based on similarity fusion

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0059] The present invention will be further described below in conjunction with specific examples.

[0060] This embodiment uses the cervical cancer project (CESC) data in the public cancer data set TCGA to evaluate the method of the present invention, and uses different types of patients as evaluation indicators. The outline design of this example is as follows figure 1 As shown, the implementation process is as follows figure 2 As shown, the specific process of realizing the multi-omics cancer data integration analysis method based on similarity fusion based on similarity fusion is as follows:

[0061] Step 1. Obtain the gene expression data, methylation data, and miRNA data of the same batch of samples of the cervical cancer project in the TCGA database. In this example, the data of 292 cervical cancer patients were initially collected.

[0062] Step 2, data cleaning: process the null values ​​in the data, delete more than 20% of the samples or features directly, and use...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a multi-omics cancer data integrating and analyzing method based on similarity fusion. The method comprises the steps of calculating a local similarity network, fusing multiplelocal similarity networks, typing according to a global similarity network, and backtracking the characteristic in an original data source according to the global similarity network. Compared with the prior art, the method is advantageous in that through modeling a similarity network connecting path which gradually advances, a fusion algorithm of multiple similarity networks is realized; a more complicated network structure can be described; and higher accuracy and higher stability are realized. Through a consistent alternating multiplier method, quick solving of a network fusion model is realized. The multi-omics cancer data integrating and analyzing method has advantages of utilizing the integrated global similarity network to typing of a cancer patient, obtaining types of the patientswith substantial prognosis difference, and combining with a multi-group characteristic selecting method for screening the key target characteristic. The selected characteristic has a potential of becoming a latent biological marker.

Description

technical field [0001] The present invention relates to the technical field of biological information, in particular to a multi-omics cancer data integration analysis method based on similarity fusion. Background technique [0002] In current clinical medical practice, cancer is often classified and treated according to its tissue origin and pathological features. However, with the development of sequencing technology and human genome research, a large number of studies have shown that the pathological system of tumors at the molecular level can well characterize its occurrence characteristics and development stages. The process of tumorigenesis and development is often accompanied by genomic variation caused by somatic gene mutations, epigenetic changes, individual differences, and environmental influences. Traditional analysis based on single genome data is difficult to capture the heterogeneity of all biological processes. and clearly distinguish the phenotypes. Therefo...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G16H50/20G16B20/00
CPCG16H50/20
Inventor 蔡宏民徐傲丹
Owner SOUTH CHINA UNIV OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products