Unlock instant, AI-driven research and patent intelligence for your innovation.

Cancer mutation cluster recognition method based on adaptive Gaussian mixture model

A technology of Gaussian mixture model and identification method, applied in the field of data mining in bioinformatics

Pending Publication Date: 2020-03-17
HUNAN UNIV
View PDF8 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] However, mutations of function that occur within coding sequences and are associated with cancer do not occur randomly

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Cancer mutation cluster recognition method based on adaptive Gaussian mixture model
  • Cancer mutation cluster recognition method based on adaptive Gaussian mixture model
  • Cancer mutation cluster recognition method based on adaptive Gaussian mixture model

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0055] The invention is a method for identifying cancer mutation clusters based on an adaptive Gaussian mixture model. Specific embodiments of the present invention are described below. Those skilled in the art should understand that these embodiments are only used to explain the technical principles of the present invention, and are not intended to limit the scope of evidence collection of the present invention.

[0056] Step 1: From the TCGA database, download somatic mutation data for 12 cancers, including lung squamous cell carcinoma (LUSC), rectal adenocarcinoma (READ), glioblastoma (GBM), bladder urothelial carcinoma ( BLCA), uterine corpus endometrioid carcinoma (UCEC), colon adenocarcinoma (COAD), ovarian serous cystadenocarcinoma (OV), acute myeloid leukemia (LAML), head and neck squamous cell carcinoma (HNSC), Lung adenocarcinoma (LUAD), breast invasive carcinoma (BRCA), kidney renal clear cell carcinoma (KIRC). The required information was screened from the mutati...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to the field of data mining in bioinformatics, in particular to a cancer mutation cluster recognition method based on an adaptive Gaussian mixture model. The method mainly comprises the following steps of: (1) preprocessing somatic mutation data, and constructing a background model; (2) initializing parameters by using an improved density peak clustering method; (3) establishing an adaptive Gaussian mixture model, and solving the adaptive Gaussian mixture model by using an EM algorithm; and (4) screening a clustering result according to the number of mutations contained in the mutation cluster. Compared with the prior art, the data driving method is provided to identify a variable-length target region in a gene and has the stronger statistical ability and the better stability. The method disclosed by the invention is feasible and effective, can achieve a good effect in the aspect of identifying the cancer mutation cluster, and has important significance for researching a potential mechanism of cancer generation and development and achieving precision medical treatment.

Description

technical field [0001] The invention relates to the field of data mining in bioinformatics, in particular to a method for identifying cancer mutation clusters based on an adaptive Gaussian mixture model. Background technique [0002] Somatic mutations are the genomic aberrations most associated with cancer formation and progression. Human cancer samples typically contain hundreds to thousands of somatic mutations that vary in different organ tissues. The precise identification of mutations that alter the function of protein-coding genes, as well as the understanding of the mechanistic and molecular roles underlying cancer development, remains a major challenge. [0003] Currently, millions of distinct somatic mutations have been observed in humans through genome-wide projects such as The Cancer Genome Atlas (TCGA) and the International Cancer Genetics Consortium (ICGC). For large-scale mutational data, many computational methods have been proposed to identify functional mu...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G16B20/50G16B50/00G06K9/62
CPCG16B20/50G16B50/00G06F18/232
Inventor 卢新国王新宇李金鑫丁莉朱正浩
Owner HUNAN UNIV