Sparse gene expression data analysis method based on truncated power

A technology of gene expression and data analysis, applied in the field of data analysis of gene expression, can solve problems such as low accuracy rate, lack of strong explanatory power of results, and many attributes, so as to achieve the effect of improving efficiency and accuracy

Active Publication Date: 2015-01-07
NANJING UNIV OF AERONAUTICS & ASTRONAUTICS
View PDF2 Cites 4 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, due to the characteristics of genetic data itself, there are many attributes and few samples, so that there will be a large amount of redundant data and interference information in high-dimensional data, and direct clustering analysis will lead to a low accuracy rate.
Principal component analysis is a classic dimensionality reduction method that can map high-dimensional data to low-dimensional space, but its results do not have strong explanatory power

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Sparse gene expression data analysis method based on truncated power
  • Sparse gene expression data analysis method based on truncated power
  • Sparse gene expression data analysis method based on truncated power

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0047] Below in conjunction with accompanying drawing, technical scheme of the present invention is described in further detail:

[0048] The processing flow diagram of the present invention is as figure 1 As shown, the whole process includes data preprocessing, feature extraction and clustering. Since the sparsity of the sparse method needs to be manually specified, there is a feedback loop in the middle to better adjust the relationship between clustering accuracy and sparseness. Specifically Proceed as follows:

[0049] Step 1. Preprocessing the genetic data set, including regularization, using principal component analysis to determine the number of principal components and combining local iterative search to determine the cardinality of principal components;

[0050]Step 2. Perform truncated power sparse dimensionality reduction and feature extraction on the genetic data determined by the sparse tuning parameters processed in step 1, to reduce the interference of data and...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a sparse gene expression data analysis method based on the truncated power. The method specifically includes the steps that a gene data set is preprocessed, namely, the gene data set is regularized, the number of principal components is determined according to a principal component analysis method, and the cardinal number of the principal components is determined in the combination of local iteration; features of gene data in the gene data set processed through the first step are extracted, interference of the data is reduced, and the clustering accuracy in the following procedure is improved; the gene data with the extracted data features are processed according to a clustering method; a clustering processing result acquired through the third step is compared with the set clustering accuracy, adjustment and optimization parameters of sparse dimensionality reduction are fed back and adjusted, and therefore the optimal clustering accuracy is achieved. The method achieves the purpose of sparse feature value decomposition and is used for sparse principal component analysis, the principal component explaining capacity is high, the operation speed is high, a sparse principal component method can be well verified, and the efficiency and the accuracy of gene data analysis are improved.

Description

technical field [0001] The invention discloses a sparse gene expression data analysis method based on truncated power, and relates to the technical field of gene expression data analysis. Background technique [0002] With the rapid development of biomedicine, the wide application of DNA chip (DNA microarray) can quickly measure the expression level of genes. Since the analysis of genetic data can be used to identify cancer cells to predict the probability of a certain disease, it is of great significance to human life. Therefore, gene clustering has become a hot topic in current research. [0003] The original collected genetic data has the characteristics of many attributes and few samples. The results of direct clustering analysis are often disturbed by a large amount of redundant data, and high-dimensional data is also a challenge to traditional clustering methods. In order to overcome these shortcomings, different dimensionality reduction principal feature extraction ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06K9/62
CPCG16B25/00G16B40/00
Inventor 沈宁敏李静周培云
Owner NANJING UNIV OF AERONAUTICS & ASTRONAUTICS
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products