A Biclustering Method for Gene Expression Data Based on Two-Stage Genetic Computation

A gene expression and biclustering technology, applied in the field of data mining processing, can solve the problems of large data volume, high redundancy, high dimensionality, etc., and achieve the effect of wide search range, overcoming local information, and high quality.

Inactive Publication Date: 2017-11-07
SOUTH CHINA UNIV OF TECH
View PDF2 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

From the analysis of the specific process of obtaining gene expression data, it can be concluded that gene expression data has (1) huge data volume; (2) high dimensionality; (3) high noise; (4) high redundancy, etc. The research on analysis algorithms puts forward higher requirements and challenges

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A Biclustering Method for Gene Expression Data Based on Two-Stage Genetic Computation
  • A Biclustering Method for Gene Expression Data Based on Two-Stage Genetic Computation
  • A Biclustering Method for Gene Expression Data Based on Two-Stage Genetic Computation

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0027] The present invention will be further described in detail below in conjunction with the embodiments and the accompanying drawings, but the embodiments of the present invention are not limited thereto.

[0028] Such as figure 1 , a biclustering algorithm based on two-stage genetic computation, includes steps in the following order:

[0029] 1) The gene expression data matrix is ​​M, the number of rows is m, and the number of columns is n, that is, the size of the gene expression data matrix is ​​m×n. Subtract the k-th row from each row of the original data matrix M to obtain the processed Matrix M(k), k=1,2,...,n;

[0030]2) For each column except the kth column in the processed matrix M(k), such as figure 2 As shown, use hierarchical clustering with a distance threshold of cof=0.02 for each column to obtain the biclustering seeds of each column, and then put all the obtained biclustering seeds into a set named Bic_Set; because each The bi-clustering seeds correspond...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a double-clustering method of gene expression data based on double-stage genetic calculation, and subtracts the k-th column from each column in the matrix M to obtain the matrix M(k), k=1, 2,...,n; Perform hierarchical clustering on each column of M(k) to obtain a set of bi-clustering seeds; obtain the corresponding bi-clustering through genetic calculation. The algorithm described in the present invention can solve the shortcoming that the traditional biclustering algorithm based on genetic calculation can only select for biclustering, by optimizing the ranks and columns at the same time, the search efficiency can be improved, and a better biclustering analysis can be obtained Effect.

Description

technical field [0001] The invention relates to the field of data mining and processing, in particular to a double-clustering method for gene expression data based on two-stage genetic calculation. Background technique [0002] The emergence and development of DNA microarray technology allows people to simultaneously detect thousands of genes and measure the expression level of their transcribed mRNA. Through repeated experiments under multiple experimental conditions (such as different experimental environments, different time points, and different tissue samples), gene expression data from hundreds of experiments can be collected. The rows of the gene expression data matrix represent the expression of a gene under different environmental conditions or at different time points, and the columns represent the expression of all genes under different conditions or samples (such as tissue, experimental conditions, processing factors, etc.), and the data in the matrix represent ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F17/30G06F19/20
Inventor 黄庆华杨杰黄仙海
Owner SOUTH CHINA UNIV OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products