Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Bi-order genetic calculation-based gene expression data bi-clustering algorithm

A gene expression and biclustering technology, applied in the field of data mining processing, can solve the problems of large data volume, high dimensionality, high redundancy, etc., and achieve the effect of wide search range, high quality, and overcoming local information.

Inactive Publication Date: 2015-04-29
SOUTH CHINA UNIV OF TECH
View PDF2 Cites 7 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

From the analysis of the specific process of obtaining gene expression data, it can be concluded that gene expression data has (1) huge data volume; (2) high dimensionality; (3) high noise; (4) high redundancy, etc. The research on analysis algorithms puts forward higher requirements and challenges

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Bi-order genetic calculation-based gene expression data bi-clustering algorithm
  • Bi-order genetic calculation-based gene expression data bi-clustering algorithm
  • Bi-order genetic calculation-based gene expression data bi-clustering algorithm

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0027] The present invention will be further described in detail below in conjunction with the embodiments and the accompanying drawings, but the embodiments of the present invention are not limited thereto.

[0028] like figure 1 , a biclustering algorithm based on two-stage genetic computation, includes steps in the following order:

[0029] 1) The gene expression data matrix is ​​M, the number of rows is m, and the number of columns is n, that is, the size of the gene expression data matrix is ​​m×n. Subtract the k-th row from each row of the original data matrix M to obtain the processed Matrix M(k), k=1,2,...,n;

[0030] 2) For each column except the kth column in the processed matrix M(k), such as figure 2 As shown, use hierarchical clustering with a distance threshold of cof=0.02 for each column to obtain the biclustering seeds of each column, and then put all the obtained biclustering seeds into a set named Bic_Set; because each The bi-clustering seeds correspond t...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a bi-order genetic calculation-based gene expression data bi-clustering algorithm. A kth column is subtracted from each column in a matrix M to obtain a matrix M (k), and k is equal to 1, 2, ..., n; hierarchical clustering is performed on each column of M (k) to obtain a set of biclustering seeds; performing genetic calculation to obtain a corresponding bicluster. According to the algorithm, the defect that the conventional genetic calculation-based biclustering algorithm can only perform selection aiming at the bicluster can be solved, optimization is performed on rows and columns simultaneously, the search efficiency can be improved, and a more superior biclustering analysis effect is obtained.

Description

technical field [0001] The invention relates to the field of data mining and processing, in particular to a biclustering algorithm of gene expression data based on two-stage genetic calculation. Background technique [0002] The emergence and development of DNA microarray technology allows people to simultaneously detect thousands of genes and measure the expression level of their transcribed mRNA. Through repeated experiments under multiple experimental conditions (such as different experimental environments, different time points, and different tissue samples), gene expression data from hundreds of experiments can be collected. The rows of the gene expression data matrix represent the expression of a gene under different environmental conditions or at different time points, and the columns represent the expression of all genes under different conditions or samples (such as tissue, experimental conditions, processing factors, etc.), and the data in the matrix represent The...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/30G06F19/20
Inventor 黄庆华杨杰黄仙海
Owner SOUTH CHINA UNIV OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products