Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Gene classification method and device

A classification method and gene technology, applied in special data processing applications, instruments, electronic digital data processing, etc., can solve the problem of low clustering effect, and achieve the effect of good clustering effect, strong generalization ability, and strong learning ability.

Active Publication Date: 2018-06-15
HENAN NORMAL UNIV
View PDF3 Cites 10 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] The purpose of the present invention is to provide a gene classification method and device for solving the problem of low clustering effect of existing gene classification methods

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Gene classification method and device
  • Gene classification method and device
  • Gene classification method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0112] Such as figure 1 Shown, a kind of gene classification method of the present invention comprises the following steps:

[0113] Acquire gene expression data, the number of samples contained in the gene expression data is the first set value, the number of genes in each sample is the second set value, and the genes in the gene expression data are arranged and combined to form a matrix, the formed matrix is the gene expression data matrix.

[0114] Using a local linear embedding algorithm to reduce the dimension of the gene expression data matrix, calculate the linear embedding matrix of the gene expression data matrix, and obtain the feature gene subset after dimension reduction. That is, calculate the k nearest neighbors of all samples in the gene expression data matrix, construct a local reconstruction weight matrix according to the k nearest neighbors of each sample, and then use the local reconstruction weight matrix to calculate the gene expression data matrix Linea...

Embodiment 2

[0155] In order to avoid directly using the AP clustering algorithm to cluster the gene expression data set to obtain a large number of clusters, the present invention combines the LLE algorithm with the AP clustering algorithm based on the hybrid kernel function. First, the original high The three-dimensional gene data set is mapped to a low-dimensional space, and the characteristic gene subsets are obtained through linear dimension reduction; then the characteristic gene subsets after dimensionality reduction are clustered using the AP clustering algorithm based on the hybrid kernel function, and the final clustering is obtained result.

[0156] Such as figure 2 As shown, the specific steps are as follows:

[0157] Data preprocessing: use the genetic data acquisition system to obtain the original gene expression data set, including the gene expression values ​​of multiple samples and the gene expression data matrix of the sample class label. The description of the gene exp...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to a gene classification method and device. An LLE algorithm and an AP clustering algorithm are combined, and a proposed mixed kernel function is utilized to improve a similaritymeasurement function. According to the method, first, the LLE algorithm is adopted to map an original high-dimensional gene expression dataset to a low-dimensional space to achieve the purpose of dimension reduction; second, a new global kernel function is proposed as an F-type function, the F-type function and a Gaussian kernel function are linearly combined into a new mixed kernel function, theproposed mixed kernel function is utilized to calculate similarity measurement, and a new similarity matrix S is constructed; third, data is clustered through the AP clustering algorithm and the similarity matrix, and a final clustering result is obtained through iteration; and finally the effectiveness and accuracy of the method are verified through comparison with other clustering methods.

Description

technical field [0001] The invention belongs to the technical field of gene classification, and in particular relates to a gene classification method and device. Background technique [0002] As the amount of genetic information continues to increase, how to process genetic data to obtain useful information has become a difficult problem. Data sets usually contain a large number of irrelevant genes, redundant genes, etc. Therefore, how to analyze and obtain an effective subset of characteristic genes from the massive information base, so as to better select disease-causing genes has become an important research topic for experts and scholars. [0003] As an effective data analysis method, cluster analysis is widely used in data mining, machine learning and pattern recognition, bioinformatics and other fields. Cluster analysis is mainly to cluster high-dimensional data into different clusters, so that the distance within the class is as small as possible and the distance bet...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F19/20G06F19/24G06K9/62
CPCG16B25/00G16B40/00G06F18/23
Inventor 孙林刘弱南张霄雨孟新超常宝方孟玲玲王蓝莹陈岁岁殷腾宇李源
Owner HENAN NORMAL UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products