Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Tumor gene identification method based on gene expression profile

A gene expression profile and tumor gene technology, applied in the field of tumor gene identification based on gene expression profile, can solve problems such as invalidation and unsuitable classification

Inactive Publication Date: 2011-05-25
CHONGQING UNIV
View PDF2 Cites 8 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, when used to classify gene expression profiles, both the PCA method and the LDA method have obvious deficiencies, which are mainly manifested in: First, because the overall sample dispersion includes both the intra-class dispersion and the inter-class dispersion of the sample degree, making the PCA method for the purpose of optimal reconstruction not suitable for classification problems; second, although LDA can effectively extract the discriminative information between classes, it is necessary to ensure that the intra-class scatter matrix is ​​reversible during the calculation process, while gene expression The dimensionality of the spectral data is very high, and its intra-class scatter matrix is ​​often singular; third, the PCA method and the LDA method are both obtained under the assumption that the samples obey a multivariate normal distribution. Studies have shown that gene expression The spectrum does not necessarily obey the normal distribution, but is likely to be located on a low-dimensional nonlinear manifold. In this case, the PCA method and the LDA method will probably fail

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Tumor gene identification method based on gene expression profile
  • Tumor gene identification method based on gene expression profile
  • Tumor gene identification method based on gene expression profile

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0044] There are 100 gene samples of suspected prostate tumor genes, and the gene expression profiles of these gene samples are obtained. Each gene sample has 10,509 gene expression profile characteristics; artificially identify these 100 gene samples with the help of prior knowledge, and know which of them 62 gene samples were tumor genes and 38 gene samples were normal genes. Hereinafter, the 100 suspected prostate tumor genes are used as gene samples to be tested, and the tumor gene identification is carried out on them by the method of the present invention to verify the feasibility and recognition rate of the method of the present invention. The operation process is as follows: figure 1 The specific operation steps are as follows:

[0045] s1) With the aid of prior knowledge, additionally obtain M normal gene samples and M tumor gene samples of prostate genes. In this example, M=20, and each gene sample has 10509 gene expression profile features; these 20 normal genes are...

Embodiment 2

[0085] For the gene samples of 100 suspected prostate tumor genes in Example 1, the method of the present invention is used to identify tumor genes; in step s1), according to prior knowledge, M normal gene samples and M tumors of prostate genes are additionally obtained. Gene samples, each gene sample has 10509 gene expression profile features; the values ​​of M are seven values ​​of 2, 5, 10, 15, 20, 25 and 30, and other steps are the same as in Example 1, and seven Secondary tumor gene identification. Then, the seven tumor gene identification results using the method of the present invention are compared with the results of artificial identification of the 100 suspected prostate tumor gene gene samples through prior knowledge, and the seven tumor gene identification results using the method of the present invention are obtained. The recognition rate of image 3 shown. from image 3 It can be seen that when the value of M is greater than or equal to 20, that is, when the n...

Embodiment 3

[0087] There are several tumor gene identification methods different from the method of the present invention as follows:

[0088] Method 1 (PCA+Ker-KNN): Based on the gene expression profile of the gene sample, a principal component analysis method (PCA for short) is used to perform dimension reduction processing on the vector data generated from the gene expression profile, and the kernel function in the present invention is used- The nearest neighbor classification method (Ker-KNN for short) classifies the gene samples to be tested, and realizes the identification of tumor genes in the gene samples to be tested;

[0089] Method 2 (LDA+Ker-KNN): Based on the gene expression profile of the gene sample, a linear discriminant analysis method (LDA for short) is used to perform dimension reduction processing on the vector data generated by the gene expression profile, and the kernel function in the present invention- The nearest neighbor classification method (Ker-KNN for short) ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a tumor gene identification method based on a gene expression profile, which is characterized in that gene expression matrixes of gene samples are studied by adopting a locality preserving projection method and a kernel-function nearest neighbor classification method under the assistance of a computer, the gene expression matrixes of the gene samples are projected into a low-dimensional embedding space so as to reveal a low-dimensional manifold structure hidden in high-dimensional gene expression profile data, and low-dimensional characteristic matrixes are classified by adopting the kernel-function nearest neighbor classification method so that the unrevealed characteristics in the low-dimensional characteristic matrixes are revealed, further the tumor genes in the gene samples are distinguished, and the tumor genes are identified. The method provided by the invention has high rate of identification and good reference value to the clinical diagnosis of the tumor genes and can be applied to establishing a tumor gene identification system.

Description

technical field [0001] The invention relates to the field of computer data processing and gene technology, in particular to a method for identifying tumor genes based on gene expression profiles. Background technique [0002] Cancer has become one of the major diseases that threaten human life. Early detection and diagnosis of cancer is the key to cancer treatment. Tumor gene analysis at the level of gene expression is an important means of diagnosing cancer in the future, while the judgment and identification of tumor genes at the level of gene expression is the premise and basis of tumor gene analysis, which is helpful for the early detection and accurate detection of cancer. judge. [0003] When DNA gene fragments of biological cells are used as gene samples for gene expression analysis, the gene expression profile of the gene sample is usually obtained by gene chip technology, and the gene expression profile is analyzed and studied. However, because the gene sequence f...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F19/24
Inventor 黄鸿叶俊勇于攀
Owner CHONGQING UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products