Sample clustering and feature recognition method based on integrated non-negative matrix factorization

A non-negative matrix factorization and feature recognition technology, applied in the field of pattern recognition

Active Publication Date: 2020-02-21
QUFU NORMAL UNIV
View PDF5 Cites 7 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, there are still some deficiencies. For example, studies have found that real-world data are usually embedded in low-dimensional manifolds in high-dimensional spaces. How to make full use of these low-dimensional features to discover the inherent laws of observation data and improve the performance of integrated analysis methods still needs further research. Research; existing algorithms are easily affected by noise and redundant information in multi-omics data, how to make full use of the complementarity and difference of heterogeneous data and improve the robustness of the algorithm also needs further research

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Sample clustering and feature recognition method based on integrated non-negative matrix factorization
  • Sample clustering and feature recognition method based on integrated non-negative matrix factorization
  • Sample clustering and feature recognition method based on integrated non-negative matrix factorization

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0033] With the implementation and completion of large-scale sequencing projects, massive omics data have been generated, which has brought great challenges to researchers' analysis and calculation. Therefore, the development of efficient multi-omics data processing methods has important theoretical significance and application value.

[0034] Due to the limitation of experimental conditions, experimental samples are usually only dozens to hundreds, and sequencing technology can monitor tens of thousands of genes at the same time. Therefore, the primary challenge in analyzing multi-omics data is that the dimensionality of data features is much higher than the number of samples. In addition, realistic multi-omics data contains a lot of noise and redundant information; different types of data from different platforms need to be processed simultaneously, such as count data from sequencing, continuous data from microarrays, binary data from genetic variation These are all issues t...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a sample clustering and feature recognition method based on integrated non-negative matrix factorization. The method comprises: 1, X = {X1, X2... XP} representing multi-view data composed of P different omics data matrixes of the same cancer; 2, constructing a diagonal matrix Q; 3, introducing graph regularization and sparse constraints into the integrated non-negative matrix factorization framework to obtain target functions O1 and O2; 4, solving the target function O1 to obtain a fusion feature matrix W and a coefficient matrix HI; solving the target function O2 to obtain a feature matrix WI and a fusion sample matrix H; 5, constructing an evaluation vector according to the fusion feature matrix W, and identifying common difference features according to the vector; 6, performing functional explanation on the identified common difference characteristics by using GeneCards; and 7, performing sample clustering analysis according to the fusion sample matrix. According to the method, the complementary and difference information of the multiple omics data can be fully utilized to identify the common difference characteristics, clustering analysis can be carriedout on the sample data provided by the multiple omics data, and a calculation method basis is provided for integrated research of different types of omics data.

Description

technical field [0001] The invention discloses a sample clustering and feature recognition method based on integrated non-negative matrix decomposition, which belongs to the field of pattern recognition technology, can integrate and analyze multi-omics data, and provide methodological advantages for the integration of different types of heterogeneous data. in accordance with. Background technique [0002] With the development of sequencing technology, bioinformatics is faced with a variety of omics big data analysis tasks. The emergence of massive omics data provides bioinformatics researchers with rich data sources, enabling researchers to conduct research from different biological levels. Only by effectively processing, analyzing and mining these data can the value of the data be fully utilized. Previous studies mostly focused on the analysis of single omics data (such as gene expression profiles), and seldom considered the correlation and differences between different o...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06K9/62
CPCG06F18/2321G06F18/2136G06F18/253Y02A90/10
Inventor 代凌云刘金星
Owner QUFU NORMAL UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products