Unlock instant, AI-driven research and patent intelligence for your innovation.

A parallel classification method of RNA sequences based on non-negative matrix factorization

A technology of non-negative matrix decomposition and classification method, which is applied in the field of parallel classification of RNA sequences based on non-negative matrix decomposition, which can solve problems such as lagging of bioinformatics tools, improve classification accuracy, improve operating efficiency, and shorten the required time Effect

Active Publication Date: 2022-07-05
GUANGXI UNIV
View PDF5 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0002] Bioinformatics tools for analyzing single-cell RNA-seq data still lag relative to experimental techniques

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A parallel classification method of RNA sequences based on non-negative matrix factorization
  • A parallel classification method of RNA sequences based on non-negative matrix factorization
  • A parallel classification method of RNA sequences based on non-negative matrix factorization

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0036] The RNA sequence parallel classification method based on non-negative matrix decomposition of the present invention, the concrete steps are as follows:

[0037] 1) Matrix RNA data:

[0038] Most somatic mutations include single base substitutions, insertions and deletions, rearrangements and copy number variations (CNVs). Single base substitutions belong to one of six possible base changes, namely C:G>A:T, C:G>G:C, C:G>T:A, T:A>A:T, T: A>C:G and T:A>G:C. The set can be further expanded by including the 5' and 3' adjacent bases of each substitution site, resulting in the letter A with 96 trinucleotide mutation types. Once A is correctly defined, the counts of mutations found in G different genomes are assembled into a K×G matrix M with K=A. A key assumption consists in treating the counts in M ​​as the additive effect of N mutational processes, each defined as a K × 1 vector of mutation rates. The latter defines the so-called mutation signature. More precisely, muta...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a parallel classification method of RNA sequences based on non-negative matrix decomposition. After the RNA data is matrixed, the corresponding Bayesian coefficients are obtained for different K values ​​according to the original data matrix, and the K value selection in the non-negative matrix decomposition process is constrained according to the Bayesian coefficients, and the non-negative matrix decomposition method is used. The classification of RNA sequences is carried out in parallel. The method of the invention effectively improves the classification accuracy of the RNA sequence, and effectively improves the operation efficiency of the RNA sequence classification work by using parallel technical means.

Description

technical field [0001] The invention belongs to the technical field of bioinformatics, in particular to a parallel classification method of RNA sequences based on non-negative matrix decomposition. Background technique [0002] Relative to experimental techniques, bioinformatics tools for analyzing single-cell RNA-seq data are still lagging behind. In recent years, various methods have been developed to detect subpopulations (or subclasses) within a set of cells using single-cell RNA-seq data. These new computational tools demonstrate the importance of understanding single-cell RNA-seq heterogeneity. Furthermore, once the subpopulations are identified, it is critical to find out that each subpopulation (subclass) has characteristic gene expression signatures in order to reveal secondary biological mechanisms. [0003] Non-negative Matrix Factorization (NMF), as an effective data dimensionality reduction algorithm, has attracted widespread attention because of its concise i...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G16B30/00G16B20/10G16B40/00G06K9/62
Inventor 杨晓凯钟诚黄毅然
Owner GUANGXI UNIV