Single-cell sequencing data dimension reduction method fusing gene ontology and neural network

A single-cell sequencing and gene ontology technology, applied in character and pattern recognition, instruments, biostatistics, etc., can solve the problems of mediocre effects, insufficient use of biological information knowledge, and weak interpretation, so as to achieve accelerated training and fast Effective dimensionality reduction and enhanced dimensionality reduction effects

Active Publication Date: 2020-08-21
NORTHWESTERN POLYTECHNICAL UNIV
View PDF7 Cites 8 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] Although the above method has achieved certain effects on dimensionality reduction of single-cell sequencing data, its disadvantages are: on the one hand, it does not make full use of existing bioinformatics knowledge, and the interpretation is not strong. Dimensionality reduction a

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Single-cell sequencing data dimension reduction method fusing gene ontology and neural network
  • Single-cell sequencing data dimension reduction method fusing gene ontology and neural network
  • Single-cell sequencing data dimension reduction method fusing gene ontology and neural network

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0029] The present invention will be further described below in conjunction with the accompanying drawings and embodiments, and the present invention includes but not limited to the following embodiments.

[0030] Single-cell sequencing data can be viewed as a matrix in which the horizontal and vertical coordinates are cells and genes, respectively. The numbers in the matrix represent the amount of gene expression in a certain cell, and are generally represented by real numbers. like figure 1 As shown, the present invention provides a single-cell sequencing data dimensionality reduction method that integrates gene ontology and neural network, and its basic implementation process is as follows:

[0031] 1. Data preprocessing

[0032] Generally, the original single-cell sequencing data are natural numbers, which are preprocessed.

[0033] (1) Delete the gene whose gene expression is less than 3 cells in the single-cell sequencing data (the gene expression value is 0, 1, 2... whe...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a single-cell sequencing data dimension reduction method fusing gene ontology and a neural network. The method comprises the following steps: firstly, extracting gene ontology terms as deep biological information priori knowledge; secondly, extracting Mut-Link constraints among cells as priori knowledge on a cell level; then, combining the two kinds of priori knowledge withan auto-encoder model, and proposing a simGOAE model; and finally, carrying out training dimension reduction on single-cell sequencing data by using the simGOAE model. The simGOAE model provided by the invention not only can adapt to the training of a large sample data set, but also can better mine the biological information of the cells and realize a better single-cell sequencing data dimension reduction effect.

Description

technical field [0001] The invention belongs to the technical field of biological information processing, and in particular relates to a dimensionality reduction method for single-cell sequencing data fused with a gene ontology and a neural network. Background technique [0002] With the development of high-throughput sequencing technology, scientists have proposed single-cell sequencing technology and widely used in transcriptomics research. This technique determines the sequence information of individual cells, providing higher differential resolution and enabling a better understanding of the function of individual cells in their microenvironment. The emergence of single-cell sequencing technology provides the possibility for humans to further study cell function and cell differential expression in the field of bioinformatics. One of the most important applications of single-cell sequencing data (scRNA-seq) is cell clustering, and the clustering results help identify new...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G16B40/00G06K9/62
CPCG16B40/00G06F18/23213G06F18/2135
Inventor 彭佳杰王晓昱王余贤尚学群
Owner NORTHWESTERN POLYTECHNICAL UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products