Unlock instant, AI-driven research and patent intelligence for your innovation.

Single-cell RNA sequence deletion value filling method based on generative adversarial network

A missing value and single-cell technology, applied in the field of bioinformatics, can solve the problem of low accuracy of missing data filling, and achieve the effect of improving accuracy

Pending Publication Date: 2022-03-15
HARBIN ENG UNIV
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0007] The purpose of the present invention is to solve the problem of low accuracy of missing data filling by existing methods, and propose a method for filling missing values ​​of single-cell RNA sequences based on generative confrontation network

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Single-cell RNA sequence deletion value filling method based on generative adversarial network
  • Single-cell RNA sequence deletion value filling method based on generative adversarial network
  • Single-cell RNA sequence deletion value filling method based on generative adversarial network

Examples

Experimental program
Comparison scheme
Effect test

specific Embodiment approach 1

[0042] Specific implementation mode 1. Combination figure 1 This embodiment will be described. A method for filling missing values ​​of single-cell RNA sequences based on generating an adversarial network described in this embodiment, the method specifically includes the following steps:

[0043] Step 1. Construct a training set based on real RNA sequence data;

[0044] Step 2, constructing a generation confrontation network, the generation confrontation network includes a generator and a discriminator, wherein the generator is an autoencoder composed of an encoding module and a decoding module;

[0045] Use the training set to train the constructed generative confrontation network;

[0046] Step 3: After TPM normalization is performed on the RNA sequence data to be filled, the TPM normalized result is preprocessed (ie, gene selection and logarithmic transformation), and the trained generative confrontation network is generated according to the preprocessing result. RNA-seq...

specific Embodiment approach 2

[0047] Specific implementation mode two: the difference between this implementation mode and specific implementation mode one is that the specific process of the step one is:

[0048] Step 11, obtaining RNA sequence data from the data set Usoskin as a training set;

[0049] The data set Usoskin was published and shared by literature 1 (Usoskin, D. et al. Unbiased classification of sensory neuron types by large-scale single-cell rnasequencing. Nat. neuroscience 18, 145 (2015)).

[0050] Step 12. Each piece of RNA sequence data obtained is based on the set missing parameters to generate RNA sequence data with missing values, and label the generated RNA sequence data with missing values;

[0051]The label represents whether each gene locus is a missing value. If it is a missing value, the autoencoder learns towards the target data, so that the prediction result of the missing data is as close as possible to the target data, so as to obtain the trained autoencoder parameters. ; ...

specific Embodiment approach 3

[0056] Embodiment 3: This embodiment differs from Embodiment 1 or Embodiment 2 in that the pretreatment methods are gene selection and logarithmic transformation.

[0057] Other steps and parameters are the same as those in Embodiment 1 or Embodiment 2.

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a single-cell RNA sequence deletion value filling method based on a generative adversarial network, and belongs to the field of biological information. According to the method, the problem of low accuracy of filling missing data by adopting an existing method is solved. According to the method, the missing value of the RNA sequence is filled through the generative adversarial network, data filled by DrImpure is introduced as a direction constraint in a generator part, and a direction constraint term is added into a loss function while the missing value is generated by using an auto-encoder, so that the missing value of the RNA sequence is obtained. And a Relu activation function is given to a decoding layer to solve the problem that the filled data has a negative value. Experiments prove that the data filling accuracy can be remarkably improved by adopting the method disclosed by the invention. The method can be used for filling the RNA sequence deletion value.

Description

technical field [0001] The invention belongs to the field of biological information, and in particular relates to a method for filling missing values ​​of single-cell RNA sequences based on generative adversarial networks. Background technique [0002] With the development of high-throughput sequencing technology, single-cell RNA sequencing (scRNA-seq) technology in genomics sequencing has become a hot topic in recent years. Compared with previous bulk RNA-sequencing (bulk RNA-seq) techniques, scRNA-seq has a relatively high noise level, especially due to so-called dropout events. The zeros observed in the gene cell expression matrix of scRNA-seq datasets consist of true zeros and missing value zeros. True zeros are the result of gene non-expression, while missing values ​​of zero are caused by so-called dropout events. Dropout events are a special type of missing value that arise from the low RNA input in sequencing experiments and the randomness of gene expression patter...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G16B30/00G16B40/00G16B45/00G06K9/62
CPCG16B30/00G16B40/00G16B45/00G06F18/213G06F18/23213
Inventor 徐丽薛同许寅丛晓红江粤张新玉
Owner HARBIN ENG UNIV