Single cell transcriptome missing value filling method based on deep hybrid network

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A hybrid network and single-cell technology, which is applied in the field of single-cell transcriptome deletion filling, can solve the problems of large computing resources, unreliable data interpretation, and inability to use single-cell transcriptome data universally, so as to improve reliability, reduce occupancy, Guaranteed versatility

Active Publication Date: 2020-04-03

ZHONGSHAN OPHTHALMIC CENT SUN YAT SEN UNIV

View PDF6 Cites 10 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0005] In order to overcome the technical defects that the existing single-cell transcriptome missing value filling method cannot be used universally for all single-cell transcriptome data, the calculation resources are huge, and the data interpretation after filling is unreliable, a single-cell transcriptome based on deep hybrid network is provided. Transcriptome missing value imputation method

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0055] like figure 1 As shown, a single-cell transcriptome missing value filling method based on a deep hybrid network includes the following steps:

[0056] S1: Preprocess the single-cell sequencing data to obtain the expression matrix;

[0057] S2: Standardize the expression matrix to obtain the initial expression matrix;

[0058] S3: Build a hybrid model based on deep learning, including two parts: autoencoder and cyclic neural network;

[0059] S4: Input the initial expression matrix into the autoencoder for dimensionality reduction processing to obtain a dimensionality-reduced feature matrix and a reconstructed expression matrix;

[0060] S5: Input the dimensionality-reduced feature matrix into the recurrent neural network, predict the expression values of all genes, and obtain the corresponding predicted expression matrix;

[0061] S6: Using the predicted expression matrix obtained in step S5 as the input of the autoencoder, repeating step S4 and step S5 until the p...

Embodiment 2

[0065] More specifically, such as figure 2 As shown, the step S1 specifically includes the following steps:

[0066] S11: Use the existing library construction method to obtain the processed cells, perform sequencing to obtain sequence data, and the file format, such as Fastq;

[0067] S12: using mapping software, such as Tophat2, to map the sequence data;

[0068] S13: using data splitting software, such as UMI-tools, to divide the mapped sequence data by cells to obtain sequence splitting data;

[0069] S14: Use quantitative software, such as FeatureCounts, to quantify the mapped and divided results to obtain a gene × cell expression matrix.

[0070] More specifically, the step S2 is specifically:

[0071] The expression matrix is normalized according to the library size ls of each cell to eliminate the effect of library size, where, for the gene expression value vector C of cell c c The standardized formula for is:

[0072]

[0073] Among them, sf represents the ...

Embodiment 3

[0085] More specifically, such as Figure 4As shown, in the application process of the hybrid model, the single-cell data is input into the hybrid model by using non-blocking multi-process block random read data; the specific process is:

[0086] Enter the storage address of the single-cell data file, which meets any type of access matrix and read in blocks;

[0087] According to the storage address, read the dimension information of the single-cell transcriptome matrix stored in the file, including the number of cells and the number of genes, and enter the corresponding cell name and gene name;

[0088] Divide all cells into multiple data clusters in order, and mark each data cluster with a serial number, and all cluster serial numbers are used as a serial number pool;

[0089] Create a copy based on the serial number pool, randomly extract a certain number of cluster serial numbers without replacement each time, and extract the data set. If the copy data is extracted, a new...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention provides a single cell transcriptome missing value filling method based on a deep hybrid network. The method comprises the steps of: carrying out sequencing and preprocessing of a singlecell, obtaining an expression matrix, and carrying out standardization processing; constructing a hybrid model based on deep learning, and inputting the standardized expression matrix into the hybridmodel for cyclic calculation to obtain a plurality of prediction expression matrixes; calculating the weight of each cycle, performing weighted average on the multiple prediction expression matrixesaccording to the corresponding weights, wherein the obtained result is filling output of the hybrid model, and filling of missing values is completed. According to the filling method provided by the invention, the fitting capability of the deep neural network to a complex function is adapted to the expression distribution of the single cells, so that the universality of the filling method to various single cell transcriptome data is ensured; and moreover, the expansibility of deep learning on a data set with an ultra-large cell number is reserved, filling of the single cell transcriptome missing value is completed, and the reliability of single cell data interpretation is remarkably improved.

Description

technical field [0001] The present invention relates to the technical field of single-cell transcriptome deletion filling, and more specifically, relates to a single-cell transcriptome missing value filling method based on a deep hybrid network. Background technique [0002] Single-cell transcriptome sequencing technology has developed into a major method for studying gene expression at the single cell level, and has been widely used to study important biological issues such as new cell types, cell differentiation, developmental trajectories, and tumor development. The number of captured cells has grown from the first few to the current million levels. However, due to the extremely low RNA content of a single cell, the low efficiency of transcript capture, technical noise, and the high cost of sequencing a large number of cells, the low sequencing depth of a single cell is difficult to cover the transcripts it contains, resulting in a large number of Gene expression values ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G16B40/00

CPCG16B40/00Y02A90/10

Inventor 何尧谢志袁皓

Owner ZHONGSHAN OPHTHALMIC CENT SUN YAT SEN UNIV

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Single cell transcriptome missing value filling method based on deep hybrid network

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

Embodiment 2

Embodiment 3

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology