Single-cell ATAC-seq data analysis method

A technology of data analysis and analysis methods, applied in the direction of sequence analysis, bioinformatics, informatics, etc., to achieve the effect of displaying various forms, increasing readability, and clear layers

Active Publication Date: 2019-12-06
GUANGZHOU GENE DENOVO BIOTECH
View PDF4 Cites 7 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] However, at present, there is no unified standard for the analysis process of single-cell epigenetic group sequencing data. The Chinese patent with publication number CN 107368701A discloses a large-scale single-cell ATAC-seq data quality control and ana

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Single-cell ATAC-seq data analysis method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0029] The single-cell ATAC-seq data analysis process in this embodiment includes the following steps:

[0030] Step S1: Perform data analysis and quality control on the raw sequencing data to obtain high-quality data for subsequent analysis. Mainly use the Cell Ranger software to filter and correct the wrong barcodes in the sequencing raw data. Compare each barcodes sequence with the known barcodes sequence in the database, find the barcodes with a base mismatch of ≤ 2bp with the known barcodes, and score according to the abundance of the read barcodes and the quality value of the mismatched bases. Barcodes with values ​​greater than 90% are considered correct barcodes.

[0031] Step S2: Alignment analysis, use cutadapt to identify the reverse complementary sequence of the primer at the end of the reads, and remove it from the reads sequence, then use BWA-MEM to align the trimmed reads to the reference genome, and then use Duplication analysis to determine the unique alignme...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a single-cell ATAC-seq data analysis method. The single-cell ATAC-seq data analysis method comprises the following steps: S1, performing data analysis and quality control on sequencing original data; S2, performing comparative analysis; S3, analyzing an insertion fragment; S4, carrying out Peak analysis on the enrichment region; S5, classifying the single-cell subgroups; S6,carrying out annotation and enrichment on the Peak related genes; S7, carrying out TF-motif analysis; S8, carrying out the subgroup accessibility difference analysis; S9, analyzing related genes of the difference accessibility sites, and annotating genes corresponding to the transcription start sites closest to the peak region where the identified difference TF-motif is located, and the like. According to the method, a comprehensive single-cell ATAC-seq data analysis process with rich analysis contents is constructed; the analysis result reveals a large amount of biological information, people can deeply excavate biological phenomena and characteristics contained in the single-cell level conveniently, the analysis process and result are visually displayed in the form of html, the analysiscontent is clear in hierarchy, the result display forms are diversified, and the readability of the report is improved.

Description

technical field [0001] The invention belongs to the technical field of bioinformatics, and in particular relates to the technical field of biological analysis of single-cell ATAC-Seq data. Background technique [0002] It is well known that the chromatin in most genomes is tightly coiled in the nucleus, and only a small area is relatively loose. This part of the naked DNA area without nucleosomes is called open chromatin (open chromatin), DNA replication and gene transcription Often occurs in these areas. ATAC-seq (Assay for Transposase-Accessible Chromatin with high-throughput sequencing) is to use Tn5 transposase to cut open chromatin regions, and add sequencing primers for high-throughput sequencing, and identify them through bioinformatics analysis Transcription factor binding sites and nucleosome region locations, thus providing effective methods for studying gene regulation, DNA imprinting, etc. [0003] At present, single-cell technology is developing rapidly, and t...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G16B30/00
CPCG16B30/00
Inventor 夏昊强高川周煌凯张羽陶勇罗玥陈飞钦曾川川
Owner GUANGZHOU GENE DENOVO BIOTECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products