A single-cell atac-seq data analysis method

A technology of data analysis and analysis methods, applied in sequence analysis, bioinformatics, instruments, etc., to achieve the effect of rich analysis content, various forms, and clear levels

Active Publication Date: 2021-06-11
GUANGZHOU GENE DENOVO BIOTECH
View PDF4 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] However, at present, there is no unified standard for the analysis process of single-cell epigenetic group sequencing data. The Chinese patent with publication number CN 107368701A discloses a large-scale single-cell ATAC-seq data quality control and analysis method, including Quality control at the level of sequencing fragments and multi-cell levels, quality control at the single-cell level, cell clustering and detection of cell-specific peaks, and finally provide users with a quality control report document

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A single-cell atac-seq data analysis method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0029] The single-cell ATAC-seq data analysis process in this embodiment includes the following steps:

[0030] Step S1: Perform data analysis and quality control on the raw sequencing data to obtain high-quality data for subsequent analysis. Mainly use the Cell Ranger software to filter and correct the wrong barcodes in the sequencing raw data. Compare each barcodes sequence with the known barcodes sequence in the database, find the barcodes with a base mismatch of ≤ 2bp with the known barcodes, and score according to the abundance of the read barcodes and the quality value of the mismatched bases. Barcodes with values ​​greater than 90% are considered correct barcodes.

[0031] Step S2: Alignment analysis, use cutadapt to identify the reverse complementary sequence of the primer at the end of the reads, and remove it from the reads sequence, then use BWA-MEM to align the trimmed reads to the reference genome, and then use Duplication analysis to determine the unique alignme...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The present invention provides a single-cell ATAC-seq data analysis method, comprising the following steps: Step S1, performing data analysis and quality control on the sequencing raw data; Step S2, comparing analysis; Step S3, inserting fragment analysis; Step S4, enriching Peak analysis of set area; step S5, classification of single-cell subgroups; step S6, annotation and enrichment of Peak-related genes; step S7, TF-motif analysis; step S8, difference analysis of subgroup accessibility; step S9, difference Accessibility site-related gene analysis, gene annotation corresponding to the transcription start site closest to the peak region where the identified differential TF-motif is located, etc. The present invention constructs a single-cell ATAC-seq data analysis process that is comprehensive and rich in analysis content. The analysis results reveal a large amount of biological information, which is convenient for people to dig deep into the biological phenomena and characteristics hidden in the single-cell level, and analyze the process and results. Visual display in the form of html, the level of analysis content is clear, and the results are displayed in various forms, which increases the readability of the report.

Description

technical field [0001] The invention belongs to the technical field of bioinformatics, and in particular relates to the technical field of biological analysis of single-cell ATAC-Seq data. Background technique [0002] It is well known that the chromatin in most genomes is tightly coiled in the nucleus, and only a small area is relatively loose. This part of the naked DNA area without nucleosomes is called open chromatin (open chromatin), DNA replication and gene transcription Often occurs in these areas. ATAC-seq (Assay for Transposase-Accessible Chromatin with high-throughput sequencing) is to use Tn5 transposase to cut open chromatin regions, and add sequencing primers for high-throughput sequencing, and identify them through bioinformatics analysis Transcription factor binding sites and nucleosome region locations, thus providing effective methods for studying gene regulation, DNA imprinting, etc. [0003] At present, single-cell technology is developing rapidly, and t...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G16B30/00
CPCG16B30/00
Inventor 夏昊强高川周煌凯张羽陶勇罗玥陈飞钦曾川川
Owner GUANGZHOU GENE DENOVO BIOTECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products