Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Duplex-seq-based ultralow-frequency mutation site detection analysis method

A technology of mutation sites and analysis methods, applied in the field of second-generation high-throughput sequencing, can solve problems such as ineffective presentation of data information, insufficient system of annotation process and related statistics, and no systematic analysis, so as to increase diversity , increase the quantity, improve the effect of analysis efficiency

Active Publication Date: 2017-04-26
SHANGHAI PASSION BIOTECHNOLOGY CO LTD
View PDF3 Cites 18 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] 1. Data quality control: In the existing Duplex-seq data analysis process, there is no systematic analysis of the previous data quality, such as data repetition rate, UMI type, quantity, ratio, R1R2 balance, etc.
[0006] 2. Differential analysis of UMI: The comprehensive and independent analysis of single-strand-specific UMI and double-strand complementary UMI has not been reported in the existing Duplex-seq data analysis methods
[0007] 3. Variant site annotation process: The mutation sites copied out based on Duplex-seq data belong to low-frequency mutation sites. Therefore, the existing methods can be optimized for the parameters related to the detection of variant sites. not systematic enough
[0008] 4. Readability of results: In the existing Duplex-seq data analysis methods, there are only some simple chart files and text reports in the results, and a lot of data information is not effectively presented

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Duplex-seq-based ultralow-frequency mutation site detection analysis method
  • Duplex-seq-based ultralow-frequency mutation site detection analysis method
  • Duplex-seq-based ultralow-frequency mutation site detection analysis method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0074] In order to realize the object of the present invention, as figure 1 and figure 2 As shown, the present invention is based on the duplex-seq ultra-low frequency mutation site detection and analysis method, comprising the following steps:

[0075] 1) Evaluate the quality of the original sequencing data, reduce data noise, and provide effective data for subsequent analysis;

[0076] 2) Extract the random barcode to the title line of each sequence in the sequence file, so as to facilitate the subsequent quick retrieval of the barcode and create a consistent sequence;

[0077] 3) Create consensus sequences based on family barcode and duplex barcode, excluding mutations introduced during library construction or PCR;

[0078] 4) Construct a double-strand consensus sequence according to the duplex-tag, and further exclude asymmetric mutation sites in the sequence;

[0079] 5) Perform local quality correction on the compared data, and detect low-frequency variant sites; ann...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a duplex-seq-based ultralow-frequency mutation site detection analysis method. The method comprises the following steps of 1) assessing original sequencing data quality, reducing data noises, and providing effective data for subsequent analysis; 2) extracting a random barcode to a title line of each sequence of a sequence file, thereby facilitating subsequent quick retrieval of the barcode and creation of a consistency sequence; 3) creating the consistency sequence according to a family barcode and a duplex barcode, and excluding mutations introduced in a library creation process or a PCR process; 4) constructing a double-strand consistency sequence according to duplex-tag, and further excluding non-symmetric mutation sites in the sequence; 5) performing local quality correction on compared data, and performing low-frequency mutation site detection; performing annotation of three levels including a gene structure, a function and a clinical phenotype on the mutation sites; and 6) performing statistics on SSCS and DCS sequence number, comparison result and mutation site information, and outputting a visual chart.

Description

technical field [0001] The invention belongs to a biological information data processing method, in particular to a duplex-seq-based ultra-low frequency mutation site detection and analysis method, which is mainly used in the field of second-generation high-throughput sequencing, based on duplex-seq whole exome sequencing , to detect and analyze the ultra-low frequency mutation sites of ctDNA. Background technique [0002] The development of next-generation sequencing technology is in full swing, and it is profoundly changing the research of traditional genetics with overwhelming power, and thus giving birth to the germination of precision medicine. Compared with traditional experimental techniques, this technique can detect thousands of genetic mutations at one time. However, the fly in the ointment is that the next generation sequencing technology still has a relatively high error rate (0.1-1%). For the detection of high-frequency genetic mutations, this error is accepta...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F19/22
CPCG16B30/00
Inventor 刘港飚朱月艳孙子奎
Owner SHANGHAI PASSION BIOTECHNOLOGY CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products