Method and device for correcting high-flux sequencing data

A sequencing data, high-throughput technology, applied in sequence analysis, instrumentation, genomics, etc., can solve the problems of PCR repetition, affecting the accuracy and repeatability of test results, and low ctDNA content, so as to improve accuracy and repeatability , Improve the quality and efficiency of detection, and achieve good consistency

Active Publication Date: 2019-06-21
SHENZHEN HAPLOX BIOTECH
View PDF15 Cites 3 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] However, as mentioned earlier, the content of ctDNA is low, and PCR amplification enrichment is required to build a library. This proce

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and device for correcting high-flux sequencing data
  • Method and device for correcting high-flux sequencing data
  • Method and device for correcting high-flux sequencing data

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0069] This example introduces the methods of correcting high-throughput sequencing data for different technologies in detail for non-UMI sequencing technology, single-end IndexUMI sequencing technology and paired-end InsertUMI sequencing technology, as follows:

[0070] Method 1: Calibration method for high-throughput sequencing data obtained by non-UMI sequencing technology

[0071] 1. Read the result file after alignment and sorting of the sequencing data and the reference genome, and read the reference genome sequence file at the same time.

[0072] 2. Analyze each readpair or read, and identify the chromosome, start point, and end point of the readpair or read comparison.

[0073] 3. Divide into different sets Ai according to whether the readpair or read has the same start and end positions, where i=1, 2, 3, 4, . . . . In non-UMI sequencing, the readpair or read in each Ai comes from the same DNA molecule template or the original DNA molecule double strand.

[0074] 4. ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a method and a device for correcting high-flux sequencing data. The method comprises the steps of comparing read pair or read data which are obtained through sequencing with areference genome; classifying the read pair or read with the same starting point and terminal point positions as one Ai subset; comparing each base sequence of the read pair or read in each subset ata genome comparing position, eliminating repetitions and false positive mutation sites according to a preset mutation threshold; and finally outputting consistent data with high coverage range, wherein only the corrected single read pair or read is reserved in each subset. The method according to the invention can eliminate a large number of repetitions and false positive mutations generated in database establishing, hybridization capturing and PCR in high-flux sequencing. The method is suitable for high-depth sequencing with easy false positive mutation generation in eliminated cancer tissuemutation detection, liquid biopsy and the like. The method lays a basis for improving detecting quality and efficiency.

Description

technical field [0001] The present application relates to the field of high-throughput sequencing data correction, in particular to a method and device for correcting high-throughput sequencing data. Background technique [0002] With the development of next-generation sequencing technology, high-depth sequencing has become more and more widely used in the fields of tumor mutation detection and liquid biopsy. In particular, mutation detection based on free peripheral blood DNA (cfDNA for short) has become an important auxiliary means for early cancer screening and clinical treatment of cancer. Although, as the tumor progresses, the content of free tumor DNA (abbreviated ctDNA) in the peripheral blood of cancer patients increases significantly, but the proportion of ctDNA content in most patients is between 0.5-5%. A large number of errors will be introduced during the sequencing process, making it still extremely difficult to detect tumor-derived somatic mutations. [0003...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G16B20/00G16B30/00
Inventor 周衍庆陈亚如尤沁徐云
Owner SHENZHEN HAPLOX BIOTECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products