Unlock instant, AI-driven research and patent intelligence for your innovation.

A method and device for analyzing ctDNA low-frequency mutation sequencing data

A technology for sequencing data and low-frequency mutations, applied in the field of sequencing, can solve the problems of difficult to guarantee detection accuracy, large background noise, limited error correction effect, etc., to improve sensitivity and specificity, accurate analysis results, efficient and accurate clustering. Effect

Active Publication Date: 2018-10-19
CAPITALBIO GENOMICS
View PDF4 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] Next-generation sequencing technology (NGS) is currently the most widely used sequencing technology, which has the advantages of high sequencing depth, high throughput, high accuracy, and good sensitivity. However, there are still technical difficulties in the application of NGS to the detection of low-frequency mutations in ctDNA
On the one hand, NGS inevitably has sequencing errors, and the single-base error rate is generally between 0.1% and 1%. -6 The replication error rate is about 10%, and it increases with the increase of the number of PCR cycles; these two factors lead to large background noise in the sequencing analysis of ctDNA low-frequency mutations, especially in the case of detection limits of 0.1% and below, the Difficulty distinguishing template DNA mutations from sequencing errors / replication errors, thus prone to false positive assay results
[0004] In order to solve the above problems, two methods are usually adopted: one is to increase the amount of sequencing data, and increase the sequencing depth by increasing the amount of data, so as to eliminate sequencing errors. However, the increase in data volume is not linearly related to the sequencing depth, and the error correction effect is very limited. ; The second is the molecular tag method, by connecting a molecular tag to at least one end of the original DNA molecule, the molecular tag can be a nucleotide sequence composed of random bases, the length is selected according to actual needs, based on the length of the molecular tag and the change of the base, In theory, molecular tags can have 4 n The original DNA molecules marked by molecular tags are unique; but in fact, there are still preference problems in the synthesis and labeling of DNA molecules with molecular tags, and there will inevitably be dominant molecular tags, and usually multiple DNA molecular tags In the case of the same molecular label, this increases the difficulty of identifying the original DNA molecule during the detection of ctDNA low-frequency mutations, and the detection accuracy is difficult to guarantee. Therefore, it is still necessary to improve the sequencing analysis method for ctDNA low-frequency mutations based on molecular labels. To improve the detection accuracy and avoid false positive results

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A method and device for analyzing ctDNA low-frequency mutation sequencing data
  • A method and device for analyzing ctDNA low-frequency mutation sequencing data
  • A method and device for analyzing ctDNA low-frequency mutation sequencing data

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0098] In this embodiment, ctDNA low-frequency mutation sequencing data analysis was performed on samples with 8 known mutation sites and a mutation frequency of 0.1% to 0.13% (tumor standard HD779, derived from horizon discovery).

[0099] In this example, 8bp random molecular tags were introduced at both ends, and multiplex PCR capture was performed using cfDNA as a template. After library construction, quality control, and finally sequencing with a Proton sequencer.

[0100] The analysis process of low-frequency mutation sequencing data is as follows:

[0101] 1. Data acquisition

[0102] Obtain off-machine sequencing data of the amplicon library constructed by the molecular tags above, where the off-machine data includes reads and molecular tags.

[0103] 2. Quality control

[0104] Use tmap (Life Technologies) software to compare the reads with the human reference genome (the hg19 version of the UCSC database), and perform quality control according to the comparison res...

Embodiment 2

[0123] In this example, ctDNA low-frequency mutation sequencing data analysis was performed on samples with 8 known mutation sites and a mutation frequency of 1% to 1.3% (tumor standard HD778, derived from horizon discovery). Other detection procedures in this embodiment are the same as those in Embodiment 1, and will not be repeated here.

[0124] Table 2 shows the detection results obtained by this embodiment and conventional cluster analysis under the amount of 1M data, wherein conventional cluster analysis refers to obtaining the second label group by density cluster analysis on the basis of obtaining the first label group, Other methods are with embodiment 1. It can be seen from the table that the detection sensitivity of 1% standard product in this embodiment is 100% and the specificity is 100%; the conventional cluster analysis test results are consistent.

[0125] The comparison of table 2, embodiment 2 and conventional method

[0126]

[0127]

[0128] Note: M...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a low frequency mutation sequencing data analysis method and device. The method and the device are based on the iterative convergence of the frequency of a molecular tag to identify a clustering center, and accordingly the efficient and accurate clustering is realized. It is proved by analysis and test that the analysis result of a clustering strategy is more accurate thanthat of an ordinary clustering model, errors in PCR amplification and sequencing process can be eliminated, then the background noise problem in the sequencing analysis resulTS is solved, and the sensitivity and specificity of the detection are improved.

Description

technical field [0001] The invention belongs to the technical field of sequencing, and in particular relates to a method and device for analyzing ctDNA low-frequency mutation sequencing data. Background technique [0002] ctDNA (circulating tumor DNA), also known as circulating tumor DNA, is the DNA released by tumor cells into the blood circulation system, which can accurately reflect the molecular genetic information of the primary tumor tissue. By detecting ctDNA, point mutations, structural variations, and even chromosomes can be obtained. Genetic information such as copy number variation. Based on the heterogeneity of tumors, the content of ctDNA in the blood is very small, and the abundance of ctDNA in the early stage of cancer even reaches a level below 0.1%, which leads to a gene mutation frequency usually between 0.01% and 1% (low frequency mutation). [0003] Next-generation sequencing technology (NGS) is currently the most widely used sequencing technology, which...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F19/24G06F19/18
CPCG16B20/00G16B40/00
Inventor 糜庆丰徐发刘宇彬吴春求李建文夏渝东黄铨飞刘丽菲
Owner CAPITALBIO GENOMICS