Biological information quality control method and device based on next-generation sequencing and storage medium

A technology of biological information and quality control methods, which is applied in the fields of biological information quality control methods, devices and storage media based on next-generation sequencing, which can solve the problems of great influence on the accuracy of test results, low detection sensitivity of low-frequency mutations, and large sample consumption and other issues to achieve the effect of avoiding errors in subsequent mutation detection results, avoiding false positive results, and avoiding cost problems

Active Publication Date: 2019-11-12
深圳裕策生物科技有限公司
View PDF5 Cites 11 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The PCR method has the characteristics of high sensitivity, and the technology is mature, but each pair of primers can only detect one mutation, cannot detect too many samples and sites at the same time, and the throughput is low
The cost of Sanger sequencing is low, but the amount of sample required is large, and the detection sensitivity for low-frequency mutations is low
The next-gen

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Biological information quality control method and device based on next-generation sequencing and storage medium
  • Biological information quality control method and device based on next-generation sequencing and storage medium
  • Biological information quality control method and device based on next-generation sequencing and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0063] In this example, 6 pairs of paired samples (white blood cells + tissue samples) were compared for batch sample quality control information. The comparison results are shown in Table 1. It can be known that the capture efficiency and insert length of the sample DNR1902006 SLZ are significantly lower than those of the same batch. For the samples of the experimental method, the sample DNR1902006SLZ can be judged as a quality control unqualified sample through the batch sample quality control information comparison method of the present invention, and the sample degradation can be further determined. In the subsequent copy number variation detection, in the sample coverage normalization step, due to the low capture efficiency, the coverage of the target region after normalization was low, resulting in the missing detection of many copy number variations. From the quality control information, it can be judged that these copy number variations cannot give results and are false...

Embodiment 2

[0067] In this embodiment, the samples used are CT1900260XYZAA03 (sample number) and the corresponding white blood cell control sample DN1900260XYZAA03 (sample number). In this example, a problem was found in the contamination quality control of the sample. There are 16 homozygous quality control sites in the sequencing data of the control samples, and 8 of these homozygous sites are non-homozygous sites in the tissue samples, which are regarded as contamination sites. Calculate the average of the mutation frequencies of these 8 non-homozygous sites, and get the pollution degree of the sample in this case to be 24%, which is greater than the pollution threshold of 1%. It is determined that the sample in this case is contaminated, and then find the samples in the same batch including these 8 The contamination source DN1900852SLZAA01 (sample number) of non-homozygous sites, and after removing all the mutations of the contamination source, the correct mutation detection result of...

Embodiment 3

[0069] In this embodiment, the samples used are CT1901812XYZAA01 (sample number) and the corresponding white blood cell control sample DN1901812XYZAA01 (sample number). In this example, a problem was found in the contamination quality control of the sample. There are 18 homozygous quality control sites in the sequencing data of the control sample, and these homozygous sites have 6 non-homozygous sites in the tissue samples, which are regarded as pollution sites, and these 6 non-homozygous sites are calculated The average value of site mutation frequency shows that the pollution degree of the sample in this case is 5%, which is greater than the pollution threshold of 1%. It is determined that there is contamination in the sample of this case, and then no samples including these 6 non-homozygous sites can be found in the same batch of samples For the sample points, a total of 192 mutations with a mutation frequency of less than 5% in the sample and belonging to the known populat...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a biological information quality control method and device based on next-generation sequencing and a storage medium. The method comprises the steps that sequencing data of a to-be-detected tissue sample and control sample which are from the same individual source is obtained, wherein the control sample is a sample other than the to-be-detected tissue sample; the sequencing data is compared with a reference genome, a locus serving as a homozygous locus in the control sample while serving as a non-homozygous locus in the to-be-detected tissue sample is detected and regarded as contamination, and the contamination degree of the to-be-detected tissue sample is obtained through detection; it is judged whether or not the contamination degree is larger than a contaminationthreshold, if yes, it is determined that contamination exists, and a contamination source is found in the sequencing data in the latest several batches; if the contamination source is found, all mutations of the contamination source are removed from mutation detection results of the sequencing data of the to-be-detected tissue sample; if the contamination source is not found, mutations having themutation frequency less than the contamination degree and belonging to a known population high-frequency reproductive mutation database are removed. The biological information quality control method based on next-generation sequencing can judge the quality state of the sample, and remove false positive mutations caused by the quality problem from the detection results.

Description

technical field [0001] The invention relates to the technical field of biological information, in particular to a biological information quality control method, device and storage medium based on next-generation sequencing. Background technique [0002] Cancer is one of the most important non-communicable diseases in the world, and it is also a disease with a high mortality rate. In my country, nearly 4.3 million people are diagnosed with cancer every year, and more than 2.8 million people die of cancer. [0003] Anti-tumor targeted drugs and immune checkpoint inhibitors are currently more effective means of treating cancer. Most of the targeted drugs target point mutations in key genes. The currently recognized potential indicators for evaluating the efficacy of immune checkpoint inhibitors are TMB (Tumor Mutation Burden), the calculation of TMB is also based on the somatic point mutations in the tumor. It is generally recommended clinically that these drugs be tested for...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G16B30/10C12Q1/6869
CPCG16B30/10C12Q1/6869C12Q2537/165Y02A90/10
Inventor 朱嘉麒李淼王鹏杨洁何雨鸣
Owner 深圳裕策生物科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products