Method and device for detecting designated location point of variation

A mutation and site technology, applied in the device for detecting fusion gene mutation, in the field of fixed-point mutation detection, can solve the problems of increased workload, difficulty in detection, lack of effective guidance, etc., and achieve the effect of accurate and efficient detection

Active Publication Date: 2017-06-30
BGI GUANGZHOU MEDICAL LAB CO LTD +1
View PDF8 Cites 35 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

In this case, the test results will contain a large number of mutations of unknown clinical significance, which have no effective guidance for clinicians; the detection process requires sequencing of cancer tissues and blood cells at the same time, which increases the workload; more What is important is that the alignment quality of bases near INDEL will decrease. For example, for complex INDEL (complex INDEL) mutations in lung cancer such as EGFRc.2238_2248>GC, the GC bases inserted after deletion may be aligned to different locations, it is very difficult for traditional variant detection methods to detect such variants

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and device for detecting designated location point of variation
  • Method and device for detecting designated location point of variation
  • Method and device for detecting designated location point of variation

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0063] (1) Construction of reference value model

[0064] 1. Hypothetical basis for the construction of the reference value model

[0065] 1.1. For any site, assume that the base corresponding to the reference genome is r∈{A,T,C,G};

[0066] 1.2, for any site, assume that the corresponding base of all reads covering the site is b i , the base quality value is q i , the corresponding base error rate is i=1,2,...,d, where d represents the sequencing depth corresponding to this site.

[0067] 2. Model establishment

[0068] The data distribution of each site is divided into two models to explain:

[0069] Model M 0 : There is no variation at this site, and those bases that are different from the reference genome are caused by systematic errors;

[0070] Model The mutation r→m at this site really exists, and the allelic mutation frequency is f, and those bases that are neither r nor m are treated as systematic errors.

[0071] The data distribution of this site can be...

Embodiment 2

[0096] After obtaining the off-machine sequencing data, taking the off-machine data from the BGISEQ-100 platform as an example, the mutation detection generally includes the following parts:

[0097] 1. Variation known information processing and sequencing data preprocessing

[0098] 1.1 Convert the type of variation to be detected into the format recognized by the detection program, and generate a list of variation to be detected.

[0099] 1.2 Compare the off-machine data with the reference genome. The effective sequencing data of BGISEQ-100 was aligned to the reference genome using the tmap tool to obtain accurate alignment results. The tmap tool comes from: https: / / github.com / iontorrent / TS / tree / master / Analysis / TMAP.

[0100] Sort. Use samtools sort to sort the results (bam files) after using tmap comparison: sort according to the chromosome number and the position on the chromosome in ascending order.

[0101] The PCR duplicate fragments of the alignment results were re...

Embodiment 3

[0136] This example uses the FFPE tissue sample of a female patient with upper left lung adenocarcinoma to capture the target area and sequence it on the BGISEQ-100 platform. The effective data off the machine is compared by tmap, sorted by samtools, deduplicated by BamDuplicates, and samtools Index construction, mutation detection at known sites, variant annotation, report generation and other steps, finally obtain the patient's known site mutation detection report.

[0137] All the parts of the process of the above-mentioned variation detection method are integrated into the software Otype. The operating environment of the software is the Linux operating system. The specific operation steps are as follows:

[0138] Enter the following command line in the Linux operating system computer terminal:

[0139] perlOtype.pl–lsample.list–o outdir–O run.sh will generate the corresponding running script.

[0140] sh run.sh runs the script.

[0141] For the meaning of the command lin...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a method for detecting a designated location point of variation. The method comprises the following steps that a designated location point of variation and a reference sequence containing the variation are determined based on known information of the variation; sequencing data of nucleic acid of a sample to be detected is obtained, wherein the sequencing data comprises multiple read segments; the read segment containing the designated location point in the sequencing data is extracted, and then the designated read segment is obtained; N bps are extended in the directions of both ends centered on the designated location point of the designated read segment, a designated fragment is obtained, 4<=N<=10; the designated fragment is compared with the reference sequence containing the variation, and then a support read segment is obtained, wherein the support read segment is a read segment where the designated fragment matched with the reference sequence containing the variation is located; the amount of the support read segment is counted, and whether or not the variation exists is judged based on the amount of the support read segment. Attention is paid to whether or not sequence characteristics which should occur after the variation exist in the read segment, and based on the judgment, detection of the designated location point of the variation is carried out; the problem of the decline in the quality of the comparison near the variation location point is avoided, and the variation can be detected quickly and accurately.

Description

technical field [0001] The present invention relates to the field of biological information. Specifically, the present invention relates to a method and device for detecting variation at a fixed point. More specifically, the present invention relates to a method for detecting variation at a fixed point, a device for detecting variation at a fixed point, and a detection fusion method. A method for gene mutation and a device for detecting fusion gene mutation. Background technique [0002] Cancer is caused by changes in genetic genes. Different cancers and different patients have different types of gene mutations. Finding the type of gene mutations in cancer patients is the basis for individualized treatment and can help us understand the mechanism of cancer more clearly. [0003] At present, armsPCR is mainly used to detect SNV and INDEL clinically, and FISH is used to detect gene fusion. These two experimental methods are expensive, and the probes are designed for specific m...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F19/22G06F19/20
CPCG16B25/00G16B30/00
Inventor 刘继龙费凌娜刘足张纪斌邵迪
Owner BGI GUANGZHOU MEDICAL LAB CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products