Genotype correction device and method

A genotype and Bayesian model technology, applied in genomics, instruments, biological systems, etc., can solve problems such as sequencing bias and mutation frequency deviation

Active Publication Date: 2019-05-21
CAPITALBIO GENOMICS
View PDF14 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, fragment amplification, sequence comparison, sequencing errors and other reasons will cause sequencing bias, which will lead to a certain deviation between the measured value of the mutation frequency and the true value, for example, the mutati...

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Genotype correction device and method
  • Genotype correction device and method
  • Genotype correction device and method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0124] Hereditary deafness-related genes Detect 33 mutation sites of 5 common genes of hereditary deafness, design PCR primers for each site, amplify target region fragments by multiplex PCR technology, determine the genotype of each site It is determined based on the ratio of the number of mutant bases in the sequencing data of the site to be detected to the sequencing depth, that is, the mutation frequency (Allele Frequency, abbreviated as AF). For the detection results of high-throughput sequencing, due to fragment amplification, sequence Due to the sequencing bias caused by comparison, sequencing errors and other reasons, the measured value of the mutation frequency of the detection result will have a certain deviation from the true value.

[0125] This application builds a genotype frequency distribution model for each site based on Bayesian Models, and calculates the conditional probability value P(AF|GT) and the posterior probability value P( GT|AF), the genotype of the...

Embodiment 2

[0162] Example 2 result verification

[0163] The present invention carries out the establishment of threshold value by 730 samples, and the sensitivity and specificity of each point of each sample are all 1; Adopt 645 samples to verify the model threshold value formulated simultaneously, verification result is shown in Table 4:

[0164] Table 4 Validation results of the model threshold

[0165]

[0166]

[0167] It can be seen from Table 4 that the site sensitivity and specificity of all negative data are 1, which shows that the calculated sensitivity and specificity of each detection site are very good, which can meet the detection requirements of clinical samples, and the detection results are reliable.

Embodiment 3

[0169] Compared with the variation detection results without Bayesian combined kernel density estimation correction, the results are shown in Table 5

[0170] table 5

[0171]

[0172]

[0173]

[0174] It can be seen from Table 5 that there are occasional abnormalities in the detected variant genotypes of the variant detection results without Bayesian combined kernel density estimation correction, that is, the wild type is detected as a mutant type, and the heterozygous mutation is detected The sensitivity and specificity of genotype analysis by high-throughput sequencing can be improved after correction.

[0175] In summary, the present invention provides a device and method for genotype correction. The device establishes a statistical model for the detected data through a Bayesian model, constructs a priori distribution, determines the overall distribution information, and finally calculates the post- Posterior probability value P(GT|AF), through judging the relat...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a genotype correction device and a genotype correction method. According to the invention, a statistical model is established for detected data through a Bayesian model, and theprior distribution is constructed. The overall distribution information is determined, and a posterior probability value P (GT | AF) is finally calculated. The genotype of a site is judged accordingto the posterior probability value and the threshold relation corresponding to the posterior probability value. The method is good in sensitivity and specificity, and is worthy of application and popularization.

Description

technical field [0001] The invention belongs to the field of biological information analysis, and relates to a genotype correction device and method. Background technique [0002] Since the human genome is diploid, the bases from the two homologous chromosomes at the same position in the genome may be different from the bases on the reference genome. If the bases from the two homologous chromosomes are different from the reference The base on the genome is a homozygous mutant; if only one base from two homologous chromosomes is different from the base on the reference genome, it is a heterozygous mutant. The bases are the same as the bases on the reference genome, then it is wild type. [0003] Determination and analysis of genotype based on high-throughput sequencing is one of the main means to determine whether there are variations in individual genes. Currently, the determination of the genotype of each site is based on the ratio of the depth of the mutant base in the s...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G16B5/20G16B20/00
Inventor 黄铨飞吕来灰朱鹏远王杨
Owner CAPITALBIO GENOMICS
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products