Single nucleotide polymorphism site identification method based on digestion library-establishing and sequencing and bayesian statistics

A technology for single nucleotide polymorphism and enzyme digestion library construction, which is applied in the field of single nucleotide polymorphism site identification, and can solve the problem that the accuracy of SNP genotype cannot be guaranteed, the error rate of statistical methods cannot be determined, and there is no statistical method. significance and other issues, to achieve the effect of improving statistical significance, improving accuracy, and reducing costs

Active Publication Date: 2013-05-22
SHANGHAI MAJORBIO BIO PHARM TECH
View PDF0 Cites 16 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

This method has no statistical significance, and is greatly affected by other external factors, such as the total amount of sequencing, and the accuracy of the identified SNP genotype cannot be guaranteed
Literature [9] improves t...

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Single nucleotide polymorphism site identification method based on digestion library-establishing and sequencing and bayesian statistics
  • Single nucleotide polymorphism site identification method based on digestion library-establishing and sequencing and bayesian statistics
  • Single nucleotide polymorphism site identification method based on digestion library-establishing and sequencing and bayesian statistics

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0069] Embodiment specific operation process:

[0070] The sequencing data obtained by the RAD-PE sequencing of the two parents were filtered according to the sequencing quality value, N content, and whether it contained restriction end sequences, and unqualified sequencing sequences were removed. The effective data statistics obtained are shown in Table 1.

[0071] Table 1: Effective data statistics of gourd RAD sequencing

[0072] name Amount of data used (bp) name Amount of data used (bp) name Amount of data used (bp) male parent 585,377,540 F2-46 3,800,775 F2-93 1,556,104 female parent 423,794,746 F2-47 2,522,407 F2-94 1,651,259 F2-1 3,114,771 F2-48 4,636,152 F2-95 3,213,147 F2-2 2,302,730 F2-49 3,737,623 F2-96 2,202,354 F2-3 537,822 F2-50 647,499 F2-97 1,956,440 F2-4 1,650,925 F2-51 3,678,334 F2-98 1,112,431 F2-5 2,824,708 F2-52 2,153,996 F2-99 1,086,168 F2-6 579,177 F2-53 7,029,889...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a single nucleotide polymorphism (SNP) site identification method based on digestion library-establishing and sequencing and bayesian statistics. The method is used for processing RAD (restriction site associated deoxyribonucleic acid) sequencing data, searching candidate SNP on an RAD sequencing fragment, and identifying the SNP reliability by employing a bioinformatics analysis method based on bayesian statistics. The method can be used for model and non-model organisms to eliminate the limitation that lots of species are lack of reference sequences and reduce the sequencing cost, and can be used for solving the bottleneck that a reliable statistical method is absent in the process of performing SNP identification by utilizing the RAD data at present, so that the obtained SNP site accuracy is greatly improved.

Description

technical field [0001] The invention relates to a method for identifying single nucleotide polymorphism sites based on enzyme digestion library construction sequencing and Bayesian statistics. Specifically, a special Bayesian statistical test is performed on the single-nucleotide polymorphism (SNP) sites obtained by single-end sequencing or pair-end sequencing based on enzyme digestion library construction, so as to accurately A method for identifying SNP genotypes; can provide reliable statistical significance for SNP testing in the absence of a reference genome sequence. The method belongs to the technical field of bioinformatics. This is of great significance for the study of non-model organisms lacking reference sequences and the accuracy of genotype identification. Background technique [0002] SNP (Single Nucleotide Polymorphisms) single nucleotide polymorphism marker refers to the variation of a single nucleotide on the genome, and its number is large. There is one ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): C12Q1/68
Inventor 陶晔钱刚郑泽群胡秋萍
Owner SHANGHAI MAJORBIO BIO PHARM TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products