Hybrid method for correcting sequencing errors in third-generation sequencing data under heterozygous variation

A sequencing data and sequencing technology, applied in sequence analysis, instrumentation, genomics, etc., can solve problems such as miscorrection of heterozygous variants

Active Publication Date: 2020-08-25
XI AN JIAOTONG UNIV
View PDF6 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] The technical problem to be solved by the present invention is to provide a hybrid method for correcting sequencing errors in the third-generation sequencing data under heterozygous variation to solve the problems caused by the single voting mechanism of the existing correction algorithm and other algorithm structures. Miscorrection of he

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Hybrid method for correcting sequencing errors in third-generation sequencing data under heterozygous variation
  • Hybrid method for correcting sequencing errors in third-generation sequencing data under heterozygous variation
  • Hybrid method for correcting sequencing errors in third-generation sequencing data under heterozygous variation

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0057] The present invention provides a hybrid method QIHC (QI Heterozygosity Correction) for correcting sequencing errors in third-generation sequencing data under heterozygous variation. The input data are second-generation sequencing data (hereinafter referred to as S) and third-generation sequencing data ( Hereinafter abbreviated as L), use the existing comparison software and assembly software to process the input data, judge the heterozygosity of the gene locus based on the Bayesian classifier principle, combine the results of the heterozygosity judgment to analyze the heterozygosity in L Reads are corrected, which solves the problem of low accuracy and ineffectiveness of existing correction algorithms when dealing with heterozygous variants.

[0058] The present invention is based on following assumptions with general consensus in the academic circle:

[0059] 1. According to the existing sequencing technology standards, there may be 5 types of sequencing results for a ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a hybrid method for correcting sequencing errors in third-generation sequencing data under heterozygous variation. According to the method, input data is second-generation sequencing data and third-generation sequencing data, and are processed by using existing comparison software and assembly software; and based on Bayesian classifier principle, heterozygosity of gene lociis judged, and reading segments in the third-generation sequencing data are corrected in combination with a heterozygosity judgment result, so that the problems that the existing correction algorithmis low in accuracy and ineffectiveness when heterozygosity variation is processed are solved. According to the invention, heterozygosity variation is considered when sequencing errors are corrected,a series of probability models are designed to judge and classify heterozygosity, different correction strategies are adopted for different heterozygosity classifications, and the problem that correction errors occur when the existing correction method encounters heterozygosity variation is solved.

Description

technical field [0001] The invention belongs to the technical field of third-generation sequencing, and in particular relates to a hybrid method for correcting sequencing errors in third-generation sequencing data under heterozygous variation. Background technique [0002] Genome sequencing technology, especially single-molecule long-read sequencing technology, also known as third-generation sequencing (English name: Third Generation Sequence, English abbreviation: TGS), has revolutionized genomics research. TGS technology not only continues the high-throughput advantages of Next Generation Sequencing (English name: Next Generation Sequence, English abbreviation: NGS) technology, but also generates longer read lengths, up to 10kbp. Therefore, TGS technology has brought great impetus to many fields, such as the detection of structural variations, the identification of methylation, and the diagnosis of diseases. Although TGS is in the leading position in terms of read length ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G16B20/20G16B20/30G16B30/00
CPCG16B20/20G16B20/30G16B30/00
Inventor 王嘉寅刘佳琦赖欣萧笑张选平朱晓燕
Owner XI AN JIAOTONG UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products