Unlock instant, AI-driven research and patent intelligence for your innovation.

Sequencing data correction method of semiconductor sequencing platform using reference genome information

A technology of reference genome and sequencing platform is applied in the field of sequencing data correction of semiconductor sequencing platform using reference genome information to achieve the effect of improving accuracy

Active Publication Date: 2018-04-17
HARBIN ENG UNIV
View PDF4 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

In the SFF file, only the voltage value itself is considered in calculating the base length of the measured voltage value, so there is a certain error rate

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Sequencing data correction method of semiconductor sequencing platform using reference genome information
  • Sequencing data correction method of semiconductor sequencing platform using reference genome information
  • Sequencing data correction method of semiconductor sequencing platform using reference genome information

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0022] The following examples will further describe the present invention:

[0023] combine image 3 The main steps of the present invention include:

[0024] 1. From the original SFF sequencing file generated by the semiconductor sequencing platform, obtain the type and voltage value of the base detected in each detection cycle for each sequencing read, as well as the serial number of the detection cycle.

[0025] In each sequencing process, the type of deoxyribonucleotides added in each detection cycle is fixed. Therefore, according to the sequence number of the detection cycle, the detection base type of the detection cycle can be obtained, and the detection base type of the detection cycle can be obtained at the same time. The measured voltage value of the detection period.

[0026] Theoretically, when the base length of the tested base is n, the sequencing platform should output n volts. But in practice, the magnitude of the output voltage will not be exactly n volts. ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a sequencing-data correction method of a semiconductor sequencing platform utilizing information of a reference gene group. The sequencing-data correction method comprises the following steps of: (1) calculating prior probability distribution of a measured voltage value when the length of a basic group is known by utilizing the measured voltage value when the decoding length of the detected basic group in the sequencing data of the semiconductor sequencing platform is consistent to the length of the corresponding basic group in the reference gene group; (2) when the decoding length of the detected basic group in the sequencing data of the semiconductor sequencing platform is not consistent to the length of the corresponding basic group in the reference gene group, correcting the length of the basic group of the sequencing data, utilizing a following formula to calculate a value S1 when the measured voltage value is known and the assumed length of the basic group is 1; and when the S1 is maximum, taking the length 1 of the corresponding basic group, which is the length of the measured basic group when the measured voltage value is known, and finishing the correction for the sequencing data. The sequencing-data correction method provided by the invention has the advantages that as an innovative proposal, the information of the reference gene group is introduced on the basis of the measured voltage value in the process of decoding the length of the basic group of the measured voltage value so as to realize correction for the sequencing data.

Description

technical field [0001] The invention relates to a molecular biological information detection method. Specifically, it is a sequencing data correction method for next-generation semiconductor sequencing platforms. Background technique [0002] With the rapid development of biological detection technology, second-generation sequencing platforms such as Illumina's Solexa, Life Sciences' 454, and ABI's SOLiD have gradually been replaced by next-generation sequencing platforms. This includes Illumina's MiSeq, NextSeq, HiSeq series, ABI's Ion Torrent, Ion Proton, Ion PGM series, and Oxford NanoporeTechnologies' MinION, among others. Although the introduction of the next-generation sequencing platform has made the detection of biological information deeper, cheaper and more efficient, due to different detection principles, the interpretation methods of the original high-throughput sequencing data will have to be changed accordingly. [0003] Among the newly launched next-generati...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F19/20
CPCG16B25/00
Inventor 冯伟兴薛丁恺赵森陈多娇贺波
Owner HARBIN ENG UNIV