Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Methods and systems for assembling genome sequences

A technology for assembling genomes and sequences, applied in the field of biological information, can solve problems such as base incorporation, achieve perfect splicing and save time in data sorting

Active Publication Date: 2015-11-18
BGI TECH SOLUTIONS
View PDF3 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Insertion errors arise when the enzyme sometimes randomly selects bases that are not actually incorporated into the synthetic strand

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Methods and systems for assembling genome sequences
  • Methods and systems for assembling genome sequences
  • Methods and systems for assembling genome sequences

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0042] The present invention will be described more fully below in conjunction with the accompanying drawings and preferred embodiments. It should be understood that the preferred embodiments described herein are only used to illustrate and explain the present invention, and are not intended to limit the present invention.

[0043] The relative arrangements of components and steps, numerical expressions and numerical values ​​set forth in these embodiments do not limit the present invention unless otherwise stated. Techniques, methods and devices known to those of ordinary skill in the art may not be discussed in detail, but where appropriate, techniques, methods and devices should be considered a part of this description.

[0044]Efficient and fast denovo assembly helps to discover structural variations of large fragments, which is of great significance for understanding disease-related genomes and genetic changes of diseases with fusion genes, copy number variations, and larg...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a method and system for assembling a genomic sequence, wherein the high-precision short-fragment sequence data obtained by the second-generation sequencing technology and the long-fragment sequence data obtained by single-molecule sequencing are combined to perform the assembly of a genomic sequence, and the assembly efficiency and accuracy are improved. The method specifically comprises the steps of sequencing a sample with the second-generation sequencing technology to obtain high-precision short-fragment sequences of the sample; splicing the obtained high-precision short-fragment sequences to obtain the first spliced sequence; sequencing a sample from the same source with the single-molecule sequencing technology to obtain long-fragment sequences of the sample from the same source; splicing the obtained long-fragment sequences to obtain the second spliced sequence; positioning the first spliced sequence to the second spliced sequence; performing local error correction of the long-fragment sequences in the second spliced sequence by use of the high-precision short-fragment sequences in the first spliced sequence to obtain the third spliced sequence.

Description

technical field [0001] The invention relates to the technical field of biological information, in particular to a method and device for assembling genome sequences. Background technique [0002] The second-generation sequencing technology has greatly promoted the development of bioinformatics, and the genomes of a large number of species have been sequenced. However, the current second-generation sequencing technology produces small fragment sequences of about 100-150 bp in length. Compared with the huge genome, the read length (reads) of only 100-150 bp makes it extremely difficult to complete the splicing work. Many users Although a large amount of sequencing data has been obtained, and the sequencing coverage depth has reached dozens or even hundreds of times, it is still impossible to complete the genome assembly. How to restore the massive small-fragment sequence data obtained by these sequencing to the large-fragment data in the sample poses a great challenge to the s...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): C12Q1/68C12M1/00
CPCC12Q1/6869G16B30/00
Inventor 詹东亮
Owner BGI TECH SOLUTIONS
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products