Method and system for drawing construction in short sequence assembly

A short sequence and sequence technology, which is applied in the field of graph construction in short sequence assembly, can solve problems such as the inability to assemble large genomes, and achieve the effect of high speed and small memory footprint.

Active Publication Date: 2009-05-13
BGI TECH SOLUTIONS
View PDF0 Cites 31 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] The purpose of the embodiments of the present invention is to provide a method for constructing graphs in short se

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and system for drawing construction in short sequence assembly
  • Method and system for drawing construction in short sequence assembly
  • Method and system for drawing construction in short sequence assembly

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0019] In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present invention, not to limit the present invention.

[0020] In the embodiment of the present invention, short strings of fixed base length are obtained by sliding and cutting the received sequencing sequence base by base, and the left and right connection relationship of the short strings is obtained, and the sequence values ​​of the obtained short strings are, The left and right connection relationship and its connection number are stored as a node of the de Bruijn graph.

[0021] figure 1 The implementation flow of the method for constructing a graph in short sequence assembly provided by the embodiment of the present invention is shown, and the d...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention is applicable to the technical field of gene engineering, and provides a method for constructing a graph in a short sequence assembly and a system thereof. The method comprises the following steps: receiving an order-checking sequence; carrying out sliding cutting on each base of the received order-checking sequence to obtain a short string with a fixed base length and a left and right connecting relation of the short string; storing a sequence value of the obtained short string, the left and right connecting relation and a connection number as a node of a de Bruijn graph. In the invention, the method for constructing the graph in the short sequence assembly can be realized by slidingly cutting the base of the received order-checking sequence one by one to obtain the short string with the fixed base length and the left and right connecting relation of the short string, and storing the sequence value of the obtained short string, the left and right connecting relation and the connection number as the node of the de Bruijn graph. The method can assemble a large genome with small occupied memory and fast speed.

Description

technical field [0001] The invention belongs to the technical field of genetic engineering, and in particular relates to a method and system for constructing graphs in short sequence assembly. Background technique [0002] The short sequences produced by new sequencing technologies have two characteristics: [0003] 1. The sequence length is short; [0004] 2. The amount of data is large. [0005] Commonly used software such as phrap for long sequence assembly is spliced ​​based on the overlap between sequences, and the amount of computation on short sequences is too large to be of practical application value. Emerging short-sequence assembly software that successfully handles short sequences, such as velvet, is based on de Bruijn diagrams. However, due to limitations of memory and time, existing short-sequence assembly software can only assemble small prokaryotic genomes, and cannot assemble large genomes, such as eukaryotic genomes, especially mammalian genome data. C...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F19/00C12Q1/68G16B30/20
CPCG06F19/22G16B30/00G16B30/20
Inventor 李瑞强阮珏朱红梅李松岗王俊杨焕明汪建
Owner BGI TECH SOLUTIONS
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products