Vertex expansion method for variable-length kmer queries based on multi-step bidirectional de Bruijn graphs

A vertex and forward edge technology, which is applied in the field of vertex expansion of variable-length kmer queries, can solve the problems of cost, large memory and computing time, repeated sequences cannot be decoupled, etc., and achieve the effect of increasing length, control and improvement

Active Publication Date: 2016-09-07
SHENZHEN INST OF ADVANCED TECH
View PDF2 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

This fixed-length kmer-based assembly strategy cannot be decoupled for all repeats of approximately kmer length
Although IDBA can iteratively shrink the De Bruijn graph for various kmer lengths, it needs to decompose, store and calculate all the sequences for each kmer length, and this strategy will consume huge memory and computing time

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Vertex expansion method for variable-length kmer queries based on multi-step bidirectional de Bruijn graphs
  • Vertex expansion method for variable-length kmer queries based on multi-step bidirectional de Bruijn graphs
  • Vertex expansion method for variable-length kmer queries based on multi-step bidirectional de Bruijn graphs

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0075] In order to make the objectives, technical solutions and advantages of the present invention clearer, the present invention will be further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are only used to explain the present invention, but not to limit the present invention.

[0076] In addition, the technical features involved in the various embodiments of the present invention described below can be combined with each other as long as they do not conflict with each other.

[0077] The purpose of the present invention is to design a bifurcation vertex expansion method based on variable-length kmer query, which will make the De Bruijn graph continue to shrink and the contigs continue to expand, and at the same time, it will not introduce errors, resulting in the decline of contig quality and accuracy.

[0078] The invention provides a vertex expansion method bas...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to the technical field of gene sequencing and provides a peak expanding method for multistep bidirectional De Bruijn image-based elongating kmer inquiry. The peak expanding method comprises the following steps: A) reading a sequencing data source file and constructing a multistep bidirectional De Bruijn image; B) constructing and counting the elongating kmer at fork peaks in the multistep bidirectional De Bruijn image; C) expanding the peak based on the elongating kmer inquiry in the multistep bidirectional De Bruijn image. According to the peak expanding method provided by the invention, only some fork peaks are selected for constructing less elongating kmer, and then the fork peaks are directionally decoupled, and the De Bruijn image is not constructed for each kmer length, so that the repeat with overall length less than the sequence length is conveniently and quickly settled, and the length and quality of contig are maximized.

Description

【Technical field】 [0001] The invention relates to the technical field of gene sequencing, in particular to a vertex expansion method of variable length kmer query based on multi-step bidirectional De Bruijn graph. 【Background technique】 [0002] Gene sequence analysis is centered on algorithms and mathematical models. The research content involves many aspects, including: storage and acquisition of genetic data, sequence alignment, sequencing and splicing, gene prediction, biological evolution and phylogenetic analysis, protein structure prediction, RNA structure prediction, molecular design and drug design, metabolic network analysis, gene chip, DNA computing, etc. The close integration of biotechnology and computer information processing technology has accelerated the speed of processing bioinformatics data, making the interpretation of biological meaning as accurate as possible in the shortest time possible, and accelerating the development of bioinformatics. At present,...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & AuthorityPatents(China)
IPC IPC(8): G06F19/22
Inventor孟金涛张慧琳彭丰斌魏彦杰冯圣中
OwnerSHENZHEN INST OF ADVANCED TECH