A bidirectional edge extension method for variable-length kmer queries based on multi-step bidirectional de Bruijn graphs

An extension method and a technology for executing steps, applied in special data processing applications, instruments, electrical digital data processing, etc., can solve problems such as consumption, large memory and computing time, and the inability to decouple repeated sequences, and achieve improved length, control and performance. improved effect

Active Publication Date: 2017-04-05
SHENZHEN INST OF ADVANCED TECH
View PDF4 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

This fixed-length kmer-based assembly strategy cannot be decoupled for all repeats of approximately kmer length
Although IDBA can iteratively shrink the De Bruijn graph for various kmer lengths, it needs to decompose, store and calculate all the sequences for each kmer length, and this strategy will consume huge memory and computing time

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A bidirectional edge extension method for variable-length kmer queries based on multi-step bidirectional de Bruijn graphs
  • A bidirectional edge extension method for variable-length kmer queries based on multi-step bidirectional de Bruijn graphs
  • A bidirectional edge extension method for variable-length kmer queries based on multi-step bidirectional de Bruijn graphs

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0079] In order to make the objectives, technical solutions and advantages of the present invention clearer, the following further describes the present invention in detail with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are only used to explain the present invention, but not to limit the present invention.

[0080] In addition, the technical features involved in the various embodiments of the present invention described below can be combined with each other as long as they do not conflict with each other.

[0081] The purpose of the present invention is to design an expansion method of cross bidirectional edges based on variable length kmer query, which will make the De Bruijn graph continue to shrink and contigs continue to expand without introducing errors, resulting in a decline in contig quality and reduced accuracy.

[0082] The invention provides a bidirectional edge expansion method based on v...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to the technical field of gene sequencing and provides a bidirectional edge expanding method for multistep bidirectional De Bruijn image-based elongating kmer inquiry. The bidirectional edge expanding method comprises the following steps: A) reading a sequencing data source file and constructing a multistep bidirectional De Bruijn image; B) counting forked bidirectional edges in the multistep bidirectional De Bruijn image; C) expanding the bidirectional edges in the multistep bidirectional De Bruijn image-based on the elongating kmer inquiry. According to the bidirectional edge expanding method, an elongating kmer combination is only constructed for the forked edges already existing on the De Bruijn image, the appearing times of the forked edges are inquired in an input sequence, and the combination of the bidirectional edges is selected according to the times; the optimal combination of the bidirectional edges is selected, and the possibility of fault combination of the bidirectional edges is reduced to the lowest; the length of contigs is obviously increased, and the quality loss of contig is reduced to the least.

Description

[0001] 【Technical Field】 [0002] The present invention relates to the technical field of gene sequencing, in particular to a bidirectional edge expansion method of variable length kmer query based on multi-step bidirectional De Bruijn graph. [0003] 【Background technique】 [0004] Gene sequence analysis is based on algorithms and mathematical models. The research content involves multiple aspects, including: storage and acquisition of genetic data, sequence comparison, sequencing and splicing, gene prediction, biological evolution and phylogenetic analysis, protein structure prediction, RNA structure prediction, molecular design and drug design, metabolic network analysis, gene chip, DNA computing, etc. The close integration of biotechnology and computer information processing technology has accelerated the speed of processing biological information data, enabling the interpretation of biological significance as accurately as possible in the shortest possible time and accelerating ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & AuthorityPatents(China)
IPC IPC(8): G06F19/22
Inventor孟金涛张慧琳彭丰斌魏彦杰冯圣中
OwnerSHENZHEN INST OF ADVANCED TECH