Method and system for fast processing genome short sequence mapping

A processing method and short-sequence technology, applied in the field of genetic engineering, can solve problems such as low efficiency and long processing time, and achieve the effect of improving efficiency and shortening processing time

Active Publication Date: 2010-06-23
BGI TECH SOLUTIONS
View PDF6 Cites 12 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Existing short sequence comparison software can map short sequences to contigs within two mismatches, but when dealing with the alignment between contigs and short sequences, the processing time is long and the efficiency is low, which cannot satisfy short sequences well. assembly needs

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and system for fast processing genome short sequence mapping
  • Method and system for fast processing genome short sequence mapping
  • Method and system for fast processing genome short sequence mapping

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0016] In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present invention, not to limit the present invention.

[0017] In the embodiment of the present invention, by sorting the sequencing sequence according to the base values ​​of short strings of preset length, and cutting the contig into short strings of preset length base by base, sequentially according to the bases of the short strings cut in the contig The base value searches for the corresponding sequencing sequence in the sorted sequencing sequence, and establishes a mapping relationship.

[0018] figure 1 The implementation flow of the rapid processing method for genome short sequence mapping provided by the embodiment of the present invention is s...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

Being applicable to the technical field of genetic engineering, the invention provides a method and a system for fast processing genome short sequence mapping, comprising the following steps: ranking sequencing sequence according to base number of short strings of preset length; cutting basic groups of sequence contig into short strings of preset length; searching corresponding sequencing sequence in ranked sequencing sequence according to base number of short strings cut from the sequence contig; then establishing mapping relation. In the invention, the sequencing sequence is ranked according to base number of short strings of preset strings and basic groups of sequence contig are cut into short strings of preset length; in addition, the corresponding sequencing sequence in ranked sequencing sequence is searched according to base number of short strings cut from the sequence contig; finally mapping relation is established; so that short sequence mapping applied to short sequence assembling is realized, processing time is short and processing efficiency is high.

Description

technical field [0001] The invention belongs to the technical field of genetic engineering, and in particular relates to a rapid processing method and system for genome short sequence mapping. Background technique [0002] The assembly of short sequences of large genomes faces memory challenges. In order to reduce the memory usage of building deBruijn graphs, the assembly software can not record the correspondence between sequencing sequences and sequence fragment contigs (contig) in memory, but only assemble them in contig After completion, map the correct sequencing sequence to the contig. Existing short sequence alignments are mostly implemented by computer software, and are mainly divided into two categories, one uses a combined index structure of fixed short strings (kmer), and the other uses a suffix tree-like index structure. Existing short sequence comparison software can map short sequences to contigs within two mismatches, but when dealing with the alignment betwe...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F19/00C12Q1/68
Inventor 李瑞强朱红梅王俊杨焕明汪建
Owner BGI TECH SOLUTIONS
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products