Method and device for matching character strings

A string matching and character matching technology, applied in the field of string matching, can solve problems such as low efficiency

Inactive Publication Date: 2013-02-13
BEIJING INST OF GENOMICS CHINESE ACAD OF SCI CHINA NAT CENT FOR BIOINFORMATION
View PDF3 Cites 20 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The comparison process is less efficient

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and device for matching character strings
  • Method and device for matching character strings
  • Method and device for matching character strings

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0034] In the embodiment of the present invention, the programmable logic device implements the matching based on the BWT search algorithm in units of characters. Each character in a string to be matched corresponds to the identifier of the string to be matched, and the coordinates of the currently obtained character are combined with the coordinates of the previously obtained characters corresponding to the same string to be matched; when a string to be matched When the coordinates of all the characters in are merged, it is determined that the character string to be matched is successfully matched with the target character string. In this way, the programmable logic device can process the matching of multiple characters within the matching cycle of one character, and it is not necessary for each matching process to wait for all the characters in a character string to be matched before proceeding to the next process, which significantly improves the matching efficiency . More...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a method for matching character strings, which is used for improving matching efficiency. The method comprises the following steps of: allocating different identifiers to a plurality of acquired character strings to be matched, wherein every character in one character string to be matched corresponds to the identifier of the character string to be matched; according to a Burrows-Wheeler Transform (BWT) search algorithm, matching every character of every character string to be matched among the character strings to be matched with a targeted character string in BWT space by taking coordinates of a previous character as an initial position, and acquiring coordinates of the current character when the character is successfully matched; according to the identifier of the character string to be matched, combining the coordinates of the character which is acquired currently into a matching route of a corresponding character string to be matched; and when the coordinates of all the characters in one character string to be matched are combined, determining that the character string to be matched is successfully matched with the targeted character string. The invention also discloses a device for realizing the method.

Description

technical field [0001] The invention relates to the technical fields of electronics and computers, in particular to a character string matching method and device. Background technique [0002] Burrows-Wheeler Transform (Burrows-Wheeler Transform, BWT, Burrows-Wheeler Transform) is also known as block-sorting compression. This data compression algorithm technology compresses the complete human genome sequence index to a size of less than 2GB (this is the level that current mainstream desktops and even laptops can achieve). Therefore, at present, the database usually uses the BWT format index to store the complete human genome sequence. The BWA (Burrows-Wheeler Alignment, Burrows-Wheeler Alignment) algorithm can compare short fragment sequences (called reads) with the reference genome sequence compressed by BWT, and finally find out the sequence of this short fragment in the reference genome positioning in . [0003] The comparison process mainly includes, see figure 1 Sho...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
Inventor 凌少平吕雪梅
Owner BEIJING INST OF GENOMICS CHINESE ACAD OF SCI CHINA NAT CENT FOR BIOINFORMATION
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products