Method for searching matched character string

A string and character technology, applied in the field of string matching and searching in text, can solve the problems of high time consumption, low efficiency of string matching and searching, and low efficiency of string matching and searching, so as to reduce the time consumed , the effect of improving efficiency

Active Publication Date: 2011-05-18
ALLWINNER TECH CO LTD
View PDF3 Cites 19 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

In practical applications, since the last character of the string is not equal to the character aligned with it in the text is the most common situation, performing the above operation after each jump in the string will consume a lot of time, resulting in string matching Inefficient lookup
[0010] In addition, when using the BM alg

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method for searching matched character string
  • Method for searching matched character string
  • Method for searching matched character string

Examples

Experimental program
Comparison scheme
Effect test

no. 1 example

[0051] As mentioned above, when searching for a string match, the length of the text and the length of the string to be matched and searched can be judged first. When it is judged that the length of the text is less than or equal to 100k, this embodiment is preferred.

[0052] In this embodiment, the string to be matched and searched is "match", and the string length is 5 digits. The string jump principle of this embodiment is consistent with the existing BM algorithm, that is, the character aligned with the last character of the string does not appear in the string, the string jumps backward, and the jump distance is equal to the length of the string. If the character aligned with the last character of the string exists in the string, the string jumps backward to align two identical characters in the string and text.

[0053] See image 3 , Assuming that the line labeled "1" (hereinafter referred to as line 1, and so on) is text, line 2 is a string, and the last character of the s...

no. 2 example

[0081] This embodiment is roughly the same as the first embodiment. The difference is that a group of extended character strings that are the same as the original character string are added after the original character string to form a comparison character string, and the comparison character string is used to match the characters of the text and jump in.

[0082] When applying this method, you need to create a jump-in table, and the jump-in table is a two-dimensional array, such as Picture 8 As shown, the first row of the table is the character aligned with the last character of the original string, the first column is the character aligned with the last character of the extended string, and the data in the table is the number of bits jumped in. Therefore, if the character aligned with the last character of the original string is "t", and the character aligned with the last character of the extended string is "m", the number of jumps in the string is 2, and so on.

[0083] See Pi...

no. 3 example

[0093] For the length of the text greater than the text length threshold, and the string length is also greater than the string length threshold, such as 8 characters, because the text character is the same as the last character of the string, there is a high probability that this embodiment will cause the extended string It cannot reflect its function, so it is necessary to query the two characters of the text at the same time, and judge whether the two characters are the same as the last two characters in the string to achieve jump.

[0094] In this embodiment, it is also necessary to establish a jump-in table. Assume that the character to be matched and searched is "onlymatch", which has a total of 9 characters. Such as Picture 10 As shown, the first row of the table represents the character aligned with the penultimate character of the string, the first column represents the character aligned with the last character of the string, and the data in the table is the jump distanc...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a method for searching a matched character string, which comprises the following steps: table establishing step: confirming a jumping distance of the character string according to the character string to be searched when any character from a preset bit number of characters is met in the searching process, and establishing a jumping table; aligning step: aligning a first character of the character string with the first character of a text; jumping step: searching an identification character aligned with a character reciprocal set number of characters in the text, searching the jumping table according to the identification character, confirming the jumping distance of the character string and backwards jumping the character string; repeating the jumping step for the preset times; comparing step: starting from the last character of the character string, judging whether each character of the character string is as same as the character aligned with each character of the character string in the text, if yes, comparing the next character till the matched character string is found, and if not, performing the jumping step. By using the method, the matched character string can be quickly searched, the searching time is short and the efficiency is high.

Description

Technical field [0001] The invention relates to the technical field of data recognition and search, and in particular to a method for character string matching and search in text. Background technique [0002] With the popularization of computers, e-readers, MP4 and other electronic equipment, electronic text file files are widely used in file editing, web page design and other aspects. In the process of editing electronic text, it is often necessary to query a specific character or string in the text. For a single character matching search, usually the character to be searched is compared with each character in the text one by one to find the corresponding character of. For the search of a string, there are many different search methods, including search from back to front, search from front to back, search from both ends at the same time, etc. The most commonly used search method at present is the BM algorithm. [0003] It should be noted that the "characters" mentioned in this...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30
Inventor 陈翔
Owner ALLWINNER TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products