Fast character string matching method based on filtering type

A matching method and string technology, applied in the direction of electrical digital data processing, special data processing applications, instruments, etc., to achieve the effect of simplifying the matching time, shortening the time, and good performance

Inactive Publication Date: 2012-10-24
SOUTH CHINA UNIV OF TECH
View PDF0 Cites 18 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

In practical applications, many problems use the edit distance model or use its variants

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Fast character string matching method based on filtering type
  • Fast character string matching method based on filtering type

Examples

Experimental program
Comparison scheme
Effect test

Embodiment example 1

[0025] Implementation Case 1: Mobile SMS Classification

[0026] In recent years, my country has accelerated the pace of social informatization and digitalization, and people have to deal with more and more information every day. In order to remind or let customers know relevant information as soon as possible, each application service provider will notify customers by SMS, such as e-commerce network product recommendation, online banking information feedback, various deduction notices, and user chat information. But as people come into contact with more and more businesses, this information becomes complicated and difficult to manage. We need a convenient, loose, and effective way to classify text messages, and get rid of the embarrassment of directly opening the inbox to deal with various information. Application of the present invention "a filter-based fast matching method for character strings" can realize the function of using templates to classify short messages. When ...

Embodiment example 2

[0033] Implementation Case 2: Diary Document Fuzzy Search

[0034] With the rapid development of computer technology and the acceleration of the speed of life, compared with the previous situation of writing diaries and completing work documents with paper and pen, people are now more inclined to use computers to complete text work. There are three main reasons: 1) computer technology makes it easier and faster for people to complete text work; 2) computer diaries are more entertaining, such as editing mood; Compared with completing a diary or work report at the end of the day, instant recording is more in line with people's needs. With the development of mobile phone hardware technology, handwriting recognition and speech recognition technology, we can record daily information more conveniently and directly. Recognition technology, these multimedia information will be converted into text for storage.

[0035] For these documents that are relatively random and scattered, and ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a fast character string matching method based on a filtering type. According to the method, a pattern string is preprocessed: the prefix P0 of the pattern string is cut into (k+s) pattern string blocks with the length being h, the length of each pattern string block is lengthened by (k+q-1), and the lengthened pattern string blocks are respectively recorded as Q1, Q2 to Q(k+s); and then, from the initial position of a text string, q characters of the text string are sequentially read every h length and are used as text string indexes, and the text string indexes are respectively marked as d1, d2 to dn / h; a matching number group B[d, j] is recreated: if one text string index di belongs to Qj, the matching number group B[di, j] is equal to 1; and the matching number of (k+s) continuous text string indexes and the pattern string blocks is calculated, and finally, the approximate matching is detected. The method has the advantages that a dynamic programming algorithm and a filter algorithm are combined, in addition, new filter strategies are added, the average time of the approximate matching is shortened, and the matching performance is greatly improved.

Description

technical field [0001] The invention relates to the technical field of character string matching, in particular to a filter-based rapid character string matching method, which belongs to the fields of information retrieval and computational biology. Background technique [0002] The problem of string matching can be defined as finding a pattern with certain properties from a given sequence of symbols. The simplest example is to find a given string from a given sequence of characters. "Approximate match" generally means that there are some subtle differences between the allowed pattern and the text string. "Match" generally means "approximate match". String matching is one of the oldest and most widely studied problems in computer science, and applications of string matching can be found everywhere. In recent years, academic interest in string matching has grown, especially in the rapidly growing fields of information retrieval and computational biology. At the same time, ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
Inventor 李拥军邹少聪林浩黄格仕谢豪
Owner SOUTH CHINA UNIV OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products