A rapid fuzzy matching algorithm for strings in mass audio data

A technology of fuzzy matching and character strings, applied in digital data processing, special data processing applications, calculations, etc., can solve problems such as high requirements of neural networks, achieve the effect of increasing speed and reducing the amount of matching calculations

A technology of fuzzy matching and character strings, applied in digital data processing, special data processing applications, calculations, etc., can solve problems such as high requirements of neural networks, achieve the effect of increasing speed and reducing the amount of matching calculations

CN106528599AInactive Publication Date: 2017-03-22深圳凡豆信息科技有限公司

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A rapid fuzzy matching algorithm for strings in mass audio data
  • A rapid fuzzy matching algorithm for strings in mass audio data
  • A rapid fuzzy matching algorithm for strings in mass audio data

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0032] The present invention will be further elaborated below in conjunction with accompanying drawing:

[0033] like figure 1 As shown, the main process of the present invention is as follows: firstly, it is necessary to read the label and text data in the database, train and learn the data stored in the database, obtain the mapping relationship D1 from characters to label strings, and the mapping from label strings to text Relationship D2, mapping relationship D3 from text to label quantity. Obtain the description text X input by the user, the length of which is L characters, and extract the character set X(l) (l=1, 2, 3, . . . , L) from the input search text. Through the mapping relationship D1 from character X(l) to label string, filter out the relevant label set from the keyword set, perform fuzzy matching on the filtered label set and the input text X, and save the score of the matching result. Then look up useless dictionaries and negative word dictionaries to further...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a rapid fuzzy matching algorithm for strings. According to the invention, firstly data preprocessing is performed on texts in a database to obtain a statistical model and an index is established via Hash. An input text is a shorter string. The algorithm traverses all Chinese characters therein, activates the positions of corresponding Chinese characters in a finite character complete set, and maps the activation state of the finite character complete set to each tag to filter tags. A few filtered tags are used for matching the texts and the DTW algorithm is used for approximate string matching. The algorithm also comprises the steps of performing scoring and sorting according to the result of the degree of approximation of matching and returning to a search result. Through the efficient tag filtering method, the calculation efficiency of the string matching algorithm is greatly increased; in a process of input text matching, a fuzzy matching effect is achieved and a good matching performance is guaranteed for fuzzy languages.

Description

technical field [0001] The invention relates to a fast fuzzy matching algorithm for character strings in massive audio data, belonging to the field of natural language processing. Background technique [0002] The string matching problem is a search problem in which an element (called a pattern) in a given symbol sequence or a given symbol sequence set (called a pattern) appears in a given symbol sequence (called a text) according to a certain matching condition. This problem is one of the basic problems of computer science, it is widely used in various fields involving text and symbol processing, and it is a key problem in important fields such as network security, information retrieval, and computational biology. With the emergence of network security issues, massive information retrieval, and the rapid development of computational biology, the existing string matching algorithms can no longer meet the needs of applications for matching performance, and there is an urgent ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
22 Mar 2017
Publication
CN106528599A
IPC
G06F17/30
CPC
G06F16/686; G06F16/90344
Inventors
田学红; 朱晓明