Method and system for query error correction

An error correction method and query statement technology, applied in the field of query error correction methods and systems, can solve the problems of no error correction for statements, high computational complexity, long query statements, etc. The effect of saving processing time

Active Publication Date: 2013-07-10
INST OF COMPUTING TECH CHINESE ACAD OF SCI
View PDF4 Cites 29 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

In the above method, since every possible combination needs to be calculated, the computational complexity is high
[0005] In addition, some Chinese long query sentences often appear in search engines (for example, in information retrieval systems such as question answering systems,

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and system for query error correction
  • Method and system for query error correction
  • Method and system for query error correction

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0041] The present invention will be described below in conjunction with the accompanying drawings and specific embodiments.

[0042] figure 1 An embodiment of a query error correction method is shown, including steps 100-106 (106'). This embodiment assumes that the constructed forward and reverse dictionary trees and two forward and reverse language models already exist.

[0043] In another embodiment, the method also includes a preprocessing step: constructing forward and reverse dictionary trees containing Chinese, English and symbols (referring to punctuation symbols in this paper); and constructing forward and reverse dictionary trees containing Chinese, English and symbols Direct and reverse language models.

[0044] To build a dictionary tree, you first need to obtain a thesaurus file. Wherein, the thesaurus file may be composed of a large number of Chinese words, mixed Chinese and English words, and English words. In one embodiment, by converting Chinese words into...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a method and a system for query error correction. The method includes: converting an query sentence into a character sequence, and judging whether the length of the character sequence is larger than a preset threshold value delta or not; simultaneously performing forward and reverse error correction to the character sequence with the length larger than delta until the number of characters under overlap processing reaches a threshold value M, and obtaining forward and reverse candidate sentence item set; and splicing candidate sentences where rear M characters in the forward candidate sentence item set are identical with front M characters in the reverse candidate sentence item set, spliced candidate sentence items form an error correction candidate item set. The method supports query sentences with Chinese and English characters mixed and allows for parallel forward and reverse query error correction to long queries, and by the parallel processing mode, accuracy is guaranteed while processing time of query error correction is saved.

Description

technical field [0001] The invention relates to natural language processing technology, in particular to a query error correction method and system. Background technique [0002] Query error correction usually refers to the correctness identification of the original query submitted by the user in the background of the search engine, and correction of spelling errors, ambiguity or polysemy that may appear in the original query submitted by the user, so as to obtain as correct a query as possible and present it to the user. Thereby improving the user's search experience. According to statistics, about 10%-15% of queries in English search engines have misspellings, while there are more Chinese spelling mistakes in Chinese search engines and more types. Across an information retrieval system, the number of misspellings in queries can be even greater. Because the query statement will directly affect the reliability and accuracy of the results returned by the information retriev...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30
Inventor 程学旗熊锦华颛悦程舒扬廖华明王元卓
Owner INST OF COMPUTING TECH CHINESE ACAD OF SCI
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products