Input error correction method and apparatus

An error correction method and error correction technology, applied in the fields of natural speech processing and machine learning, can solve problems such as search engines or intelligent question answering systems that cannot be processed correctly, users cannot obtain information, and cannot be processed correctly or effectively

Inactive Publication Date: 2017-03-22
SHANGHAI XIAOI ROBOT TECH CO LTD
View PDF15 Cites 16 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, when the user enters wrong words, mainly including: homophones, near-syllables, shape-synonyms, pinyin, multi-word omissions, etc., the above search engines or intelligent question answering systems may not be able to correctly or effectively handle such words. Words that prevent users from getting the

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Input error correction method and apparatus
  • Input error correction method and apparatus
  • Input error correction method and apparatus

Examples

Experimental program
Comparison scheme
Effect test

no. 1 example

[0087] In the first embodiment of the present invention, an input error correction method, such as figure 1 shown, including the following specific steps:

[0088] In step S101, it is judged whether the input character string is full pinyin; if yes, execute step S102; otherwise, execute step S103.

[0089] Step S102, perform pinyin error correction processing on the full pinyin of the word string, and reverse check the corrected full pinyin into Chinese characters to obtain the first error correction result, and the process ends.

[0090] Step S103, perform word segmentation processing on the word character string, when the number of word segmentation in the result of word segmentation processing is greater than 1, execute step S104; if the number of word segmentation in the result of word segmentation processing is 1, the word character input by the user string as the second error correction result, and the process ends.

[0091] Step S104, converting the character string i...

no. 2 example

[0096] In the second embodiment of the present invention, an input error correction method, such as figure 2 shown, including the following specific steps:

[0097] Step S201, pre-establishing a word list, a pinyin reverse look-up table and a word frequency table.

[0098] Specifically, step S201 includes:

[0099] Provide training corpus;

[0100] Segment the training corpus to obtain a list of words;

[0101] On the basis of the word list, the pinyin reverse look-up table is generated by using the pinyin reverse look-up table generation tool, and the word frequency table is obtained according to the word list by means of statistics.

[0102] In the embodiment of the present invention, in the process of establishing the word list, pinyin reverse look-up table and word frequency table, the word information provided by the training corpus is fully and effectively used, and can be quickly applied to self-defined word error correction in different fields.

[0103] Step S202,...

no. 3 example

[0126] In the third embodiment of the present invention, an input error correction method, such as image 3 shown, including the following specific steps:

[0127] Step S201, pre-establishing a word list, a pinyin reverse look-up table and a word frequency table.

[0128] Specifically, step S201 includes:

[0129] Provide training corpus;

[0130] Segment the training corpus to obtain a list of words;

[0131] On the basis of the word list, the pinyin reverse look-up table is generated by using the pinyin reverse look-up table generation tool, and the word frequency table is obtained according to the word list by means of statistics.

[0132] Step S202, judging whether the input character string is full pinyin; if yes, execute step S203; otherwise, execute step S204.

[0133] Step S203, perform pinyin error correction processing on the full pinyin of the word character string according to the word list, pinyin reverse lookup table and word frequency table, reverse check th...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides an input error correction method and apparatus. The method comprises the steps of judging whether an input word character string is full pinyin or not; if yes, performing pinyin error correction processing on the full pinyin of the word character string, performing a reverse query on the full pinyin subjected to the error correction to obtain Chinese characters, and obtaining a first error correction result; or otherwise, performing word segmentation processing on the word character string, converting the word character string into the full pinyin, performing the pinyin error correction processing on the full pinyin obtained by conversion, performing a reverse query on the full pinyin subjected to the error correction to obtain the Chinese characters, and obtaining a second error correction result. According to the method and the apparatus, a similarity calculation method is skillfully applied to similarity calculation of pinyin characters and similarity calculation of Chinese characters; and by applying the method and the apparatus to Chinese search engines and intelligent question-answer systems, the accuracy of query and question-answer of information input for words in the Chinese search engines and the intelligent question-answer systems can be remarkably improved.

Description

technical field [0001] The invention relates to the technical fields of natural speech processing and machine learning, in particular to an input error correction method and device. Background technique [0002] At present, users often conduct information inquiries through Chinese search engines or intelligent question answering systems, and a large part of the inquiries are input in the form of words. Both the Chinese search engine represented by Baidu and the intelligent question-answering system represented by Xiaoi robot can respond and give feedback to the Chinese words entered by users. However, when the user enters wrong words, mainly including: homophones, near-syllables, shape-synonyms, pinyin, multi-word omissions, etc., the above search engines or intelligent question answering systems may not be able to correctly or effectively handle such words. Words that prevent users from obtaining the information they need. For example, the original word is: Get chestnuts ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F3/023G06F17/27
CPCG06F3/0233G06F40/232G06F40/284
Inventor 陈培华朱频频陈成才
Owner SHANGHAI XIAOI ROBOT TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products