Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Text error detection method and device

A text and error detection technology, applied in the field of text processing, can solve problems that affect semantic understanding or intent classification accuracy, low error detection accuracy, and low error detection accuracy

Pending Publication Date: 2020-06-09
BEIJING DIDI INFINITY TECH & DEV
View PDF5 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

At present, there are inevitably wrong characters in the text obtained by manual handwriting, input method input or speech recognition. These wrong characters bring great difficulties to the above-mentioned semantic understanding and intention classification, and seriously affect the subsequent semantic understanding or intention. Classification accuracy, which in turn causes damage to the service quality of smart services
[0003] There are some error detection methods for texts in the prior art, but these error detection methods have the defects of low error detection accuracy or poor applicability. For example, some text error detection methods are only applicable to some texts, and the error detection methods for other texts very low accuracy

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Text error detection method and device
  • Text error detection method and device
  • Text error detection method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0154] This embodiment provides a kind of text error detection method, and this method utilizes the corpus that stores correct text, detects and obtains the error character in the text to be detected (namely following target error character), compared with prior art, can effectively improve Detection accuracy and adaptability of text error detection. Specifically, such as figure 1 As shown, the text error detection method of this implementation includes:

[0155] S110. Obtain information about a client that generates the text to be detected.

[0156] Here, the client information includes the category of the client generating the text to be detected, the identifier of the client generating the text to be detected, the category of the client associated with the client generating the text to be detected, the type of client associated with the client generating the text to be detected The user's identifier and other information.

[0157] S120. Select a corpus that matches the c...

Embodiment 2

[0172] This embodiment provides a text error detection method. Based on the previous embodiment, the method proposes a specific implementation manner of screening target error characters from the target suspected words. Such as figure 2 As shown, the text error detection method in this implementation includes as follows:

[0173] S210. Based on the corpus storing the correct text, screen suspected wrong words and suspected wrong characters from the text to be detected.

[0174] S220. Obtain the vocabulary to which each suspected wrong character belongs from the text to be detected, and filter the vocabulary belonging to the suspected wrong vocabulary from the acquired vocabulary to obtain a target suspected vocabulary.

[0175] S230. Based on the probability of each target suspected word appearing in the current position of the text to be detected, screen target wrong words from the target suspected words.

[0176] Here, the target wrong vocabulary is obtained by rationally...

Embodiment 3

[0190] This embodiment provides a text error detection method. On the basis of any of the above embodiments, this embodiment proposes a specific implementation manner of screening suspected wrong words and suspected wrong characters from the text to be detected. Such as image 3 As shown, the text error detection method of the present embodiment includes:

[0191] S310. Acquire the corpus and the text to be detected.

[0192] S320. Based on the co-occurrence probability of every two characters and the co-occurrence probability of every two words in the corpus, screen suspected wrong words and suspected wrong characters from the text to be detected.

[0193] Here, the correct text is stored in the corpus, so using the probability of co-occurrence of every two characters and the co-occurrence probability of every two words in the corpus, the co-occurrence probability of every two characters or every two words in the text to be detected can be calculated. The probability of occ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The embodiment of the invention provides a text error detection method and device, and the method comprises: firstly obtaining user side information for generating a to-be-detected text, selecting a corpus matched with the user side information, preliminarily screening from a to-be-detected text by utilizing the selected corpus to obtain a target suspected vocabulary, and finally screening from the target suspected vocabulary based on the probability that the target suspected vocabulary appears at the current position of the to-be-detected text to obtain a final target error character. According to the embodiment of the invention, the corpus is screened based on the client information for generating the to-be-detected text, and the text error detection is carried out on the to-be-detectedtext by utilizing the screened corpus, so that the pertinence of the text error detection can be enhanced, the accuracy of the text error detection can be improved, and the efficiency of the text error detection can be improved. Meanwhile, the target error character is further screened from the target suspected vocabulary based on the probability that the target suspected vocabulary appears at thecurrent position of the to-be-detected text, so that the accuracy of text error detection is effectively improved.

Description

technical field [0001] The present application relates to the technical field of text processing, in particular to a text error detection method and device. Background technique [0002] With the development of science and technology, in the scenario of intelligent service, it is necessary to perform operations such as semantic understanding and intent classification on the dialogue text of the user or customer service, and then perform corresponding operations according to the obtained semantics or intent. At present, there are inevitably wrong characters in the text obtained through manual handwriting, input method input or speech recognition. These wrong characters bring great difficulties to the above-mentioned semantic understanding and intention classification, and seriously affect the subsequent semantic understanding or intention. The accuracy of classification, which in turn causes the service quality of intelligent services to be impaired. [0003] There are some ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F40/232G06F40/289
Inventor 张占秋李帅王伟玮王杰
Owner BEIJING DIDI INFINITY TECH & DEV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products