Language disease correction recommendation method and system

A recommendation method and technology of language disorders, applied in special data processing applications, instruments, electronic digital data processing, etc., can solve problems such as focusing on error detection, missing language errors, and not being able to directly provide modification suggestions

Active Publication Date: 2019-05-24
IFLYTEK CO LTD
View PDF4 Cites 14 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Pinyin texts such as English often have spelling errors, for which string matching and other technologies can be used to provide error correction suggestions or even direct error correction; however, ideographic texts, such as Chinese, are characterized by mostly using characters as the basic unit In terms of probability, there are almost no spelling problems in the input characters themselves. Chinese language problems are mainly reflected in selective language problems (typos, improper collocations, and input content that does not match the input intention, etc.) and missing language problems (missing and missing characters). As a result, the more common text editing software usually only marks suspected wrong words in texts such as Chinese, that is, it only focuses on error detection, and cannot directly provide modification suggestions

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Language disease correction recommendation method and system
  • Language disease correction recommendation method and system
  • Language disease correction recommendation method and system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0137] Example as Figure 4 As shown, according to the co-occurrence words and the corresponding accurate mutual information scores, the modified candidate words specifically include:

[0138] Step S320, according to the preset first score threshold, determine the high-score co-occurrence words among the co-occurrence words of a single adjacent word;

[0139] There is no need to elaborate on this, that is, to define a standard for screening out high-score values, and to screen out high-score value co-occurrence words from all co-occurrence words. This process is based on the adjacent words as a unit, so the high-score co-occurrence words may be screened out for intersection or union, for example, two adjacent words A and B are determined through the previous steps and their respective high-score co-occurrences There are two words: Example 1, the high-scoring co-occurrence words of A are α (0.91) and β (0.88), and the high-scoring co-occurrence words of B are β (0.8) and γ (0....

Embodiment 2

[0145] Example two such as Figure 5 as shown,

[0146] Step S3201, merging the exact mutual information scores of each co-occurring word corresponding to each adjacent word one by one, to obtain the fusion score of each co-occurring word;

[0147] In this embodiment, the exact mutual information scores of each co-occurrence word corresponding to all adjacent words are obtained one by one in units of co-occurrence words. Use the above example, α (0.91 and 0.3), β (0.88 and 0.8), γ (0.6 and 0.95), δ (0.4 and 0.85)... But it should be noted that in this embodiment, whether it is a high score is not considered Values, but the exact mutual information scores of all co-occurrence words relative to adjacent words are listed and refused, thus including all cases such as ε (0.25 and 0.45), θ (0.98 and 0.1)...etc.

[0148] As for the origin of the fusion score, reference can be made to the foregoing "Embodiment 1", and details will not be repeated here.

[0149] Step S3202. The co-o...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a language disease correction recommendation method and system, and the method comprises the steps: identifying a language disease target of a to-be-tested text, and determining language disease information which comprises a language disease position and a language disease type; According to the context content of the language disease target and/or the character attribute of the language disease target, obtaining a correction candidate word; and generating a language disease correction recommendation list by using the correction candidate words. Compared with the priorart, error detection and error correction can be combined, and reliable reference suggestions are provided for language disease correction.

Description

technical field [0001] The present invention relates to the field of natural language processing, in particular to a method and system for correcting and recommending language defects. Background technique [0002] In the process of text input (handwriting or man-machine interface), there are often grammatical errors in the input text due to various reasons, such as grammatical errors and unclear semantics in writing due to typos, improper collocations, and incomplete components. However, if it is necessary to correct and recommend language problems, it usually needs to go through two stages: error detection and error correction. [0003] Existing dichotomy error detection technologies mainly rely on dictionaries or statistical information to construct simple rules for identification, such as the following processing process: [0004] 1) Dictionary construction, using manually compiled literary dictionaries or counting the frequency of binary word strings or trigram word st...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/27
CPCY02A90/10
Inventor 宋巍付瑞吉王士进胡国平秦兵刘挺
Owner IFLYTEK CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products