A Chinese text automatic error correction method and device

An automatic error correction and text technology, applied in neural learning methods, natural language data processing, instruments, etc., can solve problems such as single error correction scenarios, a large amount of manual experience, and limitations of data generalization capabilities in new scenarios, so as to improve the actual The effect of error correction, comprehensive error correction range, improvement of time efficiency and overall accuracy

Active Publication Date: 2022-06-03
中电云计算技术有限公司
View PDF9 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

This method has the following disadvantages: 1) The construction of the confused word set requires a large amount of wrong data in real scenarios; 2) It also requires a lot of manual experience; 3) The confused word set depends on different data fields, and it is difficult to exhaust all errors Possibilities, so the ability to generalize to new scene data is very limited
[0005] The second idea is to apply the deep neural network model to Chinese error correction tasks and try Chinese error correction methods in different scenarios. The disadvantage is that the error correction scene is single, and there is still a large room for improvement in error correction accuracy.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A Chinese text automatic error correction method and device
  • A Chinese text automatic error correction method and device
  • A Chinese text automatic error correction method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0050] The embodiments of the present invention will be described in detail below with reference to the accompanying drawings.

[0051] It should be noted that the following embodiments and the features in the embodiments can be combined with each other without conflict; and, based on the embodiments in the present disclosure, those of ordinary skill in the art can obtain the results obtained without creative work. All other embodiments fall within the protection scope of the present disclosure.

[0052] It is noted that various aspects of embodiments within the scope of the appended claims are described below. It should be apparent that the aspects described herein may be embodied in a wide variety of forms and that any specific structure and / or function described herein is illustrative only. Based on this disclosure, those skilled in the art should appreciate that an aspect described herein may be implemented independently of any other aspects and that two or more of these ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The present invention provides a method and device for automatic error correction of Chinese text, the method comprising: performing shallow error correction on the text to be corrected to obtain the first sentence sequence; performing deep neural network model correction on the first sentence sequence to obtain the fifth sentence Sequence; post-processing the fifth sentence sequence to obtain a corrected sample; output the corrected sample and error information. The device of the present invention includes: a shallow layer error correction module, a deep neural network model correction module, a post-processing module, and an integrated output module, wherein the deep neural network model correction module consists of an equal-length sequence error correction unit, word redundancy error correction It is composed of unit, word missing error correction unit, language model judgment unit and three-model fusion unit, and the post-processing module is composed of place name error detection unit and sensitive word error detection unit. The invention can realize the automatic generation of data sets and the correction of deep neural network models, the Chinese error correction range is more comprehensive, and the error correction efficiency is higher.

Description

technical field [0001] The invention relates to the technical field of natural language processing, and in particular, to a method and device for automatic error correction of Chinese text. Background technique [0002] Internet data dissemination is an important feature of the current big data era. In the process of Internet data dissemination, there are a large number of electronic documents. The content quality of electronic documents will not only affect the reading experience of readers, but also affect the public influence of the author. Among them, the correct expression of Chinese is a key factor to improve the quality of the content of the article. How to identify Chinese errors in electronic document data has become a time-consuming and labor-intensive task that must be done. On the one hand, due to the richness and variety of Chinese expressions, there are many kinds of errors. On the other hand, with the development of artificial intelligence technology, speech r...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F40/126G06F40/232G06F40/30G06N3/04G06N3/08
CPCG06F40/126G06F40/232G06F40/30G06N3/08G06N3/048
Inventor 陈波龚承启谢旭阳吴庆北
Owner 中电云计算技术有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products