Chinese text automatic error correction method and device

An automatic error correction, text technology, applied in neural learning methods, natural language data processing, instruments, etc., can solve the problems of limited data generalization ability in new scenarios, single error correction scenarios, and a large amount of manual experience, so as to improve the time Efficiency and overall accuracy, improving the actual error correction effect, the effect of comprehensive error correction range

Active Publication Date: 2022-04-19
中电云计算技术有限公司
View PDF9 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

This method has the following disadvantages: 1) The construction of the confused word set requires a large amount of wrong data in real scenarios; 2) It also requires a lot of manual experience; 3) The confused word set depends on different data fields, and it is difficult to exhaust all errors Possibilities, so the ability to generalize to new scene data is very limited
[0005] The second idea is to apply the deep neural network model to Chinese error correction tasks and try Chinese error correction methods in different scenarios. The disadvantage is that the error correction scene is single, and there is still a large room for improvement in error correction accuracy.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Chinese text automatic error correction method and device
  • Chinese text automatic error correction method and device
  • Chinese text automatic error correction method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0050] Embodiments of the present invention will be described in detail below in conjunction with the accompanying drawings.

[0051] It should be noted that, in the case of no conflict, the following embodiments and the features in the embodiments can be combined with each other; and, based on the embodiments in the present disclosure, those of ordinary skill in the art obtained without creative work All other embodiments belong to the protection scope of the present disclosure.

[0052] It is noted that the following describes various aspects of the embodiments that are within the scope of the appended claims. It should be apparent that the aspects described herein may be embodied in a wide variety of forms and that any specific structure and / or function described herein is illustrative only. Based on the present disclosure one skilled in the art should appreciate that an aspect described herein may be implemented independently of any other aspects and that two or more of t...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a Chinese text automatic error correction method and device, and the method comprises the steps: carrying out the shallow error correction of a to-be-corrected text, and obtaining a first sentence sequence; performing deep neural network model correction on the first sentence sequence to obtain a fifth sentence sequence; post-processing the fifth sentence sequence to obtain a corrected sample; and outputting the correction sample and the error information. The device provided by the invention comprises a shallow error correction module, a deep neural network model correction module, a post-processing module and an integrated output module, the deep neural network model correction module is composed of an equal-length sequence error correction unit, a word redundancy error correction unit, a word missing error correction unit, a language model judgment unit and a three-model fusion unit, and the post-processing module is composed of a place name error detection unit and a sensitive word error detection unit. According to the method, automatic data set generation and deep neural network model correction can be realized, the Chinese error correction range is more comprehensive, and the error correction efficiency is higher.

Description

technical field [0001] The invention relates to the technical field of natural language processing, in particular to a Chinese text automatic error correction method and device. Background technique [0002] Internet data dissemination is an important feature of the current era of big data. In the process of Internet data dissemination, there are a large number of electronic documents. The content quality of electronic documents will not only affect the reading experience of readers, but also affect the public influence of authors. Among them, the correct expression of Chinese is the key factor to improve the quality of article content, how to identify Chinese errors in electronic document data has become a necessary but time-consuming and laborious work. On the one hand, due to the variety of Chinese expressions, there are many types of errors. On the other hand, with the development of artificial intelligence technology, speech recognition and OCR recognition are likely to...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F40/126G06F40/232G06F40/30G06N3/04G06N3/08
CPCG06F40/126G06F40/232G06F40/30G06N3/08G06N3/048
Inventor 陈波龚承启谢旭阳吴庆北
Owner 中电云计算技术有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products