Chinese text error correction system, method, device and computer readable storage medium

A text error correction, Chinese technology, applied in computing, computing models, machine learning and other directions, can solve the problems of NLP emotion recognition, text classification obstacles, Chinese spelling check and backward error correction methods.

Pending Publication Date: 2020-09-08
民生科技有限责任公司
View PDF3 Cites 17 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The current Chinese spelling check and error correction methods are relatively backward, forming obstacles to the follow-up work of NLP, such as emotion recognition, text classification, etc.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Chinese text error correction system, method, device and computer readable storage medium
  • Chinese text error correction system, method, device and computer readable storage medium
  • Chinese text error correction system, method, device and computer readable storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0051] In order to better understand the technical solutions of the present invention, the embodiments of the present invention will be described in detail below in conjunction with the accompanying drawings.

[0052] It should be clear that the described embodiments are only some of the embodiments of the present invention, not all of them. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without creative efforts fall within the protection scope of the present invention.

[0053] Terms used in the embodiments of the present invention are only for the purpose of describing specific embodiments, and are not intended to limit the present invention. As used in the embodiments of the present invention and the appended claims, the singular forms "a", "said" and "the" are also intended to include the plural forms unless the context clearly indicates otherwise.

[0054] The invention provides a Chinese text err...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a Chinese text error correction system, a method, a device and a computer readable storage medium. The method comprises the steps of checking and correcting multiple errors occurring in a Chinese text through multiple methods of machine learning, correcting an unsmooth text into a smooth Chinese text suitable for reading, and using the Chinese text correcting characters which are similar in shape or identical in pronunciation and appear in sentences; and querying the position where the wrong character occurs through the confusion degree, selecting a correct modificationmode to replace the wrong character by utilizing a confusion set and a language model, and finally selecting and returning a correct Chinese language expression through a scoring method. According tothe method, multi-thread processing is adopted, input short texts are divided into two batches, two processes and one start, the speed is doubled, and under the concurrent condition, the processing efficiency of Chinese spelling check and Chinese spelling correction at the present stage is 500 QPS.

Description

【Technical field】 [0001] The invention relates to the technical field of computer word processing, in particular to a Chinese text error correction system, method, device and computer-readable storage medium based on a machine learning model. 【Background technique】 [0002] As the most widely used language in the world, there are still many limitations in the development of Chinese in the field of machine learning. Due to the complexity of Chinese phonetics, fonts, grammar, etc., whether it is in the field of manual input or machine recognition, the spelling of Chinese Both inspection and error correction are in great demand. [0003] At the same time, because Chinese is a non-alphabetic text, there are many differences in the processing method of NLP from a large number of alphabetic texts headed by English. The main difference is that there is no space between words in written Chinese text, so Chinese word segmentation technology is the first difficulty encountered in pro...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F40/232G06N20/00
CPCG06F40/232G06N20/00
Inventor 李振张刚鲍东岳尹正张雨枫刘昊霖陈厚霖傅佳美
Owner 民生科技有限责任公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products