Unlock instant, AI-driven research and patent intelligence for your innovation.

Method for correcting wrongly written characters for Chinese character spelling on basis of CNN-LSTM (convolutional neural networks-long-short term memories)

A technology for typos and Chinese characters, applied in the field of computer natural language processing, can solve problems such as inability to correct typos, and achieve the effect of improving accuracy

Active Publication Date: 2018-05-04
SUN YAT SEN UNIV
View PDF4 Cites 9 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

But this model can only detect typos, not correct typos, which is a flaw of this model

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method for correcting wrongly written characters for Chinese character spelling on basis of CNN-LSTM (convolutional neural networks-long-short term memories)

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0026] Such as figure 1 As shown, a CNN-LSTM-based Chinese spelling typo correction method includes the following steps:

[0027] A: The encoding part. The function of the encoding part is mainly to encode the input sentence and filter typos. Specifically, the following steps are included:

[0028] A1: For input, first use the pre-trained word2vector Chinese word vector to initialize the input sentence into a matrix, then open a window with a fixed width, and only encode the information in the window.

[0029] A2: The structure of the encoding part specifically includes two different convolutional neural network (CNN) convolution kernels, one is used to detect whether the Chinese characters in the window contain typos, and its width and height are related to the size of the window and the word vector The size is the same, and the output needs to go through a nonlinear transformation function sigmoid function, and the other is used to encode the Chinese character information ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a method for correcting wrongly written characters for Chinese character spelling on the basis of CNN-LSTM (convolutional neural networks-long-short term memories). The method has the advantages that mistakes are corrected mainly by the aid of contexts of texts, in other words, whether each Chinese character is correct or not is judged according to the corresponding contexts, and the mistakes can be corrected according to the contexts if the Chinese characters are the wrongly written characters; random mistake correction training modes are adopted in model training, andaccordingly the correction accuracy can be improved.

Description

technical field [0001] The present invention relates to the field of computer natural language processing methods, and more specifically, to a CNN-LSTM-based Chinese spelling typo correction method. Background technique [0002] With the rapid development of China's economy, China's influence in the world is growing, and more and more foreigners are beginning to learn Chinese, but they often make spelling mistakes in the process of learning foreign languages, such as "today's weather is very good "Written as "the weather is very good today". This kind of mistake is often a word or a few words in a sentence, but there is no grammatical error. Therefore, in the system of assisted teaching, for this spelling error, the computer needs to automatically find and correct the error. [0003] The previous traditional method of correcting typos of Chinese characters is mainly to replace the Chinese characters in the sentence with the candidate Chinese characters in the typo table (t...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F3/023
CPCG06F3/0233
Inventor 张晋斌潘嵘
Owner SUN YAT SEN UNIV