Chinese grammar error correction method based on weakened grammar error feature representation

A grammatical error, Chinese technology, applied in the Internet field, can solve problems such as poor performance of grammatical error correction tasks

Active Publication Date: 2020-10-13
BEIJING UNIV OF POSTS & TELECOMM
View PDF29 Cites 5 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The feature representation contains feature information of grammatical errors, resulting in the Transformer neural network model in the prior art being affected by feature representations containing grammatical error information, and performing poorly in grammatical error correction tasks

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Chinese grammar error correction method based on weakened grammar error feature representation
  • Chinese grammar error correction method based on weakened grammar error feature representation
  • Chinese grammar error correction method based on weakened grammar error feature representation

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0088] refer to image 3, as shown in 4, image 3 , 4 shows a Chinese grammatical error correction method based on the weakened grammatical error feature representation provided by the present invention. Specifically, the method includes:

[0089] (1) Divide the Chinese grammatical error correction corpus into text data to be corrected and correct text data;

[0090] (2) The Chinese characters of the error-correcting text and the correct text are mapped to vector representations using the same dictionary, and the input error-correcting text and correct text are digitized into a numerical matrix formed by connecting each character vector column;

[0091] In this embodiment, the dimension of the mapping vector is set to 512 for each character; this step is implemented through a mapping dictionary, and the character is mapped to a dense vector representation. Firstly, a mapping dictionary from characters to character vectors in the corpus text is established, and each characte...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a Chinese grammar error correction method based on weakened grammar error feature representation. On the basis of a Transformer neural network used for a Chinese grammar error correction task, character feature representation and context feature representation are extracted through an encoder, and a weakening factor is obtained for learning of each character in a text to besubjected to error correction. The weakening factor can combine the character feature representation and the context feature representation extracted by the encoder through a joint equation; in the feature representation of the to-be-corrected text extracted by the encoder, the feature information of the grammar error is restrained, the negative influence of the grammar error feature information on the Chinese grammar error correction model is weakened, and the performance of the sequence-to-sequence neural network model based on Transformer in the Chinese grammar error correction task is improved.

Description

technical field [0001] The invention relates to the technical field of the Internet, in particular to a method for correcting Chinese grammar errors based on a Transformer neural network. Background technique [0002] Chinese is one of the oldest and most complex languages ​​in the world. With the continuous development of China, more and more foreigners are learning Chinese as a second language. Automatic Chinese grammatical error correction can replace the traditional time-consuming and labor-intensive manual Chinese grammatical error correction, and improve the efficiency of foreigners learning Chinese. At the same time, the Chinese grammatical error correction task can be used as an auxiliary task in the field of natural language processing to improve the quality and rationality of the generated text in the generation task. Therefore, the task of Chinese grammatical error correction has attracted widespread attention in both academia and industry in recent years. [0...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F40/253G06F40/129G06N3/04G06N3/08
CPCG06F40/253G06F40/129G06N3/084G06N3/047G06N3/045
Inventor 李思梁景贵陆树栋李明正孙忆南
Owner BEIJING UNIV OF POSTS & TELECOMM
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products