Unlock instant, AI-driven research and patent intelligence for your innovation.

Dynamic mask training method in Chinese automatic grammar error correction

A training method and mask technology, applied in the field of dynamic mask training of neural network models, can solve the problems of expensive manpower and material resources, poor performance, limiting the performance of neural network models, etc., so as to enhance the generalization ability and increase the richness. Effect

Active Publication Date: 2020-04-24
PEKING UNIV
View PDF4 Cites 10 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Existing supervised methods usually rely on labeled data, and obtaining high-quality data requires a lot of manpower and material resources
However, in reality, automatic grammatical error correction, especially the limited amount of data in the Chinese field, severely limits the performance of neural network models, which also makes the current Chinese grammatical error correction models generally perform poorly

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Dynamic mask training method in Chinese automatic grammar error correction
  • Dynamic mask training method in Chinese automatic grammar error correction
  • Dynamic mask training method in Chinese automatic grammar error correction

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0023] Below by example the present invention will be further described.

[0024] refer to figure 2 , assuming a parallel corpus S of tagged sentences X containing grammatical errors and corresponding corrected sentences Y. In the tth round of model Θ training, let the training set of the current round be S (t) , for each (X, Y) sentence pair in S, the dynamic noise adding module selects the current replacement method f. If the replacement mode is the four replacement modes of blank, random, word frequency, and homonym, the replacement method when adding noise is fixed to the corresponding mode; Randomly determine the current replacement method among the modes in . Apply the replacement method f to X to get noise sentence pairs Source sentence after dynamic masking As shown in the following formula:

[0025]

[0026] where m is the length of the source sentence X.

[0027] The i-th word of is given by:

[0028]

[0029] where p is a random value sampled from...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a dynamic mask training method for Chinese automatic grammar error correction, and belongs to the field of natural language processing. According to the method, multiple noise adding modes based on word replacement are introduced, a mixed noise adding mode is provided to better utilize existing annotation data, and the generalization ability and robustness of the model are improved; and a dynamic mask mechanism is utilized to avoid the defect that a static mask mechanism repeatedly generates a sample, so that the grammar error correction effect is further improved. Afteran error sentence of the source end passing through the dynamic mask is obtained, a new training sample is formed by the error sentence and an original correct sentence of the target end, and training from a word-level sequence to a sequence model is carried out. According to the method, various noise information is introduced through different noise adding modes, the generalization ability of the neural network model is improved, the problem of data scarcity in the field of Chinese grammar error correction is relieved, and the training effect of the Chinese automatic grammar error correctionmodel is improved.

Description

technical field [0001] The invention belongs to the field of natural language processing, and in particular relates to a dynamic masking method for training a neural network model in Chinese automatic grammar error correction. Background technique [0002] Automatic grammatical error correction has a wide range of application scenarios, such as foreign language learning, document error correction, and so on. In a grammatical error correction system, the user inputs a natural language sentence that may contain errors, and the system outputs the corrected sentence. [0003] If the sentences containing errors are regarded as the source language, and the corrected sentences are regarded as the target language, then the process of grammatical error correction can be regarded as a translation process. The process of converting an erroneous sentence (language fragment) into a correct sentence (fragment) by the system is to encode the information in the (wrong) source sentence thro...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F40/253G06N3/08
CPCG06N3/08
Inventor 王厚峰赵泽伟
Owner PEKING UNIV