A Chinese grammatical error detection method based on a word vector with text information added

A technology for text information and grammatical errors, applied in the field of information processing, can solve the problems of not making good use of Chinese vocabulary and ignoring polysemy.

Active Publication Date: 2018-12-11
BEIJING UNIV OF POSTS & TELECOMM
View PDF4 Cites 8 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

At present, most of the existing detection methods do not make good use of the inform

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A Chinese grammatical error detection method based on a word vector with text information added
  • A Chinese grammatical error detection method based on a word vector with text information added
  • A Chinese grammatical error detection method based on a word vector with text information added

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0013] Next, embodiments of the present invention will be described in more detail.

[0014] figure 1 It is a network structure diagram of the error detection method provided by the present invention, which includes:

[0015] Step S1: the vectorization of text words;

[0016] Step S2: The cyclic neural network forms text information related to each word vector;

[0017] Step S3: Text matrix reconstruction;

[0018] Step S4: the cyclic neural network extracts context information;

[0019] Step S5: the forward neural network calculates the error score of each word;

[0020] Step S6: use the error score to infer the error location;

[0021] Each step will be described in detail below:

[0022] Step S1: Vectorization of text words. The present invention first establishes a mapping table from words to word vector numbers, and maps each word in the text to a corresponding word number through the mapping table. Initialize the word vector matrix, each row in the word vector ma...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a Chinese grammar error detection method and device for a word vector with text information added, belonging to the information processing field. The characteristics of the method include: firstly, vectorizing the words of the input text to form a text matrix; then, forming the text information related to each word vector by using the circulating neural network; reconstructing a text matrix; extracting the context information by using the loop neural network; then calculating the error scores of each word by using the forward neural network; using the error score to infer the wrong location. The method improves the Chinese grammar detection effect by combining the text-based word vector, and has great use values.

Description

technical field [0001] The invention relates to the field of information processing, in particular to a neural network-based Chinese grammatical error detection method. Background technique [0002] Due to the rapid development of China, more and more foreigners start to learn Chinese, so the task of Chinese grammatical error detection has attracted more and more attention. The purpose of the Chinese grammatical error detection task is to judge whether there are grammatical errors in the text written by non-Chinese native speakers, and to give an error message. [0003] Most of the current grammatical error detection models use sequence labeling. The model marks the wrong words in the text through calculation and gives an error message. Commonly used statistical learning methods for Chinese grammatical error diagnosis include n-gram, and machine learning methods include recurrent neural network methods. However, these networks require more artificial features to achieve be...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/27G06N3/04
CPCG06F40/253G06F40/284G06N3/044
Inventor 赵建博李思李明正徐雅静
Owner BEIJING UNIV OF POSTS & TELECOMM
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products