GRU (Gated Recurrent Unit)-CRF (Conditional Random Fields) conference name recognition method based on language model

A technology of GRU-CRF and language model, which is applied in the field of named entity recognition combined with GRU and conditional random field, to achieve improved effects, reasonable labeling sequence, and more effective labeling sequence

Active Publication Date: 2018-08-10
BEIJING UNIV OF TECH
View PDF4 Cites 17 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0011] The present invention is mainly to solve the problem that only a few labeled corpora are available for named entity recognition in a specific field

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • GRU (Gated Recurrent Unit)-CRF (Conditional Random Fields) conference name recognition method based on language model
  • GRU (Gated Recurrent Unit)-CRF (Conditional Random Fields) conference name recognition method based on language model
  • GRU (Gated Recurrent Unit)-CRF (Conditional Random Fields) conference name recognition method based on language model

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0025] In order to make the purpose, technical solution and characteristics of the present invention more clear, the specific implementation of the method will be further described below.

[0026] Both the recognition model and the language model of the present invention use GRU, and the method of combining GRU and CRF is adopted in the recognition model. Compared with other methods, the advantages of the present invention are:

[0027] As a variant of cyclic neural network, GRU has the advantages of cyclic neural network and is suitable for processing sequence data such as natural language. At the same time, theoretically speaking, GRU has fewer parameters, which is more computationally efficient than LSTM and requires relatively less training data.

[0028] GRU can automatically learn low-level features and high-level concepts without requiring tedious human work such as feature engineering or domain knowledge. It is an end-to-end recognition method.

[0029] Named entity r...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a GRU (Gated Recurrent Unit)-CRF (Conditional Random Fields) conference name recognition method based on a language model. The GRU-CRF conference name recognition method basedon the language model comprises two parts: one part involves a GRU-based language model, and the other part involves a GRU-CRF-based recognition model. An end-to-end recognition model which does not require feature engineering and domain knowledge is obtained by using a labeled supervision data training and labeling model GRU-CRF. Unsupervised training is performed on the LM (Language Model) by using a large amount of unlabeled data, and a word vector is acquired from the LM obtained by unsupervised training as an input of GRU-CRF, so that the supervised training effect can be improved, and the generalization ability of the recognition model can be improved, therefore, that a named entity recognition model with a relatively good effect is trained from a small amount of labeled corpus becomes possible. Experimental results show that an LM-GRU-CRF method has the best effect on a self-built corpus, and for other named entity recognition tasks that lack the labeled corpus, the method can be used for improving the effect of the model.

Description

technical field [0001] The invention belongs to the field of named entity recognition and deep learning, and is a named entity recognition method based on the combination of language model (LanguageModel, LM) GRU (Gated Recurrent Unit) and conditional random fields (Conditional Random Fields, CRF). The conference name recognized here is a named entity in a specific field, and only a small amount of marked corpus is available. The present invention mainly aims to solve the problem of named entity recognition under the condition that only a small amount of marked corpus is available. Background technique [0002] Named entity recognition is a key task in natural language processing. It was first introduced at the MUC conference in 1995. The purpose is to identify specific types of thing names and meaningful quantitative phrases in text, including named entities, time , numbers and other three categories, and can be subdivided into seven sub-categories: person name, place name,...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/27
CPCG06F40/295
Inventor 王洁张瑞东
Owner BEIJING UNIV OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products