Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Small sample named entity recognition method based on data enhancement and active learning

A named entity recognition and active learning technology, which is applied in the fields of electrical digital data processing, natural language data processing, instruments, etc., can solve the problems of slow increase, decline, and inability to adjust the F1 value, so as to achieve the improvement of F1 value recognition and improve the recognition effect , the effect of low complexity

Active Publication Date: 2021-09-07
NAT UNIV OF DEFENSE TECH
View PDF4 Cites 4 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, the query strategy is not universal. The query strategy needs to be determined in advance and cannot be adjusted during the training process.
If the F1 value of the named entity recognition model trained by the query strategy on a data set improves, it may not improve or even decline after changing the data set.
At the same time, active learning relies on multiple rounds of training, so the early (that is, the early stages of multiple rounds) training is usually less labeled data, resulting in the name entity recognition model being limited by the labeled data, and the F1 value is slowly improved.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Small sample named entity recognition method based on data enhancement and active learning
  • Small sample named entity recognition method based on data enhancement and active learning
  • Small sample named entity recognition method based on data enhancement and active learning

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0065] Such as figure 2 Shown, the present invention comprises the following steps:

[0066] In the first step, construct an active named entity recognition system combined with data augmentation. The system as figure 1 As shown, it consists of active learning module, data labeling module, data enhancement module, named entity recognition module, test data pool T, unlabeled data pool U, labeled data pool L, and enhanced data pool A. Among them, the active learning module, the data labeling module, and the data enhancement module use multiple cycles to train the named entity recognition model in the named entity recognition module and mark and enhance the data. In each cycle, the named entity recognition model Parameters are sent to the named entity recognition module.

[0067] The active learning module is connected with the data labeling module, the named entity recognition module, the labeled data pool L, and the enhanced data pool A. The active learning module obtains...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a small sample named entity recognition method based on data enhancement and active learning, and aims to improve the F1 value of early recognition unlabeled data of an active learning method, and the method is effective for various query strategies. According to the technical scheme, the method comprises the steps of firstly, constructing an active named entity recognition system combined with data enhancement; and preparing a data set required for training the named entity recognition model. The active learning module, the data annotation module and the data enhancement module train a named entity recognition model in the named entity recognition module and annotate and enhance data in a multi-round circulation mode. And the trained named entity recognition module performs named entity recognition on the text in the test data pool T to obtain a predicted tag sequence. According to the method, the effect of quickly improving named entity recognition when the annotation data participating in training in the early stage is few is achieved, and compared with an original named entity recognition method under active learning, F1 values of various query strategies are all improved.

Description

technical field [0001] The invention relates to the field of named entity recognition, in particular to a small-sample named entity recognition method based on data enhancement and active learning. Background technique [0002] Natural language refers to languages ​​such as Chinese, English, Spanish, French, German, etc. As other languages ​​​​used by people in daily life, they play an important role in human communication. Natural language is a language that naturally arises with the development of human society, rather than a language deliberately created by humans. Natural language processing is to use the computing power of computers to process information such as shape, sound, and meaning of human natural language, that is, to input, output, identify, analyze, Understand, generate operations, and process this information. Realizing information exchange between man-machine or machine-to-machine is an important issue of common concern in the world of artificial intellig...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F40/295G06N3/04
CPCG06F40/295G06N3/044
Inventor 黄震李青青窦勇胡彪金持潘衡岳汪昌健
Owner NAT UNIV OF DEFENSE TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products