Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

A Small Sample Named Entity Recognition Method Based on Data Augmentation and Active Learning

A named entity recognition and active learning technology, which is applied in electrical digital data processing, natural language data processing, instruments, etc., can solve the problems of slow increase, decline, and inability to adjust the F1 value, so as to improve the recognition F1 value and improve the recognition effect. , low-complexity effects

Active Publication Date: 2022-02-15
NAT UNIV OF DEFENSE TECH
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, the query strategy is not universal. The query strategy needs to be determined in advance and cannot be adjusted during the training process.
If the F1 value of the named entity recognition model trained by the query strategy on a data set improves, it may not improve or even decline after changing the data set.
At the same time, active learning relies on multiple rounds of training, so the early (that is, the early stages of multiple rounds) training is usually less labeled data, resulting in the name entity recognition model being limited by the labeled data, and the F1 value is slowly improved.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A Small Sample Named Entity Recognition Method Based on Data Augmentation and Active Learning
  • A Small Sample Named Entity Recognition Method Based on Data Augmentation and Active Learning
  • A Small Sample Named Entity Recognition Method Based on Data Augmentation and Active Learning

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0065] like figure 2 As shown, the present invention comprises the following steps:

[0066] The first step is to build an active named entity recognition system combined with data augmentation. The system such as figure 1 As shown, it consists of an active learning module, a data labeling module, a data enhancement module, a named entity recognition module, a test data pool T, an unlabeled data pool U, an labeled data pool L, and an enhanced data pool A. Among them, the active learning module, the data labeling module and the data enhancement module use multiple rounds of cycles to train the named entity recognition model in the named entity recognition module and label and enhance the data. Parameters are sent to the named entity recognition module.

[0067] The active learning module is connected with the data labeling module, the named entity recognition module, the labeling data pool L, and the augmented data pool A. The active learning module obtains the evaluation...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a small-sample named entity recognition method based on data enhancement and active learning. The purpose is to improve the F1 value of the early recognition of unlabeled data by the active learning method, and is effective for various query strategies. The technical solution is to first construct an active named entity recognition system combined with data enhancement; prepare the data set required for training the named entity recognition model. The active learning module, the data labeling module, and the data enhancement module train the named entity recognition model in the named entity recognition module in a multi-round cycle and label and enhance the data. The trained named entity recognition module performs named entity recognition on the text in the test data pool T, and obtains the predicted label sequence. The present invention realizes the effect of rapidly improving named entity recognition when there are few labeled data participating in training in the early stage, so that compared with the original named entity recognition method under active learning, F1 values ​​of various query strategies are all improved.

Description

technical field [0001] The invention relates to the field of named entity recognition, in particular to a small sample named entity recognition method based on data enhancement and active learning. Background technique [0002] Natural language refers to Chinese, English, Spanish, French, German and other languages. As other languages ​​that people use every day, they play an important role in human communication. Natural language is a language naturally produced with the development of human society, rather than a language specially created by human beings. Natural language processing is to use the computing power of computers to process the shape, sound, meaning and other information of human natural language, that is, to input, output, identify, analyze, Understand, generate actions, and process this information. The realization of information exchange between humans and machines or between machines and machines is an important issue of common concern in the global arti...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F40/295G06N3/04
CPCG06F40/295G06N3/044
Inventor 黄震李青青窦勇胡彪金持潘衡岳汪昌健
Owner NAT UNIV OF DEFENSE TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products