Eureka AIR delivers breakthrough ideas for toughest innovation challenges, trusted by R&D personnel around the world.

Named entity identification method based on adversarial training

A technology for named entity recognition and training samples, which is applied in special data processing applications, instrumentation, semantic tool creation, etc., can solve the problems of model performance loss, limited number of texts, large data scale, etc., and achieve enrichment of training samples, The effect of increasing generalization ability and robustness

Pending Publication Date: 2020-05-22
NO 15 INST OF CHINA ELECTRONICS TECH GRP
View PDF3 Cites 16 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0007] The existing domain-specific named entity model uses the combination of Bi-LSTM and CRF model, but the ability of the model to extract features is not strong enough, and the modeling in Bi-LSTM is simply from left to right, or from right to Modeling on the left and splicing the hidden states together, but the disadvantage of this is that it can only use the information of the above or the following, and cannot use the information of the above and the following at the same time
Moreover, the number and number of texts in a specific field are limited, and there is no large amount of data to improve the performance of the model
[0008] With the emergence of the BERT model, it has been gradually applied in various fields, but it has not been applied in specific fields. However, the words brought by BERT and the subsequent model RoBERTa are independent of each other, and fine-tuning will bring the model Disadvantages such as loss of performance, the scale of data is large, and the accuracy of the model cannot be improved

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Named entity identification method based on adversarial training
  • Named entity identification method based on adversarial training
  • Named entity identification method based on adversarial training

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0026] The present invention will be described in detail below with reference to the accompanying drawings and examples.

[0027] The present invention provides a named entity recognition method based on adversarial training, such as Figure 4 As shown, the specific process is as follows:

[0028] Step 1. The present invention introduces the RoBERTa model in the judicial field. First, it performs corresponding word segmentation for each judicial field text, and inputs it into RoBERTa in the form of characters, and assigns different words to different words through the self-attention mechanism (self-attention). The weight of , that is, assuming that the input matrix is ​​X, the largest word embedding vector is 512, through different weight matrices W q , W k , W v , and finally obtain the self-attention matrix Z through softmax, and obtain multiple representation subspaces of the attention layer through the multi-head mechanism, and finally stitch different matrices Z, and e...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a named entity identification method based on adversarial training. The method comprises the following steps: obtaining relevance features among judicial field characters through RoBERTa model training and Bi-LSTM training; splicing the two relevance features together, and predicting a training sample by using a conditional random field model to obtain a prediction result;according to the method, external word vectors and word vectors of different dimensions can be introduced to be combined with judicial domain text word mixing vectors of different dimensions, confrontation disturbance is carried out on the mixing word vectors in a judicial domain text, and the accuracy of model recognition is improved.

Description

technical field [0001] The invention belongs to the technical field of named entity recognition, and in particular relates to a named entity recognition method based on confrontation training. Background technique [0002] Named entity recognition has been widely used in various fields, and different fields have been optimized for named entity recognition. In traditional named entity recognition, it takes a lot of people to extract features for specific fields, and use probability Graph model for named entity recognition. With the rise of deep learning in recent years, various fields have used deep learning methods to conduct a lot of exploration on named entity recognition. At present, there have been a lot of exploration and practice in the fields of finance, medical treatment and law. , reducing a lot of labor costs and improving the accuracy rate. How to use this information is particularly critical. In the use of named entity recognition technology, entities with specif...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F40/295G06F16/36G06F16/35
CPCG06F16/367G06F16/355
Inventor 袁超逸刘忠麟王立才张起闻罗琪彬郝韫宏李孟书
Owner NO 15 INST OF CHINA ELECTRONICS TECH GRP
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Eureka Blog
Learn More
PatSnap group products