Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Bidirectional GRU relation extraction data processing method and system, terminal and medium

A technology of relation extraction and data processing, which is applied in the fields of electrical digital data processing, natural language data processing, digital data protection, etc., can solve the problems of model error influence, increase the workload of model calculation, increase the amount of model calculation, etc., and achieve advanced performance , saving computational overhead, avoiding error accumulation and error propagation effects

Active Publication Date: 2021-05-14
HUBEI UNIV OF TECH
View PDF7 Cites 7 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

But at the same time, in the task of relation classification, this model with attention mechanism does not make full use of the information related to the expression in the data set, and this information has a prompt effect on the task of entity classification. In addition, the shortest dependency path ( SDP), part-of-speech tags (pos), hypernyms, synonyms and other features are all language features generated by related NLP tools. error effects, and greatly increase the computational effort of the model
[0005] Through the above analysis, the problems and defects of the existing technology are: in the task of relation classification, the existing model with attention mechanism does not make full use of the information related to the expression in the data set; at the same time, the existing model uses other processing The tool will cause the model to be affected by the errors generated by the tool, greatly increase the computational workload of the model, and increase the calculation time of the model; and the traditional word vector model cannot accurately represent a large number of polysemy in sentences; In the network layer for extracting text information, the existing model uses too many LSTM network parameters, which increases the risk of model overfitting to a certain extent, and increases the amount of calculation of the model, resulting in longer calculation time

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Bidirectional GRU relation extraction data processing method and system, terminal and medium
  • Bidirectional GRU relation extraction data processing method and system, terminal and medium
  • Bidirectional GRU relation extraction data processing method and system, terminal and medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0124] The purpose of the present invention is to provide an efficient and accurate deep learning relation extraction method based on keyword attention, which is tested using the SemEval-2010 Task 8 data set, a benchmark data set in the field of relation extraction. First, the method of the present invention processes the data set to obtain a sentence dictionary and an entity relationship dictionary, and calculates the relative position scalar between each word and two entity words, and then converts it into a position feature vector through a position embedding matrix. Next, through the ELMo (embedding from language model) pre-training model, the corpus processed by the NLTK data package is converted into a 512-dimensional word vector, and input into the multi-head attention mechanism, which weights the words with relational expressions in the sentence, and unrelated Words are denoised. Then, the result is input to the Bi-GRU network layer, where the input is context-encoded,...

Embodiment 2

[0181] The relevant experiments of the present invention are based on the TensorFlow environment of Python 3.7, PyCharm 2020.2.2 (Professional Edition), the main data package is TensorFlow 2.5.0-dev20201127 version, cudav11.1, cudnn v8.0.4, pytorch v1.7.

[0182] 1. Data sources and evaluation criteria

[0183] The experiments of the present invention are evaluated on the SemEval-2010 Task 8 dataset, which is a widely used benchmark dataset in the field of relation extraction (see Figure 6 ). The dataset has 19 relationship types, including 9 directional relationships and others: Cause-Effect, Instrument-Agency, Product-Producer, Con-tent-Container, Entity-Origin, Entity-Destination, Component-Whole, ember -Collection, Message-Topic and Other. The data set consists of 10717 sentences, including 8000 training samples and 2717 test samples. The proportion of each label in the training set and test set is shown in Table 1 and Table 2.

[0184] Table 1 Proportion of various da...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention belongs to the technical field of relation extraction, and discloses a bidirectional GRU relation extraction data processing method and system, a terminal and a medium, and the method comprises the steps: carrying out the preprocessing of a reference data set SemEval-2010Task 8; performing word vectorization on the corpus through an EMLo pre-training model; performing preliminary denoising processing on the word vectors through a multi-head attention mechanism; coding the word vector by using a Bi-GRU network to obtain a hidden layer vector containing context information in the sentence; the hidden layer vector serves as input and is transmitted to a keyword attention layer, and the attention weight is calculated by combining output of the hidden layer with the entity pair relative position feature and the entity hiding similarity feature; and inputting the hidden layer vector processed by the attention mechanism into a classification layer to obtain a final relation extraction result. Experimental results show that the model provided by the invention achieves the most advanced performance without any other NLP tools.

Description

technical field [0001] The invention belongs to the technical field of relation extraction, and in particular relates to a bidirectional GRU relation extraction data processing method, system, terminal and medium. Background technique [0002] At present, relational extraction occupies an important position in the field of natural language processing. It is the core task and an indispensable link of natural language processing such as question answering systems, information extraction, and knowledge graphs. At the same time, relational extraction is also a research hotspot in recent years. The task of relation extraction is to predict the relation type and direction between two labeled entities in text. [0003] The relationship extraction method based on deep learning is mainly based on CNN and RNN network to obtain context information in sentences. Zeng et al. proposed a model using deep convolutional neural network to extract features in sentences. Zhang et al. proposed t...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F40/30G06F21/60G06F40/242G06F40/289G06K9/62G06N3/04G06N3/08
CPCG06F40/30G06F40/289G06F40/242G06F21/602G06N3/049G06N3/08G06N3/047G06N3/045G06F18/22G06F18/2415
Inventor 陈建峡陈煜张杰刘畅刘琦
Owner HUBEI UNIV OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products