Bidirectional GRU relation extraction data processing method and system, terminal and medium

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A technology of relation extraction and data processing, which is applied in the fields of electrical digital data processing, natural language data processing, digital data protection, etc., can solve the problems of model error influence, increase the workload of model calculation, increase the amount of model calculation, etc., and achieve advanced performance , saving computational overhead, avoiding error accumulation and error propagation effects

Active Publication Date: 2021-05-14

HUBEI UNIV OF TECH

View PDF7 Cites 7 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

But at the same time, in the task of relation classification, this model with attention mechanism does not make full use of the information related to the expression in the data set, and this information has a prompt effect on the task of entity classification. In addition, the shortest dependency path ( SDP), part-of-speech tags (pos), hypernyms, synonyms and other features are all language features generated by related NLP tools. error effects, and greatly increase the computational effort of the model

[0005] Through the above analysis, the problems and defects of the existing technology are: in the task of relation classification, the existing model with attention mechanism does not make full use of the information related to the expression in the data set; at the same time, the existing model uses other processing The tool will cause the model to be affected by the errors generated by the tool, greatly increase the computational workload of the model, and increase the calculation time of the model; and the traditional word vector model cannot accurately represent a large number of polysemy in sentences; In the network layer for extracting text information, the existing model uses too many LSTM network parameters, which increases the risk of model overfitting to a certain extent, and increases the amount of calculation of the model, resulting in longer calculation time

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0124] The purpose of the present invention is to provide an efficient and accurate deep learning relation extraction method based on keyword attention, which is tested using the SemEval-2010 Task 8 data set, a benchmark data set in the field of relation extraction. First, the method of the present invention processes the data set to obtain a sentence dictionary and an entity relationship dictionary, and calculates the relative position scalar between each word and two entity words, and then converts it into a position feature vector through a position embedding matrix. Next, through the ELMo (embedding from language model) pre-training model, the corpus processed by the NLTK data package is converted into a 512-dimensional word vector, and input into the multi-head attention mechanism, which weights the words with relational expressions in the sentence, and unrelated Words are denoised. Then, the result is input to the Bi-GRU network layer, where the input is context-encoded,...

Embodiment 2

[0181] The relevant experiments of the present invention are based on the TensorFlow environment of Python 3.7, PyCharm 2020.2.2 (Professional Edition), the main data package is TensorFlow 2.5.0-dev20201127 version, cudav11.1, cudnn v8.0.4, pytorch v1.7.

[0182] 1. Data sources and evaluation criteria

[0183] The experiments of the present invention are evaluated on the SemEval-2010 Task 8 dataset, which is a widely used benchmark dataset in the field of relation extraction (see Figure 6 ). The dataset has 19 relationship types, including 9 directional relationships and others: Cause-Effect, Instrument-Agency, Product-Producer, Con-tent-Container, Entity-Origin, Entity-Destination, Component-Whole, ember -Collection, Message-Topic and Other. The data set consists of 10717 sentences, including 8000 training samples and 2717 test samples. The proportion of each label in the training set and test set is shown in Table 1 and Table 2.

[0184] Table 1 Proportion of various da...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention belongs to the technical field of relation extraction, and discloses a bidirectional GRU relation extraction data processing method and system, a terminal and a medium, and the method comprises the steps: carrying out the preprocessing of a reference data set SemEval-2010Task 8; performing word vectorization on the corpus through an EMLo pre-training model; performing preliminary denoising processing on the word vectors through a multi-head attention mechanism; coding the word vector by using a Bi-GRU network to obtain a hidden layer vector containing context information in the sentence; the hidden layer vector serves as input and is transmitted to a keyword attention layer, and the attention weight is calculated by combining output of the hidden layer with the entity pair relative position feature and the entity hiding similarity feature; and inputting the hidden layer vector processed by the attention mechanism into a classification layer to obtain a final relation extraction result. Experimental results show that the model provided by the invention achieves the most advanced performance without any other NLP tools.

Description

technical field [0001] The invention belongs to the technical field of relation extraction, and in particular relates to a bidirectional GRU relation extraction data processing method, system, terminal and medium. Background technique [0002] At present, relational extraction occupies an important position in the field of natural language processing. It is the core task and an indispensable link of natural language processing such as question answering systems, information extraction, and knowledge graphs. At the same time, relational extraction is also a research hotspot in recent years. The task of relation extraction is to predict the relation type and direction between two labeled entities in text. [0003] The relationship extraction method based on deep learning is mainly based on CNN and RNN network to obtain context information in sentences. Zeng et al. proposed a model using deep convolutional neural network to extract features in sentences. Zhang et al. proposed t...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G06F40/30G06F21/60G06F40/242G06F40/289G06K9/62G06N3/04G06N3/08

CPCG06F40/30G06F40/289G06F40/242G06F21/602G06N3/049G06N3/08G06N3/047G06N3/045G06F18/22G06F18/2415

Inventor 陈建峡陈煜张杰刘畅刘琦

Owner HUBEI UNIV OF TECH

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Bidirectional GRU relation extraction data processing method and system, terminal and medium

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

Embodiment 2

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology