Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Relation extraction method based on Bi-LSTM input information enhancement

A technology of input information and relation extraction, applied in special data processing applications, instruments, electrical digital data processing, etc., to achieve the effect of improving robustness

Active Publication Date: 2018-08-17
BEIJING INSTITUTE OF TECHNOLOGYGY
View PDF3 Cites 31 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0007] The purpose of the present invention is to solve the problem of text relation extraction, and propose a relation extraction method based on Bi-LSTM input information enhancement

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Relation extraction method based on Bi-LSTM input information enhancement
  • Relation extraction method based on Bi-LSTM input information enhancement
  • Relation extraction method based on Bi-LSTM input information enhancement

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0079] In the first step, the dataset is annotated using an uncertain labeling strategy. Each word of the sentence is marked with a label, and each label consists of three parts: entity part, number part, relation part. The entity part uses "E" to indicate an entity, "N" to indicate a non-entity, and the numbered part to use "1" to indicate the first entity and "2" to indicate the second entity. The relationship part uses the abbreviation "ED", "CE" etc. of the relationship type to indicate the relationship type. "E0-R0" indicates that words are entities and belong to the "None" relation type. Non-entity words are marked with "N-X".

[0080] The second step is to apply redundant coding technology to character-level morphological coding of each word in the sentence to generate a 108-dimensional word coding vector v b . Proceed as follows:

[0081] First, use redundant coding technology to encode each character into a 9-dimensional character vector. The specific implementat...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a relation extraction method based on Bi-LSTM input information enhancement and belongs to the field of artificial intelligence natural language processing of computers. The method comprises the steps that by applying a strategy annotation dataset of an indeterminate label, a redundancy encoding technology is used for conducting character-level encoding on each word to generate a word form encoding vector; the word form encoding vector and a word embedding vector are spliced to generate a word vector used for capturing word form and word meaning information; Bi-LSTM of input information enhancement is used as a model encoding layer, the word vector is input to an encoding layer, and the encoding vector is output; the encoding vector is input into a decoding layer, and a decoding vector is obtained; by applying three layers of NN, an entity label, a relation type and entity number information are extracted from the decoding vector; finally, the gradient is calculated, the weight is updated, and a model is trained through a maximum target function. By means of the relation extraction method, the robustness of the system is improved, interference information caused by non-entity words is reduced, and the accuracy rate and recall rate of relation extraction are effectively increased.

Description

technical field [0001] The invention relates to a text relation extraction method, in particular to an improved text relation extraction method based on bidirectional long-short-term memory neural network (Bi-LSTM), which belongs to the field of computer artificial intelligence natural language processing. Background technique [0002] In the field of artificial intelligence natural language processing, relation extraction is an important research topic in information extraction, and it is also a key step in automatically building knowledge graphs. It is very useful for information retrieval, text classification, automatic question answering, machine translation and other natural language processing tasks. big help. Relation extraction aims to convert unstructured and semi-structured information in documents into structured information, extract entity pairs in text and the semantic relationship between them, that is, set predefined relationship types for entity pairs in text...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/30G06F17/27
CPCG06F16/367G06F40/284
Inventor 黄河燕雷鸣冯冲
Owner BEIJING INSTITUTE OF TECHNOLOGYGY
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products