A text relation extraction method that combines multi-level information extraction and noise reduction

A technology of relation extraction and information extraction, applied in relational databases, neural learning methods, instruments, etc., can solve problems such as low F1 value, and achieve the effect of reducing impact, solving identification difficulties, and improving evaluation indicators

Active Publication Date: 2022-08-05
BEIJING INSTITUTE OF TECHNOLOGYGY
View PDF6 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] The purpose of the present invention is to solve the problem of low F1 value caused by multi-entity to multi-label in textual relationship extraction, and propose a textual relationship extraction method that integrates multi-level information extraction and noise reduction

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A text relation extraction method that combines multi-level information extraction and noise reduction
  • A text relation extraction method that combines multi-level information extraction and noise reduction
  • A text relation extraction method that combines multi-level information extraction and noise reduction

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0065] The specific process of a text relation extraction method that integrates multi-level information extraction and noise reduction is as follows: figure 1 shown. This embodiment describes the flow and overall framework of the method of the present invention, respectively as follows figure 1 and figure 2 shown. During specific implementation, the method of the present invention can be applied to extract triple information in the text data, and update the knowledge of the knowledge graph. The reason why textual relation extraction is important is because the existing structured knowledge accounts for a small proportion of the existing knowledge, and the real-world knowledge usually exists in the form of texts, and it is still growing rapidly. Manually constructing structured knowledge requires a lot of time and money, and it is difficult for manual methods to keep up with the speed of knowledge growth.

[0066] The data used in this example comes from the DocRED datase...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a text relation extraction method integrating multi-level information extraction and noise reduction, and belongs to the technical field of computer natural language processing. Including: 1. Use BERT as the encoder to vectorize the text information, and extract the hidden layer vector information of mentions, entities, sentences and texts; 2. Propose a method to integrate multi-level information to solve multiple instances For multi-label questions, multi-level information includes: mention-level information. Entity-level information, sentence-level information, and article-level information; 3. It is proposed to use the mentioned location information to roughly extract the proof sentences, and then use the noise reduction method to capture the important relational features of the proof sentences. The method can consider and solve the problems of multi-instance and multi-label in the text and the difficulty of identifying the proof sentence when extracting the relationship of the entity pair in the text. Experiments show that the method has a significant improvement in the evaluation index of F1.

Description

technical field [0001] The invention relates to a text relation extraction method integrating multi-level information extraction and noise reduction, and belongs to the technical field of computer artificial intelligence natural language processing. Background technique [0002] With the rapid development of Internet technology, a large amount of unstructured data floods the computer network, which contains rich economic, cultural, military, political and other information, which is characterized by rapid growth, complex information, and loud noise. It is difficult for traditional manual methods to extract information from a large amount of Internet data in a short period of time. This contradiction motivates and drives the development of relation extraction techniques. The purpose of relation extraction is to extract the relations between entities from massive unstructured texts and store them in a structured form. This task benefits many applications, such as question an...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F40/279G06F16/28G06N3/04G06N3/08
CPCG06F40/279G06F16/288G06N3/08G06N3/045
Inventor 黄河燕袁长森冯冲李正君
Owner BEIJING INSTITUTE OF TECHNOLOGYGY
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products