Method for extracting chapter relation by fusing multi-level information extraction and noise reduction

A technology of relational extraction and information extraction, applied in relational databases, neural learning methods, instruments, etc., can solve problems such as low F1 value, achieve the effects of reducing impact, improving evaluation indicators, and solving recognition difficulties

Active Publication Date: 2021-09-24
BEIJING INSTITUTE OF TECHNOLOGYGY
View PDF6 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] The purpose of the present invention is to solve the problem of low F1 value caused by multi-entity to multi-label in textual relationship extraction, and propose a textual relationship extraction method that integrates multi-level information extraction and noise reduction

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method for extracting chapter relation by fusing multi-level information extraction and noise reduction
  • Method for extracting chapter relation by fusing multi-level information extraction and noise reduction
  • Method for extracting chapter relation by fusing multi-level information extraction and noise reduction

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0065] The specific process of a textual relationship extraction method that combines multi-level information extraction and noise reduction is as follows: figure 1 shown. The present embodiment has described the flow process and the overall framework of the method of the present invention, respectively as figure 1 and figure 2 shown. During specific implementation, the method of the present invention can be applied to extract triplet information in discourse data, and update the knowledge of the knowledge map. The reason why chapter relationship extraction is important is that the existing structured knowledge accounts for a small proportion of existing knowledge, while real-world knowledge usually exists in the form of chapters, and it is still growing rapidly. Manually constructing structured knowledge requires a lot of time and money, and it is difficult for manual methods to keep up with the speed of knowledge growth.

[0066] The data used in this example comes from...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a method for extracting chapter relation by fusing multi-level information extraction and noise reduction, and belongs to the technical field of computer natural language processing. The method comprises the steps of 1, using BERT as an encoder, performing vectorization representation on chapter information, and extracting mentions, entities, sentences and hidden layer vector information of chapters in the chapter information; 2, providing a method for fusing multi-level information to solve the problem of multiple instances and multiple labels, wherein the multi-level information comprises information referring to levels, information of an entity level, information of a sentence level and information of a chapter level; and 3, proposing to roughly extract the proof sentences by using the mentioned position information, and then capturing the important relation characteristics of the proof sentences by using a noise reduction method. According to the method, when relation extraction is carried out on entity pairs in a chapter, the problems of multiple instances and multiple labels in the chapter, difficulty in identification of proof sentences and the like can be considered and solved. Experiments show that the method is significantly improved in the evaluation index of F1.

Description

technical field [0001] The invention relates to a discourse relation extraction method which integrates multi-level information extraction and noise reduction, and belongs to the technical field of computer artificial intelligence natural language processing. Background technique [0002] With the rapid development of Internet technology, a large amount of unstructured data is flooding the computer network, which contains a wealth of economic, human, military, political and other information, which is characterized by rapid growth, complicated information, and loud noise. Traditional manual methods are difficult to extract information from a large amount of Internet data in a short period of time. This contradiction motivates and promotes the development of relation extraction techniques. The purpose of relationship extraction is to extract the relationship between entities from massive unstructured text and store it in a structured form. This task benefits numerous applic...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F40/279G06F16/28G06N3/04G06N3/08
CPCG06F40/279G06F16/288G06N3/08G06N3/045
Inventor 黄河燕袁长森冯冲李正君
Owner BEIJING INSTITUTE OF TECHNOLOGYGY
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products