Relation extraction method in combination with clause-level remote supervision and semi-supervised ensemble learning

A technology of remote supervision and integrated learning, applied in the field of information extraction

Active Publication Date: 2017-01-04
ZHEJIANG UNIV
View PDF2 Cites 33 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0007] In order to solve the problem of noise data and negative example data in the relation extraction method, the present invention provides a relation extraction method combining sentence-level remote supervision and semi-supervised ensemble learning, which can not only remove noise data, but also make full use of negative example data

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Relation extraction method in combination with clause-level remote supervision and semi-supervised ensemble learning
  • Relation extraction method in combination with clause-level remote supervision and semi-supervised ensemble learning
  • Relation extraction method in combination with clause-level remote supervision and semi-supervised ensemble learning

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0054] In order to describe the present invention more specifically, the technical solutions of the present invention will be described in detail below in conjunction with the accompanying drawings and specific embodiments.

[0055] figure 1 Shown is a flow chart of a relation extraction method combined with sentence-level remote supervision and semi-supervised integrated learning in the present invention. The method is divided into two stages: data processing and model training.

[0056] data processing stage

[0057] The specific steps of data processing are as follows:

[0058] Step a-1, align the relational triples in the knowledge base K to the corpus D through remote supervision, and construct a relational instance set Q={q n 丨q n =(s m ,e i ,r k ,e j ),s m ∈D}.

[0059] if sentence s m Also contains entity e i and e j , and there is a relational triple in the knowledge base K (e i ,r k ,e j ), then (s m ,e i ,r k ,e j ) are positive relation instances...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a relation extraction method in combination with clause-level remote supervision and semi-supervised ensemble learning. The method is specifically implemented by the following steps of 1, aligning a relation triple in a knowledge base to a corpus library through remote supervision, and establishing a relation instance set; 2, removing noise data in the relation instance set by using syntactic analysis-based clause identification; 3, extracting morphological features of relation instances, converting the morphological features into distributed representation vectors, and establishing a feature data set; and 4, selecting all positive example data and a small part of negative example data in the feature data set to form a labeled data set, forming an unlabelled data set by the rest of negative example data after label removal, and training a relation classifier by using a semi-supervised ensemble learning algorithm. According to the method, the relation extraction is carried out in combination with the clause identification, the remote supervision and the semi-supervised ensemble learning; and the method has wide application prospects in the fields of automatic question-answering system establishment, massive information processing, knowledge base automatic establishment, search engines, specific text mining and the like.

Description

technical field [0001] The invention relates to the field of information extraction, in particular to a relation extraction method combining clause-level remote supervision and semi-supervised integrated learning. Background technique [0002] Information Extraction refers to the process of extracting information such as entities, events, and relationships from a piece of text, forming structured data and storing it in a database for user query and use. Relation Extraction is the key content of information extraction, which aims to extract the semantic relationship between entities. Relation extraction technology has broad application prospects in the fields of automatic question answering system construction, massive information processing, automatic knowledge base construction, search engines, and specific text mining. [0003] Traditional relation extraction research generally adopts supervised machine learning methods, which regard relation extraction as a classificatio...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
CPCG06F16/288G06F16/313G06F16/355
Inventor 陈岭余小康
Owner ZHEJIANG UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products