Remotely-supervised Dual-Attention relation classification method and system

A technology of relation classification and remote supervision, applied in special data processing applications, instruments, electrical digital data processing, etc., can solve the problems of model training noise influence, weak performance, high cost ratio, etc., to reduce noisy data, avoid wrong transmission, The effect of improving accuracy

Active Publication Date: 2018-11-16
NAT COMP NETWORK & INFORMATION SECURITY MANAGEMENT CENT
View PDF8 Cites 46 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The current mainstream relationship extraction method is a relationship classification method based on neural network learning, which mainly faces three major problems: difficulties in the representation and mining of semantic features, error transmission caused by manual labeling, and noise impact of model training
Although some improved convolutional network models can achieve modeling of larger span information by superimposing structures such as K-segment maximum pooling, such as the experiment of three-segment pooling through PCNNs (Piecewise CNNs), the maximum pooling method is relatively For Bi-LSTM, when extracting semantic features with long dependencies such as long texts, the cost is relatively high and the performance is relatively weak

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Remotely-supervised Dual-Attention relation classification method and system
  • Remotely-supervised Dual-Attention relation classification method and system
  • Remotely-supervised Dual-Attention relation classification method and system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0054] Such as figure 1 As shown, the embodiment of the present invention provides a remote supervision Dual-Attention relationship classification method, including:

[0055] Align the entity pairs in the knowledge base to the news corpus through remote supervision, and construct an entity pair sentence set;

[0056] The Bi-LSTM model based on the word-level attention mechanism carries out the vector encoding of the word level to the sentence to obtain the semantic feature encoding vector of the sentence;

[0057] The Bi-LSTM model based on the sentence-level attention mechanism encodes and denoises the semantic features of the sentence to obtain the sentence set feature encoding vector;

[0058] The sentence set feature encoding vector and the entity pair translation vector are packaged, and the relationship classification of the entity pair is performed on the obtained package feature.

[0059] Such as figure 2 Shown is a specific flow chart of the embodiment of the pres...

Embodiment 2

[0081] Based on the same inventive concept, the present invention also provides a remote-supervised Dual-Attention relationship classification system, including:

[0082] Building blocks for aligning entity pairs in the knowledge base to news corpus through remote supervision, and constructing entity-pair sentence collections;

[0083] The first vector module is used to carry out the vector encoding of the word level to the sentence based on the Bi-LSTM model of the word level attention mechanism, to obtain the semantic feature encoding vector of the sentence;

[0084] The second vector module is used to encode and denoise the semantic features of the sentence based on the Bi-LSTM model of the sentence-level attention mechanism to obtain a sentence set feature encoding vector;

[0085] The relationship classification module is used to package the sentence set feature encoding vector and the entity pair translation vector, and perform entity pair relationship classification on ...

Embodiment 3

[0105] In the knowledge base WikiData, it is known that the entity pair "Jack Ma", "Alibaba" and the corresponding relationship set "founder", "CEO" and other relationships are classified from the Internet data into the entities containing the entity pair "Jack Ma" and "Alibaba". Several sentences, here are examples of sentences in which four entities co-occur.

[0106] Sentence 1: “Female executives are Alibaba’s secret sauce, founder Jack Masays.”

[0107] Sentence 2: "At a conference hosted by All Things D last week, Alibaba CEO Jack Ma said that he was interested in Yahoo."

[0108] Sentence 3: “Internet entrepreneur Jack Ma started a Chinese version of the Yellow Pages that was Alibaba’s precursor in Hanzhou, China.”

[0109] Sentence 4: "Alibaba has brought more small U.S. businesses onto the company's sites, but this is the first time Ma has discussed specific targets."

[0110] Sentences 3 and 4 do not express the predefined relationship of the knowledge base. One o...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a remotely-supervised Dual-Attention relation classification method and system. The method comprises the following steps: aligning entity pair in a knowledge base to news linguistic data through remote supervision, and constructing an entity pair sentence set; performing word-level vector encoding on the sentence through a Bi-LSTM model based on a word-level attention mechanism so as to obtain a semantic feature encoding vector of the sentence; performing encoding and denoising on the semantic feature of the sentence through the Bi-LSTM model based on the sentence-level attention mechanism so as to obtain a sentence set feature encoding vector; and packing the sentence set feature encoding vector and the entity pair translation vector, and performing the relation classification of the entity pair on the obtained packet feature. Through the technical scheme provided by the invention, the noise data of the model training is reduced, the artificial data annotationand the caused error transmission thereof are avoided. The entity alignment is performed by applying the open domain text and the large-scale knowledge library, and the annotation data scale problemof the relation extraction is effectively solved.

Description

technical field [0001] The invention belongs to the field of relationship classification, and in particular relates to a remote-supervised Dual-Attention relationship classification method and system. Background technique [0002] With the development of Internet technology, a large amount of text information on the World Wide Web has grown rapidly, and the technology of automatically extracting knowledge from text information has attracted more and more attention, and has become a current hot spot. The current mainstream relationship extraction method is a relationship classification method based on neural network learning, which mainly faces three major problems: difficulties in the representation and mining of semantic features, error transmission caused by manual annotation, and noise impact of model training. At present, among the relationship classification methods based on neural network learning, the relationship classification methods that achieve the best results a...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30G06F17/27
CPCG06F40/295
Inventor 贺敏毛乾任王丽宏李晨
Owner NAT COMP NETWORK & INFORMATION SECURITY MANAGEMENT CENT
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products