Chinese relationship extraction method

A relationship extraction, Chinese technology, applied in neural learning methods, instruments, biological neural network models, etc., can solve problems such as polysemous word ambiguity and word segmentation ambiguity

Active Publication Date: 2019-10-15
SHENZHEN GRADUATE SCHOOL TSINGHUA UNIV
View PDF2 Cites 28 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0007] In order to solve the problem of word segmentation ambiguity and polysemous word ambiguity in Chinese relation extraction in the prior art, the present invention provides a Chinese relation extraction method

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Chinese relationship extraction method
  • Chinese relationship extraction method
  • Chinese relationship extraction method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0069] Such as figure 1 As shown, the present invention provides a Chinese relation extraction method. Including the following steps:

[0070] S1: Data preprocessing: perform pre-training processing on multi-granularity information on the text of the input data, so as to extract three-level distributed vectors of words, words and word meanings in the text;

[0071] S2: Feature encoding: based on the two-way long-short-term memory network as the basic structure, the hidden state vector is obtained through the distributed vectors of the three levels of words, words and word meanings, and then the final hidden state vector at the word level is obtained;

[0072] S3: Relation classification: learn the final hidden state vector at the word level, and use the attention mechanism at the word level to fuse the hidden state vector at the word level into a hidden state vector at the sentence level.

[0073] The data preprocessing step is to perform multi-granularity information pre-tr...

Embodiment 2

[0089] Such as figure 2 As shown, the present embodiment adopts the Chinese relation extraction method provided by the present invention, which is defined as a given sentence s and two specified entities in the sentence, and it is judged what relationship these two entities are in the sentence s. For example, given the sentence "All Rhododendrons in the Darwin Institute" and the designated entities "Darwin" and "Rhododendron", the goal is to determine the relationship between "Darwin" and "Rhododendron" in the sentence.

[0090] Step 1. Data preprocessing:

[0091] 1.1 Word level representation

[0092] For a given input sequence s={c 1 ,...,c M} There are M characters in total, and the present invention uses the word2vec method to convert each character (taking the i-th example) c i Both map to a word vector is the word vector of the i-th character, R is the real number space, d c is the dimension of the word vector. In addition to word vectors, this technology als...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a Chinese relationship extraction method, which comprises the following steps of S1, data preprocessing: performing pre-training processing of multi-granularity information on atext of input data to extract distributed vectors of three levels of characters, words and word meanings in the text; S2, feature coding: taking a bidirectional long-short-term memory network as a basic framework, obtaining hidden state vectors of the characters and hidden state vectors of the words through the distributed vectors of the three levels of the characters, the words and the word meanings, and then obtaining final hidden state vectors of the character level; and S3, relationship classification: learning the final hidden state vector of the word level, and fusing the hidden state vector of the word level into a sentence-level hidden state vector by adopting an attention mechanism of the word level. The problems of word segmentation ambiguity and polysemy ambiguity are effectively solved, the performance of the model on a relation extraction task is greatly improved, and the accuracy and robustness of Chinese relation extraction are improved.

Description

technical field [0001] The invention relates to the field of computer application technology, in particular to a Chinese relation extraction method. Background technique [0002] Natural language processing is a sub-discipline of artificial intelligence and an intersection of computer science and computational linguistics. Among them, relation extraction is one of the basic tasks in the field of natural language processing. Its purpose is to accurately find out the relationship between entities for a given sentence and marked entities (usually nouns). Relational extraction technology can be used to build large-scale knowledge graphs. Knowledge graphs are semantic networks composed of concepts, entities, entity attributes, and entity relationships, and are structured representations of the real world. The construction of large-scale knowledge graphs can provide comprehensive and structured external knowledge for artificial intelligence systems, thereby developing more power...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/27G06K9/62G06N3/04G06N3/08
CPCG06N3/049G06N3/08G06F40/211G06F40/295G06F18/2414
Inventor 丁宁李自然郑海涛刘知远沈颖
Owner SHENZHEN GRADUATE SCHOOL TSINGHUA UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products