Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Entity relationship classification method of unstructured text based on WordNet and IDF

An entity-relationship, unstructured technology, applied in unstructured text data retrieval, text database clustering/classification, neural learning methods, etc., can solve problems such as training data noise, data labeling errors, etc.

Pending Publication Date: 2020-05-22
SHANGHAI UNIV +1
View PDF4 Cites 5 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] In summary, the current relationship extraction method under remote supervision mainly solves the problem that the relationship between a large amount of data is incorrectly labeled due to the introduction of strong assumptions in the process of automatically generating annotated training sets under remote supervision, which makes the training data a lot of noise.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Entity relationship classification method of unstructured text based on WordNet and IDF
  • Entity relationship classification method of unstructured text based on WordNet and IDF
  • Entity relationship classification method of unstructured text based on WordNet and IDF

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0064] In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be described in further detail below in conjunction with specific embodiments and with reference to the accompanying drawings.

[0065] The present invention proposes an entity relationship classification method for unstructured text based on WordNet and IDF, uses external information and internal information vectors to carry out semantic expansion on entities and sentences, and then uses segmental convolutional neural networks to extract fixed-length semantic features Vector, used to train the classifier, and finally can classify the entity relationship of unstructured text. The basic features of the present invention mainly include the following aspects: one is to add the external information of the entity, internal information and sentence structure information in the vector after the sentence is vectorized; the other is to use the convolutional ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses an entity relationship classification method of an unstructured text based on WordNet and IDF. The method comprises the following specific steps of (1) obtaining a text training set and performing preprocessing to obtain sentence matrix representation; (2) expanding external semantic information of sentences by utilizing WordNet; (3) expanding internal semantic informationof the sentences by utilizing IDF; (4) calculating position information of words in the sentences, and updating matrix representation of the sentences; and (5) inputting a sentence matrix in the step(4) into a segmented convolutional neural network to obtain feature vectors of the sentences; (6) inputting the feature vectors into a classifier, and calculating a loss function; (7) if the accuracyof the current round of training is improved by more than 0.1% compared with the accuracy of the previous round or reaches an upper limit of iteration, finishing the training of the classification method; otherwise, updating hyper-parameters in the step (5), and continuing the training process. According to the method, semantic features of entities and relations can be accurately expressed, so that the problem of overlarge noise of training set data is relieved, and the classification accuracy is improved.

Description

technical field [0001] The invention relates to the technical field of text mining and deep learning, and is an entity relationship classification method based on WordNet and IDF unstructured text under remote supervised learning, which can be applied to building knowledge graphs, developing question answering systems, and information retrieval systems, etc. specific areas. Background technique [0002] Entity relationship extraction is one of the most important sub-topics in the field of information extraction. It is based on entity recognition to extract the semantic relationship between predefined entities from unstructured text. According to the degree of dependence on labeled data, entity relationship extraction methods can be divided into supervised relationship extraction, semi-supervised relationship extraction, unsupervised relationship extraction and distant supervision relationship extraction. [0003] Supervised relation extraction treats the relation extraction...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F16/35G06F16/36G06F40/30G06K9/62G06N3/04G06N3/08
CPCG06F16/35G06F16/367G06N3/08G06N3/045G06F18/24
Inventor 陈雪乐金雄骆祥峰黄敬王鹏
Owner SHANGHAI UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products