Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Word2vec-based remotely supervised non-taxonomic relation extraction method and system

A technology of non-categorical relationship and extraction method, which is applied in the field of word2vec-based far-supervised non-categorical relationship extraction, can solve the problem of low accuracy of non-categorical relationship, and achieve the effect of avoiding error accumulation and high accuracy

Inactive Publication Date: 2017-09-08
CHINA AGRI UNIV
View PDF4 Cites 24 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Moreover, the accuracy of the extracted non-categorical relationship is much lower than that of the general classification relationship.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Word2vec-based remotely supervised non-taxonomic relation extraction method and system
  • Word2vec-based remotely supervised non-taxonomic relation extraction method and system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0022] In order to make the purpose, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly described below in conjunction with the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are the Some, but not all, embodiments are invented. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the protection scope of the present invention.

[0023] see figure 1 , this embodiment discloses a word2vec-based far-supervised non-categorical relationship extraction method, including:

[0024] S1. Crawling unstructured text data in the field of network vegetables from Internet encyclopedias and large-scale vegetable websites as corpus, performing preprocessing and data alignment on the corpus in turn to obtain pr...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a word2vec-based remotely supervised non-taxonomic relation extraction method and system, which can extract non-taxonomic relations in the field of vegetables. The method comprises the steps of crawling network vegetable field non-structured text data of a network encyclopedia and a large vegetable website to serve as corpora, and preprocessing the corpora in sequence to obtain primary training corpora; and training a word2vec model by utilizing the primary training corpora, and obtaining a spatial vector of each sentence by utilizing the word2vec model; aggregating the primary training corpora according to the types of the non-taxonomic relations, and for aggregated data of each relation, extracting a common sentence mode and an uncommon sentence mode; selecting two sentence spatial vectors accordant with two different modes as initial centers of a k-means clustering method, clustering all sentence spatial vectors, selecting a category accordant with the common sentence mode, and obtaining the training corpora with relatively high quality; and training a convolutional neural network model by the training corpora with relatively high quality, and through a fully connected softmax layer, extracting the non-taxonomic relations.

Description

technical field [0001] The invention relates to the field of weakly supervised classification, in particular to a word2vec-based far-supervised non-categorical relationship extraction method and system. Background technique [0002] At present, the research on ontology-like knowledge graphs in the agricultural field is still in its infancy, and there are relatively few relevant literature reports on non-categorical relationships (other relationships except the relationship between the classification relationship of the upper and lower relationships). Although there are literatures that also involve the learning of non-categorical relations in the fields of ancient agriculture and tea science, such as "Research on Semi-automatic Construction and Retrieval of Domain Ontology" by He Lin, "Research on Modeling of Ontology Learning for Vegetable Field" by Xu Jicheng etc., but they all use the most basic association rule method to discover the concept pairs with relationship. Not...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/30
CPCG06F16/35G06F16/367
Inventor 赵明杜会芳董翠翠陈瑛
Owner CHINA AGRI UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products