Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Method capable of combining word vector with bootstrap learning for obtaining and organizing domain entity hyponymy

A technology of word vectors and domains, applied in natural language data processing, special data processing applications, instruments, etc., can solve the problems of low extraction efficiency and high corpus dependence, and achieve the effect of improving accuracy and easy extraction

Active Publication Date: 2017-12-12
KUNMING UNIV OF SCI & TECH
View PDF3 Cites 27 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] The present invention provides a method for acquiring and organizing the hyponym relations of domain entities combined with word vectors and bootstrapping learning, which is used to solve the influence of the traditional hypernymy relation extraction method on the corpus with high dependence on corpus and low extraction efficiency

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method capable of combining word vector with bootstrap learning for obtaining and organizing domain entity hyponymy
  • Method capable of combining word vector with bootstrap learning for obtaining and organizing domain entity hyponymy
  • Method capable of combining word vector with bootstrap learning for obtaining and organizing domain entity hyponymy

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0056] Embodiment 1: as Figure 1-3 As shown, a method for acquiring and organizing the hyponym relationship of domain entities combined with word vectors and bootstrap learning, the specific steps of the method are as follows:

[0057] Step1. Firstly, according to the bootstrap learning method, obtain candidate hyponymy relationship examples from the text in the tourism field;

[0058] Step1.1. First, manually write a crawler program to crawl text information in the tourism field from travel websites and encyclopedia entries;

[0059] The present invention considers that the positions and tags to be crawled in the crawler program are different due to different webpage structures, and there is no ready-made program, so programs need to be written for different tasks of crawling. It is necessary to select the corpus of different travel webpage themes as comprehensively as possible. Such as Baidu Encyclopedia entries, travel webpage information, etc.

[0060] Step1.2, the pre...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to a method capable of combining word vector with bootstrap learning for obtaining and organizing domain entity hyponymy, and belongs to the technical field of natural language processing and machine learning. Firstly, according to a bootstrap learning way, a candidate hyponymy entity is obtained in the text of a tourist field, the candidate hyponymy entity is used for artificially constructing a tourist field knowledge base, and the hyponymy entity is subjected to hierarchical relationship organization in virtue of a mapping matrix. By use of the method, the hyponymy is effectively extracted, and powerful support is provided for works including information extraction, information retrieval, machine translation and the like for the hyponymy. Compared with an existing identification method, the method disclosed by the invention is characterized in that the accuracy, the recall rate and the F value of the method are improved, and therefore, the method exhibits certain research meaning.

Description

technical field [0001] The invention relates to a method for acquiring and organizing the hyponym relationship of domain entities combined with word vectors and bootstrap learning, and belongs to the technical fields of natural language processing and machine learning. Background technique [0002] The hyponymy relationship is a basic semantic relationship, which is often used in the construction and verification of ontologies, knowledge bases, and dictionaries. From the perspective of technical implementation, the acquisition of hyponymy relations provides important support for the acquisition of other information. It checks the correctness of ontology, knowledge base, and dictionary, and expands and improves them. And it can obtain the semantic information of noun phrases, especially unregistered words, and more semantic relationships between concepts can be obtained by extension. Generally speaking, the acquisition of hyponymy relation is a basic and key problem in knowl...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30G06F17/27
CPCG06F16/9535G06F40/289G06F40/30
Inventor 郭剑毅马晓军余正涛陈玮张志坤
Owner KUNMING UNIV OF SCI & TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products