Vocabulary mining method and apparatus

A vocabulary and noun technology, applied in the field of vocabulary mining methods and devices, can solve the problems of low efficiency and high labor cost

Active Publication Date: 2018-10-09
TENCENT TECH (SHENZHEN) CO LTD
View PDF8 Cites 4 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Obviously, the manual mining method is inefficient, and requires excavators to have certain domain knowledge, and the labor cost is high.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Vocabulary mining method and apparatus
  • Vocabulary mining method and apparatus
  • Vocabulary mining method and apparatus

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0028] The following will clearly and completely describe the technical solutions in the embodiments of the application with reference to the drawings in the embodiments of the application. Apparently, the described embodiments are only some of the embodiments of the application, not all of them. Based on the embodiments in this application, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the scope of protection of this application.

[0029] The embodiment of the present application provides an automatic vocabulary mining solution, which can be used to mine hypernym pairs, and the mining solution is implemented based on a server. The hardware structure of the server may be a processing device such as a computer, a notebook, etc. Before introducing the vocabulary mining method of the present application, the hardware structure of the server is firstly introduced. Such as figure 1 As shown, the server can include:...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a vocabulary mining method and apparatus. The method comprises the steps of determining an entity word set and a candidate hypernym set comprised in a corpus sentence; combining words in the two sets in pairs to obtain candidate word pairs; further determining word vectors of the entity words and the candidate hypernyms in the candidate word pairs; and according to the wordvectors, determining whether the candidate word pairs are vocabulary mining results or not. An example is as shown in that whether the candidate word pairs are hypernym pairs or not is determined. Corpora do not need to be manually sorted; and automatic mining of the hypernym pairs is realized by machine learning, so that the hypernym pair mining efficiency is greatly improved and the mining costis reduced.

Description

technical field [0001] The present application relates to the technical field of data mining, and more specifically, to a vocabulary mining method and device. Background technique [0002] The meaning of a hypernym is that if an entity word A and a word B contain a hyponym relationship, and the entity word A belongs to the hyponym of the word B, then the word B is the hypernym of the entity word A. For example, "animal" is a hypernym for "tiger". On this basis, the word pair composed of entity word A and word B that constitutes the hyponymy relationship is called a hypernym word pair. For example, "tiger, animal" constitutes a hypernym pair. [0003] Mining out hypernym pairs in a large amount of corpus can help in text analysis and other tasks. Existing hypernym pair mining methods generally manually carry out semantic analysis on corpus to determine hypernym pairs. Obviously, manual excavation is inefficient and requires excavators to have certain domain knowledge, res...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/27G06N3/08
CPCG06N3/084G06F2216/03G06F40/284
Inventor 李潇张锋王策
Owner TENCENT TECH (SHENZHEN) CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products