Word embedding deep learning method based on knowledge graph

A technology of embedding depth and knowledge graph, applied in the field of machine learning of word embedding, it can solve the problems of lack of semantic association of word vectors, failure to consider the semantic information between training data, and difficulty in capturing the deep feature representation of words, so as to improve the generalization ability. , easy to implement, and the effect of improving accuracy

Active Publication Date: 2018-02-23
TONGJI UNIV
View PDF3 Cites 30 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] We found that although the word2vec method can learn the one-dimensional vector representation of words to a certain extent, it is difficult to capture the deep feature representation of words because its tr

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Word embedding deep learning method based on knowledge graph
  • Word embedding deep learning method based on knowledge graph
  • Word embedding deep learning method based on knowledge graph

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0022] The technical solution of the present invention will be introduced below in conjunction with the drawings.

[0023] The purpose of the present invention is to provide a word embedding deep learning method with high accuracy, strong generalization ability, and simple and easy implementation in order to solve the defects of the above-mentioned existing methods. The technical framework is as follows figure 1 Shown.

[0024] The invention mainly consists of two stages of training sample set construction and word embedding deep learning.

[0025] The first stage (training sample set construction) mainly includes two steps, namely, the entity relationship division of the knowledge graph and the training sample set generation.

[0026] In step 1, the present invention first takes the knowledge graph as input, calculates the information degree of all entities therein, and sorts the entities according to their information degree from large to small or from small to large, and then divide...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a word embedding deep learning method based on a knowledge graph. In the training sample set construction stage, firstly, entity relations in the knowledge graph are divided according to the semantic intensity, and then training samples with different path lengths are generated on the basis of the divided entity relation groups. In the word embedding deep learning stage, athree-task deep neural network structure composed of a word2vec encoder, a convolution neural network, a gating cycle unit network, a softmax classifier, a logic regression device and the like is constructed, and then parameters of the deep neural network structure are optimized by taking a training sample set generated in the previous stage as input through iteration. After training is completed,a word nesting encoder composed of the word2vec encoder and the convolution neural network is reserved. Compared with the prior art, the method has the advantage of being high in word embedding accuracy, high in generalization ability, easy to implement and the like, and can be effectively applied to the fields such as big data analysis, electronic commerce, intelligent traffic, medical health and intelligent manufacturing.

Description

Technical field [0001] The present invention relates to the field of computer application technology, in particular to a machine learning method for word embedding. Background technique [0002] Word embedding is a very important and widely used technology that can convert text and words into a one-dimensional numerical vector that the machine can accept. The length of the vector can be flexibly set according to needs. [0003] In the early days of word embedding technology, researchers proposed and used the one-hot method to convert words into one-dimensional vectors. The length of the vector is the size of the vocabulary, most of the elements are 0, and only one component has the value 1. This component represents the current word. For example, "Xiao Ming is a child star" is divided into "Xiao Ming|是|个|child star" after word segmentation. Then there are a total of four words in this sentence, so the code for "Xiaoming" is "0001", the code for "Yes" is "0010", the code for "one"...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30G06N3/04G06N3/08
CPCG06F16/355G06F16/367G06N3/08G06N3/045
Inventor 黄震华
Owner TONGJI UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products