Word embedding deep learning method based on knowledge graph

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A technology of embedding depth and knowledge graph, applied in the field of machine learning of word embedding, it can solve the problems of lack of semantic association of word vectors, failure to consider the semantic information between training data, and difficulty in capturing the deep feature representation of words, so as to improve the generalization ability. , easy to implement, and the effect of improving accuracy

Active Publication Date: 2018-02-23

TONGJI UNIV

View PDF3 Cites 30 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0006] We found that although the word2vec method can learn the one-dimensional vector representation of words to a certain extent, it is difficult to capture the deep feature representation of words because its training model only contains a shallow neuron network model with a hidden layer, and It does not consider the semantic information between the training data, so that the learned word vector lacks semantic association

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0022] The technical solution of the present invention will be introduced below in conjunction with the drawings.

[0023] The purpose of the present invention is to provide a word embedding deep learning method with high accuracy, strong generalization ability, and simple and easy implementation in order to solve the defects of the above-mentioned existing methods. The technical framework is as follows figure 1 Shown.

[0024] The invention mainly consists of two stages of training sample set construction and word embedding deep learning.

[0025] The first stage (training sample set construction) mainly includes two steps, namely, the entity relationship division of the knowledge graph and the training sample set generation.

[0026] In step 1, the present invention first takes the knowledge graph as input, calculates the information degree of all entities therein, and sorts the entities according to their information degree from large to small or from small to large, and then divide...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a word embedding deep learning method based on a knowledge graph. In the training sample set construction stage, firstly, entity relations in the knowledge graph are divided according to the semantic intensity, and then training samples with different path lengths are generated on the basis of the divided entity relation groups. In the word embedding deep learning stage, athree-task deep neural network structure composed of a word2vec encoder, a convolution neural network, a gating cycle unit network, a softmax classifier, a logic regression device and the like is constructed, and then parameters of the deep neural network structure are optimized by taking a training sample set generated in the previous stage as input through iteration. After training is completed,a word nesting encoder composed of the word2vec encoder and the convolution neural network is reserved. Compared with the prior art, the method has the advantage of being high in word embedding accuracy, high in generalization ability, easy to implement and the like, and can be effectively applied to the fields such as big data analysis, electronic commerce, intelligent traffic, medical health and intelligent manufacturing.

Description

Technical field [0001] The present invention relates to the field of computer application technology, in particular to a machine learning method for word embedding. Background technique [0002] Word embedding is a very important and widely used technology that can convert text and words into a one-dimensional numerical vector that the machine can accept. The length of the vector can be flexibly set according to needs. [0003] In the early days of word embedding technology, researchers proposed and used the one-hot method to convert words into one-dimensional vectors. The length of the vector is the size of the vocabulary, most of the elements are 0, and only one component has the value 1. This component represents the current word. For example, "Xiao Ming is a child star" is divided into "Xiao Ming|是|个|child star" after word segmentation. Then there are a total of four words in this sentence, so the code for "Xiaoming" is "0001", the code for "Yes" is "0010", the code for "one"...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G06F17/30G06N3/04G06N3/08

CPCG06F16/355G06F16/367G06N3/08G06N3/045

Inventor 黄震华

Owner TONGJI UNIV

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Word embedding deep learning method based on knowledge graph

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology