Unlock instant, AI-driven research and patent intelligence for your innovation.

A word vector incremental learning method, device and electronic equipment

A technology of incremental learning and word vectors, applied in the field of machine learning, can solve problems such as re-learning word vectors, and achieve the effect of reducing cost and time consumption

Active Publication Date: 2020-01-24
GUANGZHOU LIZHI NETWORK TECH CO LTD
View PDF6 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] The embodiment of the present invention proposes a word vector incremental learning method, device and electronic equipment to solve the problem of adding new words and needing to re-learn the word vector

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A word vector incremental learning method, device and electronic equipment
  • A word vector incremental learning method, device and electronic equipment
  • A word vector incremental learning method, device and electronic equipment

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0077] In order to make the above objects, features and advantages of the present invention more comprehensible, the present invention will be further described in detail below in conjunction with the accompanying drawings and specific embodiments.

[0078] Such as figure 1 As shown, according to the principle of word2vec, only one of the training methods is used here, that is, the negative sampling method is used to train word vectors, including the following steps:

[0079] 1. If the word vector table does not exist, then initialize the word vector table randomly, wherein each word wordi (i=1, 2, 3, ..., N) corresponds to a length N (N is a positive integer) dimensional array, as a vector (i.e. word vector).

[0080] 2. Obtain the training corpus prepared in advance, and each row represents a time series.

[0081] 3. Use the sliding window to collect the central word and its surrounding context context (that is, other words in the sliding window except the central word) to...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The embodiment of the invention provides an increment learning method and device of word embedding and electronic equipment. The method comprises: acquiring new words; constructing a word increment embedding model on the new words on the basis of a trained basic word embedding model; acquiring a training corpus; and employing the training corpus to train the word embedding increment model to obtain the word embedding. The characteristics of transfer learning are utilized for increment learning of the word embedding without the need for re-learning of word embedding, consumption of time is greatly reduced, and consumption of resources is reduced.

Description

technical field [0001] The invention relates to the technical field of machine learning, in particular to a word vector incremental learning method, device and electronic equipment. Background technique [0002] The word2vec algorithm is a synonym algorithm with fast calculation speed. It was originally used to make a word into an embedding (vector), and map a large number of discrete id categories to a point in an N-dimensional space with tens to hundreds of dimensions. [0003] The word2vec algorithm is also used for sequences with semantic relevance to train word vectors. [0004] For example, in the recommendation scenario, the records browsed by users, each session or time records with relatively short time intervals can be considered as a continuous time series, so the word2vec algorithm is applied. [0005] At present, with the addition of new words, word vectors usually need to be expanded, and at this time, word vectors need to be relearned. Contents of the inven...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F40/242G06F40/284G06F16/36
CPCG06F40/242G06F40/284
Inventor 庄正中
Owner GUANGZHOU LIZHI NETWORK TECH CO LTD