Method and device for generating word vectors

A technology of word vectors and center vectors, which is applied in the fields of instruments, calculations, electrical digital data processing, etc., and can solve problems such as inability to generate, time-consuming, and large amount of computation

Inactive Publication Date: 2017-05-24
NUBIA TECHNOLOGY CO LTD
View PDF5 Cites 8 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, when a word is missing in the training set, the word vector corresponding to the word cannot be generated. Then, when it is necessary to generate a word vector for a new word, it is necessary to add the tra

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and device for generating word vectors
  • Method and device for generating word vectors
  • Method and device for generating word vectors

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0060] Based on the foregoing embodiments, an embodiment of the present invention provides a method for generating a word vector, and the method is applied to a device for generating a word vector, wherein the device can be set in the above-mentioned terminal and realized by the method for generating a word vector The function can be realized by calling the program code by the processor in the terminal. Of course, the program code can be stored in a computer storage medium. It can be seen that the terminal includes at least a processor and a storage medium.

[0061] This embodiment provides a method for generating word vectors, image 3 It is a schematic flow chart of the method for generating word vectors in Embodiment 1 of the present invention, refer to image 3 As shown, the generation method of the above word vector includes:

[0062] S301: Perform word2vec processing on the acquired first training word segmentation set to obtain a word vector of each training word segme...

Embodiment 2

[0084] Based on the foregoing embodiments, this embodiment provides a method for generating word vectors, which can be applied to terminals. The functions implemented by the method for generating word vectors can be implemented by calling program codes from the processor in the terminal. Of course, the program The code can be stored in a computer storage medium. It can be seen that the terminal includes at least a processor and a storage medium.

[0085] On the basis of the first embodiment above, Figure 4 It is an optional schematic flow chart of the method for generating word vectors in Embodiment 2 of the present invention, refer to Figure 4 As shown, S305 may include:

[0086] S401: From the second training word segmentation set, select training word segmentations that meet the preset conditions;

[0087] Specifically, there are many ways to select training word segments that meet the preset conditions from the second training word segment set. Here, in order to select...

Embodiment 3

[0124]Based on the aforementioned method embodiments, this embodiment provides a device for generating word vectors, which can be set in a terminal, and the first processing module, receiving module, acquiring module, second processing module and determining module in the device, All can be realized by the processor in the terminal, and of course can also be realized by a specific logic circuit; in the process of a specific embodiment, the processor can be a central processing unit (CPU), a microprocessor (MPU), a digital signal processing device (DSP) or field programmable gate array (FPGA), etc.

[0125] The device for generating word vectors provided in this embodiment, Image 6 It is a schematic structural diagram of a word vector generation device in Embodiment 3 of the present invention, as Image 6 As shown, the terminal includes a first processing module 61, a receiving module 62, an obtaining module 63, a second processing module 64 and a determining module 65,

[0...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The embodiment of the invention discloses a method for generating word vectors. The method comprises the steps that an acquired first training participle set is subjected to word2vec processing, and the word vector of each training participle in the first training participle set is obtained; new participles are received; training texts of the new participles are obtained, the training texts of the new participles are subjected to participle processing, training participles of the new participles are obtained, and a second training participle set is formed by the new participles and the training participles of the new participles; the second training participle set is subjected to word2vec processing, and the word vector of each training participle in the second training participle set is obtained; the word vectors of the new participles added into the first training participle set are determined according to the word vector of each training participle in the first participle set and the word vector of each training participle in the second training participle set. The embodiment of the invention further discloses a device for generating the word vectors. The method and device aim at avoiding that when the new participles are added, the needed calculation amount is trained again, and the time for generating the word vectors when the new participles are added is shortened.

Description

technical field [0001] The present invention relates to the technical field of the Internet, in particular to a method and device for generating word vectors. Background technique [0002] With the development of machine learning technology, researchers hope to use machine learning algorithms in language models to improve the progress of language model research and applications. Word2Vec is such a technology. Using Word2Vec, words / phrases can be converted into a A vector of a specified dimension is called a word vector. In this way, the processing of the text content can be simplified as a vector operation in the vector space, and the similarity of the vector in the vector space can be calculated, which is used to represent the semantic relationship of words / words. similarity. [0003] However, when the existing Word2Vec algorithm trains the model, it reads in all the training set data at one time, and after a long period of training, a model is obtained, which collects all...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/27
CPCG06F40/289
Inventor 姬晨王凯张淑燕
Owner NUBIA TECHNOLOGY CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products