Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Word vector generation method and related equipment

A technology of word vectors and vectors, applied in the field of natural language processing, can solve problems such as not considering the importance of different Chinese characters, and achieve the effect of improving expression ability

Active Publication Date: 2020-05-26
BEIJING GRIDSUM TECH CO LTD
View PDF4 Cites 4 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

These methods do not consider the importance of different Chinese characters, and still need to be improved

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Word vector generation method and related equipment
  • Word vector generation method and related equipment
  • Word vector generation method and related equipment

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0067] The embodiment of the present invention provides a method for generating word vectors and related equipment. Through the attention mechanism, in the conversion process from word vectors to word vectors, the importance of the word vectors in the words is added, and the weighting of the word vectors Combining the average with the original word vector to obtain the final word vector can effectively improve the expressive ability of the word vector.

[0068] The terms "first", "second", "third", "fourth", etc. (if any) in the description and claims of the present invention and the above drawings are used to distinguish similar objects, and not necessarily Used to describe a specific sequence or sequence. It is to be understood that the terms so used are interchangeable under appropriate circumstances such that the embodiments described herein can be practiced in sequences other than those illustrated or described herein. Furthermore, the terms "comprising" and "having", as...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The embodiment of the invention provides a word vector generation method and related equipment. According to the method, the importance degree of the word vectors in the words can be added in the process of converting the word vectors into the word vectors through an attention mechanism, the weighted average of the word vectors is combined with the original word vectors, the final word vectors areobtained, and the expression capacity of the word vectors can be effectively improved. The method comprises the steps of obtaining a target word, wherein the target word is a word of a word vector tobe generated; determining an initial word vector of the target word, a word vector of each word in the target word and a global variable through a training model; determining the weight of each character in the target word through the global variable and the character vector of each character; and determining a target word vector of the target word according to the word vector of each word, the weight of each word in the target word and the initial word vector of the target word.

Description

technical field [0001] The invention relates to the technical field of natural language processing, in particular to a method for generating word vectors and related equipment. Background technique [0002] Text is a carrier of information and plays an important role in the development of our society. In order for computers to be able to deal with natural language problems, these discrete texts must first be mathematicalized. The easiest way is to use One-hotRepresentation to convert each word into a vector of |V| dimension, where |V| represents the size of the vocabulary. The position corresponding to the word sequence number is 1, and the other positions are 0. In 2003, Yoshua Bengio et al. first applied neural networks to language models, and proposed to use Distributed Representation instead of traditional One-hot Representation to represent word vectors, making word vectors Not only computable, but meaningful. In 2013, Mikolov et al. proposed the Continuous Bag of Wo...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F40/284
CPCY02D10/00
Inventor 陈华杰
Owner BEIJING GRIDSUM TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products