Word vector configuration method and device, storage medium and electronic device

A configuration method and word vector technology, applied in the field of neural networks, can solve problems such as the decrease in the accuracy of training tasks, and achieve the effect of reducing time-consuming and improving accuracy
CN110413990APending Publication Date: 2019-11-05PING AN TECH (SHENZHEN) CO LTD

Patent Information

Authority / Receiving Office
CN ยท China
Current Assignee / Owner
PING AN TECH (SHENZHEN) CO LTD
Publication Date
2019-11-05

Smart Images

  • Figure 1
    Figure 1
  • Figure 2
    Figure 2
  • Figure 3
    Figure 3
Patent Text Reader

Abstract

The invention provides a word vector configuration method and a device, a storage medium and an electronic device, and the word vector configuration method provided by the invention comprises the steps: determining a first vocabulary of a to-be-configured initial word vector; judging whether the first vocabulary is in a word vector dictionary, and the word vector dictionary is used for storing theone-to-one correspondence relation between multiple vocabularies and multiple word vectors; if it is judged that the first vocabulary is not in the word vector dictionary, executing stroke disassembling on the first vocabulary, and obtaining a stroke sequence; calculating the similarity between the stroke sequence of each vocabulary in the word vector dictionary and the stroke sequence of the first vocabulary; and determining a word vector corresponding to the vocabulary with the highest stroke sequence similarity with the first vocabulary, and configuring the word vector as an initial word vector of the first vocabulary. According to the method and the device, the technical problem that the precision of subsequent training tasks is reduced when the word vectors of the unregistered wordsare configured in a random allocation mode in related technologies is solved.
Need to check novelty before this filing date? Find Prior Art

Description

technical field

[0001] The present invention relates to the field of neural networks, in particular to a word vector configuration method, device, storage medium, and electronic device. Background technique

[0002] When processing text data, the most basic steps are usually word segmentation and training word vectors (for example, using the word2vec method for training), and then perform subsequent tasks such as text comparison and classification based on word vectors. In the actual processing process, it often happens that the text to be processed contains new words (unregistered words) that are not within the scope of the word vector dictionary. The usual processing method is to randomly assign word vectors to unregistered words. However, random assignment The word vector of the new word does not utilize the semantic information of the new word, resulting in a decrease in the accuracy of subsequent tasks.

[0003] Aiming at the above-mentioned problems existing in relate...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More