Two-stage semantic word vector generation method

A word vector, two-stage technology, applied in the field of two-stage semantic word vector generation, can solve problems such as large space, quality degradation, and data sparseness, and achieve high quality and excellent results

A word vector, two-stage technology, applied in the field of two-stage semantic word vector generation, can solve problems such as large space, quality degradation, and data sparseness, and achieve high quality and excellent results

CN111027595AActive Publication Date: 2020-04-17UNIV OF ELECTRONICS SCI & TECH OF CHINA

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Two-stage semantic word vector generation method
  • Two-stage semantic word vector generation method
  • Two-stage semantic word vector generation method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0026] The following will clearly and completely describe the technical solutions in the embodiments of the present invention with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only some, not all, embodiments of the present invention.

[0027] The two-stage semantic word vector generation method proposed by the present invention is divided into three stages and consists of five steps. The first stage is text matrixing; the second stage includes two steps of feature extractor construction and semantic recognition; the third stage includes two steps of neural language model construction and semantic word vector generation.

[0028] Step 1: Text Matrixization

[0029] Select the clause s containing polysemy w from the obtained texti , forming a set D w ={s 1 ,s 2 ,s 3 ...} (that is, the set of clauses containing ambiguous words), this clause s i and polysemy w in the sense item category c of this ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a two-stage semantic word vector generation method. The two-stage semantic word vector generation method comprises the following five steps of: performing text matrixizing; constructing a feature extractor; performing semantic recognition; constructing a neural language model; and generating a semantic term vector. According to the method, the corresponding word vectors aregenerated for different semantics of the polysemy by using the plurality of neural networks, the defect that the polysemy only corresponds to one word vector in a traditional word-level embedded modeis overcome, and the size of the used corpus is within an acceptable range; meanwhile, a mode of combining a convolutional neural network (CNN) and a support vector machine (SVM) is adopted, on one hand, the feature extraction capability of the convolutional neural network is utilized, and on the other hand, the generalization and robustness of the SVM are utilized, so that the word meaning recognition effect is better, and the generated semantic word vector quality is higher.

Description

technical field [0001] The invention belongs to the field of neural networks, and in particular relates to a two-stage semantic word vector generation method. Background technique [0002] Word representation is one of the key issues in natural language processing. Whether the word representation method is appropriate directly affects the modeling methods of tasks such as syntactic analysis, semantic representation and text understanding, and also affects the accuracy and robustness of application systems such as information retrieval and question answering systems. [0003] At present, the representation strategies of Chinese words can be summarized into three types: traditional 0-1 representation, distributed representation based on latent semantic information and distributed representation based on neural network language model. The traditional 0-1 representation has two problems: on the one hand, the 0-1 representation causes data sparsity, which makes the word vectors ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
17 Apr 2020
Publication
CN111027595A
IPC
G06K9/62; G06N3/04; G06N3/08; G06F40/30
CPC
G06N3/08; G06N3/048; G06N3/045; G06F18/2411; G06F18/214
Inventors
ζ‘‚η››ιœ–; εˆ˜δΈ€ι£ž