Word definition generation method based on recurrent neural network and latent variable structure

A technology of cyclic neural network and latent variables, applied in the field of natural language processing, can solve problems such as polysemy of a word, and achieve high-quality and easy-to-understand effects

Active Publication Date: 2019-08-02
BEIJING UNIV OF TECH
View PDF4 Cites 14 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The present invention utilizes variational autoencoder (VAE) to model the interpretation on the basis of the cyclic neural network, combines latent variable features, extracts the meaning of the word according to the context information of the defined word to generate the interpretation of the word, and makes up for the existing There is a method that cannot combine the shortcomings of the context, thus solving the problem of polysemy

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Word definition generation method based on recurrent neural network and latent variable structure
  • Word definition generation method based on recurrent neural network and latent variable structure
  • Word definition generation method based on recurrent neural network and latent variable structure

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0070] figure 2 It is a schematic diagram of the model structure of the word definition generation method based on the cyclic neural network and the latent variable structure of the present invention. This implementation includes a context semantic extractor, a paraphrase variational autoencoder, and a paraphrase generation decoder.

[0071] Some basic concepts and interrelationships involved in the present invention

[0072] 1. Vocabulary: It consists of all the words included in the dictionary, that is, it consists of all defined words;

[0073] 2. Initial vocabulary: count the first 70,000 characters with the highest frequency in the WikiText-103 dataset, remove special symbols, and only keep English words as the initial vocabulary;

[0074] 3. Basic corpus: It is to organize the language materials that have actually appeared together to form a corpus, so that when explaining words, we can draw materials from them or obtain data evidence. The corpus described in the pres...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a word definition generation method based on a recurrent neural network and a latent variable structure in the field of natural language processing. On the basis of a recurrent neural network, a variational auto-encoder (VAE) is used for modeling paraphrases, latent variable characteristics are combined. The paraphrases are extracted according to context information of defined words to generate paraphrases of the words, and the method specifically comprises the steps of establishing and arranging a basic corpus; selecting a synonym set of the defined words, and expanding the basic corpus to form a final corpus; carrying out expansion reconstruction on the word vectors of the defined words; constructing a structure model based on the recurrent neural network and thelatent variable; training a latent variable structure model based on a recurrent neural network; and inputting the to-be-paraphrased words and the context information thereof into the trained model to realize semantic paraphrasing of the to-be-paraphrased words in a specific context, thereby achieving polysemy.

Description

technical field [0001] The invention relates to a word definition generation method based on a cyclic neural network and a latent variable structure, belonging to the field of natural language processing. Background technique [0002] English learning dictionary refers to a reference book specially designed for learners whose mother tongue is not English, trying to help learners understand and use English correctly. At present, the definitions of words in most English learning dictionaries have problems such as circular interpretation and too difficult words, which are not conducive to users' understanding. [0003] The main task of Definition Generation is to automatically generate natural language interpretations of words, thereby reducing the time and cost of manual compilation of dictionaries, involving Linguistics, Natural Language Processing, Artificial Intelligence ) and many other fields. [0004] Word vectors, that is, distributed word representations, use low-dim...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/36G06K9/62G06N3/04
CPCG06F16/36G06N3/045G06F18/214
Inventor 杜永萍张海同王辰成
Owner BEIJING UNIV OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products