A title generation method based on a variational neural network topic model

A neural network and topic model technology, applied in special data processing applications, instruments, electrical digital data processing, etc., can solve problems such as lack of textual information representation

Inactive Publication Date: 2018-12-11
BEIJING INSTITUTE OF TECHNOLOGYGY
View PDF6 Cites 34 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] The purpose of the present invention is to solve the defect that the existing method lacks more textual inf

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A title generation method based on a variational neural network topic model
  • A title generation method based on a variational neural network topic model
  • A title generation method based on a variational neural network topic model

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0154] The present embodiment has described the concrete implementation process of the present invention, as figure 1 shown.

[0155] From figure 1 It can be seen that the process of a headline generation method based on a variational neural network topic model in the present invention is as follows:

[0156] Step A preprocessing; specific to the present embodiment is to carry out word segmentation to corpus, remove the processing of stop word;

[0157] Among them, the word segmentation operation uses the PTB word segmenter to perform word segmentation processing, and uses the nltk tool to perform the operation of removing stop words.

[0158] Step B uses the PV algorithm to learn the document vector, and uses the word2vec algorithm to learn the word vector;

[0159] Among them, using the PV algorithm to learn the document vector and using the word2vec algorithm to learn the word vector are calculated in parallel, specifically in this embodiment:

[0160]Use the PV algorit...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a title generation method based on a variational neural network subject model, belonging to the technical field of natural language processing. This method automatically learnsthe document topic hidden distribution vector by variational self-encoder, and combines the document topic hidden distribution vector and the document representation vector learned by multi-layer neural network with attention mechanism, so as to express the comprehensive and deep semantics of the document on the topic and global level, and to construct a high-quality title generation model. Thismethod uses the multi-layer encoder to learn the more comprehensive information of the document, and improves the effect of summarizing the main idea of the full text of the title generation model; the topic implicit distribution vector of VAE learning is utilized, and the document content is represented in the abstract level of topic. The topic implicit distribution vector and the document information learned by the multi-layer encoder are combined with the deep semantic representation and context information to construct a high quality title generation model by using the attention mechanism.

Description

technical field [0001] The invention relates to a title generation method based on a variational neural network topic model, and belongs to the technical field of natural language processing. Background technique [0002] Nowadays, people obtain a large amount of information through various channels every day, and only a small part of this information is useful information for people. If there is a machine learning model to digest a large amount of information in a compressed form, understand the document and extract useful information in it, so as to automatically generate accurate titles for long texts, it will save people a lot of reading time. Title generation, as the name suggests, aims to generate titles from a large amount of information data, especially generating titles from long texts is the main difficulty, especially when the length of the text increases greatly. Title generation is an important task in the field of natural language processing, which helps machi...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/27
CPCG06F40/258
Inventor 高扬黄河燕郭一迪
Owner BEIJING INSTITUTE OF TECHNOLOGYGY
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products