Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Method and device for generating multi-document summarization

A multi-document and abstract technology, which is applied in the field of data processing, can solve the problems of poor performance in generating abstracts, and achieve the effect of improving performance

Active Publication Date: 2018-11-02
HUAWEI TECH CO LTD
View PDF6 Cites 14 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The embodiment of the present application provides a method and device for generating multi-document summarization, which solves the problem of poor performance of the existing automatic multi-document summarization technology for generating summaries

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and device for generating multi-document summarization
  • Method and device for generating multi-document summarization
  • Method and device for generating multi-document summarization

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0030] The embodiment of the present application provides a method for generating multi-document abstracts. The basic principle is as follows: First, multiple documents are divided into n sentences, and each sentence is represented by an input bag of words vector, and the input bag of words of n sentences The vectors form the input bag of words vector space, and then, each sentence represented by the input bag of words vector is input to the variational self-encoding model to perform unsupervised training on each sentence, and the encoding hidden layer vector of each sentence and each sentence The latent semantic vector of n sentences constitutes the latent semantic vector space of coding hidden layer, the latent semantic vector of n sentences constitutes latent semantic vector space, and then collects m latent semantic vectors from the latent semantic vector space, according to the m Latent semantic vectors, get m decoding hidden layer vectors and m output word bag vectors, up...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The embodiment of the invention discloses a method and a device for generating a multi-document summarization, relates to the field of data processing and solves the problem of poor performance of a summarization generated by an existing automatic multi-document summarization technology. A specific scheme of the method comprises the steps of dividing multiple documents into n sentences; generatingan input word bag vector; performing unsupervised training on each sentence represented by the input word bag vector to obtain an encoding hidden layer vector of each sentence and a potential semantic vector of each sentence; collecting m potential semantic vectors; obtaining m decoding hidden layer vectors and m output word bag vectors according to the m potential semantic vectors; updating them decoding hidden layer vectors and the m output word bag vectors; estimating an importance degree of each sentence; acquiring the importance degree and a redundancy degree of a verb phrase of each sentence and the importance degree and the redundancy degree of a noun phrase of each sentence; and generating the summarization of multiple documents according to the importance degree and the redundancy degree of all noun phrases and the importance degree and the redundancy degree of all verb phrases. The embodiment of the invention is used for a process for generating the multi-document summarization.

Description

technical field [0001] The embodiments of the present application relate to the field of data processing, and in particular to a method and device for generating multi-document summaries. Background technique [0002] In the era of information explosion, facing massive amounts of information, people are increasingly in urgent need of rapid and effective means of information processing. As one of the channels for obtaining information, news reading occupies a considerable part of people's life. However, the mass and redundancy of news brings great inconvenience to people's reading. Multi-Document Summarization (MDS) technology refers to the automatic generation of short summaries with a word limit for multiple documents on a topic, which can describe the main content of the topic to the maximum extent for users to read. Thereby improving the efficiency of information reading and information acquisition. [0003] From the summary generation method, it can be divided into th...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/30
Inventor 李丕绩吕正东李航
Owner HUAWEI TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products