A method and device for generating multi-document summaries

A multi-document and summary technology, applied in the field of data processing, can solve the problems of poor performance of summary generation

Active Publication Date: 2021-06-22
HUAWEI TECH CO LTD
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The embodiment of the present application provides a method and device for generating multi-document summarization, which solves the problem of poor performance of the existing automatic multi-document summarization technology for generating summaries

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A method and device for generating multi-document summaries
  • A method and device for generating multi-document summaries
  • A method and device for generating multi-document summaries

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0030] The embodiment of the present application provides a method for generating multi-document abstracts. The basic principle is as follows: First, multiple documents are divided into n sentences, and each sentence is represented by an input bag of words vector, and the input bag of words of n sentences The vectors form the input bag of words vector space, and then, each sentence represented by the input bag of words vector is input to the variational self-encoding model to perform unsupervised training on each sentence, and the encoding hidden layer vector of each sentence and each sentence The latent semantic vector of n sentences constitutes the latent semantic vector space of coding hidden layer, the latent semantic vector of n sentences constitutes latent semantic vector space, and then collects m latent semantic vectors from the latent semantic vector space, according to the m Latent semantic vectors, get m decoding hidden layer vectors and m output word bag vectors, up...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The embodiment of the present application discloses a method and device for generating multi-document summarization, which relates to the field of data processing and solves the problem of poor performance of the existing automatic multi-document summarization technology for generating summaries. The specific plan is: divide multiple documents into n sentences, generate input bag-of-words vectors, perform unsupervised training on each sentence represented by the input bag-of-words vectors, and obtain the encoded hidden layer vector of each sentence and the Latent semantic vectors, collect m latent semantic vectors, obtain m decoding hidden layer vectors and m output word bag vectors according to m latent semantic vectors, update them, estimate the importance of each sentence, and obtain the verbs of each sentence The importance and redundancy of phrases, and the importance and redundancy of noun phrases in each sentence, according to the importance and redundancy of all noun phrases, and the importance and redundancy of all verb phrases, generate multiple Abstract of the document. The embodiment of the present application is used for the process of generating multi-document summaries.

Description

technical field [0001] The embodiments of the present application relate to the field of data processing, and in particular to a method and device for generating multi-document summaries. Background technique [0002] In the era of information explosion, facing massive amounts of information, people are increasingly in urgent need of rapid and effective means of information processing. As one of the channels for obtaining information, news reading occupies a considerable part of people's life. However, the mass and redundancy of news brings great inconvenience to people's reading. Multi-Document Summarization (MDS) technology refers to the automatic generation of short summaries with a word limit for multiple documents on a topic, which can describe the main content of the topic to the maximum extent for users to read. Thereby improving the efficiency of information reading and information acquisition. [0003] From the summary generation method, it can be divided into th...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/34G06F16/35
Inventor 李丕绩吕正东李航
Owner HUAWEI TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products