Multi-document abstract generation method and device, and terminal

A technology for abstracts and documents, applied in the field of multi-document abstract generation, which can solve the problem of redundant information

Active Publication Date: 2018-12-07
XFUSION DIGITAL TECH CO LTD
View PDF13 Cites 12 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] The present application provides a method, device and terminal for generating multi-document summaries to solve the problem of relatively large amount of redundant information in document summaries generated in the prior art

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Multi-document abstract generation method and device, and terminal
  • Multi-document abstract generation method and device, and terminal
  • Multi-document abstract generation method and device, and terminal

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0034] Such as figure 1 as shown, figure 1 It shows a schematic structural diagram of a device for generating multi-document abstracts provided by an embodiment of the present invention, the device includes: a data processing module 101, an importance estimation module 102 connected to the data processing module 101, and an Summary generation module 103 .

[0035] Among them, the data processing module 101 is used to convert each candidate document in multiple candidate documents about the same event to be summarized into a candidate sentence to obtain a set of candidate sentences D; then for the multiple candidate documents about the same event All the words in the document, generate a dictionary of size V; finally, use each candidate sentence with a V-dimensional vector x j (j=1,...,N, N is the maximum number of candidate sentences in the candidate sentence set D) representation, and each candidate sentence represented by a V-dimensional vector is input to the importance e...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

Embodiments of the invention provide a multi-document abstract generation method and device, and a terminal, relate to the field of data processing, and aim to solve the problem that redundant information in generated document abstracts is relatively numerous in the prior art. The method comprises the steps of acquiring a candidate sentence set, wherein the candidate sentence set comprises candidate sentences comprised in each of a plurality of candidate documents about the same event; training each candidate sentence in the candidate sentence set by using a cascade attention mechanism in a preset network model and an unsupervised learning model, and obtaining the importance of each candidate sentence, wherein the importance of one candidate sentence corresponds to a module of a row vectorin a matrix of the cascade attention mechanism output by the preset network model; according to the importance of each candidate sentence, selecting phrases meeting preset conditions from the candidate sentence set to serve as abstract phrase sets; and combining the abstract phrase sets into abstract sentences in a preset combination mode, and obtaining abstracts of the candidate documents.

Description

technical field [0001] The embodiments of the present invention relate to the field of data processing, and in particular to a method, device and terminal for generating multi-document abstracts. Background technique [0002] Automatic multi-document summarization (Multi-Document Summarization, MDS) technology, using multiple candidate documents under the same topic (for example, news events) as input, through the analysis and processing of multiple candidate documents, automatically generate a summary of a specific length as needed Text, to describe the central idea of ​​the news event to the greatest extent, so as to quickly and concisely extract the important information of the news event. [0003] In the prior art, a method for abstract generation is: use the deep neural network model to train the corpus to obtain the word vector representation of the feature words; obtain the candidate sentence set according to the preset query words in the corpus; obtain the word vecto...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30G06F17/27G06N3/04G06N3/08G06F40/20
CPCG06N3/08G06F40/258G06F40/30G06F40/20G06N3/044G06F16/345G06N3/088G06N3/045G06N20/00G06F40/211G06F40/289G06F17/16
Inventor 李丕绩吕正东李航
Owner XFUSION DIGITAL TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products