Multi-text abstract generation method, device, server and storage medium

A text and abstract technology, applied in the fields of devices, servers, storage media, and multi-text abstract generation methods, can solve problems such as unnaturalness, poor overall quality of the abstract, and unsmooth content of the abstract, etc.

Active Publication Date: 2021-09-14
BEIJING BAIDU NETCOM SCI & TECH CO LTD
View PDF4 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The embodiment of the present invention provides a method, device, server, and storage medium for generating a multi-text abstract, so as to solve the problems in the prior art that the content of the abstract is not smooth and unnatural when the abstract is generated using the traditional extractive multi-document abstract algorithm. Issues leading to poor overall quality of abstracts

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Multi-text abstract generation method, device, server and storage medium
  • Multi-text abstract generation method, device, server and storage medium
  • Multi-text abstract generation method, device, server and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0029] figure 1 It is a flow chart of a method for generating a multi-text abstract provided in Embodiment 1 of the present invention. This embodiment is applicable to the situation where a multi-text abstract needs to be generated. The method can be executed by a corresponding multi-text abstract generating device, which can adopt It can be realized by means of software and / or hardware, and can be configured on a server.

[0030] Such as figure 1 As shown, the multi-text abstract generation method provided in the embodiment of the present invention may include:

[0031] S110. Determine a summary sentence set corresponding to the target text set from the sentences of each text in the target text set.

[0032] Among them, the target text set includes at least two texts, and to generate a high-quality summary corresponding to the target text set, the summary must cover enough important information provided by each text, that is, the summary of the target text set It is compos...

Embodiment 2

[0042] figure 2 It is a schematic flowchart of a method for generating a multi-text abstract provided in Embodiment 2 of the present invention. This embodiment is optimized on the basis of the above embodiments, such as figure 2 As shown, the multi-text abstract generation method provided in the embodiment of the present invention may include:

[0043] S210. Text preprocessing.

[0044] In order to ensure that the text data for generating summaries is clean, it is necessary to preprocess each text in the target text set to filter out useless data included in the target text, and because too long input text will affect the operating efficiency of the model, and too long The text generation summary effect is not good, and the long text needs to be preprocessed. Exemplarily, text preprocessing may include the following processing operations:

[0045] (1) Use regular expressions to match, filter webpage links in the target text, for example, match a string through regular ex...

Embodiment 3

[0054] image 3 It is a schematic flowchart of a method for generating a multi-text abstract provided by Embodiment 3 of the present invention. This embodiment is optimized on the basis of the above embodiments, such as image 3 As shown, the multi-text abstract generation method provided in the embodiment of the present invention may include:

[0055] S310. Calculate the importance score of each sentence of each text in the target text set based on the graph ranking model, where the target text set includes at least two texts.

[0056] In this embodiment, the importance score of each sentence can be calculated by using the TextRank model based on graph ranking. Based on the TextRank model, each sentence is regarded as a node in the graph. If there is similarity between two sentences, it is considered that there is an undirected weighted edge between the corresponding two nodes, and the weight of the edge is the similarity. The most important sentences calculated by the Pag...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The embodiment of the invention discloses a method, device, server and storage medium for generating a multi-text abstract. Wherein, the method includes: determining a summary sentence set corresponding to the target text set from sentences of each text in the target text set; Each sentence is sorted; according to the sorting result of each sentence, the sequence of each summary sentence in the set of summary sentences is determined; and the summary of the target text set is assembled according to the sequence of the summary sentences. In the embodiment of the present invention, the abstract sentences are sorted according to the time sorting method, and the sorting makes the overall abstract more smooth, reasonable, and natural, and the overall effect performance is more excellent.

Description

technical field [0001] The embodiments of the present invention relate to the technical field of the Internet, and in particular to a method, device, server, and storage medium for generating a multi-text abstract. Background technique [0002] By definition, multi-text summarization is to extract the main information of multiple texts under the same topic into a summary according to the compression ratio. From an application point of view, on the one hand, when using a search engine, thousands of web pages can often be returned when searching for the text of the same topic. It is of great significance to form a unified summary of these web pages that can reflect the main information. On the other hand, a series of reports on the same event by a certain news unit on the Internet, or reports by several news units at the same time at a certain time, if these highly relevant texts can be extracted into a summary with strong coverage and brief form is equally important. [000...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/34G06F40/211G06F40/30
CPCG06F40/211G06F40/30
Inventor 李法远陈思姣罗雨
Owner BEIJING BAIDU NETCOM SCI & TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products