Abstract generation method and device, server and storage medium

A technology of abstracts and sentences, applied in the field of devices, servers, storage media, and abstract generation methods, can solve the problems of low quality abstracts and achieve the effect of improving the coverage of important information

Active Publication Date: 2019-04-19
BEIJING BAIDU NETCOM SCI & TECH CO LTD
View PDF5 Cites 15 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The embodiment of the present invention provides a summary generation method, device, server and storage medium to solve the technical problem in the pr

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Abstract generation method and device, server and storage medium
  • Abstract generation method and device, server and storage medium
  • Abstract generation method and device, server and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0026] figure 1 It is a flowchart of a summary generation method provided by Embodiment 1 of the present invention. This embodiment is applicable to situations such as summary generation of news information in the communication field, event summary generation of event graphs, etc., and the method can be executed by a corresponding summary generation device , the device can be implemented in software and / or hardware, and can be configured on a server.

[0027] like figure 1 As shown, the abstract generation method provided in the embodiment of the present invention may include:

[0028] S110. Segment the target text to obtain a sentence set.

[0029] Wherein, the target text is the text to be abstracted. Since the abstract of the target text is composed of some important sentences in the text, the target text must be segmented. Exemplarily, sentence segmentation may be performed according to text paragraphs or common sentence terminators (for example: ".!?", etc.), and the t...

Embodiment 2

[0038] figure 2 It is a schematic flowchart of a method for generating an abstract provided in Embodiment 2 of the present invention. This embodiment is optimized on the basis of the above embodiments, such as figure 2 As shown, the abstract generation method provided in the embodiment of the present invention may include:

[0039] S210. Preprocessing the target text.

[0040] In order to ensure that the text data of the generated summary is clean, the target text needs to be preprocessed before the target text is segmented to filter out the useless data included in the target text, and the operation of the model will be affected due to too long input text Efficiency, and the effect of generating summaries for too long text is not good, so it is necessary to preprocess the too long text. Exemplary, text preprocessing may include:

[0041] (1) Use regular expressions to match, filter webpage links in the target text, for example, match a string through regular expressions...

Embodiment 3

[0049] image 3 A schematic flow chart of a summary model training method provided in Embodiment 3 of the present invention, wherein the summary model is a recurrent neural network model, which is used to predict whether each sentence of the text is a summary sentence in any embodiment of the present invention . like image 3 As shown, the summary model training method provided in the embodiment of the present invention may include:

[0050] S310. Obtain a sample data set used for training, and a topic corresponding to each sample data, and label abstract sentences and non-abstract sentences in each sample data.

[0051] Before the summary model is trained, the training data needs to be prepared, including obtaining the sample data set for training and the topic corresponding to each sample data. Since many summaries in the training data set are written manually, the summaries themselves are not included in the text. Therefore, the embodiment of the present invention can u...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The embodiment of the invention discloses an abstract generation method and device, a server and a storage medium. The method comprises the following steps: carrying out sentence segmentation on a target text to obtain a sentence set; Obtaining a target topic corresponding to a target text, and predicting each sentence in the sentence set by using a pre-trained abstract model in combination with the target topic to obtain a probability value that each sentence is an abstract sentence; And selecting a plurality of abstract sentences from the sentence set according to the probability value, andforming an abstract of a target text according to the abstract sentences. According to the embodiment of the invention, when the abstract is generated, the abstract which has higher relevancy with thetheme and is more accurate is generated by combining the theme of the text, the important information coverage capability of the abstract is improved, and meanwhile, diversified abstracts can be generated according to different themes.

Description

technical field [0001] The embodiments of the present invention relate to the technical field of the Internet, and in particular, to a method, device, server and storage medium for generating an abstract. Background technique [0002] The current information overload is severe, with massive news articles generated every day. The summary model extracts and compresses the key information content of the article by summarizing the article, and expresses the article concisely, so that people can obtain information and knowledge more simply and quickly. According to the relationship between the abstract and the original text, the abstract can be divided into extractive abstract and generative abstract. [0003] There are two main types of traditional extractive summarization models: graph-based ranking models and machine learning-based models. However, the sorting summary model based on the graph model only considers the global information of the current article, ignores the his...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F16/34G06F17/27G06N3/04G06N3/08
CPCG06N3/08G06F40/216G06F40/289G06N3/044
Inventor 李法远陈思姣罗雨
Owner BEIJING BAIDU NETCOM SCI & TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products