Multi-document abstract generation method and system

A multi-document and abstract technology, applied in the field of natural language processing and deep learning, can solve the problem that the word vector is not enough to meet the keyword extraction task, ignore the topic information of the document, etc., and achieve high accuracy and good readability

A multi-document and abstract technology, applied in the field of natural language processing and deep learning, can solve the problem that the word vector is not enough to meet the keyword extraction task, ignore the topic information of the document, etc., and achieve high accuracy and good readability

CN110334188AInactive Publication Date: 2019-10-15COMMUNICATION UNIVERSITY OF CHINA

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Multi-document abstract generation method and system
  • Multi-document abstract generation method and system
  • Multi-document abstract generation method and system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0091] In order to make the purpose, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below in conjunction with the drawings in the embodiments of the present invention. Obviously, the described embodiments It is a part of embodiments of the present invention, but not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the protection scope of the present invention.

[0092] The present invention will be further described below in conjunction with the accompanying drawings and specific embodiments.

[0093] In order to make the technical solutions and advantages in the examples of the present application clearer, the exemplary embodiments of the present application will be further described in detail below...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a multi-document abstract generation method which comprises the following steps: S1, determining a theme, acquiring a plurality of documents related to the theme, and constructing a first corpus; S2, constructing an HLDA topic model for the topic, and obtaining sub-topics; S3, calculating importance scores of the clauses; S4, calculating the importance degree of the sub-topics; and S5, extracting abstract sentences. According to the method, news features are added, an HLDA theme importance calculation method is improved, reasonable sentence scores are obtained, and meanwhile on the basis of a traditional abstract sorting step, features of inter-sentence information are added to serve as one of bases for judging sentence sorting, so that finally obtained abstract sentences are more accurate, and sentences are smoother.

Description

technical field [0001] The invention relates to the technical fields of natural language processing and deep learning, in particular to a method and system for generating multi-document abstracts. Background technique [0002] In recent years, while massive data has brought great convenience to people, it has also brought great challenges to data analysis and search. In the context of big data, how to quickly obtain the required key information from massive data has become an urgent problem that people need to solve. [0003] For example, for hot news topics, there will be a large number of related documents on the Internet. The content of these document webpages has many repetitions and similarities. It takes a lot of time and effort for readers to obtain the required document information. Multi-document summary technology uses machine learning, graph model, topic model and other technologies to obtain the content of multiple documents related to the topic, automatically e...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
15 Oct 2019
Publication
CN110334188A
IPC
G06F16/33; G06F16/34; G06F17/27
CPC
G06F16/3344; G06F16/345; G06F40/211; G06F40/289
Inventors
李樱; 胡诚成