Method for reversely generating abstract based on key sentences and keywords

A keyword and key sentence technology, applied in the field of reverse generation of abstracts based on key sentences and keywords, can solve the problems of repeated words in the abstract, the source of the abstract cannot be highly generalized in the original text, and the corpus data is sparse, and achieves the effect of easy determination.

Pending Publication Date: 2019-07-05
CHINACCS INFORMATION IND
View PDF3 Cites 8 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] Aiming at the technical problems that the corpus data is sparse, the attention model cannot accurately locate the source of the abstract, the abstract cannot highly summarize the original

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method for reversely generating abstract based on key sentences and keywords
  • Method for reversely generating abstract based on key sentences and keywords
  • Method for reversely generating abstract based on key sentences and keywords

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0055] see figure 1 and figure 2 , the present invention provides a kind of method based on key sentence and the reverse generation abstraction of keyword, comprise step

[0056] S1. Generate documents from the acquired corpus.

[0057] S2. Use the tf-idf algorithm and the textrank algorithm to extract 30 keywords and 2 key sentences of the original text respectively.

[0058] The specific steps of using the tf-idf algorithm are:

[0059] First calculate the frequency of each word in the document, and then calculate the reverse file frequency of each word; multiply the word frequency of each word by the reverse file frequency of each word, and finally get the weight of each word, and take out the weight in turn The higher 30 words are used as keywords; the expression is:

[0060]

[0061]

[0062] where n i,j is the word t i in file d j The number of occurrences in , while the denominator is in file d j The sum of occurrences of all words in ; where |D| is the t...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a method for reversely generating an abstract based on keywords and key sentences. The method comprises the following steps of constructing a training and testing data set ofthe keywords and the key sentences; building a sequence-to-sequence framework network; coding the keywords and the key sentences; then selecting an attention model to position and select a place needing to be extracted, selecting the original text if the extracted abstract information is in the original text, otherwise, selecting the abstract information from a dictionary, putting the positioned and selected contents into a reverse decoder, and finally obtaining a text abstract through a duplicate checking module. The beneficial effects of the present invention are that an abstract extractionmethod based on the keywords and the key sentences enables the redundant information in a document to be reduced, the capacity of identifying important information in the original text to be improved,the generated abstract to have the high generalization on the original text and conform to Chinese grammar, the sentences to be smoother and the semantics to better conform to the meaning of the text.

Description

technical field [0001] The invention relates to the technical field of natural language processing, in particular to a method for reversely generating summaries based on key sentences and keywords. Background technique [0002] In the field of natural language processing, text summary generation occupies an important position, and it is mainly used in news information service, document automatic indexing, information retrieval, search engine and so on. Text summarization is mainly divided into extractive and generative. But with the emergence of the attention model, text summarization has developed rapidly. But the current mainstream method is the text summarization algorithm based on the sequence-to-sequence model. This algorithm makes use of deep learning techniques so that the summaries generated by training are close to the standard summaries. The disadvantage of this model is that it is prone to repeated words, has poor ability to identify and process redundant infor...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F16/34
Inventor 舒泓新蔡晓东蒋鹏马新成
Owner CHINACCS INFORMATION IND
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products