Text abstract generation method based on sentence association attention mechanism

An attention and sentence technology, applied in biological neural network models, special data processing applications, instruments, etc., can solve problems such as poor sentence coherence, slow progress in generating abstracts, and high information redundancy.

Active Publication Date: 2019-10-18
KUNMING UNIV OF SCI & TECH
View PDF7 Cites 33 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] The present invention provides a text summarization method based on a sentence-associated attention mechanism, which is used to solve the problems that the existing sum

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Text abstract generation method based on sentence association attention mechanism
  • Text abstract generation method based on sentence association attention mechanism
  • Text abstract generation method based on sentence association attention mechanism

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0050] Embodiment 1: as Figure 1-2 As shown, the text summary generation method based on the sentence association attention mechanism, the specific steps are as follows:

[0051] Step1. More than 220,000 news documents were collected and sorted out as experimental data. This set of experimental data is divided into three parts: training set, verification set, and test set. The training data set contains more than 200,000 Chinese news corpus; the verification set and test set There are more than 10,000 pieces of data each, involving news events in recent years.

[0052] Step2. Before performing the summarization task, preprocess the document, including steps such as segmentation, word segmentation, and removal of stop words. The preprocessing parameters are set as follows: 100-dimensional word vectors pre-trained with word vectors (word2vec) are used as embedding initialization and allowed to be updated during training, and the hidden state dimensions of the encoder and decod...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a text abstract generation method based on a sentence association attention mechanism, and belongs to the technical field of natural language processing. The method comprisesthe following steps: firstly, encoding a document by using a layered bi-directional long short-term memory Bi-LSTM network to obtain sentence semantic vectors, then analyzing an association relationship among sentences by virtue of a gating network to realize sentence-level importance and redundancy evaluation, and finally, providing a decoding algorithm based on a sentence association attention mechanism to generate an abstract. When a neural network abstract generation framework is constructed, sentence relevance analysis is fused, and the evaluation capacity of the model for sentence importance and redundancy in an original text is improved. According to the method, the performance of the generative abstract is effectively improved, and a relatively good effect is achieved on the current ROUGH evaluation index.

Description

technical field [0001] The invention relates to a text summarization method based on a sentence-associated attention mechanism, and belongs to the technical field of natural language processing. Background technique [0002] Text summary is a brief description of the content of the text, that is, to summarize the content of the article with a refined text and express the most important information in the original text. Users can understand the gist of the original text through the abstract, which can solve problems such as information overload and difficult analysis. The current research work on text summarization can be divided into two categories, namely extractive and generative. Extractive summarization usually estimates the importance of sentences in the original text according to certain rules, and selects sentences with high scores and non-repetitive semantics to form a summary, while the generative summarization is based on the premise of understanding the semantics...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/27G06N3/04
CPCG06N3/049G06F40/211G06F40/30
Inventor 郭军军赵瑶余正涛黄于欣吴瑾娟朱恩昌相艳
Owner KUNMING UNIV OF SCI & TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products