Generating Method of Blog Text Summarization Based on Deep Learning

A deep learning and abstract technology, applied in the field of blog text abstract generation based on deep learning, can solve the problem of inconspicuous application, and achieve the effect of wide application prospects.

Active Publication Date: 2021-02-12
SUZHOU INST FOR ADVANCED STUDY USTC
View PDF3 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] The generation of text summaries in natural language is mainly divided into two methods: the first extraction type, text summaries based on rules and statistics, has been proved by a large number of practical applications; the second is abstract type, based on deep learning model summarization generation, It has been greatly improved in 2014, from mechanical text summarization to comprehensible text summarization generation, currently using the encoder-decoder framework and embedding a recurrent neural network to achieve it, the application in Chinese is not obvious

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Generating Method of Blog Text Summarization Based on Deep Learning
  • Generating Method of Blog Text Summarization Based on Deep Learning
  • Generating Method of Blog Text Summarization Based on Deep Learning

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0060] A method for generating Chinese blog summaries based on deep learning, the specific steps include:

[0061] Step 1. Blog training data crawling and sorting

[0062] The blog training data is crawled from the popular blogs on the csdn website. The content of the blogs obtained is diverse, but they are all highly professional texts. At the same time, there are some defects in the blog training data. For example, the blog is too short, there is no text in the blog, only Contains videos and pictures, we will discard this kind of text.

[0063] Use find and get_text in beautifulsoup to get the final blog text and select the text content of the web page tag category as article_description as the actual blog summary. If the blog does not have an abstract, the title of the expert blog and the sentence with the largest weight selected through textRank will be combined as the actual abstract of the blog, which will be used during training.

[0064] The textRank method is a text...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a blog text abstract generation method based on deep learning, comprising the following steps: crawling blog data; preprocessing the crawled blog data, selecting blog text data; and using the selected blog text data according to Chinese word vectors Convert the dictionary into vector matrix data; build a deep learning encoder-decoder (encoder-decoder) model, and train the encoder encoder and decoder decoder of the model separately, and use it after the training is completed; repeat steps S01-S03 to generate Data, which will generate a summary of predictions from the model that the data has completed training. The present invention automatically generates blog text summaries based on the deep learning framework encoder-decoder, and at the same time can obtain deeper semantic links of blogs. The generated text summary can intuitively display the main content of the current blog, and has a wide application prospect.

Description

technical field [0001] The present invention relates to a method for generating text abstracts, in particular to a method for generating blog text abstracts based on deep learning. Background technique [0002] Natural Language Processing (NLP) is a particularly important part of artificial intelligence. It includes multiple sub-tasks such as text classification, sentiment analysis, machine translation, and reading comprehension. Almost one sub-task is a very important professional research field. are independent and interrelated. [0003] Deep learning is a new type of end-to-end learning method proposed in recent years. In ordinary processing tasks such as classification, the effect of ordinary neural networks may be almost the same, but in the process of high-dimensional data calculation and feature extraction, depth Learning to use deep networks to fit shows its powerful computational capabilities. At present, deep learning has been applied to many fields-image process...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/34G06F16/33G06F16/35G06N3/08
CPCG06N3/08G06F16/3335G06F16/345G06F16/35
Inventor 杨威周叶子黄刘生
Owner SUZHOU INST FOR ADVANCED STUDY USTC
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products