A text summarization method based on bert pre-training model

A pre-training and model technology, applied in neural learning methods, biological neural network models, instruments, etc., can solve problems such as obstacles in the process of knowledge acquisition, and achieve the effect of improving text quality, quality, accuracy and fluency

Active Publication Date: 2022-05-06
CHONGQING UNIV OF POSTS & TELECOMM +1
View PDF10 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

This creates a huge barrier to the knowledge acquisition process

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A text summarization method based on bert pre-training model
  • A text summarization method based on bert pre-training model
  • A text summarization method based on bert pre-training model

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0061] The technical solutions in the embodiments of the present invention will be described clearly and in detail below with reference to the drawings in the embodiments of the present invention. The described embodiments are only some of the embodiments of the invention.

[0062] The technical scheme that the present invention solves the problems of the technologies described above is:

[0063] In this embodiment, a method for generating abstracts based on the BERT pre-trained model is performed in the following steps.

[0064] Step 1: Preprocessing the text data set (removing special characters, converting animated expressions, replacing date tags, hyperlink URLs, replacing numbers, and replacing English tags);

[0065] (1) Special characters: Remove special characters, mainly including punctuation marks and commonly used stop particles and transition words, including: "「,",¥,..."Ah, hey, and;

[0066] (2) Convert the label content in brackets into words, such as [happy],...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention claims protection of a method for generating a text summary based on a BERT pre-training model. The method includes: preprocessing the Chinese short text data set; using the BERT bidirectional encoding feature to better obtain global information, sending the data into the BERT pre-training model for training; inputting the original text that actually needs to obtain a summary into the BERT pre-training model. The training model is trained using the trained parameters to obtain the best word vector; the obtained high-quality word vector is sent to the improved LeakGAN model; the text is trained in the improved LeakGAN, and finally the summary output is obtained. The invention enables the generator to generate more accurate summaries and improves the accuracy and fluency of the summaries.

Description

technical field [0001] The invention belongs to the field of natural language processing text generation, and relates to a method for generating abstracts based on a BERT pre-training model. Background technique [0002] With the progress of the times and the development of information technology, the Internet has become an increasingly important social, entertainment and even work platform in human life, and it is the main channel for people to obtain various knowledge resources. The Internet has increasingly become an essential part of people's lives and has penetrated into every aspect of life. [0003] However, while the Internet provides convenient and fast services for human beings, it also brings about the inevitable problem of information overload. With the rapid increase in the amount of information data, the form of information is also showing a trend of diversification, mainly including text, sound, image and so on. As the most basic form of information on the I...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F40/284G06F40/30G06F40/253G06N3/04G06N3/08
CPCG06F40/284G06F40/30G06F40/253G06N3/08G06N3/047G06N3/044G06N3/045
Inventor 文凯周玲玉杨航王宗文
Owner CHONGQING UNIV OF POSTS & TELECOMM
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products