Gibbs restricted text abstract generation method using pre-training model

A pre-training, Gibbs technology, applied in neural learning methods, biological neural network models, unstructured text data retrieval, etc. The effect of predicting unification, improving generation quality, and reducing training bias

Active Publication Date: 2021-09-17
成都崇瑚信息技术有限公司
View PDF9 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Although the emergence of deep learning has driven the development of text summarization, the generated text summaries often have problems such as lack of semantics, repeated generation, unregistered words, polysemy, poor readability, and difficult evaluation of summaries, which need further research. and urgent

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Gibbs restricted text abstract generation method using pre-training model
  • Gibbs restricted text abstract generation method using pre-training model
  • Gibbs restricted text abstract generation method using pre-training model

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0047] Such as figure 1 As shown, this embodiment provides a method for generating a Gibbs-restricted text summary using a pre-trained model, which uses the Trans-BLSTM model to train and generate a text summary, and the Trans-BLSTM model uses the FFN ( ) layer is changed to Bi-LSTM and connected to a Linear layer, while the decoder part remains unchanged. The training process of the Trans-LSTM model is as follows:

[0048] (1) First use the pre-trained language model Bert to the source sequence x={x of the text 1 ,x 2 ,...,x n} Carry out word vectorization, and add relative position encoding at the same time to get the Word Embedding of the text;

[0049] (2) In the encoder stage, use the multi-head attention mechanism and Bi-LSTM to extract features, train the model, fine-tune the model, and obtain the output of the encoder;

[0050] (3) Same as the word embedding method of the end-source sequence of the encoder, add relative position encoding to obtain the target seque...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to the technical field of text abstracts, in particular to a Gibbs limited text abstract generation method utilizing a pre-training model. According to the method, a model is used for training and generating a text abstract, and the training comprises the following steps: 1) performing word vectorization on a text source sequence, and adding a relative position code to obtain Word Embedding; 2) extracting features by using an attention mechanism and Bi-LSTM, training a model, and finely tuning the model to obtain the output of an encoder; (3) adding a relative position code to obtain a target sequence Word Embedding; (4) enabling the parameters of the decoder end to be consistent with those of the Transformer; 5) inputting the Attention matrix into a full connection layer, and then calculating to obtain probability representation of the vocabulary; and 6) fusing an LDA model into a decoder end to carry out keyword extraction, and extracting and generating an abstract in combination with a Gibbs sampling algorithm. According to the method, the text abstract can be better generated.

Description

technical field [0001] The invention relates to the technical field of text summarization, in particular to a method for generating a Gibbs-restricted text summarization using a pre-trained model. Background technique [0002] Under the background of today's highly developed network, hundreds of millions of data flows are generated on the Internet every day, and the overwhelming information flow fills our lives. How to extract the information we need from the information flow is very important. Since the mobile Internet entered a high-speed development stage in 2012, the amount of text information has grown exponentially and explosively. The huge text information makes people spend a lot of time browsing on the Internet, which greatly increases the user's reading cost and access to important The cost of information. How to quickly extract key information in text data from excessive information has become an urgent need for various industries. Text summarization is a brief ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/34G06F40/216G06F40/289G06K9/62G06N3/04G06N3/08
CPCG06F16/345G06F40/216G06F40/289G06N3/08G06N3/044G06F18/2132Y02D10/00
Inventor 纪禄平杨凡陈香
Owner 成都崇瑚信息技术有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products