Generative abstract generation method based on image-text fusion

A generative and abstract technology, applied in the field of generative abstract generation based on image-text fusion, can solve the problem of missing key entities in generative abstracts, and achieve the effect of improving quality and readability

Active Publication Date: 2020-01-17
INST OF INFORMATION ENG CAS
View PDF5 Cites 9 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] This application proposal can solve the problem of missing key entities in existing generative summaries, thereby improving the quality and readability of generated summaries

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Generative abstract generation method based on image-text fusion
  • Generative abstract generation method based on image-text fusion

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0048] In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be described in further detail below with reference to the accompanying drawings.

[0049] This embodiment uses the multi-modal sentence summary data set MMSS, which is a data set containing text, image and abstract (X, Y, I) triples, wherein the text and abstract come from the Gigawords data set of the extensive evaluation abstract system, Images are retrieved through search engines. Finally, the (X, Y, I) triplet data set was obtained after manual screening, including 66,000 samples in the training set, and 2,000 samples in the verification set and test set.

[0050] Step 1, preprocessing the dataset.

[0051] In step 1.1, one-to-one correspondence between text, abstract and image is performed on the given original data set, namely (X, Y, I).

[0052] In step 1.2, remove special characters, emoticons, and full-width characters, such as "¥", "30...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a generative abstract generation method based on image-text fusion. The method comprises the following steps: 1) dividing a given text data set into a training set, a verification set and a test set, wherein each sample in the text data set is a triple (X, I, Y), X is a text, I is an image corresponding to the text X, and Y is an abstract of the text X; 2) performing entityfeature extraction on the images of the text data set, and expressing the extracted entity features as image feature vectors with the same dimension as the text; 3) training the generative abstract model by using the training set and the image feature vector corresponding to the training set; and 4) inputting a text and a corresponding image, generating an image feature vector of the image, and inputting the text and the image feature vector corresponding to the text into the trained generative abstract model to obtain an abstract corresponding to the text. According to the abstract generatedby the method, the weight of the entity in the text can be effectively adjusted, and the problem of unregistered words is relieved to a certain extent.

Description

technical field [0001] The invention belongs to the field of artificial technology, and relates to a method for generating a generative abstract based on image-text fusion. Background technique [0002] Existing generative summarization methods are mainly based on the deep learning seq2seq framework and attention mechanism. The Seq2Seq framework is mainly composed of an encoder (encoder) and a decoder (decoder). Both encoding and decoding are implemented by a neural network. The neural network can be a recurrent neural network (RNN) or a convolutional neural network (CNN). The specific process is as follows. The encoder encodes the input original text into a vector (context), which is a representation of the original text. The decoder is then responsible for extracting important information from this vector and generating text summaries. The attention mechanism is to solve the bottleneck of information loss caused by the conversion of long sequences to fixed-length vectors...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/34G06F16/35G06F16/36G06T11/60
CPCG06F16/345G06F16/35G06F16/367G06T11/60
Inventor 曹亚男徐灏尚燕敏刘燕兵谭建龙郭莉
Owner INST OF INFORMATION ENG CAS
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products