Image-text data fusion method and system based on attention mechanism

A technology of text data and image data, applied in the direction of editing/combining graphics or text, computer parts, characters and pattern recognition, etc., can solve the problems of unsatisfactory fusion effect and limited application scope.

Active Publication Date: 2019-05-21
WUHAN UNIV
View PDF5 Cites 3 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0009] The purpose of the present invention is to provide an image-text data fusion method based on the attention mechanism in view

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Image-text data fusion method and system based on attention mechanism
  • Image-text data fusion method and system based on attention mechanism
  • Image-text data fusion method and system based on attention mechanism

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0069] Embodiment one: see figure 1 , an image-text data fusion method based on an attention mechanism, which is characterized in that: for image and text data, based on an attention mechanism, an image-text data fusion network is constructed by combining word vectors, position encoding and feature maps based on convolution kernels , and construct a complete training network according to specific tasks, and obtain an available data fusion network through training, and then realize the fusion of image and text data. Specific steps are as follows;

[0070] Step S1, collecting image and text datasets under specific tasks;

[0071] Step S2, preprocessing the collected image and text data sets as a training set;

[0072] Step S3, constructing an image-text data fusion network based on an attention mechanism;

[0073] Step S4, constructing an output network according to the task, and connecting it to the data fusion network to form a training network;

[0074] Step S5, using the...

Embodiment 2

[0098] Embodiment 2: This image-text data fusion method based on the attention mechanism takes the image retrieval task as a specific task, and uses image 3 The network designed in is the training network, and the data fusion network such as figure 2 shown. according to figure 1 , a kind of image-text data fusion method based on the attention mechanism of the present embodiment, its steps are as follows:

[0099] S1. The famous Flickr30k dataset is selected as a task-specific dataset. There are 31,000 images in this dataset, and each image corresponds to 5 different text annotations. Considering an image and its text annotation as task input, the task output is 1, indicating that the image and text annotation match.

[0100] S2. Preprocess the collected image and text datasets, that is, de-average the image data, perform word segmentation on the text annotation, and use the preprocessed dataset as a training set.

[0101] S3. Build an image-text data fusion network based...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses an image-text data fusion method and system based on an attention mechanism. The method comprises the following steps: firstly, constructing an image-text data fusion network based on an attention mechanism, a complete training network is constructed based on a specific task, a training set is used for training, and then images to be fused and text data are input into the trained data fusion network to achieve data fusion of the images and the text data. Compared with the prior art, the method has the outstanding characteristics and advantages that firstly, position codes are introduced to replace a recurrent neural network to model text contexts, the parallelization degree of a data fusion network is higher, and the training speed of a training model is higher; 2,through fusion of the image and the text on the semantic level, the fused data is higher in quality and stronger in usability; And thirdly, the method can train a data fusion network through multipletasks, and the robustness is higher.

Description

technical field [0001] The invention relates to a data fusion method of images and texts, specifically constructing an image-text data fusion network based on an attention mechanism, then constructing a complete training network according to a specific task, and then using the training set for training, and finally will need to be fused The image and text data are input into the trained data fusion network to obtain the fused data, which is an image-text data fusion method based on the attention mechanism. Background technique [0002] In recent years, with the rapid development of sensor technology and computer technology, the research of data fusion technology has been greatly promoted, and the application field of data fusion technology has also rapidly expanded from military to civilian use. At present, data fusion technology has achieved results in many civilian fields. These fields mainly include robotics and intelligent instrument systems, intelligent manufacturing s...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06T11/60G06K9/62
Inventor 刘进郭峻材沈晨凯崔晓晖储玮周平义余啸付忠旺
Owner WUHAN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products