Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Image text mutual retrieval method based on bidirectional attention

一种注意力、文本的技术,应用在图像处理领域,达到准确构建的效果

Active Publication Date: 2019-11-29
XIDIAN UNIV
View PDF5 Cites 5 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] The purpose of the present invention is to solve the problem of mutual retrieval of natural images and electronic texts with the same semantic information based on the shortcomings of the above-mentioned prior art, and propose a method for mutual retrieval of images and texts based on two-way attention

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Image text mutual retrieval method based on bidirectional attention
  • Image text mutual retrieval method based on bidirectional attention
  • Image text mutual retrieval method based on bidirectional attention

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0045] The present invention will be described in further detail below in conjunction with the accompanying drawings.

[0046] Refer to attached figure 1 , the steps of the present invention are further described in detail.

[0047] Step 1, generate training set and test set.

[0048] A total of 25,000 images and their corresponding text pairs were arbitrarily selected from the Flickr30k dataset, and 15,000 image-text pairs constituted the training set, and 10,000 image-text pairs constituted the test set.

[0049] Step 2, using the neural network to extract the features of each image-text pair.

[0050] Build a 14-layer neural network, set and train the parameters of each layer.

[0051] The structure of the neural network is as follows: first convolutional layer—>first pooling layer—>second convolutional layer—>second pooling layer—>third convolutional layer—>third pooling layer —> Fourth convolutional layer —> Fourth pooling layer —> Fifth convolutional layer —> Fifth p...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses an image text mutual retrieval method based on bidirectional attention. The method can be used for mutual retrieval of electronic texts and natural images. According to the method, the natural image features and the electronic text features are preliminarily extracted by utilizing the deep neural network, the natural image features and the electronic text features preliminarily extracted by the deep neural network are reconstructed by constructing a bidirectional attention module, and the reconstructed features contain richer semantic information. According to the method, a bidirectional attention module is used for improving a traditional feature extraction process. The high-order features accommodating more image and text semantic information are obtained, and theimage and text mutual retrieval is achieved.

Description

technical field [0001] The invention belongs to the technical field of image processing, and further relates to an image-text mutual retrieval method based on two-way attention in the cross technical field of natural language processing and computer vision. The present invention can be used to excavate the deep connection between two different modalities of natural image and electronic text, extract natural image features and text features, use the extracted features to calculate the matching probability of natural image and electronic text, and realize the two different modes of natural image and electronic text Mutual retrieval between modals. Background technique [0002] There are currently two methods for image-text mutual retrieval. One is to build a similarity learning network, and the other is to build a feature extraction network. The construction of similarity learning network is to use the similarity learning network to learn the similarity of two types of data ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/483G06K9/62G06N3/04G06N3/08
CPCG06F16/483G06N3/08G06N3/047G06N3/045G06F18/22G06F16/5866G06V20/35G06V10/454G06V10/82G06V30/19173G06F16/44G06F18/214
Inventor 刘静石雨佳
Owner XIDIAN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products