Multi-step self-attention cross-media retrieval method and system based on limited text space

An attention and cross-media technology, applied in the field of computer vision and information retrieval, can solve the problems of image and text encoding uncertainty, no consideration of object interaction information, image and text focus information cannot be fixed, etc., to reduce interference, The effect of fast training speed and good experimental results

Active Publication Date: 2019-05-21
PEKING UNIV SHENZHEN GRADUATE SCHOOL
View PDF6 Cites 17 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, most of the existing attention-based methods only consider the shared information at the object level between images and text, and do not consider the interactive information between objects.
[0004]The second sub-problem is how to find a suitable isomorphic feature space
If the additive or product self-attention mechanism is used in the cross-media retrieval algorithm, the focus information of images and texts cannot be fixed, resulting in the uncertainty of image and text encoding, which affects the practical application value of the algorithm

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Multi-step self-attention cross-media retrieval method and system based on limited text space
  • Multi-step self-attention cross-media retrieval method and system based on limited text space
  • Multi-step self-attention cross-media retrieval method and system based on limited text space

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0050] Below in conjunction with accompanying drawing, further describe the present invention through embodiment, but do not limit the scope of the present invention in any way.

[0051] The invention provides a multi-step self-attention cross-media retrieval method based on a limited text space, which includes a feature extraction network, a feature mapping network and a similarity measurement network. The feature extraction network is used to extract global features, regional feature sets, and associated features of images and texts; secondly, the features are further sent to the feature mapping network, and as many objects as possible between images and texts are extracted through a multi-step self-attention mechanism level of shared information. However, it does not consider the interaction information between different objects. like figure 1 As shown, for two different image-text pairs, the object-level shared information between images and texts is similar, such as “m...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a multi-step self-attention cross-media retrieval method and retrieval system based on a restricted text space, and the method comprises the steps: constructing the restrictedtext space of a relatively fixed vocabulary, and converting an unrestricted text space into the restricted text space; Extracting image features and text features of the limited text space through a feature extraction network; Wherein the features comprise global features, a regional feature set and associated features; Sending the extracted features into a feature mapping network, and extractingobject-level sharing information between the image and the text through a multi-step self-attention mechanism; Collecting useful information at each moment through a similarity measurement network tomeasure the similarity between the image and the text, and calculating a triple loss function. Therefore, multi-step self-attention cross-media retrieval based on the limited text space is realized. The cross-media retrieval recall rate is greatly improved by introducing a multi-step self-attention mechanism and associated characteristics.

Description

technical field [0001] The invention relates to the technical field of computer vision and information retrieval, in particular to a multi-step self-attention cross-media retrieval method and system based on a limited text space. Background technique [0002] In recent years, with the rapid development of information technology, multimedia data on the Internet has become more and more abundant, and multimedia data of different modalities (text, image, audio, video, etc.) can be used to express similar content. In order to meet the growing needs of users for multimedia retrieval, people propose a cross-media retrieval task to find an isomorphic semantic space (public space, text space, image space) to make the similarity between the underlying heterogeneous multimedia data can be directly measured. More precisely, the core problem of this cross-media retrieval task can be subdivided into two sub-problems. [0003] The first sub-problem is how to learn effective low-level fe...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/435G06N3/04G06N3/08
CPCG06F16/435G06N3/04G06N3/08
Inventor 王文敏余政
Owner PEKING UNIV SHENZHEN GRADUATE SCHOOL
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products