Fine-grained image-text retrieval method and system based on Transform model

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A fine-grained, model-based technology, applied in unstructured text data retrieval, digital data information retrieval, biological neural network models, etc., can solve problems such as unsatisfactory retrieval accuracy, achieve excellent retrieval results, improve performance and The effect of high precision and retrieval accuracy

Pending Publication Date: 2022-07-22

浙大宁波理工学院

View PDF0 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

The current retrieval method based on deep learning can effectively realize semantic retrieval, but the retrieval accuracy is not satisfactory

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment

[0076] like figure 1 As shown, a fine-grained image and text retrieval method based on Transformer model includes the following steps:

[0077] S01: Obtain the area vector group of the target area of the image and the word vector group of the text;

[0078] S02: Use the trained self-attention Transformer model to optimize each target area of the image, so that each target area can obtain effective information of other surrounding target areas; use the trained self-attention Transformer model to determine according to the semantic information of the text Meaning information of the current target word;

[0079] S03: Use the trained mutual attention Transformer model to process cross-modal information, and interact with the information between the image and the text, so that the regional vector group includes key information, and the word vector group includes detailed information;

[0080] S04: Calculate the similarity between each region vector in the finally obtained reg...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a fine-grained image-text retrieval method based on a Transform model. The fine-grained image-text retrieval method comprises the following steps: acquiring a region vector group of an image target region and a word vector group of a text; each target area of the image is optimized by using a trained self-attention Transform model, so that each target area obtains effective information of other target areas around the target area; determining the meaning information of the current target word according to the semantic information of the text by using a trained self-attention Transform model; performing cross-modal information processing by using a trained mutual attention Transform model, so that the region vector group comprises key information, and the word vector group comprises detail information; and respectively calculating the similarity of each finally obtained region vector and each word vector, obtaining a fine-grained semantic similarity matrix of the input image and the input text, and obtaining a retrieval result. And the retrieval performance and accuracy are improved.

Description

technical field [0001] The invention belongs to the technical field of image and text retrieval, and in particular relates to a fine-grained image and text retrieval method and system based on a Transformer model. Background technique [0002] With the development of Internet technology, various applications and web pages generate a large number of pictures and texts every day, and these pictures and texts may have certain connections. [0003] In practical applications, images corresponding to text can be retrieved based on cross-modal retrieval algorithms. In the related art, the cross-modal retrieval algorithm mainly extracts the image features of all the pictures in the picture library and the text features of the text, and determines the similarity between each picture and text according to the image features and text features, and then determines the similarity between the pictures and the text from the picture library. The image with the highest text similarity. [...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G06F16/532G06F16/583G06F16/33G06N3/04G06V10/25G06V10/82

CPCG06F16/532G06F16/583G06F16/3334G06F16/3344G06N3/047G06N3/048G06N3/044G06N3/045

Inventor 张百灵潘正新武芳宇

Owner 浙大宁波理工学院

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Fine-grained image-text retrieval method and system based on Transform model

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology