Secondary retrieval-based method and apparatus for cross-modal image and text retrieval, device, and medium

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
By using a secondary retrieval method, which acquires and fuses image features, the problem of insufficient image-text interaction in feature-based retrieval is solved, achieving higher retrieval accuracy and efficiency.

WO2026124054A1PCT designated stage Publication Date: 2026-06-18SHENZHEN INTELLIFUSION TECHNOLOGIES CO LTD +2

View PDF 0 Cites 0 Cited by

Patent Information

Authority / Receiving Office: WO · WO
Patent Type: Applications
Current Assignee / Owner: SHENZHEN INTELLIFUSION TECHNOLOGIES CO LTD
Filing Date: 2025-11-05
Publication Date: 2026-06-18

Application Information

Patent Timeline

05 Nov 2025

Application

18 Jun 2026

Publication

WO2026124054A1

IPC: G06F16/583

AI Tagging

Application Domain

Digital data information retrieval Special data processing applications

Explore More Agents

Novelty Search
Search existing technologies and assess novelty
↗
FTO
Analyze whether a product may infringe others' patents
↗
Design FTO
Check prior-design risk for exterior design
↗
Drafting
Draft patent application text based on a technical solution
↗
Find Solutions with TRIZ
Generate feasible solution to solve your technical challenge
↗

Similar Technology Patents

Get free access to AI patent search and analysis

Check patentability, review prior art and ask IP Agent with full patent context.

AI Technical Summary

Technical Problem

In existing technologies, feature-based cross-modal image and text retrieval methods lack image and text interaction, resulting in low retrieval accuracy. How to implement image and text interaction in feature-based retrieval to improve retrieval accuracy has become an urgent problem to be solved.

Method used

The method based on secondary retrieval first obtains the first retrieval features and performs feature query on the database to obtain N first image features. Then, these features are fused with the first retrieval features to obtain the second retrieval features. The second retrieval features are then used for feature query to achieve image-text interaction fusion.

Benefits of technology

By employing a two-stage retrieval method, the impact of differences between image and text modalities is reduced, retrieval accuracy is improved, and high efficiency is maintained.

✦ Generated by Eureka AI based on patent content.

Smart Images

Figure CN2025132684_18062026_PF_FP_ABST

Patent Text Reader

Abstract

The present application relates to the technical field of image retrieval, and in particular, to a secondary retrieval-based method and apparatus for cross-modal image and text retrieval, a device, and a medium. The method comprises: by means of obtaining a first retrieval feature and using the first retrieval feature, performing feature querying on a database to be queried, to obtain N first image features, database to be queried storing a mapping relationship of image features of at least one image and a corresponding image in the database to be queried; fusing the N first image features and the first retrieval feature, to obtain a fusion result as a second retrieval feature; and performing feature querying on the database to be queried by using the second retrieval feature, to obtain a retrieval target image. The retrieval accuracy can be improved by means of two retrievals. In the second retrieval, interactive image-text fusion is performed on the image in the first retrieval and a retrieval requirement. Although the image and the text are in different modes, the fusion can reduce the impact of the difference in the modes, thereby improving the final retrieval accuracy.

Need to check novelty before this filing date? Find Prior Art