Secondary retrieval-based method and apparatus for cross-modal image and text retrieval, device, and medium
By using a secondary retrieval method, which acquires and fuses image features, the problem of insufficient image-text interaction in feature-based retrieval is solved, achieving higher retrieval accuracy and efficiency.
Patent Information
- Authority / Receiving Office
- WO · WO
- Patent Type
- Applications
- Current Assignee / Owner
- SHENZHEN INTELLIFUSION TECHNOLOGIES CO LTD
- Filing Date
- 2025-11-05
- Publication Date
- 2026-06-18
AI Technical Summary
In existing technologies, feature-based cross-modal image and text retrieval methods lack image and text interaction, resulting in low retrieval accuracy. How to implement image and text interaction in feature-based retrieval to improve retrieval accuracy has become an urgent problem to be solved.
The method based on secondary retrieval first obtains the first retrieval features and performs feature query on the database to obtain N first image features. Then, these features are fused with the first retrieval features to obtain the second retrieval features. The second retrieval features are then used for feature query to achieve image-text interaction fusion.
By employing a two-stage retrieval method, the impact of differences between image and text modalities is reduced, retrieval accuracy is improved, and high efficiency is maintained.
Smart Images

Figure CN2025132684_18062026_PF_FP_ABST