Semi-paired multi-modal data hash coding method
A hash coding and multi-modal technology, applied in the field of cross-modal retrieval, can solve problems such as limited retrieval accuracy of hash coding and limitation of nonlinear fitting ability
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0078] Multimodal hash coding is to represent multiple pairs of real number vectors with the same group of binary number vectors, so as to achieve cross-modal retrieval. For example, images and their text tags captured from social networks are paired. Through multi-modal hash coding, it is possible to retrieve images with text tags, or retrieve text tags with images. Half pairing means that only part of the pairing information of the multimodal data is known, while full pairing means that all the data in the multimodal data are in one-to-one correspondence. For example, there is usually a one-to-one correspondence between pictures and accompanying texts in WeChat Moments, and such data is fully paired multimodal data. For another example, for pictures and texts obtained directly from web pages, sometimes due to typesetting reasons, the pictures and the text describing the content of the picture are not next to each other, so the obtained data cannot pre-judge which words descr...
Embodiment 2
[0173] The purpose of this embodiment is to provide a computer system.
[0174] A computer system, comprising a memory, a processor, and a computer program stored in the memory and operable on the processor, when the processor executes the program, it realizes:
[0175] Obtain the image information matrix and text information matrix of semi-paired multimodal data;
[0176] Constructing a first neural network that maps images to text space and a second neural network that maps text to image space and selects encoding layers in the first neural network and the second neural network respectively;
[0177] establishing an objective function using the encoding layer;
[0178] The neural network is trained according to the objective function to obtain a hash coding matrix of the semi-paired multimodal data.
Embodiment 3
[0180] The purpose of this embodiment is to provide a computer-readable storage medium.
[0181] A computer-readable storage medium, on which a computer program is stored, and when the program is executed by a processor, the following steps are performed:
[0182] Obtain the image information matrix and text information matrix of semi-paired multimodal data;
[0183] Constructing a first neural network that maps images to text space and a second neural network that maps text to image space and selects encoding layers in the first neural network and the second neural network respectively;
[0184] establishing an objective function using the encoding layer;
[0185] The neural network is trained according to the objective function to obtain a hash coding matrix of the semi-paired multimodal data.
PUM
Login to View More Abstract
Description
Claims
Application Information
Login to View More - R&D
- Intellectual Property
- Life Sciences
- Materials
- Tech Scout
- Unparalleled Data Quality
- Higher Quality Content
- 60% Fewer Hallucinations
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2025 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com



