Cross-modal information retrieval method based on semantic fusion

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
An information retrieval, cross-modal technology, applied in the field of information retrieval, can solve the problems of difficult data representation and measurement, different feature spaces, etc.

Pending Publication Date: 2021-10-22

NANJING UNIV OF POSTS & TELECOMM

View PDF0 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

However, data of different modalities have different feature spaces, and the problem of "semantic gap" makes it difficult to directly represent and measure data of different modalities

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0090] refer to figure 1 , as an embodiment of the present invention, provides a cross-modal information retrieval method based on semantic fusion, including:

[0091] S1: Collect raw data and preprocess the raw data. It should be noted:

[0092] The original data includes the original image, audio and tactile signal, and the resolution of the original image is adjusted to 224×224×3; the audio is converted into a discrete digital signal; the tactile signal or discrete digital signal is preprocessed as a new signal;

[0093] Preprocessing includes,

[0094] (1) Pre-emphasis:

[0095] Set the new signal to x(n), 0≤n≤N-1, apply the pre-emphasis filter to the signal x(n), and get the pre-emphasis signal y(n):

[0096]

[0097] Among them, α represents the pre-emphasis filter coefficient, N is the signal length, and the signal x(n) sampling frequency is f s ;

[0098] (2) Framing:

[0099] Record the frame size FRAME_SIZE as N sz, the frame step size FRAME_STRIDE is reco...

Embodiment 2

[0154] refer to Figure 2-11 It is the second embodiment of the present invention, which is different from the first embodiment in that it provides a verification test of a cross-modal information retrieval method based on semantic fusion, in order to improve the technical effect adopted in this method Verification shows that this embodiment adopts the traditional technical scheme and the method of the present invention to carry out a comparative test, and compares the test results by means of scientific demonstration, so as to verify the real effect of the method.

[0155] Traditional technical solutions: The traditional six methods of CCA, KCCA, ICA, PCA, AE, and VAE have low retrieval accuracy when dealing with cross-modal retrieval problems involving three modalities; in order to verify that this method has higher In this embodiment, the traditional CCA, KCCA, ICA, PCA, AE, and VAE methods will be used to compare the MAP values with this method. The larger the MAP value,...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a cross-modal information retrieval method based on semantic fusion, and the method comprises the steps: collecting original data, and carrying out the preprocessing of the original data; performing feature extraction and model training on the preprocessed original data to obtain different modal features; inputting the different modal features into the same network for semantic fusion to obtain a semantic fusion network model; and performing retrieval based on the semantic fusion network model and the query set sample to complete cross-modal information retrieval. According to the method, the traditional cross-modal retrieval of two modes of images and texts is overcome, and the cross-modal information retrieval of three modes of images, audios and tactile signals is realized; and according to the preprocessing method of the tactile signal, two-dimensional visualization can be carried out on an original one-dimensional sequence signal, and therefore semantic association can be carried out on the original one-dimensional sequence signal and an original image to achieve the purpose of retrieval.

Description

technical field [0001] The invention relates to the technical field of information retrieval, in particular to a semantic fusion-based cross-modal information retrieval method. Background technique [0002] In recent years, with the rapid development of the Internet industry, technologies such as big data, cloud computing, and artificial intelligence continue to rise, resulting in the generation of massive amounts of data of different types, such as audio, video, text, images, etc. People are no longer satisfied with single-mode Retrieval between states, such as image retrieval image, text retrieval text. Therefore, cross-modal retrieval has become a research hotspot. Different from traditional unimodal retrieval, the query samples and retrieval samples used in cross-modal retrieval belong to different modalities. However, data of different modalities have different feature spaces, and the "semantic gap" problem makes it difficult to directly represent and measure data of ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G06F16/903G06F16/904G06K9/62

CPCG06F16/90335G06F16/904G06F18/251G06F18/253Y02D10/00

Inventor 周亮徐建博冶占远魏昕

Owner NANJING UNIV OF POSTS & TELECOMM

Cross-modal information retrieval method based on semantic fusion

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

Embodiment 2

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology