Cross-media information analysis and retrieval method

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
An information analysis and cross-media technology, applied in the field of multimedia information data retrieval, can solve problems such as semantic gap, low retrieval efficiency, and incomparability

Inactive Publication Date: 2012-09-26

CHANGZHOU HIGH TECH RES INST OF NANJING UNIV

View PDF3 Cites 13 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

There are several major problems in the research of multimedia documents: (1) Since the multimedia data between different modalities is usually unstructured or semi-structured, the underlying characteristics of multimedia data between different modalities are different due to different dimensions and attributes. structure and incomparability, there is a semantic gap between low-level features and high-level semantics, which greatly intensifies the difficulty of cross-modal retrieval between different modalities

(2) Weak correlation between modes

(3) Low retrieval efficiency

(4) In the process of user annotation, the text annotation presents a certain personality with different users, which is greatly affected by subjective factors, so there are inconsistencies and inaccuracies, which affect the retrieval efficiency

For example: a 128-dimensional visual feature vector and a 21-dimensional auditory feature vector, both may express similar semantic concepts, such as explosions and pictures and the sound of explosions, but it is difficult for a computer to measure two degree of relatedness at the semantic level

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0087] Assuming that there are 10,000 images and 10,000 audio clips, 500 in each category, and 20 categories, 20 multimedia documents can be constructed, and each multimedia document contains 1,000 multimedia objects (500 images and 500 audio). First extract the SIFT features of all images, and represent each image as a set of 128-dimensional visual feature vectors, and then extract the MFCC features of all audio, and each segment of audio is represented as a set of 21-dimensional auditory feature vectors. Construct multimedia documents, classify the image-audio database, and generate a training set D={D 1 ,...,D c ,...,D N}., each multimedia document D c is a collection of images and audios of type C. The information of multimedia documents is known, and the MC-PLSA model is used to learn multimedia documents and their characteristics. The MC-PLSA model mainly uses EM expected maximum value and asymmetric learning method to learn related parameters. The user inputs the mu...

Embodiment 2

[0089] The user submits an audio clip of a tiger's call to query the first 20 results returned by the image. The retrieval process is as follows: When the user submits an audio clip of a tiger's call as a retrieval example, the system first performs a search based on the audio features of the audio clip. The model learns to find the topic probability distribution of the multimedia semantic space to which the segment belongs. Then, according to the cosine angle value between all multimedia objects in the database and the subject probability distribution of the query example as the similarity, four correlation matrices C are formed IA , C AI , C II , C AA . Then update the values of the four association matrices according to the propagation model, that is, update the similarity values between all multimedia objects in the database and the query examples. And calculate the average similarity value between the query example and multimedia documents of each category, and th...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention provides a cross-media information analysis and retrieval method, which comprises the following steps of: performing semantic integration processing on multimode information; performing expansion according to a probability latent semantic analysis model to obtain a multilayer continuous probability latent semantic analysis model for processing a continuous feature vector; learning the multilayer-continuous probability latent semantic analysis model by adopting an asymmetric learning method, and calculating the visual feature vector distribution of an image, the visual feature vector distribution of an audio and topic probability distribution; submitting a training set and a tested media object which serves as a retrieval example by a user, and calculating intra-mode and inter-mode initial similarity values of the image and the audio in the retrieval sample; constructing a propagation model, and updating the intra-mode and inter-mode similarity values according to the propagation model; and performing secondary retrieval according to the updated similarity values.

Description

technical field [0001] The invention relates to the field of multimedia information data retrieval, in particular to a cross-media information analysis and retrieval method. Background technique [0002] Multimedia is a complex of text, images, audio and video, etc. These different types of multimedia data express rich and colorful semantics. There are several major problems in the research of multimedia documents: (1) Since the multimedia data between different modalities is usually unstructured or semi-structured, the underlying characteristics of multimedia data between different modalities are different due to different dimensions and attributes. Due to the structure and incomparability, there is a semantic gap between the underlying features and the high-level semantics, which greatly intensifies the difficulty of cross-modal retrieval between different modalities. (2) The correlation between modes is weak. Different types of multimedia data jointly express a variety ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G06F17/30

Inventor 路通林婉霞

Owner CHANGZHOU HIGH TECH RES INST OF NANJING UNIV

Cross-media information analysis and retrieval method

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

Embodiment 2

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology