Cross-media information analysis and retrieval method

An information analysis and cross-media technology, applied in the field of multimedia information data retrieval, can solve problems such as semantic gap, low retrieval efficiency, and incomparability

Inactive Publication Date: 2012-09-26
CHANGZHOU HIGH TECH RES INST OF NANJING UNIV
View PDF3 Cites 13 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

There are several major problems in the research of multimedia documents: (1) Since the multimedia data between different modalities is usually unstructured or semi-structured, the underlying characteristics of multimedia data between different modalities are different due to different dimensions and attributes. structure and incomparability, there is a semantic gap between low-level features and high-level semantics, which greatly intensifies the difficulty of cross-modal retrieval between different modalities
(2) Weak correlation between modes
(3) Low retrieval efficiency
(4) In the process of user annotation, the text annotation presents a certain personality with different users, which is greatly affected by subjective factors, so there are inconsistencies and inaccuracies, which affect the retrieval efficiency
For example: a 128-dimensional visual feature vector and a 21-dimensional auditory feature vector, both may express similar semantic concepts, such as explosions and pictures and the sound of explosions, but it is difficult for a computer to measure two degree of relatedness at the semantic level

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Cross-media information analysis and retrieval method
  • Cross-media information analysis and retrieval method
  • Cross-media information analysis and retrieval method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0087] Assuming that there are 10,000 images and 10,000 audio clips, 500 in each category, and 20 categories, 20 multimedia documents can be constructed, and each multimedia document contains 1,000 multimedia objects (500 images and 500 audio). First extract the SIFT features of all images, and represent each image as a set of 128-dimensional visual feature vectors, and then extract the MFCC features of all audio, and each segment of audio is represented as a set of 21-dimensional auditory feature vectors. Construct multimedia documents, classify the image-audio database, and generate a training set D={D 1 ,...,D c ,...,D N}., each multimedia document D c is a collection of images and audios of type C. The information of multimedia documents is known, and the MC-PLSA model is used to learn multimedia documents and their characteristics. The MC-PLSA model mainly uses EM expected maximum value and asymmetric learning method to learn related parameters. The user inputs the mu...

Embodiment 2

[0089] The user submits an audio clip of a tiger's call to query the first 20 results returned by the image. The retrieval process is as follows: When the user submits an audio clip of a tiger's call as a retrieval example, the system first performs a search based on the audio features of the audio clip. The model learns to find the topic probability distribution of the multimedia semantic space to which the segment belongs. Then, according to the cosine angle value between all multimedia objects in the database and the subject probability distribution of the query example as the similarity, four correlation matrices C are formed IA , C AI , C II , C AA . Then update the values ​​of the four association matrices according to the propagation model, that is, update the similarity values ​​between all multimedia objects in the database and the query examples. And calculate the average similarity value between the query example and multimedia documents of each category, and th...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a cross-media information analysis and retrieval method, which comprises the following steps of: performing semantic integration processing on multimode information; performing expansion according to a probability latent semantic analysis model to obtain a multilayer continuous probability latent semantic analysis model for processing a continuous feature vector; learning the multilayer-continuous probability latent semantic analysis model by adopting an asymmetric learning method, and calculating the visual feature vector distribution of an image, the visual feature vector distribution of an audio and topic probability distribution; submitting a training set and a tested media object which serves as a retrieval example by a user, and calculating intra-mode and inter-mode initial similarity values of the image and the audio in the retrieval sample; constructing a propagation model, and updating the intra-mode and inter-mode similarity values according to the propagation model; and performing secondary retrieval according to the updated similarity values.

Description

technical field [0001] The invention relates to the field of multimedia information data retrieval, in particular to a cross-media information analysis and retrieval method. Background technique [0002] Multimedia is a complex of text, images, audio and video, etc. These different types of multimedia data express rich and colorful semantics. There are several major problems in the research of multimedia documents: (1) Since the multimedia data between different modalities is usually unstructured or semi-structured, the underlying characteristics of multimedia data between different modalities are different due to different dimensions and attributes. Due to the structure and incomparability, there is a semantic gap between the underlying features and the high-level semantics, which greatly intensifies the difficulty of cross-modal retrieval between different modalities. (2) The correlation between modes is weak. Different types of multimedia data jointly express a variety ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
Inventor 路通林婉霞
Owner CHANGZHOU HIGH TECH RES INST OF NANJING UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products