Cross-modal subject correlation modeling method based on deep learning

A deep learning and correlation technology, applied in the field of cross-media correlation learning, can solve the problems of not considering the heterogeneity of images and texts well, and achieve the effects of high accuracy, efficiency promotion and strong adaptability

Active Publication Date: 2016-07-13
FUDAN UNIV
View PDF4 Cites 89 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, such a direct matching method does not well consider the heterogeneity of images and texts, so learning their corre

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Cross-modal subject correlation modeling method based on deep learning
  • Cross-modal subject correlation modeling method based on deep learning
  • Cross-modal subject correlation modeling method based on deep learning

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0083] The cross-modal relevance calculation method for social images of the present invention will be described in detail below in conjunction with the accompanying drawings.

[0084] (1) Collection data object

[0085] Collect data objects, obtain images and image annotation data, and organize image annotation data that do not appear frequently or are useless in the entire data set. Generally, the obtained data set contains a lot of noise data, so it should be properly processed and filtered before using these data for feature extraction. For images, the obtained images are all in a uniform JPG format, and no conversion is required. For text annotation of images, the resulting image annotations contain a lot of meaningless words, such as words plus numbers without any meaning. Some images have as many as dozens of annotations. In order for the image annotations to describe the main information of the image well, those useless and meaningless annotations should be discarded...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention belongs to the technical field of cross-media correlation learning, and particularly relates to a cross-modal subject correction modeling method based on deep learning.The method includes two main algorithms of multi-modal file expression based on deep vocabularies and correlation subject model modeling fusing cross-modal subjection correction learning.A deep learning technology is utilized for constructing deep semantic vocabularies and deep vision vocabularies to describe a semantic description part and an image part in a multi-modal file.Based on multi-modal file expression, a cross-modal correlation subject model is constructed to model a whole multi-modal file set, so that the generation process of the multi-modal file and the correlation between different modals are described.The accuracy is high, and adaptability is high.The cross-modal subject correction modeling method has important meaning for efficient cross-media information retrieval in consideration of multi-modal semantic information on the basis of the large-scale multi-modal file (a text and an image), can improve retrieval correlation and promote user experience, and has great application value in the field of cross-media information retrieval.

Description

technical field [0001] The invention belongs to the technical field of cross-media correlation learning, and in particular relates to a cross-modal image-text topic correlation learning method based on deep learning. Background technique [0002] With the development of Internet technology and the maturity of Web2.0, a large number of multimodal documents have been accumulated on the Internet. How to analyze and process the complex structure of these multimodal documents to provide theoretical support for practical applications such as cross-media retrieval has changed. become a very important research hotspot. Generally speaking, a multimodal document usually exists in the form of multiple modal co-occurrences. For example, many web images are accompanied by many user-defined image descriptions or annotations, and some web documents contain some illustrations. . However, although these multi-modal data are often related to each other, due to the problem of semantic gap, t...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30G06K9/62
CPCG06F16/355G06F16/94G06F18/23213
Inventor 张玥杰程勇刘志鑫金城张涛
Owner FUDAN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products