A multimodal data representation learning method and system

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A data representation and learning method technology, applied in the information field, can solve problems such as missing data, large amount of data, and high computational cost

Active Publication Date: 2019-09-06

GUANGDONG UNIV OF TECH

View PDF2 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0005] In view of this, the present invention provides a multimodal data representation learning method and system to solve the existing technical solutions that cannot simultaneously solve the existing heterogeneity, large amount of data, and missing data when processing multimodal data. and computationally expensive problems

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0064] A multi-modal data representation learning method disclosed in Embodiment 1 of the present invention is applied to a multi-modal data representation learning system. The flow chart is as follows figure 1 As shown, multimodal data representation learning methods include:

[0065] S101. Receive target multimodal data, and acquire each modality corresponding to the target multimodal data and a feature representation of each modality;

[0066] In the process of executing step S101, according to the target multi-modal data sent by the social media data collection device, each modality corresponding to the target multi-modal data is obtained, and the corresponding feature representation of each modality is obtained according to each acquired modality.

[0067] S102. Obtain a data representation and a dictionary representation of the fusion multimodal feature according to the target multimodal data, feature representation and preset graph random walk model;

[0068] S103. Acc...

Embodiment 2

[0071] Based on the above-mentioned multi-modal data representation learning method disclosed in the first embodiment of the present invention, such as figure 1 In the shown step S101, the target multimodal data is received, and the specific execution process of each modality corresponding to the target multimodal data and the feature representation of each modality is obtained, as shown in figure 2 shown, including:

[0072] S201. Receive target multimodal data, acquire each modality corresponding to the target multimodal data, and extract original features of each modality;

[0073] In the process of executing step S201, the target multimodal data is received, each modality corresponding to the target multimodal data is obtained, and the original features of each modality are extracted, wherein the original features include: visual features, text features and The characteristics of each layer of deep learning neural network.

[0074] S202. Obtain the missing features of e...

Embodiment 3

[0100] Based on a multi-modal data representation learning method disclosed in the second embodiment of the present invention, as image 3 In the shown step S301, the dictionary atom is selected according to the target multimodal data, and the corresponding feature representation of the dictionary atom is extracted according to the feature representation, and the specific execution process of the mode dictionary of each modality is obtained, as shown in Figure 5 shown, including:

[0101] S501, judging whether the target multimodal data has a label;

[0102] S502, if not, select any one of the feature representations as a single mode, perform clustering processing on the target multi-modal data corresponding to the single mode based on the preset center clustering algorithm, and select the second preset of the cluster center point target multimodal data in scope as dictionary atoms;

[0103] Optionally, the preset central clustering algorithm includes: K-Means clustering al...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a multimodal data representation learning method and system. The method comprises the steps of receiving target multimodal data, and acquiring each modality corresponding to the target multimodal data and a feature representation of each modality; acquiring a data representation and a dictionary representation fusing multimodal features according to the target multimodal data, the feature representation and a preset graph random walk model; and acquiring a low-dimensional discrimination representation optimal solution and a dictionary optimal representation based on a preset data reconstruction model, the data representation and the dictionary representation and storing the low-dimensional discrimination representation optimal solution and the dictionary optimal representation in a database. The multimodal data representation learning method disclosed by the invention simultaneously solves the problems of heterogeneity, large data size, data missing and large computational cost existing when the multimodal data are processed.

Description

technical field [0001] The invention relates to the field of information technology, in particular to a multimodal data representation learning method and system. Background technique [0002] With the rapid popularization of the Internet, social media sites are also constantly rising, and people can more conveniently generate or share multimedia content on social media sites, so social media platforms store a large number of events composed of multi-modal data. . In practical applications, as far as a single event is concerned, the content of the event on the social media platform may be published or shared by multiple users, and the distribution of information will be scattered due to the differences in geographical distribution, sharing time, modal form or description angle of users. . Therefore, multimodal data has the characteristics of heterogeneity, large data volume, data loss and high computational cost, which brings challenges to the processing of multimodal data...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Patents(China)

IPC IPC(8): G06F16/2458

Inventor 刘文印杨振国李青

Owner GUANGDONG UNIV OF TECH

A multimodal data representation learning method and system

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

Embodiment 2

Embodiment 3

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology