Unlock instant, AI-driven research and patent intelligence for your innovation.

Multi-modal data processing method and device

A data processing device and data processing technology, applied in the field of artificial intelligence, can solve the problems of difficult to guarantee data security, low model training efficiency, data leakage of data providers, etc., and achieve the effect of ensuring data security and improving efficiency

Active Publication Date: 2021-01-22
北京爱数智慧科技有限公司
View PDF5 Cites 4 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] However, in the process of implementing this application, the inventor found that at least the following problems exist at present: in the prior art, if one wants to train a multi-modal data processing model, one can only use data from the same data provider for training. Training with data from one data provider will lead to data leakage between each data provider, data security is difficult to guarantee, and model training efficiency is low

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Multi-modal data processing method and device
  • Multi-modal data processing method and device
  • Multi-modal data processing method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0047] refer to figure 1 , which shows a schematic flowchart of a multi-modal data processing method provided by an embodiment of the present application, the multi-modal data processing method includes:

[0048] S101: The terminal acquires multimodal data.

[0049] Specifically, the terminal is a data provider. Further, the terminal may be multiple different terminals, and different terminals have different terminal ids, which are used to identify the identity of the terminal.

[0050] Specifically, the types of multimodal data may include: at least two of voice modality data, image modality data and text modality data.

[0051]Furthermore, the multimodal data provided by different terminals may have different compositions. For example, the multimodal data provided by the first terminal includes voice modal data and image modal data, and the multimodal data provided by the second terminal includes Text Modal Data and Image Modal Data.

[0052] There is no need for the data...

Embodiment 2

[0073] refer to image 3 , which shows a schematic flow chart of another multimodal data processing method provided by an embodiment of the present application. The multimodal data processing method includes:

[0074] S301: The terminal acquires multimodal data;

[0075] S302: The terminal performs feature extraction on the multimodal data through a feature extraction algorithm to obtain data features of the multimodal data;

[0076] S303: The terminal converts the data features by using a first conversion algorithm to obtain the first data features, where the first conversion algorithm is used to map the multimodal data to a specific space;

[0077] S304: The terminal encrypts the data label of the multimodal data without damaging its mathematical characteristics;

[0078] Specifically, encryption without compromising its mathematical properties is homomorphic encryption. The homomorphic encryption method can realize that the calculation between ciphertexts is equivalent t...

Embodiment 3

[0085] refer to Figure 4 , shows a schematic structural diagram of a multi-modal data processing device provided by an embodiment of the present application, and the multi-modal data processing device 40 includes:

[0086] An acquisition module 401, configured for the terminal to acquire multimodal data;

[0087] The extraction module 402 is used for the terminal to perform feature extraction on the multi-modal data through a feature extraction algorithm, so as to obtain data features of the multi-modal data;

[0088] The first conversion module 403 is configured to convert the data features by the terminal through a first conversion algorithm to obtain the first data features, wherein the first conversion algorithm is used to map multimodal data to a specific space;

[0089] The transmission module 404 is used for the terminal to transmit the data characteristics, data tags and terminal id of the multimodal data to the server;

[0090] The second conversion module 405 is u...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a multi-modal data processing method and device. The method comprises the steps that a terminal acquires multi-modal data; the terminal performs feature extraction on the multi-modal data through a feature extraction algorithm to obtain data features of the multi-modal data; the terminal converts the data features through a first conversion algorithm to obtain first data features, and the first conversion algorithm is used for mapping the multi-modal data to a specific space; the terminal transmits the data characteristics of the multi-modal data, the data label and theterminal id to the server; the server converts the first data feature through a second conversion algorithm corresponding to the terminal id to obtain a second data feature, and the second conversionalgorithm is used for mapping data in different specific spaces to the same space; and the server performs multi-modal representation learning by taking the second data feature as an input and the data label as an output so as to train a multi-modal representation learning algorithm.

Description

technical field [0001] This application belongs to the technical field of artificial intelligence, and specifically relates to a multimodal data processing method and device. Background technique [0002] Multimodal learning has become one of the hotspots of artificial intelligence since 2010. Modality refers to a fixed type of information source. For example, voice information is one modality, image information is another modality, and text information is a third modality. Modality can also have a very broad definition. For example, we can regard two different languages ​​as two modalities, and even data sets collected in two different situations can also be considered as two modalities. In this context, it can be understood that multimodal learning is relative to unimodal learning. The well-known speech recognition, image recognition, fingerprint recognition, etc. all belong to the application of single-modal learning, and the input information is the same type; if the i...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F16/25G06F21/60
CPCG06F21/602G06F16/258
Inventor 张晴晴张雪璐贾艳明曹艳丽
Owner 北京爱数智慧科技有限公司