Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Cross-modal processing method and device, electronic equipment and computer storage medium

A processing method and processing device technology, applied in computer parts, computing, image data processing, etc., can solve the problem of not establishing semantic association between text and vision modes, failing to capture enough semantic information, and poor model training effect and other issues to achieve the effect of improving the training effect

Pending Publication Date: 2020-07-28
BEIJING BAIDU NETCOM SCI & TECH CO LTD
View PDF9 Cites 16 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] However, the current multimodal processing methods cannot capture enough semantic information during model training, and at the same time, do not establish the semantic association between the two modalities of text and vision, making the training effect of the model poor

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Cross-modal processing method and device, electronic equipment and computer storage medium
  • Cross-modal processing method and device, electronic equipment and computer storage medium
  • Cross-modal processing method and device, electronic equipment and computer storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0036] The following describes exemplary embodiments of the present application with reference to the accompanying drawings, which include various details of the embodiments of the present application to facilitate understanding, and should be regarded as merely exemplary. Therefore, those of ordinary skill in the art should realize that various changes and modifications can be made to the embodiments described herein without departing from the scope and spirit of the present application. Likewise, for clarity and conciseness, descriptions of well-known functions and structures are omitted in the following description.

[0037] The cross-modal processing method, device, electronic device, and computer storage medium of the embodiments of the present application are described below with reference to the accompanying drawings.

[0038] figure 1 This is a schematic flow chart of a cross-modal processing method provided by an embodiment of this application. Among them, modal is a term...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a cross-modal processing method and device, electronic equipment and a computer storage medium, and relates to the technical field of natural language processing. According tothe specific implementation scheme, the method comprises the steps that 1, acquiring sample sets, wherein the sample set comprises a plurality of corpora and a plurality of images; generating multipletraining samples, wherein each training sample is a combination of at least one corpus and at least one corresponding image; using multiple training samples, training semantic models, learning the semantic model to obtain a semantic vector of corpus and image combination; adopting a trained semantic model to carry out cross-modal processing between the corpora and the images; and combining the corpora and the corresponding images for training .The semantic model learns the semantic association between the corpus and the corresponding image, the training effect of the semantic model is improved, and the problems that in the prior art, during multi-modal processing, each modal is independently trained, semantic association between different modals is isolated, and the effect of the model obtained through training is poor are solved.

Description

Technical field [0001] This application relates to the field of computer technology, in particular to the field of natural language processing technology, and specifically to a cross-modal processing method, device, electronic equipment, and computer storage medium. Background technique [0002] The world we live in is a multi-modal world, and content of different modalities such as text and vision fills our lives. With the rapid development of artificial intelligence technology, the requirements and requirements for multi-modal processing, such as visual-language multi-modal processing, are getting higher and higher. [0003] However, the current multi-modal processing method cannot capture enough semantic information during model training, and at the same time, it does not establish a semantic association between the text and visual modalities, which makes the training effect of the model poor. Summary of the invention [0004] A method, device, electronic device and computer sto...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06K9/62G06K9/34G06K9/20G06K9/32G06K9/46G06F40/30G06N3/04G06N3/08G06V10/764
CPCG06N3/08G06V10/22G06V10/25G06V10/267G06V10/462G06N3/045G06F18/22G06F16/583G06F40/30G06F40/284G06V20/20G06V2201/10G06V10/82G06V10/764G06V10/7747G06F18/2413G06T7/73G06T2207/20081G06V30/274G06F18/214
Inventor 牛国成何伯磊肖欣延
Owner BEIJING BAIDU NETCOM SCI & TECH CO LTD
Features
  • Generate Ideas
  • Intellectual Property
  • Life Sciences
  • Materials
  • Tech Scout
Why Patsnap Eureka
  • Unparalleled Data Quality
  • Higher Quality Content
  • 60% Fewer Hallucinations
Social media
Patsnap Eureka Blog
Learn More