Multi-modal scene recognition method based on deep learning

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A scene recognition and deep learning technology, applied in the field of pattern recognition and artificial intelligence, can solve problems such as complex implementation methods, achieve the effect of improving accuracy and facilitating scene recognition methods

Active Publication Date: 2019-07-23

NANJING UNIV OF POSTS & TELECOMM

View PDF3 Cites 15 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

The results obtained by the method of feature fusion are more objective, but the actual implementation method is too complicated

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0035] The present invention provides a new multi-modal scene recognition method based on deep learning for the problems of inaccurate results and high complexity of the existing scene recognition methods. Feature information of text modalities, and fusion of multi-modal feature information to improve the accuracy of scene recognition.

[0036] Further, the deep learning-based multi-modal scene recognition method of the present invention includes the following steps.

[0037] S1. Use the stammer word segmentation tool to perform word segmentation processing on short texts.

[0038] S2. Input a group of pictures and short text word segmentation and corresponding labels into respective convolutional neural networks for training.

[0039] S3. Training a short text classification model. Specifically include the following steps:

[0040] S31. During the training process of the short text classification model, quantify the word segmentation results of the input short text and inp...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a multi-modal scene recognition method based on deep learning. The multi-modal scene recognition method comprises the following steps: S1, carrying out word segmentation processing on a short text; S2, inputting a group of pictures, short text segmented words and corresponding tags into respective convolutional neural networks for training; S3, training a short text classification model; S4, training a picture classification model; S5, respectively calculating cross entropies of the full connection layer outputs in S3 and S4 and a standard classification result, calculating an average Euclidean distance which serves as a loss value, then feeding back the loss value to the respective convolutional neural network, and finally obtaining a complete multi-modal scene recognition model; S6, adding the text and the image prediction result vector to obtain a final classification result; and S7, respectively inputting the short text and the image to be identified into the trained multi-modal scene identification model, and performing scene identification. The invention provides a multi-modal scene searching mode, and more accurate and convenient scene recognition isprovided for users.

Description

technical field [0001] The invention relates to a multimodal scene recognition method, in particular to a deep learning-based multimodal scene recognition method, which belongs to the fields of artificial intelligence and pattern recognition. Background technique [0002] Deep learning is a brand-new field of machine learning. Its purpose is to make machine learning closer to human intelligence. Convolutional neural network is a representative algorithm of deep learning. It has the characteristics of simple structure, strong adaptability, few training parameters and many connections. , therefore, this network has been widely used in the fields of image processing and pattern recognition for many years. [0003] Specifically, the convolutional neural network is a hierarchical model whose input is the original data. Through a series of operations such as convolution operations, pooling operations, and nonlinear activation functions, the high-level semantic information is layer...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G06K9/62G06N3/04

CPCG06N3/045G06F18/214G06F18/24

Inventor 吴家皋刘源孙璨郑剑刚

Owner NANJING UNIV OF POSTS & TELECOMM

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Multi-modal scene recognition method based on deep learning

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology