A multi-modal motion recognition method based on depth neural network

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A technology of deep neural network and action recognition, applied in the field of multi-modal action recognition based on deep neural network, can solve problems such as time information loss, achieve the effect of improving accuracy and precision, improving precision, and reducing computing time

Inactive Publication Date: 2019-03-12

SOUTH CHINA UNIV OF TECH

View PDF4 Cites 48 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

Although this method improves the performance of single-stream methods by clearly capturing local temporal motion, mid- and long-term temporal information is still lost in the learned features since video-level prediction is obtained by averaging the prediction scores of sampled clips.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment

[0039] Such as figure 1 As shown, this embodiment discloses a multi-modal action recognition method based on a deep neural network.

[0040] The deep neural network used in this embodiment has three branches in the lower layer - a convolutional neural network for extracting temporal features, a convolutional neural network for extracting spatial features, and a convolutional neural network for processing skeleton path integral features Fully connected network. At a high level, the three branches are merged into one branch through feature fusion, and the classification id of the video action is predicted by the softmax activation function. In the image branch, a pooling structure based on the attention mechanism is introduced, which can help the network structure to focus on features that are conducive to recognizing actions without changing the existing network structure, thereby reducing the interference of irrelevant features and improving the existing network structure. T...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a multi-modal action recognition method based on a depth neural network. The method comprehensively utilizes multi-modal information such as video images, optical flow diagramsand human skeleton. The specific steps are as follows: firstly, a series of preprocessing and compression are performed on the video; Obtaining an optical flow graph based on adjacent frames of the video; The human skeleton is obtained from the video frame by using the attitude estimation algorithm, and the path integral characteristics of the skeleton sequence are calculated. The obtained optical flow graph, skeleton path integration feature and original video image are input into a depth neural network with multi-branch structure to learn the abstract spatio-temporal representation of humanmotion and correctly judge its motion category. In addition, the pooling layer based on attention mechanism is connected to the video image branch, which enhances the abstract features closely related to the final motion classification results and reduces the irrelevant interference. The invention comprehensively utilizes multimodal information, and has the advantages of strong robustness and high recognition rate.

Description

technical field [0001] The invention relates to the technical field of image processing, in particular to a multimodal action recognition method based on a deep neural network. Background technique [0002] Action recognition is a very popular research direction recently. By recognizing human body actions in videos, it can be used as a new interactive input to processing devices, and can be widely used in daily contact applications such as games and movies. The task of action recognition involves identifying different actions from video clips, where the action may run through the entire video, which is a natural extension of the image classification task, that is, to perform image recognition in multiple frames of video, and then calculate from each frame The predicted outcome of the final action. [0003] Traditional video action recognition techniques often rely on hand-designed feature extractors to extract spatiotemporal features of actions. With the advent of deep lea...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G06K9/00G06N3/04

CPCG06V40/25G06V40/20G06V40/28G06V20/42G06V20/46G06N3/045

Inventor 许泽珊余卫宇

Owner SOUTH CHINA UNIV OF TECH

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

A multi-modal motion recognition method based on depth neural network

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology