Video human body behavior recognition method and system based on multi-mode double-flow 3D network

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A recognition method and multi-modal technology, applied in character and pattern recognition, biological neural network models, instruments, etc., can solve problems such as difficult to obtain high-level clues, and achieve the effect of eliminating interference

Inactive Publication Date: 2020-01-17

SHANDONG UNIV

View PDF4 Cites 15 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

Therefore, CNN-based methods tend to predict behaviors based on scenes and objects, which makes them more susceptible to cluttered backgrounds.

Second, although 3D CNN has the significant advantage of simultaneously learning spatio-temporal features, it is usually only applied to RGB videos

RGB data is highly sensitive to color and lighting changes, occlusion and background clutter, and moreover, it is difficult to obtain more high-level cues, such as human pose and body contour information, only from RGB video

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0029] In one or more embodiments, a video human behavior recognition method based on a multimodal dual-stream 3D network is disclosed, such as figure 1 shown, including the following steps:

[0030] (1) For an action sample, synchronized RGB and depth videos are collected separately.

[0031] (2) Depth dynamic image sequence generated based on depth video;

[0032] (3) Posture evaluation map sequence generated based on RGB video;

[0033] (4) Input the depth dynamic map sequence and the pose evaluation map sequence into the 3D convolutional neural network respectively, and obtain the respective classification results;

[0034] (5) The obtained classification results are fused to obtain the final behavior recognition result.

[0035] In this implementation method, the depth dynamic image sequence DDIS and the pose evaluation image sequence PEMS are respectively used as the data input of two 3DCNNs in the framework, and the DDIS stream and the PEMS stream are constructed. Th...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a video human body behavior recognition method and system based on a multi-mode double-flow 3D network. The method comprises the steps of generating a depth dynamic image sequence DDIS based on a depth video; generating a posture evaluation graph sequence PEMS based on the RGB video; respectively inputting the depth dynamic graph sequence and the posture evaluation graph sequence into a 3D convolutional neural network, constructing a DDIS flow and a PEMS flow, and obtaining respective classification results; and fusing the obtained classification results to obtain a final behavior recognition result. The method has the beneficial effects that the DDIS can well describe the human body motion and the contour of the interactive object in the long-term behavior video by modeling the local space-time structure information of the video. The PEMS can clearly capture the change of the human body posture and eliminate the interference of the messy background. The multi-mode double-flow 3D network architecture can effectively model the global space-time dynamic of the behavior video in different data modes, and has excellent recognition performance.

Description

technical field [0001] The invention relates to the technical field of human behavior recognition, in particular to a video human behavior recognition method and system based on a multimodal dual-stream 3D network. Background technique [0002] The statements in this section merely provide background information related to the present invention and do not necessarily constitute prior art. [0003] Video-based human action recognition has attracted more and more attention in the field of computer vision in recent years due to its wide range of applications, such as intelligent surveillance, video retrieval, and elderly care. Compared with image classification, video action recognition is a more challenging task because of the high-dimensional nature of videos, and the temporal structure between consecutive frames can also provide important additional information. Therefore, spatio-temporal feature learning is of great significance for video-based action recognition. Spatial...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G06K9/00G06N3/04

CPCG06V40/20G06V20/41G06V20/46G06N3/045

Inventor 马昕武寒波宋锐荣学文田国会李贻斌

Owner SHANDONG UNIV

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Video human body behavior recognition method and system based on multi-mode double-flow 3D network

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology