Human gesture recognizing method based on depth convolution condition random field

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A technology of human action recognition and conditional random field, applied in character and pattern recognition, computer components, instruments, etc., can solve problems such as inability to model image sequence data in space-time transformation, inability to model nonlinear relationships, and inability to predict

Active Publication Date: 2015-11-25

NANKAI UNIV

View PDF4 Cites 29 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0004] The conditional random field method in the prior art mainly adopts the method of manual design and extraction of features in the process of sequence learning, but it is impossible to predict which high-performance manual design features are used in practical applications

Conditional random field methods cannot model the spatiotemporal transformation of image sequence data well, especially when the original input nodes are high-dimensional nonlinear data

Nonlinear methods based on improved conditional random fields, such as conditional random fields with kernel functions, can only obtain shallow features and cannot model complex nonlinear relationships between data

In addition, the conditional random field method cannot automatically and adaptively learn the characteristics of the data for different scenarios.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0029] 101: Alternately obtain the spatial feature expression of the input image sequence according to the spatial convolution layer and the sub-sampling layer;

[0030] 102: Perform temporal convolution operation on the spatial feature expression of the input image sequence to obtain further image sequence feature expression;

[0031] 103: Construct and optimize a deep conditional random field recognition model based on spatio-temporal convolutional network;

[0032] 104: Perform forward calculation of the depth-conditional random field recognition after optimizing the video sequence to be predicted, and obtain the action category label information to which each frame of the image in the video sequence to be predicted belongs.

[0033] Wherein, the depth conditional random field identification model in step 103 includes:

[0034] A state function for obtaining the relationship between the image data in the sequence and the category label after the nonlinear transformation; ...

Embodiment 2

[0043] The following combines calculation formulas, examples and figure 2 The scheme in Example 1 is described in detail, wherein the entire space-time convolutional network has two different operations, namely spatial convolution and temporal convolution, which will be described in detail below:

[0044] 201: Alternately obtain the spatial feature expression of the input image sequence according to the spatial convolution layer and the sub-sampling layer;

[0045] Among them, the spatial convolutional network mainly consists of alternating spatial convolutional layers and sub-sampling layers. The spatial convolution layer mainly detects the features in the input image, and the sub-sampling layer performs local averaging or local maximization operations to reduce image resolution and improve feature robustness. The main operations of the spatial convolution layer are expressed as follows:

[0046] The spatial convolution operation is to perform a convolution operation on th...

Embodiment 3

[0096] The feasibility of this method is verified by specific experiments below. The present invention uses two types of data sets to verify the proposed algorithm. One is a segmented action dataset that contains only one action in each video, and the other is a non-action-segmented dataset that contains multiple actions in each video. The two datasets and experimental results are described below.

[0097] see image 3 , the segmented Weizmann dataset is one of the commonly used standard datasets in action recognition tasks. This dataset contains 83 videos recorded by 9 individuals. There are 9 types of movements, namely running, walking, jumping jacks, jumping with two legs forward, jumping with both legs in place, bowing, waving with both hands, waving with one hand, and sliding. This method performs background clipping for each frame and centers the action. After preliminary processing, the size of the image is 103×129, and there are still a lot of blank areas on the ed...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a human gesture recognizing method based on a depth convolution condition random field. The method comprises the following steps: alternately obtaining expressions of spatial characteristics of an inputted image sequence according to a space convolution layer and a sub-sampling layer; performing time convolution on the obtained expressions of spatial characteristics of the inputted image sequence to further obtain expressions on the image sequence; building a depth condition random field recognizing model based on a time convolution network and optimizing the model; conducting forward algorithm on the depth condition random field recognizing after the optimization of a to-be-predicted video sequence to obtain marked information for the movement category to which each frame image in the to-be-predicted video sequence belongs. According to the embodiments of the invention, the method is capable of building a model on the change of image sequence data from time to space, and therefore, a good human movement recognizing effect is achieved.

Description

technical field [0001] The invention relates to the field of human motion recognition, in particular to a human motion recognition method based on a deep convolution conditional random field. Background technique [0002] At present, the application of human action recognition in computer vision applications has become an important issue in the field of artificial intelligence. Since there is a temporal dependency between frames of actions, a sequence labeling problem is naturally formed. Through the recognition of action sequences and the meanings represented by different action image sequences, it is possible to analyze human behavior in video surveillance, human-computer interaction and other scenarios. [0003] In the process of realizing the present invention, the inventor finds that at least the following disadvantages and deficiencies exist in the prior art: [0004] The conditional random field method in the prior art mainly adopts the artificially designed feature...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G06K9/00

CPCG06V40/23

Inventor 刘杰刘才华黄亚楼于芳

Owner NANKAI UNIV

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Human gesture recognizing method based on depth convolution condition random field

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

Embodiment 2

Embodiment 3

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology