Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Video action classification and recognition method based on double-flow collaborative network

A collaborative network, classification and recognition technology, applied in character and pattern recognition, instruments, computer parts, etc., can solve problems such as missing key technologies, inability to end-to-end, processing, etc.

Active Publication Date: 2020-04-28
CHENGDU KOALA URAN TECH CO LTD
View PDF12 Cites 13 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0009] Based on solving the problem that some key technologies may be lost in the prior art, and video frames and optical flow fields are processed separately, information does not flow, and end-to-end processing cannot be performed, the present invention proposes a video action classification and recognition method based on a dual-stream collaborative network , by constructing a connection unit to allow heterogeneous spatial features and time domain features to interact, the two-stream information complementation and information flow are realized, and end-to-end reasoning and learning can be realized

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Video action classification and recognition method based on double-flow collaborative network
  • Video action classification and recognition method based on double-flow collaborative network
  • Video action classification and recognition method based on double-flow collaborative network

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0052] A video action classification and recognition method based on a dual-stream collaborative network, combining figure 1 , figure 2 and image 3 As shown, firstly, the spatial sequence feature is extracted from the video frame and the time domain sequence feature is extracted from the video optical flow field through the convolutional network. The expression of the spatial sequence feature is The characteristic expression of the time domain sequence is in d is the dimension of the feature; then construct a connection unit to allow the heterogeneous spatial sequence features and time domain sequence features to exchange information; then construct a shared unit to perform the fusion of the fused spatial sequence features and the fused time domain sequence features respectively Sequential feature aggregation to obtain aggregated spatial features and aggregated time domain features; the aggregated spatial feature expression is Z f , the time-domain feature expression ...

Embodiment 2

[0062] On the basis of above-mentioned embodiment 1, in order to realize the present invention better, combine figure 1 , figure 2 As shown, further, a shared unit is constructed to perform sequential feature aggregation on the fused spatial sequence features and the fused time domain sequence features; the fused spatial sequence features are aggregated into the spatial feature Z f , the fused time-domain sequence features are aggregated into a time-domain feature Z o ; The time domain feature Z o and the spatial feature Z f At the same time, regularization is performed, and then the shared weight layer is input, and then the temporal feature classification score and the spatial feature classification score are extracted; finally, the temporal feature classification score and the spatial feature classification score are fused into predicted spatiotemporal features for actual video action recognition Classification score vector; the predicted spatiotemporal feature classifi...

Embodiment 3

[0065] On the basis of any one of the above-mentioned embodiments 1-2, in order to better realize the present invention, further, select a sample set for training to generate a classifier model that includes correct spatio-temporal feature classification scores for action classification; use cross-entropy loss function, heterogeneous triplet pair loss function, discriminative embedding-limited loss function combined function as the loss function for training.

[0066] Working principle: Select a sample set for pre-training, train a classifier model, introduce a combination of cross-entropy loss function, heterogeneous triplet pair loss function, and discriminative embedding limit loss function as the training loss function, which can make the classification obtained by pre-training The device model is more realistic and reliable, and the classification is more aggregated.

[0067] Other parts of this embodiment are the same as those of any one of Embodiments 1-2 above, so deta...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to a video category recognition method based on a double-flow collaborative network. The method comprises the following steps: firstly, carrying out information interaction on heterogeneous spatial domain features and time domain features; the information interaction fuses heterogeneous time domain features and spatial domain features; extracting a time domain and space domain complementary part from the fused time-space domain features, fusing the complementary part into the originally extracted time domain feature rear space domain features, and respectively forming space domain sequence feature rear time domain sequence features by all the time domain feature rear space domain features after the complementary part is fused; performing sequence feature aggregation on the spatial domain sequence features and the time domain sequence features to obtain aggregated spatial domain features and aggregated time domain features; and finally, pre-training a classifier model for performing test classification on the to-be-identified video. According to the invention, flow complementation of different inflow mode information can be realized, so that a more accurate action recognition effect is achieved.

Description

technical field [0001] The invention belongs to the technical field of video action classification and recognition, and in particular relates to a video action classification and recognition method based on a dual-stream collaborative network. Background technique [0002] Due to the popularity of smart phones, public surveillance, portable cameras and other devices, short video data has grown rapidly due to its easy acquisition. Action recognition based on short videos not only has important academic value, but also can help commercial applications such as intelligent security and user recommendation. The dual-stream network has always been the most widely used and effective framework in the field of action recognition, but most of the action recognition solutions based on the dual-stream network now focus on how to design structures to integrate different stream features, and different stream networks are trained in separate ways , unable to do end-to-end reasoning. [0...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06K9/00G06K9/62
CPCG06V20/46G06F18/253G06F18/214
Inventor 徐行张静然沈复民贾可申恒涛
Owner CHENGDU KOALA URAN TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products