Violent video recognition method for bimodal task learning based on attention mechanism

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A video recognition and attention technology, applied in the field of violent video recognition, can solve the problems affecting the generalization ability of classifiers, ignoring interdependence, etc., to achieve the effect of improving generalization ability, improving feature appearance, and suppressing feature expression

Pending Publication Date: 2020-11-06

COMMUNICATION UNIVERSITY OF CHINA

View PDF0 Cites 1 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

Existing research methods basically only use video labels as supervisory signals, build and train network structures to obtain video violence / non-violence labels, but ignore the interdependence between features in the network structure, which makes the learned classifier Overfitting is easy to occur on the limited violent video training database, which affects the generalization ability of the classifier

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0024] Embodiment 1: as figure 1 , figure 2 , image 3 and Figure 4 As shown, the violent video recognition method based on the dual-modal task learning of the attention mechanism includes the following steps:

[0025] Step 1: Add an attention mechanism module to the spatial flow deep neural network to capture the interdependence between the violent features of static frame pictures, and form the weight of the attention mechanism;

[0026] Step 2: Add an attention mechanism module to the time flow deep neural network to capture the interdependence between the violent features of the optical flow sequence diagram, and form the weight of the attention mechanism;

[0027] Step 3: Extract the feature information of the violent video on a single frame image, and establish a violent video recognition model based on a single frame image;

[0028] Step 4: Extract the feature information of the violent video on the motion optical flow, and establish a violent video recognition mo...

Embodiment 2

[0059] Embodiment 2: as figure 1 , figure 2 , image 3 and Figure 4 As shown, the violent video recognition method based on the dual-modal task learning of the attention mechanism includes the following steps:

[0060] Step S101, adding an attention mechanism module to the deep neural network to capture interdependence between violent features;

[0061] Step S102, using a deep neural network with an attention mechanism to extract the features of the violent video on a single frame image;

[0062] Step S103, using a deep neural network with an attention mechanism to extract the features of the violent video on the motion optical flow;

[0063] Step S104, build a more reasonable violence recognition system based on the post-fusion multi-feature average fusion strategy.

[0064] The first basic neural convolutional network used is the TSN network, which is composed of a spatial stream convolutional neural network and a temporal stream convolutional neural network. The atte...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a violent video recognition method for bimodal task learning based on an attention mechanism, and belongs to the technical field of natural interaction and intelligent image recognition. The method includes taking the analysis of the characteristics of the violent scene video as a starting point, and extracting video characteristics which are suitable for violent scene description and have space-time correlation; secondly, establishing an attention mechanism module for violent video features by taking capture of global feature information as a principle; and finally, fusing spatial-temporal features with a global attention relationship so as to realize multi-modal information complementation as a starting point, and researching a violent video recognition step of multi-task learning based on an attention mechanism of violent video features and violent video classification so as to form a complete violent video recognition detection framework. According to the violent video recognition method, intelligent and effective detection of the violent video is realized.

Description

technical field [0001] The invention relates to a violent video recognition method based on dual-modal task learning based on an attention mechanism, and belongs to the technical field of natural interaction and image intelligent recognition. Background technique [0002] The rapid development of Internet technology has brought convenience to our lives, but also brought various hidden dangers, and violent videos are one of them. The wanton dissemination of bloody and violent videos on the Internet has seriously damaged the healthy and sound network environment and is not conducive to the healthy development of young people's mind and body. Therefore, it is of great significance to improve the intelligent recognition level of violent videos. Among them, how to extract and effectively fuse audio and video features is a key problem to be solved in violent video detection technology. [0003] From the perspective of previous violent video detection technologies, there are main...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G06K9/00G06K9/62G06N3/04G06N3/08

CPCG06N3/08G06V20/47G06V20/42G06V20/41G06N3/045G06F18/253

Inventor 吴晓雨侯聪聪顾超男杨磊

Owner COMMUNICATION UNIVERSITY OF CHINA

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Violent video recognition method for bimodal task learning based on attention mechanism

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

Embodiment 2

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology