Video target detection method based on attention mechanism

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A target detection and attention technology, applied in the field of computer vision, can solve the problems of high computational cost of optical flow, easy to cause error propagation, difficult to quickly detect, etc.

Active Publication Date: 2019-09-27

BEIJING UNIV OF TECH

View PDF11 Cites 13 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0003] There are three main types of current video target detection frameworks: one regards video frames as independent images and uses image target detection algorithms to detect them. This method ignores time information and independently detects each frame, and the effect is not ideal; the other One method combines target detection and target tracking technology. This type of method performs post-processing on the detection results in order to track the target. The tracking accuracy depends on the detection, which is easy to cause error propagation; there is another method that only performs on a few key frames. detection, and then use the optical flow information and key frame features to generate the features of the remaining frames. Although this type of method uses timing information, the calculation cost of optical flow is very high, and it is difficult to quickly detect

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0024] like figure 1 As shown, this example provides a video target detection method based on attention mechanism, including the following steps

[0025] Step S1, input the video frame image of the current time point into the Mobilenet network to extract the candidate feature map;

[0026] In step S2, a time series feature fusion window is set in the past time period adjacent to the current time point, and for the video frames to be fused in the feature fusion window, the image Laplacian variance is calculated and normalized. Then, as the fusion weight of each frame to be fused, the candidate feature maps of all the frames to be fused are weighted and summed according to the weight to obtain the time series features required by the current frame, and the candidate features and time series features of the video frame of the current time step are in the channel dimension. connected to obtain a feature map to be detected that incorporates timing information;

[0027] Step S3, u...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention relates to a video target detection method based on an attention mechanism, and relates to the computer vision. The method comprises the following steps of S1, extracting a candidate feature map of a current time frame; S2, setting a fusion window in the past time period, calculating the Laplace variance of each frame in the window, normalizing the variance to serve as the weight of each frame in the window, performing weighted summation on the candidate feature maps of all frames in the window to obtain a time sequence feature, and connecting the candidate feature map of the current time frame with the time sequence feature to obtain a feature map to be detected; S3, extracting a feature map of an additional scale from the feature map to be detected by using the convolutional layer; and S4, carrying out target category and position prediction on the feature maps of different scales by using a convolutional layer. According to the feature fusion method provided by the invention, different weights are distributed to frame features with different qualities in the past time period, so that the fusion of the time sequence information is more sufficient, and the performance of a detection model is improved.

Description

technical field [0001] The invention relates to computer vision, deep learning, and video target detection technology. Background technique [0002] Image object detection methods based on deep learning have made great progress in the past five years, such as RCNN series networks, SSD networks and YOLO series networks. However, in the fields of video surveillance, vehicle assisted driving, etc., video-based target detection has a wider demand. Due to the problems of motion blur, occlusion, diversity of morphological changes, and diversity of illumination changes in videos, only using image object detection technology to detect objects in videos cannot get good detection results. There is temporal continuity and spatial similarity between adjacent frames in the video, and the position of the target between frames is related. How to use the target timing information in the video to improve the performance of video target detection The essential. [0003] The current video t...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G06K9/00G06N3/04

CPCG06V20/46G06V20/41G06N3/045

Inventor 李建强白骏刘雅琦

Owner BEIJING UNIV OF TECH

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Video target detection method based on attention mechanism

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology