Unlock instant, AI-driven research and patent intelligence for your innovation.

Video target detection method based on attention mechanism

A target detection and attention technology, applied in the field of computer vision, can solve the problems of high computational cost of optical flow, easy to cause error propagation, difficult to quickly detect, etc.

Active Publication Date: 2019-09-27
BEIJING UNIV OF TECH
View PDF11 Cites 13 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] There are three main types of current video target detection frameworks: one regards video frames as independent images and uses image target detection algorithms to detect them. This method ignores time information and independently detects each frame, and the effect is not ideal; the other One method combines target detection and target tracking technology. This type of method performs post-processing on the detection results in order to track the target. The tracking accuracy depends on the detection, which is easy to cause error propagation; there is another method that only performs on a few key frames. detection, and then use the optical flow information and key frame features to generate the features of the remaining frames. Although this type of method uses timing information, the calculation cost of optical flow is very high, and it is difficult to quickly detect

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Video target detection method based on attention mechanism
  • Video target detection method based on attention mechanism
  • Video target detection method based on attention mechanism

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0024] like figure 1 As shown, this example provides a video target detection method based on attention mechanism, including the following steps

[0025] Step S1, input the video frame image of the current time point into the Mobilenet network to extract the candidate feature map;

[0026] In step S2, a time series feature fusion window is set in the past time period adjacent to the current time point, and for the video frames to be fused in the feature fusion window, the image Laplacian variance is calculated and normalized. Then, as the fusion weight of each frame to be fused, the candidate feature maps of all the frames to be fused are weighted and summed according to the weight to obtain the time series features required by the current frame, and the candidate features and time series features of the video frame of the current time step are in the channel dimension. connected to obtain a feature map to be detected that incorporates timing information;

[0027] Step S3, u...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to a video target detection method based on an attention mechanism, and relates to the computer vision. The method comprises the following steps of S1, extracting a candidate feature map of a current time frame; S2, setting a fusion window in the past time period, calculating the Laplace variance of each frame in the window, normalizing the variance to serve as the weight of each frame in the window, performing weighted summation on the candidate feature maps of all frames in the window to obtain a time sequence feature, and connecting the candidate feature map of the current time frame with the time sequence feature to obtain a feature map to be detected; S3, extracting a feature map of an additional scale from the feature map to be detected by using the convolutional layer; and S4, carrying out target category and position prediction on the feature maps of different scales by using a convolutional layer. According to the feature fusion method provided by the invention, different weights are distributed to frame features with different qualities in the past time period, so that the fusion of the time sequence information is more sufficient, and the performance of a detection model is improved.

Description

technical field [0001] The invention relates to computer vision, deep learning, and video target detection technology. Background technique [0002] Image object detection methods based on deep learning have made great progress in the past five years, such as RCNN series networks, SSD networks and YOLO series networks. However, in the fields of video surveillance, vehicle assisted driving, etc., video-based target detection has a wider demand. Due to the problems of motion blur, occlusion, diversity of morphological changes, and diversity of illumination changes in videos, only using image object detection technology to detect objects in videos cannot get good detection results. There is temporal continuity and spatial similarity between adjacent frames in the video, and the position of the target between frames is related. How to use the target timing information in the video to improve the performance of video target detection The essential. [0003] The current video t...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06K9/00G06N3/04
CPCG06V20/46G06V20/41G06N3/045
Inventor 李建强白骏刘雅琦
Owner BEIJING UNIV OF TECH
Features
  • R&D
  • Intellectual Property
  • Life Sciences
  • Materials
  • Tech Scout
Why Patsnap Eureka
  • Unparalleled Data Quality
  • Higher Quality Content
  • 60% Fewer Hallucinations
Social media
Patsnap Eureka Blog
Learn More