Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Video target detection method based on convolutional gating recurrent neural unit

A neural unit and target detection technology, applied in the field of image processing, can solve the problems of low detection accuracy and insufficient feature accuracy, and achieve the effects of high detection accuracy, improved feature quality, and improved effect

Active Publication Date: 2019-07-02
XIDIAN UNIV
View PDF3 Cites 38 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, due to the insufficient accuracy of the features obtained by optical flow estimation, the detection accuracy of this method is slightly lower than that of directly using R-FCN single frame detection

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Video target detection method based on convolutional gating recurrent neural unit
  • Video target detection method based on convolutional gating recurrent neural unit
  • Video target detection method based on convolutional gating recurrent neural unit

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0029]Video object detection requires correct object recognition and bounding box position prediction for each frame in the video. Compared with the target detection in the image, the target detection in the video adds a temporal relationship, and has some difficulties that are rarely seen in the image data. The single-frame target detection method cannot make full use of the temporal relationship, and is not well adaptable to the unique motion blur, video out-of-focus, occlusion, and singular poses of video data. The T-CNN series of methods consider the consistency constraints on timing, but the steps are complicated and cannot be trained end-to-end. The DFF series methods make full use of the redundancy between consecutive frames in time series, but do not make good use of the information between consecutive frames to improve the quality of feature extraction. Aiming at the shortcomings of the above method, the present invention introduces a circular gating convolutional ne...

Embodiment 2

[0050] The video target detection method based on the convolution gated recurrent neural unit is the same as embodiment 1, and the reference frame feature is estimated based on the current frame feature described in step (4), specifically comprising the following steps:

[0051] 4.1) The reference frame K t-n / 2 ~K t+n / 2 with the current frame I t Splicing along the channel direction is used as the input of the optical flow learning network, and the result of the optical flow learning network is expressed as S i =M(K i , I t ). Among them, i represents the time range t-n / 2~t+n / 2, S i Represents the result of the optical flow learning network at the i-th moment, M represents the optical flow learning network, K i is the i-th reference frame, I t for the current frame.

[0052] In this embodiment, the FlowNet fully trained on the Flying Chairs data set is used as the optical flow learning network. The output of the network is 1 / 4 of the size of the original image, which n...

Embodiment 3

[0060] The video target detection method based on the convolution gated cyclic neural unit is the same as embodiment 1-2, step (5) based on the temporal context feature learning of the convolution gated cyclic neural unit, including the following detailed steps:

[0061] 5.1) The reference frame estimation feature E obtained by steps (1)~(4) in claim 1 t-n / 2 ~E t+n / 2 and the current frame feature F t According to the time sequence, it is used as the input of the convolutional gated recurrent neural unit, and is denoted as H;

[0062] 5.2) The specific calculation formula for the forward propagation of the convolutional gated recurrent neural unit is as follows:

[0063] z t =σ(W Z *H t +U z *M t-1 ),

[0064] r t =σ(W r *H t +U r *M t-1 ),

[0065]

[0066]

[0067] where H t is the input feature map of the convolution-gated recurrent neural unit at the current moment, M t-1 The feature map with memory learned by the convolutional gated recurrent neural u...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a video target detection method based on a convolutional gating recurrent neural unit, and solves the problems of tedious steps and low detection precision in the prior art byusing video data time sequence context information. The method comprises the implementation steps of data set processing and network pre-training. The method comprises steps of selecting a reference frame, and estimating a reference frame feature based on the current frame feature; carrying out time sequence context feature learning based on the convolutional gated recurrent neural unit; performing weighted fusion on the time sequence related characteristics; extracting a target candidate box; carrying out target classification and position regression; training to obtain a video target detection network model; and verifying model effects. According to the invention, by introducing a characteristic propagation mode of a current frame estimation reference frame, and establishing a time sequence relation between the current frame and reference frame characteristics, the current frame is enabled to have reference frame information by using the convolutional gated recurrent neural unit, andthe feature quality of the current frame is enhanced by using a weighted fusion mode. And under the condition of low time cost, the detection precision is improved, the complexity is reduced, and themethod can be used for video target detection.

Description

technical field [0001] The invention belongs to the technical field of image processing, and relates to a video target detection method, in particular to a video target detection method based on a convolution gating cyclic neural unit, which can be used to locate and identify common objects in surveillance videos and network videos. Background technique [0002] With the rapid development and application of deep learning technology, especially the convolutional neural network has made great progress in the fields of image classification, recognition, and segmentation. License plate recognition, face recognition and other technologies have been widely used in people's lives. The progress of these technologies has benefited from the rapid development of computer hardware and the convenient acquisition of massive data. Since Li Feifei and others proposed the ImageNet dataset and challenge in 2012, the performance of the basic classification network has been rapidly improved. A...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06K9/00G06N3/04G06N3/08
CPCG06N3/08G06V20/41G06V20/46G06N3/048G06N3/044G06N3/045
Inventor 韩红李阳岳欣张照宇陈军如高鑫磊范迎春支涛
Owner XIDIAN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products