Network live video feature extraction method in complex scene based on joint attention ResNeSt

A network live broadcast and video feature technology, applied in image communication, selective content distribution, electrical components, etc., can solve the problems that it is difficult to effectively learn spatio-temporal context information and affect the accuracy rate, so as to save computing resources, enhance effective extraction, The effect of good discrimination

Active Publication Date: 2021-04-13
BEIJING UNIV OF TECH
View PDF5 Cites 7 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, when the video scene is relatively single, the number of people is small, and the edge of the object is clear, using the existing deep learning network can obtain better performance, but when there are many types of scenes, the number of people is uncert

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Network live video feature extraction method in complex scene based on joint attention ResNeSt
  • Network live video feature extraction method in complex scene based on joint attention ResNeSt
  • Network live video feature extraction method in complex scene based on joint attention ResNeSt

Examples

Experimental program
Comparison scheme
Effect test

Example Embodiment

[0030]According to the above description, the following is a specific implementation process, but the range protected by this patent is not limited to the implementation process. Below is a specific workflow of the present invention:

[0031]The video data used in the present invention is derived from a number of network video platforms, and performs keyframe extraction of various live video downloads. During the experiment, take a key frame with 5FPS, and take only a segment of the continuous 16 frames to represent the video, and the video frame data of 224 × 224 pixels is obtained by Resize. Place the video frame data into the feature pyramid in the pyramid, to obtain a feature map of different scales; then through the calculation of the joint attention mechanism, obtain the attention weight allocation of multi-scale features; final combination convolution and poolization operation Setting a resnest module, through the overlapping of 50 RESNEST modules, a resnest50 feature extraction...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a network live video feature extraction method in a complex scene based on joint attention ResNeSt. Firstly, key frame extraction is performed on a network live video to obtain key frame data of the video; in order to utilize multi-scale features of video frames, a parallel path is designed according to a multi-scale structure of a feature pyramid network. The parallel path is constructed from bottom to top, information exchange is carried out between the parallel path and an original main path by utilizing transverse connection and oblique connection, and the transverse connection and the oblique connection are convolution operations. Considering that the picture representation form of network live broadcast is mostly a human main body, and a large amount of redundant information is mingled, the space-channel joint attention is introduced, and the picture main body characteristics are conveniently focused. And finally, a ResNeSt feature extraction module is constructed by combining the parallel feature pyramid fused with the joint attention with a convolution layer and a pooling layer, and feature extraction of the network live video in a complex scene is realized through superposition of multiple layers of modules.

Description

technical field [0001] The present invention takes network live video in complex scenes as the research object, and extracts features of live video through joint attention and ResNeSt network, thereby forming an efficient feature expression of live video. First, use the parallel feature pyramid to perform feature convolution on the key frame of the video; in the convolution process of the feature pyramid, obtain the low-level visual information and high-level semantic information of the video by introducing a joint attention mechanism; finally combine the split attention residual network (ResidualNetworks with Split-Attention, ResNeSt), forming an efficient feature expression for live webcast videos. Background technique [0002] With the advent of the Internet self-media era, more and more people begin to share their lives on the Internet in the form of live videos, and the number of live videos on the Internet is also increasing geometrically. Webcasting has a strong abil...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): H04N21/2187H04N21/234H04N21/44
CPCH04N21/2187H04N21/23418H04N21/44008
Inventor 张菁康俊鹏张广朋卓力
Owner BEIJING UNIV OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products