Network live video feature extraction method in complex scene based on joint attention ResNeSt

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A network live broadcast and video feature technology, applied in image communication, selective content distribution, electrical components, etc., can solve the problems that it is difficult to effectively learn spatio-temporal context information and affect the accuracy rate, so as to save computing resources, enhance effective extraction, The effect of good discrimination

Active Publication Date: 2021-04-13

BEIJING UNIV OF TECH

View PDF5 Cites 7 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

However, when the video scene is relatively single, the number of people is small, and the edge of the object is clear, using the existing deep learning network can obtain better performance, but when there are many types of scenes, the number of people is uncertain, and the lighting conditions are affected In complex live video scenes under limited conditions, directly applying the above-mentioned deep network is not easy to effectively learn spatio-temporal context information, which affects the improvement of accuracy

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0030] According to the above description, the following is a specific implementation process, but the protection scope of this patent is not limited to this implementation process. Below is the concrete workflow of the present invention:

[0031] The video data used in the present invention comes from multiple network video platforms, and key frames are extracted from various downloaded live videos. During the experiment, the key frame was taken at 5 fps, and only the segment composed of 16 consecutive frames was taken to represent the video, and the video frame data of 224×224 pixels was obtained through Resize preprocessing. Put the video frame data into the feature pyramid for down-sampling to obtain feature maps of different scales; then through the calculation of the joint attention mechanism, the attention weight distribution of multi-scale features is obtained; finally, combined with convolution and pooling operations The ResNeSt module is set up, and the ResNeSt50 fe...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention relates to a network live video feature extraction method in a complex scene based on joint attention ResNeSt. Firstly, key frame extraction is performed on a network live video to obtain key frame data of the video; in order to utilize multi-scale features of video frames, a parallel path is designed according to a multi-scale structure of a feature pyramid network. The parallel path is constructed from bottom to top, information exchange is carried out between the parallel path and an original main path by utilizing transverse connection and oblique connection, and the transverse connection and the oblique connection are convolution operations. Considering that the picture representation form of network live broadcast is mostly a human main body, and a large amount of redundant information is mingled, the space-channel joint attention is introduced, and the picture main body characteristics are conveniently focused. And finally, a ResNeSt feature extraction module is constructed by combining the parallel feature pyramid fused with the joint attention with a convolution layer and a pooling layer, and feature extraction of the network live video in a complex scene is realized through superposition of multiple layers of modules.

Description

technical field [0001] The present invention takes network live video in complex scenes as the research object, and extracts features of live video through joint attention and ResNeSt network, thereby forming an efficient feature expression of live video. First, use the parallel feature pyramid to perform feature convolution on the key frame of the video; in the convolution process of the feature pyramid, obtain the low-level visual information and high-level semantic information of the video by introducing a joint attention mechanism; finally combine the split attention residual network (ResidualNetworks with Split-Attention, ResNeSt), forming an efficient feature expression for live webcast videos. Background technique [0002] With the advent of the Internet self-media era, more and more people begin to share their lives on the Internet in the form of live videos, and the number of live videos on the Internet is also increasing geometrically. Webcasting has a strong abil...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): H04N21/2187H04N21/234H04N21/44

CPCH04N21/2187H04N21/23418H04N21/44008

Inventor 张菁康俊鹏张广朋卓力

Owner BEIJING UNIV OF TECH

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Network live video feature extraction method in complex scene based on joint attention ResNeSt

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology