Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Video scene classification method, device, equipment and storage medium

A scene classification and video scene technology, applied in the field of computer vision, can solve problems such as difficult scene classification, and achieve the effect of meeting personalized viewing needs, fast recognition rate, and improved accuracy

Active Publication Date: 2022-06-24
BEIJING BYTEDANCE NETWORK TECH CO LTD
View PDF4 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, the scene in the video changes rapidly, making it difficult to classify the scene from the video

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Video scene classification method, device, equipment and storage medium
  • Video scene classification method, device, equipment and storage medium
  • Video scene classification method, device, equipment and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0025] figure 1 This is a flowchart of a video scene classification method provided in Embodiment 1 of the present disclosure. This embodiment can be applied to the case of scene classification of a video frame sequence in a video stream. The method can be performed by a video scene classification device. The device can be composed of hardware and / or software and integrated into the electronic equipment, and specifically includes the following steps:

[0026] S110. Extract a plurality of video frames to be processed from the video frame sequence.

[0027] The video frame sequence refers to continuous video frames within a period of time in the video stream, for example, continuous video frames within a time period of 5 seconds or 8 seconds, and the video frame sequence includes a plurality of video frames.

[0028] Optionally, when extracting a plurality of video frames to be processed, the extraction may be performed continuously or discontinuously in the video frame sequenc...

Embodiment 2

[0051] In each optional implementation manner of the foregoing embodiment, the video frame to be processed may be extracted from any video frame sequence of the video stream, and the scene classification of the to-be-processed video frame may be performed. However, the video stream contains complex contents, and it cannot be guaranteed that the to-be-processed video frames in each video frame sequence belong to a certain preset scene category. Based on this, this embodiment first locks a certain video frame sequence according to the shooting angle of view, and then performs scene classification on the video frames in the video frame sequence.

[0052] figure 2 This is a flowchart of a video scene classification method provided in Embodiment 2 of the present disclosure. This embodiment can be combined with each optional solution in one or more of the foregoing embodiments, and specifically includes the following steps:

[0053] S210. Extract at least one to-be-identified vide...

Embodiment approach

[0061] Embodiment 1: Input at least one to-be-recognized video frame into a first image recognition model, respectively, to obtain a shooting angle of view corresponding to each to-be-recognized video frame output by the first image recognition model.

[0062] In this embodiment, the first image recognition model can directly recognize the shooting angle of the video frame to be recognized. Then, when training the first image recognition model, the video frame samples of the long-distance shooting perspective and the long-distance shooting perspective labels, and the video frame samples and the short-distance shooting perspective labels of the short-distance shooting perspective are used as model inputs for training.

[0063] The second embodiment: input at least one to-be-recognized video frame into the first image recognition model respectively, and obtain the display area of ​​the target object in each to-be-recognized video frame output by the first image recognition model....

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The embodiment of the present disclosure discloses a video scene classification method, device, equipment and storage medium. Wherein, the method includes: extracting a plurality of video frames to be processed from the video frame sequence; inputting the plurality of video frames to be processed into the scene classification model, and obtaining the scene corresponding to the plurality of video frames to be processed output by the scene classification model category; wherein, the scene classification model includes an aggregation model, a classifier and multiple feature extraction models, the scene classification model extracts the image features in the input video frame to be processed through each feature extraction model, and aggregates multiple video frames to be processed through the aggregation model The image features in the frame are aggregated, and the aggregated features are classified by a classifier to obtain the corresponding scene category. Embodiments of the present disclosure can implement scene classification in videos.

Description

technical field [0001] Embodiments of the present disclosure relate to computer vision technologies, and in particular, to a video scene classification method, apparatus, device, and storage medium. Background technique [0002] With the development of Internet technology, videos can be captured by cameras and sent to smart terminals through the network. People can watch videos from all over the world on smart terminals, such as sports videos, road videos, and game videos. [0003] Wonderful videos are more attractive to the audience, and whether a video is wonderful depends on the scene in it. For example, in a football game video, scenes such as shooting, penalty kicks, and free kicks are popular content for audiences. However, the scenes in the videos change rapidly, making it difficult to get scene classifications from the videos. SUMMARY OF THE INVENTION [0004] Embodiments of the present disclosure provide a video scene classification method, apparatus, device, an...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06V20/40G06V10/764G06V10/80G06V10/82G06K9/62
CPCG06V20/41G06V20/46G06F18/24
Inventor 李根许世坤朱延东王长虎
Owner BEIJING BYTEDANCE NETWORK TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products