Transform and deep reinforcement learning-based video abstract generation network
A technology of reinforcement learning and video summarization, applied in the field of computer vision, can solve the problem that the summaries do not have temporal coherence
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment Construction
[0047] In order to enable those skilled in the art to better understand the present invention, the technical solutions of the present invention are further described below with reference to the accompanying drawings and embodiments.
[0048] like figure 1 As shown, the video summary generation network based on Transformer and deep reinforcement learning includes three parts: encoding, decoding, and optimization;
[0049] The coding part extracts the depth features of the video frame through GooLeNet, and inputs the feature vector into the Transformer coding part. First, position coding is performed, and then passed to the self-attention layer. After the calculation is completed, the residual connection and layer regularization are performed, and finally the feedforward neural network is passed. and one more residual connection and layer regularization;
[0050] After the depth feature is extracted from the video frame through GooLeNet, assuming there are M frames in total, th...
PUM
Login to View More Abstract
Description
Claims
Application Information
Login to View More 


