Video time sequence positioning method based on intra-modal collaborative multilinear pooling
A multi-linear, video technology, applied in character and pattern recognition, biological neural network models, special data processing applications, etc., can solve problems such as high dependence on timing modeling and sensitive timing information
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment Construction
[0085] The detailed parameters of the present invention will be further specifically described below.
[0086] Such as figure 1 , as shown in 2, the present invention provides a video temporal grounding method (Video Temporal Grounding) based on Intra-and Inter-modal Multilinear Pooling.
[0087] The core method of the present invention is to propose a multi-linear pooling model (IIM) of intra-modal coordination, which is used to solve the effective fusion of multimedia representations, and to verify the superiority of the model in the cross-modal deep learning task of video timing positioning . For the first time, this method proposes to model the features in each mode while interacting between modes of video and natural language. The resulting fusion features not only obtain the correlation between modes, but also establish the in-depth understanding and interaction. On the premise of the excellent performance of the IIM model, the present invention further proposes a gen...
PUM
Login to View More Abstract
Description
Claims
Application Information
Login to View More 


