Method for solving problem-based video clip extraction task by using cross-model interaction network

A technology for video clips and questions, applied in the field of video clip extraction tasks, can solve problems such as ignoring the comprehensive utilization of various effective information

Active Publication Date: 2019-10-25
ZHEJIANG UNIV
View PDF3 Cites 5 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] The purpose of the present invention is to solve the problems in the prior art. In order to overcome the problem of only focusing on one aspect of the video clip extraction task in the prior art

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method for solving problem-based video clip extraction task by using cross-model interaction network
  • Method for solving problem-based video clip extraction task by using cross-model interaction network
  • Method for solving problem-based video clip extraction task by using cross-model interaction network

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0126] The present invention conducts experiments on the ActivityCaption data set and the TACoS data set. In order to objectively evaluate the performance of the algorithm of the present invention, the present invention uses R@1IoU=0.3, R@1IoU=0.5, R@1IoU=0.7, R@5IoU=0.3, R@5IoU in the selected test set =0.5, R@5IoU=0.7 six evaluation criteria to evaluate the effect of the present invention, these six evaluation criteria reflect respectively when the IoU is 0.3, 0.5, 0.7, if the best 1 and the best 5 are respectively selected If there are predefined candidate segments, what percentage of candidate segments larger than the set IoU value are selected. According to the steps described in the specific embodiment, the obtained experimental results are shown in Table 1 and Table 2, and this method is denoted as CMIN.

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a method for solving a problem-based video clip extraction task by using a cross-model interaction network. The method mainly comprises the following steps: 1), for problem statements and video frames, acquiring cross-model semantic expressions of the video frames by utilizing a semantic image convolutional network, a multi-head self-attention module and a multi-step cross-model interaction module; and 2), for the obtained cross-model semantic expression of the video frame, calculating a loss function and training a model, and performing problem-based fragment extraction on the video by using the trained cross-model interaction network. Compared with a common video clip extraction solution, the method has the advantage that various kinds of effective information arecomprehensively utilized. Compared with a traditional method, the method has the advantage that the effect obtained in a problem-based video clip extraction task is better.

Description

technical field [0001] The invention relates to a problem-based video segment extraction task, in particular to a method for solving the problem-based video segment extraction task by using a cross-model interaction network. Background technique [0002] At present, the video clip extraction task has become an important service, but the effect of the existing service is not very good. [0003] Existing technologies mainly only focus on one aspect of video clip extraction tasks, such as problem description learning, video content modeling, and cross-model expression mixing, thus ignoring the comprehensive utilization of various effective information to improve video clip extraction. Accuracy of fragment extraction. To overcome this shortcoming, this method will use a cross-model interaction network to solve the problem-based video segment extraction task. [0004] The present invention will use a semantic image convolutional network to capture the grammatical structure in t...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F16/78G06N3/04G06F17/27
CPCG06F16/7867G06F40/30G06N3/045
Inventor 赵洲孟令涛张竹陈漠沙仇伟
Owner ZHEJIANG UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products