Attention regression-based video time sequence sentence positioning method and device

A positioning method and positioning device technology, applied in the field of computer vision, can solve the problems of ignoring key information, low accuracy, blocking, etc.

Inactive Publication Date: 2018-10-12
TSINGHUA UNIV
View PDF0 Cites 24 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] The method adopted in the prior art has the following defects: the method of scanning in the video to generate the video segment to be selected has high computational cost and cannot adapt to the processing of long videos, so the scalability of the above video sequence sentence positioning method is not strong ; Separate the video segment to be selected from the global video for independent processing, blocking the interaction between specific video content and video context information, which is crucial to sentence positioning
Therefore, the accuracy rate of the above video temporal sentence positioning methods is not high; the above methods directly use the general long short-term memory network to extract sentence features, ignoring the key information for temporal positioning in the sentence, so they are not enough to mine sentence information.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Attention regression-based video time sequence sentence positioning method and device
  • Attention regression-based video time sequence sentence positioning method and device
  • Attention regression-based video time sequence sentence positioning method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0026] Embodiments of the present invention are described in detail below, and examples of the embodiments are shown in the drawings, wherein the same or similar reference numerals denote the same or similar elements or elements having the same or similar functions throughout. The embodiments described below by referring to the figures are exemplary and are intended to explain the present invention and should not be construed as limiting the present invention.

[0027] The method and device for locating video time series sentences based on attention regression according to the embodiments of the present invention will be described below with reference to the accompanying drawings.

[0028] figure 1 It is a flow chart of the video temporal sequence sentence positioning method based on attention regression according to an embodiment of the present invention, such as figure 1 As shown, the video temporal sequence sentence positioning method based on attention regression includes...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses an attention regression-based video time sequence sentence positioning method and device. The method comprises the following steps: encoding a video clip and a sentence by using a bidirectional long short term memory network on the basis of a three-dimensional convolutional neural network and a Glove word vector mechanism to characterize contents of the video clip and the sentence; building a symmetrical association between the video and the sentence through a multimodal attention mechanism according to the contents of the video clip and the sentence, so as to acquire attention weight vectors and attention weighted features of the video and the sentence; and outputting a positioning result of an acquired video time sequence sentence through an attention weight-basedregression mechanism and an attention weighted feature regression mechanism according to the attention weight vectors or attention weighted features of the video and the sentence. The method disclosed by the invention can keep contextual information in the video and the sentence and improve the efficiency of a sentence positioning process, so as to achieve the aim of improving the sentence positioning speed, the positioning accuracy and the positioning robustness.

Description

technical field [0001] The invention relates to the technical field of computer vision, in particular to a method and device for locating video time-series sentences based on attention regression. Background technique [0002] In the prior art, the video sequence sentence positioning method mainly includes: constructing a unified representation space between the video and the sentence, scanning in the video to generate several video segments to be selected for positioning, and projecting the sentence and the video segment to be selected into the unified representation space Compare and locate; scan the video to generate several video segments to be selected for positioning, and fuse the visual features of the video segments to be selected with the text features of sentences to generate multimodal features. Time series regression is performed on the basis of the multimodal features to generate a time deviation value between the video segment to be selected and the video segme...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30G06N3/04G06N3/08
CPCG06N3/049G06N3/084
Inventor 朱文武袁艺天
Owner TSINGHUA UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products