Attention regression-based video time sequence sentence positioning method and device

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A positioning method and positioning device technology, applied in the field of computer vision, can solve the problems of ignoring key information, low accuracy, blocking, etc.

Inactive Publication Date: 2018-10-12

TSINGHUA UNIV

View PDF0 Cites 24 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0003] The method adopted in the prior art has the following defects: the method of scanning in the video to generate the video segment to be selected has high computational cost and cannot adapt to the processing of long videos, so the scalability of the above video sequence sentence positioning method is not strong ; Separate the video segment to be selected from the global video for independent processing, blocking the interaction between specific video content and video context information, which is crucial to sentence positioning

Therefore, the accuracy rate of the above video temporal sentence positioning methods is not high; the above methods directly use the general long short-term memory network to extract sentence features, ignoring the key information for temporal positioning in the sentence, so they are not enough to mine sentence information.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0026] Embodiments of the present invention are described in detail below, and examples of the embodiments are shown in the drawings, wherein the same or similar reference numerals denote the same or similar elements or elements having the same or similar functions throughout. The embodiments described below by referring to the figures are exemplary and are intended to explain the present invention and should not be construed as limiting the present invention.

[0027] The method and device for locating video time series sentences based on attention regression according to the embodiments of the present invention will be described below with reference to the accompanying drawings.

[0028] figure 1 It is a flow chart of the video temporal sequence sentence positioning method based on attention regression according to an embodiment of the present invention, such as figure 1 As shown, the video temporal sequence sentence positioning method based on attention regression includes...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses an attention regression-based video time sequence sentence positioning method and device. The method comprises the following steps: encoding a video clip and a sentence by using a bidirectional long short term memory network on the basis of a three-dimensional convolutional neural network and a Glove word vector mechanism to characterize contents of the video clip and the sentence; building a symmetrical association between the video and the sentence through a multimodal attention mechanism according to the contents of the video clip and the sentence, so as to acquire attention weight vectors and attention weighted features of the video and the sentence; and outputting a positioning result of an acquired video time sequence sentence through an attention weight-basedregression mechanism and an attention weighted feature regression mechanism according to the attention weight vectors or attention weighted features of the video and the sentence. The method disclosed by the invention can keep contextual information in the video and the sentence and improve the efficiency of a sentence positioning process, so as to achieve the aim of improving the sentence positioning speed, the positioning accuracy and the positioning robustness.

Description

technical field [0001] The invention relates to the technical field of computer vision, in particular to a method and device for locating video time-series sentences based on attention regression. Background technique [0002] In the prior art, the video sequence sentence positioning method mainly includes: constructing a unified representation space between the video and the sentence, scanning in the video to generate several video segments to be selected for positioning, and projecting the sentence and the video segment to be selected into the unified representation space Compare and locate; scan the video to generate several video segments to be selected for positioning, and fuse the visual features of the video segments to be selected with the text features of sentences to generate multimodal features. Time series regression is performed on the basis of the multimodal features to generate a time deviation value between the video segment to be selected and the video segme...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G06F17/30G06N3/04G06N3/08

CPCG06N3/049G06N3/084

Inventor 朱文武袁艺天

Owner TSINGHUA UNIV

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Attention regression-based video time sequence sentence positioning method and device

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology