Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Language description guided video timing sequence positioning method and system

A language description and positioning method technology, applied in the field of machine vision and deep learning, can solve the problems of not specifying the timing boundary and reducing the accuracy of boundary prediction, etc.

Pending Publication Date: 2020-12-01
SUN YAT SEN UNIV
View PDF4 Cites 6 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, this weakly supervised paradigm has only video-level language annotations without accounting for their corresponding concrete temporal boundaries, resulting in less accurate boundary prediction.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Language description guided video timing sequence positioning method and system
  • Language description guided video timing sequence positioning method and system
  • Language description guided video timing sequence positioning method and system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0100] The following will clearly and completely describe the technical solutions in the embodiments of the present invention with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only some, not all, embodiments of the present invention. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without creative efforts fall within the protection scope of the present invention.

[0101] It should be understood that the step numbers used herein are only for convenience of description, and are not intended to limit the execution order of the steps.

[0102] It should be understood that the terminology used in the description of the present invention is for the purpose of describing particular embodiments only and is not intended to limit the present invention. As used in this specification and the appended claims, the singular forms "a", "an"...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a language description guided video timing sequence positioning method and system. The method comprises the following steps: receiving a video query information group; obtaininga target video clip from the to-be-queried video; extracting character feature information from the query character; extracting target feature information from the target video clip; calculating a target loss value corresponding to the target video clip; when the target loss value is not included in a preset loss value set, calculating an action parameter; and adjusting the boundary position of the target video clip in the to-be-queried video according to the action parameters, and returning to the step of extracting the feature information of the target video clip from the target video clip.According to the language description guided weak supervision video time sequence positioning method provided by the invention, a time sequence boundary is adaptively optimized by means of a reinforcement learning normal form according to a boundary adaptive optimization framework, a cross-modal semantic difference is reduced, and a more accurate result is obtained.

Description

technical field [0001] The present invention relates to the technical field of machine vision and deep learning, in particular to a video sequence positioning method and system guided by language description. Background technique [0002] As a newly proposed key task, video timing positioning has potential applications in the fields of human-computer interaction and cross-media analysis. Its goal is to perform timing positioning on a given video according to the provided text description, that is, enter a sentence, The timing segment (starting frame and ending frame) of the meaning contained in this sentence is located in the video through the model. [0003] Existing techniques use a fully supervised approach to map between video segments and corresponding language descriptions. However, obtaining fine-grained annotations is a daunting task requiring a lot of human effort, which becomes a critical bottleneck as this task evolves towards larger scale and more complex scenes...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F16/73G06K9/62G06N3/04G06N3/08
CPCG06F16/73G06N3/08G06N3/045G06F18/25
Inventor 李冠彬许晓倩吴捷毛明志
Owner SUN YAT SEN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products