Video-paragraph retrieval method and system based on local-whole graph reasoning network

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A partial image and video clip technology, applied in the field of cross-modal retrieval, can solve the problems of long sequence direct coding performance degradation, etc., and achieve the best technical effect and the comprehensive effect of interactive information

Active Publication Date: 2021-09-17

HANGZHOU YIWISE INTELLIGENT TECH CO LTD

View PDF9 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0006] The purpose of the present invention is to solve the problems in the prior art. In order to overcome the problem of long-sequence direct coding performance degradation and learn more fine-grained information, the present invention decomposes video and text into four levels: overall level, segment layer, action layer and object layer

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0030] The present invention will be further described below in conjunction with the accompanying drawings.

[0031] The present invention mainly designs four parts: firstly, video and text (paragraph) are preprocessed. Second, the given video and text are respectively encoded using a local-whole graph inference network to obtain the final video features and text features. After that, the similarity between video features and text features is calculated using cosine similarity. Finally, search is performed according to the similarity measurement results. In the local-whole graph reasoning network, the present invention proposes to decompose video and text into four-layer semantic structures, and construct a local graph and an overall graph respectively, and then use a graph convolutional network to perform graph reasoning operations.

[0032] Schematic diagram reference of the video-paragraph retrieval process of the present invention figure 1 As shown, it mainly includes t...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a video-paragraph retrieval method and system based on a partial-whole graph reasoning network, which belongs to the field of cross-modal retrieval and mainly includes the following steps: 1) Firstly, preprocessing the video and text (paragraph). 2) For the given video and text, use the local-whole graph inference network to encode, respectively, to obtain the final video features and text features. 3) Calculate the similarity between video features and text features using cosine similarity. 4) Retrieve according to the similarity measurement results. Compared with the traditional video-paragraph retrieval method, the present invention proposes to decompose video and text into four-layer semantic structures, and construct local graphs and overall graphs respectively, and then use graph convolutional networks to perform graph reasoning operations. The results obtained in video-paragraph retrieval are better than traditional methods.

Description

technical field [0001] The invention relates to the field of cross-modal retrieval, in particular to a video-paragraph retrieval method and system based on a partial-whole graph reasoning network. Background technique [0002] As a cross-modal retrieval task between videos and paragraphs, the video-paragraph retrieval task is a very important task that has attracted the attention of many researchers. [0003] This task is designed in the two fields of computer vision and natural language processing. It requires the system to encode both video and text, and then calculate the similarity based on the encoding, and then perform retrieval. At present, the video-paragraph retrieval task is still a novel task, and the current research on this task is not mature enough. [0004] Existing video-paragraph retrieval tasks either directly encode the entire video and the entire paragraph, or directly encode only multiple segments of the video and paragraph. However, such encoding meth...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Patents(China)

IPC IPC(8): G06F16/783G06F40/30G06N5/04

CPCG06N5/04G06F16/7844G06F40/30

Inventor 张鹏程

Owner HANGZHOU YIWISE INTELLIGENT TECH CO LTD

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Video-paragraph retrieval method and system based on local-whole graph reasoning network

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology