Video-paragraph retrieval method and system based on local-overall graph inference network
A partial graph and video segment technology, applied in the field of cross-modal retrieval, can solve the problems of long-sequence direct coding performance degradation, etc., and achieve optimal technical effects and comprehensive interactive information effects
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment Construction
[0030] The present invention will be further described below in conjunction with the accompanying drawings.
[0031] The present invention mainly designs four parts: first, preprocess the video and text (paragraph). Second, a local-holistic graph inference network is used to encode the given video and text, respectively, to obtain the final video features and text features. After that, the cosine similarity is used to calculate the similarity between video features and text features. Finally, retrieval is performed according to the similarity measurement results. In the partial-overall graph reasoning network, the present invention proposes to decompose the video and text into four-layer semantic structures, and construct the partial graph and the overall graph respectively, and then use the graph convolutional network to perform graph reasoning operations.
[0032] Reference to the schematic diagram of the video-paragraph retrieval process of the present invention figure 1...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com