Method for solving video questions and answers by use of hierarchical space-time attention coder-decoder network mechanism
A codec and attention technology, applied in the fields of instruments, computer parts, special data processing applications, etc., can solve problems such as the inability to make good use of the mutual sequence relationship between videos
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment
[0077] The present invention conducts experimental verification on the data set constructed by itself. This data set contains 201,068 GIF fragments and 287,933 text descriptions, and then the present invention generates question and answer pairs from the video description. The verification experiment of the present invention comprises 4 kinds of problems in total, which are respectively related to the object, number, color and location of the video. Then the present invention carries out following pretreatment to the video question answering data set of construction:
[0078] 1) Take 25 frames for each video, and reset each frame to a size of 224×224, and then use VGGNet to obtain the 4096-dimensional feature expression of each frame. For each frame, the present invention selects 3 regions as candidate regions.
[0079] 2) For questions and answers, the present invention utilizes the word2vec model trained in advance to extract the semantic expressions of questions and answer...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com