Natural language video clip retrieval method based on multi-agent boundary perception network
A perception network, multi-agent technology, applied in digital data information retrieval, special data processing applications, instruments, etc., can solve the problems of inability to match query text and video clips, lack of video time inference ability, and retrieval accuracy limitations.
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment Construction
[0050] The present invention will be described in detail below in conjunction with the accompanying drawings and specific embodiments. This embodiment is carried out on the premise of the technical solution of the present invention, and detailed implementation and specific operation process are given, but the protection scope of the present invention is not limited to the following embodiments.
[0051] The invention provides a method for retrieving natural language video segments based on a multi-agent boundary perception network, which can retrieve a corresponding target segment from a certain video based on a sentence of natural language description. The retrieval method decomposes the task into two sub-tasks, starting point retrieval and end point retrieval, and iteratively adjusts the time boundary in multiple directions and scales through the boundary-aware agent (including the start point agent and the end point agent) to make the retrieval result approach the target seg...
PUM
Login to View More Abstract
Description
Claims
Application Information
Login to View More 


