Cross-modal retrieval method for querying video from complex text based on semantic tree enhancement

A semantic tree, cross-modal technology, applied in the field of cross-modal retrieval, can solve the problems of information loss, poor video retrieval effect, and ineffective complex text query.

Active Publication Date: 2020-11-06
ZHEJIANG GONGSHANG UNIVERSITY
View PDF5 Cites 12 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, this kind of method has the following shortcomings: First, it is usually not very effective for complex text queries, because it is usually difficult to fully describe the semantic content of complex text queries through a number of visual concepts, resulting in information loss, and the semantic content of complex text queries Not just aggregations for extracting concepts
2. How to effectively train a concept classifier and select related co...

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Cross-modal retrieval method for querying video from complex text based on semantic tree enhancement
  • Cross-modal retrieval method for querying video from complex text based on semantic tree enhancement
  • Cross-modal retrieval method for querying video from complex text based on semantic tree enhancement

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0067] The present invention will be described in detail below in conjunction with the accompanying drawings and specific embodiments.

[0068] In order to solve the problem of cross-modal retrieval from complex text query to video, the present invention proposes a cross-modal retrieval method from complex text query to video based on semantic tree enhancement. The specific steps are as follows:

[0069] (1) Using the feature extraction method to extract the features of the complex text query statement, and obtain the leaf node features of the complex text query statement.

[0070] (1-1) Given a complex text query statement Q of length N, the complex text query statement Q can be expressed as:

[0071] Q={w 1 ,w 2 ,...,w N}

[0072] where w 1 Represents the first word in the complex text query sentence, first use one-hot encoding (one-hot) to encode each word in the complex text query sentence, and the one-hot encoding vector sequence {w′ 1 , w′ 2 ,...,w′ N}, where w′ ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a cross-modal retrieval method for querying a video from a complex text based on semantic tree enhancement. For complex text query statements, words of complex text query statements are converted into leaf node representations, the relationship between child nodes is mined, the two child nodes with the highest dependency are combined, a semantic tree structure of the querystatements is constructed in a recursion mode, and query representations based on semantic tree enhancement are obtained. For coding of candidate videos, video preliminary features are obtained through a CNN, time dependence and semantic correlation between the videos are captured through a GRU and a self-attention mechanism module, and robust video feature representation is obtained. The complextext query representation and the video feature representation are mapped into a public space, and a matching relationship between the complex text query representation and the video feature representation is is automatically learned, thereby realizing cross-modal retrieval from complex text query to video. Information components in the complex text query statements can be explained, the user intention can be better understood, and the retrieval performance is improved to a great extent.

Description

technical field [0001] The invention relates to the field of cross-modal retrieval from text query to video, in particular to a cross-modal retrieval method from complex text query to video based on semantic tree enhancement. Background technique [0002] With the exponential growth of user-generated videos on the Internet, uploading videos in daily life and searching for videos of interest have become indispensable activities in people's daily life. The cross-modal retrieval method from text query to video is one of the techniques to obtain interesting videos. Early cross-modal retrieval methods from text query to video are mainly based on text keywords, and have been extensively researched and developed. But such methods only allow the user to input several keywords as queries. With the further improvement of people's demand for Internet video search capabilities, keyword-based queries are difficult to fully express users' search intentions, thereby affecting search expe...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F16/33G06F16/783G06F40/30G06N3/04
CPCG06F16/3344G06F16/783G06F40/30G06N3/044G06N3/045
Inventor 董建锋彭敬伟杨勋郑琪王勋
Owner ZHEJIANG GONGSHANG UNIVERSITY
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products