Audio and video mutual retrieval method based on user click behaviors

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A video and user technology, applied in the field of data retrieval, can solve problems such as poor results and monotonous basis, and achieve the effect of improving interpretability

Active Publication Date: 2019-06-21

SOUTH CHINA UNIV OF TECH

View PDF5 Cites 12 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

Although this method has good interpretability, it only uses one or more classification results to match, the basis is relatively monotonous, and the effect is often not good

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment

[0043] Such as figure 1 As shown, a method for mutual retrieval of audio and video based on user click behavior includes steps:

[0044] S1. Preprocessing the input audio and video data to obtain a spectrogram of the audio data and key frames of the video data;

[0045] The specific implementation manner of step S1 is: for the input audio data, the audio data is first drawn as a spectrogram. Then the spectrogram will be horizontally scaled to form a two-dimensional image I with a size of 128*128 pixels a . For the input video data, use the frame averaging method to extract 128 key frames as the key frame sequence S of the input video f =[f1 ,f 2 ,..., f n ]. Uniformly scale each picture in the key frame sequence into a two-dimensional image with a height of 128*128 pixels;

[0046] S2. Send the preprocessed audio data to an encoder composed of a deep convolutional neural network based on an attention mechanism. Obtain the representation vector and attention weight dist...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses an audio and video mutual retrieval method based on user click behaviors. The method comprises the following steps: preprocessing input audio and video data; Sending the preprocessed audio data into a deep convolutional neural network to obtain an audio representation vector and attention weight distribution; Sending the preprocessed video key frame into a deep convolutional neural network to obtain a key frame representation vector, and sequentially sending the key frame representation vector into a time sequence processing network based on an attention mechanism to obtain a representation vector of the video and attention weight distribution; Calculating the similarity of the audio and video representation vectors and sorting the audio and video according to the similarity; performing Annotating according to the attention weight distribution to provide explainable basis for sorting; Calculating the loss function through a user click behavior, and carrying outmodel training by adopting a backward propagation method; And carrying out retrieval matching on audios and videos in the media library based on the trained model. According to the method and the device, matched audios and videos in the media library can be retrieved under the condition of giving videos and audios.

Description

technical field [0001] The invention relates to data retrieval technology, in particular to a method for mutual retrieval of audio and video based on user click behavior. Background technique [0002] With the rapid development of the Internet industry, especially the mobile Internet industry, a large amount of audio and video content has been produced, and how to deal with these audio and video contents has become an urgent problem to be solved. In the past, finding matching audio for video or finding matching video for audio and editing it into a complete work was often only the needs of practitioners in the audio and video industry. For professionals, they can rely on their exposure and professional understanding of a large amount of audio and video content to organically combine audio and video. But even so, people's memory for audio and video content is still limited, and the size of the audio and video candidate library is limited by human memory, so it is difficult t...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G06F16/783G06F16/78G06F16/732G06N3/04G06N3/08

Inventor 张炯衍彭新一刘孜文

Owner SOUTH CHINA UNIV OF TECH

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Audio and video mutual retrieval method based on user click behaviors

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology