Rapid audio searching method based on GPU (Graphic Processing Unit)

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
An audio and fast technology, applied in the field of retrieval, can solve problems such as slow retrieval speed

Active Publication Date: 2012-09-05

成都川哈工机器人及智能装备产业技术研究院有限公司

View PDF0 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0003] In order to solve the problem of slow retrieval speed of existing content-based audio retrieval methods, the present invention proposes a fast audio retrieval method based on GPU

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

specific Embodiment approach 1

[0019] Specific implementation mode one: the steps of this implementation mode are as follows:

[0020] Step 1: Initial: Determine whether there is feature information of the audio clip in the image processor GPU,

[0021] If not, proceed to step 2 to preprocess the audio stream data;

[0022] If yes, enter step 3, and perform vector sliding matching of feature information of audio clips;

[0023] Step 2: Preprocessing: The central processing unit CPU divides the audio stream data input into the audio retrieval system into audio segments, performs feature extraction on each audio segment, and groups the feature information of the audio segments, and then classifies the feature information of each group of audio segments The information is sequentially transferred to the texture memory of the image processor GPU;

[0024] Step 3: audio segment vector sliding matching: the vector sliding matching module in the texture memory of the image processor GPU utilizes the segment vect...

specific Embodiment approach 2

[0032] Embodiment 2: The difference between this embodiment and Embodiment 1 is that the feature information of the audio segment includes Mel cepstral coefficients and their differential features and segment vector features, wherein the segment vector features are Mel cepstral coefficients and their differential features Dimensionality reduction features; other steps are the same as in the first embodiment.

specific Embodiment approach 3

[0033] Specific embodiment three: the difference between this embodiment and specific embodiment one or two is that the Mel cepstral coefficient feature matrix matching module and the vector sliding matching module are obtained by the following method:

[0034] Step A: The central processing unit CPU establishes an original audio library according to the function and scale of the audio retrieval system; performs feature extraction on each audio file in the original audio library, thereby obtaining Mel cepstral coefficients and their differential features and segment vectors Features two kinds of feature information, using the feature information to establish a reference template library;

[0035] Among them, the calculation of the Mel cepstral coefficients and their differential feature information is to convert the time-domain signal into a frequency-domain signal by using Fast Fourier Transform (Fast Fourier Transform, referred to as FFT), and then the pair of the frequency-d...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention relates to a rapid audio searching method based on a GPU (Graphic Processing Unit), belonging to a searching method and solving the problem of low searching speed of the traditional audio searching method based on the contents. The rapid audio searching method comprises the steps of: 1, judging whether characteristic information with audio segments exists in the GPU, if not, entering the step 2, if yes, entering the step 3; 2, preprocessing; 3, matching an audio segment vector in a sliding manner; 4, judging whether the vector sliding matching result is more than the set first threshold or not, if not, returning the step 1, if yes, entering the step 5; 5, matching an audio segment Mel cepstrum coefficient characteristic matrix; and 6, judging whether the matching result of the Mel cepstrum coefficient characteristic matrix is more than the set second threshold, if not, returning the step 1, if yes, judging that the matching is succeeded, and correspondingly processing the successfully matching result. The method greatly improves the audio searching speed and ensures the searching accuracy.

Description

technical field [0001] The invention relates to a retrieval method, in particular to a method for improving retrieval speed by using GPU to calculate the core steps of content-based audio retrieval. Background technique [0002] Content-based audio retrieval mainly studies how to use physical features such as audio amplitude and frequency spectrum, auditory features such as loudness, pitch, and timbre, and semantic features such as words and melodies to realize audio information search. Usually, Mel-frequency Cepstral Coefficient (MFCC for short) and its differential features are used, and continuous audio data segments can be regarded as a matrix of floating-point numbers. Among them, the number of matrix rows represents the MFCC feature dimension, and the number of matrix columns represents the number of audio data segment frames. In order to improve the retrieval speed, some audio retrieval algorithms reduce the dimensionality of the MFCC feature matrix into a one-dimens...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Patents(China)

IPC IPC(8): G06F17/30

Inventor 韩纪庆杜艳斌

Owner 成都川哈工机器人及智能装备产业技术研究院有限公司

Rapid audio searching method based on GPU (Graphic Processing Unit)

What is Al technical title? Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document. An audio and fast technology, applied in the field of retrieval, can solve problems such as slow retrieval speed

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

specific Embodiment approach 1

specific Embodiment approach 2

specific Embodiment approach 3

PUM

Abstract

Description

Claims

Application Information

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
An audio and fast technology, applied in the field of retrieval, can solve problems such as slow retrieval speed

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology