Sound scene recognizing method based on label amplification and multi-spectrum fusion

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A scene recognition and spectrogram technology, applied in the field of scene recognition, can solve the problem of not considering clustering and extracting super-category labels, and achieve the effects of fast training convergence, improved performance, and system robustness

Active Publication Date: 2018-12-04

SOUTH CHINA NORMAL UNIVERSITY

View PDF6 Cites 21 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

Similarly, Document 4 assumes that hierarchical labels already exist, and does not consider how to cluster and extract supercategory labels

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment

[0031] Such as figure 1 As shown, in this embodiment, a sound scene recognition method based on tag amplification and multi-spectrum fusion includes the following steps:

[0032] Step S1: the data set used in this embodiment includes the Development file set and the Evaluation file set of DCASE2017 sound scene recognition; 90% of the Development file set is used as the training part Tr, and the remaining 10% is used as the verification part V1, and the Evaluation file set is used as the verification part V1 As a test part Te. The audio files in each file set are 10 seconds long. Without loss of generality, this embodiment only uses two spectrogram formats to describe the implementation steps: one is the STFT spectrogram, and the other is the CQT spectrogram.

[0033] Step S2: Take out the audio files one by one from Tr, and obtain the STFT time-frequency characteristic value after operations such as framing, windowing, and short-time Fourier transform, and organize the time-...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a sound scene recognizing method based on label amplification and multi-spectrum fusion. The method comprises the following steps: by adopting different signal processing technologies, generating various spectra for sound scene data; aiming at each spectrum, training a deep convolution neural network model which serves as a basic classification model; by utilizing the labelamplification technology, amplifying an ultra-class label for a sample, improving the original network model into a multi-task learning model by using an artificially constructed layered label, and optimizing the performances of the basic classification model; and extracting the sample features by utilizing the improved basic classification model, splicing a plurality of depth features of a voicescene file, and carrying out dimensionality reduction, thus obtaining the global features. The plurality of global features of the corresponding different spectra are fused, and an SVM classifier istrained and serves as the final classification model. According to the method, the multi-spectrum feature fusion technology is adopted, so that the recognition performance is effectively promoted; andwith the provided label amplification and model promotion method, the performances of the basic classifier can be effectively optimized, and the method can be popularized to other application researches.

Description

technical field [0001] The invention belongs to the technical field of scene recognition, and in particular relates to a sound scene recognition method based on tag amplification and multi-spectrum fusion. Background technique [0002] The sound scene recognition technology analyzes the audio data to determine the attributes, functions and uses of the space environment where the machine is located. Sound scene recognition based on convolutional neural network has become one of the most effective methods in this field. Since the sound scene dataset is labeled according to the function of the place, the problem of similarity between classes is more prominent, such as libraries and self-study classrooms, which are easy to misjudge each other. On the other hand, these data, which are inherently similar in acoustic features, are indiscriminately considered as different categories when training the network model due to different functional purposes, which hinders the network mode...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G10L25/30G10L25/51G10L25/18G06K9/62G06N3/04

CPCG10L25/18G10L25/30G10L25/51G06N3/045G06F18/2411G06F18/214

Inventor 郑伟平刑晓涛莫振尧

Owner SOUTH CHINA NORMAL UNIVERSITY

Sound scene recognizing method based on label amplification and multi-spectrum fusion

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology