Audio multi-label classification method based on deep learning

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A deep learning, multi-label technology, applied in the field of multi-label classification

Inactive Publication Date: 2021-03-26

HUNAN UNIV

View PDF0 Cites 5 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0005] The invention discloses an audio multi-label classification method based on deep learning, which solves the problem of automatically classifying complex environmental sounds under noise interference

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0035] The hardware environment of the present invention is mainly a server whose GPU model is GeForce GTX 2080Ti. The software implementation uses ubuntu 16.04 as the platform, adopts the Python programming language, and is developed based on the deep learning framework TensorFlow. The experimental data set comes from the FSDKaggle2019 data set on the Kaggle platform. The data set consists of two parts, namely Freesound Dataset (FSD) and Yahoo Flickr Creative Commons 100M dataset (YFCC). FSD is based on AudioSet, and YFCC is a set of Audio track for Flickr videos. The entire dataset contains 80 class labels, such as applause, cows, rain, etc. The specific implementation process is mainly divided into five parts: data preprocessing, audio feature extraction, model construction and training, model evaluation, and audio label classification. details as follows:

[0036] 1. Data preprocessing

[0037] Since the original audio data set contains noise interference, this patent ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention relates to the field of audio marking of environmental sound recognition, in particular to a multi-label classification method based on deep learning for noisy audio. According to the content of the invention, data preprocessing comprises performing noise reduction processing on a data set by using an RNNoise algorithm. The audio feature extraction comprises carrying out short-time Fourier transform on an audio, then converting the audio into MFCC feature data, and then inputting the MFCC feature data into a VGGish network to obtain 128-dimensional high level feature embedding; the model construction comprises the steps that a CNN and an RNN neural network are determined to be used, the CNN can well utilize a two-dimensional structure of input data to process voice data, andthe RNN can well utilize correlation between labels to orderly predict the labels; the model training comprises tracking a loss function value and a classification error, and updating model parametersuntil a model with relatively high accuracy is obtained. The model evaluation comprises defining evaluation indexes and calculating average precision; the audio multi-label classification comprises the steps of loading the trained model and outputting a predicted label probability result. The process is shown in Figure 1.

Description

technical field [0001] The invention relates to the field of audio marking for environmental sound recognition, in particular to a deep learning-based multi-label classification method for audio with noise. Specifically, after the audio feature is extracted, it is used as the input of the neural network for training to obtain a model with high accuracy, so as to perform label classification. Background technique [0002] In recent years, deep learning has been widely used in speech recognition, image classification, automatic driving and other fields, and the classification of environmental sound recognition is a problem that is widely used in real life. At present, the research on this problem is gradually became a hotspot. [0003] Traditional single-label classification mainly solves the problem that an example belongs to only one category. However, in real life, due to the complexity and polysemy of the objective object itself, there is often no absolute single-label c...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G06F16/65G06F16/683G06K9/62G06N3/04G06N3/08

CPCG06F16/65G06F16/683G06N3/08G06N3/047G06N3/045G06F18/241G06F18/2415

Inventor 陈浩马文钟雄虎

Owner HUNAN UNIV

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Audio multi-label classification method based on deep learning

What is Al technical title? Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document. A deep learning, multi-label technology, applied in the field of multi-label classification

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A deep learning, multi-label technology, applied in the field of multi-label classification

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology