Unlock instant, AI-driven research and patent intelligence for your innovation.

Audio classification method based on dual data enhancement strategy

A classification method and data technology, applied in the field of information processing, can solve the problems of insufficient data, reduced audio classification accuracy, poor recognition accuracy, etc., to achieve the effect of improving accuracy, improving generalization ability, and improving training accuracy

Active Publication Date: 2020-02-18
WUHAN UNIV OF SCI & TECH
View PDF9 Cites 9 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, due to the particularity of the spectrogram, traditional image data enhancement methods such as rotation, flipping, and zooming cannot be used on it, and the lack of a data enhancement process for the spectrogram will also reduce the accuracy of audio classification
The current mainstream approach is to directly enhance audio data through methods such as rotation, tuning, and noise addition. However, for deep learning, these data are often not enough, especially for some datasets with too few samples but a large number of label categories. Poor final recognition accuracy on the dataset

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Audio classification method based on dual data enhancement strategy
  • Audio classification method based on dual data enhancement strategy
  • Audio classification method based on dual data enhancement strategy

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0046] In order to further understand the present invention, the preferred embodiments of the present invention are described below in conjunction with examples, but it should be understood that these descriptions are only to further illustrate the features and advantages of the present invention, rather than limiting the claims of the present invention.

[0047] An embodiment of the present invention provides an audio classification method based on a dual data enhancement strategy. The basic flow of the method is as follows figure 1 As shown, the specific implementation process is as follows:

[0048] Step 1: Data Preprocessing

[0049] Preprocess the original audio data and convert it into a unified standard file for storage. The specific operations are as follows:

[0050] (1) Audio clipping: Remove 1s-long data from the head of a large original audio file, and remove 1s-long data from the tail, so as to avoid the lack of audio information at the beginning and end of the ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses an audio classification method based on a dual data enhancement strategy. The method comprises the following steps of firstly, carrying out primary audio data enhancement on anoriginal audio file by using four traditional audio enhancement methods of rotation, tone tuning, tone modification and noise adding; secondly, converting audio data subjected to the primary enhancement into a speech spectrum; thirdly, performing secondary data enhancement on speech spectrum data through a random replacement mean value; inputting data subjected to dual data enhancement to an Inception_Resnet_V2 neural network for training, thereby completing feature extraction work; and finally, training extracted high-level features through an xgboost classifier to complete final audio classification work. According to the method, the data enhancement on the spectrum is added on the basis of traditional audio enhancement, the dual data enhancement can directly improve the training precision, the accuracy rate is improved by 3% in the test process, and the method can be widely applied to environmental sound classification and voiceprint recognition tasks.

Description

technical field [0001] The invention relates to the technical field of information processing, in particular to an audio classification method based on a dual data enhancement strategy. Background technique [0002] Analyzing audio can yield a wealth of information about people's activities, communications, surrounding conditions, and more. Generally, the method of audio classification is divided into two steps. One is to extract the features of different audios that need to be classified, usually using some hand-made features, such as log-Mel features, matrix decomposition, dictionary learning, wavelet-based features and Mei The cepstral coefficient; the second is to analyze the extracted audio features, and then train the classifier for recognition. Among them, audio features are very important, and audio features with poor performance will directly lead to poor subsequent classification results. [0003] Traditional features are not expressive enough to represent raw au...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G10L15/06G10L15/16G10L19/02G10L21/02
CPCG10L15/16G10L15/063G10L21/02G10L19/02
Inventor 张晓龙周迅边小勇李波何新宇甘浩旻
Owner WUHAN UNIV OF SCI & TECH