Method for identifying sound scenes based on CNN (convolutional neural network) and random forest classification

A random forest classification and convolutional neural network technology, which is applied in speech recognition, character and pattern recognition, speech analysis, etc., can solve the problems of difficult training of models, dependence of recognition effect, and aggravated model complexity, etc., to achieve improved recognition rate, Less computing resources and training time, the effect of simple CNN structure

Inactive Publication Date: 2018-06-29
FUZHOU UNIV
View PDF6 Cites 43 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] However, the related methods based on CNN have the following disadvantages: 1) The recognition effect depends on the length of the set segmentation, and the change of recognition rate caused by different lengths will lead to the instability of the CNN model, and th

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method for identifying sound scenes based on CNN (convolutional neural network) and random forest classification
  • Method for identifying sound scenes based on CNN (convolutional neural network) and random forest classification
  • Method for identifying sound scenes based on CNN (convolutional neural network) and random forest classification

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0046] The technical solution of the present invention will be specifically described below in conjunction with the accompanying drawings.

[0047] A kind of sound scene recognition method based on convolutional neural network and random forest classification of the present invention, at first, sound scene generates Mel energy spectrum and its fragment sample set through Mel filter; Then, utilizes fragment sample set to carry out two-stage training to CNN , truncate the feature output of the fully connected layer to obtain the CNN features of the fragment sample set; finally, use random forest to classify the CNN features of the fragment sample set to obtain the final recognition result.

[0048] The sound scene generates the Mel energy spectrum and its segment sample set through the Mel filter, that is, by extracting the Mel energy spectrum from the scene sound samples of various lengths, and sampling by slices, the Mel energy spectrum segments of the same size are obtained as...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a method for identifying sound scenes based on a CNN (convolutional neural network) and random forest classification. The method comprises the following steps of generating Mel energy spectrum and a segment sample set of the sound scene through a Mel filter; utilizing the segment sample set to perform two-phase training on the CNN, and cutting off the feature output of a whole connecting layer, so as to obtain the CNN features of the segment sample set; finally, utilizing a random forest to classify the CNN features of the segment sample set, so as to obtain a final identifying result. Proofed by related experiment results, the method has the advantage that the identification rate on an IEEE DCASE2016 sound scene evaluation data set is better than the identification rate of an MFCC-GMM (Mel frequency cepstrum coefficient-gaussian mixed model) method, and is also better than the identification rate of the existing related identifying methods.

Description

technical field [0001] The invention relates to a sound scene recognition method based on convolutional neural network and random forest classification. Background technique [0002] Sound scene recognition is to realize the perception of the sound scene by analyzing the audio signal. As one of the key links in the analysis of environmental information, it has a wide range of applications in scene recognition, foreground and background sound recognition and separation. In recent years, there have been related researches on the use of sound scene recognition to improve the terminal's autonomous perception of the scene [1][2][3] . For example, the mobile phone detects the sound of the scene and realizes automatic mute in the meeting scene; increases the volume of calls and ringtones in noisy outdoor environments; the automatic driving system analyzes the scene through the surrounding environmental sound and realizes safe driving, etc. [0003] For the recognition of sound s...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G10L15/16G10L15/06G10L25/21G10L25/24G10L25/30G10L25/45G06K9/62
CPCG10L15/063G10L15/16G10L25/21G10L25/24G10L25/30G10L25/45G06F18/241G06F18/24323
Inventor 李应李俊华
Owner FUZHOU UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products