Environment sound identification method and system based on convolutional neural network

A convolutional neural network, environmental sound technology, applied in biological neural network models, speech recognition, neural architecture, etc., can solve problems such as limited application scope, poor robustness, and inconvenient feature extraction

Active Publication Date: 2018-12-21
SHANGHAI UNIV
View PDF5 Cites 39 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] The existing sound event recognition methods based on convolutional neural networks and cochlear spectrograms, sound scene recognition methods based on convolutional neural networks and random forests, and envir

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Environment sound identification method and system based on convolutional neural network
  • Environment sound identification method and system based on convolutional neural network
  • Environment sound identification method and system based on convolutional neural network

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0019] This embodiment is tested on three public datasets ESC-10, ESC-50 and UrbanSound8K datasets, such as figure 1 shown, including:

[0020] Step 1) Data enhancement: the number of sound samples in the three public data sets is small, and this embodiment uses time extension processing and pitch conversion processing to expand the training samples to enhance the generalization performance of the model.

[0021] The time extension processing refers to speeding up or slowing down the sound without changing the pitch of the sound and obtaining new samples.

[0022] The pitch conversion process refers to raising or lowering the pitch without changing the duration of the sound and obtaining new samples.

[0023] Step 2) feature extraction: use the FFT transform to obtain the amplitude spectrum of the sound, take the square to obtain the energy spectrum of the sound, and then use the Mel filter bank to convert the energy spectrum of the sound to the Mel frequency representation t...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to an environment sound identification method and system based on the convolutional neural network. Mel energy spectrum characteristics extracted from an audio are mixed and constructed to obtain a sample database, the sample database is utilized to train a convolutional neural network model, and lastly, the trained convolutional neural network is utilized to identify the environment sound. The method is advantaged in that the best or near-best results on three public sound data sets ESC-10, ESC-50 and UrbanSound8K are obtained.

Description

technical field [0001] The present invention relates to a technology in the field of audio processing, in particular to a convolutional neural network-based environmental sound recognition method and system. Background technique [0002] In the research of audio information, environmental sound recognition is an important research field, and it has great application potential in the fields of security monitoring, medical monitoring, smart home and scene analysis. Compared with speech recognition, environmental sound has characteristics such as noise-like and wide frequency spectrum, which makes the recognition of environmental sound more challenging. [0003] The existing sound event recognition methods based on convolutional neural networks and cochlear spectrograms, sound scene recognition methods based on convolutional neural networks and random forests, and environmental sound recognition methods based on time-frequency domain statistical feature extraction all have limi...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G10L15/08G10L15/02G10L15/06G06N3/04
CPCG10L15/02G10L15/063G10L15/08G06N3/045
Inventor 张智超徐树公曹姗张舜卿
Owner SHANGHAI UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products