Environment sound identification method and system based on convolutional neural network

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A convolutional neural network, environmental sound technology, applied in biological neural network models, speech recognition, neural architecture, etc., can solve problems such as limited application scope, poor robustness, and inconvenient feature extraction

Active Publication Date: 2018-12-21

SHANGHAI UNIV

View PDF5 Cites 39 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0003] The existing sound event recognition methods based on convolutional neural networks and cochlear spectrograms, sound scene recognition methods based on convolutional neural networks and random forests, and environmental sound recognition methods based on time-frequency domain statistical feature extraction all have limited scope of application, Disadvantages such as inconvenient feature extraction and poor robustness to noise

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0019] This embodiment is tested on three public datasets ESC-10, ESC-50 and UrbanSound8K datasets, such as figure 1 shown, including:

[0020] Step 1) Data enhancement: the number of sound samples in the three public data sets is small, and this embodiment uses time extension processing and pitch conversion processing to expand the training samples to enhance the generalization performance of the model.

[0021] The time extension processing refers to speeding up or slowing down the sound without changing the pitch of the sound and obtaining new samples.

[0022] The pitch conversion process refers to raising or lowering the pitch without changing the duration of the sound and obtaining new samples.

[0023] Step 2) feature extraction: use the FFT transform to obtain the amplitude spectrum of the sound, take the square to obtain the energy spectrum of the sound, and then use the Mel filter bank to convert the energy spectrum of the sound to the Mel frequency representation t...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention relates to an environment sound identification method and system based on the convolutional neural network. Mel energy spectrum characteristics extracted from an audio are mixed and constructed to obtain a sample database, the sample database is utilized to train a convolutional neural network model, and lastly, the trained convolutional neural network is utilized to identify the environment sound. The method is advantaged in that the best or near-best results on three public sound data sets ESC-10, ESC-50 and UrbanSound8K are obtained.

Description

technical field [0001] The present invention relates to a technology in the field of audio processing, in particular to a convolutional neural network-based environmental sound recognition method and system. Background technique [0002] In the research of audio information, environmental sound recognition is an important research field, and it has great application potential in the fields of security monitoring, medical monitoring, smart home and scene analysis. Compared with speech recognition, environmental sound has characteristics such as noise-like and wide frequency spectrum, which makes the recognition of environmental sound more challenging. [0003] The existing sound event recognition methods based on convolutional neural networks and cochlear spectrograms, sound scene recognition methods based on convolutional neural networks and random forests, and environmental sound recognition methods based on time-frequency domain statistical feature extraction all have limi...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G10L15/08G10L15/02G10L15/06G06N3/04

CPCG10L15/02G10L15/063G10L15/08G06N3/045

Inventor 张智超徐树公曹姗张舜卿

Owner SHANGHAI UNIV

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Environment sound identification method and system based on convolutional neural network

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology