Short-time and long-time feature modeling fusion-based environmental sound recognition method and device

A technology of long-term features and short-term features, applied in speech recognition, speech analysis, instruments, etc., can solve problems such as insufficient use of algorithm information, and achieve the effect of improving recognition results

Inactive Publication Date: 2016-06-08
INST OF AUTOMATION CHINESE ACAD OF SCI +2
View PDF8 Cites 16 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0007] The purpose of the present invention is to solve the situation of insufficient i

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Short-time and long-time feature modeling fusion-based environmental sound recognition method and device
  • Short-time and long-time feature modeling fusion-based environmental sound recognition method and device
  • Short-time and long-time feature modeling fusion-based environmental sound recognition method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0031] In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be further described in detail below in conjunction with specific embodiments and with reference to the accompanying drawings.

[0032] In order to fully utilize the information of each scale of the audio in the process of environmental sound recognition, the present invention proposes a cascade fusion model based on the short-term and long-term features of the audio. The whole process uses GMM and SVM to model based on different features. The implementation of the GMM model is based on short-term features of audio. The input of the SVM classifier includes long-term features and the probability score of GMM. In this two-stage framework, firstly, the correct classification results of the first stage are retained by introducing confidence, and at the same time, the probability score of GMM is used as a part of the SVM input, so that the short-term d...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a short-time and long-time feature modeling fusion-based environmental sound recognition method and device. According to the invention, a model cascaded fusion method is adopted, so that short-time and long-time information can be utilized in a whole identification process. According to the technical schemes of the invention, the method includes two stages. According to the first stage, pre-classification is performed on sliding windows based on short-time features and by using the modeling of the Gaussian mixture model (GMM); confidence judgment is performed on the classification results of the GMM; a result with high confidence is directly adopted as a final classification result; and when lower confidence appears, re-classification is carried out based on long-time features. According to the second stage, based on analysis on a GMM classification result confusion matrix, classes easy to be confused are found out; and a support vector machine (SVM) classification model between the classes is trained; and re-classification is carried out by using a support vector machine (SVM). The probability score of the Gaussian mixture model used in the modeling process of the second stage is added to the long-time features, so that the probability score and the long-time features are together adopted as the input of the SVM.

Description

technical field [0001] The invention relates to the field of environmental sound recognition, in particular to the field of acoustic modeling of environmental sound. Background technique [0002] In recent years, research on non-speech perception has gradually become a research hotspot. Non-speech environmental sounds can also convey useful information, such as human activities in a specific environment usually produce a variety of acoustic events. Through the analysis and processing of these environmental sounds, people's activities and corresponding environmental conditions can be effectively learned, such as applause, laughter, footsteps, gunshots, explosions, glass shattering, etc. [0003] For environmental sound recognition, researchers have tried various methods. Since it is all about sound processing, environmental sound recognition first draws on the GMM (GaussianMixtureModel: Mixed Gaussian Model) / HMM (HiddenMarkovModel: Hidden Markov Model) technology in the fie...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G10L15/14G10L15/04
CPCG10L15/04G10L15/14
Inventor 刘文举胡鹏飞张邯平高鹏董理科刘晓飞乔利玮王桐
Owner INST OF AUTOMATION CHINESE ACAD OF SCI
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products