Audio Type Detection Method Based on Bipolar Modeling of Pure Speech and Background Noise

A technology of background noise and pure speech, applied in speech analysis, speech recognition, instruments, etc., can solve problems such as large amount of calculation

Active Publication Date: 2019-07-16
SOUTH CHINA UNIV OF TECH
View PDF6 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Therefore, GMM-SVM has a large amount of computation. For applications that require real-time processing, there are certain performance requirements for the equipment.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Audio Type Detection Method Based on Bipolar Modeling of Pure Speech and Background Noise
  • Audio Type Detection Method Based on Bipolar Modeling of Pure Speech and Background Noise
  • Audio Type Detection Method Based on Bipolar Modeling of Pure Speech and Background Noise

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0038]figure 1 It is the generation of background noise and pure speech bipolar model and the flow chart of classifier training in the present invention. The described method comprises the steps of:

[0039] (1) Pure speech and pure background noise model construction: based on enough suitable training data to train a pure speech model GMM of N Gaussian mixed elements s and a background noise model GMM of M Gaussian mixed elements n .

[0040] In this embodiment, the Gaussian mixture number of the pure speech model is 256, and a GMM model is constructed by using as many speakers as possible and pure speech with different language content; the number of speakers is not less than 20, and the male: female ratio is kept as far as possible balanced. Language content should also be diversified. In terms of completeness, language content should contain all basic phonetic units.

[0041] The Gaussian mixture number of the background noise model uses 512, and uses as many backgrou...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The present invention provides an audio type detection method based on pure speech and background noise bipolar modeling, comprising the steps of: S1, constructing a pure speech GMM model and a pure background noise GMM model; Distance, to judge whether the Gaussian mixed element is located in the feature overlapping space; S3, remove the Gaussian mixed element located in the feature overlapping space, rebuild the pure speech statistical model, the pure background noise statistical model; S4, calculate the new pure speech statistical model, pure background noise The probability of the noise statistical model, and the estimated signal-to-noise ratio of the audio sample; S5, according to the calculated probability and estimated signal-to-noise ratio, construct the feature vector and use the SVM model to make a judgment, and distinguish the sample as pure speech, background noise or noise-containing speech . The invention can effectively distinguish pure speech, pure background noise and noise-containing speech while reducing the calculation amount of GMM-SVM.

Description

technical field [0001] The invention relates to the technical field of speech signal processing, in particular to an audio type detection method based on pure speech and background noise bipolar modeling. Background technique [0002] Audio type detection technology refers to the use of different types of audio features to distinguish their types. At present, the technologies used more frequently are GMM (Gaussian Mixture Model) model and HMM (hidden Markov model) model. In recent years, the combination method of GMM-SVM has appeared, using the GMM model to construct supervectors as features, and using the SVM (Support Vector Machine, Support Vector Machine) model for soft classification, and good results have been obtained. The calculation amount of GMM-SVM is determined by the Gaussian mixture degree of GMM, the supervector dimension used and the SVM kernel function. The supervector dimension is generally the feature dimension or Gaussian mixture degree used by GMM, and t...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G10L15/06G10L15/08G10L15/14G10L15/20G10L21/0216G10L21/0264
Inventor 贺前华李洪滔蔡梓文
Owner SOUTH CHINA UNIV OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products