Audio Type Detection Method Based on Bipolar Modeling of Pure Speech and Background Noise

What is AI technical title?
AI technical title is built by Patsnap AI team. It summarizes the technical point description of the patent document.
A technology of background noise and pure speech, applied in speech analysis, speech recognition, instruments, etc., can solve problems such as large amount of calculation

Active Publication Date: 2019-07-16

SOUTH CHINA UNIV OF TECH

View PDF6 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

Therefore, GMM-SVM has a large amount of computation. For applications that require real-time processing, there are certain performance requirements for the equipment.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment

[0038]figure 1 It is the generation of background noise and pure speech bipolar model and the flow chart of classifier training in the present invention. The described method comprises the steps of:

[0039] (1) Pure speech and pure background noise model construction: based on enough suitable training data to train a pure speech model GMM of N Gaussian mixed elements s and a background noise model GMM of M Gaussian mixed elements n .

[0040] In this embodiment, the Gaussian mixture number of the pure speech model is 256, and a GMM model is constructed by using as many speakers as possible and pure speech with different language content; the number of speakers is not less than 20, and the male: female ratio is kept as far as possible balanced. Language content should also be diversified. In terms of completeness, language content should contain all basic phonetic units.

[0041] The Gaussian mixture number of the background noise model uses 512, and uses as many backgrou...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The present invention provides an audio type detection method based on pure speech and background noise bipolar modeling, comprising the steps of: S1, constructing a pure speech GMM model and a pure background noise GMM model; Distance, to judge whether the Gaussian mixed element is located in the feature overlapping space; S3, remove the Gaussian mixed element located in the feature overlapping space, rebuild the pure speech statistical model, the pure background noise statistical model; S4, calculate the new pure speech statistical model, pure background noise The probability of the noise statistical model, and the estimated signal-to-noise ratio of the audio sample; S5, according to the calculated probability and estimated signal-to-noise ratio, construct the feature vector and use the SVM model to make a judgment, and distinguish the sample as pure speech, background noise or noise-containing speech . The invention can effectively distinguish pure speech, pure background noise and noise-containing speech while reducing the calculation amount of GMM-SVM.

Description

technical field [0001] The invention relates to the technical field of speech signal processing, in particular to an audio type detection method based on pure speech and background noise bipolar modeling. Background technique [0002] Audio type detection technology refers to the use of different types of audio features to distinguish their types. At present, the technologies used more frequently are GMM (Gaussian Mixture Model) model and HMM (hidden Markov model) model. In recent years, the combination method of GMM-SVM has appeared, using the GMM model to construct supervectors as features, and using the SVM (Support Vector Machine, Support Vector Machine) model for soft classification, and good results have been obtained. The calculation amount of GMM-SVM is determined by the Gaussian mixture degree of GMM, the supervector dimension used and the SVM kernel function. The supervector dimension is generally the feature dimension or Gaussian mixture degree used by GMM, and t...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & AuthorityPatents(China)

IPC IPC(8): G10L15/06G10L15/08G10L15/14G10L15/20G10L21/0216G10L21/0264

Inventor贺前华李洪滔蔡梓文

OwnerSOUTH CHINA UNIV OF TECH

Audio Type Detection Method Based on Bipolar Modeling of Pure Speech and Background Noise

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology