A method of double tone detection in continuous speech stream
A detection method and voice stream technology, applied in voice analysis, voice recognition, instruments, etc., can solve the problems that the detection accuracy of the system does not meet expectations, improve system robustness and detection performance, increase system robustness, and reduce false alarms. warning error effect
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment
[0056] 1. The structure and state space of HMM
[0057] For speech and refrain, respectively, a chain of three states is used to model it. For each state chain, GMM (Gaussian Mixture Model) is used to describe the acoustic mapping from state to observation. For speech, a GMM of 256 Gaussians is used to describe its acoustic variation; for antiphonic, a GMM of 64 Gaussians is used. In order to control the jump of the state chain between speech and repeat, a penalty term is introduced. By adjusting this penalty term, a trade-off between system detection accuracy and recall can be made.
[0058] 2. Characteristic form
[0059] At different scales, there are differences in feature robustness and expressive ability. The feature parameters at four scales are calculated and recorded as: MLpR1, MLpR2, MLpR3 and MLpR4.
[0060] MLpR1 was calculated from conventional short-time Fourier analysis. Among them, the frame length is 20ms, the frame shift is 10ms, and the FFT adopts 1024 ...
PUM
Login to View More Abstract
Description
Claims
Application Information
Login to View More 


