Frequency-domain blind deconvolution method for voice signal

A speech signal and blind deconvolution technology, applied in speech analysis, speech recognition, instruments, etc., can solve problems such as low robustness, slow algorithm search speed, and large signal cross-interference

Inactive Publication Date: 2012-10-31
HEFEI UNIV OF TECH
View PDF0 Cites 8 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] 1) Most of the algorithms are obtained under certain limited conditions, the separation effect is not ideal, the signal cross interference is large after separation, and the robustness is not high
[0007] 2) In the process of human-comput

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Frequency-domain blind deconvolution method for voice signal
  • Frequency-domain blind deconvolution method for voice signal
  • Frequency-domain blind deconvolution method for voice signal

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0047] refer to figure 1 , a voice signal frequency domain blind deconvolution method, the time domain convolution mixed voice signal is transformed into the frequency domain for blind separation, specifically comprising the following steps:

[0048] 1) For the adaptive framing of the original audio file, when the sampling frequency is 16KHz, the frame length is 16ms, and the frame shift is 2ms;

[0049] 2) Fourier transform is performed on the single frame data, and the convolutional mixed signal model is transformed into a linear mixed model; the convolutional mixed model can be expressed as

[0050] x ( t ) = H ⊗ s ( t ) ( means convolution) (1)

[0051] The short-time Fourier transform of the signal can be expressed as

[0052] X ( ω , t ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a frequency-domain blind deconvolution method for a voice signal, comprising the following steps of: converting a time-domain convolution mixed voice signal to a frequency domain and then performing blind separation; converting and transforming the time-domain convolution mixed voice signal to a frequency-domain linear instantaneous mixture model via windowed Fourier transform according to the short-time stability of the voice signal; after performing pre-processing such as filtering and whitening in the frequency domain, realizing segmented blind separation for the voice signal by adopting a method of the approximate joint diagonalization of correlation matrices under different time delays; and after solving the problem of the fuzziness of the blind separation for the signal, performing segmented recombination for the separated signals in the time domain via inverse Fourier transform. Via the frequency-domain blind deconvolution method disclosed by the invention, a good separation effect is realized for 2*2 real-time recoded mixed voice signal, and the recognition accuracy of the voice signal of a human-computer interaction system in an environment with the speech interference of other people can be efficiently improved.

Description

technical field [0001] The invention belongs to the field of voice signal extraction and recognition in multimedia information processing, and in particular relates to a voice signal frequency domain blind deconvolution method, which can be applied to human-computer interaction scenes to improve the interactive recognition rate. Background technique [0002] After more than 60 years of development of automatic speech recognition technology, the recognition rate has exceeded 95% in a noise-free or interference-free environment. But in the actual application environment, especially when two or more speakers speak at the same time, the speech recognition rate drops suddenly, which greatly limits the application of this technology in human-machine interaction (Human-Machine Interaction, HMI). The human auditory system can obtain information of interest in a noisy environment, but it is difficult for a robot in a human-computer interaction environment to have this ability. Blind...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G10L15/18
Inventor 丁志中黄玉雷戴礼荣陈小平
Owner HEFEI UNIV OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products