Bi-ear time delay estimating method based on frequency division and improved generalized cross correlation

A generalized cross-correlation and time-delay estimation technology, which is applied in the field of sound source localization, can solve problems such as increased sound source localization errors, obstacles to the widespread use of human-computer voice interaction technology, and decreased sound clarity.

Active Publication Date: 2017-12-15
CHONGQING UNIV OF POSTS & TELECOMM
View PDF2 Cites 21 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

When collecting speech in a closed home environment, it often carries various noises from the surrounding environment, room reverberation, and other sound sources. The existence of these interferences reduc

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Bi-ear time delay estimating method based on frequency division and improved generalized cross correlation
  • Bi-ear time delay estimating method based on frequency division and improved generalized cross correlation
  • Bi-ear time delay estimating method based on frequency division and improved generalized cross correlation

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0085] The technical solutions in the embodiments of the present invention will be described clearly and in detail below with reference to the drawings in the embodiments of the present invention. The described embodiments are only some of the embodiments of the invention.

[0086] The technical scheme that the present invention solves the problems of the technologies described above is:

[0087] Aiming at the problem that reverberation has different effects on different frequency components of speech, and the same processing of each frequency component of the sound source signal will cause positioning errors, a generalized cross-correlation binaural time delay estimation based on frequency division and improvement is proposed algorithm. In order to avoid doing the same processing on each frequency component of speech, the reverberant speech is divided into various frequency components by using the frequency division characteristics of the Gammatone filter bank, and independe...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a bi-ear time delay estimating method based on frequency division and improved generalized cross correlation in reverberation environment, and relates to the field of sound source positioning. A Gammatone filter is used to effectively simulate characteristics of a basal membrane of a human ear, voice signals are subjected to frequency division processing, and two-ear cross-correlation delay is estimated under a reverberation environment. Compared with a generalized cross correlation delay estimating method, the method can estimate time delay more accurately. The sound source positioning system has better robustness. A Gammatone filter is used to conduct frequency dividing processing for bi-ear signals, and each sub-band signal is subjected to inverse transformation to a time domain after reverberation processing of cepstrum and pre-filtering. Each sub-band signal of left and right ears are subjected to generalized cross correlation operation, an improved phase transformation weight function is employed in a generalized cross correlation algorithm to obtain cross correlation value of each sub-band for summing operation, and the bi-ear time difference corresponding to maximal cross correlation value is obtained.

Description

technical field [0001] The invention belongs to the field of sound source localization, in particular to a generalized cross-correlation binaural time delay estimation method based on frequency division and improvement. Background technique [0002] With the progress of human society, people have higher and higher requirements for the human-computer interaction performance of machines. What human-computer interaction really needs is better coupling between humans and machines or computers, and comprehensive and intuitive communication and communication, rather than simply better designing the surface characteristics of the interactive interface. Increasing communication between humans and machines requires localization and tracking of sound sources, automatic camera tracking for video and audio applications, microphone array beamforming for noise and reverberation suppression, long-distance speech speech recognition and robotic audio systems are sources of speech An example...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G01S5/22G10L21/0208
CPCG01S5/22G10L21/0208G10L2021/02082G10L2021/02087
Inventor 胡章芳乐聪聪罗元张毅刘宇
Owner CHONGQING UNIV OF POSTS & TELECOMM
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products