Sound source direction positioning method and device, voice equipment and voice system

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A sound source position and direction positioning technology, applied in the voice field, can solve problems such as poor anti-interference ability, interference, and poor robustness of sound source positioning algorithms

Pending Publication Date: 2020-04-17

ALIBABA GRP HLDG LTD

View PDF6 Cites 17 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0004] However, the principle of the TDOA positioning algorithm is to perform integral summation for each frequency band equally. Since reverberation and noise inevitably exist in the actual speech environment, this integral summation method does not distinguish between target signals and non-target signals. , so that the sound source localization process is disturbed by reverberation and noise, resulting in poor robustness of the sound source localization algorithm, low accuracy of sound source localization, and poor anti-interference ability

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0094] A method for locating the direction of a sound source provided in Embodiment 1 of the present invention, the process of which is as follows figure 1 As shown, the block diagram of its implementation is shown in figure 2 As shown, the method includes the following steps:

[0095] S11. Perform beamforming filtering in a specified direction on the audio signal collected by the microphone array to obtain output signals in each specified direction, wherein the specified direction is determined according to the center point of the area of the microphone array.

[0096] In this step, audio signals are collected synchronously by each microphone in the microphone array, and Fourier transform is performed synchronously on the collected audio signals to obtain frequency domain signals. In the specified direction, the signal collected by the microphone is filtered through the beamforming filter (BeamForming) to enhance the audio signal in the specified direction and suppress th...

Embodiment 2

[0117] Embodiment 2 of the present invention provides a specific implementation process of the above-mentioned sound source direction localization. The sound source localization calculation is realized based on the GCC-PHAT algorithm. The flow is as follows image 3 shown, including the following steps:

[0118] S21. Perform frequency domain conversion on the audio signals synchronously collected by the microphones in the microphone array to obtain corresponding frequency domain audio signals.

[0119] In this embodiment, a circular microphone array is taken as an example for description, as Figure 4 As shown, starting from the direction of 0 degrees, M microphones are placed along the circumference, where the radius of the circumference is r.

[0120] The microphones in the microphone array collect audio signals synchronously, obtain and output time-domain audio signals, perform short-time Fourier transform on the time-domain audio signals, and obtain frequency-domain audio...

Embodiment 3

[0145] Embodiment 3 of the present invention provides another specific implementation process of the sound source direction localization method. The difference from Embodiment 2 is that after the cross-correlation calculation is performed for the two microphones, the cross-correlation is weighted and summed to obtain Comprehensive cross-correlation of multiple microphones, positioning based on the comprehensive cross-correlation. This method is an improved Weighted SRP-PHAT algorithm, which can perform sound source localization for a microphone array of multiple microphones. The method flow is as follows Figure 5 shown, including the following steps:

[0146] S31. Perform frequency domain conversion on the audio signals synchronously collected by the microphones in the microphone array to obtain corresponding frequency domain audio signals.

[0147] For specific steps, refer to the description of step S21, which will not be repeated here.

[0148] S32. Using pre-constructe...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a sound source direction positioning method and device and equipment. The method comprises the steps of performing beam forming filtering in a specified direction on audio signals acquired by a microphone array to obtain an output signal in each specified direction; determining the weight of each specified direction according to the signal energy of the output signal of each specified direction; determining the cross-correlation degree of the audio signals collected by each pair of microphones in the microphone array at an imaginary sound source position by adopting a selected time delay estimation algorithm, and determining the imaginary sound source position of which the cross-correlation degree meets a set requirement as a real sound source position, wherein whenthe cross-correlation degree is determined, the corresponding weight is selected for weighted calculation according to the specified direction area where the imaginary sound source position is located. The robustness of sound source positioning is improved, the accuracy of sound source positioning is high, and the anti-interference capability is strong.

Description

technical field [0001] The present invention relates to the technical field of speech, in particular to a sound source direction localization method and device, speech equipment and system. Background technique [0002] With the development of artificial intelligence, intelligent voice devices are more and more widely used in daily life, such as smart speakers, intelligent conference pickup equipment, intelligent robots, etc., intelligent voice devices can capture the voices of the surrounding speakers, in order to obtain For a clear voice, it is necessary to locate the direction of the sound. A speech smart device equipped with a ring microphone can process the audio signal received by the ring microphone through a built-in necessary sound source localization algorithm, so as to track the relative orientation of the speaker and the microphone when speaking in the space, that is, the orientation of the sound source. [0003] Currently commonly used DOA Estimation algorithms...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G01S5/20

CPCG01S5/20

Inventor 黄伟隆

Owner ALIBABA GRP HLDG LTD

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Sound source direction positioning method and device, voice equipment and voice system

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

Embodiment 2

Embodiment 3

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology