Unlock instant, AI-driven research and patent intelligence for your innovation.

An end-to-end sound source localization method and system based on multi-task learning

A multi-task learning, sound source localization technology, applied in the field of end-to-end sound source localization methods and systems, can solve problems such as poor localization performance and insufficient robustness

Active Publication Date: 2020-11-20
PEKING UNIV
View PDF14 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

This type of neural network method is very convenient when modeling. It simply relies on a large amount of data for iterative learning to learn a classifier with better performance, but it needs to design and screen an artificial feature with better performance in advance, and this type of method is very difficult. It is difficult to obtain a more general mapping relationship from features to sound source positions through learning, that is, classifiers. The actual positioning performance is relatively poor and the robustness is insufficient.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • An end-to-end sound source localization method and system based on multi-task learning
  • An end-to-end sound source localization method and system based on multi-task learning
  • An end-to-end sound source localization method and system based on multi-task learning

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0059] DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS The preferred embodiments of the present invention will be described in more detail below with reference to the accompanying drawings of the present invention. figure 1 Shown is the basic block diagram of the end-to-end sound source localization algorithm based on multi-task learning proposed by the present invention. The specific implementation steps of the method of the present invention include calculation delay, input time domain signal, compensation delay, CNN extraction feature, DNN recovery signal, calculate channel-to-channel coherence, and estimate target sound source locations. The specific implementation process of each step is as follows:

[0060] 1. Calculation delay

[0061] In the scanning method, since the position to be scanned and the distribution of the microphone array are known, the time delay can also be obtained by calculating the known scanning position and microphone position, which is also know...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses an end-to-end sound source positioning method and system based on multi-task learning. The method comprises the following steps: 1) for each sound source position to be scanned, calculating the delay of transmitting a sound signal from the sound source position to each microphone position; 2) performing corresponding delay compensation on the multi-channel frame-level timedomain signal acquired by each microphone when the microphone array scans each time according to the delay; 3) inputting each time domain signal after delay compensation into a corresponding CNN modelfor feature extraction, and inputting the feature into a deep neural network; 4) estimating a multi-channel sound source signal of each scanning position by the deep neural network according to the feature extracted by each CNN model; and 5) for each scanning position, calculating the cross correlation coefficient sum of the multi-channel sound source signals corresponding to the scanning position, and selecting the position with the maximum correlation coefficient sum as the sound source position. According to the invention, appropriate characteristics can be automatically extracted, a multi-task learning mechanism is introduced, and the positioning performance of the model is improved.

Description

technical field [0001] The invention belongs to the technical field of array signal processing, and relates to a microphone array and a sound source localization method, in particular to an end-to-end sound source localization method and system based on multi-task learning. Background technique [0002] With the development of artificial intelligence technology, machine hearing has attracted widespread attention, and many technologies and research fields related to machine hearing have emerged one after another. Sound source localization technology is a basic and important technology in the machine auditory system. Its essence is to imitate the function of human ears, collect sound signals through microphone arrays, and then judge the position of sound-emitting objects. Sound source localization technology can be independently applied in many fields, such as video conferencing, vehicle whistle recognition, etc., and can also provide basic location information for many techno...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G01S5/22
Inventor 曲天书吴玺宏黄炎坤
Owner PEKING UNIV