Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

UNET structure-based microphone array voice source positioning method

A microphone array and source localization technology, which is applied in neural learning methods, direction finders using ultrasonic/sonic/infrasonic waves, neural architecture, etc., can solve problems such as reverberation and noise interference, and achieve avoidance of impact and high robustness , The effect of strong feature learning ability

Active Publication Date: 2021-01-26
南京南大电子智慧型服务机器人研究院有限公司 +2
View PDF8 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

In a large number of practical application scenarios, there is not only reverberation, but also noise interference. Most current methods cannot maintain high accuracy and robustness in such complex environments.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • UNET structure-based microphone array voice source positioning method
  • UNET structure-based microphone array voice source positioning method
  • UNET structure-based microphone array voice source positioning method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0039] Below in conjunction with accompanying drawing and specific embodiment, further illustrate the present invention, should be understood that these examples are only for illustrating the present invention and are not intended to limit the scope of the present invention, after having read the present invention, those skilled in the art will understand various aspects of the present invention All modifications of the valence form fall within the scope defined by the appended claims of the present application.

[0040] A method for localizing speech sources using microphone arrays based on the UNET structure, such as figure 1 , 2 As shown, it is suitable for high interference and high reverberation environments, and can be applied to arrays of different shapes, including the following steps:

[0041] 1. Generate training samples, obtain time-frequency domain signals, and obtain power envelopes.

[0042] Arrange speech or interference sound sources in the simulation room, u...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a UNET structure-based microphone array voice source positioning method, which comprises the following steps of: (1) generating a training sample to obtain a time-frequency domain signal and obtain a power envelope; (2) for each time-frequency point of the time-frequency domain signal, calculating a corresponding voice energy proportion and a direct path voice energy proportion; (3) training a neural network of a multi-task UNET structure by utilizing the samples generated in the step (1); (4) predicting a voice direct sound energy proportion of each time-frequency point of the to-be-tested noise-containing signal by utilizing a trained neural network with a multi-task UNET structure; and (5) applying a positioning method to the time-frequency point with the high voice direct sound energy proportion to obtain a positioning result. According to the voice sound source positioning method, the influence of interference and reverberation can be effectively eliminatedin a high-reverberation and high-interference environment, and a result with relatively high accuracy and robustness is obtained.

Description

technical field [0001] The invention relates to a voice sound source positioning method based on a multi-task UNET structure and using a microphone array in a high-interference and high-reverberation environment, and belongs to the technical field of voice signal processing. Background technique [0002] The purpose of Speech Source Localization (SSL) is to estimate the angle (Direction-of-Arrival, DOA) at which the speech signal arrives at the microphone array. Sound source localization, or DOA estimation, of speech signals using a microphone array is a very important and hot topic in acoustic signal processing. It plays a very important role in capturing sound in many application scenarios, such as human-computer voice interaction of smart devices, lens tracking and intelligent monitoring. However, the difficulty lies in that the speech signal is a broadband non-stationary random process, and there are also noise floor, reverberation and other disturbing sound sources. ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G01S3/80G06N3/04G06N3/08
CPCG01S3/80G06N3/08G06N3/045
Inventor 王浩卢晶刘晓峻狄敏姚志强
Owner 南京南大电子智慧型服务机器人研究院有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products