Target sound source locking and extracting method

A sound source and target technology, applied in the field of target sound source locking and extraction, can solve the problems of leaving a large pure interference sound source, unable to lock the target sound source, and the noise reduction algorithm has no ability to distinguish the target sound source.

Pending Publication Date: 2021-01-22
上海声瀚信息科技有限公司
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, for an interfering sound source that is very similar to the acoustic characteristics of the human voice, the noise reduction algorithm has no ability to distinguish the target sound source
Although the simple multi-channel sound source BSS (blind separation) algorithm can deal with the situation of multiple sound sources, it cannot lock the target sound source because of blind separation, and in the actual voice interaction environment due to reverberation and other signal distortion effect, leaving a large number of purely disturbing sound sources

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Target sound source locking and extracting method
  • Target sound source locking and extracting method
  • Target sound source locking and extracting method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0097] The present invention will be further described below in conjunction with the accompanying drawings and embodiments.

[0098] Overall system execution block diagram of the present invention is as figure 1 describe, figure 1 It is the block diagram of target sound source speech extraction.

[0099] The specific operation process is described in detail as follows:

[0100] 1. Use window-shifted independent vector analysis based on auxiliary functions.

[0101] Step 1) The mixed signal x containing M source signals collected by M microphones m (n), 1≤m≤M, perform short-time Fourier transform to obtain its frequency domain representation x(ω,τ), where ω and τ are frequency and time indexes respectively, and the total number of frequency bands is K. And initialize the blind separation matrix W(ω,τ);

[0102] Step 2) Accumulate to get L b The frequency-domain representation of the frame-mixed signal X(ω,τ), and separating it using W(ω,τ) yields an estimated signal Y((ω,...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a method for locking and extracting a target sound source, which comprises the following steps of: extracting a plurality of sound sources through a multichannel BSS algorithmAUX-IVA (independent vector analysis based on an auxiliary function), and performing sound source selection through self-adaptive target sound source locking based on SNR (Signal to Noise Ratio) and voice recognition system feedback. On the basis, a pure interference sound source section is judged through spatial information opposite to different sound sources in sound source separation, and elimination is carried out. Finally, VAD endpoint detection based on DNN is used. A final target sound source effective voice segment is extracted through the algorithm, and feedback of the voice recognition system also acts on the VAD algorithm. According to the AUX-IVA algorithm based on the time window, the number of convergence iterations of the overall algorithm is small, the algorithm is lighter,and space information is further used for separating and extracting an interference source and a target source.

Description

technical field [0001] The invention relates to a method for locking and extracting a target sound source, in particular to a method for locking and extracting a target sound source based on multi-channel sound source separation and endpoint detection in a strong unsteady interference environment. Background technique [0002] As a new generation of human-computer interaction, voice is increasingly being used in embedded devices, such as car machines and household appliances, and integrated into people's daily life. However, the environments where these embedded devices with voice recognition interactive functions usually contain non-stationary interference sources. Although the noise reduction algorithm has developed rapidly in recent years, and more and more DNN (neural network) modeling methods are used. However, when the acoustic characteristics of the interference source and the target sound source have strong commonalities, the workflow of traditional speech enhanceme...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G10L21/0272G10L21/0208G10L25/18G10L25/21G10L25/30G10L25/78G10L25/87G10L15/20
CPCG10L15/20G10L21/0208G10L21/0272G10L25/18G10L25/21G10L25/30G10L25/78G10L25/87
Inventor 叶剑豪瞿虎林周伟林
Owner 上海声瀚信息科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products