Binaural speech separation method based on LSTM (Long Short Term Memory) network

What is AI technical title?
AI technical title is built by Patsnap AI team. It summarizes the technical point description of the patent document.
A long-short-term memory and speech separation technology, applied in speech analysis, instruments, etc., can solve problems such as performance degradation

Active Publication Date: 2020-01-24

SOUTHEAST UNIV

View PDF4 Cites 9 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0010] Purpose of the invention: Aiming at the problem that the performance of the previous binaural speech separation algorithm drops sharply under the condition of high noise and strong reverberation, the present invention proposes a binaural speech separation method of the long-short-term memory network LSTM, which uses the LSTM network to Feature parameters in multiple environments for training

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0067] Such as figure 1 As shown, the binaural speech separation method based on the LSTM network provided by this embodiment includes the following steps:

[0068] Step 1. Convolve two different monophonic speech signals in the training speech with the head-related impulse response function HRIR of different azimuth angles to generate two training monophonic source binaural speech signals in different azimuths. The source calculation formula is:

[0069] the s 1,L (n)=s 1 (n)*h 1,L the s 2,L (n)=s 2 (n)*h 2,L

[0070] the s 1,R (n)=s 1 (n)*h 1,R ,s 2,R (n)=s 2 (n)*h 2,R

[0071] Among them, s 1 (n), s 2 (n) is two different monophonic speech signals, s 1,L (n), s 1,R (n) represents the single sound source left and right ear speech signals corresponding to the azimuth angle 1, h 1,L 、h 1,R Indicates the left ear HRIR and right ear HRIR corresponding to azimuth 1, s 2,L (n), s 2,R (n) represents the single sound source left and right ear speech signals cor...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a binaural speech separation method based on an LSTM (Long Short Term Memory) network. The ITD (Inter-aural Time Difference), the IID (Interaural Intensity Difference) and theCCF (Cross Correlation Function) of each time frequency unit of a training binaural speech signal are extracted to be used as separation space features; space features of a current frame and front andback 5 frames of the time frequency units in the same subband are used as input parameters of a two-way LSTM network to be trained; and a separation model based on the LSTM is obtained. At the test stage, the space features of the current frame and the front and back 5 frames of the time frequency units of a test binaural speech signal are used as input parameters, obtained through training, of the two-way LSTM network, and are used for estimating shielding values of the target speech of the current time frequency unit so as to perform speech separation according to a shielding value. The separation result shows that compared with a method based on a deep neural network, the binaural speech separation method based on the LSTM network provided by the invention has the advantages that the subjective evaluation index is obviously improved, and the algorithm generalization performance is good.

Description

technical field [0001] The invention relates to a speech separation algorithm, in particular to a binaural speech separation method based on a long-short-term memory network LSTM. Background technique [0002] Speech separation algorithm is an important research direction of speech signal processing, and it also has a wide range of applications. For example, in teleconferencing systems, speech separation technology can realize the extraction of interested sound sources from multiple speakers, which can improve the efficiency of teleconferencing; The pre-processing process applied to speech recognition can improve the quality of speech and help improve the accuracy of recognition; when applied to hearing aids, it can provide more prominent target sound sources and effective speech information for the hearing-impaired. [0003] Speech separation technology involves a wide range of fields, including but not limited to acoustics, digital signal processing, information communicat...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & AuthorityApplications(China)

IPC IPC(8): G10L21/0272G10L25/30

CPCG10L21/0272G10L25/30

Inventor周琳陆思源钟秋月庄琰

OwnerSOUTHEAST UNIV

Binaural speech separation method based on LSTM (Long Short Term Memory) network

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology