Speech separation method and system based on Ultra Gaussian prior speech model and deep learning and storage medium

What is AI technical title?
AI technical title is built by Patsnap AI team. It summarizes the technical point description of the patent document.
A speech separation and deep learning technology, applied in speech analysis, instruments, etc., can solve problems such as non-stationary noise performance discount, speech signal interference noise pollution, poor data performance, etc., to achieve robust performance enhancement and weak generalization ability , The effect of suppressing the non-stationary noise signal

Inactive Publication Date: 2019-05-17

HARBIN INST OF TECH SHENZHEN GRADUATE SCHOOL

View PDF4 Cites 9 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0002] Applications such as automatic speech recognition, human-machine dialogue, hearing aids, etc. have encountered great challenges because speech signals are often polluted by interfering noise from the surroundings

The performance of existing traditional speech enhancement techniques is greatly reduced for non-stationary noise and low signal-to-noise ratio

Although the recently emerging speech enhancement technology based on deep learning can suppress non-stationary noise well, the performance of such algorithms is highly dependent on training data, and it will perform poorly for unlearned or trained data.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0019] The invention discloses a speech separation method based on a super-Gaussian prior speech model and deep learning, which not only suppresses non-stationary noise well, but also shows good generalization performance for untrained data.

[0020] The present invention mainly implements a robust speech enhancement method by combining traditional statistical models and deep learning techniques. The whole method mainly includes four parts: using the speech gain function based on the super-Gaussian speech hypothesis model, using the neural network to estimate the power spectrum of the pure speech signal, estimating the noise power spectrum, a priori signal-to-noise ratio and the calculation of the gain function.

[0021] First introduce the signal model: we consider the additive signal model, y(n)=x(n)+d(n), where y(n) is a noisy speech signal, x(n) and d(n) represent Pure speech signal and noisy signal. The relationship in the time-frequency domain is obtained by using the s...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention provides a speech separation method and system based on Ultra Gaussian prior speech model and deep learning and a storage medium. The speech separation method comprises the steps of: utilizing a pure speech power spectrum density estimated value and a noise power spectrum density estimated value so as to obtain a prior signal-to-noise ratio in a gain function; bringing the prior signal-to-noise ratio into the gain function to obtain a value of the gain function; multiplying the value of the gain function by noisy speech spectrum to obtain an estimated value of a pure speech amplitude spectrum; and by utilizing an overlapping-adding technology, obtaining a recovered speech signal. The speech separation method and system and the storage medium have the beneficial effects that by combining a conventional statistic model with a deep learning technology, not only can a non-stationary noise signal be effectively inhibited, but also a problem of weak generalization ability caused by high dependence of the deep learning technology on training data is solved. Combination of the conventional statistic model and the deep learning technology enables enhancement performance of themethod to show very robust in various noise environments and signal-to-noise ratio cases.

Description

technical field [0001] The invention relates to the technical field of speech processing, in particular to a speech separation method, system and storage medium based on a super-Gaussian prior speech model and deep learning. Background technique [0002] Since speech signals are often polluted by interfering noise from the surroundings, applications such as automatic speech recognition, human-machine dialogue, and hearing aids have encountered great challenges. The performance of existing traditional speech enhancement technology is greatly reduced for non-stationary noise and low signal-to-noise ratio. Although the recently emerging deep learning-based speech enhancement technology can suppress non-stationary noise well, the performance of such algorithms is highly dependent on training data, and it will perform poorly for unlearned or trained data. Contents of the invention [0003] The invention provides a voice separation method based on a super-Gaussian prior voice m...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & AuthorityApplications(China)

IPC IPC(8): G10L21/0216G10L21/0264G10L21/0308G10L25/30

CPCG10L21/0216G10L21/0264G10L21/0308G10L25/30

Inventor张啟权王明江陆云韩宇菲张禄

OwnerHARBIN INST OF TECH SHENZHEN GRADUATE SCHOOL

Speech separation method and system based on Ultra Gaussian prior speech model and deep learning and storage medium

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology