A high-performance speech enhancement method based on deep learning

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A speech enhancement and deep learning technology, applied in speech analysis, instruments, etc., can solve problems such as ignoring the overall situation

Active Publication Date: 2021-08-03

TIANJIN UNIV

View PDF0 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0012] Each RBM is pre-trained using the above contrastive hashing algorithm to obtain the initial weight of the DBN. However, the initial weight obtained by pre-training may only consider the local optimum of the cost function, ignoring the global nature. In order to make up for this shortcoming, it is necessary to Perform overall network parameter fine-tuning

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0046] A high-performance speech enhancement method based on deep learning of the present invention will be described in detail below with reference to the embodiments and the accompanying drawings.

[0047] Such as Figure 4 Shown, a kind of deep learning-based high-performance speech enhancement method of the present invention comprises the following steps:

[0048] 1) Preprocessing the audio PCM coded signal: Framing and windowing the audio PCM coded signal, and dividing the original data set into a training set and a test set according to a set ratio. The original data set is composed of 720 TIMIT The pure speech of the corpus is mixed with the noise in the NOISEX92 noise library.

[0049] 2) Utilize the golden section method to determine the number of nodes in the hidden layer of DBN; including:

[0050] Let the initial value range of the number of nodes in the hidden layer of DBN be: [x 1 ,x 2 ] to calculate the two golden section points within the initial value rang...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

A high-performance speech enhancement method based on deep learning: preprocessing the audio PCM coded signal: dividing the audio PCM coded signal into frames and adding windows, and dividing the original data set into a training set and a test set according to the set ratio; using The golden section method determines the number of DBN hidden layer nodes; uses the training set to pre-train the weight and bias parameters of DBN; uses the training set to fine-tune the weight and bias parameters of DBN; extracts DBN features, which is the DBN weight obtained by fine-tuning and bias parameters to extract the DBN training set features and DBN test set features respectively; use the extracted DBN training set features to train the supervised learning speech separation system; use the extracted DBN test set features as the trained supervised learning speech separation system The input of the supervised learning speech separation system is the estimated target label, and the final enhanced speech can be obtained through speech waveform synthesis. In the present invention, the speech evaluation indexes have been greatly improved.

Description

technical field [0001] The invention relates to a voice enhancement method. In particular, it relates to a high-performance speech enhancement method based on deep learning. Background technique [0002] 1. The working principle of the supervised learning speech separation baseline system [0003] Computational auditory scene analysis in a supervised learning speech separation system is a typical speech separation method, which is based on the perceptual principles of auditory scene analysis, usually with ideal ratio masks as training targets for noise suppression. The ideal ratio mask is a time-frequency mask constructed from premixed speech and noise, defined as follows, [0004] [0005] Among them, S 2 (t, f), N 2 (t, f) represent the energy of speech and noise in the time-frequency unit, respectively, and the time-frequency unit is the expression after the signal passes through the γ-tone filter bank and the sub-band signal is divided into frames. β is an adjust...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Patents(China)

IPC IPC(8): G10L21/02G10L21/0264G10L21/0272G10L21/0308

CPCG10L21/02G10L21/0264G10L21/0272G10L21/0308

Inventor 张涛任相赢

Owner TIANJIN UNIV

A high-performance speech enhancement method based on deep learning

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology