Unlock instant, AI-driven research and patent intelligence for your innovation.

High-perform speech enhancement method based on deep learning

A speech enhancement and deep learning technology, applied in speech analysis, instruments, etc., can solve the problem of ignoring the overall situation

Active Publication Date: 2018-10-02
TIANJIN UNIV
View PDF13 Cites 7 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0012] Each RBM is pre-trained using the above contrastive hashing algorithm to obtain the initial weight of the DBN. However, the initial weight obtained by pre-training may only consider the local optimum of the cost function, ignoring the global nature. In order to make up for this shortcoming, it is necessary to Perform overall network parameter fine-tuning

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • High-perform speech enhancement method based on deep learning
  • High-perform speech enhancement method based on deep learning
  • High-perform speech enhancement method based on deep learning

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0046] A high-performance speech enhancement method based on deep learning of the present invention will be described in detail below with reference to the embodiments and the accompanying drawings.

[0047] Such as Figure 4 Shown, a kind of deep learning-based high-performance speech enhancement method of the present invention comprises the following steps:

[0048] 1) Preprocessing the audio PCM coded signal: Framing and windowing the audio PCM coded signal, and dividing the original data set into a training set and a test set according to a set ratio. The original data set is composed of 720 TIMIT The pure speech of the corpus is mixed with the noise in the NOISEX92 noise library.

[0049] 2) Utilize the golden section method to determine the number of nodes in the hidden layer of DBN; including:

[0050] Let the initial value range of the number of nodes in the hidden layer of DBN be: [x 1 ,x 2 ] to calculate the two golden section points within the initial value rang...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to a high-performance speech enhancement method based on deep learning, which comprises the steps of preprocessing an audio PCM coding signal; performing framing and windowing onthe audio PCM coding signal, and dividing an original data set into a training set and a test set according to a set proportion at the same time; determining the number of DBN hidden layer nodes by using a golden section method; pre-training a weight and an offset parameter of the DBN by using the training set; perform fine tuning on the weight and the offset parameter of the DBN by using the training set; extracting DBN features, namely, respectively extracting features of the DBN training set and features of the DBN test set by using the DBN weight and offset parameter obtained by the finetuning; training a supervised learning speech separation system by using the extracted features of the DBN training set; enabling the extracted features of the DBN test set to serve as the input of the trained supervised learning speech separation system, wherein the output of the supervised learning speech separation system is an estimated target label, and the final enhanced speech can be obtained through speech waveform synthesis. The high-performance speech enhancement method has been greatly improved in speech evaluation index.

Description

technical field [0001] The invention relates to a voice enhancement method. In particular, it relates to a high-performance speech enhancement method based on deep learning. Background technique [0002] 1. The working principle of the supervised learning speech separation baseline system [0003] Computational auditory scene analysis in a supervised learning speech separation system is a typical speech separation method, which is based on the perceptual principles of auditory scene analysis, usually with ideal ratio masks as training targets for noise suppression. The ideal ratio mask is a time-frequency mask constructed from premixed speech and noise, defined as follows, [0004] [0005] Among them, S 2 (t, f), N 2 (t, f) represent the energy of speech and noise in the time-frequency unit, respectively, and the time-frequency unit is the expression after the signal passes through the γ-tone filter bank and the sub-band signal is divided into frames. β is an adjust...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G10L21/02G10L21/0264G10L21/0272G10L21/0308
CPCG10L21/02G10L21/0264G10L21/0272G10L21/0308
Inventor 张涛任相赢
Owner TIANJIN UNIV