Unlock instant, AI-driven research and patent intelligence for your innovation.

A high-performance speech enhancement method based on deep learning

A speech enhancement and deep learning technology, applied in speech analysis, instruments, etc., can solve problems such as ignoring the overall situation

Active Publication Date: 2021-08-03
TIANJIN UNIV
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0012] Each RBM is pre-trained using the above contrastive hashing algorithm to obtain the initial weight of the DBN. However, the initial weight obtained by pre-training may only consider the local optimum of the cost function, ignoring the global nature. In order to make up for this shortcoming, it is necessary to Perform overall network parameter fine-tuning

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A high-performance speech enhancement method based on deep learning
  • A high-performance speech enhancement method based on deep learning
  • A high-performance speech enhancement method based on deep learning

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0046] A high-performance speech enhancement method based on deep learning of the present invention will be described in detail below with reference to the embodiments and the accompanying drawings.

[0047] Such as Figure 4 Shown, a kind of deep learning-based high-performance speech enhancement method of the present invention comprises the following steps:

[0048] 1) Preprocessing the audio PCM coded signal: Framing and windowing the audio PCM coded signal, and dividing the original data set into a training set and a test set according to a set ratio. The original data set is composed of 720 TIMIT The pure speech of the corpus is mixed with the noise in the NOISEX92 noise library.

[0049] 2) Utilize the golden section method to determine the number of nodes in the hidden layer of DBN; including:

[0050] Let the initial value range of the number of nodes in the hidden layer of DBN be: [x 1 ,x 2 ] to calculate the two golden section points within the initial value rang...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

A high-performance speech enhancement method based on deep learning: preprocessing the audio PCM coded signal: dividing the audio PCM coded signal into frames and adding windows, and dividing the original data set into a training set and a test set according to the set ratio; using The golden section method determines the number of DBN hidden layer nodes; uses the training set to pre-train the weight and bias parameters of DBN; uses the training set to fine-tune the weight and bias parameters of DBN; extracts DBN features, which is the DBN weight obtained by fine-tuning and bias parameters to extract the DBN training set features and DBN test set features respectively; use the extracted DBN training set features to train the supervised learning speech separation system; use the extracted DBN test set features as the trained supervised learning speech separation system The input of the supervised learning speech separation system is the estimated target label, and the final enhanced speech can be obtained through speech waveform synthesis. In the present invention, the speech evaluation indexes have been greatly improved.

Description

technical field [0001] The invention relates to a voice enhancement method. In particular, it relates to a high-performance speech enhancement method based on deep learning. Background technique [0002] 1. The working principle of the supervised learning speech separation baseline system [0003] Computational auditory scene analysis in a supervised learning speech separation system is a typical speech separation method, which is based on the perceptual principles of auditory scene analysis, usually with ideal ratio masks as training targets for noise suppression. The ideal ratio mask is a time-frequency mask constructed from premixed speech and noise, defined as follows, [0004] [0005] Among them, S 2 (t, f), N 2 (t, f) represent the energy of speech and noise in the time-frequency unit, respectively, and the time-frequency unit is the expression after the signal passes through the γ-tone filter bank and the sub-band signal is divided into frames. β is an adjust...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G10L21/02G10L21/0264G10L21/0272G10L21/0308
CPCG10L21/02G10L21/0264G10L21/0272G10L21/0308
Inventor 张涛任相赢
Owner TIANJIN UNIV