Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Phase-dependent shared deep convolutional neural network speech enhancement method

A technology of deep convolution and neural network, applied in voice analysis, instruments, etc., can solve the problems of ignoring the verification of noise reduction results and the influence of voice phase information and phase information noise reduction effect, so as to expand the training data set and ensure the quality of voice quality effect

Inactive Publication Date: 2020-04-28
ZHEJIANG UNIV
View PDF3 Cites 16 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0007] However, most of the current methods based on deep learning are to design network models, process them in the time-spectrum domain, use amplitude spectrum and power spectrum as training data, and directly perform one-stop noise reduction, ignoring the verification of noise reduction results and speech Phase information; many recent studies have shown that phase plays a vital role in restoring speech signals, especially in the case of severe environmental noise and low signal-to-noise ratio, the inaccuracy of phase information has a great influence on the noise reduction effect Impact

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Phase-dependent shared deep convolutional neural network speech enhancement method
  • Phase-dependent shared deep convolutional neural network speech enhancement method
  • Phase-dependent shared deep convolutional neural network speech enhancement method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0058] In order to express the purpose, technical solution and advantages of the invention more clearly, the present invention will be further described in detail below in conjunction with the accompanying drawings and specific implementation examples.

[0059] Such as figure 1 As shown, a phase-correlated shared deep convolutional neural network speech enhancement method, including:

[0060] Step 1, use the short-time Fourier transform to analyze the noisy speech data and the clean speech data in the time-frequency domain, and obtain the dual-channel time-spectrum features of the noisy speech data and the clean speech data, including the real part spectrum and the imaginary part spectrum , taking the dual-channel time-spectral features of noisy speech data as input and the dual-channel time-spectral features of clean speech data as supervised labels to construct training samples.

[0061] Select 500 voices of clean speakers, each voice duration is between 3 and 10s, select 3...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a phase-dependent shared deep convolutional neural network speech enhancement method, which comprises the following steps of: performing time-frequency domain analysis on noisyspeech data and clean speech data by using short-time Fourier transform to obtain dual-channel time-frequency spectrum characteristics of the noisy speech data and the clean speech data respectively,taking the dual-channel time-frequency spectrum characteristics as training samples, building a shared deep convolutional neural network, training the shared deep convolutional neural network by using a training sample, for to-be-enhanced noisy speech data, obtaining dual-channel time-frequency spectrum features of the noisy speech data, inputting the dual-channel time-frequency spectrum featuresinto the shared deep convolutional neural model, calculating and outputting the predicted dual-channel time-frequency spectrum features, and processing the enhanced dual-channel time-frequency spectrum features by using short-time inverse Fourier transform and an overlap-add method to obtain an enhanced speech signal. The method can effectively suppress noise interference in speech signals and enhance the quality of the speech signals.

Description

technical field [0001] The invention relates to the field of digital speech signal processing, in particular to a phase-correlated shared deep convolutional neural network speech enhancement method. Background technique [0002] Language is one of the most important ways for human to communicate. Communicating through voice makes human life simple and efficient; with the development of mobile communication technology and Internet technology, voice technology is applied in calls, smart speakers, voice recognition, smart In various fields such as security, due to the damaging effect of environmental noise on voice, the application of voice technology in some products does not perform well. Therefore, voice enhancement has become a way to improve voice quality and intelligibility and solve product performance problems in actual environments. an important step. [0003] Speech enhancement technology has received widespread attention in the 19th century, and many solutions have ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G10L21/0224G10L21/0232G10L21/0264G10L25/30
CPCG10L21/0224G10L21/0232G10L21/0264G10L25/30
Inventor 王曰海李斌李东洋胡冰
Owner ZHEJIANG UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products