Multi-target voice enhancement method based on SCNN (Stacked Convolutional Neural Network) and TCNN (Temporal Convolutional Neural Network) joint estimation

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A joint estimation and speech enhancement technology, applied in speech analysis, speech recognition, instruments, etc., can solve the problem of unsatisfactory speech enhancement performance

Active Publication Date: 2020-03-06

BEIJING UNIV OF TECH

View PDF5 Cites 13 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0006] The purpose of the invention is to propose a brand-new multi-objective speech enhancement algorithm for the unsatisfactory problem of the speech enhancement performance of the current speech enhancement algorithm under non-stationary noise

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

specific Embodiment approach

[0034] Such as figure 1 Shown, the present invention provides a kind of new speech enhancement method based on multi-objective learning, comprises the following steps:

[0035]Step 1, the input signal is subjected to windowing and framing processing to obtain the time-frequency representation of the input signal;

[0036] (1) First, time-frequency decomposition is performed on the input signal;

[0037] The speech signal is a typical time-varying signal, and the time-frequency decomposition focuses on the time-varying spectral characteristics of the components of the real speech signal, and decomposes the one-dimensional speech signal into a two-dimensional signal represented by time-frequency, aiming to reveal How many frequency component levels are contained in a speech signal and how each component varies with time.

[0038] First, the original speech signal y(p) is preprocessed in Equation (1), the signal is divided into frames, and each frame is smoothed by Hamming wind...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention provides a multi-target voice enhancement method based on SCNN (Stacked Convolutional Neural Network) and TCNN (Temporal Convolutional Neural Network) joint estimation. On the basis of aSCNN and a TCNN, a new stacked and temporal convolutional neural network (STCNN) is provided; a log-power spectra (LPS) is used as the main characteristic, and input into the SCNN, so that high-levelabstract characteristics are extracted; then, a power function compression Mel-frequency cepstral coefficient (PC-MFCC) according with auditory characteristics of human ears better is provided; the TCNN takes the high-level abstract characteristics extracted by the SCNN and the PC-MFCC as the input; then, sequence modelling is carried out; furthermore, joint estimation on the clean LPS, PC-MFCC and an ideal ratio mask (IRM) is carried out; finally, in an enhancement stage, different voice characteristics have complementarity in a voice synthesis process; an IRM-based post-processing method isprovided; and enhancement voice is synthesized by adaptively adjusting the weight of the estimated LPS and IRM through voice presence information.

Description

Technical field: [0001] The invention belongs to the technical field of speech signal processing, and relates to speech recognition and speech enhancement in mobile speech communication, which is the key speech signal processing technology. Background technique: [0002] The purpose of speech enhancement is to remove background noise in noisy speech and improve the quality and intelligibility of noisy speech. Single-channel speech enhancement technology is widely used in many fields of speech signal processing, including mobile speech communication, speech recognition and digital hearing aids. But at present, the performance of speech enhancement systems in these fields in real acoustic environments is not always satisfactory. Traditional speech enhancement techniques, such as spectral subtraction, Wiener filtering, least mean square error, statistical models, and wavelet transform, which are unsupervised speech enhancement methods, have been extensively studied in the past...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G10L15/20G10L15/02G10L15/06G10L21/0216G10L21/0264G10L25/03G10L25/24G10L25/30

CPCG10L15/20G10L15/02G10L15/063G10L21/0216G10L21/0264G10L25/03G10L25/24G10L25/30

Inventor 李如玮孙晓月李涛赵丰年

Owner BEIJING UNIV OF TECH

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Multi-target voice enhancement method based on SCNN (Stacked Convolutional Neural Network) and TCNN (Temporal Convolutional Neural Network) joint estimation

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

specific Embodiment approach

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology