Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Voice enhancing method based on generative adversarial network

A voice enhancement and network technology, applied in voice analysis, instruments, etc., can solve problems such as slow convergence speed and instability, and achieve the effects of reducing differences, eliminating noise, and improving clarity and quality

Active Publication Date: 2019-11-08
珠海亿智电子科技有限公司
View PDF2 Cites 8 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The size of its network model is nearly 10 times smaller than that of RNN series models, but its network converges slowly and is not stable during training

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Voice enhancing method based on generative adversarial network
  • Voice enhancing method based on generative adversarial network
  • Voice enhancing method based on generative adversarial network

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0040] The present invention will be further described below in combination with specific embodiments.

[0041] Such as figure 1As shown, the present invention proposes a speech enhancement method based on a generative confrontation network. The speech enhancement method uses a progressive training method, that is, simultaneously increases the signal-to-noise ratio of the data and the number of network layers, and makes the difficult speech denoising task Decomposed into several simple denoising tasks, the model can more easily reconstruct the distribution of pure speech. At the same time, the feature matching based on the discriminator is combined with the traditional feature mapping method, which can reduce the difference between the feature distribution of the enhanced speech and the feature distribution of the pure speech. In addition, the network is jointly optimized and trained using the GAN objective function to minimize the loss between the generator and the discrimi...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a voice enhancing method based on a generative adversarial network. The method is characterized by comprising the steps that 1, a progressive training mode is adopted to reconstruct distribution of pure voice; 2, a feature matching strategy based on a discriminator is adopted to optimize enhancement properties of a generator; and 3, a plurality of types of noise type data are adopted to perform training so as to generate the generative adversarial network. According to the method, the discriminator-based feature matching method is combined with a traditional feature mapping method, so that the difference between the feature discrimination of enhanced voice and the feature distribution of the pure voice is effectively reduced. Besides, a GAN objective function is further adopted to perform united optimal training on the network, so that the loss between the generator and the discriminator is minimum.

Description

technical field [0001] The invention relates to the technical field of single-channel speech enhancement in the field of speech signal processing, in particular to a speech enhancement method based on a generative confrontation network. Background technique [0002] In recent years, great breakthroughs have been made in automatic speech recognition (ASR) and speaker recognition, but in natural environment conditions, speech signals are polluted by noise to varying degrees. In severe cases, the speech will be completely submerged in the noise, making it impossible to distinguish the original semantics. Therefore, the recognition system needs the support of speech enhancement technology, which can eliminate the noise part and provide high-quality and intelligible audio data for speech recognition tasks. Therefore, as a front-end preprocessing stage, speech enhancement technology plays a vital role in noisy environments. The present invention is to study how to solve the chal...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G10L21/0208G10L21/0216G10L25/03G10L25/30
CPCG10L21/0208G10L21/0216G10L25/03G10L25/30
Inventor 殷绪成赵力杨春
Owner 珠海亿智电子科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products