Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Complex field speech enhancement method and system based on generative adversarial network and medium

A speech enhancement and complex domain technology, applied in speech analysis, instruments, etc., can solve the problem that the discriminator cannot judge whether the sample is real or generated, so as to improve the accuracy of speech recognition, solve the phase mismatch, and improve the auditory effect Effect

Active Publication Date: 2020-01-31
SUN YAT SEN UNIV
View PDF11 Cites 19 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The generator and the discriminator are trained against each other. The samples generated by the generator try to obey the real distribution to confuse the discriminator so that the discriminator can judge it as true, and the discriminator tries to separate the real samples from the generated samples. In this continuous game In the process until the Nash equilibrium is reached, the samples generated by the generator are very close to the real samples at this time, and the discriminator cannot judge whether the generated samples are real or generated

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Complex field speech enhancement method and system based on generative adversarial network and medium
  • Complex field speech enhancement method and system based on generative adversarial network and medium
  • Complex field speech enhancement method and system based on generative adversarial network and medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0035] like figure 1 As shown, the implementation steps of the complex domain speech enhancement method based on the generated confrontation network in this embodiment include:

[0036] 1) Obtain the voice with noise;

[0037] 2) adopt Cartesian coordinate representation to obtain the real number spectrum R and the imaginary number spectrum I of band noise after adopting Fourier transform of speech;

[0038] 3) Input the noisy real number spectrum R and imaginary number spectrum I into the generator of the pre-trained generative confrontation network, and encode the input IR composed of real number spectrum R and imaginary number spectrum I into a high semantic feature Encoder through the encoder Encoder of the generator IR ; High semantic feature Encoder IR The feature S with global information is output through the generator's self-attention mechanism layer self-attention IR ; Through the decoder Decoder of the generator, the feature S IR The real number spectrum and the...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a complex field speech enhancement method and system based on a generative adversarial network and a medium. The complex field speech enhancement method comprises the implementation steps that a noisy speech is acquired; the speech is expressed with Cartesian coordinates after being subjected to Fourier transformation to obtain a noisy real number spectrum and imaginary number spectrum; the noisy real number spectrum and imaginary number spectrum are input into a generator, which completes training in advance, of the generative adversarial network, and a denoised real number spectrum and imaginary number spectrum of a pure speech are obtained; and the real number spectrum and imaginary number spectrum of the pure speech are made into a clean speech based on inverseFourier transformation. Through the method, noise can be better removed from a speech signal to generate the clean speech, the problem that it is difficult to predict a phase is effectively solved, the auditory effect of the enhanced speech can be effectively improved, and the speech recognition accuracy rate of a speech recognition system in a noise environment can be effectively increased.

Description

technical field [0001] The present invention relates to a speech noise reduction and enhancement technology based on a generative confrontation network, in particular to a method, system and medium for speech enhancement in the complex domain based on a generative confrontation network. The signal is enhanced to facilitate the research of related downstream tasks such as speech recognition. Background technique [0002] Speech Enhancement (Speech Enhancement, SE) refers to removing the noise z from the noisy speech y, so as to separate the pure speech x, that is, x=y-z. Removing noise from a mixed speech signal is one of the most challenging tasks in speech signal processing. Traditional speech enhancement algorithms include spectral subtraction, subspace method and Wiener filtering method. In recent years, deep learning-based speech enhancement techniques have greatly improved the quality of denoised speech. [0003] In a general speech signal processing method, the speec...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G10L21/0208G10L25/30
CPCG10L21/0208G10L25/30
Inventor 刘刚陈志广肖侬
Owner SUN YAT SEN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products