Complex field speech enhancement method and system based on generative adversarial network and medium

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A speech enhancement and complex domain technology, applied in speech analysis, instruments, etc., can solve the problem that the discriminator cannot judge whether the sample is real or generated, so as to improve the accuracy of speech recognition, solve the phase mismatch, and improve the auditory effect Effect

Active Publication Date: 2020-01-31

SUN YAT SEN UNIV

View PDF11 Cites 19 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

The generator and the discriminator are trained against each other. The samples generated by the generator try to obey the real distribution to confuse the discriminator so that the discriminator can judge it as true, and the discriminator tries to separate the real samples from the generated samples. In this continuous game In the process until the Nash equilibrium is reached, the samples generated by the generator are very close to the real samples at this time, and the discriminator cannot judge whether the generated samples are real or generated

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0035] like figure 1 As shown, the implementation steps of the complex domain speech enhancement method based on the generated confrontation network in this embodiment include:

[0036] 1) Obtain the voice with noise;

[0037] 2) adopt Cartesian coordinate representation to obtain the real number spectrum R and the imaginary number spectrum I of band noise after adopting Fourier transform of speech;

[0038] 3) Input the noisy real number spectrum R and imaginary number spectrum I into the generator of the pre-trained generative confrontation network, and encode the input IR composed of real number spectrum R and imaginary number spectrum I into a high semantic feature Encoder through the encoder Encoder of the generator IR ; High semantic feature Encoder IR The feature S with global information is output through the generator's self-attention mechanism layer self-attention IR ; Through the decoder Decoder of the generator, the feature S IR The real number spectrum and the...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a complex field speech enhancement method and system based on a generative adversarial network and a medium. The complex field speech enhancement method comprises the implementation steps that a noisy speech is acquired; the speech is expressed with Cartesian coordinates after being subjected to Fourier transformation to obtain a noisy real number spectrum and imaginary number spectrum; the noisy real number spectrum and imaginary number spectrum are input into a generator, which completes training in advance, of the generative adversarial network, and a denoised real number spectrum and imaginary number spectrum of a pure speech are obtained; and the real number spectrum and imaginary number spectrum of the pure speech are made into a clean speech based on inverseFourier transformation. Through the method, noise can be better removed from a speech signal to generate the clean speech, the problem that it is difficult to predict a phase is effectively solved, the auditory effect of the enhanced speech can be effectively improved, and the speech recognition accuracy rate of a speech recognition system in a noise environment can be effectively increased.

Description

technical field [0001] The present invention relates to a speech noise reduction and enhancement technology based on a generative confrontation network, in particular to a method, system and medium for speech enhancement in the complex domain based on a generative confrontation network. The signal is enhanced to facilitate the research of related downstream tasks such as speech recognition. Background technique [0002] Speech Enhancement (Speech Enhancement, SE) refers to removing the noise z from the noisy speech y, so as to separate the pure speech x, that is, x=y-z. Removing noise from a mixed speech signal is one of the most challenging tasks in speech signal processing. Traditional speech enhancement algorithms include spectral subtraction, subspace method and Wiener filtering method. In recent years, deep learning-based speech enhancement techniques have greatly improved the quality of denoised speech. [0003] In a general speech signal processing method, the speec...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G10L21/0208G10L25/30

CPCG10L21/0208G10L25/30

Inventor 刘刚陈志广肖侬

Owner SUN YAT SEN UNIV

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Complex field speech enhancement method and system based on generative adversarial network and medium

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology