End-to-end Blind Speech Enhancement with Atrous Causal Convolutional Generative Adversarial Networks

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A network-side, convolutional technology, applied in speech analysis, instruments, etc., can solve problems such as poor hearing and perception, lack of high-frequency components in bone conduction speech, etc., to achieve increased computing costs, easy reconstruction, and good performance enhancement Effect

Active Publication Date: 2021-12-24

TIANJIN UNIV

View PDF0 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0007] In order to overcome the deficiencies in the prior art, the present invention aims to propose an end-to-end bone conduction speech enhancement algorithm based on the hole causal convolution generation confrontation network, and the proposed system uses end-to-end (ie waveform input and waveform output) Speech enhancement is carried out by means of training, the best network model parameters are obtained through training, and then the trained model is used to enhance bone conduction speech, so as to solve the problems of lack of high-frequency components of bone conduction speech, poor auditory perception and communication under strong noise background

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0034] The technical solution to realize the object of the present invention is: a kind of generation confrontational network structure based on hole causal convolution for end-to-end speech enhancement, which is different from the existing large Most enhancement methods are different. It directly uses bone conduction original audio sampling points as input data, pure air conduction original audio as the output target of training, and constructs and trains the hole causal convolution to generate an adversarial enhancement network.

[0035] The atrous causal convolution generation adversarial network includes a generator and a discriminator. The generator uses atrous causal convolution to perform deep meaning feature extraction and feature transformation on the input data of the network, and outputs enhanced samples; the discriminator is an input The original audio data and the enhanced speech samples generated by the generator use the convolution layer in the discriminator to e...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention relates to the fields of artificial intelligence and medical rehabilitation equipment. In order to propose an end-to-end bone conduction voice enhancement method and solve the problems of lack of high-frequency components of bone conduction voice, poor hearing and perception, and communication under strong noise background, the invention discloses An end-to-end bone conduction speech blind enhancement method based on hole causal convolution generation confrontation network. The method takes bone conduction original audio sampling points as input data, pure air conduction original audio as training output target, and trains bone conduction speech input well. The atrous causal convolution generates an adversarial enhancement network. The atrous causal convolution generates an adversarial network including a generator and a discriminator. The generator uses atrous causal convolution to output enhanced samples; the discriminator inputs the original audio data and generates The enhanced speech samples are generated by the detector, and the deep nonlinear features are extracted by the convolution layer in the discriminator, so as to judge the deep similarity of the samples. The invention is mainly used in the design and manufacture of bone conduction speech enhancement equipment.

Description

technical field [0001] The invention relates to the field of artificial intelligence, in particular to a training method and system for an end-to-end speech enhancement model. Specifically, it relates to an end-to-end bone-conducted speech-blind enhancement method for generative adversarial networks with dilated causal convolutions. Background technique [0002] The difference between bone conduction microphone (Bone Conducted Microphone, BCM) and traditional air conduction microphone (Air Conducted Microphone, ACM) is that the sound collected by BCM is not transmitted through the air, but through the highly sensitive vibration sensor to collect the human bone or tissue vibrations, which are then converted into audio signals. Its advantage is that it shields the noise from the sound source and prevents the transmission of environmental noise at both ends of the communication system from the source. Therefore, even in a strong noise environment, useful signals can be clearly...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Patents(China)

IPC IPC(8): G10L21/02G10L25/30

CPCG10L21/02G10L25/30

Inventor 魏建国胡宏周何宇清路文焕

Owner TIANJIN UNIV

End-to-end Blind Speech Enhancement with Atrous Causal Convolutional Generative Adversarial Networks

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology