A kind of speech enhancement method and system

What is AI technical title?
AI technical title is built by Patsnap AI team. It summarizes the technical point description of the patent document.
A technology of speech enhancement and speech characteristics, applied in speech analysis, instruments, etc., can solve the problems of poor speech perception quality and low intelligibility, and achieve good speech enhancement effect

Active Publication Date: 2022-04-29

SHENZHEN UNIV

View PDF7 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0003] Therefore, the technical problem to be solved by the present invention is to overcome the defects of poor speech perception quality and low intelligibility caused by unreasonable balance of speech distortion and residual noise in the speech enhancement method in the prior art. Speech enhancement method and system for adjusting speech distortion and residual noise

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0035] The embodiment of the present invention provides a speech enhancement method, which can be applied to scenarios such as cochlear implants, hearing aids, human-computer interaction systems, and speech communication, such as figure 1 As shown, the method includes the following steps:

[0036] Step S1: Build a speech enhancement network model, the network model includes three sub-neural networks, wherein the first neural network is a common part, and it and the second neural network constitute a prediction time-frequency mask module, and at the same time constitute a prediction module with the third neural network Adaptive weight module.

[0037] In the embodiment of the present invention, the constructed neural network model includes two parallel modules, wherein the predictive adaptive weight module judges the signal-to-noise ratio according to the input characteristics, thereby adjusting the proportion of speech distortion and residual noise through the weight, and pred...

Embodiment 2

[0051] An embodiment of the present invention provides a speech enhancement system, such as Figure 4 shown, including:

[0052] Model construction module 1, is used for constructing speech enhancement network model, and described network model comprises three sub-neural networks, and wherein the first neural network is a common part, and it and the second neural network constitute the time-frequency mask module of prediction, simultaneously and the third The neural network constitutes a predictive adaptive weight module; this module executes the method described in step S1 in Embodiment 1, which will not be repeated here.

[0053] Model training module 2, is used for inputting the speech characteristic of band noise speech signal in described network model, and the first neural network generates an intermediate latent variable according to the speech characteristic of input, and described intermediate latent variable simultaneously serves as the second neural network and the ...

Embodiment 3

[0057] An embodiment of the present invention provides a computer device, such as Figure 5 As shown, the device may include a processor 51 and a memory 52, wherein the processor 51 and the memory 52 may be connected via a bus or in other ways, Figure 5 Take connection via bus as an example.

[0058] As a non-transitory computer-readable storage medium, the memory 52 can be used to store non-transitory software programs, non-transitory computer-executable programs and modules, such as corresponding program instructions / modules in the embodiments of the present invention. The processor 51 executes various functional applications and data processing of the processor by running the non-transitory software programs, instructions and modules stored in the memory 52, that is, implements the speech enhancement method in the first method embodiment above.

[0059] The memory 52 may include a program storage area and a data storage area, wherein the program storage area may store an ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a speech enhancement method and system. The method comprises: the constructed speech enhancement network model includes two parallel modules: the predictive adaptive weight module judges the signal-to-noise ratio according to input features, thereby adjusting the proportion of speech distortion and residual noise through weights; Ratio; the predicted time-frequency mask module estimates the time-frequency mask used to suppress noise according to the input features. By training the network, the proportion of speech distortion and residual noise in the enhanced speech can be adaptively adjusted according to the signal-to-noise ratio, and the trained network model can be used for the actual noise reduction task to obtain an enhanced speech signal. The invention uses a neural network to adaptively adjust and enhance speech distortion and residual noise in speech to obtain better speech enhancement effects, and can train different adaptive weight ranges according to different task requirements to obtain a speech enhancement algorithm more suitable for related tasks.

Description

technical field [0001] The invention relates to the technical field of speech enhancement, in particular to a speech enhancement method and system. Background technique [0002] Voice signal is one of the most convenient and fast ways for human communication and information transmission. Background noise is everywhere, and human ears and microphones actually receive speech signals interfered by noise. Noise can seriously affect human speech perception and the performance of speech products (eg hearing aids, automatic speech recognition systems, voice communications). Speech enhancement is a technique that removes or suppresses noise from noisy speech and is widely used in front-end processing for various speech-related tasks. In actual processing, speech enhancement algorithms will inevitably introduce speech distortion and residual noise. Although deep learning has achieved very significant results in speech enhancement, most deep learning-based methods only consider the...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & AuthorityPatents(China)

IPC IPC(8): G10L21/0208G10L25/30

CPCG10L21/0208G10L25/30

Inventor康迂勇郑能恒

OwnerSHENZHEN UNIV

A kind of speech enhancement method and system

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

Embodiment 2

Embodiment 3

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology