Speech enhancement method and system

A speech enhancement and speech feature technology, applied in speech analysis, instruments, etc., can solve the problems of poor speech perception quality and low intelligibility, and achieve the effect of good speech enhancement effect.

Active Publication Date: 2021-03-30
SHENZHEN UNIV
View PDF7 Cites 10 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] Therefore, the technical problem to be solved by the present invention is to overcome the defects of poor speech perception quality and low intelligibility caused by unreasonable balance of speech distortion and residual noise in the speech enhancement method in the prior art. Speech enhancement method and system for adjusting speech distortion and residual noise

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Speech enhancement method and system
  • Speech enhancement method and system
  • Speech enhancement method and system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0035] The embodiment of the present invention provides a speech enhancement method, which can be applied to scenarios such as cochlear implants, hearing aids, human-computer interaction systems, and speech communication, such as figure 1 As shown, the method includes the following steps:

[0036] Step S1: Build a speech enhancement network model, the network model includes three sub-neural networks, wherein the first neural network is a common part, and it and the second neural network constitute a prediction time-frequency mask module, and at the same time constitute a prediction with the third neural network Adaptive weight module.

[0037] In the embodiment of the present invention, the constructed neural network model includes two parallel modules, wherein the predictive adaptive weight module judges the signal-to-noise ratio according to the input characteristics, thereby adjusting the proportion of speech distortion and residual noise through the weight, and predicting ...

Embodiment 2

[0051] An embodiment of the present invention provides a speech enhancement system, such as Figure 4 shown, including:

[0052] Model construction module 1, is used for constructing speech enhancement network model, and described network model comprises three sub-neural networks, and wherein the first neural network is a common part, and it and the second neural network constitute the time-frequency mask module of prediction, simultaneously and the third The neural network constitutes a predictive adaptive weight module; this module executes the method described in step S1 in Embodiment 1, which will not be repeated here.

[0053] Model training module 2, is used for inputting the speech characteristic of band noise speech signal in described network model, and the first neural network generates an intermediate latent variable according to the speech characteristic of input, and described intermediate latent variable simultaneously serves as the second neural network and the ...

Embodiment 3

[0057] An embodiment of the present invention provides a computer device, such as Figure 5 As shown, the device may include a processor 51 and a memory 52, wherein the processor 51 and the memory 52 may be connected via a bus or in other ways, Figure 5 Take connection via bus as an example.

[0058] As a non-transitory computer-readable storage medium, the memory 52 can be used to store non-transitory software programs, non-transitory computer-executable programs and modules, such as corresponding program instructions / modules in the embodiments of the present invention. The processor 51 executes various functional applications and data processing of the processor by running the non-transitory software programs, instructions and modules stored in the memory 52, that is, implements the speech enhancement method in the first method embodiment above.

[0059] The memory 52 may include a program storage area and a data storage area, wherein the program storage area may store an ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a speech enhancement method and system. The method comprises the steps that a constructed speech enhancement network model comprises two parallel modules: a prediction adaptiveweight module judges a signal-to-noise ratio according to an input feature so as to adjust the proportion of speech distortion and residual noise through weight; and a prediction time-frequency maskmodule estimates a time-frequency mask for suppressing noise according to the input features. The proportion of speech distortion and residual noise in the enhanced speech can be adaptively adjusted according to the signal-to-noise ratio through the training network, and the trained network model is used for an actual noise reduction task to obtain an enhanced speech signal. According to the method, the neural network is used for adaptively adjusting and enhancing voice distortion and residual noise in the voice to obtain a better voice enhancement effect, different adaptive weight ranges canbe trained according to different task requirements, and a voice enhancement algorithm more suitable for related tasks is obtained.

Description

technical field [0001] The invention relates to the technical field of speech enhancement, in particular to a speech enhancement method and system. Background technique [0002] Voice signal is one of the most convenient and fast ways for human communication and information transmission. Background noise is everywhere, and human ears and microphones actually receive speech signals interfered by noise. Noise can seriously affect human speech perception and the performance of speech products (eg hearing aids, automatic speech recognition systems, voice communications). Speech enhancement is a technique that removes or suppresses noise from noisy speech and is widely used in front-end processing for various speech-related tasks. In actual processing, speech enhancement algorithms will inevitably introduce speech distortion and residual noise. Although deep learning has achieved very significant results in speech enhancement, most deep learning-based methods only consider the...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G10L21/0208G10L25/30
CPCG10L21/0208G10L25/30
Inventor 康迂勇郑能恒
Owner SHENZHEN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products