Unlock instant, AI-driven research and patent intelligence for your innovation.

A Wavenet-Based Bone Conduction Speech Enhanced Waveform Generation Method

A waveform generation and voice enhancement technology, applied in the field of bone conduction, can solve problems such as unsatisfactory conversion effect, difficult modeling, voice problems, etc., to achieve good spectrum expansion function and improve quality.

Active Publication Date: 2022-02-18
ARMY ENG UNIV OF PLA
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, because the excitation signal is closely related to vocal cord movement and friction noise, its own characteristic regularity is not obvious, so the modeling is difficult and the conversion effect is not ideal
The speech waveform synthesis method based on the short-time inverse Fourier transform, because the short-time Fourier transform overlaps and analyzes the signal, the signal information will be partially shared between frames, so a specific correlation is introduced between the STFT coefficients, if Using the enhanced amplitude spectrum and the original bone conduction speech phase to synthesize the speech waveform, the original constraint of the STFT coefficients is lacking between the amplitude spectrum and the phase spectrum. This mismatch will make the synthesized speech difficult even if the amplitude spectrum is optimal. There is a problem
In the field of bone conduction speech enhancement, due to the high similarity between bone conduction speech and air conduction speech in the low frequency part, the speech quality of the original bone conduction speech phase synthesis is acceptable, but the mismatched phase spectrum causes The loss of sound quality is still noticeable

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A Wavenet-Based Bone Conduction Speech Enhanced Waveform Generation Method
  • A Wavenet-Based Bone Conduction Speech Enhanced Waveform Generation Method
  • A Wavenet-Based Bone Conduction Speech Enhanced Waveform Generation Method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0100] The following is a detailed introduction to WaveNet-based waveform generation:

[0101] WaveNet is a fully probabilistic autoregressive generation model. By constructing a special deep convolutional neural network structure, it realizes the direct modeling of the speech waveform level. It usually needs to give additional input conditions to guide the generation of speech waveforms with specific properties. .

[0102] Let the speech waveform sequence be x={x 1 ,···,x t-1}, then its joint probability density distribution under the conditional feature λ can be expressed as the product of the following conditional probabilities:

[0103]

[0104] WaveNet uses PixelCNN to realize the calculation of the probability distribution of formula (1) by stacking well-designed convolutional layers, and uses deep residual network and parameterized skip connection to construct a deeper network structure and achieve fast convergence of the model.

[0105] The method of the present ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a WaveNet-based bone conduction speech enhancement waveform generation method. This method uses the WaveNet model to generate high-quality speech based on the amplitude spectrum enhancement of bone conduction speech based on the BLSTM model. First build the BLSTM model and the WaveNet model. The WaveNet model introduces an upsampling module across sampling rates, and trains the two models separately; then sends the bone conduction speech amplitude spectrum at a low sampling rate to be enhanced to the trained BLSTM. The enhanced amplitude spectrum obtained in the model is combined with the bone conduction speech phase information and sent to the trained WaveNet model to obtain the enhanced speech waveform at a high sampling rate. The invention effectively utilizes the bone conduction voice phase information and has a spectrum expansion function, and can directly generate an enhanced high-sampling rate voice waveform from the enhanced bone conduction voice amplitude spectrum and bone conduction voice phase information, thereby significantly improving the quality of the bone conduction voice.

Description

technical field [0001] The invention relates to the field of bone conduction technology, in particular to a WaveNet-based bone conduction speech enhancement waveform generation method. Background technique [0002] The bone conduction microphone uses the vibration generated by the skull, larynx and other body parts to obtain the voice signal bone conduction voice. Since its signal transmission channel shields the influence of the surrounding environment noise, compared with the voice produced by the traditional air conduction microphone, the air Guided voice has strong anti-noise performance, and has broad application prospects in military and civilian fields. However, due to the low-pass nature of human body signal conduction, the high-frequency components of bone conduction voice are seriously attenuated, and the frequency components are usually below 2.5kHz, and the electro-acoustic signals generated by vibration do not pass through the sound "tuning" areas such as the or...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G10L21/0232G10L21/0332G10L25/18G10L25/27
CPCG10L21/0232G10L21/0332G10L25/18G10L25/27
Inventor 张雄伟郑昌艳杨吉斌曹铁勇李莉孙蒙
Owner ARMY ENG UNIV OF PLA