Speech signal processing method and device, terminal and storage medium

What is AI technical title?
AI technical title is built by Patsnap AI team. It summarizes the technical point description of the patent document.
A technology of a voice signal and a processing method, applied in the fields of terminals, storage media, voice signal processing methods, and devices, can solve the problems of naturality hazards of voice signals, reduced quality and intelligibility of narrow-band voice signals, and difficulty in comparison, etc. Achieve improved quality and intelligibility

Active Publication Date: 2019-10-11

SOUTH UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA

View PDF10 Cites 8 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0004] However, the vocoder takes advantage of the fact that the human ear is not sensitive to the phase of the speech signal. When analyzing and synthesizing the speech signal, it only requires the amplitude spectrum of the signal. Therefore, the speech signal synthesized by the vocoder is different from the original speech signal in waveform Comparisons are difficult on the ground, and the quality and intelligibility of speech synthesized by vocoders can only be measured by subjective scoring metrics

In addition, the vocoder only transmits the model parameters, which brings better frequency band compression effect, but also brings great harm to the naturalness of the speech signal.

Especially when using a single-channel vocoder, the synthesized narrowband speech signal loses a lot of detail, resulting in reduced quality and intelligibility of the narrowband speech signal

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0043] figure 1 It is a flow chart of a voice signal processing method provided in Embodiment 1 of the present invention. This embodiment is applicable to performing bandwidth restoration on a narrowband voice signal output by a vocoder, and the method can be performed by a voice signal processing device. The device can be implemented in the form of software and / or hardware, and can be integrated on terminals, such as smart phones, tablet computers, personal computers (PCs) and learning machines, etc.

[0044] Such as figure 1 As shown, a speech signal processing method provided in Embodiment 1 of the present invention may include:

[0045] S101. Obtain a compressed narrowband voice signal;

[0046] Specifically, the manner in which voice is transmitted between electronic devices is called a voice signal. The speech signal transmission process first needs to compress and encode the speech to remove the redundancy in the unprocessed original speech signal and reduce the tran...

Embodiment 2

[0057] figure 2 It is a flowchart of a voice signal processing method provided by Embodiment 2 of the present invention. This embodiment is a further refinement based on the above technical solution. Such as figure 2 As shown, the method specifically includes:

[0058] S201. Input the original voice signal into a vocoder for compression to obtain a compressed narrowband voice signal.

[0059] Specifically, the original speech signal is an unprocessed speech signal. A vocoder is a coder and decoder that analyzes and synthesizes speech. It is also called a speech analysis and synthesis system or a speech band compression system. It is mainly used for signal band compression, speech storage communication and secure communication. The original speech signal is input into the vocoder, and the vocoder performs frequency band compression on it, and the output speech signal is the compressed narrowband speech signal.

[0060] Optionally, a channel vocoder may be used to compress...

Embodiment 3

[0076] image 3 It is a flow chart of a method for preprocessing a narrowband speech signal provided by Embodiment 3 of the present invention. This embodiment is based on the above embodiment and further refines the preprocessing of the narrowband speech signal. Such as image 3 As shown, the method specifically includes:

[0077] S301. Perform pre-emphasis on the narrowband speech signal to obtain a pre-emphasized narrowband speech signal.

[0078] Specifically, pre-emphasis is a signal processing method for compensating high-frequency components of an input signal at a sending end. The voice signal is greatly damaged during the transmission process. In order to obtain a better voice signal waveform at the receiving end, it is necessary to compensate the damaged voice signal. The idea of pre-emphasis technology is to enhance the voice signal at the beginning of the transmission line. frequency components to compensate for excessive attenuation of high frequency component...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The embodiments of the invention disclose a speech signal processing method and device, a terminal and a storage medium. The speech signal processing method includes the steps that a compressed narrowband speech signal is obtained; frequency domain characteristics of the narrowband speech signal are extracted; the frequency domain characteristics of the narrowband speech signal are input into a trained deep noise reduction self-encoder neural network model for nonlinear fitting, and frequency domain characteristics of a full frequency band speech signal are obtained; and the frequency domain characteristics of the full frequency band speech signal are converted into a power spectrum of the full frequency band speech signal, the Fourier inversion of the power spectrum of the full frequencyband speech signal is performed by using phase information of a corresponding narrowband signal, and the full frequency band speech signal is obtained. According to the speech signal processing methodand device, the terminal and the storage medium, the compressed narrowband speech signal is subjected to bandwidth recovery by using the deep noise reduction self-encoder neural network model, and the quality of the speech signal is improved and the intelligibility of the speech signal is increased.

Description

technical field [0001] Embodiments of the present invention relate to the technical field of voice processing, and in particular, to a voice signal processing method, device, terminal, and storage medium. Background technique [0002] Voice signal is one of the important ways for human beings to communicate, especially with the rapid development of science and technology, voice signal needs to be transmitted between mobile phones and computers. During the transmission process, the voice signal needs to be compressed and encoded to remove the redundancy in the voice signal and reduce the transmission bit rate or storage space, so the compression of the voice signal is particularly important. [0003] The vocoder first appeared in Bell Laboratories in the United States, and is mainly used for signal frequency band compression, voice storage communication and secure communication. The use of channel vocoders to compress and encode speech signals is widely used. It first extrac...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G10L19/16G10L25/24G10L25/30

CPCG10L19/16G10L25/24G10L25/30

Inventor陈霏叶富强

OwnerSOUTH UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA

Speech signal processing method and device, terminal and storage medium

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

Embodiment 2

Embodiment 3

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology