Voice processing method and device based on generative adversarial network

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A speech processing and generative technology, applied in biological neural network models, speech analysis, neural learning methods, etc., can solve the problems of not making full use of the strong correlation between adjacent states of speech, frequency band expansion and poor compensation for packet loss.

Active Publication Date: 2019-11-12

SHENZHEN UNIV

View PDF4 Cites 4 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0006] The main purpose of the present invention is to propose a speech processing method and device based on a generative confrontation network, to solve the problem that the mathematical model in the prior art does not make full use of the gap between adjacent speech states when performing spectrum expansion or packet loss compensation on speech. Strong correlation, making frequency band expansion and packet loss compensation ineffective

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0026] Such as figure 1 As shown, the embodiment of the present invention provides a speech processing method based on a generative confrontation network, which is used to obtain a speech processing system composed of a packet loss compensation model and a frequency band extension model, through which the speech processing system processes the original speech and overcomes the original Packet loss problem in voice or problem with too narrow frequency band. In the embodiment of the present invention, the above method includes but not limited to the following steps:

[0027] S101. Acquire voice training samples, where the voice training samples include N groups of complete voice samples and packet loss voice samples corresponding to the complete voice samples, K groups of wideband voice samples and narrowband voice samples corresponding to the wideband voice samples, Wherein, N and K are positive integers.

[0028] In the above step S101, the voice training samples are voice d...

Embodiment 2

[0075] Such as figure 2 As shown, the embodiment of the present invention also provides a speech processing device 20 based on a generative confrontation network, including but not limited to the following modules:

[0076] The training sample acquisition module 21 is used to obtain voice training samples, and the voice training samples include N groups of complete voice samples and packet loss voice samples corresponding to the complete voice samples, K groups of wideband voice samples and narrowband voice samples corresponding to the wideband voice samples, Wherein, N and K are positive integers;

[0077] The voice processing system training module 22 is used to put the voice training samples into the generative confrontation network, and perform packet loss compensation model training based on packet loss voice samples and complete voice samples, and frequency band based on wideband voice samples and narrowband voice samples. Extended model training to obtain a speech pro...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention is applicable to the technical field of voice communication, and provides a voice processing method and device based on a generative adversarial network. The method comprises the following steps: acquiring voice training samples, wherein the voice training samples include N groups of complete voice samples, packet loss voice samples corresponding to the complete voice samples, K groups of broadband voice samples and narrowband voice samples corresponding to the broadband voice samples; putting the voice training samples into the generative adversarial network to carry out packetloss compensation model training based on the packet loss voice samples and the complete voice samples, and band spreading model training based on the broadband voice samples and the narrowband voicesamples, thereby obtaining a voice processing system composed of a packet loss compensation model and a band spreading model; and processing an original voice to be processed through the voice processing system to obtain an enhanced voice after packet loss compensation or band spreading. According to the voice processing method and device, the packet loss compensation processing efficiency based on a packet loss voice in voice processing, and the band spreading processing performance based on a narrowband voice can be improved.

Description

technical field [0001] The invention relates to the technical field of voice communication, in particular to a voice processing method and device based on a generative confrontation network. Background technique [0002] In modern society, communication has become an important part of people's life, and the communication method has gradually developed from fixed telephone to mobile phone and Internet phone, which greatly facilitates our life. However, the different characteristics of mobile phones and Internet phones lead to their relative advantages and disadvantages in different occasions. [0003] Among them, most mobile phone networks belong to narrowband voice communication systems. The transmission bandwidth of the voice signal is only 3.1kHz, and the frequency range is between 300 and 3400Hz. Although this narrowband voice signal saves communication bandwidth, it reduces voice quality. For VoIP, the IP network is usually used for real-time voice transmission. When ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G10L21/038G10L21/0388G10L19/005G06N3/08G06N3/04

CPCG10L21/038G10L21/0388G10L19/005G06N3/08G06N3/045

Inventor 郑能恒史裕鹏容韦聪康迂勇

Owner SHENZHEN UNIV

Voice processing method and device based on generative adversarial network

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

Embodiment 2

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology