Unlock instant, AI-driven research and patent intelligence for your innovation.

Speech synthesis method, system and device and storage medium

A technology of speech synthesis and speech synthesis, applied in speech synthesis, speech analysis, instruments, etc., can solve the problem of low synthesis efficiency, and achieve the effect of ensuring the quality of speech synthesis and improving the efficiency of speech synthesis

Active Publication Date: 2021-09-17
ALIBABA GRP HLDG LTD
View PDF5 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, the existing LPCNet has a certain degree of computational redundancy, and the synthesis efficiency is low

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Speech synthesis method, system and device and storage medium
  • Speech synthesis method, system and device and storage medium
  • Speech synthesis method, system and device and storage medium

Examples

Experimental program
Comparison scheme
Effect test

preparation example Construction

[0047] The embodiment of the present application also provides a speech synthesis method based on a neural network combined with linear predictive coding. Such as Figure 2a As shown, the method includes:

[0048] 21a. Acquire acoustic features of the text to be synthesized on multiple channels, where different channels correspond to different acoustic frequency bands.

[0049] 22a. Using a neural network combined with linear predictive coding to predict acoustic features on multiple channels, and obtain linear predictive parameters and nonlinear residuals on multiple channels.

[0050] 23a. Perform speech synthesis according to the linear prediction parameters and nonlinear residuals on multiple channels, to obtain synthesized speech corresponding to the text to be synthesized.

[0051] In this embodiment, a multi-channel speech synthesis method is adopted, which can improve speech synthesis efficiency compared with a single-channel speech synthesis method; further, the lin...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The embodiment of the invention provides a speech synthesis method, system and device and a storage medium. In the embodiment of the invention, a multi-channel linear prediction network vocoder is provided, multi-channel input is supported, acoustic features of a to-be-synthesized text on multiple channels are acquired, and a voice signal corresponding to the to-be-synthesized text can be synthesized by using the multi-channel linear prediction network vocoder; and the speech synthesis based on linear prediction can ensure the speech synthesis quality, and at the same time, the speech synthesis efficiency can be improved by means of the advantages of multiple channels.

Description

technical field [0001] The present application relates to the technical field of speech signal processing, and in particular to a speech synthesis method, system, device and storage medium. Background technique [0002] Speech synthesis, also known as text-to-speech (Text to Speech) technology, is a technology that generates artificial voice through mechanical and electronic methods. In the process of speech synthesis, the front-end and middle-end are responsible for predicting the compressed features of the speech from the text, such as Mel-Frequency Cepstral Coefficients (Mel-Frequency Cepstral Coefficients, MFCC), etc.; Vocoder (vocoder) to complete. [0003] Linear Predictive Coding Net (LPCNet) vocoder is a variant model of WaveRNN that combines Recurrent Neural Network (RNN) and linear prediction, which combines deep learning and digital signal processing technology , which greatly improves the quality of speech synthesis, so it is widely used in speech synthesis sys...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G10L13/02G10L13/04G10L13/08G10L19/04G10L19/16G10L25/03G10L25/30
CPCG10L13/02G10L13/04G10L13/08G10L19/16G10L19/04G10L25/03G10L25/30
Inventor 杨辰雨雷鸣
Owner ALIBABA GRP HLDG LTD