Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Speech synthesis method for generating new tone

A technology of speech synthesis and timbre, applied in speech synthesis, speech analysis, instruments, etc., can solve the problems of high cost, long cycle, complicated procedures of pronunciation library, etc., and achieve the effect of avoiding complex procedures.

Active Publication Date: 2019-11-15
BEIJING UNISOUND INFORMATION TECH
View PDF11 Cites 3 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The present invention provides a speech synthesis method for generating new timbres, which is used to solve the problems of complex procedure, long period and high cost of customizing a new speaker library

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Speech synthesis method for generating new tone
  • Speech synthesis method for generating new tone
  • Speech synthesis method for generating new tone

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0052] The preferred embodiments of the present invention will be described below in conjunction with the accompanying drawings. It should be understood that the preferred embodiments described here are only used to illustrate and explain the present invention, and are not intended to limit the present invention.

[0053] figure 1 It is a flowchart of a speech synthesis method for generating new timbres in an embodiment of the present invention. Such as figure 1 As shown, a kind of speech synthesis method that the present invention provides produces new timbre, comprises:

[0054] S101. Using multiple sound library data to train the deep neural network to form a first synthesis model;

[0055] Specifically, in order to solve the defects of the existing speech synthesis method for generating new timbres, this embodiment first selects the existing multiple speakers to record the sound database data, and mixes the voice database data of these speakers when training the model T...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a speech synthesis method for generating a new tone. The method comprises the following steps of training a deep neural network by using a plurality of sound base data to forma first synthesis model; respectively training the first synthesis model by using the plurality of sound base data to form a plurality of second synthesis models corresponding to the plurality of sound base data; reasoning a first output parameter by using the first synthesis model; reasoning a plurality of second output parameters corresponding to the second synthesis models by using the plurality of second synthesis models to form a second output parameter group; performing weighted stacking on the second output parameter group to form acoustic parameters; and reconstructing the acoustic parameters by a vocoder to form synthesis speech. The method provided by the invention has the advantages that the synthesis of the speech with the new tone can be realized under the condition of not making a new sound base; and the tone of the synthesized speech can be flexibly modulated according to a synthesis model corresponding to the existing sound producer sound base data, the synthesis efficiency is not obviously changed, and the problems of complicated work procedure, long period and high cost of the manufacturing of the new sound producer sound base are solved.

Description

technical field [0001] The invention relates to the field of speech synthesis, in particular to a speech synthesis method for generating new timbres. Background technique [0002] Speech synthesis, also known as text-to-speech (Text to Speech) technology, refers to the real-time conversion of any text information into a standard fluent voice read out. It involves many disciplines and technologies such as acoustics, linguistics, digital signal processing, and computer science. The main problem to be solved is how to convert text information into audible sound information. [0003] With the development of speech synthesis technology, users have higher and higher requirements for the diversification and differentiation of synthesized voice timbres. The existing method of generating new timbres is generally to obtain new timbres by customizing a new speaker's voice bank. However, the process of customizing a new speaker library is relatively complicated, and there are problems ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G10L13/02G10L13/033G10L13/04
CPCG10L13/02G10L13/033G10L13/04
Inventor 孙见青
Owner BEIJING UNISOUND INFORMATION TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products