Highly expressive speech synthesis method and device

A speech synthesis and expressive technology, applied in speech synthesis, speech analysis, instruments, etc., can solve problems such as difficulty in distinguishing differences, low naturalness, dull and boring synthesized speech, etc., and achieve the effect of improving naturalness

Active Publication Date: 2017-10-13
BEIJING BAIDU NETCOM SCI & TECH CO LTD
View PDF3 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Since the size of the speech synthesis sound library is usually only a few thousand to tens of thousands of sentences, in order to avoid over-training, the leaf node model of the acoustic model decision tree can only represent statistically significant acoustic parameters, and it is difficult to distinguish the differences brought about by the detailed context. difference, thus, resulting in flat, unnatural synthetic speech

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Highly expressive speech synthesis method and device
  • Highly expressive speech synthesis method and device
  • Highly expressive speech synthesis method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0023] The present invention will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present invention, but not to limit the present invention. In addition, it should be noted that, for the convenience of description, only parts related to the present invention are shown in the drawings but not all content.

[0024] figure 2 A first embodiment of the invention is shown.

[0025] figure 2 It is a flow chart of the high-expressive speech synthesis method provided by the first embodiment of the present invention. see figure 2 , the high-expressive speech synthesis method includes:

[0026] S210. Process and analyze the input text to obtain the phoneme sequence corresponding to the input text and the context of the state contained in the phoneme sequence.

[0027] The task of the speech synthesis system is to synthesize the spe...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The embodiment of the invention discloses a high-expressive speech synthesis method and device. The high-expressive speech synthesis method includes: processing and analyzing the input text to obtain the phoneme sequence corresponding to the input text and the context of the state contained in the phoneme sequence; according to the context of the state, Based on the Viterbi algorithm, select a Gaussian acoustic model from the Gaussian mixture acoustic model containing at least two Gaussian acoustic models corresponding to the state, as the Gaussian acoustic model of the synthesized speech; generate acoustic parameters according to the selected Gaussian acoustic model, and according to Synthesizing the speech with the generated acoustic parameters includes using the vocoder to synthesize the speech with the acoustic parameters or using the acoustic parameters to guide the unit selection of the acoustic segment to generate the speech. The highly expressive speech synthesis method and device provided by the embodiments of the present invention improve the naturalness of synthesized speech.

Description

technical field [0001] Embodiments of the present invention relate to the technical field of text-to-speech conversion, and in particular, to a high-expressive speech synthesis method and device. Background technique [0002] Voice is the most habitual and natural way for man-machine communication. The technology of converting text input into speech output is called text-to-speech (TTS) or speech synthesis technology. It involves many fields such as acoustics, linguistics, digital signal processing and multimedia technology, and is a cutting-edge technology in the field of Chinese information processing. [0003] Since most of the acoustic parameters of the synthesized speech obey the Gaussian distribution, the Gaussian acoustic model is used to generate the acoustic parameters of the synthesized speech, and then the synthesized speech is generated. figure 1 The signal flow of the speech synthesis system based on the Gaussian acoustic model provided by the prior art is sho...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G10L13/027G10L13/08
Inventor 李秀林贾磊康永国
Owner BEIJING BAIDU NETCOM SCI & TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products