Voice synthesis method and device

A technology of speech synthesis and synthetic speech, which is applied in speech synthesis, speech analysis, instruments, etc., and can solve problems such as difficulty in learning the fundamental frequency trend, synthesizing speech rhythm, and insufficient expressive power

Active Publication Date: 2016-04-27
BAIDU ONLINE NETWORK TECH (BEIJIBG) CO LTD
View PDF8 Cites 26 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] In the traditional speech synthesis system, the fundamental frequency modeling uses the multi-space probability distribution hidden Markov model (multi-space probability distribution HMM, MSD-HMM) modeling method, which can be very good for the state level, the sound

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Voice synthesis method and device
  • Voice synthesis method and device
  • Voice synthesis method and device

Examples

Experimental program
Comparison scheme
Effect test

Example Embodiment

[0025] The embodiments of the present invention are described in detail below. Examples of the embodiments are shown in the accompanying drawings, in which the same or similar reference numerals represent the same or similar modules or modules with the same or similar functions. The embodiments described below with reference to the drawings are exemplary, and are only used to explain the present invention, but should not be understood as limiting the present invention. On the contrary, the embodiments of the present invention include all changes, modifications and equivalents falling within the scope of the spirit and connotation of the appended claims.

[0026] figure 1 It is a schematic flowchart of a speech synthesis method proposed in an embodiment of the present invention. The process of this embodiment takes the synthesis process as an example. See figure 1 , The method includes:

[0027] S11: Perform text feature extraction on the text to be synthesized to obtain context f...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a voice synthesis method and a device. The voice synthesis method comprises steps of performing text characteristic extraction on a text to be synthesized to obtain the context characteristic information, obtaining a pre-generated model, wherein the pre-generated model is generated by training according to the context characteristic information of the training sample and converted acoustic parameter, and the converted acoustic parameters comprise a plurality of rhythm level fundamental frequency parameters, determining the model output parameter corresponding to the context characteristic information according to the model, wherein the model output parameters comprise a plurality of the rhythm level fundamental frequency parameters, performing the fundamental frequency reconstruction on the plurality of rhythm level fundamental frequency parameter, and synthesizing voice according to the parameter after the fundamental frequency reconstruction and the other parameters in the model output parameters. The method can improve the performance result of the synthesized speech.

Description

technical field [0001] The invention relates to the technical field of speech synthesis, in particular to a speech synthesis method and device. Background technique [0002] Now people are not only satisfied with the clarity and intelligibility of synthesized speech, but also require the synthesized speech to have better naturalness and expressiveness. In natural speech, fundamental frequency is the main factor affecting naturalness and expressiveness, so the accuracy of fundamental frequency modeling directly affects the naturalness and expressiveness of synthesized speech. [0003] In the traditional speech synthesis system, the fundamental frequency modeling uses the multi-space probability distribution hidden Markov model (multi-space probability distribution HMM, MSD-HMM) modeling method, which can be very good for the state level, the sound level However, it is difficult to learn higher-level fundamental frequency trends such as words, phrases or sentences, which make...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G10L13/02G10L13/033G10L13/047G10L13/10
Inventor 盖于涛康永国张少飞
Owner BAIDU ONLINE NETWORK TECH (BEIJIBG) CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products