Speech synthesis method and device and device for speech synthesis

A technology of speech synthesis and synthetic speech, applied in speech synthesis, speech analysis, speech recognition, etc., can solve problems such as synthetic speech noise, achieve the effect of improving hearing and sound quality, improving noise, and improving consistency

Active Publication Date: 2018-07-31
BEIJING SOGOU TECHNOLOGY DEVELOPMENT CO LTD
View PDF5 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] In view of the above problems, the embodiments of the present invention are proposed to provide a speech synthesis method, a speech synthesis device, and a device for speech synthesis that overcome the above problems or at least partially solve the above problems. The problem of noise in the synthesized speech caused by the wrong judgment of voiced sound can improve the sense of hearing and sound quality of the synthesized speech

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Speech synthesis method and device and device for speech synthesis
  • Speech synthesis method and device and device for speech synthesis
  • Speech synthesis method and device and device for speech synthesis

Examples

Experimental program
Comparison scheme
Effect test

preparation example Construction

[0074] The embodiment of the present invention can be applied in the speech synthesis process based on HMM, refer to figure 1 , shows a flow chart of an HMM-based speech synthesis method of the present invention, which may specifically include: a training phase and a synthesis phase.

[0075] Wherein, in the training stage, the training recording data can be obtained from the recording database, and the parameters of the training recording data can be extracted to obtain the corresponding acoustic parameters. The acoustic parameters can include: at least one of the spectrum parameters, fundamental frequency parameters and duration parameters One, and, the training recording data can be labeled; optionally, labeling information can be generated based on the training recording data and its corresponding text, and the above labeling information can be used to indicate from which moment to which moment in the training recording data it is What modeling unit, what is the modeling u...

Embodiment 1

[0082] refer to image 3 , which shows a flow chart of the steps of Embodiment 1 of a speech synthesis method of the present invention, the method embodiment may specifically include the following steps:

[0083] Step 301, receiving text to be synthesized;

[0084] Step 302: During the speech synthesis process of the text to be synthesized, judge the corresponding state of the text to be synthesized or the voicing of the frame according to the spectral parameters, so as to obtain the corresponding voicing judgment result;

[0085] Step 303: Obtain the synthesized speech corresponding to the text to be synthesized according to the voicing determination result.

[0086] In the embodiment of the present invention, the text to be synthesized may be used to represent the text that needs to be converted into speech. In practical applications, one can follow the figure 1 In the processing flow of the synthesis stage, the speech synthesis of the text to be synthesized is performed ...

Embodiment 2

[0096] refer to Figure 4 , which shows a flow chart of the steps of Embodiment 1 of a speech synthesis method of the present invention, the method embodiment may specifically include the following steps:

[0097] Step 401, receiving text to be synthesized;

[0098] Step 402, during the speech synthesis process of the text to be synthesized, according to the HMM model, obtain the target spectrum leaf node matching the corresponding state of the text to be synthesized; wherein, the HMM model may include: a decision tree, the The decision tree may include: a spectrum decision tree, and the spectrum decision tree may include: a spectrum leaf node;

[0099] Step 403, according to the voicing probability of the target spectral leaf node, determine the voicing of the corresponding state of the text to be synthesized;

[0100] Step 404: Obtain the synthesized speech corresponding to the to-be-synthesized text according to the voicing determination result.

[0101] compared to im...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

An embodiment of the present invention provides a voice synthesis method and device and a device for voice synthesis. The method specifically includes a step of receiving a text to be synthesized, a step of judging a compositing state of the text to be synthesized or the opacity of a frame according to spectrum parameters in the voice synthesis process of the text to be synthesized so as to obtaina corresponding opacity determination result, a step of obtaining synthesized speech corresponding to the text to be synthesized according to the opacity determination result. According to the embodiment of the present invention, the problem of noise occurring in the synthesized speech due to a judgment error of unvoiced and voiced sounds can be solved, and thus the listening feeling and the sound quality of the synthesized voice can be improved.

Description

technical field [0001] The invention relates to the technical field of speech synthesis, in particular to a speech synthesis method and device, and a speech synthesis device. Background technique [0002] Speech synthesis technology, also known as text-to-speech (TTS, Text-to-Speech) technology, is a technology that converts text into speech. This technology gives computers the ability to speak freely like humans, making information communication between users and machines more comfortable. nature. [0003] At present, speech synthesis (HTS, HMM-based Speech Synthesis System) based on Hidden Markov Model (HMM, Hidden Markov Model) has been widely valued and applied. The basic idea of ​​HTS is: parametrically decompose the speech signal, and establish the HMM model corresponding to each acoustic parameter, and use the trained HMM model to predict the acoustic parameters of the text to be synthesized during synthesis, and these acoustic parameters are input to the parameter s...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G10L13/02G10L15/14G10L25/93
CPCG10L13/02G10L15/142G10L25/93
Inventor 孟凡博
Owner BEIJING SOGOU TECHNOLOGY DEVELOPMENT CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products