Speech synthesis method, device and equipment

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A speech synthesis and speech technology, applied in the computer field, can solve the problems of monotonous and flat rhythm and low voice quality, and achieve the effect of simplifying speech synthesis steps and avoiding the phenomenon of over-smoothing.

Active Publication Date: 2019-03-08

BAIDU ONLINE NETWORK TECH (BEIJIBG) CO LTD

View PDF4 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

The HTS speech synthesis technology, which generates dynamic parameters based on differential dynamic features, will bring over-smoothness to the generated feature parameter sequence, which leads to low sound quality and flat rhythm of the synthesized speech.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

preparation example Construction

[0039] Such as figure 1 As shown in , the speech synthesis method mainly includes the following steps:

[0040] S100. Obtain context information of the text to be processed;

[0041] Obtaining the context information of the text to be processed is to obtain the context information of the text to be synthesized into speech, and the obtained context information includes but not limited to: consonants, tones, pauses, etc. of each word in the text to be processed.

[0042] Among them, compared with the context information that can only be processed at the state level in the prior art, the context information acquired by the embodiment of the present application can be either a sound level or a state level smaller than the sound level, so The sound sub-level is that the minimum level of the acquired contextual information is the sound and final of pinyin; the state level smaller than the sound level is that the minimum level of the acquired context information is the sub-segment o...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention provides a speech synthesis method, apparatus and equipment. The method comprises: context information of a to-be-processed text is obtained; according to the context information, a speech time length is determined by using a time length prediction model, wherein the time length prediction model is obtained based on deep neutral network training; on the basis of the context information and the speed time length, feature parameters of a spectrum and fundamental frequency are determined by using a spectral and fundamental frequency prediction model; and a synthesized speech is obtained based on the feature parameters of the spectral and fundamental frequency. With the provided speech synthesis method, the natural and smooth speech with high tone quality can be provided.

Description

technical field [0001] The invention relates to the field of computers, in particular to a speech synthesis method, device and equipment. Background technique [0002] Speech synthesis is a technology that generates artificial voice through mechanical and electronic methods. It is a technology that converts text information generated by the computer itself or externally input into intelligible and fluent voice output. The current speech synthesis technology mostly adopts the parametric synthesis technology (hereinafter referred to as HTS) based on the Hidden Markov Model (hereinafter referred to as HMM). The HTS speech synthesis technology performs decision tree clustering and HMM modeling on the training data during the training phase. , get clustering HMM and decision tree. In the stage of speech synthesis, the decision tree is used to make decisions on the context information of the text to be synthesized, and the corresponding acoustic information is obtained, that is, ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Patents(China)

IPC IPC(8): G10L13/02G10L15/16G10L25/30

Inventor 康永国李威贾磊盖于涛邹赛赛

Owner BAIDU ONLINE NETWORK TECH (BEIJIBG) CO LTD

Speech synthesis method, device and equipment

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

preparation example Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology