Unlock instant, AI-driven research and patent intelligence for your innovation.

Personalized multi-acoustic model training method, speech synthesis method and device

An acoustic model and speech synthesis technology, applied in the field of speech, can solve the problems of unstable timbre, low accuracy of acoustic model, unnatural speech, etc., to improve the user experience and meet the needs of personalized speech.

Active Publication Date: 2017-03-22
BAIDU ONLINE NETWORK TECH (BEIJIBG) CO LTD
View PDF6 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Since two speakers generate the original speech according to different texts, and the pronunciation of the same syllable is obviously different in different sentence environments, therefore, if the same sound in different sentences of different speakers is made mapping, it is easy to cause the trained personalized acoustic model to be inaccurate, resulting in unnatural synthesized speech
[0010] For the second method, since the decision tree is a shallow model, its descriptive ability is limited, especially when the amount of user voice data is relatively small, the accuracy of the generated personalized acoustic model is not high, resulting in prediction The output parameters may be incoherent, which will cause jumps in the synthesized voice, unstable timbre, etc., resulting in unnatural voice

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Personalized multi-acoustic model training method, speech synthesis method and device
  • Personalized multi-acoustic model training method, speech synthesis method and device
  • Personalized multi-acoustic model training method, speech synthesis method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0039] Embodiments of the present invention are described in detail below, examples of which are shown in the drawings, wherein the same or similar reference numerals designate the same or similar elements or elements having the same or similar functions throughout. The embodiments described below by referring to the figures are exemplary and are intended to explain the present invention and should not be construed as limiting the present invention.

[0040] The following describes the personalized multi-acoustic model training method for speech synthesis, speech synthesis method and device according to the embodiments of the present invention with reference to the accompanying drawings.

[0041] figure 1 It is a flowchart of a training method for a personalized multi-acoustic model for speech synthesis according to an embodiment of the present invention.

[0042] like figure 1 As shown, the training method of the personalized multi-acoustic model for speech synthesis includ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

A training method for multiple personalized acoustic models, and a voice synthesis method and device, for voice synthesis. The method comprises: training a reference acoustic model, based on first acoustic feature data of training voice data and first text annotation data corresponding to the training voice data (S11); acquiring voice data of a target user (S12); training a first target user acoustic model according to the reference acoustic model and the voice data (S13); generating second acoustic feature data of the first text annotation data, according to the first target user acoustic model and the first text annotation data (S14); and training a second target user acoustic model, based on the first text annotation data and the second acoustic feature data (S15).

Description

technical field [0001] The invention relates to the technical field of speech, in particular to a training method for a personalized multi-acoustic model for speech synthesis, a speech synthesis method and a device. Background technique [0002] Speech synthesis, also known as Text to Speech (Text to Speech) technology, is a technology that can convert text information into speech and read it aloud. It involves multiple disciplines such as acoustics, linguistics, digital signal processing, and computer science. It is a cutting-edge technology in the field of Chinese information processing. The main problem to be solved is how to convert text information into audible sound information. [0003] In the speech synthesis system, the process of converting text information into sound information is as follows: first, the input text needs to be processed, including preprocessing, word segmentation, part-of-speech tagging, polyphone prediction, prosodic level prediction, etc., and t...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G10L13/02G10L15/02G10L15/183
CPCG10L13/02G10L13/10G10L15/02G10L15/183G10L13/08G10L15/04G10L15/063G10L15/142G10L15/1807G10L2015/0631
Inventor 李秀林
Owner BAIDU ONLINE NETWORK TECH (BEIJIBG) CO LTD