Chinese and English mixed speech synthesis method and device
A speech synthesis, Chinese-English technology, applied in speech synthesis, speech analysis, instruments, etc., can solve the problem of difficulty in speech synthesis of Chinese-English mixed text, and achieve the effect of high speech synthesis quality
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0041] see figure 1 , a Chinese-English mixed speech synthesis method, including a training phase and an inference phase, including the following steps in the training phase:
[0042] S11. Obtain multi-person Chinese and English speech training data, and extract speech acoustic features to obtain a training data set;
[0043] Optionally, the English speech synthesis data set can use LJSpeech, VCTK and other public data sets, and the Chinese speech synthesis data set can use the female voice database of Biaobei Company and the self-recorded voice database covering the voices of more than 20 people.
[0044] Understandably, Chinese and English speech training data include: Chinese speech data and corresponding Chinese text, English speech data and corresponding English text, Chinese-English mixed speech data and corresponding Chinese-English mixed text; extracted speech acoustic features Including but not limited to Mel spectral features.
[0045] S12. Standardize the English ...
Embodiment 2
[0068] A Chinese-English mixed speech synthesis device, comprising:
[0069] The text processing module is used to normalize the Chinese and English texts and convert them into a unified pinyin phoneme expression;
[0070] Optionally, the text processing module processes the mixed text in Chinese and English differently, standardizes the English text, and eliminates illegal characters; unifies the English text into ASCII code; unifies the English characters into lowercase letters; expands the English abbreviation; uses The CMU pronunciation dictionary converts each English word into a CMU pronunciation phoneme. If the word is not in the CMU dictionary key value, the sentence text and the corresponding voice are removed from the training data; a mapping dictionary between the CMU pronunciation phoneme and the Pinyin phoneme is created; through The mapping dictionary converts CMU pronunciation phonemes into Pinyin phonemes; standardizes Chinese text, screens out illegal characte...
PUM
Login to View More Abstract
Description
Claims
Application Information
Login to View More 

