Chinese and English mixed speech synthesis method and device, electronic equipment and storage medium

What is AI technical title?
AI technical title is built by Patsnap AI team. It summarizes the technical point description of the patent document.
A speech synthesis, Chinese and English technology, applied in speech synthesis, speech analysis, instruments, etc., can solve the problems of model influence, inconsistency of phonemes sent into the model, flawed synthesis effect, etc.

Pending Publication Date: 2021-09-10

携程科技(上海)有限公司

View PDF0 Cites 3 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

Although the characters sent to the model in Chinese and English are the same, the phonemes sent to the model are not uniform when the two voices pronounce the same sound, which leads to the model being affected by the speaker, and the synthesis effect is flawed

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

preparation example Construction

[0032] figure 1 Show the main steps of the speech synthesis method of Chinese and English mixing in one embodiment, refer to figure 1 As shown, the speech synthesis method of Chinese and English mixing includes: step S110, normalize the initial text comprising Chinese text and English text, convert the Chinese text into pinyin with tones, and convert the English text into words; step S120, convert the The regularized text is aligned with the corresponding initial audio to obtain an aligned text with a pause rhythm; step S130, perform phoneme conversion on the aligned text, and convert the pinyin and words in the aligned text into corresponding Carnegie Mellon University ( CMU) phoneme; step S140, convert each CMU phoneme into a phoneme vector and input it into the acoustic model to obtain the mel spectrum feature corresponding to the initial text; step S150, input the mel spectrum feature into the vocoder to synthesize the target audio.

[0033]The above speech synthesis meth...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention relates to the technical field of language processing, and provides a Chinese and English mixed speech synthesis method and device, electronic equipment and a storage medium. The speech synthesis method comprises the following steps: regularizing an initial text containing a Chinese text and an English text, converting the Chinese text into pinyin with tones, and converting the English text into words; aligning the regularized text with the corresponding initial audio to obtain an aligned text with a pause rhythm; carrying out phoneme conversion on the aligned text, and respectively converting pinyin and words in the aligned text into corresponding CMU phonemes; converting each CMU phoneme into a phoneme vector, inputting the phoneme vector into an acoustic model, and obtaining a Mel spectrum feature corresponding to the initial text; and inputting the Mel spectrum features into a vocoder to synthesize a target audio. By converting Chinese and English into unified CMU phonemes, Chinese and English pronunciations are mapped to the same pronunciation space, and the synthesis effect of the Chinese and English mixed speech is effectively improved.

Description

technical field [0001] The invention relates to the technical field of language processing, in particular to a Chinese-English mixed speech synthesis method, device, electronic equipment and storage medium. Background technique [0002] Large-scale online travel service companies have a large number of users who need services. Using speech synthesis technology, combined with speech recognition, dialogue management, natural language understanding, and natural language generation to build an outbound robot, it can save labor costs and serve users efficiently. Among them, the broadcast effect of speech synthesis plays a vital role in enabling users to obtain a better service experience. [0003] With the continuous expansion of the tourism business, a large number of overseas services and overseas users need to be connected, and a large amount of mixed Chinese and English information needs to be broadcast. Based on this, a Chinese-English mixed speech synthesis model (hereinaf...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & AuthorityApplications(China)

IPC IPC(8): G10L13/02G10L13/08G10L13/04G10L25/24G10L25/30

CPCG10L13/02G10L13/08G10L13/086G10L13/04G10L25/24G10L25/30G10L2013/083

Inventor陈子浩罗超周明康邹宇李巍严丽

Owner携程科技(上海)有限公司

Chinese and English mixed speech synthesis method and device, electronic equipment and storage medium

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

preparation example Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology