Unlock instant, AI-driven research and patent intelligence for your innovation.

English speech synthesis method and system, electronic equipment and storage medium

A synthesis method, English technology, applied in the field of English speech synthesis, can solve problems such as insufficient clarity, naturalness, and large information loss, and achieve the effect of ensuring quality and real-time performance

Pending Publication Date: 2020-09-25
CTRIP COMP TECH SHANGHAI
View PDF12 Cites 6 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, the information loss of the parametric synthesis method is large, and the synthesized sound is not clear and natural enough.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • English speech synthesis method and system, electronic equipment and storage medium
  • English speech synthesis method and system, electronic equipment and storage medium
  • English speech synthesis method and system, electronic equipment and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0045] This embodiment provides a method for synthesizing English speech, such as figure 1 shown, including the following steps:

[0046] Step S101, converting the target English text into a corresponding text vector.

[0047] In an optional implementation manner, before step S101, preprocessing the target English text is also included. In one example, regularization processing is performed on the target English text, such as removing garbled characters or non-standard symbols in the target English text. In another example, Chinese symbols in the target English text are replaced with corresponding English symbols. In another example, the numbers in the target English text are converted into English words in the corresponding scene. For example, for the same number "205", if the corresponding scene is a room number, the corresponding English words are "two, zero, five"; if the corresponding scene is money, the corresponding English words are "two hundred and five ".

[004...

Embodiment 2

[0066] The present embodiment provides a kind of synthesis system 40 of English speech, as Figure 4 As shown, it includes a text processing module 41 , a feature extraction module 42 , a prediction module 43 and a vocoder 44 .

[0067] The text processing module 41 is used to convert the target English text into corresponding text vectors.

[0068] In an optional implementation manner, the text processing module 41 is also used to preprocess the target English text. In one example, regularization is performed on the target English text. In another example, Chinese symbols in the target English text are replaced with corresponding English symbols. In another example, the numbers in the target English text are converted into English words in the corresponding scene.

[0069] The feature extraction module 42 is used to extract the parameters of the template audio corresponding to the target sentence pattern, and convert the parameters into corresponding parameter vectors; whe...

Embodiment 3

[0077] Figure 5 A schematic structural diagram of an electronic device provided in this embodiment. The electronic device includes a memory, a processor, and a computer program stored in the memory and operable on the processor, and the processor implements the English speech synthesis method in Embodiment 1 when executing the program. Figure 5 The electronic device 3 shown is only an example, and should not impose any limitation on the functions and application scope of the embodiments of the present invention.

[0078] The electronic device 3 may be in the form of a general computing device, eg it may be a server device. Components of the electronic device 3 may include but not limited to: the at least one processor 4 mentioned above, the at least one memory 5 mentioned above, and the bus 6 connecting different system components (including the memory 5 and the processor 4 ).

[0079] The bus 6 includes a data bus, an address bus and a control bus.

[0080] The memory 5 ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses an English speech synthesis method and system, electronic equipment and a storage medium. The English speech synthesis method comprises the following steps: converting a targetEnglish text into a corresponding text vector; extracting a parameter of a template audio corresponding to the target sentence pattern, and converting the parameter into a corresponding parameter vector, wherein the parameters are used for representing intonation characteristics of the template audio; splicing and then inputting the text vectors and the parameter vectors into an acoustic model, and obtaining corresponding acoustic features through prediction; and converting the acoustic features into playable audio. According to the invention, the corresponding audio is synthesized by using the parameters of the English text and the template audio, so that a machine can play English of a corresponding sentence pattern with intonation, and the quality and real-time performance of speech synthesis can be ensured.

Description

technical field [0001] The invention relates to the technical field of speech synthesis, in particular to an English speech synthesis method and system, electronic equipment and a storage medium. Background technique [0002] Existing speech synthesis techniques are mainly divided into splicing method and parametric method. Among them, the splicing synthesis method is to pre-record a large number of voices, and then select the voices of the required basic units according to the text to be synthesized for splicing and synthesizing the voices. Although the voice quality of the splicing synthesis method is high, the amount of audio data to be recorded is huge and the cost is very high. The parameter synthesis method is to generate speech parameters at every moment according to the statistical model, such as fundamental frequency, formant frequency, etc., and then convert these parameters into sound through a vocoder. However, the information loss of the parametric synthesis m...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G10L13/08G10L13/04
CPCG10L13/08G10L13/04Y02D10/00
Inventor 周明康罗超吉聪睿李巍胡泓
Owner CTRIP COMP TECH SHANGHAI