Speech synthesis method and device, equipment and storage medium

A speech synthesis and audio technology, applied in speech synthesis, speech analysis, instruments, etc., can solve the problems of single emotional expression of synthesized speech and inability to control speaking style independently.

Pending Publication Date: 2021-05-11
PING AN TECH (SHENZHEN) CO LTD
View PDF0 Cites 13 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] The main purpose of this application is to provide a speech synthesis method, device, computer equipment and computer-readable storage medium, aiming ...

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Speech synthesis method and device, equipment and storage medium
  • Speech synthesis method and device, equipment and storage medium
  • Speech synthesis method and device, equipment and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0026] The following will clearly and completely describe the technical solutions in the embodiments of the present application with reference to the drawings in the embodiments of the present application. Obviously, the described embodiments are part of the embodiments of the present application, not all of them. Based on the embodiments in this application, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the scope of protection of this application.

[0027] The flow charts shown in the drawings are just illustrations, and do not necessarily include all contents and operations / steps, nor must they be performed in the order described. For example, some operations / steps can be decomposed, combined or partly combined, so the actual order of execution may be changed according to the actual situation.

[0028] Embodiments of the present application provide a speech synthesis method, device, computer equipment, and co...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to the technical field of artificial intelligence, and discloses a speech synthesis method and a device thereof, computer equipment and a computer readable storage medium, and the method comprises the steps: obtaining a to-be-processed text and a to-be-synthesized speech style audio, and inputting the to-be-processed text and the to-be-synthesized speech style audio into a preset speech synthesis model, encoding the speech style audio to be synthesized based on the multi-reference encoder, and obtaining style embedding vector information; encoding the to-be-processed text based on the text encoder to obtain text encoding vector information; splicing the style embedding vector information and the text coding vector information through the full connection layer to generate a Mel language spectrogram; and performing feature extraction on the Mel-language spectrogram through the output layer, and outputting a target audio of the to-be-processed text, thereby realizing control of the speaking style of the synthesized voice, and synthesizing the voice with more emotional expressions.

Description

technical field [0001] The present application relates to the technical field of speech semantics, in particular to a speech synthesis method, device, computer equipment and computer-readable storage medium. Background technique [0002] In the process of speech synthesis, not only the clarity and fluency of the synthesized speech must be considered, but also the prosodic information of the synthesized speech, so that the synthesized speech has rich emotional expression. When synthesizing speech, not only consider the smoothness of the sentence, but also consider changing the emotional state of the speaker, and use the model to learn the style information of the reference audio, so as to achieve a level comparable to the human voice. In the current prosodic model construction, the common method is to classify all speaking styles into one expression, and the speaking styles cannot be separated, so the speaking styles cannot be controlled separately, and the emotional expressi...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G10L13/047G10L13/10G10L25/18G10L25/30
CPCG10L13/10G10L13/047G10L25/30G10L25/18
Inventor 孙奥兰王健宗程宁
Owner PING AN TECH (SHENZHEN) CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products