Method and device for synthesizing emotional speech of specific speaker under extremely low resources

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A technology of speech synthesis and speaker, which is applied in the field of computer equipment and storage media, and specific speaker emotional speech synthesis, which can solve the problems of high cost and high cost

Pending Publication Date: 2020-09-04

SHENGZHI INFORMATION TECH NANJING CO LTD

View PDF9 Cites 3 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

The advantage of the neural network synthesis method is that the processing is simple and the synthesis effect is very natural. The disadvantage is that a large data set is required to train the model, especially for the labeling of emotional data, the cost is too high

It can be seen that traditional emotional speech synthesis solutions often have limitations and high costs

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0031] In order to make the purpose, technical solution and advantages of the present application clearer, the present application will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present application, and are not intended to limit the present application.

[0032] Reference herein to an "embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the present application. The occurrences of this phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. It is understood explicitly and implicitly by those skilled in the art that the embodiments described herein can be combined with other embodiment...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a method and a device for synthesizing emotional speech of a specific speaker under extremely low resources, computer equipment and a storage medium. The method comprises the following steps: acquiring a training text and an audio corresponding to the training text, converting the training text into a phoneme sequence, embedding a slot with an emotion vector to obtain initial training data, inputting into a deep learning model, training to obtain a base model, acquiring a specific text and a specific audio, converting the specific text into a corresponding phoneme sequence, embedding a slot with an emotion vector into the phoneme sequence to obtain specific training data, inputting the specific training data into the base model, training to obtain a speech synthesismodel, converting a to-be-synthesized text into a phoneme sequence to obtain a to-be-synthesized phoneme sequence, filling the to-be-synthesized phoneme sequence into the emotion slot to obtain synthetic input data, and inputting the synthetic input data into the speech synthesis model to obtain a speech audio with specific emotion, so as to reduce the cost of obtaining the speech audio with thespecific emotion and improve the flexibility of an emotion speech synthesis scheme.

Description

technical field [0001] The present invention relates to the technical field of speech signal processing, in particular to a speech synthesis method, device, computer equipment and storage medium for specific speaker emotion under extremely low resources. Background technique [0002] Speech synthesis technology is to give computers (or various terminal devices) the ability to speak like humans, which is a typical interdisciplinary subject. TTS technology (also known as text-to-speech technology) belongs to speech synthesis, which is a technology that converts text information generated by the computer itself or externally input into intelligible and fluent speech output. Emotional speech synthesis is a research field that has only emerged in the past ten years. Compared with traditional speech synthesis, emotional speech synthesis takes into account the speaker's emotional state and speaking style, making the synthesized speech more intelligent and humanized, with more Wide...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G10L13/08G10L13/04G10L25/63

CPCG10L13/08G10L25/63

Inventor 袁熹

Owner SHENGZHI INFORMATION TECH NANJING CO LTD

Method and device for synthesizing emotional speech of specific speaker under extremely low resources

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology