Method for synthesizing emotional speech by utilizing transfer learning under low resources

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A technology of speech synthesis and transfer learning, applied in speech synthesis, speech analysis, instruments, etc., can solve the problems of high acquisition cost and unconditional access to data sets, etc.

Pending Publication Date: 2020-11-17

TIANJIN UNIV

View PDF0 Cites 1 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

Emotional speech synthesis under the premise of training with a large amount of data has reached an acceptable level, but in some special cases, it may not be possible to obtain a large amount of data for training, or obtain a cost relatively high

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0045] The present invention will be further described in detail below with reference to the accompanying drawings and specific embodiments. It should be understood that the specific embodiments described herein are only used to explain the present invention, but not to limit the present invention.

[0046] This embodiment provides a method for emotional speech synthesis using transfer learning under low resources. In the actual operation of this embodiment, two data sets: EMOV-DB and LJSpeech-1.1 are used, wherein the EMOV-DB data set is low The emotional speech synthesis dataset of the resource, the text in the dataset is based on the CMU Arctic database. The dataset includes recordings of four speakers - two men and two women. Emotion types include neutral, sleepy, angry, disgusted, and entertaining. The LJSpeech-1.1 dataset is a single-person emotion-neutral speech synthesis dataset containing 13,100 short audio clips from a single speaker from 7 non-fiction books. Tran...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a method for synthesizing emotional speech by using transfer learning under low resources, which comprises the following steps of: 1, pre-training an emotional vector: traininga speech emotion recognition model by using an EMOVDB data set, wherein the speech emotion recognition model is obtained by further processing a style vector extraction part in a basic method GST + Tacotron2 model of stylized end-to-end speech synthesis; 2, pre-training a speech synthesis model: for a basic Tacotron2 model, carrying out pre-training by utilizing a data set of LJSpeech 1.1; and 3,carrying out transfer learning training: for the basic Tacotron2 model, connecting the intermediate result obtained in the step 1 to the result of the encoder, and carrying out transfer learning training. According to the method, a pre-training and transfer learning method is adopted, a small amount of emotion data of a single speaker can be fully utilized, and on the basis of a unified emotion speech synthesis model, the synthesized speech with the quality reaching a certain level and the obvious emotion tendency is synthesized.

Description

technical field [0001] The invention relates to the field of speech synthesis, in particular to a method for implementing emotional speech synthesis by using existing data for migration learning under low resources. Background technique [0002] In recent years, the field of end-to-end speech synthesis has developed rapidly. Under the premise of training on large data sets, the quality and clarity of speech synthesis have been greatly improved. Emotional speech synthesis under the premise of training with a large amount of data has reached an acceptable level, but in some special cases, it may not be possible to obtain a dataset with a large amount of data for training, or the acquisition cost relatively high. SUMMARY OF THE INVENTION [0003] The purpose of the present invention is to overcome the deficiencies in the prior art, and to provide a method for emotional speech synthesis using migration learning under low resources. The methods of learning and model pre-train...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G10L13/02G10L13/08G10L25/63

CPCG10L13/02G10L13/08G10L25/63Y02D10/00

Inventor 王龙标徐杰党建武贡诚

Owner TIANJIN UNIV

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Method for synthesizing emotional speech by utilizing transfer learning under low resources

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology