Real-time voice conversion method under conditions of minimal amount of training data

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A technology of real-time speech and conversion methods, used in speech synthesis, speech analysis, speech recognition, etc.

Inactive Publication Date: 2010-06-23

NANJING UNIV OF POSTS & TELECOMM

View PDF0 Cites 46 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0005] At present, there have been no researches on how to perform speech conversion in the case of scarce training data in the world and domestically, and the content of the invention is still the first in this field.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0096] The structure of the published speech conversion system is as follows: figure 1 shown. Horizontally, the system can be divided into two main parts: the training phase and the conversion phase. In the training phase, source and target voice data are collected, analyzed, feature parameters extracted, conversion rules learned and saved; in the conversion phase, the new source voice data to be converted is also collected, analyzed, and parameters are extracted, and then the training The transformation rules obtained in the stage are used on it, and finally all the transformed parameters are synthesized into speech through the speech synthesis module. Generally speaking, the training phase is a non-real-time phase, that is, an offline mode; and the conversion phase is a real-time phase, that is, an online mode. From a vertical perspective, the system can be divided into four major steps: signal analysis and synthesis, parameter selection and extraction, parameter alignment...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention provides a real-time voice conversion method under conditions of minimal amount of training data. The method utilizes an ensemble learning theory to carry out modeling of a Gaussian mixture model to the collected data and design a mapping function under the rule of minimum mean square error. The method solves the problem that a standard GMM easily leads to over-fitting in the case of very minimal amount of data, and increases the robustness of a voice conversion algorithm for amount of data issues. At the same time, the GMM with more standard computational complexity is low in the process of estimating GMM parameters by the method, so the method is suitable for real-time voice conversion.

Description

technical field [0001] The present invention relates to a voice conversion technology (Voice conversion, VC), in particular to a real-time voice conversion method under the condition of a very small amount of training data, which is a voice conversion based on a statistical analysis model for text-to-speech conversion systems and robot vocalization systems The scheme belongs to the technical field of signal processing, especially speech signal processing. Background technique [0002] The knowledge field involved in this patent is called speech conversion technology, which is a new research branch in the field of speech signal processing in recent years, covering the core technologies of speaker recognition and speech synthesis, and combining them to achieve a unified goal. That is, while keeping the semantic content unchanged, by changing the voice personality characteristics of a specific speaker (called source speaker, Source speaker), what he (or she) said is considered ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G10L13/02G10L15/06

Inventor 徐宁杨震

Owner NANJING UNIV OF POSTS & TELECOMM

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Real-time voice conversion method under conditions of minimal amount of training data

What is Al technical title? Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document. A technology of real-time speech and conversion methods, used in speech synthesis, speech analysis, speech recognition, etc.

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A technology of real-time speech and conversion methods, used in speech synthesis, speech analysis, speech recognition, etc.

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology