Voice conversion system based on hidden Gaussian random field

What is AI technical title?
AI technical title is built by Patsnap AI team. It summarizes the technical point description of the patent document.
A speech conversion and random field technology, applied in speech analysis, speech recognition, instruments, etc., can solve the problems of dimensional disaster and system instability, and achieve the effect of excellent performance, strong nonlinear mapping ability, and excellent system performance.

Active Publication Date: 2014-10-08

CHANGZHOU INST OF TECH

View PDF3 Cites 15 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

However, since the speech parameters are generally high-dimensional vectors, the traditional speech conversion method is prone to the "curse of dimensionality" problem under the condition of relatively scarce data volume, which will lead to instability in the system

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0041] The present invention will be further described below in conjunction with the accompanying drawings.

[0042] Such as figure 1 As shown, a speech conversion system based on Hidden Gaussian Random Field includes a speech analysis module, a speech synthesis module, a speech parameter preprocessing module, and a speech parameter conversion mapping module. The speech analysis module and the speech synthesis module are used for decomposing and recombining the original speech signal. The intermediate parameters involved in the decomposition and reorganization are called feature parameters; the speech parameter preprocessing module is used to organize and filter the feature parameters of speakers A and B to obtain a set of feature parameters synchronized in time; The speech parameter conversion and mapping module is used to capture the mapping relationship between the two feature parameter sets A and B, so as to obtain the mapping rule.

[0043] The speech analysis module in...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a voice conversion system based on a hidden Gaussian random field. The voice conversion system comprises a voice analysis module, a voice synthesis module, a voice parameter pre-processing module and a voice parameter conversion mapping module. The voice analysis module and the voice synthesis module are used for carrying out decomposition and reconstruction on original voice signals. The voice parameter pre-processing module is used for sorting and screening feature parameters of a speaker A and a speaker B to obtain feature parameter sets synchronous in time. The voice parameter conversion mapping module is used for capturing the mapping relation between the feature parameter set A and the feature parameter set B to obtain a mapping rule. The core technology of the system expands around the Gaussian random field theory, a novel hidden Gaussian random field model is generated by changing the structure of the basic Gaussian random field, and the system can achieve an ideal effect under the environment of a lack of data.

Description

technical field [0001] The present invention relates to a speech signal processing system, that is, by changing the voice characteristics of a speaker A to make it sound like the voice of another speaker B, this technology is called speech conversion. Background technique [0002] As an important branch in the field of speech signal processing, speech conversion technology aims to change the voice characteristics of any speaker to make it sound like another designated target person's voice. This technology has important application value, for example, it can be used at the terminal of text-to-speech converter, so that the machine can produce various voices vividly, and it can also be used in film entertainment dubbing, secret security and other fields. At present, more mature speech conversion methods are generally constructed based on Gaussian mixture models. This type of method can model and analyze speech data from the perspective of probability distribution, and has the...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & AuthorityApplications(China)

IPC IPC(8): G10L15/02G10L15/14G10L25/12G10L25/93

Inventor鲍静益徐宁

OwnerCHANGZHOU INST OF TECH

Voice conversion system based on hidden Gaussian random field

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology