Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Voice conversion system based on hidden Gaussian random field

A speech conversion and random field technology, applied in speech analysis, speech recognition, instruments, etc., can solve the problems of dimensional disaster and system instability, and achieve the effect of excellent performance, strong nonlinear mapping ability, and excellent system performance.

Active Publication Date: 2014-10-08
CHANGZHOU INST OF TECH
View PDF3 Cites 15 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, since the speech parameters are generally high-dimensional vectors, the traditional speech conversion method is prone to the "curse of dimensionality" problem under the condition of relatively scarce data volume, which will lead to instability in the system

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Voice conversion system based on hidden Gaussian random field

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0041] The present invention will be further described below in conjunction with the accompanying drawings.

[0042] Such as figure 1 As shown, a speech conversion system based on Hidden Gaussian Random Field includes a speech analysis module, a speech synthesis module, a speech parameter preprocessing module, and a speech parameter conversion mapping module. The speech analysis module and the speech synthesis module are used for decomposing and recombining the original speech signal. The intermediate parameters involved in the decomposition and reorganization are called feature parameters; the speech parameter preprocessing module is used to organize and filter the feature parameters of speakers A and B to obtain a set of feature parameters synchronized in time; The speech parameter conversion and mapping module is used to capture the mapping relationship between the two feature parameter sets A and B, so as to obtain the mapping rule.

[0043] The speech analysis module in...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a voice conversion system based on a hidden Gaussian random field. The voice conversion system comprises a voice analysis module, a voice synthesis module, a voice parameter pre-processing module and a voice parameter conversion mapping module. The voice analysis module and the voice synthesis module are used for carrying out decomposition and reconstruction on original voice signals. The voice parameter pre-processing module is used for sorting and screening feature parameters of a speaker A and a speaker B to obtain feature parameter sets synchronous in time. The voice parameter conversion mapping module is used for capturing the mapping relation between the feature parameter set A and the feature parameter set B to obtain a mapping rule. The core technology of the system expands around the Gaussian random field theory, a novel hidden Gaussian random field model is generated by changing the structure of the basic Gaussian random field, and the system can achieve an ideal effect under the environment of a lack of data.

Description

technical field [0001] The present invention relates to a speech signal processing system, that is, by changing the voice characteristics of a speaker A to make it sound like the voice of another speaker B, this technology is called speech conversion. Background technique [0002] As an important branch in the field of speech signal processing, speech conversion technology aims to change the voice characteristics of any speaker to make it sound like another designated target person's voice. This technology has important application value, for example, it can be used at the terminal of text-to-speech converter, so that the machine can produce various voices vividly, and it can also be used in film entertainment dubbing, secret security and other fields. At present, more mature speech conversion methods are generally constructed based on Gaussian mixture models. This type of method can model and analyze speech data from the perspective of probability distribution, and has the...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G10L15/02G10L15/14G10L25/12G10L25/93
Inventor 鲍静益徐宁
Owner CHANGZHOU INST OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products