Voice conversion method of united frequency-spectrum modeling based on restricted boltzman machine

A technology of Boltzmann machine and joint spectrum, which is applied in speech synthesis, speech analysis, speech recognition, etc., and can solve problems such as over-smoothing

Active Publication Date: 2013-11-27
UNIV OF SCI & TECH OF CHINA
View PDF4 Cites 9 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0008] The technical solution of the present invention: In order to improve the over-smoothing problem in the existing sound conversion method, a sound conversion method based on join

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Voice conversion method of united frequency-spectrum modeling based on restricted boltzman machine
  • Voice conversion method of united frequency-spectrum modeling based on restricted boltzman machine
  • Voice conversion method of united frequency-spectrum modeling based on restricted boltzman machine

Examples

Experimental program
Comparison scheme
Effect test

Example Embodiment

[0112] In the present invention, according to the proposed idea of ​​joint spectrum modeling based on restricted Boltzmann machine, the specific process of realizing sound conversion is obtained as follows: figure 1 shown. Different from describing each acoustic subspace with a single Gaussian in the transformation method based on the GMM model, a restricted Boltzmann machine (RBM) model is used for description in the present invention. In the RBM training module, RBMs of different structures can be adopted according to the specific form of the RBM model, such as Gaussian-Bernoulli RBM, Gaussian-Gaussian RBM, etc.

[0113] Restricted Boltzmann Machines (see R.Salakhutdinov, "Learning deep generative models," Ph.D. dissertation, University of Toronto, 2009.) is a method for describing interdependencies between a set of random variables with An undirected graph model of a two-layer structure, which consists of a set of visible random variables v=[v 1 , v 2 ,...,v V ] T A no...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

Disclosed is a voice conversion method of united frequency-spectrum modeling based on a restricted boltzman machine. The method comprises the implementation steps: extracting voice spectrum envelope characteristics, extracting voice high-layer spectrum characteristics, conducting dynamic time warping, training a GMM, dividing acoustic subspaces of the united spectrum envelope characteristics, training a Gaussian-Bernouslli RBM or a Gaussian-Gaussian RBM, converting frequency spectrums and synthesizing conversion voices. According to the voice conversion method of the united frequency-spectrum modeling based on the restricted boltzman machine, the precision of the frequency-spectrum modeling is improved, and the tone quality and the naturalness of the conversion voices are improved.

Description

technical field [0001] The invention relates to a sound conversion method in speech synthesis, in particular to a sound conversion method based on joint spectrum modeling of a restricted Boltzmann machine (Restricted Boltzmann Machine, RBM). Background technique [0002] The purpose of voice conversion (also known as speaker conversion) is to transform the speech of one speaker (source speaker) to make it sound like another speaker (target speaker) while keeping the semantics of the speech unchanged. Currently, joint spectrum modeling based on Gaussian Mixture Model (GMM) (see Y. Stylianou, O. Cappé, and E. Moulines, "Continuous probabilistic transform for voice conversion," IEEE Trans. Speech Audio Process., vol.6, no.2, pp.131-142, Mar.1998.) is the mainstream method for voice conversion. The main principle of this method is to use multiple Gaussian distributions to fit the joint spectral feature probability distribution of the source and target according to the maximum l...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G10L13/033G10L15/06
Inventor 刘利娟陈凌辉凌震华戴礼荣
Owner UNIV OF SCI & TECH OF CHINA
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products