A Restricted Boltzmann Machine-Based Approach to Sound Transformation Based on Joint Spectral Modeling

A joint spectrum and sound conversion technology, applied in speech synthesis, speech analysis, speech recognition, etc., can solve problems such as over-smoothing

Active Publication Date: 2016-02-03
UNIV OF SCI & TECH OF CHINA
View PDF4 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0008] The technical solution of the present invention: In order to improve the over-smoothing problem in the existing sound conversion method, a sound conversion method based on joint spectral modeling of restricted Boltzmann machines is provided, which improves the accuracy of spectral modeling and improves the conversion of speech Sound quality and naturalness

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A Restricted Boltzmann Machine-Based Approach to Sound Transformation Based on Joint Spectral Modeling
  • A Restricted Boltzmann Machine-Based Approach to Sound Transformation Based on Joint Spectral Modeling
  • A Restricted Boltzmann Machine-Based Approach to Sound Transformation Based on Joint Spectral Modeling

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0112] In the present invention, according to the idea of ​​joint spectrum modeling based on restricted Boltzmann machine, the specific process of realizing sound conversion is obtained as follows: figure 1 shown. Different from using a single Gaussian to describe each acoustic subspace in the GMM-based model conversion method, the present invention uses a restricted Boltzmann machine (RBM) model for description. In the RBM training module, RBMs with different structures can be used according to the specific form of the RBM model, such as Gaussian-BernoulliRBM, Gaussian-GaussianRBM, etc.

[0113] Restricted Boltzmann machine (see R. Salakhutdinov, "Learning deep generative models," Ph. D. dissertation, University of Toronto, 2009.) is a two-layer structure used to describe the mutual dependence between a set of random variables. Directed graphical model, which consists of a set of visible random variables v=[v 1 , v 2 ,...,v V ] T Nodes and a set of hidden random variable...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

Disclosed is a voice conversion method of united frequency-spectrum modeling based on a restricted boltzman machine. The method comprises the implementation steps: extracting voice spectrum envelope characteristics, extracting voice high-layer spectrum characteristics, conducting dynamic time warping, training a GMM, dividing acoustic subspaces of the united spectrum envelope characteristics, training a Gaussian-Bernouslli RBM or a Gaussian-Gaussian RBM, converting frequency spectrums and synthesizing conversion voices. According to the voice conversion method of the united frequency-spectrum modeling based on the restricted boltzman machine, the precision of the frequency-spectrum modeling is improved, and the tone quality and the naturalness of the conversion voices are improved.

Description

technical field [0001] The invention relates to a sound conversion method in speech synthesis, in particular to a sound conversion method based on joint spectrum modeling of a Restricted Boltzmann Machine (RBM). Background technique [0002] The purpose of voice conversion (also known as speaker conversion) is to transform the speech of one speaker (source speaker) to make it sound like another speaker (target speaker) while keeping the semantics of the speech unchanged. Currently, joint spectrum modeling based on Gaussian Mixture Model (GMM) (see Y. Stylianou, O. Cappé, and E. Moulines, "Continuous probabilistic transform for voice conversion," IEEETrans.SpeechAudioProcess., vol.6, no.2, pp.131 -142, Mar.1998.) is the mainstream method for voice transformation. The main principle of this method is to use multiple Gaussian distributions to fit the joint spectral feature probability distribution of the source and target according to the maximum likelihood criterion during th...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G10L13/033G10L15/06
Inventor 刘利娟陈凌辉凌震华戴礼荣
Owner UNIV OF SCI & TECH OF CHINA
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products