A Restricted Boltzmann Machine-Based Approach to Sound Transformation Based on Joint Spectral Modeling

What is AI technical title?
AI technical title is built by Patsnap AI team. It summarizes the technical point description of the patent document.
A joint spectrum and sound conversion technology, applied in speech synthesis, speech analysis, speech recognition, etc., can solve problems such as over-smoothing

Active Publication Date: 2016-02-03

UNIV OF SCI & TECH OF CHINA

View PDF4 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0008] The technical solution of the present invention: In order to improve the over-smoothing problem in the existing sound conversion method, a sound conversion method based on joint spectral modeling of restricted Boltzmann machines is provided, which improves the accuracy of spectral modeling and improves the conversion of speech Sound quality and naturalness

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0112] In the present invention, according to the idea of joint spectrum modeling based on restricted Boltzmann machine, the specific process of realizing sound conversion is obtained as follows: figure 1 shown. Different from using a single Gaussian to describe each acoustic subspace in the GMM-based model conversion method, the present invention uses a restricted Boltzmann machine (RBM) model for description. In the RBM training module, RBMs with different structures can be used according to the specific form of the RBM model, such as Gaussian-BernoulliRBM, Gaussian-GaussianRBM, etc.

[0113] Restricted Boltzmann machine (see R. Salakhutdinov, "Learning deep generative models," Ph. D. dissertation, University of Toronto, 2009.) is a two-layer structure used to describe the mutual dependence between a set of random variables. Directed graphical model, which consists of a set of visible random variables v=[v 1 , v 2 ,...,v V ] T Nodes and a set of hidden random variable...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

Disclosed is a voice conversion method of united frequency-spectrum modeling based on a restricted boltzman machine. The method comprises the implementation steps: extracting voice spectrum envelope characteristics, extracting voice high-layer spectrum characteristics, conducting dynamic time warping, training a GMM, dividing acoustic subspaces of the united spectrum envelope characteristics, training a Gaussian-Bernouslli RBM or a Gaussian-Gaussian RBM, converting frequency spectrums and synthesizing conversion voices. According to the voice conversion method of the united frequency-spectrum modeling based on the restricted boltzman machine, the precision of the frequency-spectrum modeling is improved, and the tone quality and the naturalness of the conversion voices are improved.

Description

technical field [0001] The invention relates to a sound conversion method in speech synthesis, in particular to a sound conversion method based on joint spectrum modeling of a Restricted Boltzmann Machine (RBM). Background technique [0002] The purpose of voice conversion (also known as speaker conversion) is to transform the speech of one speaker (source speaker) to make it sound like another speaker (target speaker) while keeping the semantics of the speech unchanged. Currently, joint spectrum modeling based on Gaussian Mixture Model (GMM) (see Y. Stylianou, O. Cappé, and E. Moulines, "Continuous probabilistic transform for voice conversion," IEEETrans.SpeechAudioProcess., vol.6, no.2, pp.131 -142, Mar.1998.) is the mainstream method for voice transformation. The main principle of this method is to use multiple Gaussian distributions to fit the joint spectral feature probability distribution of the source and target according to the maximum likelihood criterion during th...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & AuthorityPatents(China)

IPC IPC(8): G10L13/033G10L15/06

Inventor刘利娟陈凌辉凌震华戴礼荣

OwnerUNIV OF SCI & TECH OF CHINA

A Restricted Boltzmann Machine-Based Approach to Sound Transformation Based on Joint Spectral Modeling

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology