Voice conversion method based on convolutive nonnegative matrix factorization

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
A technology of non-negative matrix decomposition and speech conversion, which is applied in speech analysis, speech recognition, speech synthesis, etc., and can solve problems such as limited applications

Inactive Publication Date: 2012-01-04

PLA UNIV OF SCI & TECH

View PDF2 Cites 21 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

Although this non-uniqueness can be understood as a different representation of the feature space, it limits its application in speech conversion

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0035] to combine figure 1 , the present invention is based on the voice conversion method of convolution non-negative matrix factorization, and the steps are as follows:

[0036] Training phase: The transformation model is trained with the training data.

[0037] The first step is time alignment and parameter decomposition of training speech data:

[0038] (1) Time alignment of voice data, such as figure 2 shown. First, the source speaker's voice in the training data set and the target speaker's voice , through the analysis of the STRAIGHT model, the pitch period information of each sampling point of the two is obtained, that is, the pitch period envelope and :

[0039]

[0040]

[0041] in and source speaker speech and the target speaker's voice The number of sampling points contained in .

[0042] The pitch period here is expressed in the form of the number of sampling points, and the fractional part is rounded to an integer. Since the unvoiced s...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a voice conversion method based on convolutive nonnegative matrix factorization. The method comprises the following steps: (1) training a transformation model through training data: carrying out time calibration and parameter decomposition of training voice data, analyzing STRAIGHT spectrum by using a convolutive nonnegative matrix factorization method, and analyzing pitch frequency of source voice and object voice; (2) converting new input voice based on a training model: carrying out parameter decomposition on source voice data A[c] to be converted by employing a STRAIGHT model, realizing sound channel frequency spectrum parameter conversion based on convolutive nonnegative matrix factorization, realizing conversion of the pitch frequency based on obtained mean value and variance in a training phase, and synthesizing voice after conversion, wherein the voice is voice after synthesis and conversion of the STRAIGHT spectrum S[Bc] which is obtained through conversion, the pitch frequency f[Bc] and original aperiodic component ap[Ac]. According to the invention, training effect of voice conversion is improved, and voice quality of conversion voice is improved.

Description

technical field [0001] The invention belongs to the technical field of speech signal processing, in particular to a speech conversion method based on convolutional non-negative matrix decomposition. Background technique [0002] Speech conversion is a technology that changes the personal characteristic information in the source speaker's voice signal to make it have the personal characteristic information of the target speaker's voice. Speech conversion has broad application prospects in the fields of personalized human-computer interaction, military struggle, information security and multimedia entertainment. For example, by combining voice conversion and speech synthesis systems, personalized voice synthesis can be realized; through voice conversion, the voice of an enemy commander can be forged to send false information or orders, disrupting the enemy's combat command; through voice conversion, history can be reproduced character speeches etc. [0003] Voice Conversion / ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & AuthorityApplications(China)

IPC IPC(8): G10L15/06G10L19/02G10L13/02

Inventor张雄伟孙健曹铁勇孙新建黄建军杨吉斌邹霞贾冲

OwnerPLA UNIV OF SCI & TECH

Voice conversion method based on convolutive nonnegative matrix factorization

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology