Voice conversion method and system for non-parallel data

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A non-parallel data and voice conversion technology, which is applied in the voice conversion method and system field of non-parallel data, can solve problems such as difficulty in obtaining parallel data and poor conversion effect, and achieve the effect of ensuring conversion quality

Pending Publication Date: 2020-11-20

BEIJING UNISOUND INFORMATION TECH +1

View PDF0 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0003] In the existing speech conversion technology, most of the methods require two speakers to have parallel data (the text content corresponding to the speech is consistent). The main disadvantage of this method is that it is difficult to obtain parallel data; there are also some methods that do not require parallel data. , requiring only non-parallel data, the main drawback of this approach is that it does not convert well

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0051] The preferred embodiments of the present invention will be described below in conjunction with the accompanying drawings. It should be understood that the preferred embodiments described here are only used to illustrate and explain the present invention, and are not intended to limit the present invention.

[0052]The embodiment of the present invention provides a voice conversion method for non-parallel data, such as figure 1 As shown, the method performs the following steps:

[0053] Step 1: using large-scale synthetic sound database data other than the source speaker and the target speaker to train the speech synthesis model of the target speaker, wherein the large-scale synthetic sound database data includes text data and speech pair data;

[0054] Step 2: Based on the text corresponding to the speech data of the source speaker, generate parallel data corresponding to the target speaker according to the speech synthesis model of the target speaker, wherein the paral...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention provides a voice conversion method and system for non-parallel data. The method comprises the steps of training a target speaker voice synthesis model through the large-scale synthesis voice library data except a source speaker and a target speaker; based on a text corresponding to the voice data of the source speaker, generating parallel data corresponding to the target speaker according to the target speaker voice synthesis model; training a frequency spectrum parameter conversion model by utilizing the voice data and the parallel data of the source speaker; and based on the voice data of the source speaker, generating the converted voice of the target speaker according to the frequency spectrum parameter conversion model and the target speaker voice synthesis model. According to the method provided by the invention, high-quality parallel data is forged by using the target speaker voice synthesis model, then a spectrum parameter conversion model is trained by using theparallel data, and voice conversion is carried out by using the spectrum parameter conversion model and the target speaker voice synthesis model, so that the conversion quality is ensured.

Description

technical field [0001] The invention relates to the technical field of voice conversion, in particular to a voice conversion method and system for non-parallel data. Background technique [0002] Speech conversion is a technique for modifying a source speaker's speech signal to match a target speaker's speech signal, so that it has the speech characteristics of the target speaker while keeping the speech information unchanged. The main task of speech conversion includes extracting and converting the characteristic parameters representing the personality of the speaker, and then reconstructing the converted parameters into speech. This process must not only ensure the clarity of the converted speech, but also ensure the similarity of the converted speech features. [0003] In the existing speech conversion technology, most of the methods require two speakers to have parallel data (the text content corresponding to the speech is consistent). The main disadvantage of this meth...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G10L13/02G10L25/27G10L25/69G10L17/04G10L17/08

CPCG10L13/02G10L25/27G10L25/69G10L17/04G10L17/08

Inventor 孙见青

Owner BEIJING UNISOUND INFORMATION TECH

Voice conversion method and system for non-parallel data

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology