Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Voice conversion method and system for non-parallel data

A non-parallel data and voice conversion technology, which is applied in the voice conversion method and system field of non-parallel data, can solve problems such as difficulty in obtaining parallel data and poor conversion effect, and achieve the effect of ensuring conversion quality

Pending Publication Date: 2020-11-20
BEIJING UNISOUND INFORMATION TECH +1
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] In the existing speech conversion technology, most of the methods require two speakers to have parallel data (the text content corresponding to the speech is consistent). The main disadvantage of this method is that it is difficult to obtain parallel data; there are also some methods that do not require parallel data. , requiring only non-parallel data, the main drawback of this approach is that it does not convert well

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Voice conversion method and system for non-parallel data
  • Voice conversion method and system for non-parallel data

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0051] The preferred embodiments of the present invention will be described below in conjunction with the accompanying drawings. It should be understood that the preferred embodiments described here are only used to illustrate and explain the present invention, and are not intended to limit the present invention.

[0052]The embodiment of the present invention provides a voice conversion method for non-parallel data, such as figure 1 As shown, the method performs the following steps:

[0053] Step 1: using large-scale synthetic sound database data other than the source speaker and the target speaker to train the speech synthesis model of the target speaker, wherein the large-scale synthetic sound database data includes text data and speech pair data;

[0054] Step 2: Based on the text corresponding to the speech data of the source speaker, generate parallel data corresponding to the target speaker according to the speech synthesis model of the target speaker, wherein the paral...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a voice conversion method and system for non-parallel data. The method comprises the steps of training a target speaker voice synthesis model through the large-scale synthesis voice library data except a source speaker and a target speaker; based on a text corresponding to the voice data of the source speaker, generating parallel data corresponding to the target speaker according to the target speaker voice synthesis model; training a frequency spectrum parameter conversion model by utilizing the voice data and the parallel data of the source speaker; and based on the voice data of the source speaker, generating the converted voice of the target speaker according to the frequency spectrum parameter conversion model and the target speaker voice synthesis model. According to the method provided by the invention, high-quality parallel data is forged by using the target speaker voice synthesis model, then a spectrum parameter conversion model is trained by using theparallel data, and voice conversion is carried out by using the spectrum parameter conversion model and the target speaker voice synthesis model, so that the conversion quality is ensured.

Description

technical field [0001] The invention relates to the technical field of voice conversion, in particular to a voice conversion method and system for non-parallel data. Background technique [0002] Speech conversion is a technique for modifying a source speaker's speech signal to match a target speaker's speech signal, so that it has the speech characteristics of the target speaker while keeping the speech information unchanged. The main task of speech conversion includes extracting and converting the characteristic parameters representing the personality of the speaker, and then reconstructing the converted parameters into speech. This process must not only ensure the clarity of the converted speech, but also ensure the similarity of the converted speech features. [0003] In the existing speech conversion technology, most of the methods require two speakers to have parallel data (the text content corresponding to the speech is consistent). The main disadvantage of this meth...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G10L13/02G10L25/27G10L25/69G10L17/04G10L17/08
CPCG10L13/02G10L25/27G10L25/69G10L17/04G10L17/08
Inventor 孙见青
Owner BEIJING UNISOUND INFORMATION TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products