Asymmetrical voice conversion method based on deep neural network feature mapping

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A technology of deep neural network and feature mapping, which is applied in the field of asymmetric speech conversion based on deep neural network feature mapping, which can solve problems such as large data volume, unsatisfactory voice quality, and system performance deterioration

Active Publication Date: 2014-01-22

BYZORO NETWORK LTD

View PDF4 Cites 34 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0006] The purpose of the present invention is to overcome the problems that the voice conversion system in the prior art not only strictly limits the way users can make sentences, but also requires a large amount of data for training, and the voice quality after conversion is not ideal at the same time, and provides a The asymmetric voice conversion method based on deep neural network feature mapping adopts the technical solution provided by the present invention. In the actual environment, the voice conversion system faces the problem of sharp deterioration of system performance under the condition of asymmetric data and data volume scarcity. The above-mentioned The two relatively independent links are integrated into a unified theoretical framework for research. At the same time, the deep neural network is used to conduct unsupervised training on the original data, and the high-order statistical feature information contained in it is extracted. Directive prediction training, and finally improve the generalization performance of the speech conversion system in the real environment

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0043] The present invention will be described in further detail below with reference to the accompanying drawings and examples.

[0044] In order to effectively deal with the problems of "asymmetric data" and "lack of data" in the actual environment, the present invention designs the following data acquisition and integration schemes for subsequent operations: for most applications, the sound data of the target speaker is generally It is relatively passive, so it is more difficult to collect, and often leads to a lack of data volume; in contrast, because the source speaker's voice data collection process is more active, it is relatively easy to collect and the data volume is relatively sufficient. For this reason, on the basis of the existing source speech data, let the source speaker record a small amount of sound data containing the same semantic content as a reference according to the collected speech of the target speaker (the source speaker incrementally records a small a...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses an asymmetrical voice conversion method based on deep neural network feature mapping, and belongs to the technical field of voice conversion. The asymmetrical voice conversion method based on deep neural network feature mapping disclosed by the invention is specific to asymmetrical data of source voice and target voice. The method comprises the following steps: firstly, performing probability modeling by using the pre-training function of a deep network, and extracting high-order statistics features in a voice signal to provide a standby preferred space of network coefficients; secondly, performing incremental learning by using a small quantity of asymmetrical data, and correcting network weight coefficients by using an optimized transmission error to realize mapping of feature parameters. According to the asymmetrical voice conversion method, a network coefficient structure is optimized and is taken as a parameter initial value of a deep forward prediction network, network structure parameters are further transmitted reversely and optimized in the incremental learning process of a small quantity of asymmetrical data, so that mapping of the personal feature parameters of a speaker is realized.

Description

technical field [0001] The invention belongs to the technical field of speech conversion, and in particular relates to an asymmetric speech conversion method based on deep neural network feature mapping. Background technique [0002] Speech conversion technology, simply put, is to transform the voice of a speaker (called the source) by some means to make it sound like another speaker (called the target). Speech conversion is an interdisciplinary subject branch, and its content not only involves knowledge in the fields of phonetics, semantics, and psychoacoustics, but also covers all aspects of the field of speech signal processing, such as speech analysis and synthesis, speaker recognition, and speech coding. and enhancements etc. [0003] The ultimate goal of speech conversion is to provide instant speech services that can automatically and quickly adapt to any speaker. This system requires little or no user training and can function well for all users and various conditio...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G10L25/30G10L25/51

Inventor 鲍静益徐宁

Owner BYZORO NETWORK LTD

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Asymmetrical voice conversion method based on deep neural network feature mapping

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology