System and method for training cloned tone and rhythm based on Bottleneck features

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A feature training and training method technology, applied in speech synthesis, speech analysis, speech recognition, etc., can solve the problems of delays that cannot meet the market's response, large labor costs, etc., and achieve the effect of shortening the production cycle and reducing clone samples

Active Publication Date: 2020-05-29

NANJING SILICON INTELLIGENCE TECH CO LTD

View PDF11 Cites 7 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0002] With the rapid development of the telephone robot business market, the rapid increase in the volume of intelligent voice services has brought great difficulties to customized speech synthesis technology services (TTS). A set of customized speech synthesis technology services (TTS) requires nearly 10,000 For real recording samples, the production cycle from sample collection, data labeling, data preprocessing, model training to service provision is nearly one month, and requires a lot of labor costs. This delay cannot meet the response of the market

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0033] In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present invention, not to limit the present invention.

[0034] Such as figure 1 As shown, the present invention provides a system based on Bottleneck feature training clone timbre and rhythm, including:

[0035] (1) The data collection module is used to collect speech recognition module (ASR Model) corpus, prosody module (TTB Model) basic TTB model corpus, multi-speaker acoustic model (Multi-speaker Acoustic Model) corpus, clone corpus (target user's audio and corresponding text);

[0036] (2) Acoustic feature extraction module, which extracts linear predictive coding features (LPC Feature) and Mel frequency cepstral coefficients (Mfcc) as acoustic fe...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention relates to the technical field of voice synthesis, voice recognition and voice cloning, and provides a voice cloning implementation scheme based on Bottleneck features (language featuresof audio) by combining a voice synthesis technology, a voice recognition technology and a transfer learning technology. A training system and a training method are included. The TTS service with highnaturalness and similarity is provided by using a small number of samples, so that the TTS service with target user characteristics is provided, and problems of large service sample size, long manufacturing period and high labor cost of a voice synthesis technology are solved. The training system comprises a data acquisition module, an acoustic feature extraction module, a voice recognition module, a rhythm module, a multi-person voice acoustic module and a voice synthesis module. The invention further provides a training method based on the system. The training method comprises the steps oftraining corpus preparation, acoustic feature extraction, training and fine adjustment of all modules and speech synthesis.

Description

technical field [0001] The invention relates to the fields of speech synthesis technology (TTS), speech recognition technology (ASR), and sound cloning technology, and belongs to the field of artificial intelligence-intelligent speech. Background technique [0002] With the rapid development of the telephone robot business market, the rapid increase in the volume of intelligent voice services has brought great difficulties to customized speech synthesis technology services (TTS). A set of customized speech synthesis technology services (TTS) requires nearly 10,000 For real recording samples, the production cycle from sample collection, data labeling, data preprocessing, model training to service provision is nearly one month, and requires a lot of labor costs. This delay cannot meet the market's response. Currently, TTS mainly includes two technical solutions: staged speech synthesis and end-to-end speech synthesis. The purpose of timbre and rhythm cloning is to synthesize ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G10L13/02G10L15/02G10L15/06G10L15/16G10L25/03G10L25/24G10L25/30G10L25/12

CPCG10L13/02G10L15/02G10L15/063G10L15/16G10L25/03G10L25/12G10L25/24G10L25/30

Inventor 司马华鹏龚雪飞

Owner NANJING SILICON INTELLIGENCE TECH CO LTD

System and method for training cloned tone and rhythm based on Bottleneck features

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology