Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Speech data amplification method and system for speech synthesis

A technology of speech synthesis and speech data, applied in speech synthesis, speech analysis, speech recognition, etc., can solve the problems of speech synthesis system such as poor effect, distortion, large speech error, etc., achieve the effect of optimizing system capabilities and improving speech synthesis effect

Pending Publication Date: 2022-07-01
AISPEECH CO LTD
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0007] In order to at least solve the problem that the conventional speech amplification in the prior art will introduce distortion, and the speech error with the real environment is relatively large, resulting in the poor effect of the speech synthesis system trained on this

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Speech data amplification method and system for speech synthesis
  • Speech data amplification method and system for speech synthesis
  • Speech data amplification method and system for speech synthesis

Examples

Experimental program
Comparison scheme
Effect test

Embodiment approach

[0041] As an embodiment, the speech synthesis system consists of a text processing module, an acoustic model and a vocoder;

[0042] The training methods of the speech synthesis system include:

[0043] Input the training text into the text processing module to obtain the optimized processing text;

[0044] inputting the training style representation and the optimized text into the acoustic model, and inputting the output of the acoustic model into the vocoder to obtain simulated synthetic speech data;

[0045] The trained reference speech synthesis system is used as the teacher model of the speech synthesis system, and the speech synthesis system is trained through the reference synthesized speech data output by the teacher model and the simulated synthesized speech data until the simulated synthesis is performed. The speech data approximates the reference synthesized speech data.

[0046] like image 3 Shown is the structure of the speech synthesis system. During the ini...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The embodiment of the invention provides a voice data amplification method for voice synthesis. The method comprises the following steps: inputting collected real voices of multiple speakers into a style extraction system to obtain style characterization of each speaker; taking the text data and the style representation of each speaker as the input of a speech synthesis system for end-to-end modeling, and outputting synthesized speech data with the style representation; and determining the synthetic voice data which is judged to be the real scene in the synthetic voice data as amplified voice data by using an audio identification system. The embodiment of the invention also provides a voice data amplification system for voice synthesis. According to the embodiment of the invention, a large amount of multi-speaker, multi-language and multi-scene real data instead of shed-recorded high-tone-quality data is used for speech synthesis modeling, so that the modeling capability of a speech synthesis system is enriched; a style extraction module is introduced to enhance the real scene data modeling capability of the speech synthesis system; and the speech synthesis effect of the speech synthesis system is integrally improved.

Description

technical field [0001] The invention relates to the field of intelligent speech, in particular to a speech data amplification method, system, electronic device and storage medium for speech synthesis. Background technique [0002] Speech synthesis, also known as Text to Speech technology, can convert any text information into standard and fluent pronunciation in real time, just like putting an artificial mouth on a machine. It involves many disciplines such as acoustics, linguistics, digital signal processing, computer science, etc. It is a cutting-edge technology in the field of Chinese information processing. The so-called "making machines talk like humans" is essentially different from traditional playback devices (systems). Traditional sound playback devices (systems), such as tape recorders, achieve "making the machine talk" by pre-recording sound and then playing it back. This method has great limitations in terms of content, storage, transmission, convenience, and t...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G10L13/02G10L13/08G10L15/06G10L17/02G10L25/51
CPCG10L13/02G10L13/08G10L15/063G10L17/02G10L25/51G10L2013/021G10L2015/0635
Inventor 薛少飞
Owner AISPEECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products