Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Text information-based waveform concatenation voice synthesizing method

A waveform splicing and text information technology, applied in speech synthesis, speech analysis, instruments, etc., can solve problems such as unsatisfactory stability of synthesized speech, poor speech rhythm performance, and lack of consideration of the influence of text information, so as to enhance real-time performance. , high naturalness, the effect of reducing the number of

Inactive Publication Date: 2015-04-29
中科极限元(杭州)智能科技股份有限公司
View PDF6 Cites 11 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, although this method can synthesize waveforms that are relatively close to the original speech, it is limited by the size of the corpus, and the stability of the synthesized speech is not ideal (the sound library is too large, the synthetic speech speed is slow, and it cannot be synthesized in real time; the sound library is too small. , synthetic speech is unstable), which greatly affects the sense of hearing
Moreover, the existing splicing synthesis system lacks consideration of the influence of text information on primitives when calculating the cost, and the synthesized speech is not very good in prosodic performance.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Text information-based waveform concatenation voice synthesizing method
  • Text information-based waveform concatenation voice synthesizing method
  • Text information-based waveform concatenation voice synthesizing method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0035] Such as figure 1 As shown, the flow chart of the waveform splicing speech synthesis method for text information, the method includes the following steps:

[0036] Step S1: Extract the acoustic parameters and text parameters of all primitives in the original audio through segment segmentation, and train the weight prediction model and duration prediction model according to the extracted parameters;

[0037] The model training module performs model training according to the text parameters and acoustic parameters of the training text and the corresponding audio extraction primitive, and obtains the time length prediction model and the weight prediction model required for the calculation of the target cost in the hierarchical preselection;

[0038] Such as figure 2 As shown, the training duration prediction model includes the following steps:

[0039] Step S11: Carry out segment segmentation (primary segmentation) on the original sound bank, and segment it into the mini...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a text information-based waveform concatenation voice synthesizing method. The method comprises the following steps: extracting acoustic parameters and text parameters of all primitives from original audio through sound segment segmentation, and training a time length prediction model and a weight prediction model according to the extracted parameters; performing layered pre-selection by adopting a layered pre-selecting method; performing primary pre-selection on the primitives in a language database by utilizing a target primitive analyzed from a text and the time length predicted by the time length prediction model to obtain candidate primitives; calculating the target primitive, the candidate primitives and the weight information predicted by the weight prediction model to obtain target cost; calculating the integrating degree between two adjacent primitives to obtain concatenation cost; searching the target cost and the concatenation cost by using a Viterbi search method to obtain the least cost path, so as to obtain an optimal primitive and a synthesized voice through smooth concatenation. According to the method, the voice synthesizing efficiency is improved; the real-time performance of the concatenation synthesis is enhanced; the prosodic features of the synthesized voice are improved.

Description

technical field [0001] The invention relates to a waveform splicing speech synthesis method, in particular to a text information-based waveform splicing speech synthesis method, which belongs to the field of intelligent information processing. Background technique [0002] Speech is one of the main means of human-computer interaction, and the main purpose of speech synthesis is to enable computers to generate high-definition, high-naturalness continuous speech. There are two main methods of speech synthesis. Early research mainly uses parametric speech synthesis, and the most commonly used synthesis method is based on Hidden Markov parameter speech synthesis method. As a specific implementation of statistical acoustic modeling method, this method performs Hidden Markov Modeling on the acoustic parameters of speech, reconstructs the trajectory of acoustic parameters through parameter generation algorithms, and finally invokes the speech synthesizer to generate speech waveform...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G10L13/02G10L13/08
Inventor 徐明星
Owner 中科极限元(杭州)智能科技股份有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products