Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Front end design-based speech synthesis method

A speech synthesis and phoneme technology, applied in speech synthesis, speech analysis, instruments, etc., can solve the problems of data dependence and uncontrollable synthesis effect.

Inactive Publication Date: 2019-01-29
SICHUAN CHANGHONG ELECTRIC CO LTD
View PDF6 Cites 10 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] The purpose of the present invention is to provide a speech synthesis method based on front-end design to solve the problems of data dependence and uncontrollable synthesis effect of the current speech synthesis method

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Examples

Experimental program
Comparison scheme
Effect test

preparation example Construction

[0029] The speech synthesis method based on front-end design of the present invention comprises the following steps:

[0030] Step 1, preprocessing the Chinese text data;

[0031] Step 2, extracting the relevant linguistic features of the Chinese text;

[0032] Step 3, extracting at least two acoustic features of the audio file;

[0033] Step 4, training the duration model and the acoustic model according to the linguistic features and the acoustic features;

[0034] Step 5. After processing the Chinese text to be synthesized in step 1 and step 2, call the duration model obtained in step 4 to obtain the duration information corresponding to the text, and then combine linguistic features and duration information as the input of the acoustic model to obtain the corresponding acoustic characteristics;

[0035] Step 6. Using a vocoder to synthesize corresponding audio data for the acoustic features obtained in step 5.

Embodiment

[0037] The embodiment of the present invention is based on the speech synthesis method of front-end design, specifically comprises the following steps:

[0038] (1) Data processor, which preprocesses Chinese text data.

[0039] Preprocess special characters and numbers in Chinese text, such as "0.1%" is parsed as "0.1%". "2018" is parsed as "2018", "2018 times" is parsed as "2018 times" and so on. The parsed text is then converted into pinyin with tones. The Chinese text set needs to cover all Chinese pinyin.

[0040] (2) Linguistic feature generator, which extracts linguistic features related to Chinese text.

[0041] a) The pinyin with phonetic symbols obtained in step (1) is split into corresponding phonemes according to a custom dictionary. It includes the setting of conversion rules for special syllables. Some pinyin split rules are as follows:

[0042] a1 a1

gua1 g ua1

na1 n a1

sui1 s uei1

ai1 ai1

guai1 g uai1

nai1 n ai1

sun1...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a front end design-based speech synthesis method and belongs to the technical field of speech synthesis. With the front end design-based speech synthesis method of the inventionadopted, the problem of data dependence and uncontrollable synthetic effect of a current speech synthesis method can be solved. According to the technical schemes of the invention, the method includes the following steps that: step 1, Chinese text data are preprocessed; step 2, linguistic features related to the Chinese text are extracted; step 3, at least two acoustic features of an audio file are extracted; step 4, a duration model and an acoustic model are trained according to the linguistic features and the acoustic features; step 5, the duration model obtained in the step 4 is called toobtain duration information corresponding to the Chinese text which requires synthesis and has been processed in the step 1 and step 2, with the linguistic features and the duration information adopted as the input of the acoustic model, so that corresponding acoustic features can be obtained; and step 6, the acoustic features obtained in the step 5 are synthesized into corresponding audio data with a vocoder.

Description

technical field [0001] The invention relates to speech synthesis technology, in particular to the technology of speech synthesis method based on front-end design. Background technique [0002] Speech synthesis is the technology of producing intelligible and fluent speech through mechanical and electronic methods. With the rapid development of technology in the field of artificial intelligence, the existing speech synthesis technology has gradually shifted from the traditional feature extraction algorithm based on hmm to the use of deep learning technology, by recording high-quality audio data of a large number of speakers, and then through the neural network model. Training, get speech synthesis model, and directly synthesize audio data end-to-end. Such methods can synthesize high-quality speech, but are highly data-dependent and the synthesis effect is uncontrollable. Contents of the invention [0003] The purpose of the present invention is to provide a speech synthesi...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G10L13/02G10L13/08G10L25/03
CPCG10L13/02G10L13/08G10L25/03G10L2013/083
Inventor 王昆
Owner SICHUAN CHANGHONG ELECTRIC CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products