Non-periodic component syllable model building and speech synthesizing method and device

A non-periodic component and model building technology, applied in speech synthesis, speech analysis, instruments, etc., can solve the problems of poor spectral coherence of aperiodic components, large amount of data, and low quality of synthesized audio.

Inactive Publication Date: 2015-01-14
CHINA MOBILE COMM GRP CO LTD
View PDF6 Cites 6 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0017] The embodiment of the present invention provides a method and device for establishing an aperiodic component syllable model and speech synthesis, which are used to solve the problem of the large amount of data

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Non-periodic component syllable model building and speech synthesizing method and device
  • Non-periodic component syllable model building and speech synthesizing method and device
  • Non-periodic component syllable model building and speech synthesizing method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0069] like figure 1 As shown, it is a schematic flow chart of a method for establishing a non-periodic component syllable model in Embodiment 1 of the present invention, and the method includes:

[0070] Step 101: Obtain the original voice wave file in the voice database.

[0071] Specifically, in step 101, the voice database includes a large number of original voice waveform files and annotation files corresponding to the original voice waveform files, for example: files in Wav format and corresponding file identifiers (ie Lable).

[0072] Wherein, there is a one-to-one correspondence between the annotation file and the original voice waveform file, that is to say, each original voice waveform file corresponds to a unique annotation file.

[0073] Before preparing to build the aperiodic component syllable model, a large number of original speech waveform files are obtained from the speech database, and after analysis and processing, the required language parameter model, th...

Embodiment 2

[0123] like figure 2 As shown, it is a schematic flowchart of a speech synthesis method based on an aperiodic component syllable model in Embodiment 2 of the present invention. Embodiment 2 of the present invention is implemented on the basis of Embodiment 1 of the present invention. The method includes:

[0124] Step 201: Use a text analysis device to convert the acquired text information to be speech-synthesized into an original speech waveform file, and obtain an annotation file of the original speech waveform file according to the converted original speech waveform file.

[0125] Specifically, in step 201, after acquiring the text information to be synthesized into speech, it is necessary to use a text analysis device to convert the acquired text information to be synthesized into an original waveform file, and obtain the original voice according to the converted original voice waveform file. Annotation file for wave files.

[0126] Step 202: According to the correspondi...

Embodiment 3

[0136] like image 3 As shown, it is a schematic structural diagram of an aperiodic component syllable model building device in Embodiment 3 of the present invention. Embodiment 3 of the present invention is an invention under the same concept as Embodiment 1 of the present invention and Embodiment 2 of the present invention. The equipment includes: aperiodic component representative value determination module 11, aperiodic component spectrum fitting curve generation module 12 and aperiodic component syllable model building module 13, wherein:

[0137] Aperiodic component representative value determining module 11 is used to decompose the original speech waveform file in the voice database, and obtain the aperiodic component spectrum information, fundamental frequency information and vocal tract spectrum information of each syllable in the original speech waveform file; and according to Preset at least one frequency band information divided for each frame of the syllable and t...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a non-periodic component syllable model building and speech synthesizing method and device. The method includes the steps that according to a non-periodic component representative value, of each frame of each syllable in an original speech waveform file, on each piece of frequency band information obtained through dividing, a non-periodic component spectrum fitting curve, of each syllable, on the selected frequency band information is obtained through a discrete cosine transform method, and a non-periodic component syllable model including the non-periodic component spectrum fitting curves, of all the syllables of the original speech waveform file, on the different frequency band information is generated, so that the data information, including the frequency band number *syllable frame number, in the syllable model is converted into the fitting curves including the number of frequency bands, the scale of speech model building is downsized, the system resources are saved, meanwhile, the non-periodic component spectrum fitting curve of each syllable is built, the continuity among frames of the syllables is fully considered, the original tone quality of the syllables is kept through the fitting curves, and the quality of the synthetic speech is improved in the synthesis process.

Description

technical field [0001] The invention relates to the technical field of speech processing, in particular to a method and device for establishing a non-periodic component syllable model and speech synthesis. Background technique [0002] Speech synthesis technology refers to the technology of generating artificial voice through mechanical and electronic methods. For example: TTS (Text To Speech, text-to-speech) technology, which converts text information into voice information, and plays the converted voice information through a playback device. [0003] The premise of speech synthesis is to analyze speech information, for example: speech parametric analysis. The so-called speech parametric analysis methods include direct waveform analysis and speech parametric analysis. At present, the more common speech analysis method is the speech parametric analysis method. The so-called voice parametric analysis method refers to a method for analyzing the extracted voice parameters, w...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G10L13/02G10L13/04
Inventor 王朝民刘琨焦伟
Owner CHINA MOBILE COMM GRP CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products