Phonetic segmentation method and device for speech synthesis

A speech synthesis, phonetic technology, applied in speech synthesis, speech analysis, instruments, etc., can solve the problems of speech synthesis system performance degradation, cross-phonon segmentation errors, etc., to achieve smooth and natural speech, improve reliability, and improve accuracy sexual effect

Active Publication Date: 2016-02-17
BAIDU ONLINE NETWORK TECH (BEIJIBG) CO LTD
View PDF5 Cites 10 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

More seriously, it is likely to cause cross-phone segmentation errors, that is, the speech segment segmented for a cert

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Phonetic segmentation method and device for speech synthesis
  • Phonetic segmentation method and device for speech synthesis
  • Phonetic segmentation method and device for speech synthesis

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0019] Embodiments of the present invention are described in detail below, examples of which are shown in the drawings, wherein the same or similar reference numerals designate the same or similar elements or elements having the same or similar functions throughout. The embodiments described below by referring to the figures are exemplary and are intended to explain the present invention and should not be construed as limiting the present invention.

[0020] The phone segmentation method and device for speech synthesis according to the embodiments of the present invention will be described below with reference to the accompanying drawings.

[0021] figure 2 is a flowchart of a method for phonetic segmentation for speech synthesis according to an embodiment of the present invention.

[0022] Such as figure 2 As shown, the phone segmentation method for speech synthesis may include:

[0023] S1. Obtain the corpus text and convert the corpus text into a pinyin sequence.

[0...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a phonetic segmentation method and device for speech synthesis, wherein the method comprises: acquiring corpus text and converting it into a pinyin sequence, wherein the pinyin sequence comprises a plurality of phones, each having multiple states; segmenting speech data corresponding to the corpus text into multiple speech frames, and acquiring acoustic features of the speech frames; with respect to each state, clustering the speech frames according to the acoustic features, and generating multiple nodes corresponding to the state; calculating an optimal path corresponding to the pinyin sequence based on a dynamic programming algorithm and a two-dimensional state network, and segmenting the pinyin sequence according to the optimal path. The phonetic segmentation method and device for speech synthesis in embodiments of the invention enables the accuracy of pinyin sequence segmentation, thus improving the reliability of a speech synthesis acoustic model and enable speech in text-to-speech conversion to be smoother and more natural.

Description

technical field [0001] The invention relates to the technical field of text-to-speech conversion, in particular to a phonetic segmentation method and device for speech synthesis. Background technique [0002] Speech synthesis, also known as text-to-speech technology, is a technology that can convert text information into speech and read it aloud. The main evaluation indicators of speech synthesis system performance mainly include intelligibility and fluency. The existing speech synthesis system has basically matured in terms of intelligibility, but there is still a certain gap between the fluency and the real pronunciation of people. In order to be able to synthesize more fluent and natural speech, it is necessary to have high accuracy in the segmentation of sounds (such as initials and finals). If the phoneme is segmented incorrectly, the established acoustic model may be unreliable, which in turn may result in wrong speech fragments being obtained when synthesizing speec...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G10L13/08
CPCG10L13/08
Inventor 张辉李秀林
Owner BAIDU ONLINE NETWORK TECH (BEIJIBG) CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products