Quantitative F0 pattern generation device and method, and model learning device and method for generating F0 pattern

A technology for generating models and generating devices, which is applied in speech synthesis, speech analysis, instruments, etc., and can solve problems such as difficult to obtain model parameters

Inactive Publication Date: 2016-04-06
NAT INST OF INFORMATION & COMM TECH
View PDF2 Cites 3 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

In the Fujisaki model, it is difficult to automatically obtain model parameters from the observed F0 contours in the sound corpus

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Quantitative F0 pattern generation device and method, and model learning device and method for generating F0 pattern
  • Quantitative F0 pattern generation device and method, and model learning device and method for generating F0 pattern
  • Quantitative F0 pattern generation device and method, and model learning device and method for generating F0 pattern

Examples

Experimental program
Comparison scheme
Effect test

no. 1 Embodiment approach

[0132]

[0133] refer to Figure 7 , the F0 contour synthesizing unit 359 according to the first embodiment includes: a parameter estimation unit 366 for smoothing and serializing the observed F0 contour 130 observed from a plurality of sound signals included in the sound corpus to obtain The continuous F0 contour 132 of the given prosodic word boundary, according to the above-mentioned principle, estimates the target point for specifying the phrase component P and the target parameter for specifying the tone component A; the F0 contour fitting part 368, which passes The phrase component P and the tone component A estimated by the parameter estimation part 366 are synthesized to generate a fitted F0 profile fitted with a continuous F0 profile; the HMM learning part 369 uses the fitted F0 profile to generate Learning of the HMM is performed in the same manner as in the prior art; and the HMM storage device 370 stores the learned HMM parameters. The process of synthesizing th...

no. 2 Embodiment approach

[0156] In the first embodiment, the phrase component P and the tone component A are represented by target points, and the F0 profile is fitted by combining these components. However, the idea of ​​using target points is not limited to this first embodiment. In the second embodiment, the observed F0 contour is separated into a phrase component P, a tone component A, and a micro-prosodic component M by the method described above, and HMM learning is performed on each of the time-varying contours of these components. When generating F0, the learned HMM is used to obtain the time-varying profiles of the phrase component P, the tone component A, and the micro-prosodic component M, and then these profiles are synthesized to estimate the F0 profile.

[0157]

[0158] refer to Figure 9 , the sound synthesis system 270 according to this embodiment includes: a model learning unit 280 which learns an HMM used for sound synthesis; and a sound synthesis unit 282 which uses the HMM lear...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

To provide a synthesizer for F0 patterns using a statistic model whereby correlation between linguistic information and the F0 patterns becomes clear while maintaining accuracy. An HMM learning device includes: a parameter estimation unit which represents an F0 pattern (133) fitting to a continuous F0 pattern (132) as a sum of a phrase component and an accent component, and estimates the target points of these components; and an HMM learning means for learning an HMM (139) using the F0 pattern as learning data after the F0 pattern fits to the continuous F0 pattern. The continuous F0 pattern (132) may be separated into an accent component (134), a phrase component (136), and a micro-prosody component (138) so that individual HMMs (140, 142, 144) can be learned. An F0 pattern is obtained by generating the accent component, the phrase component, and the micro-prosody component individually from the HMMs (140, 142, 144), and synthesizing the components using the results of text analysis.

Description

technical field [0001] The present invention relates to sound synthesis technology, in particular to the synthesis technology of fundamental frequency contours during sound synthesis. Background technique [0002] A time-varying profile of the fundamental frequency of a voice (hereinafter referred to as “F0 profile”) is useful for clarifying sentence division, expressing accent positions, or distinguishing words. The F0 contour also plays a large role in conveying non-verbal information such as the emotion that accompanies the vocalization. Furthermore, the F0 profile will also have a greater impact on the naturalness of vocalization. In particular, in order to clarify the focus of the utterance and thereby clarify the structure of the sentence, it is necessary to make the sentence uttered with an appropriate intonation. If the F0 profile is not appropriate, it will impair the intelligibility of the synthesized sound. Therefore, in speech synthesis, how to synthesize a de...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G10L13/10
CPCG10L13/027G10L13/10G10L13/086G10L21/0364G10L25/18
Inventor 倪晋富志贺芳则
Owner NAT INST OF INFORMATION & COMM TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products