Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Synthesizing method of personalized singing voice

A synthesis method and speech technology, applied in speech synthesis, speech analysis, instruments, etc., can solve the problems of high recording difficulty, difficult control, lack of applied research, etc., and achieve the effect of improving scalability and entertainment

Active Publication Date: 2008-11-19
IFLYTEK CO LTD
View PDF0 Cites 57 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0002] In recent years, speech synthesis technology has made great progress. Since the sound quality and naturalness of the synthesized speech are good, people put forward more demands on the synthesis system, such as diversified speech synthesis, including multiple speakers, multiple Different pronunciation styles, multilingual, etc., so the model adaptive technology developed on the basis of trainable speech synthesis technology has been more and more widely used. The model adaptive technology can achieve good results when synthesizing speech with reading style , but there is not enough applied research on speech synthesis for singing style
[0003] In addition, in order to synthesize synthetic speech with singing style, relevant research institutions have also carried out related research. The main method is to learn from the speech synthesis method of reading style. First, record a large-scale singing style library, and then use the trainable speech synthesis method To synthesize singing-style speech, this method can synthesize more natural and realistic synthetic speech, but because the control of singing is difficult, the recording difficulty of singing style library is much higher than that of reading aloud style library, and, if you want When synthesizing the singing voice of another speaker, it is necessary to record another singing style library for this speaker. For most ordinary people, it is basically impossible to record such a large-scale singing style library

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Synthesizing method of personalized singing voice
  • Synthesizing method of personalized singing voice

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0022] See attached figure 1 , 2 shown.

[0023] 1. Trainable speech synthesis, this invention is based on a trainable speech synthesis method. This method uses the Hidden Markov Model (HMM) to model the parameters of three aspects of the speech signal during the training phase. The parameters of these three aspects are: fundamental frequency, duration and line spectrum frequency coefficient LSF; all models are in A hidden Markov model is used to train the model on a speech library; the size of the speech library is generally about 1000 sentences (1.5~2 hours of recording volume), and the hidden Markov model with three parameters is obtained through training; while in the synthesis stage, according to the context-related attributes obtained by text analysis of the input text, according to these attributes, the clustering decision tree of the time length, fundamental frequency and spectral parameters are respectively made decisions, and the corresponding model sequence is obt...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to an individualized singing sound synthesis method, including the following steps: building up a module of the coefficient of the line spectrum frequency of the sound and obtaining a relevant decision-making tree module through training; recording the reading sounds of a special subscriber to get the module of the coefficient of the line spectrum frequency of the sound of the subscriber; obtaining the attribute set relevant to the context of the lyric of the numerical notes, and pre-estimating the frequency parameters and the time duration data of initial consonant and vowels corresponding to the lyric according to the decision-making tree module and the module of the coefficient of the individualized line spectrum frequency; building up fundamental frequency data according to the numerical notes and combining the fundamental frequency data with the time duration and frequency parameters to obtain synthesized parameters; inputting the parameters into a parameterized sound synthesis vocoder, so that individualized singing sound can be synthesized. The method of the invention can synthesize synthesis sound with singing style by adjusting a few parameters of the rhythm and can synthesize singing sound by only recording a small reciting style library.

Description

technical field [0001] The invention relates to a speech synthesis method, in particular to a synthesis method for synthesizing the singing style speech of a target speaker with only a small amount of reading style training data. Background technique [0002] In recent years, speech synthesis technology has made great progress. Since the sound quality and naturalness of the synthesized speech are good, people put forward more demands on the synthesis system, such as diversified speech synthesis, including multiple speakers, multiple Different pronunciation styles, multilingual, etc., so the model adaptive technology developed on the basis of trainable speech synthesis technology has been more and more widely used. The model adaptive technology can achieve good results when synthesizing speech with reading style , but there is not enough applied research on speech synthesis for singing style. [0003] In addition, in order to synthesize synthetic speech with singing style, r...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G10L13/02G10L13/04G10L13/08G10L13/10
Inventor 王玉平江源凌震华胡国平胡郁刘庆峰王仁华
Owner IFLYTEK CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products