Synthesizing method of personalized singing voice

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
A synthesis method and speech technology, applied in speech synthesis, speech analysis, instruments, etc., can solve the problems of high recording difficulty, difficult control, lack of applied research, etc., and achieve the effect of improving scalability and entertainment

Active Publication Date: 2008-11-19

IFLYTEK CO LTD

View PDF0 Cites 57 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0002] In recent years, speech synthesis technology has made great progress. Since the sound quality and naturalness of the synthesized speech are good, people put forward more demands on the synthesis system, such as diversified speech synthesis, including multiple speakers, multiple Different pronunciation styles, multilingual, etc., so the model adaptive technology developed on the basis of trainable speech synthesis technology has been more and more widely used. The model adaptive technology can achieve good results when synthesizing speech with reading style , but there is not enough applied research on speech synthesis for singing style

[0003] In addition, in order to synthesize synthetic speech with singing style, relevant research institutions have also carried out related research. The main method is to learn from the speech synthesis method of reading style. First, record a large-scale singing style library, and then use the trainable speech synthesis method To synthesize singing-style speech, this method can synthesize more natural and realistic synthetic speech, but because the control of singing is difficult, the recording difficulty of singing style library is much higher than that of reading aloud style library, and, if you want When synthesizing the singing voice of another speaker, it is necessary to record another singing style library for this speaker. For most ordinary people, it is basically impossible to record such a large-scale singing style library

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0022] See attached figure 1 , 2 shown.

[0023] 1. Trainable speech synthesis, this invention is based on a trainable speech synthesis method. This method uses the Hidden Markov Model (HMM) to model the parameters of three aspects of the speech signal during the training phase. The parameters of these three aspects are: fundamental frequency, duration and line spectrum frequency coefficient LSF; all models are in A hidden Markov model is used to train the model on a speech library; the size of the speech library is generally about 1000 sentences (1.5~2 hours of recording volume), and the hidden Markov model with three parameters is obtained through training; while in the synthesis stage, according to the context-related attributes obtained by text analysis of the input text, according to these attributes, the clustering decision tree of the time length, fundamental frequency and spectral parameters are respectively made decisions, and the corresponding model sequence is obt...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention relates to an individualized singing sound synthesis method, including the following steps: building up a module of the coefficient of the line spectrum frequency of the sound and obtaining a relevant decision-making tree module through training; recording the reading sounds of a special subscriber to get the module of the coefficient of the line spectrum frequency of the sound of the subscriber; obtaining the attribute set relevant to the context of the lyric of the numerical notes, and pre-estimating the frequency parameters and the time duration data of initial consonant and vowels corresponding to the lyric according to the decision-making tree module and the module of the coefficient of the individualized line spectrum frequency; building up fundamental frequency data according to the numerical notes and combining the fundamental frequency data with the time duration and frequency parameters to obtain synthesized parameters; inputting the parameters into a parameterized sound synthesis vocoder, so that individualized singing sound can be synthesized. The method of the invention can synthesize synthesis sound with singing style by adjusting a few parameters of the rhythm and can synthesize singing sound by only recording a small reciting style library.

Description

technical field [0001] The invention relates to a speech synthesis method, in particular to a synthesis method for synthesizing the singing style speech of a target speaker with only a small amount of reading style training data. Background technique [0002] In recent years, speech synthesis technology has made great progress. Since the sound quality and naturalness of the synthesized speech are good, people put forward more demands on the synthesis system, such as diversified speech synthesis, including multiple speakers, multiple Different pronunciation styles, multilingual, etc., so the model adaptive technology developed on the basis of trainable speech synthesis technology has been more and more widely used. The model adaptive technology can achieve good results when synthesizing speech with reading style , but there is not enough applied research on speech synthesis for singing style. [0003] In addition, in order to synthesize synthetic speech with singing style, r...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G10L13/02G10L13/04G10L13/08G10L13/10

Inventor王玉平江源凌震华胡国平胡郁刘庆峰王仁华

OwnerIFLYTEK CO LTD

Synthesizing method of personalized singing voice

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology