A method, device and device for synthesizing speech and music

What is AI technical title?
AI technical title is built by Patsnap AI team. It summarizes the technical point description of the patent document.
A voice and music technology, applied in the field of speech synthesis, can solve problems such as unsatisfactory speech flow quality and inability to meet the needs of speech and music synthesis

Inactive Publication Date: 2016-03-09

BEIJING QIYI CENTURY SCI & TECH CO LTD

View PDF7 Cites 2 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0008] However, the essence of the above-mentioned processing mechanism is still to simply synthesize the waveform data of each isolated user voice and music waveform data together abruptly, and the user voice and music are only a kind of "superimposition" rather than "fusion". The quality of the entire speech flow is still not ideal, and it cannot meet the increasing and changing needs of users for speech and music synthesis. For example, users hope to combine their own recorded speech with music into a piece of rap music (Rap) , it is completely impossible to achieve with the existing technology

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0108] refer to figure 1 , which shows a flow chart of steps in Embodiment 1 of a method for synthesizing speech and music provided by an embodiment of the present invention, which may specifically include the following steps:

[0109] Step 101, obtain input voice data and background music data;

[0110] In the embodiment of the present invention, the voice data can be understood as the voice data formed by people who do not require regularity, and the voice and speed of speech can be erratic; the background music data can be understood as having a certain rhythm and regularity. Music data formed by a combination of tones. The "background music data" referred to in the embodiment of the present invention is essentially "music data". The word "background" is only used to emphasize its background as voice data synthesis, and does not mean that it has certain technical characteristics .

[0111] In a specific implementation, the voice data can be the voice data transmitted by ...

Embodiment 2

[0176] refer to Figure 4 , which shows a flow chart of steps in Embodiment 2 of a method for synthesizing speech and music provided by an embodiment of the present invention, which may specifically include the following steps:

[0177] Step 201, obtaining input voice data and background music data;

[0178] Step 202, identifying one or more single characters or words that make up the voice data from the voice data, and obtaining the pitch and duration of the one or more single characters or words;

[0179] Step 203, acquiring the pitch and duration of the background music data;

[0180] Step 204, according to the pitch and duration of the background music data, change the speed and / or transpose the pitch and duration of the one or more words or words;

[0181] Step 205, perform special effect processing on the speech data after the speed change and / or pitch shift processing, the special effect processing includes: echo special effect processing, and / or, T-Pain special effec...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The embodiment of the invention provides a method, device and equipment for synthesizing voice and music. The method comprises the steps that input voice data and background music data are obtained; one or more words or phrases are identified in the voice data and tones and lasting time of the words or the phrases are obtained, wherein the voice data are composed of the words or the phrases; tone and lasting time of the background music data are obtained; according to the tone and the lasting time of the background music data, speed change processing and / or tone modification processing are / is conducted on the tones and the lasting time of the words or the phrases; the processed voice data and the background music data are synthesized so that a new audio file can be formed. According to the method, device and equipment for synthesizing voice and music, independent voice data and music data can be perfectly combined together, a music attribute is offered to the monotonous voice data, so that the quality of an whole voice stream is improved, and the gradually improved and changed requirements for voice and music synthesis of users are satisfied.

Description

technical field [0001] Embodiments of the present invention relate to the technical field of speech synthesis, and in particular, relate to a method for synthesizing speech and music, an apparatus for synthesizing speech and music, and a device. Background technique [0002] With the development of electronic technology, users often have the requirement of synthesizing voice and music. Waveform splicing technology (or called waveform synthesis technology) is commonly used in the prior art. The development of waveform splicing technology is inseparable from the development of speech encoding and decoding technology, among which the development of LPC technology (linear predictive coding technology) has had a huge impact on waveform splicing technology. LPC technology is essentially a time waveform encoding technology, the purpose of which is to reduce the transmission rate of time domain signals. The advantage of the LPC technique is that it is simple and intuitive. The sy...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & AuthorityPatents(China)

IPC IPC(8): G10L13/033G10L13/08G10L25/87G10L25/90

Inventor蒋金峰

OwnerBEIJING QIYI CENTURY SCI & TECH CO LTD

A method, device and device for synthesizing speech and music

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

Embodiment 2

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology