Singing separation method using melody extraction and speech synthesis technology

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A technology of speech synthesis and separation method, which is applied in speech synthesis, speech analysis, instruments, etc., and can solve the problem of poor separation quality of singing.

Active Publication Date: 2019-12-20

HANGZHOU DIANZI UNIV

View PDF5 Cites 12 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0004] The purpose of the present invention is to provide a high-efficiency, accurate and clear separation method to make up for the shortcomings of the traditional singing voice separation method in view of the poor quality of the current singing voice separation and the separation of non-human voice parts in the audio.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0098] The method provided by the present invention will be further explained below in conjunction with the accompanying drawings.

[0099]Singing separation using melody extraction and speech synthesis technology, in the field of music information retrieval, compared with the traditional separation method, the separation problem is converted into an extraction and synthesis problem, which effectively avoids accompaniment interference. Compared with previous methods, the separated sound is clearer and the accompaniment is noisy Fewer features, while in most cases the degree of separation is more ideal than traditional methods. Using OSCC instead of traditional MFCC as audio features in the main melody extraction has the advantage of being more suitable for music melody extraction, and its musical characteristics can be obtained during feature extraction. Using multiple speaker sound banks to generate an average model after HMM training can have a better effect on speaker monoc...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a singing separation method using melody extraction and speech synthesis technology. The method comprises the following steps: performing a main melody extraction operation ona given input audio to obtain a main melody and a note duration of a song; establishing a speaker speech library and performing HMM (Hidden Markov Model) training to obtain a speaker speech library feature model; extracting pronunciation characteristics of the input audio, and completing speaker timbre conversion; and outputting speech by using a given lyric text through a TTS (Text-to-Speech) technology, and synthesizing in combination with information of the extracted melody, the note duration and the like to obtain a final human voice audio. The human voice is simulated through synthetic speech, and compared with the traditional singing separation method, accompaniment information after separation can be greatly reduced.

Description

technical field [0001] The invention relates to the field of music information retrieval, in particular to a singing voice separation method using melody extraction and speech synthesis technology. Background technique [0002] Vocal separation is a technique for separating the vocal portion of a song from the instrumental accompaniment portion. Using singing voice separation, a complex song audio can be separated into pure human voice audio, which is the front-end technology for applications such as lyrics alignment, vocal recognition, and high-level semantic analysis of songs. For multi-channel songs, the traditional singing voice separation method mostly uses the extraction method based on the difference in the spatial position of the audio signal, taking advantage of the feature that the song producer puts the human voice track into the center channel during the production process, and combines the left and right voices of the song. channel audio, calculate the audio in...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G10L25/48G10L25/30G10L25/24G10L13/08

CPCG10L25/48G10L25/24G10L25/30G10L13/08

Inventor 郑杰文鄢腊梅张祥泉蒋小花郭渝慧袁友伟

Owner HANGZHOU DIANZI UNIV

Singing separation method using melody extraction and speech synthesis technology

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology