A Singing Voice Separation Method Using Melody Extraction and Speech Synthesis Technology

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A technology of speech synthesis and singing, applied in speech synthesis, speech analysis, instruments, etc., can solve the problem of poor separation quality of singing, and achieve the effect of less interference and high degree of separation of accompaniment

Active Publication Date: 2022-03-01

HANGZHOU DIANZI UNIV

View PDF5 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0004] The purpose of the present invention is to provide a high-efficiency, accurate and clear separation method to make up for the shortcomings of the traditional singing voice separation method in view of the poor quality of the current singing voice separation and the separation of non-human voice parts in the audio.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0098] The method provided by the present invention will be further explained below in conjunction with the accompanying drawings.

[0099]Singing separation using melody extraction and speech synthesis technology, in the field of music information retrieval, compared with the traditional separation method, the separation problem is converted into an extraction and synthesis problem, which effectively avoids accompaniment interference. Compared with previous methods, the separated sound is clearer and the accompaniment is noisy Fewer features, while in most cases the degree of separation is more ideal than traditional methods. Using OSCC instead of traditional MFCC as audio features in the main melody extraction has the advantage of being more suitable for music melody extraction, and its musical characteristics can be obtained during feature extraction. Using multiple speaker sound banks to generate an average model after HMM training can have a better effect on speaker monoc...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a singing voice separation method using melody extraction and speech synthesis technology. This method performs the main melody extraction operation on the given input audio, and obtains the main melody and note duration of the song; establishes the speaker's voice library and conducts HMM training to obtain the speaker's sound library feature model; extracts its pronunciation features from the input audio, and completes the speech Human timbre conversion; use the given lyrics text to output speech using TTS technology, and combine the extracted melody and note duration to synthesize the final human voice audio. The human voice is simulated by synthesizing voice. Compared with the traditional singing voice separation method, the accompaniment information after separation can be greatly reduced.

Description

technical field [0001] The invention relates to the field of music information retrieval, in particular to a singing voice separation method using melody extraction and speech synthesis technology. Background technique [0002] Vocal separation is a technique for separating the vocal portion of a song from the instrumental accompaniment portion. Using singing voice separation, a complex song audio can be separated into pure human voice audio, which is the front-end technology for applications such as lyrics alignment, vocal recognition, and high-level semantic analysis of songs. For multi-channel songs, the traditional singing voice separation method mostly uses the extraction method based on the difference in the spatial position of the audio signal, taking advantage of the feature that the song producer puts the human voice track into the center channel during the production process, and combines the left and right voices of the song. channel audio, calculate the audio in...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Patents(China)

IPC IPC(8): G10L25/48G10L25/30G10L25/24G10L13/08

CPCG10L25/48G10L25/24G10L25/30G10L13/08

Inventor 郑杰文鄢腊梅张祥泉蒋小花郭渝慧袁友伟

Owner HANGZHOU DIANZI UNIV

A Singing Voice Separation Method Using Melody Extraction and Speech Synthesis Technology

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology