Singing separation method using melody extraction and speech synthesis technology

A technology of speech synthesis and separation method, which is applied in speech synthesis, speech analysis, instruments, etc., and can solve the problem of poor separation quality of singing.

Active Publication Date: 2019-12-20
HANGZHOU DIANZI UNIV
View PDF5 Cites 12 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The purpose of the present invention is to provide a high-efficiency, accurate and clear separation method to make up for the shortcomings of the traditional singing voice separation method in view of the poor quality of the current singing voice separation and the separation of non-human voice parts in the audio.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Singing separation method using melody extraction and speech synthesis technology
  • Singing separation method using melody extraction and speech synthesis technology
  • Singing separation method using melody extraction and speech synthesis technology

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0098] The method provided by the present invention will be further explained below in conjunction with the accompanying drawings.

[0099]Singing separation using melody extraction and speech synthesis technology, in the field of music information retrieval, compared with the traditional separation method, the separation problem is converted into an extraction and synthesis problem, which effectively avoids accompaniment interference. Compared with previous methods, the separated sound is clearer and the accompaniment is noisy Fewer features, while in most cases the degree of separation is more ideal than traditional methods. Using OSCC instead of traditional MFCC as audio features in the main melody extraction has the advantage of being more suitable for music melody extraction, and its musical characteristics can be obtained during feature extraction. Using multiple speaker sound banks to generate an average model after HMM training can have a better effect on speaker monoc...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a singing separation method using melody extraction and speech synthesis technology. The method comprises the following steps: performing a main melody extraction operation ona given input audio to obtain a main melody and a note duration of a song; establishing a speaker speech library and performing HMM (Hidden Markov Model) training to obtain a speaker speech library feature model; extracting pronunciation characteristics of the input audio, and completing speaker timbre conversion; and outputting speech by using a given lyric text through a TTS (Text-to-Speech) technology, and synthesizing in combination with information of the extracted melody, the note duration and the like to obtain a final human voice audio. The human voice is simulated through synthetic speech, and compared with the traditional singing separation method, accompaniment information after separation can be greatly reduced.

Description

technical field [0001] The invention relates to the field of music information retrieval, in particular to a singing voice separation method using melody extraction and speech synthesis technology. Background technique [0002] Vocal separation is a technique for separating the vocal portion of a song from the instrumental accompaniment portion. Using singing voice separation, a complex song audio can be separated into pure human voice audio, which is the front-end technology for applications such as lyrics alignment, vocal recognition, and high-level semantic analysis of songs. For multi-channel songs, the traditional singing voice separation method mostly uses the extraction method based on the difference in the spatial position of the audio signal, taking advantage of the feature that the song producer puts the human voice track into the center channel during the production process, and combines the left and right voices of the song. channel audio, calculate the audio in...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G10L25/48G10L25/30G10L25/24G10L13/08
CPCG10L25/48G10L25/24G10L25/30G10L13/08
Inventor 郑杰文鄢腊梅张祥泉蒋小花郭渝慧袁友伟
Owner HANGZHOU DIANZI UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products