Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

A Singing Voice Separation Method Using Melody Extraction and Speech Synthesis Technology

A technology of speech synthesis and singing, applied in speech synthesis, speech analysis, instruments, etc., can solve the problem of poor separation quality of singing, and achieve the effect of less interference and high degree of separation of accompaniment

Active Publication Date: 2022-03-01
HANGZHOU DIANZI UNIV
View PDF5 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The purpose of the present invention is to provide a high-efficiency, accurate and clear separation method to make up for the shortcomings of the traditional singing voice separation method in view of the poor quality of the current singing voice separation and the separation of non-human voice parts in the audio.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A Singing Voice Separation Method Using Melody Extraction and Speech Synthesis Technology
  • A Singing Voice Separation Method Using Melody Extraction and Speech Synthesis Technology
  • A Singing Voice Separation Method Using Melody Extraction and Speech Synthesis Technology

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0098] The method provided by the present invention will be further explained below in conjunction with the accompanying drawings.

[0099]Singing separation using melody extraction and speech synthesis technology, in the field of music information retrieval, compared with the traditional separation method, the separation problem is converted into an extraction and synthesis problem, which effectively avoids accompaniment interference. Compared with previous methods, the separated sound is clearer and the accompaniment is noisy Fewer features, while in most cases the degree of separation is more ideal than traditional methods. Using OSCC instead of traditional MFCC as audio features in the main melody extraction has the advantage of being more suitable for music melody extraction, and its musical characteristics can be obtained during feature extraction. Using multiple speaker sound banks to generate an average model after HMM training can have a better effect on speaker monoc...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a singing voice separation method using melody extraction and speech synthesis technology. This method performs the main melody extraction operation on the given input audio, and obtains the main melody and note duration of the song; establishes the speaker's voice library and conducts HMM training to obtain the speaker's sound library feature model; extracts its pronunciation features from the input audio, and completes the speech Human timbre conversion; use the given lyrics text to output speech using TTS technology, and combine the extracted melody and note duration to synthesize the final human voice audio. The human voice is simulated by synthesizing voice. Compared with the traditional singing voice separation method, the accompaniment information after separation can be greatly reduced.

Description

technical field [0001] The invention relates to the field of music information retrieval, in particular to a singing voice separation method using melody extraction and speech synthesis technology. Background technique [0002] Vocal separation is a technique for separating the vocal portion of a song from the instrumental accompaniment portion. Using singing voice separation, a complex song audio can be separated into pure human voice audio, which is the front-end technology for applications such as lyrics alignment, vocal recognition, and high-level semantic analysis of songs. For multi-channel songs, the traditional singing voice separation method mostly uses the extraction method based on the difference in the spatial position of the audio signal, taking advantage of the feature that the song producer puts the human voice track into the center channel during the production process, and combines the left and right voices of the song. channel audio, calculate the audio in...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G10L25/48G10L25/30G10L25/24G10L13/08
CPCG10L25/48G10L25/24G10L25/30G10L13/08
Inventor 郑杰文鄢腊梅张祥泉蒋小花郭渝慧袁友伟
Owner HANGZHOU DIANZI UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products