Many-to-many speaker conversion method based on starwgan-gp and x-vector
A conversion method and speaker technology, applied in neural learning methods, speech analysis, speech recognition, etc., can solve problems such as gradient disappearance and GAN training instability
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment Construction
[0054]Such asfigure 1 As shown, the high-quality speech conversion method of the present invention is divided into two parts: the training part is used to obtain the parameters and conversion functions required for speech conversion, and the conversion part is used to convert the source speaker's voice to the target speaker's voice.
[0055]The implementation steps of the training phase are:
[0056]1.1) Obtain the training corpus of non-parallel text. The training corpus is the corpus of multiple speakers, including the source speaker and the target speaker. The training corpus is taken from the VCC2018 speech corpus. There are 6 male and 6 female speakers in the training set of this corpus, and each speaker has 81 sentences. This method can realize conversion under parallel text and non-parallel text, so these training corpus can also be non-parallel text.
[0057]1.2) The training corpus uses the WORLD speech analysis / synthesis model to extract the spectral envelope features x, non-period...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com