Multi-scale StarGAN voice conversion method based on shared training
A speech conversion, multi-scale technology, applied in speech analysis, speech synthesis, neural learning methods, etc., can solve problems such as inability to focus on extraction, high speech similarity, and limited conversion performance.
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment Construction
[0064] Such as figure 1 As shown, the method of the present invention is divided into two parts: the training part is used to obtain the parameters and conversion functions required for voice conversion, and the conversion part is used to convert the source speaker's voice into the target speaker's voice.
[0065] The implementation steps of the training phase are:
[0066] 1.1) Obtain the training corpus of non-parallel text, the training corpus is the corpus of multiple speakers, including the source speaker and the target speaker. The training corpus is taken from the VCC2018 speech corpus. There are 6 male and 6 female speakers in the training set of this corpus, and each speaker has 81 sentence corpus. Select 4 source speakers (two men and two women) and 4 target speakers (two men and two women), the speech content of the 4 source speakers is the same, and the speech content of the 4 target speakers is different from the 4 source speakers , so the method is based on no...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com