Many-to-many speech conversion system based on vae and i-vector under the condition of non-parallel text
A voice conversion, non-parallel technology, applied in the field of signal processing, can solve the problem that the personality similarity of the converted voice is not ideal.
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0027] see figure 1 and figure 2 , the present embodiment provides a many-to-many speech conversion system based on VAE and i-vector under non-parallel text conditions, which is divided into two steps of training and conversion:
[0028] 1 speaker speech training stage
[0029] 1.1 Obtain the training corpus. The speech library used here is VCC2018, which contains 8 source speakers and 4 target speakers. The training corpus is divided into two groups: 4 male speakers and 4 female speakers. For each fully trained speaker, 81 sentences are used as training corpus for full training, and 35 sentences are used as test corpus for model evaluation;
[0030] 1.2 Use the speech analysis and synthesis model WORLD to extract the speech features of each frame of the speaker's sentence: spectral envelope sp', speech logarithmic fundamental frequency logf 0 , the harmonic spectrum envelope ap, calculate the energy en of each frame of speech, and recalculate the spectrum envelope, ie sp...
PUM
Login to View More Abstract
Description
Claims
Application Information
Login to View More - R&D
- Intellectual Property
- Life Sciences
- Materials
- Tech Scout
- Unparalleled Data Quality
- Higher Quality Content
- 60% Fewer Hallucinations
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2025 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com



