Multi-to-multi speaker conversion method based on STARGAN and ResNet
A conversion method and speaker technology, applied in neural learning methods, speech analysis, instruments, etc., can solve problems such as network degradation
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment Construction
[0061] The present invention builds a ResNet between the encoding network and the decoding network of the generator, which can well solve the problem of network degradation in the training process, further reduce the difficulty of learning the semantics of the encoding network, and improve the spectrum generation quality of the decoding network. Thereby improving the naturalness and fluency of converted speech. Since BN (Batch norm) is the standardization between different samples in a batch, it is important to standardize each batch to ensure that the data distribution is consistent. However, in speech conversion, the generated results mainly depend on a certain speech sample instance, and BN is used to The overall information obtained will not bring any benefits, and the noise it brings will weaken the independence between instances. Therefore, the effect obtained after standardization using BN in the network is not obvious. The naturalness has not been greatly improved, and...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com