Many-to-many speaker conversion method based on SE-ResNet STARGAN
A conversion method and speaker technology, applied in speech analysis, speech recognition, instruments, etc., can solve problems such as degradation, and achieve the effect of strengthening useful features, improving extraction capabilities, and enhancing representation capabilities
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment Construction
[0042] In convolutional neural networks, convolution kernels capture local spatial relationships in the form of feature maps, and different channel features are further used with equally important weights, making globally irrelevant features propagate through the network, thereby affecting accuracy. In order to solve the above problems, the present invention builds a SE-ResNet network by adding SE-Net network (Squeeze-and-ExcitationNetworks, SE-Net) on the basis of ResNet, utilizes the independence between different channel features to model, and introduces The idea of attention and the gating mechanism readjust the channel features of the output of the convolutional network, emphasizing useful features and suppressing useless features. While effectively solving the problem of network degradation, the representation ability of the model is further enhanced, thereby improving the spectrum of the decoding network. build quality. The present invention proposes a voice method ba...
PUM
Login to View More Abstract
Description
Claims
Application Information
Login to View More 


