Cross-language speech conversion method based on activation guidance and inner convolution

A speech conversion and cross-language technology, applied in speech synthesis, speech analysis, speech recognition, etc., to improve accuracy and versatility, realize flexible modeling, and enhance the effect of personality similarity

Active Publication Date: 2021-12-17
NANJING UNIV OF POSTS & TELECOMM
View PDF3 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0007] The technical problem to be solved by the present invention: the present invention provides a cross-lingual speech conversion method based on activation guidance and inner convolution, the activation guidance adopted by the method can effectively extract the content representation in the speech, and solve the problem of over-smoothing in FHVAE problem, significantly improving the quality of converted speech; further using inner convolution instead of traditional convolution, greatly reducing the amount of parameters and calculations in the model, and effectively improving the operating efficiency of the algorithm; finally realizing high-quality open-set arbitrary speech Human Interlingual Speech Conversion

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Cross-language speech conversion method based on activation guidance and inner convolution
  • Cross-language speech conversion method based on activation guidance and inner convolution
  • Cross-language speech conversion method based on activation guidance and inner convolution

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0060] The technical solutions in the embodiments of the present invention will be clearly and completely described below in conjunction with the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only some of the embodiments of the present invention, not all of them. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the protection scope of the present invention.

[0061] The present invention proposes a cross-language speech conversion method based on activation guidance and inner convolution, including a training phase and a conversion phase. The training phase is used to obtain the conversion network and its parameters required for voice conversion, while the conversion phase is used to convert the personality information of the source speaker's voice into the personality information of the target speaker's...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a cross-language speech conversion method based on activation guidance and inner convolution. The cross-language speech conversion method comprises a training stage and a conversion stage. A speech conversion model provided by the invention is composed of an encoder and a decoder; firstly, internal convolution is adopted in the encoder and the decoder to replace traditional convolution, so the parameter quantity and the calculation quantity of the model are greatly reduced, and the operation efficiency of an algorithm is effectively improved; content information in sentences of a source speaker is further extracted through activation guidance in the encoder, meanwhile, the personalized information of a target speaker is transmitted to the decoder from the encoder through U-shaped connection, the personalized information and the content information in the sentences of the source speaker are reconstructed in the decoder, and therefore, high-quality cross-language speech conversion is achieved; and meanwhile, the method can also realize conversion of speakers not in a training set; that is, cross-language conversion of any speaker under the condition of an open set is completed.

Description

technical field [0001] The invention relates to the technical field of voice conversion, in particular to a cross-lingual voice conversion method based on activation guidance and inner convolution. Background technique [0002] Speech conversion is an important research branch in the field of speech signal processing. The task of speech conversion is to generate a voice with the source speaker's voice content and the target speaker's personality characteristics under the premise of the source speaker's voice and the target speaker's voice to be converted. . Traditional speech conversion focuses on solving the problem of same-language conversion, that is, the source and target speakers are required to be in the same language, but cross-language speech conversion breaks this limitation, and the source and target speakers speak different languages ​​and texts. From another perspective, whether it is traditional same-language speech conversion or cross-language speech conversio...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G10L13/02G10L15/00G10L15/02G10L15/06G10L15/16G10L19/02G10L25/24G10L25/30
CPCG10L15/005G10L15/02G10L15/063G10L25/24G10L13/02G10L15/16G10L25/30G10L19/02Y02T10/40
Inventor 李燕萍戴少梁邱祥天
Owner NANJING UNIV OF POSTS & TELECOMM
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products