Voice style conversion method and device, equipment and storage medium

A technology of style conversion and conversion method, which is applied in speech analysis, instruments, etc., and can solve the problems of affecting the effect of voice change and low style similarity

Pending Publication Date: 2020-06-19
GUANGZHOU BAIGUOYUAN INFORMATION TECH CO LTD
View PDF6 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] At present, since the target voice that needs to be changed is usually determined in advance, for each target voice determined in advance, a large number of historical source voices will be used as training samples in advance to train the voice style conversion under the target voice, so that There are a large number of target voices that have been trained in the training set for voice style conversion, so that the accurate conversion of the source voice to a certain target voice in the training set can be realized later,

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Voice style conversion method and device, equipment and storage medium
  • Voice style conversion method and device, equipment and storage medium
  • Voice style conversion method and device, equipment and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0028] figure 1 It is a flowchart of a speech style conversion method provided by Embodiment 1 of the present invention. This embodiment is applicable to the case of performing audio voice change with a specific style and unchanged speech content on any speech. A voice style conversion method provided in this embodiment can be performed by the voice style conversion device provided in the embodiment of the present invention, which can be implemented by software and / or hardware, and integrated into the device that executes the method , the device may be a user terminal configured with any voice-changing application.

[0029] Specifically, refer to figure 1 , the method may include the following steps:

[0030] S110. Acquire a source-style speech, a target-style speech, and an initial conversion speech.

[0031] Specifically, in order to show users various voices in various voice styles, the audio voice changing technology set in the voice changing application is usually used...

Embodiment 2

[0045] Figure 2A It is a flowchart of a speech style conversion method provided by Embodiment 2 of the present invention, Figure 2B It is a schematic diagram of the principles of various speech losses during the calculation loss optimization process in the method provided by Embodiment 2 of the present invention. This embodiment is optimized on the basis of the foregoing embodiments. Specifically, such as Figure 2A As shown, this embodiment explains in detail the specific calculation process of speech content loss and speech style loss.

[0046] optional, such as Figure 2A As shown, the following steps may be included in this embodiment:

[0047] S210. Acquire source-style speech, target-style speech and initial conversion speech.

[0048] S220. Determine the speech content features of the source-style speech, the speech style features of the target-style speech, and the speech content and speech style features of the initial converted speech.

[0049] Specifically, ...

Embodiment 3

[0068] image 3 It is a flowchart of a speech style conversion method provided by Embodiment 3 of the present invention. This embodiment is optimized on the basis of the foregoing embodiments. Specifically, this embodiment explains in detail the specific process of initially converting speech for loss optimization.

[0069] optional, such as image 3 As shown, the following steps may be included in this embodiment:

[0070] S301. Acquire a source-style speech, a target-style speech, and an initial conversion speech.

[0071] S302, according to the speech content loss between the initial conversion speech and the source style speech and the speech style loss between the initial conversion speech and the target style speech, use the gradient descent algorithm to perform corresponding gradient sub-optimization on the gradient loss of the initial conversion speech, and obtain The new gradient loss.

[0072] Optionally, when performing loss optimization on the initial converte...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The embodiment of the invention discloses a voice style conversion method and device, equipment and a storage medium. The method comprises the steps: acquiring a source style voice, a target style voice and an initial conversion voice; according to the voice content loss between the initial conversion voice and the source style voice and the voice style loss between the initial conversion voice and the target style voice, carrying out loss optimization on the initial conversion voice to obtain a new initial conversion voice, continuing to carry out loss optimization until the new initial conversion voice meets a preset loss optimization condition, and taking the new initial conversion voice as a style conversion voice of the source style voice under the target style. According to the technical scheme provided by the embodiment of the invention, accurate conversion of the source style voice under the target style is realized, pre-training of voice style conversion does not need to be carried out for the target style, voice style conversion under the target voice which is not pre-trained is ensured, and comprehensiveness and accuracy of voice style conversion are improved.

Description

technical field [0001] Embodiments of the present invention relate to the technical field of voice change, and in particular, to a voice style conversion method, device, equipment, and storage medium. Background technique [0002] With the rapid development of intelligent voice technology, audio voice-changing technology has become a hot emerging technology, which aims to convert a certain source voice into a target voice with characteristic voice style and voice content unchanged, such as a The app plays a piece of audio recorded by the user with a voice-changing effect of a specific target. [0003] At present, since the target voice that needs to be changed is usually determined in advance, for each target voice determined in advance, a large number of historical source voices will be used as training samples in advance to train the voice style conversion under the target voice, so that There are a large number of target voices that have been trained in the training set ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G10L21/007G10L21/013G10L25/48
CPCG10L21/007G10L21/013G10L25/48G10L2021/0135
Inventor 娄帆
Owner GUANGZHOU BAIGUOYUAN INFORMATION TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products