A voice special effect synthesis method, device, electronic equipment and storage medium
A synthesis method and special effect technology, which is applied in the field of voice processing, can solve the problems of being unable to meet the diversity needs of special effects voice and single synthesized voice, so as to meet the diverse needs and solve the effect of single synthesized voice
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0032] figure 1 It is a flow chart of a speech special effect synthesis method provided in Embodiment 1 of the present invention. This embodiment is applicable to the situation of synthesizing special effects and synthesized speech according to different special effect requirements. The method can be executed by a speech special effect synthesis device. It can be realized by means of software and / or hardware. Correspondingly, such as figure 1 As shown, the method includes the following operations:
[0033] S110. Acquire text data corresponding to the original speech data, and obtain basic prosodic features and basic acoustic features matching the text data.
[0034] Wherein, the original voice data may be input manually, and needs to be converted into voice data of special-effect synthesized voice. The basic prosodic feature may be the prosodic feature of the original speech data, for example, the pitch, stress, participle, and pause features of phonetic pinyin. The basic ...
Embodiment 2
[0046] figure 2 It is a flow chart of a speech special effect synthesis method provided in Embodiment 2 of the present invention. This embodiment is embodied on the basis of the above-mentioned embodiments. In this embodiment, the use of the feature adjustment parameters to the basic The prosodic feature and / or the basic acoustic feature are adjusted to obtain a specific implementation manner of the target prosodic feature and / or the target acoustic feature. Correspondingly, such as figure 2 As shown, the method includes the following operations:
[0047] S210. Acquire text data corresponding to the original voice data, and obtain basic prosodic features and basic acoustic features matching the text data.
[0048] S220. Acquire feature adjustment parameters corresponding to required special effects according to the pre-established mapping relationship between at least two special effects and corresponding feature adjustment parameters, where the feature adjustment paramete...
Embodiment 3
[0079] image 3 is a schematic diagram of an answer output device provided in Embodiment 3 of the present invention, such as image 3 As shown, the device includes: a data acquisition module 310, a parameter acquisition module 320, a parameter adjustment module 330 and a speech synthesis module 340, wherein:
[0080] A data acquisition module 310, configured to acquire text data corresponding to the original voice data, and acquire basic prosodic features and basic acoustic features matched with the text data;
[0081]The parameter acquisition module 320 is configured to acquire feature adjustment parameters corresponding to required special effects according to the pre-established mapping relationship between at least two special effects and corresponding feature adjustment parameters, the feature adjustment parameters including: target prosody features tuning parameters and / or target acoustic feature tuning parameters;
[0082] A parameter adjustment module 330, configured...
PUM
Login to View More Abstract
Description
Claims
Application Information
Login to View More 


