Unlock instant, AI-driven research and patent intelligence for your innovation.

A voice special effect synthesis method, device, electronic equipment and storage medium

A synthesis method and special effect technology, which is applied in the field of voice processing, can solve the problems of being unable to meet the diversity needs of special effects voice and single synthesized voice, so as to meet the diverse needs and solve the effect of single synthesized voice

Active Publication Date: 2022-05-13
出门问问创新科技有限公司
View PDF5 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] In view of this, the embodiment of the present invention provides a speech special effect synthesis method, device, electronic equipment, and storage medium, the main purpose of which is to solve the problem that the speech synthesis system has relatively single synthesized speech and cannot meet the diversity requirements of special effect speech.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A voice special effect synthesis method, device, electronic equipment and storage medium
  • A voice special effect synthesis method, device, electronic equipment and storage medium
  • A voice special effect synthesis method, device, electronic equipment and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0032] figure 1 It is a flow chart of a speech special effect synthesis method provided in Embodiment 1 of the present invention. This embodiment is applicable to the situation of synthesizing special effects and synthesized speech according to different special effect requirements. The method can be executed by a speech special effect synthesis device. It can be realized by means of software and / or hardware. Correspondingly, such as figure 1 As shown, the method includes the following operations:

[0033] S110. Acquire text data corresponding to the original speech data, and obtain basic prosodic features and basic acoustic features matching the text data.

[0034] Wherein, the original voice data may be input manually, and needs to be converted into voice data of special-effect synthesized voice. The basic prosodic feature may be the prosodic feature of the original speech data, for example, the pitch, stress, participle, and pause features of phonetic pinyin. The basic ...

Embodiment 2

[0046] figure 2 It is a flow chart of a speech special effect synthesis method provided in Embodiment 2 of the present invention. This embodiment is embodied on the basis of the above-mentioned embodiments. In this embodiment, the use of the feature adjustment parameters to the basic The prosodic feature and / or the basic acoustic feature are adjusted to obtain a specific implementation manner of the target prosodic feature and / or the target acoustic feature. Correspondingly, such as figure 2 As shown, the method includes the following operations:

[0047] S210. Acquire text data corresponding to the original voice data, and obtain basic prosodic features and basic acoustic features matching the text data.

[0048] S220. Acquire feature adjustment parameters corresponding to required special effects according to the pre-established mapping relationship between at least two special effects and corresponding feature adjustment parameters, where the feature adjustment paramete...

Embodiment 3

[0079] image 3 is a schematic diagram of an answer output device provided in Embodiment 3 of the present invention, such as image 3 As shown, the device includes: a data acquisition module 310, a parameter acquisition module 320, a parameter adjustment module 330 and a speech synthesis module 340, wherein:

[0080] A data acquisition module 310, configured to acquire text data corresponding to the original voice data, and acquire basic prosodic features and basic acoustic features matched with the text data;

[0081]The parameter acquisition module 320 is configured to acquire feature adjustment parameters corresponding to required special effects according to the pre-established mapping relationship between at least two special effects and corresponding feature adjustment parameters, the feature adjustment parameters including: target prosody features tuning parameters and / or target acoustic feature tuning parameters;

[0082] A parameter adjustment module 330, configured...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The embodiment of the present invention discloses a voice special effect synthesis method, device, electronic equipment, and storage medium. The method includes: acquiring text data corresponding to the original voice data, and acquiring basic prosodic features and basic acoustics matching the text data. Features; according to the pre-established mapping relationship between at least two special effects and corresponding feature adjustment parameters, obtain feature adjustment parameters corresponding to the required special effects, where the feature adjustment parameters include: target rhythm feature adjustment parameters and / or Target acoustic feature adjustment parameters, using the feature adjustment parameters to adjust the basic prosodic features and / or basic acoustic features to obtain target prosodic features and / or target acoustic features; according to the target prosodic features and / or target acoustic features , generating special-effect synthesized speech corresponding to the original speech data. The technical solutions of the embodiments of the present invention can meet the diversity requirements of special effect speech.

Description

technical field [0001] Embodiments of the present invention relate to the technical field of voice processing, and in particular, to a voice special effect synthesis method, device, electronic equipment, and storage medium. Background technique [0002] Speech synthesis, also known as Text to Speech (Text to Speech) technology, can convert any text information into smooth speech in real time. [0003] Existing speech synthesis technologies usually use a pre-trained prosody model and an acoustic model to process the text data of the original speech data to obtain the synthesized speech corresponding to the original speech data. During the specific implementation process, the inventor found the following defects in the prior art: using the pre-trained prosody model and acoustic model to process the text data of the original speech data, only a fixed type of synthetic speech can be obtained, which cannot satisfy Diversity requirements for special effects voice. Contents of t...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G10L13/033G10L13/04G10L13/10
CPCG10L13/033G10L13/04G10L13/10
Inventor 张冉张征
Owner 出门问问创新科技有限公司