Speech synthesis method and device, electronic equipment and program product
A technology for speech synthesis and speech data, which is applied in the field of devices, electronic equipment and program products, and speech synthesis methods, and can solve the problems of large computing time and computing resources, unfavorable applications, complexity of LPCNet vocoder and large amount of calculation, etc.
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
preparation example Construction
[0031] The speech synthesis method provided by the embodiments of the present disclosure can be applied to figure 1 It is a schematic diagram of the system architecture based on the network of the present disclosure. like figure 1 As shown, the network system includes: a speech synthesis device 1 and an electronic device 2 .
[0032] Among them, the speech synthesis device 1 described in the present disclosure can be installed or integrated in the electronic device 1, and the electronic device 1 can specifically be a smart terminal, such as a smart phone, a tablet computer, a desktop computer, etc., which can perform data calculations according to preset calculation logics processing equipment.
[0033] Wherein, the electronic device 2 can analyze and obtain corresponding acoustic feature data and feature sampling data corresponding to the acoustic feature data by acquiring the voice text to be synthesized from the network. Then, the speech synthesis device 1 will acquire t...
Embodiment 1
[0110] Embodiment 1. A speech synthesis method, comprising:
[0111] Acquiring characteristic sampling data of the acoustic characteristic data at multiple sampling moments;
[0112] Using a speech synthesis network to simultaneously perform prediction processing on the feature sampling data at the plurality of sampling moments, and obtain linear prediction data and nonlinear prediction data at any two target sampling moments in the plurality of sampling moments;
[0113] The speech synthesis data at the two target sampling moments are determined according to the linear prediction data and the nonlinear prediction data at the two target sampling moments.
Embodiment 2
[0114] Embodiment 2, according to the speech synthesis method described in Embodiment 1, the speech synthesis network is used to predict the feature sampling data at the multiple sampling moments at the same time, and obtain any two target samples in the multiple sampling moments The linear prediction data and nonlinear prediction data of time, including:
[0115] Perform linear prediction processing on the feature sampling data at multiple sampling moments, and obtain the linear speech data Pm at the mth sampling moment and the linear speech data Pm+1 at the m+1st sampling moment;
[0116] Obtain the speech synthesis data Sm-1 and nonlinear speech data Em-1 at the m-1 sampling moment, and the speech synthesis data Sm-2 and nonlinear speech data Em-2 at the m-2 sampling moment;
[0117] For the feature sampling data, speech synthesis data Sm-1, nonlinear speech data Em-1, speech synthesis data Sm-2, nonlinear speech data Em-2, linear The speech data Pm and the linear speech d...
PUM
Login to View More Abstract
Description
Claims
Application Information
Login to View More 


