Personalized speech synthesis model construction method and device, speech synthesis method and device, and personalized speech synthesis model test method and device

A technology of speech synthesis and construction method, which is applied in speech synthesis, speech analysis, speech recognition, etc., which can solve the problems of unable to synthesize speakers and achieve the effect of improving user experience
CN112863476APending Publication Date: 2021-05-28ALIBABA GRP HLDG LTD

Patent Information

Authority / Receiving Office
CN · China
Current Assignee / Owner
ALIBABA GRP HLDG LTD
Publication Date
2021-05-28

Smart Images

  • Figure 1
    Figure 1
  • Figure 2
    Figure 2
  • Figure 3
    Figure 3
Patent Text Reader

Abstract

The invention discloses a personalized speech synthesis model construction method and device, a speech synthesis method and device, and a personalized speech synthesis model test method and device. The construction method of the personalized speech synthesis model comprises the following steps: determining training data similar to a user from training set data of a plurality of speakers of a multi-speaker speech synthesis model; selecting similar speakers belonging to the same category as the user from the plurality of speakers except the speaker to which the approximate training data belongs; and training the multi-speaker speech synthesis model according to training data similar to the user and the selected similar speakers to obtain a personalized speech synthesis model of the user. The voice of the specific speaking style of the user can be synthesized, and the user experience is improved.
Need to check novelty before this filing date? Find Prior Art

Description

technical field

[0001] The invention relates to the technical field of artificial intelligence, in particular to a construction method of a personalized speech synthesis model, a speech synthesis method, a testing method and a device. Background technique

[0002] Voice interaction scenarios in artificial intelligence technology require personalized voice synthesis. Personalized speech synthesis is a strong demand in business, and it is also one of the future trends in the field of speech synthesis.

[0003] In traditional speech synthesis technology, using hundreds of hours of training data from hundreds of speakers, a multi-speaker speech synthesis system based on massive data can be constructed. Specifically, a multi-speaker speech synthesis model can be used, for example, based on neural The text-to-speech (Neural TTS (Text-To-Speech)) model of the network, in the training data of this model, the voice data volume of a single speaker often ranges from a few hours to doz...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More