Voice processing method, device and equipment and storage medium

A voice processing and voice technology, applied in the computer field, can solve the problems of consuming large computer resources and time, lack of training data, and insufficient retraining of voice conversion models, etc., to reduce the occupation and time consumption of computing resources, and lower the application threshold , Improve the effect of voice processing efficiency
CN112712813AActive Publication Date: 2021-04-27BEIJING DAJIA INTERNET INFORMATION TECH CO LTD

Patent Information

Authority / Receiving Office
CN · China
Current Assignee / Owner
BEIJING DAJIA INTERNET INFORMATION TECH CO LTD
Publication Date
2021-04-27

Smart Images

  • Figure 1
    Figure 1
  • Figure 2
    Figure 2
  • Figure 3
    Figure 3
Patent Text Reader

Abstract

The invention relates to a voice processing method, device and equipment and a storage medium. The method comprises the steps: obtaining a to-be-processed first voice and a to-be-processed second voice; calling an encoder in a voice processing model obtained by performing optimization training based on at least one target speaker statement to encode the obtained voice, and respectively obtaining a first feature representing text information irrelevant to the identity of the speaker and a second feature representing tone information of the target speaker; and performing decoding and voice reconstruction based on the first feature and the second feature to obtain a target voice after tone conversion. Thus, through an end-to-end voice processing model, the voice processing model does not need a large number of target speaker statements, and the tone modeling ability of the target speaker can be completed only based on a small number of utterances, so that the occupation and time consumption of computing resources for model training are reduced.
Need to check novelty before this filing date? Find Prior Art

Description

technical field

[0001] The present disclosure relates to the field of computer technology, in particular to a voice processing method, device, equipment and storage medium. Background technique

[0002] Speech conversion refers to the conversion of the original speaker's timbre of speech into the target speaker's timbre while keeping the language content unchanged. Speech conversion plays an important role in video voice change, video dubbing, human-computer interaction and other fields.

[0003] In related technologies, existing speech recognition systems are usually trained using a large number of data sets. When the target speaker changes, it is necessary to obtain a large amount of data to retrain a voice conversion model, which not only consumes a lot of computer resources and time, but also in some special scenarios, especially in the voice data of the new target speaker. In rare cases, it is not sufficient to retrain a speech translation model to a new target speake...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More