Acoustic model post-processing method based on probability diffusion model, server and readable memory

A technology of diffusion model and acoustic model, applied in the field of artificial intelligence, can solve problems such as low naturalness and insufficient quality of acoustic spectrum

Pending Publication Date: 2022-05-17
ZHEJIANG UNIV
View PDF0 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, there is a characteristic that the quality of the obtained acoustic spectrum is not high enough, so the synthesized speech wave file will also have a correspondingly low degree of naturalness

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Acoustic model post-processing method based on probability diffusion model, server and readable memory
  • Acoustic model post-processing method based on probability diffusion model, server and readable memory
  • Acoustic model post-processing method based on probability diffusion model, server and readable memory

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0037]In order to better explain the method of the present invention, the technical solution of the present invention will be further described below through specific embodiments in conjunction with the accompanying drawings, so that the method will be more clear. Through the contents disclosed in this specification, those skilled in the art can easily understand the function of the present invention. The present invention can also be implemented or applied through other specific implementation methods, and the details of this specification can also be modified based on different viewpoints, and various modifications or changes can be made without departing from the essence of the present invention. It should be noted that, in the case of no conflict, the following embodiments and the features therein can be further combined.

[0038] In detail, the present invention proposes an acoustic model post-processing method based on a probability diffusion model, including the followi...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses an acoustic model post-processing method based on a probability diffusion model, a server and a readable memory. The method comprises the following steps: model training: training the probability diffusion model by using the server, optimizing parameters of the probability diffusion model by reducing a loss function until the model converges, and obtaining the weight of the probability diffusion model; and model inference: according to the model weight obtained in the training stage, realizing spectrum optimization on the input predicted spectrum by using a server. According to the method, by learning the feature similarity between an input predicted spectrum and a real spectrum and using the data fitting capability of a noise estimation network in a model, probability distribution transfer based on diffusion is realized, and finally, the input predicted spectrum is more approximate to the real spectrum. And the naturalness of the synthesized speech is improved by improving the quality of the frequency spectrum. According to the method, the spectrum detail optimization effect can be achieved for the spectrums obtained by various acoustic models, and compared with other methods, the better spectrum generation effect is achieved.

Description

technical field [0001] The invention relates to the technical field of artificial intelligence. Aiming at the problem that the quality of speech synthesis needs to be improved, an acoustic model post-processing method based on a probability diffusion model is proposed. By optimizing the details of the predicted spectrum, the naturalness of the synthesized speech is improved. Background technique [0002] With the continuous development of artificial intelligence technology, problems in various fields based on artificial intelligence technology have made great progress. Speech synthesis is an important direction in artificial intelligence technology. Speech synthesis refers in detail to the synthesis from text to audio. The goal of speech synthesis technology is to enable machines to speak like people. High-quality speech synthesis technology can meet the needs of human-computer interaction in the current society. [0003] The existing speech synthesis technology first con...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G10L13/10G10L25/18G10L25/27
CPCG10L13/10G10L25/18G10L25/27
Inventor 张晨张宗煜陈积明史治国
Owner ZHEJIANG UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products