Unlock instant, AI-driven research and patent intelligence for your innovation.

Mel spectrum prediction method and device, equipment and storage medium

A prediction method and spectrum technology, applied in speech synthesis, speech analysis, instruments, etc., can solve the problems of complicated and complicated model training steps

Pending Publication Date: 2021-08-06
PING AN TECH (SHENZHEN) CO LTD
View PDF4 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The main purpose of this application is to provide a prediction method, device, equipment and storage medium of a Mel spectrum, aiming at solving the singing synthesis method of the prior art, when performing Mel spectrum prediction, an additional duration model needs to be trained, resulting in a model Technical problems with cumbersome and complicated training steps

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Mel spectrum prediction method and device, equipment and storage medium
  • Mel spectrum prediction method and device, equipment and storage medium
  • Mel spectrum prediction method and device, equipment and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0075] In order to make the purpose, technical solution and advantages of the present application clearer, the present application will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present application, and are not intended to limit the present application.

[0076] In order to solve the singing synthesis method of the prior art, an additional duration model needs to be trained when performing mel spectrum prediction, resulting in complicated and complicated technical problems in model training steps. This application proposes a prediction method of mel spectrum, the method Applied to the technical field of artificial intelligence, the method is further applied to the technical field of neural network of artificial intelligence. The subject of execution of the method is a device capable of realizing the prediction method of the...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to the technical field of artificial intelligence, and discloses a Mel spectrum prediction method, device and equipment and a storage medium, and the method comprises the steps: inputting a to-be-predicted text sequence into a text encoder of a target acoustic module for feature extraction, and obtaining target text coding feature data; performing alignment position prediction on the target text coding feature data through an alignment position predictor of the target acoustic module to obtain target alignment position data; performing time alignment feature calculation according to the target text coding feature data and the target alignment position data through an alignment graph reconstructor of the target acoustic module to obtain a target time alignment feature value; and performing Mel spectrum calculation on the target time alignment feature value through a decoder of the target acoustic module to obtain target Mel spectrum data. The time length modeling is implicitly integrated in the target acoustic module by adopting an input-output feature alignment strategy, and an additional time length model is not needed. The invention further relates to a block chain technology.

Description

technical field [0001] The present application relates to the technical field of artificial intelligence, in particular to a prediction method, device, equipment and storage medium of a mel spectrum. Background technique [0002] Singing synthesis is a technology that converts information such as lyrics and music scores into singing audio. With the popularization of the mobile Internet and the continuous improvement of people's requirements for the quality of entertainment and life, singing synthesis technology has gradually emerged in the fields of video games, short video applications, and virtual singers. [0003] In the existing singing synthesis method, when predicting the mel spectrum, it is necessary to obtain the phoneme / note duration information through manual labeling or automatic machine labeling, and additionally train a duration model based on this, and may even need to carry out the duration predicted by the duration model. Post-processing leads to cumbersome ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G10L13/08G10L25/18
CPCG10L13/08G10L25/18
Inventor 刘正晨缪陈峰朱清影陈闽川马骏王少军肖京
Owner PING AN TECH (SHENZHEN) CO LTD