Vocoder training method, terminal and storage medium

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A vocoder and acoustic model technology, applied in the Internet field, can solve problems such as unrecognizable spectrum data and vocoder mismatch

Pending Publication Date: 2021-09-07

TENCENT MUSIC ENTERTAINMENT TECH SHENZHEN CO LTD

View PDF0 Cites 1 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0004] Since the vocoder is trained based on the spectral data obtained from real sounds, and the spectral data input into the trained vocoder during actual use is only the spectral data similar to real sounds obtained by the acoustic model based on phoneme sequences and pause information, it is not It is the spectral data of the real sound, which causes a mismatch between the trained acoustic model and the trained vocoder, which may cause the vocoder to be unable to recognize the spectral data obtained by the acoustic model, making the "rustle" in the synthetic sound the sound of

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0081] In order to make the purpose, technical solution and advantages of the present application clearer, the implementation manners of the present application will be further described in detail below in conjunction with the accompanying drawings.

[0082] figure 1 It is a schematic diagram of an implementation environment of a method for training a vocoder provided in an embodiment of the present application. Such as figure 1 As shown, the method can be implemented by the terminal 101 or the server 102.

[0083] The terminal 101 may include components such as a processor and a memory. The processor, which can be a CPU (Central Processing Unit, central processing unit), etc., can be used to obtain the time domain data of the sample audio, determine the first spectral data corresponding to the reference time domain data, and input the first spectral data into the trained acoustic model In the self-attention learning module, the second spectral data is obtained, the second ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a vocoder training method, a terminal and a storage medium, and belongs to the technical field of Internet. The method comprises the following steps: acquiring time domain data of a sample audio as reference time domain data; determining first frequency spectrum data corresponding to the reference time domain data, and inputting the first frequency spectrum data into a self-attention learning module in a trained acoustic model to obtain second frequency spectrum data; inputting the second frequency spectrum data into a vocoder to obtain predicted time domain data; and training the vocoder based on the predicted time domain data and the reference time domain data. According to the invention, the matching degree of the trained vocoder and the trained acoustic model obtained based on the method is higher than the matching degree of the trained acoustic model and the trained vocoder in the prior art, and the sand sound existing in the synthetic sound is reduced to a certain extent.

Description

technical field [0001] The present application relates to the technical field of the Internet, in particular to a method for training a vocoder, a terminal and a storage medium. Background technique [0002] With the continuous development of Internet technology, when people read novels, they often read the content of novels through AI models. [0003] In related technologies, the AI model actually consists of a phoneme conversion model, a pause prediction model, an acoustic model, and a vocoder. The specific process of applying these models to obtain the target text is as follows: input the target text into the phoneme conversion model and the pause prediction model respectively, and obtain the phoneme sequence and pause information, which includes pause position and pause duration. Input the phoneme sequence and pause information into the trained acoustic model to obtain spectrum data. The frequency spectrum data is input into the trained vocoder to obtain the target t...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G10L19/16G06N3/08G06N3/04

CPCG10L19/16G06N3/08G06N3/044G06N3/045

Inventor 徐东

Owner TENCENT MUSIC ENTERTAINMENT TECH SHENZHEN CO LTD

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Vocoder training method, terminal and storage medium

What is Al technical title? Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document. A vocoder and acoustic model technology, applied in the Internet field, can solve problems such as unrecognizable spectrum data and vocoder mismatch

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A vocoder and acoustic model technology, applied in the Internet field, can solve problems such as unrecognizable spectrum data and vocoder mismatch

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology