A prosodic labeling method, device, equipment and medium

A prosody, first tone technology, applied in the field of speech synthesis, can solve the problems of complex implementation, difficult system design, inaccurate prosody information, etc., to achieve the effect of improving accuracy and avoiding superposition errors

Active Publication Date: 2022-03-15
ZHEJIANG TONGHUASHUN INTELLIGENT TECH CO LTD
View PDF3 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] Solution 1 cannot finally filter out the correct prosodic information without reading according to the predicted prosodic rhythm of the text
Option 2 splits the internal connection between speech and text, and cannot achieve good prosodic labeling effect
Moreover, the prosodic labeling process in the existing solutions includes multiple stages of processing. The component construction of each stage requires rich domain knowledge. The design of the entire system is difficult and the implementation is complicated. The errors of each stage will be superimposed in the final stage, so that The resulting prosodic information is inaccurate

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A prosodic labeling method, device, equipment and medium
  • A prosodic labeling method, device, equipment and medium
  • A prosodic labeling method, device, equipment and medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0046] The following will clearly and completely describe the technical solutions in the embodiments of the application with reference to the drawings in the embodiments of the application. Apparently, the described embodiments are only some of the embodiments of the application, not all of them. Based on the embodiments in this application, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the scope of protection of this application.

[0047]The existing prosodic labeling schemes either have low labeling efficiency, or split the internal relationship between the acoustic features of the prosody to be labeled and the corresponding text features, and the prosodic labeling process includes multiple stages of processing, and the component construction of each stage is Rich domain knowledge is required, the design of the whole system is difficult, and the implementation is complicated. The errors of each stage will be s...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The present application discloses a method, device, device, and medium for prosodic labeling. The method includes: acquiring first acoustic features, first text features, and first prosodic labeling results corresponding to sample audio; As the input of the encoder in the end-to-end neural network, the first text feature is used as the input of the decoder in the end-to-end neural network, and the first prosodic labeling result is used as the input of the end-to-end neural network Output, train the end-to-end neural network to obtain the end-to-end neural network after training; when the second acoustic feature and the second text feature of the prosody to be marked are obtained, use the end-to-end neural network after the training to directly output Second prosody labeling results. The prosodic labeling method effectively fuses the acoustic features with the corresponding text features, and improves the accuracy of prosodic labeling.

Description

technical field [0001] The present application relates to the technical field of speech synthesis, in particular to a prosody tagging method, device, equipment, and medium. Background technique [0002] The synthesized sound library generally includes a large number of high-quality recorded audio clips, corresponding transcribed texts, and prosodic annotations on the transcribed texts based on the prosodic information of the recorded audio clips. How to automatically and accurately carry out the prosodic labeling of the synthesized sound bank by computer has become an important technology in the field of speech synthesis. [0003] Existing technical solution 1: first use the pre-trained text prosody prediction model to predict the prosody information of the text, and then use the pre-recorded audio to authenticate and screen the predicted text prosody information, eliminate the incorrect prosody information, and keep the correct prosody information. prosodic information to ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G10L13/10G10L25/24G10L25/30
CPCG10L13/10G10L25/24G10L25/30
Inventor 谌明陆健徐欣康胡新辉
Owner ZHEJIANG TONGHUASHUN INTELLIGENT TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products