Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Prosodic labeling method, device and apparatus, and medium

A prosody and first tone technology, applied in the field of speech synthesis, can solve problems such as inability to obtain prosody labeling effect, difficulty in system design, and inability to finally screen out correct prosody information

Active Publication Date: 2019-11-15
ZHEJIANG TONGHUASHUN INTELLIGENT TECH CO LTD
View PDF3 Cites 5 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] Solution 1 cannot finally filter out the correct prosodic information without reading according to the predicted prosodic rhythm of the text
Option 2 splits the internal connection between speech and text, and cannot achieve good prosodic labeling effect
Moreover, the prosodic labeling process in the existing solutions includes multiple stages of processing. The component construction of each stage requires rich domain knowledge. The design of the entire system is difficult and the implementation is complicated. The errors of each stage will be superimposed in the final stage, so that The resulting prosodic information is inaccurate

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Prosodic labeling method, device and apparatus, and medium
  • Prosodic labeling method, device and apparatus, and medium
  • Prosodic labeling method, device and apparatus, and medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0046] The following will clearly and completely describe the technical solutions in the embodiments of the application with reference to the drawings in the embodiments of the application. Apparently, the described embodiments are only some of the embodiments of the application, not all of them. Based on the embodiments in this application, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the scope of protection of this application.

[0047]The existing prosodic labeling schemes either have low labeling efficiency, or split the internal relationship between the acoustic features of the prosody to be labeled and the corresponding text features, and the prosodic labeling process includes multiple stages of processing, and the component construction of each stage is Rich domain knowledge is required, the design of the whole system is difficult, and the implementation is complicated. The errors of each stage will be s...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The application discloses a prosodic labeling method, device and apparatus, and a medium. The method comprises the following steps: acquiring a first acoustic feature, a first text feature and a firstprosodic labeling result corresponding to a sample audio; the first acoustic feature serving as input of an encoder in an end-to-end neural network, the first text feature serving as input of a decoder in the end-to-end neural network, the first prosodic labeling result serving as an output of the end-to-end neural network, and the end-to-end neural network is trained to obtain a trained end-to-end neural network; and when a second acoustic feature and a second text feature of a to-be-labelled rhythm are obtained, directly outputting a second prosodic labeling result by utilizing the trainedend-to-end neural network. According to the prosodic labeling method, the acoustic features and the corresponding text features are effectively fused, thereby improving the accuracy of prosodic labeling.

Description

technical field [0001] The present application relates to the technical field of speech synthesis, in particular to a prosody tagging method, device, equipment, and medium. Background technique [0002] The synthesized sound library generally includes a large number of high-quality recorded audio clips, corresponding transcribed texts, and prosodic annotations on the transcribed texts based on the prosodic information of the recorded audio clips. How to automatically and accurately carry out the prosodic labeling of the synthesized sound bank by computer has become an important technology in the field of speech synthesis. [0003] Existing technical solution 1: first use the pre-trained text prosody prediction model to predict the prosody information of the text, and then use the pre-recorded audio to authenticate and screen the predicted text prosody information, eliminate the incorrect prosody information, and keep the correct prosody information. prosodic information to ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G10L13/10G10L25/24G10L25/30
CPCG10L13/10G10L25/24G10L25/30
Inventor 谌明陆健徐欣康胡新辉
Owner ZHEJIANG TONGHUASHUN INTELLIGENT TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products