Polyphone disambiguation and rhythm control combined method and system and electronic equipment

What is AI technical title?
AI technical title is built by Patsnap AI team. It summarizes the technical point description of the patent document.
A technology of polyphonic characters and prosody, applied in the field of Chinese speech synthesis, can solve problems such as accumulation of module errors and affecting the effect of speech synthesis

Active Publication Date: 2021-07-30

HISENSE VISUAL TECH CO LTD

View PDF6 Cites 2 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0004] At present, the general practice is to use two independent polyphone disambiguation models and prosodic prediction models in the front-end processing to realize polyphone disambiguation and prosodic prediction respectively. Structural processing leads to the accumulation of errors in each module, which affects the final speech synthesis effect

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0054] In order to make the purposes, technical solutions and advantages of the exemplary embodiments of the present application clearer, the technical solutions in the exemplary embodiments of the present application will be clearly and completely described below in conjunction with the accompanying drawings in the exemplary embodiments of the present application. , the described exemplary embodiments are only some of the embodiments of the present application, but not all of the embodiments.

[0055] Based on the exemplary embodiments shown in this application, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the scope of protection of this application. In addition, although the disclosures in this application are introduced according to one or several exemplary examples, it should be understood that each aspect of these disclosures may also independently constitute a complete technical solution.

[0056] It sho...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention provides a polyphone disambiguation and rhythm control combined method, a polyphone disambiguation and rhythm control combined system and electronic equipment. The method comprises the steps of: obtaining a to-be-processed text and part-of-speech thereof, converting the to-be-processed text and the part-of-speech thereof into a character vector and a part-of-speech vector, and splicing to obtain a spliced vector; training through an alternate training strategy to obtain a joint model, a first group of weights and a second group of weights, wherein the joint model comprises a first neural network and a second neural network, and encoding the splicing vector through the joint model to obtain a first in-sentence code and a second in-sentence code of the character; obtaining a polyphone weighted sum according to the first group of weights, and obtaining pronunciation probability distribution of polyphones through a first full connection layer; removing incorrect pronunciation in the pronunciation probability distribution of the polyphone through masks, and obtaining final pronunciation prediction; and obtaining a rhythm weighted sum according to the second group of weights, and obtaining a rhythm pause level through a second full connection layer and a conditional random field. Error accumulation caused by stream structure processing is eliminated, and the calculation speed of text-to-speech conversion is improved.

Description

technical field [0001] The present application relates to the technical field of Chinese speech synthesis, and in particular to a polyphone disambiguation and prosody control joint method, system and electronic equipment. Background technique [0002] In order to avoid mispronunciation of polyphonic characters generated by text-to-speech technology or too bland speech, and to make the synthesized speech more accurate and "personalized", polyphonic word disambiguation and prosodic pauses to control speech are often added during the processing. [0003] In the traditional processing method, text-to-speech mainly includes two parts: front-end text / phoneme conversion processing and back-end phoneme / speech signal conversion processing. The processing of the back-end is based on acoustic features, which is used to achieve end-to-end training and synthesis; while the front-end includes the clause segmentation model, text regularization model, natural tone sandhi model, polyphone di...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G06F40/284G06F40/44G06N3/04G06N3/08

CPCG06F40/284G06F40/44G06N3/08G06N3/044G06N3/045

Inventor马明刘宇

OwnerHISENSE VISUAL TECH CO LTD

Polyphone disambiguation and rhythm control combined method and system and electronic equipment

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology