Chinese mandarin character pronunciation conversion method based on self-attention mechanism

A word-to-speech conversion and mandarin technology, applied in speech analysis, speech synthesis, computer parts, etc., and can solve problems such as computational difficulties

Active Publication Date: 2020-05-12
INST OF ACOUSTICS CHINESE ACAD OF SCI
View PDF6 Cites 11 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] The purpose of the present invention is to use the highly parallel characteristics of the self-attention model to solve the problem of difficult calculations, and at the same time realize the end-to-end direct prediction of the character string to the tone-modified pronunciation

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Chinese mandarin character pronunciation conversion method based on self-attention mechanism
  • Chinese mandarin character pronunciation conversion method based on self-attention mechanism
  • Chinese mandarin character pronunciation conversion method based on self-attention mechanism

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0048] In order to make the purpose, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below in conjunction with the drawings in the embodiments of the present invention. Obviously, the described embodiments It is a part of embodiments of the present invention, but not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the protection scope of the present invention.

[0049] One, the structure of described neural network model:

[0050] figure 1 A hierarchical diagram of a Chinese mandarin word-to-sound conversion neural network model provided for the embodiments of the present invention: as shown in the figure, it includes an embedding layer (Embedding Layer), a self-attention layer (in the figure, Self ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The embodiment of the invention provides a Chinese mandarin character pronunciation conversion method based on self-attention mechanism. The Chinese mandarin character pronunciation conversion methodcan be used for direct prediction from Chinese sentences to pronunciation after tone change. According to the Chinese mandarin character pronunciation conversion method, multi-task learning and relative position coding are combined with a self-attention model, a self-attention mechanism is used for capturing the dependency relationship of characters in input sentences, and extra part-of-speech andthree pinyin attributes are introduced into multi-task learning to serve as sub-tasks; a tone transfer relationship is modeled by using CRF, and position information of a sequence is effectively modeled by relative position coding; finally, pronunciation can be obtained through a main task prediction result, and can also be a result of joint judgment of three pinyin attribute subtasks. Accordingto the method, the performance of Chinese mandarin character pronunciation conversion is improved to a great extent.

Description

technical field [0001] The invention relates to the field of speech synthesis, in particular to a Chinese Mandarin word-to-sound conversion method based on a self-attention mechanism. Background technique [0002] TTS technology is widely used in e-books, voice assistants, car navigation, voice customer service and other products. In Chinese speech synthesis, whether it is a parametric or sequence-to-sequence model, the phoneme-level modeling unit is compact enough to be trained effectively. The role of phonetic conversion is to map Chinese characters to pronunciation. [0003] At the heart of transliteration is polyphone disambiguation and tone sandhi, and in some cases, pronunciation is determined by semantics. For example, "also" reads "huan2" such as "return" when it means returning, and reads "hai2" such as "still" when it means still. There is also a part of the tone-changing tone environment, such as two consecutive three-tone readings, the former is usually pronou...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G10L13/02G10L13/10G10L25/30G06K9/62G06N3/04
CPCG10L13/02G10L13/10G10L25/30G06N3/045G06F18/214
Inventor 张鹏远尚增强颜永红
Owner INST OF ACOUSTICS CHINESE ACAD OF SCI
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products