Word acoustic feature system, training method and system for word acoustic feature system

A feature system and acoustic feature technology, applied in speech analysis, speech synthesis, speech recognition, etc., can solve the problems of poor synthesis quality, ignoring word pronunciation, and only focusing on the meaning of words, so as to improve quality and accurate acoustic features of words. Effect

Active Publication Date: 2022-08-05
AISPEECH CO LTD
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0009] In order to at least solve the problem that the existing models in the existing methods only focus on the meaning of the word and ignore the pronunciation of the word, making the feature vector less effective in improving the quality of speech synthesis

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Word acoustic feature system, training method and system for word acoustic feature system
  • Word acoustic feature system, training method and system for word acoustic feature system
  • Word acoustic feature system, training method and system for word acoustic feature system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0045] In order to make the purposes, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments These are some embodiments of the present invention, but not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative efforts shall fall within the protection scope of the present invention.

[0046] like figure 1 Shown is a schematic structural diagram of a word acoustic feature system provided by an embodiment of the present invention, and the system can be configured in a terminal.

[0047] A word acoustic feature system 10 provided in this embodiment includes: a word encoder 11 and a word phoneme aligner 12 .

[004...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

Embodiments of the present invention provide a method for training a word acoustic feature system. The method includes: splicing word acoustic features output by a word acoustic feature system with a phoneme feature sequence output by a phoneme encoder to obtain a phoneme feature sequence with word acoustic features, and splicing with actual prosody features to obtain a phoneme feature sequence with prosody and word acoustics The phoneme feature sequence of the feature; adjust the coding length, add the pitch and energy features, and decode to obtain the predicted mel spectrum; train the word acoustic feature system based on the actual mel spectrum and the predicted mel spectrum. Embodiments of the present invention also provide a word acoustic feature system and a training system for the word acoustic feature system. The embodiment of the present invention uses the trained word acoustic feature system to obtain word acoustic features that not only have word meanings, but also pronunciation, and through continuous training of the word acoustic feature system, the word acoustic features are more accurate, thereby further improving speech synthesis. The quality of speech synthesis.

Description

technical field [0001] The invention relates to the field of intelligent speech, in particular to a word acoustic feature system, a training method and system for the word acoustic feature system. Background technique [0002] End-to-end text-to-speech synthesis models with sequence-to-sequence architectures have achieved great success in generating natural speech. The word features are characterized by aligning and splicing the word vector encoder with the phoneme feature sequence (the output of the phoneme encoder) through text analysis or extracting the word vector representation from the pre-training model. Ways to obtain these feature vectors include: [0003] Obtain word features by statistical methods, such as word frequency, etc., and then use text analysis to generate word feature vectors; [0004] Extract the encoder output as word vectors from common machine learning tasks (such as translation tasks); [0005] Use the BERT coding layer to extract word vectors; ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G10L15/02G10L25/24G10L13/10G10L13/02G10L15/14
CPCG10L15/02G10L25/24G10L13/10G10L13/02G10L15/142G10L2015/025
Inventor 俞凯沈飞宇杜晨鹏
Owner AISPEECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products