Supercharge Your Innovation With Domain-Expert AI Agents!

Front-end text analysis method based on multi-task learning

A multi-task learning and text analysis technology, applied in the field of speech synthesis, can solve problems such as the complexity of the training process, and achieve the effects of simplifying the training process, reducing workload, and reducing computing resources

Active Publication Date: 2022-07-05
慧言科技(天津)有限公司
View PDF5 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, this serial structure also brings several problems
One is a complex feature engineering and data labeling effort, since each component requires different input and output labels
The other is that the front-end components need to be trained and optimized separately, which makes the training process very complicated

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Front-end text analysis method based on multi-task learning
  • Front-end text analysis method based on multi-task learning

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0049] In order to verify the present invention, verification is performed on a self-built database. In this dataset, the training set contains 971,500 sentences, the test set and validation set are 3,000 sentences, the polyphonic word dictionary contains 312, and the prosody contains #1 and #3. The algorithm flow of the whole system is as follows figure 1 shown, the following combined with the appendix figure 1 The present invention is described in further detail.

[0050] figure 1 It is the model frame diagram of the front-end text analysis method based on multi-task learning of the present invention. like figure 1 It mainly includes the following steps:

[0051] S1. Data annotation:

[0052] First perform data processing: segment each sentence in the corpus by word, and filter out sentences with a length of more than 250;

[0053] Then manually perform data labeling on the same source corpus, and splicing the polyphonic label and prosodic label corresponding to each ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a front-end text analysis method based on multi-task learning, which uses the same corpus to label features and results, uses a CNN (Convolutional Neural Network) as a sharing layer to extract the features of the corpus, then respectively puts the features into two Bi-LSTMs (Bidirectional Long Short Term Memory) for parallel training, and outputs results for two tasks, and specifically comprises the following steps: S1, data labeling; s2, preparing features; s3, performing feature fusion; and S4, classifying. According to the method, polyphone prediction and rhythm prediction tasks are combined by using a multi-task learning method, and a unified end-to-end text processing model is realized, namely, a unified front-end structure is provided, so that a high-quality mandarin TTS system is constructed more quickly and more easily. The training of the unified model can use the same data as input, polyphones and rhythms can be directly predicted from an original text at the same time, two tasks can be trained in parallel, the workload of data labeling is reduced, the training cost is saved, two results are output at the same time, and the training process is simplified.

Description

technical field [0001] The invention relates to the technical field of speech synthesis, in particular to a front-end text analysis method based on multi-task learning. Background technique [0002] Text-to-Speech (TTS), also known as speech synthesis. Aiming to synthesize understandable natural speech from text, it has a wide range of applications in human communication and has long been a research topic in the fields of artificial intelligence, natural language processing, and speech processing. Developing a TTS system requires knowledge of language and human speech generation across multiple disciplines, including linguistics, acoustics, digital signal processing, and machine learning. With the development of deep learning, neural network-based TTS has flourished, and a lot of research work has focused on different aspects of neural TTS. As a result, the quality of synthesized speech has greatly improved in recent years. [0003] In Mandarin text-to-speech synthesis, t...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F40/289G06N3/04G06K9/62G06F16/35
CPCG06F40/289G06F16/35G06N3/044G06F18/253Y02D10/00
Inventor 黎天宇张句关昊天王宇光
Owner 慧言科技(天津)有限公司
Features
  • R&D
  • Intellectual Property
  • Life Sciences
  • Materials
  • Tech Scout
Why Patsnap Eureka
  • Unparalleled Data Quality
  • Higher Quality Content
  • 60% Fewer Hallucinations
Social media
Patsnap Eureka Blog
Learn More