Voice recognition and voice synthesis model training method based on dual learning

What is AI technical title?
AI technical title is built by Patsnap AI team. It summarizes the technical point description of the patent document.
A speech recognition model and speech synthesis technology, applied in the fields of speech synthesis, speech recognition, speech recognition and speech synthesis, can solve the problems of high cost, time-consuming and laborious, and it is difficult to ensure data quality, so as to save cost and solve data problems. small number of effects

Inactive Publication Date: 2018-06-08

RUN TECH CO LTD

View PDF0 Cites 17 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0003] Traditional speech recognition and speech synthesis model training methods require a large amount of one-to-one correspondence between speech data and text data, but collecting a sufficient amount of such one-to-one correspondence data is not only a time-consuming and laborious task, but also difficult to guarantee The quality of the collected data, in addition, the collection of data will also bring a lot of cost overhead

Insufficient amount of high-quality data has become a major obstacle to improving the accuracy and conversion efficiency of speech recognition and speech synthesis models

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0019] The present invention will be further described below in conjunction with specific drawings and embodiments.

[0020] The general idea of the present invention is: firstly, use less labeled data to pre-train the speech recognition model and the speech synthesis model; The speech recognition model and the speech synthesis model are further trained in a supervised way.

[0021] First, define the input of the algorithm, including: speech data set D for training speech recognition and speech synthesis models A , a text dataset D B ; Speech recognition model Θ to be trained AB ; The speech synthesis model Θ to be trained BA ; Pre-trained speech language model LM used to calculate the confidence that the speech data is generated by humans rather than machine-generated A ; Pre-trained text language model LM used to calculate the confidence that the text data is written by humans rather than generated by machines B ; When updating parameters, the hyperparameter α used to...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention provides a voice recognition and voice synthesis model training method based on dual learning. The method comprises the following steps that firstly voice recognition acts as the "main task" and voice synthesis acts as the "dual task"; the voice data A are converted into a text B' by using a voice recognition model to be trained; the confidence coefficient that the text obtained by conversion of the voice data A is written by humans rather than by machines is calculated by using a text language model obtained by pre-training; the text B' obtained by conversion of the voice data Ais converted back to the voice data A' by using a voice synthesis model to be trained; the "reconstruction similarity" between the voice data A' and the original voice data A is calculated by using avoice language model obtained by pre-training; and the final "reward" is calculated, and the parameters of the voice recognition model to be trained and the voice synthesis model to be trained are updated by using the REINFORCE algorithm of the reinforcement learning technology. A lot of cost overhead caused by data collection can be saved.

Description

technical field [0001] The present invention relates to the technical field of speech recognition and speech synthesis, in particular, it is a kind of speech recognition and speech synthesis established by using deep learning technology in an unsupervised manner by utilizing the nature of dual learning, using a large amount of unlabeled data and reinforcement learning technology. The method for training the speech synthesis model can be applied to the fields of speech recognition and speech synthesis. Background technique [0002] Speech is the most basic and most effective way for people to communicate in daily life. With the maturity of artificial intelligence technology, people also hope to communicate and transmit information with computers through direct dialogue, so speech recognition and speech synthesis have also become a major topic in the field of natural language processing. The demand for various forms such as speech-to-text and text-to-speech synthesis is expan...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G10L15/06G10L13/08G10L25/27

CPCG10L15/063G10L13/08G10L25/27

Inventor杨华兴刘云浩

OwnerRUN TECH CO LTD

Voice recognition and voice synthesis model training method based on dual learning

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology