Voice synthesis method and device and computer readable storage medium

A speech synthesis and speech technology, applied in speech synthesis, speech analysis, instruments, etc., can solve the problems of increasing difficulty, occupying a lot of storage space, increasing the cost of acoustic model deployment, etc., to reduce costs, reduce computing requirements, and reduce storage. Effects of Model and Computational Power Requirements

Active Publication Date: 2020-06-12
HUAWEI TECH CO LTD
View PDF7 Cites 11 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] However, due to the characteristics of personalized TTS itself, for example, personalized TTS needs to establish its acoustic model for each user. If a product has millions of users, it is necessary to provide millions of different acoustic models corresponding to the million users. Model, the storage of a large number of models requires a large amount of storage space. When using processing equipment such as servers for speech synthesis, there are high requirements for its configuration and computing power, which greatly increases the deployment cost of acoustic models. At the same time It also increases the difficulty of practical application

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Voice synthesis method and device and computer readable storage medium
  • Voice synthesis method and device and computer readable storage medium
  • Voice synthesis method and device and computer readable storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0065] The technical solutions in the embodiments of the present application will be described below with reference to the drawings in the embodiments of the present application.

[0066] The solution of this application can be applied in various voice interaction scenarios, such as mobile phones, smart bracelets, smart voice assistants on wearable devices that can produce sound, smart speakers, or various machines or devices that can talk to people, etc. In the process of interaction between the above-mentioned various devices and people, it can output personalized voices customized by users. Several possible application scenarios of personalized speech synthesis are introduced below.

[0067] Application Scenario 1: Sound Cloning

[0068] In the application scenario of voice cloning, the voice of the speaker can be simulated, so that the user can hear the voice of the speaker customized by himself. Sound cloning can be widely used in life and work. For example, it can be ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a voice synthesis method and device and a computer readable storage medium, and relates to the field of artificial intelligence, in particular to a voice synthesis technology inthe field of voice recognition. The method comprises the following steps: acquiring to-be-processed data of a first user, processing the to-be-processed data through the target model; and obtaining first data, wherein the target model is obtained by training a first sub-model of the basic acoustic model based on the personalized training data of the first user, sending the first data to the dataprocessing device, receiving a processing result, and obtaining the processing result by processing the first data by the data processing device based on a second sub-model of the basic acoustic model. Through the data processing method by combining the terminal and the data processing equipment, the requirements on the storage model and the operational capability of the data processing equipmentare reduced, so that the deployment and implementation cost of personalized speech synthesis is greatly reduced.

Description

technical field [0001] The present application relates to the field of artificial intelligence, in particular to a speech synthesis technology in the field of speech recognition, and more specifically, to a speech synthesis method, device and computer-readable storage medium. Background technique [0002] In recent years, speech synthesis technology has made great progress, and machine voice broadcasts have been widely used in smart mobile terminals, smart homes, car audio and other equipment. People's requirements for speech synthesis are no longer just "able to hear clearly", but have been transformed into "expressive power" and "full of personalization". Therefore, the personalized function of speech synthesis has gradually become the "black technology" declared by many products, and has become a highlight of product competitiveness. Personalized speech synthesis (text to speech, TTS) system, that is, a speech synthesis system that integrates user-customized features, wh...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G10L13/027G10L13/04
CPCG10L13/027
Inventor 邓利群张旸郑念祖王雅圣
Owner HUAWEI TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products