Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Speech synthesis playing method and device, and storage medium

A technology of speech synthesis and playback method, which is applied in the field of speech and can solve problems such as the inability to accurately predict the pronunciation of polyphonic characters

Pending Publication Date: 2019-12-20
TENCENT TECH (SHENZHEN) CO LTD
View PDF6 Cites 5 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] During the research and practice of the prior art, the inventors of the present invention found that the polyphonic character processing ability of the existing speech synthesis technology has defects, and it is often impossible to accurately predict the polyphonic character when faced with an uncommon context. pronunciation

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Speech synthesis playing method and device, and storage medium
  • Speech synthesis playing method and device, and storage medium
  • Speech synthesis playing method and device, and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0086] An embodiment of the present invention provides a speech synthesis playback method, which is suitable for a user terminal, including: receiving a speech synthesis request, and obtaining text to be synthesized that needs to be synthesized according to the speech synthesis request; sending the text to be synthesized to a speech synthesis server for speech Synthesize, so that the speech synthesis server returns the synthesized speech corresponding to the text to be synthesized; play the synthesized speech, and receive the pronunciation correction request for the synthesized speech; receive the correction data corresponding to the synthesized speech according to the pronunciation correction request, and send the correction data to the speech synthesis The server enables the speech synthesis server to update the synthesized speech according to the correction data, and return the updated synthesized speech; replace the currently played synthesized speech with the updated synthe...

Embodiment 2

[0171] An embodiment of the present invention also provides a speech synthesis playback method, which is suitable for a speech synthesis server, including: when receiving text to be synthesized from a user terminal, performing speech synthesis on the text to be synthesized according to a pre-trained speech synthesis model, and obtaining Synthesize voice; return the synthesized voice to the user terminal for playback, and receive the correction data corresponding to the synthesized voice returned by the user terminal; update the synthesized voice according to the correction data to obtain an updated synthesized voice; return the updated synthesized voice to the user terminal , so that the user terminal replaces the synthesized voice with the updated synthesized voice for playback.

[0172] Please refer to image 3 , the flow of the speech synthesis playback method can be as follows:

[0173] 301. When receiving text to be synthesized from a user terminal, perform speech synthe...

Embodiment 3

[0193] According to the methods described in the previous embodiments, examples will be given below for further description.

[0194] Such as Figure 4 As shown, the flow process of the speech synthesis playback method can be as follows:

[0195] 401. The user terminal receives the speech synthesis request, and extracts the text in the display content of the foreground application according to the speech synthesis request, obtains the extracted text, and divides the extracted text into multiple clauses according to the preset sentence segmentation strategy, and sequentially divides the divided text The obtained sentence is set as the text to be synthesized and sent to the speech synthesis server.

[0196] In the embodiment of the present invention, the user terminal may receive an externally input speech synthesis request in real time, thereby triggering speech synthesis, and converting the corresponding text into speech for output.

[0197] After receiving the speech synthe...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The embodiment of the invention discloses a speech synthesis playing method and device, and a storage medium. A user terminal can receive a speech synthesis request, and obtain a text to be synthesized that requires speech synthesis according to the speech synthesis request; then the text to be synthesized is sent to a speech synthesis server for speech synthesis to obtain a corresponding synthesized speech; then the synthesized speech is played and a pronunciation correction request for the synthesized speech is received; correction data corresponding to the synthesized speech is received according to the pronunciation correction request, and the correction data is sent to the speech synthesis server for updating the synthesized speech, so that the updated synthesized speech is obtained;and the currently played synthesized speech is replaced with the updated synthesized speech for playing. Compared with the related art, the invention can correct and update the played synthesized speech in real time during the process of playing the synthesized speech; therefore, even when the pronunciation prediction of a polyphonic character is wrong, the pronunciation can be corrected in time.

Description

technical field [0001] The present invention relates to the technical field of speech, in particular to a speech synthesis playback method, device and storage medium. Background technique [0002] Speech synthesis technology, also known as text-to-speech technology (Text To Speech, TTS), its goal is to allow machines to convert text information into voice output through recognition and understanding, so that machines can speak, which is the future of human-computer interaction. important branch. [0003] Speech synthesis technology is widely used, such as web page content reading aloud, novel audio reading, e-mail reading and so on. Take the audio reading of novels as an example. Through speech synthesis, user terminals such as mobile phones and tablet computers can read the novels read by users aloud, so that users can "read" novels with their eyes closed. [0004] During the research and practice of the prior art, the inventors of the present invention found that the pol...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G10L13/08
CPCG10L13/086
Inventor 杨木文
Owner TENCENT TECH (SHENZHEN) CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products