Speech synthesis playing method and device, and storage medium

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
A technology of speech synthesis and playback method, which is applied in the field of speech and can solve problems such as the inability to accurately predict the pronunciation of polyphonic characters

Pending Publication Date: 2019-12-20

TENCENT TECH (SHENZHEN) CO LTD

View PDF6 Cites 5 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0004] During the research and practice of the prior art, the inventors of the present invention found that the polyphonic character processing ability of the existing speech synthesis technology has defects, and it is often impossible to accurately predict the polyphonic character when faced with an uncommon context. pronunciation

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0086] An embodiment of the present invention provides a speech synthesis playback method, which is suitable for a user terminal, including: receiving a speech synthesis request, and obtaining text to be synthesized that needs to be synthesized according to the speech synthesis request; sending the text to be synthesized to a speech synthesis server for speech Synthesize, so that the speech synthesis server returns the synthesized speech corresponding to the text to be synthesized; play the synthesized speech, and receive the pronunciation correction request for the synthesized speech; receive the correction data corresponding to the synthesized speech according to the pronunciation correction request, and send the correction data to the speech synthesis The server enables the speech synthesis server to update the synthesized speech according to the correction data, and return the updated synthesized speech; replace the currently played synthesized speech with the updated synthe...

Embodiment 2

[0171] An embodiment of the present invention also provides a speech synthesis playback method, which is suitable for a speech synthesis server, including: when receiving text to be synthesized from a user terminal, performing speech synthesis on the text to be synthesized according to a pre-trained speech synthesis model, and obtaining Synthesize voice; return the synthesized voice to the user terminal for playback, and receive the correction data corresponding to the synthesized voice returned by the user terminal; update the synthesized voice according to the correction data to obtain an updated synthesized voice; return the updated synthesized voice to the user terminal , so that the user terminal replaces the synthesized voice with the updated synthesized voice for playback.

[0172] Please refer to image 3 , the flow of the speech synthesis playback method can be as follows:

[0173] 301. When receiving text to be synthesized from a user terminal, perform speech synthe...

Embodiment 3

[0193] According to the methods described in the previous embodiments, examples will be given below for further description.

[0194] Such as Figure 4 As shown, the flow process of the speech synthesis playback method can be as follows:

[0195] 401. The user terminal receives the speech synthesis request, and extracts the text in the display content of the foreground application according to the speech synthesis request, obtains the extracted text, and divides the extracted text into multiple clauses according to the preset sentence segmentation strategy, and sequentially divides the divided text The obtained sentence is set as the text to be synthesized and sent to the speech synthesis server.

[0196] In the embodiment of the present invention, the user terminal may receive an externally input speech synthesis request in real time, thereby triggering speech synthesis, and converting the corresponding text into speech for output.

[0197] After receiving the speech synthe...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The embodiment of the invention discloses a speech synthesis playing method and device, and a storage medium. A user terminal can receive a speech synthesis request, and obtain a text to be synthesized that requires speech synthesis according to the speech synthesis request; then the text to be synthesized is sent to a speech synthesis server for speech synthesis to obtain a corresponding synthesized speech; then the synthesized speech is played and a pronunciation correction request for the synthesized speech is received; correction data corresponding to the synthesized speech is received according to the pronunciation correction request, and the correction data is sent to the speech synthesis server for updating the synthesized speech, so that the updated synthesized speech is obtained;and the currently played synthesized speech is replaced with the updated synthesized speech for playing. Compared with the related art, the invention can correct and update the played synthesized speech in real time during the process of playing the synthesized speech; therefore, even when the pronunciation prediction of a polyphonic character is wrong, the pronunciation can be corrected in time.

Description

technical field [0001] The present invention relates to the technical field of speech, in particular to a speech synthesis playback method, device and storage medium. Background technique [0002] Speech synthesis technology, also known as text-to-speech technology (Text To Speech, TTS), its goal is to allow machines to convert text information into voice output through recognition and understanding, so that machines can speak, which is the future of human-computer interaction. important branch. [0003] Speech synthesis technology is widely used, such as web page content reading aloud, novel audio reading, e-mail reading and so on. Take the audio reading of novels as an example. Through speech synthesis, user terminals such as mobile phones and tablet computers can read the novels read by users aloud, so that users can "read" novels with their eyes closed. [0004] During the research and practice of the prior art, the inventors of the present invention found that the pol...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & AuthorityApplications(China)

IPC IPC(8): G10L13/08

CPCG10L13/086

Inventor杨木文

OwnerTENCENT TECH (SHENZHEN) CO LTD

Speech synthesis playing method and device, and storage medium

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

Embodiment 2

Embodiment 3

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology