Method and system for simulating reading and pronunciation of real person

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A real person, voice technology, applied in the direction of speech synthesis, speech analysis, instruments, etc., can solve the problems such as the tone and emotion are not rich enough, the pronunciation cannot be read by a real person, the pronunciation is rigid and not vivid, etc.

Active Publication Date: 2015-08-26

ZHANGJIAGANG INST OF IND TECH SOOCHOW UNIV +1

View PDF3 Cites 12 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0004] In view of this, this application provides a method and system for simulating the pronunciation of real people reading aloud, so as to overcome the problem that the pronunciation of the text pronunciation system in the prior art is rigid and not vivid, the tone of voice is not rich enough, and it is impossible to achieve the problem of reading and pronunciation of real people

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0054] Such as figure 1 as shown, figure 1 It is a flow chart of a method for simulating human reading pronunciation provided in Embodiment 1 of the present application. The method includes:

[0055] S101: Input the text to be pronounced in the pre-built classification model.

[0056] Among them, the classification model is pre-built through a large number of audios of webcasters collected on the Internet and corresponding texts. In this way, after inputting the text to be pronounced, a suitable speech can be found out according to the classification model, and further pronounced.

[0057] S102: Split the text to be pronounced into multiple words, and acquire the text vector of each word in sequence.

[0058] In practical applications, the text to be pronounced may be a sentence or a paragraph. When the text to be pronounced is a sentence, you can directly split the sentence into multiple words; when the text to be pronounced is a paragraph, you first need to split the te...

Embodiment 2

[0064] On the basis of Embodiment 1, Embodiment 2 of the present application provides a method of simulating the pronunciation of a real person reading aloud, the method comprising:

[0065] Input the text to be pronounced in the pre-built classification model.

[0066] Split the text to be pronounced into multiple words, and obtain the text vector of each word in turn.

[0067] Obtain the speech vector corresponding to the word according to the text vector of the word.

[0068] Convert the speech vector to audio and play it through the player.

[0069] Such as figure 2 as shown, figure 2 It is a flow chart of constructing a classification model provided in Embodiment 2 of the present application. Specifically, the construction method of the classification model includes:

[0070] S201: Collect a training sample set.

[0071] Specifically, a large number of samples need to be taken in advance, and a sentence is generally taken as a sample. It should be noted that when...

Embodiment 3

[0095] In order to realize the method for simulating the pronunciation of human reading aloud described in the first embodiment, the third embodiment of the present application provides a system for simulating the pronunciation of reading aloud for a real person. Such as Figure 5 as shown, Figure 5 It is a schematic structural diagram of a system for simulating human reading and pronunciation provided in Embodiment 3 of the present application. The system includes: a construction unit 401, an input unit 402, a split unit 403, an acquisition unit 404 and a conversion unit 405, wherein,

[0096] The construction unit 401 is configured to pre-construct the classification model.

[0097] Wherein, the classification model is pre-built by the construction unit 401 through a large amount of audios of webcasters collected on the Internet and corresponding texts. In this way, after inputting the text to be pronounced, a suitable speech can be found out according to the classificat...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention provides a method and system for simulating reading and pronunciation of a real person. The method comprises the following steps: inputting a text to be pronounced into a classification model constructed in advance; splitting the text to be pronounced into a plurality of words and obtaining a text vector of each word in sequence; obtaining voice vectors corresponding to the words according to the text vectors of the words; and converting the voice vectors into audios and playing the audios through a player. According to the method, through the classification model constructed in advance, the text to be pronounced is split into the plurality of words, the voice vectors corresponding to the words are obtained according to the obtained text vectors of the words, and the corresponding audios are output for playing, so that the generated sound is nature, and the whole piece of text can be read with a sentence and a paragraph as a basic unit; and each word has different pronunciation, tone and pause and the like unique properties, and thus the real person pronunciation effect is achieved.

Description

technical field [0001] The present application relates to the technical field of human pronunciation, in particular to a method and system for simulating human pronunciation. Background technique [0002] Now there are many text pronunciation systems on the market, such as Lingoes, text-to-speech broadcasting systems, etc., which can convert a piece of text into speech. Traditional text pronunciation technology uses TTS (Text To Speech, speech synthesis) technology. TTS is a process of converting text into speech output. The work of this process is mainly to decompose the output text into phonemes according to words or words, and to Numbers, currency units, word deformations, punctuation and other symbols in the text that require special processing are analyzed, and the phonemes are generated into digital audio and then played with a speaker or saved as a sound file to be played with multimedia software. [0003] The more common TTS technology now is realized by the voice g...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G10L13/02

Inventor 严建峰李云飞杨晓峰贾俊铖杨璐

Owner ZHANGJIAGANG INST OF IND TECH SOOCHOW UNIV

Method and system for simulating reading and pronunciation of real person

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

Embodiment 2

Embodiment 3

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology