Unlock instant, AI-driven research and patent intelligence for your innovation.

Text-to-voice conversion method and device and computer equipment

A conversion method and text technology, applied in speech synthesis, speech analysis, instruments, etc., can solve problems such as poor hearing, information loss, and heavy mechanical effect of sound effects, and achieve better sound quality, better sound quality, and better overall fluency. Effect

Active Publication Date: 2018-09-04
BAIDU ONLINE NETWORK TECH (BEIJIBG) CO LTD
View PDF12 Cites 12 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, this method has a certain amount of information loss due to the use of prior knowledge derived from human research for approximation, making the final synthesized sound effect more mechanical and poor in hearing, which cannot be compared with natural human voices.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Text-to-voice conversion method and device and computer equipment
  • Text-to-voice conversion method and device and computer equipment
  • Text-to-voice conversion method and device and computer equipment

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0026] Embodiments of the present application are described in detail below, examples of which are shown in the drawings, wherein the same or similar reference numerals denote the same or similar elements or elements having the same or similar functions throughout. The embodiments described below by referring to the figures are exemplary, and are intended to explain the present application, and should not be construed as limiting the present application.

[0027] In the existing text-to-speech conversion schemes, the speech conversion is not performed directly according to the frequency spectrum of the audio, but the acoustic parameters such as fundamental frequency and spectral envelope are first extracted from the audio, and then the speech conversion is performed according to these indirect acoustic features. . Due to the complexity of the sound spectrum, in order to simplify the process of speech conversion, a large number of approximations are inevitably introduced, which...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a text-to-voice conversion method and device and computer equipment. The text-to-voice conversion method comprises the steps that the corresponding frames of the text to be converted are acquired; the vector feature and the text prosodic feature of the corresponding phone of the current frame of the frames are acquired, and the mapping feature of the corresponding linear spectrum of the last frame of the current frame is acquired; the acquired vector feature, the text prosodic feature and the mapping feature are inputted to the pre-trained neural network model so as to acquire the corresponding linear spectrum of the current frame; and after the linear spectrum of the corresponding frame of the text to be converted is acquired, the corresponding voice of the text tobe converted is acquired according to the linear spectrum of the corresponding frame of the text to be converted. The corresponding voice of the text to be converted is acquired according to the linear spectrum of the corresponding frame of the text to be converted directly, and the approximate error is not introduced so that the quality of the acquired voice is better and the overall fluency is better.

Description

technical field [0001] The present application relates to the technical field of speech synthesis, in particular to a text-to-speech conversion method, device and computer equipment. Background technique [0002] TTS is the abbreviation of Text To Speech, which is a part of human-machine dialogue, and the purpose is to enable the machine to speak according to the text. Yinzi is the smallest vocal unit of human speech. In Chinese, Yinzi is each initial consonant or final. In order for the machine to make corresponding sounds according to the text, it is necessary to model the acoustic model of each phone. [0003] In the existing related art, a vocoder is used for modeling. This modeling method first divides the speech signal into frames, and then divides the acoustic model of each frame into three blocks for modeling: (1) Does this frame need Vocalization; (2) the fundamental frequency of this frame; (3) the shock response of this frame relative to the fundamental frequenc...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G10L13/10G10L13/04G10L25/30
CPCG10L13/04G10L13/10G10L25/30
Inventor 张黄斌
Owner BAIDU ONLINE NETWORK TECH (BEIJIBG) CO LTD