Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

System and method for voice transformation

a voice and voice technology, applied in the field of voice transformation, can solve the problems of robotic voice, unnatural voice, and severe degradation of voice quality, and achieve the effects of high quality, and improving the accuracy of automatic speech recognition

Active Publication Date: 2014-05-22
THE TRUSTEES OF COLUMBIA UNIV IN THE CITY OF NEW YORK
View PDF0 Cites 15 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Benefits of technology

The patent text describes a method for making high-quality speech using a compact speech database. This method involves using a parametric representation of speech signals to modify speech segments to meet different requirements, such as adult male or female voices with different emergency situations. The method can also be used to improve the accuracy of automatic speech recognition. Overall, this invention provides a way to build better speech synthesis and recognition systems using advanced technologies.

Problems solved by technology

These systems generate intelligible speech, but the speech sounds robotic, and unnatural.
However, using those methods, the quality of voice is severely degraded.
Obviously, the accuracy of speech parameterization affects the overall accuracy.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • System and method for voice transformation
  • System and method for voice transformation
  • System and method for voice transformation

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0018]Various exemplary embodiments of the present invention are implemented on a computer system including one or more processors and one or more memory units. In this regard, according to exemplary embodiments, steps of the various methods described herein are performed on one or more computer processors according to instructions encoded on a computer-readable medium.

[0019]FIG. 1 is a block diagram of the voice transformation system according to an exemplary embodiment of the present invention. The source is the voice from a speaker 101. Through a microphone 102, the voice is converted into electrical signal, and recorded in the computer as PCM (Pulse Code Modulation) signal 103. The PCM signal 103 is then segmented by segmenter 104 into frames 105, according to segment points 110. There are two methods to generate the segment points. The first one is to use an electroglottograph (EGG) 106 to detect the glottal closure instants (GCI) 107 directly (See FIG. 2). The second one is to...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The present invention is a method and system to convert speech signal into a parametric representation in terms of timbre vectors, and to recover the speech signal thereof. The speech signal is first segmented into non-overlapping frames using the glottal closure instant information, each frame is converted into an amplitude spectrum using a Fourier analyzer, and then using Laguerre functions to generate a set of coefficients which constitute a timbre vector. A sequence of timbre vectors can be subject to a variety of manipulations. The new timbre vectors are converted back into voice signals by first transforming into amplitude spectra using Laguerre functions, then generating phase spectra from the amplitude spectra using Kramers-Knonig relations. A Fourier transformer converts the amplitude spectra and phase spectra into elementary waveforms, then superposed to become the output voice. The method and system can be used for voice transformation, speech synthesis, and automatic speech recognition.

Description

FIELD OF THE INVENTION[0001]The present invention generally relates to voice transformation, in particular to voice transformation using orthogonal functions, and its applications in speech synthesis and automatic speech recognition.BACKGROUND OF THE INVENTION[0002]Voice transformation involves parameterization of a speech signal into a mathematical format which can be extensively manipulated such that the properties of the original speech, for example, pitch, speed, relative length of phones, prosody, and speaker identity, can be changed, but still sound natural. A straightforward application of voice transformation is singing synthesis. If the new parametric representation is successfully demonstrated to work well in voice transformation, it can be used for speech synthesis and automatic speech recognition.[0003]Speech synthesis, or text-to-speech (TTS), involves the use of a computer-based system to convert a written document into audible speech. A good TTS system should generate...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(United States)
IPC IPC(8): G10L13/06
CPCG10L13/08G10L15/02G10L13/04G10L15/04
Inventor CHEN, CHENGJUN JULIAN
Owner THE TRUSTEES OF COLUMBIA UNIV IN THE CITY OF NEW YORK
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products