Patents
Literature
Patsnap Copilot is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Patsnap Copilot

84 results about "Text to speech synthesis" patented technology

Text-to-speech (TTS) is a type of speech synthesis application that is used to create a spoken sound version of the text in a computer document, such as a help file or a Web page.

Personal message service with enhanced text to speech synthesis

A server in a network gathers textual information, such as news items, E-mail and the like. From that information, the server develops or identifies messages for use by individual subscribers. The same server that accumulates the text messages or another server in the network converts the textual information in each message to a sequence of speech synthesizer instructions. The converted messages, containing the sequences of speech synthesizer instructions, are transmitted to each identified subscriber's terminal device. A synthesizer in the terminal generates an audio waveform signal, representing the speech information, in response to the instructions. In the preferred embodiment, the terminals utilize concatenative type speech synthesizers, each of which has an associated vocabulary of stored fundamental sound samples. The instructions identify the sound samples, in order. The instructions also provide parameters for controlling characteristics of the signal generated during waveform synthesis for each sound sample in each sequence. For example, the instructions may specify the pitch, duration, amplitude, attack envelope and decay envelope for each sample. The division of the text to speech synthesis processing between the server and the terminals places the cost of the front end processing in the server, which is a shared resource. As a result, the hardware and software of the terminal may be relatively simple and inexpensive. Also, it is possible to upgrade the quality of the synthesis by upgrading the server software, without modifying the terminals.
Owner:GOOGLE LLC

Methods and apparatus related to pruning for concatenative text-to-speech synthesis

The present invention provides, among other things, automatic identification of near-redundant units in a large TTS voice table, identifying which units are distinctive enough to keep and which units are sufficiently redundant to discard. According to an aspect of the invention, pruning is treated as a clustering problem in a suitable feature space. All instances of a given unit (e.g. word or characters expressed as Unicode strings) are mapped onto the feature space, and cluster units in that space using a suitable similarity measure. Since all units in a given cluster are, by construction, closely related from the point of view of the measure used, they are suitably redundant and can be replaced by a single instance. The disclosed method can detect near-redundancy in TTS units in a completely unsupervised manner, based on an original feature extraction and clustering strategy. Each unit can be processed in parallel, and the algorithm is totally scalable, with a pruning factor determinable by a user through the near-redundancy criterion. In an exemplary implementation, a matrix-style modal analysis via Singular Value Decomposition (SVD) is performed on the matrix of the observed instances for the given word unit, resulting in each row of the matrix associated with a feature vector, which can then be clustered using an appropriate closeness measure. Pruning results by mapping each instance to the centroid of its cluster.
Owner:APPLE INC
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products