Patents
Literature
Hiro is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Hiro

116 results about "Text to speech conversion" patented technology

Audio user interface for computing devices

An audio user interface that generates audio prompts that help a user interact with a user interface of a device is disclosed. One aspect of the present invention pertains to techniques for providing the audio user interface by efficiently leveraging the computing resources of a host computer system. The relatively powerful computing resources of the host computer can convert text strings into audio files that are then transferred to the computing device. The host system performs the process intensive text-to-speech conversion so that a computing device, such as a hand-held device, only needs to perform the less intensive task of playing the audio file. The computing device can be, for example, a media player such as an MP3 player, a mobile phone, or a personal digital assistant.
Owner:APPLE INC

Intelligent Text-to-Speech Conversion

Techniques for improved text-to-speech processing are disclosed. The improved text-to-speech processing can convert text from an electronic document into an audio output that includes speech associated with the text as well as audio contextual cues. One aspect provides audio contextual cues to the listener when outputting speech (spoken text) pertaining to a document. The audio contextual cues can be based on an analysis of a document prior to a text-to-speech conversion. Another aspect can produce an audio summary for a file. The audio summary for a document can thereafter be presented to a user so that the user can hear a summary of the document without having to process the document to produce its spoken text via text-to-speech conversion.
Owner:APPLE INC

System for integrated electronic communications

A system is disclosed for integrating electronic mail, voice mail, and fax mail in a universal mailbox. Message receivers may access their messages with a telephone or a computer regardless of the communication medium used by the message sender. Using a telephone, message receivers may playback voice mail, redirect fax mail, and "listen" to e-mail through a text-to-speech conversion process. Using a computer and modem, message receivers may playback voice mail, view fax mail, and read e-mail by accessing the universal mailbox via connection software. Message senders and receivers may choose from a variety of filter and forward options that allow them to manage their communications via the universal mailbox. Forwarding and conversion of messages is performed automatically. The options are used to define a set of rules to be applied to inbound and outbound messages so that messages are sent and received in accordance with the preferences of the senders and receivers.
Owner:CRANBERRY PROPERTIES

Electronic book with voice emulation features

A method and system for providing text-to-audio conversion of an electronic book displayed on a viewer. A user selects a portion of displayed text and converts it into audio. The text-to-audio conversion may be performed via a header file and pre-recorded audio for each electronic book, via text-to-speech conversion, or other available means. The user may select manual or automatic text-to audio conversion. The automatic text-to-audio conversion may be performed by automatically turning the pages of the electronic book or by the user manually turning the pages. The user may also select to convert the entire electronic book, or portions of it, into audio. The user may also select an option to receive an audio definition of a particular word in the electronic book. The present invention allows a user to control the system by selecting options from a screen or by entering voice commands.
Owner:ADREA LLC

Text-to-speech user interface control

A system and method includes a detecting computer readable text associated with a device, detecting a starting point for a text-to-speech conversion of text, beginning the text-to-speech conversion upon detection of movement of a pointing device in a direction of text flow, and controlling a rate of the text-to-speech conversion based on a rate of movement of the pointing device in relation to the text to be converted.
Owner:NOKIA CORP

Method and system for statistic-based distance definition in text-to-speech conversion

A method for distance definition in a text-to-speech conversion system by applying Gaussian Mixture Model (GMM) to a distance definition. According to an embodiment, the text that is to be subjected to text-to-speech conversion is analyzed to obtain a text with descriptive prosody annotation; clustering is performed for samples in the obtained text; and a GMM model is generated for each cluster, to determine the distance between the sample and the corresponding GMM model.
Owner:CERENCE OPERATING CO

Methods and systems for personal interactive voice response

A personal interactive voice response system with a web-based interface allowing the user to specify treatment of incoming calls based on voice or touchtone responses provided by the calling party. A graphical user interface available over a computer network, such as the Internet, allows the user to personalize greetings that callers hear, as well as customizing treatment of callers based on the caller's response. The user may record an initial greeting or other messages, either over the telephone or over the Internet, so that the messages are played to callers in the user's voice. Additionally, the user may enter text, via a PC or wireless device connected to the Internet, that is played back for the caller, based on the caller's response, via text-to-speech conversion using voice extensible markup language technology. Resulting actions, such as call forwarding, distinctive ringing, or remote notification of the incoming call may also be included.
Owner:AT&T DELAWARE INTPROP INC

Systems and methods for the visually impared

Embodiments herein provide / acquire a document in electronic form (e.g., by receiving, copying, retrieving from storage, scanning combined with optical character recognition, etc.) and receive user input regarding visual impairment. In response to one or more levels of user visual impairment, embodiments herein automatically change (for example, immediately after scanning text) an appearance of the document, without requiring any user input, other than the visual impairment input. More specifically, when changing the appearance of the document, embodiments herein can increase the size of characters in the document, change the contrast or coloring of the text and / or background, and provide text-to-speech conversion of the document, thereby (in one embodiment) producing audio output of the text-to-speech conversion in coordination with a corresponding portion of the document being displayed. When changing the appearance of the document, embodiments herein also reformat the document (e.g., around graphic elements) to accommodate the increased size of the characters.
Owner:XEROX CORP

System for providing translated information to a driver of a vehicle

A vehicle mounted translation system for providing language translation to a driver of a vehicle. The translation system may be associated with a vehicle navigation system. The translation system includes a translation device and a storage unit for storing language and translation information. The system further includes the ability to enter information to be translated into the system, data processing for retrieving a translation from storage based on the input of the first information, and the ability to provide the retrieved translation to the driver. Output of the translated information may be accomplished by, speech-to-speech and / or text-to-speech conversion of words and / or a text or image output to a visual display.
Owner:HARMAN BECKER AUTOMOTIVE SYST

Cellular phone with scanning capability

A cellular phone is provided with a media scanning capability. Scanner optics, an optional light source and related scanning circuitry is integrated within a cellular phone to enable image or text scanning, facsimile, text-to-speech conversion, and language translation. Position sensors provide position data as the scanner is manually moved, in one or more passes across the scanned media, to enable a bit-mapped image of the strip to be created in a data buffer. Image data from the strips is processed to remove redundant overlap data and skew position errors, to give a bit-mapped final image of the entire scanned item. Image compression is provided to compress the image into standard JPEG format for storage or transmission, or into facsimile format for transmission of the document to any fax machine. Optical character recognition (OCR) is provided to convert image data to text which may be sent as email, locally displayed, stored for later use, or further processed. Further processing of text data includes language translation and text to speech conversion of either the original or translated text. The resulting speech audio can be heard locally or transmitted over the cellular network.
Owner:TEXAS INSTR INC

Communications system providing automatic text-to-speech conversion features and related methods

ActiveUS20050192061A1Enhanced text-to-speech conversionEnhanced delivery featureSubstation speech amplifiersCommmunication supplementary servicesCommunications systemWireless transceiver
A communications system may include at least one mobile wireless communications device, and a wireless communications network for sending text messages thereto. More particularly, the at least one mobile wireless communications device may include a wireless transceiver and a controller for cooperating therewith for receiving text messages from the wireless communications network. It may further include a headset output connected to the controller. The controller may be for switching between a normal message mode and an audio message mode based upon a connection between the headset output and a headset. Moreover, when in the audio message mode, the controller may output at least one audio message including speech generated from at least one of the received text messages via the headset output.
Owner:MALIKIE INNOVATIONS LTD

Communications system providing automatic text-to-speech conversion features and related methods

A communications system may include at least one mobile wireless communications device, and a wireless communications network for sending text messages thereto. More particularly, the at least one mobile wireless communications device may include a wireless transceiver and a controller for cooperating therewith for receiving text messages from the wireless communications network. It may further include a headset output connected to the controller. The controller may be for switching between a normal message mode and an audio message mode based upon a connection between the headset output and a headset. Moreover, when in the audio message mode, the controller may output at least one audio message including speech generated from at least one of the received text messages via the headset output.
Owner:MALIKIE INNOVATIONS LTD

Remote access system and method and intelligent agent therefor

The invention relates to remote access systems and methods using automatic speech recognition to access a computer system. The invention also relates to an intelligent agent resident on the computer system for facilitating remote access to, and receipt of, information on the computer system through speech recognition or text-to-speech read-back. The remote access systems and methods can be used by a user of the computer system while traveling. The user can dial into a server system which is configured to interact with the user by automatic speech recognition and text-to-speech conversion. The server system establishes a connection to an intelligent agent running on the user's remotely located computer system by packet communication over a public network. The intelligent agent sources information on the user's computer system or a network accessible to the computer system, processes the information and transmits it to the server system over the public network. The server system converts the information into speech signals and transmits the speech signals to a telephone operated by the user.
Owner:VOICE ON THE GO

Providing descriptions of visually presented information to video teleconference participants who are not video-enabled

Descriptions of visually presented material are provided to one or more conference participants that do not have video capabilities. This presented material could be any one or more of a document, PowerPoint® presentation, spreadsheet, Webex® presentation, whiteboard, chalkboard, interactive whiteboard, description of a flowchart, picture, or in general, any information visually presented at a conference. For this visually presented information, descriptions thereof are assembled and forwarded via one or more of a message, SMS message, whisper channel, text information, non-video channel, MSRP, or the like, to one or more conference participant endpoints. These descriptions of visually presented information, such as a document, spreadsheet, spreadsheet presentation, multi-media presentation, or the like, can be assembled in cooperation with one or more of OCR recognition and text-to-speech conversion, human input, or the like.
Owner:AVAYA INC

Method and apparatus for contextual text to speech conversion

The present specification discloses systems and methods for contextual text to speech conversion, in part, by interpreting the contextual format of the underlying document, and modifying the literal text so as to reflect that context in the conversion, thereby converting text to contextually appropriate speech.
Owner:STUDY OUTLOUD

Apparatus and methods for converting textual information to audio-based output

InactiveUS7454346B1Conversion occurs more quicklySuitable for useSpeech synthesisVoice transformationAudio frequency
A system for providing text-to-speech conversion of a body of text is presented. The system includes a first executable resource which generates text portions from the body of text in response to receiving an initial web request to convert the body of text to speech and provides an output in response to generating the text portions comprising a sequence of resource identifiers suitable for use in the text-to-speech conversion of the text portions. The system further includes a second executable resource which receives a text portion web request that requests the conversion of at least one text portion to an audio format, the text portion web request comprising the at least one text portion and one of the resource identifiers, and further provides at least one media file suitable for audio output based on the text portion web request.
Owner:CISCO TECH INC

Text-to-speech conversion system on an integrated circuit

A text-to-speech conversion system that includes a first module to convert text into words, a second module to convert words into phonemes, a third module to map phonemes to sound units, and a storage unit to store speech representations for a library of sound units. The first, second, and third modules and the storage unit are implemented within a single integrated circuit to reduce size and cost. The system typically further includes a ROM to store the codes for the modules, a RAM to store the text and intermediate results, a processor to execute the codes for the modules, a control module to direct the operation of the first, second, and third modules. The storage unit may be implemented with a multi-level, non-volatile analog storage array and may be programmed with a new library of speech representations by a programming module.
Owner:WINBOND ELECTRONICS CORP

Text-to-speech conversion with associated mood tag

A method (and associated apparatus) comprises associating a mood tag with text. The mood tag specifies a mood to be applied when the text is subsequently converted to an audio signal. In accordance with another embodiment, a method (and associated apparatus) comprises receiving text having an associated mood tag and converting the text to speech in accordance with the associated mood tag.
Owner:HEWLETT PACKARD DEV CO LP

Text to speech conversion system

A system for the automated conversion to audio of text displayed on a surface.
Owner:SHARP KK

Converting text-to-speech and adjusting corpus

The present invention provides a method and apparatus for text to speech conversion, and a method and apparatus for adjusting a corpus. The method for text to speech comprises: text analysis step for parsing the text to obtain descriptive prosody annotations of the text based on a TTS model generated from a first corpus; prosody parameter prediction step for predicting the prosody parameter of the text according to the result of text analysis step; speech synthesis step for synthesizing speech of said text based on said the prosody parameter of the text; wherein descriptive prosody annotations of the text include prosody structure for the text, the prosody structure of the text is adjusted according to a target speech speed for the synthesized speech. The present invention adjusts the prosody structure of the text according to the target speech speed. The synthesized speech will have improved quality.
Owner:CERENCE OPERATING CO

Cellular phone with scanning capability

A cellular phone is provided with a media scanning capability. Scanner optics, an optional light source and related scanning circuitry is integrated within a cellular phone to enable image or text scanning, facsimile, text-to-speech conversion, and language translation. Position sensors provide position data as the scanner is manually moved, in one or more passes across the scanned media, to enable a bit-mapped image of the strip to be created in a data buffer. Image data from the strips is processed to remove redundant overlap data and skew position errors, to give a bit-mapped final image of the entire scanned item. Image compression is provided to compress the image into standard JPEG format for storage or transmission, or into facsimile format for transmission of the document to any fax machine. Optical character recognition (OCR) is provided to convert image data to text which may be sent as email, locally displayed, stored for later use, or further processed. Further processing of text data includes language translation and text to speech conversion of either the original or translated text. The resulting speech audio can be heard locally or transmitted over the cellular network.
Owner:TEXAS INSTR INC

Network provided information using text-to-speech and speech recognition and text or speech activated network control sequences for complimentary feature access

A real-time networked telephony or computer system has a feature complex and / or applications that offer a class of features to a subscriber, including call information, and permits the subscriber to manage incoming and existing calls through available features accessed using spoken utterances. A speech processing unit coupled to the system interprets a subscriber's spoken utterances without requiring the subscriber to train the system to recognize his or her voice. The interpretation of spoken utterances is enabled by a system state database that is maintained at the speech processing unit and comprises a database of the possible system slates, including possible call flows for a call, and a database associated with the system state database comprising context-specific grammar that a subscriber may recite at respective points in the call flow. The speech processing unit may also convert message signals from the network to speech which is read to the subscriber using a text to speech translator. The network can identify the voice or subscriber voice, or language used and will thereafter recognize all further commands using specific grammar for that language as well as perform text-to-speech conversion using the identified language. Use of the features can be applied to update of grammars, profiles and templates, etc. by transmitting results of transactions.
Owner:SOUND VIEW INNOVATIONS

Method to enable instant collaboration via use of pervasive messaging

A system (10) and method (50) to enable instant collaboration via the use of pervasive messaging can include the steps of receiving (52) a call from a caller to a callee, transferring (54) the call to a voicemail system when the callee is unavailable, determining(56) if the callee is available via instant messaging, and querying (58) the caller if they want to leave one among a voice message and an instant message. In another aspect, a system and method (70) enables instant collaboration by receiving (72) a text message having a designation for text-to-speech conversion via an instant messaging network (14) where the text message is intended for a phone (16 or 18) coupled to a voicemail system (20), recognizing (74) the designation, converting (76) the text message to a voice message, calling (78) the phone, and delivering (86) the voice message to the voicemail system.
Owner:SNAP INC

Method and system for operating interactive voice response systems tandem

First and second voice interactive response systems operate in tandem. The first voice response system receives a call from a caller and interacts with the caller to provide information to the caller and / or obtain information from the caller. To make additional media services (e.g., automatic speech recognition or text-to-speech conversion) available for the interaction with the caller, the first voice response system extends the call to a second voice response system that is able to provide the additional media services.
Owner:SPRINT SPECTRUM LLC

System and method for word-sense disambiguation by recursive partitioning

A device and related methods for word-sense disambiguation during a text-to-speech conversion are provided. The device, for use with a computer-based system capable of converting text data to synthesized speech, includes an identification module for identifying a homograph contained in the text data. The device also includes an assignment module for assigning a pronunciation to the homograph using a statistical test constructed from a recursive partitioning of training samples, each training sample being a word string containing the homograph. The recursive partitioning is based on determining for each training sample an order and a distance of each word indicator relative to the homograph in the training sample. An absence of one of the word indicators in a training sample is treated as equivalent to the absent word indicator being more than a predefined distance from the homograph.
Owner:CERENCE OPERATING CO
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products