Patents
Literature
Hiro is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Hiro

145 results about "VoiceXML" patented technology

VoiceXML (VXML) is a digital document standard for specifying interactive media and voice dialogs between humans and computers. It is used for developing audio and voice response applications, such as banking systems and automated customer service portals. VoiceXML applications are developed and deployed in a manner analogous to how a web browser interprets and visually renders the Hypertext Markup Language (HTML) it receives from a web server. VoiceXML documents are interpreted by a voice browser and in common deployment architectures, users interact with voice browsers via the public switched telephone network (PSTN).

Supporting Multi-Lingual User Interaction With A Multimodal Application

Methods, apparatus, and products are disclosed for supporting multi-lingual user interaction with a multimodal application, the application including a plurality of VoiceXML dialogs, each dialog characterized by a particular language, supporting multi-lingual user interaction implemented with a plurality of speech engines, each speech engine having a grammar and characterized by a language corresponding to one of the dialogs, with the application operating on a multimodal device supporting multiple modes of interaction including a voice mode and one or more non-voice modes, the application operatively coupled to the speech engines through a VoiceXML interpreter, the VoiceXML interpreter: receiving a voice utterance from a user; determining in parallel, using the speech engines, recognition results for each dialog in dependence upon the voice utterance and the grammar for each speech engine; administering the recognition results for the dialogs; and selecting a language for user interaction in dependence upon the administered recognition results.
Owner:NUANCE COMM INC

Ordering Recognition Results Produced By An Automatic Speech Recognition Engine For A Multimodal Application

Ordering recognition results produced by an automatic speech recognition (‘ASR’) engine for a multimodal application implemented with a grammar of the multimodal application in the ASR engine, with the multimodal application operating in a multimodal browser on a multimodal device supporting multiple modes of interaction including a voice mode and one or more non-voice modes, the multimodal application operatively coupled to the ASR engine through a VoiceXML interpreter, includes: receiving, in the VoiceXML interpreter from the multimodal application, a voice utterance; determining, by the VoiceXML interpreter using the ASR engine, a plurality of recognition results in dependence upon the voice utterance and the grammar; determining, by the VoiceXML interpreter according to semantic interpretation scripts of the grammar, a weight for each recognition result; and sorting, by the VoiceXML interpreter, the plurality of recognition results in dependence upon the weight for each recognition result.
Owner:NUANCE COMM INC

Visual interactive response system and method translated from interactive voice response for telephone utility

A system, method, and computer readable medium storing a software program for translating a script for an interactive voice response system to a script for a visual interactive response system. The visual interactive response system executes the translated visual-based script when a user using a display telephone calls the visual interactive response system. The visual interactive response system then transmits a visual menu to the display telephone to allow the user to select a desired response, which is subsequently sent back to the visual interactive response system for processing. The voice-based script may be defined in voice extensible markup language and the visual-based script may be defined in wireless markup language, hypertext markup language, or handheld device markup language. The translation system and program includes a parser for extracting command structures from the voice-based script, a visual-based structure generator for generating corresponding command structure for the visual-based script, a text prompt combiner for incorporating text translated from voice prompts into command structure generated by the structure generator, an automatic speech recognition routine for automatically converting voice prompts into translated text, and an editor for editing said visual-based script.
Owner:RPX CLEARINGHOUSE

Handling of speech recognition in a declarative markup language

Declarative markup languages for speech applications such as VoiceXML are becoming more prevalent programming modalities for describing speech applications. Present declarative markup languages for speech applications model the running speech application as a state machine with the program specifying the transitions amongst the states. These languages can be extended to support a marker-semantic to more easily solve several problems that are otherwise not easily solved. In one embodiment, a partially overlapping target window is implemented using a mark semantic. Other uses include measurement of user listening time, detection and avoidance of errors, and better resumption of playback after a false barge in.
Owner:MICROSOFT TECH LICENSING LLC

Voice browser implemented as a distributable component

A system for implementing voice services can include at least one virtual machine, such as a Java 2 Enterprise Edition (J2EE) virtual machine. The virtual machine can include a bean container for handling software beans, such as Enterprise Java Beans. The bean container can include a voice browser bean. The voice browser bean can include a VoiceXML browser.
Owner:NUANCE COMM INC

A Test System and method of Operation

A test system comprises a test processor which is arranged to perform hardware level tests on a unit under test. A voice interface interfaces to an external voice communication link coupled to a remote voice communication unit. A test controller is coupled to the test processor and the voice interface and comprises a script processor for executing a test control script. The test control script is in accordance with a voice scripting language standard, such as the Voice extensible Markup Language, VXML, standard. The script processor comprises a first interface for interfacing with the test processor in response to the test control script and a second interface for interfacing with the voice interface in response to the test control script. The invention may allow a user friendly speech interface to a hardware level test system.
Owner:EMERSON NETWORK POWER EMBEDDED COMPUTING

System and method for enhancing performance of VoiceXML gateways

A system and method are disclosed for managing frequently used VoiceXML documents. In particular, a VoiceXML gateway is provided having an administrator-managed and provisioned local file system. Specifically, the administrator provisions the files that are to be stored on the local file system. Importantly, neither the VoiceXML interpreter nor the VoiceXML interpreter context manage the local file system. Accordingly, the local file system is not subject to the cache control directives that requires regular retransmission of frequently used VoiceXML documents and other files from the remote documents servers. To that end, administrator-provisioned files may be permanently stored on the local file system thereby minimizing their search and access time.
Owner:LUCENT TECH INC

Method and system for presenting dynamic commercial content to clients interacting with a voice extensible markup language system

A system for selecting a voice dialog, which may be an advertisement or information message, from a pool of voice dialogs and for causing the selected voice dialog to be utilized by a voice application for presentation to a caller during an automated voice interactive session includes a voice-enabled interaction interface hosting the voice application; and, a sever monitoring the voice-enabled interaction interface for selecting the voice dialog and for serving at least identification and location of the dialog to be presented to the caller via the voice application.
Owner:APPTERA

Method and apparatus for adapting a voice extensible markup language-enabled voice system for natural speech recognition and system response

A system for analyzing natural language spoken through a voice recognition system comprising: a language separator for separating a natural language expression into multiple word segments; and a grammar module for creating XML-based description sets or binary sets using word segments as input. In a preferred embodiment, the word segments are further processed as class objects and then organized according to original spoken order and wherein content fields are created to contain the class objects for comparison during voice interaction using the voice recognition system.
Owner:APPTERA

Methods and systems for personal interactive voice response

A personal interactive voice response system with a web-based interface allowing the user to specify treatment of incoming calls based on voice or touchtone responses provided by the calling party. A graphical user interface available over a computer network, such as the Internet, allows the user to personalize greetings that callers hear, as well as customizing treatment of callers based on the caller's response. The user may record an initial greeting or other messages, either over the telephone or over the Internet, so that the messages are played to callers in the user's voice. Additionally, the user may enter text, via a PC or wireless device connected to the Internet, that is played back for the caller, based on the caller's response, via text-to-speech conversion using voice extensible markup language technology. Resulting actions, such as call forwarding, distinctive ringing, or remote notification of the incoming call may also be included.
Owner:AT&T DELAWARE INTPROP INC

Dynamically Generating a Vocal Help Prompt in a Multimodal Application

Dynamically generating a vocal help prompt in a multimodal application that include detecting a help-triggering event for an input element of a VoiceXML dialog, where the detecting is implemented with a multimodal application operating on a multimodal device supporting multiple modes of interaction including a voice mode and one or more non-voice modes, the multimodal application is operatively coupled to a VoiceXML interpreter, and the multimodal application has no static help text. Dynamically generating a vocal help prompt in a multimodal application according to embodiments of the present invention typically also includes retrieving, by the VoiceXML interpreter from a source of help text, help text for an element of a speech recognition grammar, forming by the VoiceXML interpreter the help text into a vocal help prompt, and presenting by the multimodal application the vocal help prompt through a computer user interface to a user.
Owner:NUANCE COMM INC

Methods and systems for multi-modal browsing and implementation of a conversational markup language

A new application programming language is provided which is based on user interaction with any device which a user is employing to access any type of information. The new language is referred to herein as a “Conversational Markup Language (CML). In a preferred embodiment, CML is a high level XML based language for representing “dialogs” or “conversations” the user will have with any given computing device. For example, interaction may comprise, but is not limited to, visual based (text and graphical) user interaction and speech based user interaction. Such a language allows application authors to program applications using interaction-based elements referred to herein as “conversational gestures.” The present invention also provides for various embodiments of a multimodal browser capable of supporting the features of CML in accordance with various modality specific representations, e.g., HTML based graphical user interface (GUI) browser, VoiceXML based speech browser, etc.
Owner:TWITTER INC

Method and apparatus for providing a reliable voice extensible markup language service

A method and apparatus for providing a reliable Voice Extensible Markup Language (VXML) over packet networks such as Voice over Internet Protocol (VoIP) and Service over Internet Protocol (SoIP) network are disclosed. For example, a service provider may utilize a plurality of content servers that can be accessed by at least one telephony browser. The telephony browser can reach the content browsers directly as well as through a shared server that may load balance among the content servers. When a request for a VXML content, e.g., a VXML application, is received, the telephony browser sends the request to the shared server. If the request fails or a response is not received prior to expiration of a predetermined time interval, then the telephony browser sends a second request directly to one of the content servers that is capable of providing the requested content.
Owner:AMERICAN TELEPHONE & TELEGRAPH CO

System and method for generating and presenting multi-modal applications from intent-based markup scripts

Systems and methods are provided for rendering modality-independent scripts (e.g., intent-based markup scripts) in a multi-modal environment, whereby a user can interact with an application using a plurality of modalities (e.g., speech and GUI) with I / O events being automatically synchronized over the plurality of modalities presented. In one aspect, immediate synchronized rendering of the modality-independent document in each of the supported modalities is provided. In another aspect, deferred rendering and presentation of intent-based scripts to an end user is provided, wherein a speech markup language script (such as a VoiceXML document) is generated from the modality-independent script and rendered (via, e.g., VoiceXML browser) at a later time.
Owner:IBM CORP

Method and system of building a grammar rule with baseforms generated dynamically from user utterances

A method (200) of building a grammar with baseforms generated dynamically from user utterances can include the steps of recording (205) a user utterance, generating (210) a baseform using the user utterance, creating or adding to (215) a grammar rule using the baseform, and binding (230) the grammar rule in a grammar document of a voice extensible markup language program. Generating a baseform can optionally include introducing a new element to VoiceXML with attributes that enable generating the baseform from a referenced recording such as the user utterance. In one embodiment, the method can be used to create (235) a phonebook and a grammar to access the phonebook by repeatedly visiting a form containing the grammar rule with attributes that enable generating the baseform from the referenced recording.
Owner:MICROSOFT TECH LICENSING LLC

System and methodology for voice activated access to multiple data sources and voice repositories in a single session

A system and method and computer program for seamlessly accessing multiple data sources and voice repositories using voice commands in a single phone call session. The system comprises of voice grammars that span various contexts for all data sources and voice repositories, a telephony platform, an automatic speech recognition engine, extractors for extracting information from the data sources and voice repositories and an interpreter for controlling the extractors and telephony platform. It is the co-operation between the voice grammars and the telephony platform, controlled by a VoiceXML interpreter that enables this seamless access to information from the multiple data sources and voice repositories.
Owner:CISCO TECH INC

Pausing A VoiceXML Dialog Of A Multimodal Application

Pausing a VoiceXML dialog of a multimodal application, including generating by the multimodal application a pause event; responsive to the pause event, temporarily pausing the dialogue by the VoiceXML interpreter; generating by the multimodal application a resume event; and responsive to the resume event, resuming the dialog. Embodiments are implemented with the multimodal application operating on a multimodal device supporting multiple modes of interaction including a voice mode and one or more non-voice modes, the multimodal application is operatively coupled to a VoiceXML interpreter, and the VoiceXML interpreter is interpreting the VoiceXML dialog to be paused.
Owner:NUANCE COMM INC

Enabling Natural Language Understanding In An X+V Page Of A Multimodal Application

Enabling natural language understanding using an X+V page of a multimodal application implemented with a statistical language model (‘SLM’) grammar of the multimodal application in an automatic speech recognition (‘ASR’) engine, with the multimodal application operating in a multimodal browser on a multimodal device supporting multiple modes of interaction including a voice mode and one or more non-voice modes, the multimodal application operatively coupled to the ASR engine through a VoiceXML interpreter, including: receiving, in the ASR engine from the multimodal application, a voice utterance; generating, by the ASR engine according to the SLM grammar, at least one recognition result for the voice utterance; determining, by an action classifier for the VoiceXML interpreter, an action identifier in dependence upon the recognition result, the action identifier specifying an action to be performed by the multimodal application; and interpreting, by the VoiceXML interpreter, the multimodal application in dependence upon the action identifier.
Owner:NUANCE COMM INC

Providing Expressive User Interaction With A Multimodal Application

Methods, apparatus, and products are disclosed for providing expressive user interaction with a multimodal application, the multimodal application operating in a multimodal browser on a multimodal device supporting multiple modes of user interaction including a voice mode and one or more non-voice modes, the multimodal application operatively coupled to a speech engine through a VoiceXML interpreter, including: receiving, by the multimodal browser, user input from a user through a particular mode of user interaction; determining, by the multimodal browser, user output for the user in dependence upon the user input; determining, by the multimodal browser, a style for the user output in dependence upon the user input, the style specifying expressive output characteristics for at least one other mode of user interaction; and rendering, by the multimodal browser, the user output in dependence upon the style.
Owner:NUANCE COMM INC

Reusable voiceXML dialog components, subdialogs and beans

Systems and methods for building speech-based applications using reusable dialog components based on VoiceXML (Voice eXtensible Markup Language). VoiceXML reusable dialog components can be used for building a voice interface for use with multi-modal, multi-channel and conversational applications that offer universal access to information anytime, from any location, using any pervasive computing device regardless of its I / O modality. In one embodiment, a framework for reusable dialog components built within the VoiceXML specifications is based on the <subdialog> tag and ECMAScript parameter objects to pass parameters, configuration and results. This solution is interpreted at the client side (VoiceXML browser). In another embodiment, a framework for reusable dialog components is based on JSP (Java Server Pages) and beans that generate VoiceXML subdialogs. This solution can be evaluated at the server side. These frameworks can be mixed and matched depending on the application.
Owner:IBM CORP

Method and system for monitoring and managing multi-sourced call centers

A mid-point call management node subject to monitoring through a workstation communicatively coupled thereto, provides call services (e.g., through extensible markup language (XML), and in particular call control extensible markup language (CCXML) and / or voice extensible markup language (VXML), instructions) for an inbound call received from an originating network at an originating point-of-presence (POP) associated with multiple, disparate call centers, the call services being provided in response to call management application instructions issued according to enterprise-specific strategies for optimizing call handling between the originating POP and domestic and / or international ones of the disparate call centers communicatively coupled thereto. Call center information (e.g., call load information) received at the management node from the multiple call centers may be used in connection with providing the call services. The enterprise-specific strategies may be instantiated as processes for: call routing, load balancing, work force management, and / or customer relationship management.
Owner:BROADSOFT

Dynamically defining a voicexml grammar in an x+v page of a multimodal application

Dynamically defining a VoiceXML grammar of a multimodal application, implemented with the multimodal application operating on a multimodal device supporting multiple modes of interaction including a voice mode and one or more non-voice modes, the multimodal application operatively coupled to a VoiceXML interpreter, and the method includes loading the X+V page by the multimodal application, from a web server into the multimodal device for execution, the X+V page including one or more VoiceXML grammars in one or more VoiceXML dialogs, including at least one in-line grammar that is declared but undefined; retrieving by the multimodal application a grammar definition for the in-line grammar from the web server without reloading the X+V page; and defining by the multimodal application the in-line grammar with the retrieved grammar definition before executing the VoiceXML dialog containing the in-line grammar.
Owner:NUANCE COMM INC

Switching between modalities in a speech application environment extended for interactive text exchanges

The present solution includes a method for dynamically switching modalities in a dialogue session involving a voice server. In the method, a dialogue session can be established between a user and a speech application. During the dialogue session, the user can interact using an original modality, which is either a speech modality, a text exchange modality, or a multi mode modality that includes a text exchange modality. The speech application can interact using a speech modality. A modality switch trigger can be detected that changes the original modality to a different modality. The modality transition to the second modality can be transparent to the speech application. The speech application can be a standard VoiceXML based speech application that lacks an inherent text exchange capability.
Owner:NUANCE COMM INC

Method of implementing a VXML application into an IP device and an IP device having VXML capability

A method of implementing a Voice Extensible Markup Language (VXML) application in an Internet Protocol (IP) device, and an IP device having VXML capability, are disclosed. An IP device having a VXML browser is provided. A VXML script file containing a plurality of instructions for a particular VXML application is fetched into the IP device from a server via an IP network to which the IP device is connected. The fetched VXML script file is then parsed into an appropriate format, and an VXML engine in the VXML browser executes the instructions of the parsed VXML script file to establish an audio interface with either the user of the IP device or a user of another IP device that is connected to the IP network.
Owner:AVAYA TECH CORP

Multi-tenant self-service VXML portal

A multi-tenant voice extensible markup language (VXML) voice system includes a voice portal connected to at least one telephony network; a voice application server integrated with the voice portal; and a multi-tenant configuration application integrated with the voice application server, the configuration application accessible to the tenants from a data packet network.
Owner:APPTERA

System and process for developing a voice application

A system for use in developing a voice application, including a dialog element selector for defining execution paths of the application by selecting dialog elements and adding the dialog elements to a tree structure, each path through the tree structure representing one of the execution paths, a dialog element generator for generating the dialog elements on the basis of predetermined templates and properties of the dialog elements, the properties templates received from a user of the system, each of said dialog elements corresponding to at least one voice language template, and a code generator for generating at least one voice language module for the application on the basis of said at least one voice language template and said properties. The voice language templates include VoiceXML elements, and the dialog elements can be regenerated from the voice language module. The voice language module can be used to provide the voice application for an IVR.
Owner:TELSTRA CORPORATION LIMITD

System and method for tracking VoiceXML document execution in real-time

A method and apparatus for tracking execution of a VoiceXML document of a VoiceXML application by a VoiceXML execution client are disclosed. In one embodiment, the method includes trapping a desired VoiceXML document and executing the trapped document in an execution tracking mode as a function of marker information. The marker information is configured to enable a highlighting of a respective VoiceXML element within a listing of at least a portion of the VoiceXML document on a remote display.
Owner:AVAYA INC

Automated directory assistance system for a hybrid TDM/VoIP network

An automated directory assistance platform architecture is provided for at least partial automatic processing of 411 calls from TDM-based telephone networks and from VoIP networks. The architecture includes three layers. One layer is a telephony network interface that accepts information from both TDM and VoIP based DA networks. The telephony layer sequesters the other two layers from the complexities of interacting with different source networks. Another layer is a VoiceXML-based IVR dialog engine that directs information received from the telephony interface. The third layer is an App Server Layer that processes information received from the dialog engine by retrieving information from an internet-accessible database. Calls that cannot be handled completely by automation are handed off to a live operator working in all IP environment.
Owner:SBC KNOWLEDGE VENTURES LP
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products