Patents
Literature
Hiro is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Hiro

116 results about "Speech interface" patented technology

A speech interface is a software application that enables interaction between humans and voice enabled-applications, such as virtual assistants and voice assistants. Speech interfaces use and mimic human speech via speech recognition technology. But designing an effective speech interface requires more than writing a script for your voice assistant.

Multi-functional surgical control system and switching interface

An interface which allows a surgeon to operate multiple surgical devices from a single input device. The input device may be a foot pedal that provides output signals to actuate a number of different surgical devices. The surgical devices may include a robotic arm, a laser, an electrocautery device, or an operating table. The interface has an input channel that is coupled to the input device and a plurality of output channels that are coupled to the surgical devices. The interface also has a select input channel which can receive input commands to switch the input channel to one of the output channels. The select channel may be coupled to a speech interface that allows the surgeon to select one of the surgical devices with a voice command. The surgeon can operate any device by providing an input command which switches the input channel to the desired output channel.
Owner:INTUITIVE SURGICAL OPERATIONS INC

Dynamic interactive voice interface

A dynamic voice user interface system is provided. The dynamic voice user interface system interacts with a user at a first level of formality. The voice user interface system then monitors history of user interaction and adjusts the voice user interface to interact with the user with a second level of formality based on the history of user interaction.
Owner:INTELLECTUAL VENTURES I LLC

Method and system for the generation of a voice extensible markup language application for a voice interface process

A method and system for Extensible Markup Language (XML) application transformation may include converting a call flow diagram describing a voice interface process into a list of states in a XML format, and creating a lookup table of audio states in the XML format by mapping a plurality of audio prompts and their corresponding textual representations with states of a list of states that play audio files associated with the plurality of audio prompts. The method and system may include creating an intermediate application in the XML format and from the list of states by merging audio prompts in the lookup table with states of the list of states that play audio files, and transforming the intermediate application into a second application of a second format that is a representation of the call flow diagram.
Owner:MICROSOFT TECH LICENSING LLC

A Test System and method of Operation

A test system comprises a test processor which is arranged to perform hardware level tests on a unit under test. A voice interface interfaces to an external voice communication link coupled to a remote voice communication unit. A test controller is coupled to the test processor and the voice interface and comprises a script processor for executing a test control script. The test control script is in accordance with a voice scripting language standard, such as the Voice extensible Markup Language, VXML, standard. The script processor comprises a first interface for interfacing with the test processor in response to the test control script and a second interface for interfacing with the voice interface in response to the test control script. The invention may allow a user friendly speech interface to a hardware level test system.
Owner:EMERSON NETWORK POWER EMBEDDED COMPUTING

Automated speech-enabled application creation method and apparatus

A system for creating and hosting speech-enabled applications having a speech interface that can be customised by a user is disclosed. The system comprises a customisation module that manages the components, e.g. templates, needed to enable the user to create a speech-enabled application. The customisation module allows a non-expert user rapidly to design and deploy complex speech interfaces. Additionally, the system can automatically manage the deployment of the speech-enabled applications once they have been created by the user, without the need for any further intervention by the user or use of the user's own computer processing resources.
Owner:VOX GENERATION LTD

Speech interface for an automated endoscopic system

A robotic system which controls the movement of a surgical instrument in response to voice commands from the user. The robotic system has a computer controlled arm that holds the surgical instrument. The user provides voice commands to the computer through a microphone. The computer contains a phrase recognizer that matches the user' speech with words stored in the computer. Matched words are then processed to determine whether the user has spoken a robot command. If the user has spoken a recognized robot command the computer will move the robotic arm in accordance with the command.
Owner:INTUITIVE SURGICAL OPERATIONS INC

Terminal device for voice-directed work and information exchange

A device is comprised of processing circuitry operable for providing a speech interface to facilitate a speech dialog with a user or generates commands for a user. An RFID reader, operably coupled with the processing circuitry, is operable for reading data from an RFID tag. In one aspect, the reading occurs in the context of a speech dialog. In another aspect, the data is used to generate speech commands. In another aspect, information is stored to an RFID tag during the speech dialog.
Owner:VOCOLLECT

Constrain-free, imperceptible sleep disorder measuring device and its method

The invention discloses a sleepy obstacle measuring device and method of limitless and non-awareness, which is characterized by the following: setting the pickup and microprocessor in the pillow; placing the microphone on two sides of pillow toward the detector mouth and nose to receive the sound in the phonetic interface of microprocessor; transmitting the magnified sound signal through microphone to the microprocessor; proceeding sound signal filter and intelligent judgment of limitless and non-awareness sleepy obstacle measurement; judging the sleep respiratory and sleep respiratory hesitation symptom; sending the judgment result to the specialist through interface to treat and improve the sleepy obstacle.
Owner:ZHEJIANG UNIV OF TECH

Operation method of web page speech interface

The operation method suitable to graphic user interface system controls a web page through a voice command. The web page is operated based on selection from multiple content events. The method includes following steps: receiving registration of the web page from multiple content events; creating a relevant comparison signal to be stored on a database of comparison table based on data of content events; receiving voice command, and converting the voice command to signal in form of comparison signal, searching the said database to find out relevant content event; choosing whether the content event is displayed on the web page or the content event is executed by proper command.
Owner:ACER INC

Method and apparatus for implementing a speech interface for a GUI

A method and apparatus for providing speech control to a graphical user interface (GUI) divide a GUI into a plurality of screen areas; assign the screen areas priorities; receive a first audio input relating to the selection of one of the objects in the interface; determine the one of the screen areas having the highest priority and including a first object matching the first audio input; and select the first object in the determined screen area if the determined screen area only contains one object matching the first audio input. The method and apparatus also select one of the objects that matches the first audio input in the determined screen area if the determined screen area contains more than one object that matches the first audio input.
Owner:SAP LABS

Voice-Interfaced In-Vehicle Assistance

Voice-interfaced, in-vehicle assistance includes receiving a voice-based query from a user in the vehicle, and then determining at least one of a user emotional state, user expertise level and speech recognition confidence level associated with the voice-based query. A text-based query may then be derived from the voice-based query, and used to search a help database for answers corresponding to the voice-based query. At least one response is then provided to the user in the form of voice-based assistance in accordance with at least one of the user emotional state, user expertise level and speech recognition confidence level.
Owner:BAYERISCHE MOTOREN WERKE AG

Voice interface and methods for improving recognition accuracy of voice search queries

A system and associated methods are disclosed for improving voice recognition accuracy when a user conducts a search by voice. One method involves prompting the user to enter a set of characters of the query (e.g., the first N letters of a query term), and then using these letters to execute a preliminary search. The results of the preliminary search are then used to generate a dynamic grammar for interpreting the full voice query. The grammar may alternatively be retrieved from a cache or other memory that stores the grammars for various combinations of letters. In one embodiment, the user enters the characters by selecting the corresponding keys on a standard telephone keypad (one depression per letter) and then saying the letters, and the keypad entries are used to reduce the number of possible interpretations of each character utterance. Another method, which is useful for search refinement, involves generating a dynamic grammar from a set of search results (e.g., when the number of hits is large), and then using this grammar to interpret utterances of additional query terms to be added to the query.
Owner:A9 COM INC

System and method for a real time client server text to speech interface

A method and system may provide an interface (e.g., “API”), client side software module or other process that may accept an input from a client process such as a website, being executed on a local computer. The module may send the input and possibly authentication information to a remote server, which may produce text-to-speech content or output and transmit the output back to the module, which may produce the output for the client process. The module may be loaded by a security or bootstrap process. The module may analyze client side status, or may otherwise generate authentication or security conditions or information.
Owner:ODDCAST

Voice interface ocx

A medical dictation workflow system can be customized from the selection of available user application programs. A voice interface OCX can interface speech technologies with the selected user application programs of the medical dictation workflow system. The medical dictation workflow system may be directed to generating reports through filling out defined fields. The fields can be generated through a tracking system subscribing to a core reporting system and requesting certain information be captured or through a user. The voice interface OCX can provide macros so a user can customize the fields, navigate among the fields, or fill in the fields with data through a voice recognition engine or a wave player control. The data entered into the fields can be automatically entered into corresponding database elements of a database.
Owner:ATIRIX MEDICAL SYST

Multilevel speech recognition method and apparatus

A multilevel speech recognition method and an apparatus performing the method are disclosed. The method includes receiving a first speech command from a user through a speech interface, and extracting a keyword from the first speech command. The method also includes providing a candidate application group of a category providing a service associated with the keyword, and processing a second speech command from the user associated with an application selected from the candidate application group.
Owner:SAMSUNG ELECTRONICS CO LTD

Speech interface for search engines

An embodiment provides search results from a speech initiated search query. The system receives voiced utterances from a user, converts the voiced utterances through use of a speech recognition application, system or method into data strings, identifies from the data strings a search engine identifier representing a search engine chosen by the user to perform a search, identifies from the data strings a query term to be searched for, modifies the query term to be searched for by replacing any spaces in the query term to be searched for with a query term separator compatible with the search engine represented by the search engine identifier thereby creating a modified query term to be searched for, constructs a uniform resource locator that includes the modified query term to be searched for such that the constructed uniform resource locator represents a valid request to the search engine represented by the search engine identifier, opens the constructed uniform resource locator which causes the chosen search engine to make a search for the modified query term, and provides the results of the search system to the user.
Owner:GOOGLE LLC

Dynamic Interactive Voice Interface

InactiveUS20110246203A1Less of a distraction for the userMinimize extra and unnatural stepAutomatic call-answering/message-recording/conversation-recordingSpeech recognitionZoomVoice user interface
A dynamic voice user interface system is provided. The dynamic voice user interface system interacts with a user at a first level of formality. The voice user interface system then monitors history of user interaction and adjusts the voice user interface to interact with the user with a second level of formality based on the history of user interaction.
Owner:INTELLECTUAL VENTURES I LLC

Creating and editing web 2.0 entries including voice enabled ones using a voice only interface

The present invention discloses a method for creating Web 2.0 entries, such as WIKI entries. In the method, a voice communication channel can be established between a user and an automated response system. User speech input can be received over the voice communication channel. A Web 2.0 entry can be created based upon the speech input. The Web 2.0 entry can be saved in a data store accessible by a Web 2.0 server. The Web 2.0 server can serve the saved Web 2.0 entry to Web 2.0 clients. The Web 2.0 clients can include a graphical and / or a voice interface through which the Web 2.0 entry can be presented to users of the clients. The created Web 2.0 entries (e.g. Web 2.0 application) can be formatted in an ATOM PUBLISHING PROTOCOL compliant manner.
Owner:IBM CORP

Mobile voice platform architecture with remote service interfaces

A mobile voice platform for providing a user speech interface to computer-based services includes a mobile device having a processor, communication circuitry that provides access to the computer-based services, an operating system, and one or more applications that are run using the operating system and that utilize one or more of the computer-based services via the communication circuitry. The mobile voice platform includes at least one non-transient digital storage medium storing a program module having computer instructions that, upon execution by the processor, receives speech recognition results representing user speech that has been processed using automated speech recognition, determines a desired computer-based service based on the speech recognition results, accesses a remotely-stored service interface associated with the desired service, initiates the desired service using the service interface, receives a service result from the desired service, and provides a text-based service response for conversion to a speech response to be provided to the user.
Owner:GM GLOBAL TECH OPERATIONS LLC

Systems and methods for neural voice cloning with a few samples

Voice cloning is a highly desired capability for personalized speech interfaces. Neural network-based speech synthesis has been shown to generate high quality speech for a large number of speakers. Neural voice cloning systems that take a few audio samples as input are presented herein. Two approaches, speaker adaptation and speaker encoding, are disclosed. Speaker adaptation embodiments are basedon fine-tuning a multi-speaker generative model with a few cloning samples. Speaker encoding embodiments are based on training a separate model to directly infer a new speaker embedding from cloningaudios, which is used in or with a multi-speaker generative model. Both approaches achieve good performance in terms of naturalness of the speech and its similarity to original speaker-even with veryfew cloning audios.
Owner:BAIDU USA LLC

Interactive voice browsing server for mobile devices on wireless networks

The wireless data communication system described herein generally includes a wireless mobile computing device, a WLAN infrastructure, a server, and a database. The mobile computing device includes voice interface functionality that enables the device to receive voice signals and generate voice commands for wireless transmission to the WLAN infrastructure. The server includes a voice command interpreter that processes received voice commands to identify the data requested by the user of the wireless mobile computing device. Once the requested data is identified, the server obtains the requested data from the database and generates a reply for the wireless mobile computing device. The reply is formatted such that it initiates a response at the wireless mobile computing device, where the response conveys at least a portion of the requested data.
Owner:SYMBOL TECH INC

Interactive voice browsing for mobile devices on wireless networks

The wireless data communication system described herein generally includes a wireless mobile computing device, a WLAN infrastructure, a server, and a database. The mobile computing device includes voice interface functionality that enables the device to receive voice signals and generate voice commands for wireless transmission to the WLAN infrastructure. The server includes a voice command interpreter that processes received voice commands to identify the data requested by the user of the wireless mobile computing device. Once the requested data is identified, the server obtains the requested data from the database and generates a reply for the wireless mobile computing device. The reply is formatted such that it initiates a response at the wireless mobile computing device, where the response conveys at least a portion of the requested data.
Owner:SYMBOL TECH INC

Non-native speech recognition system and method thereof

The invention relates to a non-native speech recognition system based on mixed model state correction and a method thereof. The non-native speech recognition system comprises a non-native speech interface, a native model module, a non-native model module, a native state decoding module, a non-native state forced alignment module, a native-non-native state similarity matrix computation module, a native-non-native state mapping table computation module and a non-native state correction model decoding module. In the system and the method thereof, a non-native acoustic model is corrected at statelevels based on a native acoustic model of a speaker and state mapping among different models, thus obtaining a model that better meets non-native pronunciation characteristics. The system and the method thereof have the advantages of obvious improvement of recognition performance compared with that of a recognition system which is not corrected by the method on the premise of only the native training data without increase of any non-native speech training data, no obvious speed fall of the recognition speech of the system and very high practicability.
Owner:INST OF ACOUSTICS CHINESE ACAD OF SCI +1

Automatic selection of a disambiguation data field for a speech interface

A method of disambiguating database search results can include retrieving multiple database entries responsive to a database search. The retrieved database entries can include a plurality of common data fields. The retrieved database entries can be processed according to predetermined speech interface criteria. At least one data field can be selected from the plurality of common data fields for uniquely identifying each retrieved database entry. The data items corresponding to the selected data field for each retrieved database entry can be presented through a speech interface.
Owner:NUANCE COMM INC

Enhancement of digital image files

An apparatus to enhance digital image files includes at least a digital image file editing capability and may also include a digital image file capture capability. Upon capture of a digital image file, an editing module allows the digital image file to be enhanced by adding information as desired by the user such as context information and subject matter information. The information to be added may be received by the apparatus through a voice interface or a text interface and then converted (if necessary) for storage as metadata associated with the digital image file. The metadata can be stored in a field or in association with a tag that is part of the digital image file format.
Owner:KYOCERA CORP

Intelligent robot training method and intelligent robot training device, computer equipment and storage medium

The invention discloses an intelligent robot training method and an intelligent robot training device, computer equipment and a storage medium. The method comprises the following steps: acquiring a mode selection request; if the mode selection request is a training speech mode request, entering a training speech interface, acquiring a search request, and acquiring at least one search keyword basedon the search request; searching a training database based on at least one search keyword, acquiring a target presentation file and target handout speech data which are mutually associated, and displaying the target presentation file; meanwhile, playing the target handout speech data which are mutually associated with the target presentation file. According to the technical scheme provided by theinvention, to-be-trained persons are trained by adopting the intelligent robot training method, and the effect of training the to-be-trained persons one by one according to different demands is achieved, so that personalized demands of the to-be-trained persons are effectively met.
Owner:PING AN TECH (SHENZHEN) CO LTD

Method, apparatus and computer-readable media for touch and speech interface with audio location

Method, apparatus, and computer-readable media for touch and speech interface, with audio location, includes structure and / or function whereby at least one processor: (i) receives a touch input from a touch device; (ii) establishes a touch-speech time window; (iii) receives a speech input from a speech device; (iii) determines whether the speech input is present in a global dictionary; (iv) determines a location of a sound source from the speech device; (v) determines whether the touch input and the location of the speech input are both within a same region; (vi) if the speech input is in the dictionary, determines whether the speech input has been received within the window; and (vii) if the speech input has been received within the window, and the touch input and the speech input are both within the same region, activates an action corresponding to both the touch input and the speech input.
Owner:NUREVA INC
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products