Patents
Literature
Hiro is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Hiro

10704 results about "Subvocal recognition" patented technology

Subvocal recognition (SVR) is the process of taking subvocalization and converting the detected results to a digital output, aural or text-based.

Multimodal disambiguation of speech recognition

The present invention provides a speech recognition system combined with one or more alternate input modalities to ensure efficient and accurate text input. The speech recognition system achieves less than perfect accuracy due to limited processing power, environmental noise, and / or natural variations in speaking style. The alternate input modalities use disambiguation or recognition engines to compensate for reduced keyboards, sloppy input, and / or natural variations in writing style. The ambiguity remaining in the speech recognition process is mostly orthogonal to the ambiguity inherent in the alternate input modality, such that the combination of the two modalities resolves the recognition errors efficiently and accurately. The invention is especially well suited for mobile devices with limited space for keyboards or touch-screen input.
Owner:TEGIC COMM

Portable computing apparatus particularly useful in a weight management program

Portable computing apparatus for aiding a user in the monitoring of the consumption of consumable items, such as food items or prescribed medicaments, and reordering such items includes a common database for use in monitoring the items as consumed, and for preparing the reorder list at the proper time. The apparatus preferably includes an imaging device for recording the image of the item to be consumed, and recognition circuitry for utilizing the recorded image to identify the item and also to provide information concerning its nutritional content in a weight management program. The consumable item may also be identified in other manners, such as by a barcode reader, or a voice-recognition circuit.
Owner:HEALTHETECH

Multiple web-based content category searching in mobile search application

In embodiments of the present invention improved capabilities are described for multiple web-based content category searching for web content on a mobile communication facility comprising capturing speech presented by a user using a resident capture facility on the mobile communication facility; transmitting at least a portion of the captured speech as data through a wireless communication facility to a speech recognition facility; generating speech-to-text results for the captured speech utilizing the speech recognition facility; and transmitting the text results and a plurality of formatting rules specifying how search text may be used to form a query for a search capability on the mobile communications facility, wherein each formatting rule is associated with a category of content to be searched.
Owner:VLINGO CORP +1

Sending a communications header with voice recording to send metadata for use in speech recognition, formatting, and search in mobile search application

In embodiments of the present invention improved capabilities are described for sending a communications header with the voice recording to send metadata for use in speech recognition, formatting, and search in searching for web content on a mobile communication facility comprising capturing speech presented by a user using a resident capture facility on the mobile communication facility; transmitting a communications header to a speech recognition facility from the mobile communication facility through a wireless communications facility, wherein the communications header includes at least one of device name, network type, audio source, display parameters for the wireless communications facility, geographic location, and phone number information; transmitting at least a portion of the captured speech as data through the wireless communication facility to a speech recognition facility; generating speech-to-text results utilizing the speech recognition facility based at least in part on the information relating to the captured speech and the communications header; and transmitting text from the speech-to-text results along with URL usage information configured to enable a user to conduct a search on the mobile communication facility.
Owner:NUANCE COMM INC +1

System for handling frequently asked questions in a natural language dialog service

A voice-enabled help desk service is disclosed. The service comprises an automatic speech recognition module for recognizing speech from a user, a spoken language understanding module for understanding the output from the automatic speech recognition module, a dialog management module for generating a response to speech from the user, a natural voices text-to-speech synthesis module for synthesizing speech to generate the response to the user, and a frequently asked questions module. The frequently asked questions module handles frequently asked questions from the user by changing voices and providing predetermined prompts to answer the frequently asked question.
Owner:NUANCE COMM INC

Method for processing the output of a speech recognizer

A system and method for processing speech input comprising a speech recognizer and a logical command processor which facilitates additional processing of speech input beyond the speech recognizer level. A speech recognizer receives input from a user, and when a command is identified in the speech input, if the command meets conditions that require additional processing, a representation of the speech input s stored for subsequent processing. A logical command processor performs additional processing of command input by analyzing the command and its elements, determining which elements are required for successful processing the command and which elements are present and lacking. The user is prompted to supply missing information, and subsequent user input is added to the command structure until the command input is aborted or the command structure reaches sufficient completeness to enable execution of the command. Thereby, speech input of complex commands in natural language in a system running a plurality of applications and processes is made possible.
Owner:GREAT NORTHERN RES

Enhanced speech-to-speech translation system and methods

A speech translation system and methods for cross-lingual communication that enable users to improve and modify content and usage of the system and easily abort or reset translation. The system includes a speech recognition module configured for accepting an utterance, a machine translation module, an interface configured to communicate the utterance and proposed translation, a correction module and an abort action unit that removes any hypotheses or partial hypotheses and terminates translation. The system also includes modules for storing favorites, changing language mode, automatically identifying language, providing language drills, viewing third party information relevant to conversation, among other things.
Owner:META PLATFORMS INC

Method for Voice Activation of a Software Agent from Standby Mode

A method for voice activation of a software agent from a standby mode. In one embodiment, an audio recording (2) is buffered in an audio buffer (6) and at the same time, the audio recording is input to a secondary voice recognition process (7) which is economical in terms of energy and has an increased false positive rate. When a keyword is recognized, a primary voice recognition process (8) is activated from an inactive state, which converts the audio buffer to text and inputs it to a dialog system (9) which analyzes as to whether there is a relevant question made by the user. If this is the case, the user gets an acoustic reply (3), and if this is not the case, the dialog system and the primary voice recognition process immediately return to the inactive state and transfer the control to the secondary voice recognition process.
Owner:INODYN NEWMEDIA

Interface with Gaze Detection and Voice Input

Methods, computer programs, and systems for interfacing a user with a computer program, utilizing gaze detection and voice recognition, are provided. One method includes an operation for determining if a gaze of a user is directed towards a target associated with the computer program. The computer program is set to operate in a first state when the gaze is determined to be on the target, and set to operate in a second state when the gaze is determined to be away from the target. When operating in the first state, the computer program processes voice commands from the user, and, when operating in the second state, the computer program omits processing of voice commands.
Owner:SONY COMPUTER ENTERTAINMENT INC

Speech recognition repair using contextual information

A speech control system that can recognize a spoken command and associated words (such as “call mom at home”) and can cause a selected application (such as a telephone dialer) to execute the command to cause a data processing system, such as a smartphone, to perform an operation based on the command (such as look up mom's phone number at home and dial it to establish a telephone call). The speech control system can use a set of interpreters to repair recognized text from a speech recognition system, and results from the set can be merged into a final repaired transcription which is provided to the selected application.
Owner:APPLE INC

Apparatus and Method for Analyzing Intention

An apparatus and system for analyzing intention are provided. The apparatus for analyzing an intention applies a context-free grammar to each of one or more sentences in units of one or more phrases to perform phrase spotting on each sentence, thereby extending a recognition range for an out-of-grammar (OOG) expression. Meanwhile, the apparatus for analyzing an intention determines whether sentences that have undergone phrase spotting are grammatically valid by applying a dependency grammar to the sentences to filter an invalid sentence, and generates the intention analysis result of a valid sentence, thereby and grammatically and / or semantically verifying a sentence that has undergone speech recognition while extending a speech recognition range.
Owner:SAMSUNG ELECTRONICS CO LTD

Training and using pronunciation guessers in speech recognition

The error rate of a pronunciation guesser that guesses the phonetic spelling of words used in speech recognition is improved by causing its training to weigh letter-to-phoneme mappings used as data in such training as a function of the frequency of the words in which such mappings occur. Preferably the ratio of the weight to word frequency increases as word frequencies decreases. Acoustic phoneme models for use in speech recognition with phonetic spellings generated by a pronunciation guesser that makes errors are trained against word models whose phonetic spellings have been generated by a pronunciation guesser that makes similar errors. As a result, the acoustic models represent blends of phoneme sounds that reflect the spelling errors made by the pronunciation guessers. Speech recognition enabled systems are made by storing in them both a pronunciation guesser and a corresponding set of such blended acoustic models.
Owner:CERENCE OPERATING CO

System and method of a list commands utility for a speech recognition command system

In embodiments of the present invention, a system and computer-implemented method for enabling a user to interact with a mobile device using a voice command may include the steps of defining a structured grammar for generating a global voice command, defining a global voice command of the structured grammar, wherein the global voice command enables access to an object of the mobile device using a single command, and mapping at least one function of the object to the global voice command, wherein upon receiving voice input from the user of the mobile device, the object recognizes the global voice command and controls the function.
Owner:PATCH KIMBERLY C

Method for processing speech signal features for streaming transport

Speech signal information is formatted, processed and transported in accordance with a format adapted for TCP / IP protocols used on the Internet and other communications networks. NULL characters are used for indicating the end of a voice segment. The method is useful for distributed speech recognition systems such as a client-server system, typically implemented on an intranet or over the Internet based on user queries at his / her computer, a PDA, or a workstation using a speech input interface.
Owner:NUANCE COMM INC

Multimodal interface for input of text

The disclosure describes an overall system / method for text-input using a multimodal interface with a combination of speech recognition and text prediction. Specifically, an “always listening” mode for entering words is combined with a push-to-speak mode for entering symbols and phrases. In addition, these two modes are further combined with keypad based text prediction. Finally, the overall user-interface of the proposed system is designed such that it enhances existing standard text-input methods; thereby minimizing the behavior change for mobile users.
Owner:RAO ASHWIN P +2

Mobile office with speech recognition

A user interface for a mobile office is provided for allowing simple, safe, and convenient access to electronic mail, calendar, news, and web browser functions. The dialog or number of steps required to access desired items is minimized using a state controller responsive to voice commands and manual activations of reconfigurable steering wheel switches. Someone unfamiliar with the user interface is assisted by prompts for various commands and can use the mobile office without needing to resort to use of the reconfigurable steering wheel control elements. A more experienced user can bypass prompts by interrupting them with voice commands and can quickly move through various steps by utilizing the configurable steering wheel control elements to gain access to individual items within the mail, calendar, and news functions.
Owner:VISTEON GLOBAL TECH INC

Method, medium and apparatus for providing mobile voice web service

Provided are a method and apparatus for providing a mobile voice web service in a mobile terminal. The method includes analyzing a web history of a user from web search logs of the user and generating a voice access list based on the analysis results, and performing voice recognition by dynamically generating a voice recognition syntax according to the generated voice access list. Accordingly, by limiting syntax required for voice recognition by generating a syntax suitable for a web context of the user, efficient voice recognition, which can be performed in a terminal not a server, can be implemented.
Owner:SAMSUNG ELECTRONICS CO LTD

System for automated translation of speech

The present invention allows subscribers to an online information service to participate in real-time conferencing or chat sessions in which a message originating from a subscriber in accordance with a first language is translated to one or more languages before it is broadcast to the other conference areas. Messages in a first language are translated automatically to one or more other languages through language translation capabilities resident at online information service host computers. Access software that subscribers use for participating in conference is integrated with speech recognition and speech generation software such that a subscriber may speak the message he or she would like to share with other participants and may hear the messages from the other participants in the conference. Speech-to-speech translation may be accomplished as a message spoken into a computer microphone in accordance with a first language may be recited by a remote computer in accordance with a second language.
Owner:META PLATFORMS INC

Home control, monitoring and communication system using remote voice commands

A voice-activated command interface module for interacting with a plurality of home-based electronic devices so as to allow for a remotely-located home owner to communicate with, command and control various ones of the electronic devices. The module includes a plurality of communication ports, each communication port associated with a different type of communication interface for providing communications to and from the plurality of electronic devices. The module also includes a voice network communication port for receiving the voice commands from the home owner and a data network communication port for transmitting monitoring and control information between the plurality of electronic devices and the home owner. In operation, the command interface module is responsive to voice commands received from a remote user via an incoming telephone line (either data or voice). A voice recognition unit within the command interface module is utilized to translate the received voice signal into an “action / control” signal and then perform the desired activity.
Owner:AMERICAN TELEPHONE & TELEGRAPH CO

System, Method, Apparatus and Computer Program Product for Providing Dynamic Vocabulary Prediction for Speech Recognition

InactiveUS20080154600A1Improve speech processingAccurate word recognitionSpeech recognitionNatural language processingConfidence measures
An apparatus for providing dynamic vocabulary prediction for setting up a speech recognition network of resource constrained portable devices may include a recognition network element. The recognition network element may be configured to determine a confidence measure for each candidate recognized word for a current word to be recognized. The recognition network element may also be configured to select a subset of candidate recognized words as selected candidate words based on the confidence measure of each one of the candidate recognized words, and determine a recognition network for a next word to be recognized, the recognition network including likely follower words for each of the selected candidate words, e.g. using language model and highly frequently used words.
Owner:NOKIA CORP

Providing menu and other services for an information processing system using a telephone or other audio interface

A method and system for providing efficient menu services for an information processing system that uses a telephone or other form of audio user interface. In one embodiment, the menu services provide effective support for novice users by providing a full listing of available keywords and rotating house advertisements which inform novice users of potential features and information. For experienced users, cues are rendered so that at any time the user can say a desired keyword to invoke the corresponding application. The menu is flat to facilitate its usage. Full keyword listings are rendered after the user is given a brief cue to say a keyword. Service messages rotate words and word prosody. When listening to receive information from the user, after the user has been cued, soft background music or other audible signals are rendered to inform the user that a response may now be spoken to the service. Other embodiments determine default cities, on which to report information, based on characteristics of the caller or based on cities that were previously selected by the caller. Other embodiments provide speech concatenation processes that have co-articulation and real-time subject-matter-based word selection which generate human sounding speech. Other embodiments reduce the occurrences of falsely triggered barge-ins during content delivery by only allowing interruption for certain special words. Other embodiments offer special services and modes for calls having voice recognition trouble. The special services are entered after predetermined criterion have been met by the call. Other embodiments provide special mechanisms for automatically recovering the address of a caller.
Owner:MICROSOFT TECH LICENSING LLC

Digital picture frame and method for editing

InactiveUS20060170669A1Prevent attackers from duplicatingEditing and manipulating pictures is simplifiedTelevision system detailsColor television detailsComputer moduleDigital pictures
The invention provides a digital picture frame that allows a user to edit a displayed picture using simple and intuitive controls. Modifications to an image may be stored by the digital picture frame so that the digital picture frame may later display the edited or modified version of the picture rather than the original version. A user may edit a picture using mechanical controls (e.g., knobs, switches, slider-bars, wheels), sensors (e.g., a position sensor, a tilt sensor, a microphone, a light sensor), a voice recognition module, and / or a touch screen. A digital picture frame may identify a user and based on the user's identity (e.g., the user's preferences or permissions), may display pictures to the user. Further, different users may modify a picture in different ways, so that two different users may view two different versions of the same picture.
Owner:WALKER DIGITAL

System and method for input of text to an application operating on a device

A device comprise an a display screen and an audio circuit for generating an audio signal representing spoken words uttered by the user. A processor executes a first application, a second application, and a text mark-up object. The first application may render a depiction of text on the display screen. The text mark-up object may: i) receiving at least a portion of the audio signal representing spoken words uttered by the user; ii) performing speech recognition to generate a text representation of the spoken words uttered by the user; iii) determining a selected text segment, and iv) performing an input function to input the selected text segment to the second application. The selected text segment may be text which corresponds to both a portion of the depiction of text on the display screen and the text representation of the spoken words uttered by the user.
Owner:SONY ERICSSON MOBILE COMM AB
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products