Patents
Literature
Hiro is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Hiro

45606results about "Speech recognition" patented technology

Intelligent Automated Assistant

An intelligent automated assistant system engages with the user in an integrated, conversational manner using natural language dialog, and invokes external services when appropriate to obtain information or perform various actions. The system can be implemented using any of a number of different platforms, such as the web, email, smartphone, and the like, or any combination thereof. In one embodiment, the system is based on sets of interrelated domains and tasks, and employs additional functionally powered by external services with which the system can interact.
Owner:APPLE INC

System and methods for recognizing sound and music signals in high noise and distortion

A method for recognizing an audio sample locates an audio file that most closely matches the audio sample from a database indexing a large set of original recordings. Each indexed audio file is represented in the database index by a set of landmark timepoints and associated fingerprints. Landmarks occur at reproducible locations within the file, while fingerprints represent features of the signal at or near the landmark timepoints. To perform recognition, landmarks and fingerprints are computed for the unknown sample and used to retrieve matching fingerprints from the database. For each file containing matching fingerprints, the landmarks are compared with landmarks of the sample at which the same fingerprints were computed. If a large number of corresponding landmarks are linearly related, i.e., if equivalent fingerprints of the sample and retrieved file have the same time evolution, then the file is identified with the sample. The method can be used for any type of sound or music, and is particularly effective for audio signals subject to linear and nonlinear distortion such as background noise, compression artifacts, or transmission dropouts. The sample can be identified in a time proportional to the logarithm of the number of entries in the database; given sufficient computational power, recognition can be performed in nearly real time as the sound is being sampled.
Owner:APPLE INC

System and method for generating voice pages with included audio files for use in a voice page delivery system

A content provider system for enabling content providers to create voice pages with audio files included for use in a network for voice page delivery through which subscribers request a voice page and a voice page server system delivers the voice page audibly to the subscriber. A content provider selects a voice page into which the audio file is to be incorporated, selects the audio file and the content provider system then transfers the audio file to a voice page server system which generates a voice page with the audio file included using XML-based tags designated for audio files. The audio files are uploaded from a number of user devices including a telephony device, a web-based system and a PDA.
Owner:GENESYS TELECOMMUNICATIONS LABORATORIES INC

Interactive speech recognition device and system for hands-free building control

A self-contained wireless interactive speech recognition control device and system that integrates with automated systems and appliances to provide totally hands-free speech control capabilities for a given space. Preferably, each device comprises a programmable microcontroller having embedded speech recognition and audio output capabilities, a microphone, a speaker and a wireless communication system through which a plurality of devices can communicate with each other and with one or more system controllers or automated mechanisms. The device may be enclosed in a stand-alone housing or within a standard electrical wall box. Several devices may be installed in close proximity to one another to ensure hands-free coverage throughout the space. When two or more devices are triggered simultaneously by the same speech command, real time coordination ensures that only one device will respond to the command.
Owner:ROSENBERGER THEODORE ALFRED

Synchronized pattern recognition source data processed by manual or automatic means for creation of shared speaker-dependent speech user profile

InactiveUS20060149558A1Avoids time-consuming generationMaximize likelihoodSpeech recognitionGraphicsData segment
An apparatus for collecting data from a plurality of diverse data sources, the diverse data sources generating input data selected from the group including text, audio, and graphics, the diverse data sources selected from the group including real-time and recorded, human and mechanically-generated audio, single-speaker and multispeaker, the apparatus comprising: means for dividing the input data into one or more data segments, the dividing means acting separately on the input data from each of the plurality of diverse data sources, each of the data segments being associated with at least one respective data buffer such that each of the respective data buffers would have the same number of segments given the same data; means for selective processing of the data segments within each of the respective data buffers; and means for distributing at least one of the respective data buffers such that the collected data associated therewith may be used for further processing.
Owner:CUSTOM SPEECH USA

Automatically Adapting User Interfaces For Hands-Free Interaction

A user interface for a system such as a virtual assistant is automatically adapted for hands-free use. A hands-free context is detected via automatic or manual means, and the system adapts various stages of a complex interactive system to modify the user experience to reflect the particular limitations of such a context. The system of the present invention thus allows for a single implementation of a complex system such as a virtual assistant to dynamically offer user interface elements and alter user interface behavior to allow hands-free use without compromising the user experience of the same system for hands-on use.
Owner:APPLE INC

Natural language task-oriented dialog manager and method

A system for conversant interaction includes a recognizer for receiving and processing input information and outputting a recognized representation of the input information. A dialog manager is coupled to the recognizer for receiving the recognized representation of the input information, the dialog manager having task-oriented forms for associating user input information therewith, the dialog manager being capable of selecting an applicable form from the task-oriented forms responsive to the input information by scoring the forms relative to each other. A synthesizer is employed for converting a response generated by the dialog manager to output the response. A program storage device and method are also provided.
Owner:NUANCE COMM INC

System and method for multi-modal focus detection, referential ambiguity resolution and mood classification using multi-modal input

Systems and methods are provided for performing focus detection, referential ambiguity resolution and mood classification in accordance with multi-modal input data, in varying operating conditions, in order to provide an effective conversational computing environment for one or more users.
Owner:IBM CORP

Conversational computing via conversational virtual machine

A conversational computing system that provides a universal coordinated multi-modal conversational user interface (CUI) (10) across a plurality of conversationally aware applications (11) (i.e., applications that “speak” conversational protocols) and conventional applications (12). The conversationally aware maps, applications (11) communicate with a conversational kernel (14) via conversational application APIs (13). The conversational kernel (14) controls the dialog across applications and devices (local and networked) on the basis of their registered conversational capabilities and requirements and provides a unified conversational user interface and conversational services and behaviors. The conversational computing system may be built on top of a conventional operating system and APIs (15) and conventional device hardware (16). The conversational kernel (14) handles all I/O processing and controls conversational engines (18). The conversational kernel (14) converts voice requests into queries and converts outputs and results into spoken messages using conversational engines (18) and conversational arguments (17). The conversational application API (13) conveys all the information for the conversational kernel (14) to transform queries into application calls and conversely convert output into speech, appropriately sorted before being provided to the user.
Owner:UNILOC 2017 LLC
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products