Patents
Literature
Hiro is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Hiro

3554 results about "Speech input" patented technology

Speech input is one of the most innovative browser technologies to appear in recent months. It’s easy to implement and there are several obvious uses: assistive dictation for those with impaired mobility. an alternative input option for mobile phones and tablets, and. any environment where a keyboard or mouse is impractical.

Distributed voice user interface

A distributed voice user interface system includes a local device which receives speech input issued from a user. Such speech input may specify a command or a request by the user. The local device performs preliminary processing of the speech input and determines whether it is able to respond to the command or request by itself. If not, the local device initiates communication with a remote system for further processing of the speech input.
Owner:INTELLECTUAL VENTURES I LLC

Intelligent automated assistant for TV user interactions

Systems and processes are disclosed for controlling television user interactions using a virtual assistant. A virtual assistant can interact with a television set-top box to control content shown on a television. Speech input for the virtual assistant can be received from a device with a microphone. User intent can be determined from the speech input, and the virtual assistant can execute tasks according to the user's intent, including causing playback of media on the television. Virtual assistant interactions can be shown on the television in interfaces that expand or contract to occupy a minimal amount of space while conveying desired information. Multiple devices associated with multiple displays can be used to determine user intent from speech input as well as to convey information to users. In some examples, virtual assistant query suggestions can be provided to the user based on media content shown on a display.
Owner:APPLE INC

Speech interface system and method for control and interaction with applications on a computing system

A speech processing system which exploits statistical modeling and formal logic to receive and process speech input, which may represent data to be received, such as dictation, or commands to be processed by an operating system, application or process. A command dictionary and dynamic grammars are used in processing speech input to identify, disambiguate and extract commands. The logical processing scheme ensures that putative commands are complete and unambiguous before processing. Context sensitivity may be employed to differentiate data and commands. A multi faceted graphic user interface may be provided for interaction with a user to speech enable interaction with applications and processes that do not necessarily have native support for speech input.
Owner:SAMSUNG ELECTRONICS CO LTD

Automated database assistance using a telephone for a speech based or text based multimedia communication mode

An interface for remote human input for reading a database, the interface including an automatic voice question unit for eliciting speech input, a speech recognition unit for recognizing human speech input, and a data recognition unit for recognizing remote data input. The interface is associated with a database to search the database using the recognized input. A typical application is as an automated directory enquiry service.
Owner:AMAZON TECH INC

Multimodal natural language query system for processing and analyzing voice and proximity-based queries

The present invention provides a natural language query system and method for processing and analyzing multimodally-originated queries, including voice and proximity-based queries. The natural language query system includes a Web-enabled device including a speech input module for receiving a voice-based query in natural language form from a user and a location / proximity module for receiving location / proximity information from a location / proximity device. The query system also includes a speech conversion module for converting the voice-based query in natural language form to text in natural language form and a natural language processing module for converting the text in natural language form to text in searchable form. The query system further includes a semantic engine module for converting the text in searchable form to a formal database query and a database-look-up module for using the formal database query to obtain a result related to the voice-based query in natural language form from a database.
Owner:PORTAL COMM LLC

Method for processing the output of a speech recognizer

A system and method for processing speech input comprising a speech recognizer and a logical command processor which facilitates additional processing of speech input beyond the speech recognizer level. A speech recognizer receives input from a user, and when a command is identified in the speech input, if the command meets conditions that require additional processing, a representation of the speech input s stored for subsequent processing. A logical command processor performs additional processing of command input by analyzing the command and its elements, determining which elements are required for successful processing the command and which elements are present and lacking. The user is prompted to supply missing information, and subsequent user input is added to the command structure until the command input is aborted or the command structure reaches sufficient completeness to enable execution of the command. Thereby, speech input of complex commands in natural language in a system running a plurality of applications and processes is made possible.
Owner:GREAT NORTHERN RES

Interface with Gaze Detection and Voice Input

Methods, computer programs, and systems for interfacing a user with a computer program, utilizing gaze detection and voice recognition, are provided. One method includes an operation for determining if a gaze of a user is directed towards a target associated with the computer program. The computer program is set to operate in a first state when the gaze is determined to be on the target, and set to operate in a second state when the gaze is determined to be away from the target. When operating in the first state, the computer program processes voice commands from the user, and, when operating in the second state, the computer program omits processing of voice commands.
Owner:SONY COMPUTER ENTERTAINMENT INC

Process for automatic control of one or more devices by voice commands or by real-time voice dialog and apparatus for carrying out this process

A speech dialog system wherein a process for automatic control of devices by speech dialog is used applying methods of speech input, speech signal processing and speech recognition, syntatical-grammatical postediting as well as dialog, executive sequencing and interface control, and which is characterized in that syntax and command structures are set during real-time dialog operation; preprocessing, recognition and dialog control are designed for operation in a noise-encumbered environment; no user training is required for recognition of general commands; training of individual users is necessary for recognition of special commands; the input of commands is done in linked form, the number of words used to form a command for speech input being variable; a real-time processing and execution of the speech dialog is established; and the speech input and output is done in the hands-free mode.
Owner:NUANCE COMM INC

Interface for a Virtual Digital Assistant

The digital assistant displays a digital assistant object in an object region of a display screen. The digital assistant then obtains at least one information item based on a speech input from a user. Upon determining that the at least one information item can be displayed in its entirety in the display region of the display screen, the digital assistant displays the at least one information item in the display region, where the display region and the object region are not visually distinguishable from one another. Upon determining that the at least one information item cannot be displayed in its entirety in the display region of the video display screen, the digital assistant displays a portion of the at least one information item in the display region, where the display region and the object region are visually distinguishable from one another.
Owner:APPLE INC

System and method of a list commands utility for a speech recognition command system

In embodiments of the present invention, a system and computer-implemented method for enabling a user to interact with a mobile device using a voice command may include the steps of defining a structured grammar for generating a global voice command, defining a global voice command of the structured grammar, wherein the global voice command enables access to an object of the mobile device using a single command, and mapping at least one function of the object to the global voice command, wherein upon receiving voice input from the user of the mobile device, the object recognizes the global voice command and controls the function.
Owner:PATCH KIMBERLY C

Techniques for disambiguating speech input using multimodal interfaces

A technique is disclosed for disambiguating speech input for multimodal systems by using a combination of speech and visual I / O interfaces. When the user's speech input is not recognized with sufficiently high confidence, a the user is presented with a set of possible matches using a visual display and / or speech output. The user then selects the intended input from the list of matches via one or more available input mechanisms (e.g., stylus, buttons, keyboard, mouse, or speech input). These techniques involve the combined use of speech and visual interfaces to correctly identify user's speech input. The techniques disclosed herein may be utilized in computer devices such as PDAs, cellphones, desktop and laptop computers, tablet PCs, etc.
Owner:WALOOMBA TECH

Method for processing speech signal features for streaming transport

Speech signal information is formatted, processed and transported in accordance with a format adapted for TCP / IP protocols used on the Internet and other communications networks. NULL characters are used for indicating the end of a voice segment. The method is useful for distributed speech recognition systems such as a client-server system, typically implemented on an intranet or over the Internet based on user queries at his / her computer, a PDA, or a workstation using a speech input interface.
Owner:NUANCE COMM INC

Digital assistant providing whispered speech

Systems and processes for detecting and / or providing a whispered speech response are provided. In one example process, speech is received from a user, and based on the speech input, determined that a whispered speech response is to be provided. Upon determining that a whispered speech response is to be provided, the whispered speech response is generated and provided to the user.
Owner:APPLE INC

Handwriting and voice input with automatic correction

A hybrid approach to improve handwriting recognition and voice recognition in data process systems is disclosed. In one embodiment, a front end is used to recognize strokes, characters and / or phonemes. The front end returns candidates with relative or absolute probabilities of matching to the input. Based on linguistic characteristics of the language, e.g. alphabetical or ideographic language for the words being entered, e.g. frequency of words and phrases being used, likely part of speech of the word entered, the morphology of the language, or the context in which the word is entered), a back end combines the candidates determined by the front end from inputs for words to match with known words and the probabilities of the use of such words in the current context.
Owner:TEGIC COMM

Method and apparatus for searching for music based on speech recognition

Provided is a method and apparatus for searching music based on speech recognition. By calculating search scores with respect to a speech input using an acoustic model, calculating preferences in music using a user preference model, reflecting the preferences in the search scores, and extracting a music list according to the search scores in which the preferences are reflected, a personal expression of a search result using speech recognition can be achieved, and an error or imperfection of a speech recognition result can be compensated for.
Owner:SAMSUNG ELECTRONICS CO LTD

Method and apparatus for improving the transcription accuracy of speech recognition software

A virtual vocabulary database is provided for use with a with a particular user database as part of a speech recognition system. Vocabulary elements within the virtual database are imported from the user database and are tagged to include numerical data corresponding to the historical use of the vocabulary element within the user database. For each speech input, potential vocabulary element matches from the speech recognition system are provided to the virtual database software which creates virtual sub-vocabularies from the criteria according to predefined criteria templates. The software then applies vocabulary element weighting adjustments according to the virtual sub-vocabulary weightings and applies the adjustment to the default weighting provided by the speech recognition system. The modified weightings are returned with the associated vocabulary elements to the speech engine for selection of an appropriate match to the input speech.
Owner:COIFMAN ROBERT E

Method and apparatus for media rendering services using gesture and/or voice control

An approach for providing media rendering services using touch input and voice input. An apparatus invokes a media application and presents media content at the apparatus. The apparatus monitors for touch input and / or voice input to execute a function to apply the media content. The apparatus receives user input as a sequence of user actions, wherein each of the user actions is provided via the touch input or the voice input. The touch input or the voice input is received without presentation of an input prompt that overlays or alters the media content
Owner:VERIZON PATENT & LICENSING INC

Communication system with handset for distributed processing

A communication system comprising at least one mobile handheld telephone handset adapted to communicate via a wireless telephony medium with a telephone network handling system. The handset comprises input devices to receive input from a user and produce signals dependent thereupon, an onboard processor to adapt speech input to produce a voice transmission signal as part of a telephone conversation with a third party; and an antenna to transmit the voice transmission signal via the wireless telephony medium. The telephone network handling system comprises a receiver to receive the voice transmission signal, and means to forward the voice signal to a third party. The handset further comprises a first processor to carry out a first processing step on selected input signals and produce data dependent thereupon which preserves predetermined information necessary to carry out a remote second processing step, an onboard processor to adapt the data according to a conventional wireless telephony protocol to produce a transmission signal, and an antenna to transmit the transmission signal via the wireless telephony medium to the telephone network handling system. The system further comprises a remote processor adapted to receive and adapt the transmission signal from the telephone network handling system to regenerate the data, and to carry out a second processing step on the data and produce an output dependent thereupon.
Owner:CABLE & WIRELESS PLC

Multimodal natural language query system for processing and analyzing voice and proximity-based queries

ActiveUS20110093271A1Inaccurate and imprecise and unreliable and trainingData processing applicationsSemantic analysisDatabase querySpeech input
The present disclosure provides a natural language query system and method for processing and analyzing multimodally-originated queries, including voice and proximity-based queries. The natural language query system includes a Web-enabled device including a speech input module for receiving a voice-based query in natural language form from a user and a location / proximity module for receiving location / proximity information from a location / proximity device. The query system also includes a speech conversion module for converting the voice-based query in natural language form to text in natural language form and a natural language processing module for converting the text in natural language form to text in searchable form. The query system further includes a semantic engine module for converting the text in searchable form to a formal database query and a database-look-up module for using the formal database query to obtain a result related to the voice-based query in natural language form from a database.
Owner:PORTAL COMM LLC

Recognition architecture for generating Asian characters

Architecture for correcting incorrect recognition results in an Asian language speech recognition system. A spelling mode can be launched in response to receiving speech input, the spelling mode for correcting incorrect spelling of the recognition results or generating new words. Correction can be obtained using speech and / or manual selection and entry. The architecture facilitates correction in a single pass, rather than multiples times as in conventional systems. Words corrected using the spelling mode are corrected as a unit and treated as a word. The spelling mode applies to languages of at least the Asian continent, such as Simplified Chinese, Traditional Chinese, and / or other Asian languages such as Japanese.
Owner:MICROSOFT TECH LICENSING LLC

Method and system for providing alternatives for text derived from stochastic input sources

A computer-implemented method for providing a candidate list of alternatives for a text selection containing text from multiple input sources, each of which can be stochastic (such as a speech recognition unit, handwriting recognition unit, or input method editor) or non-stochastic (such as a keyboard and mouse). A text component of the text selection may be the result of data processed through a series of stochastic input sources, such as speech input that is converted to text by a speech recognition unit before being used as input into an input method editor. To determine alternatives for the text selection, a stochastic input combiner parses the text selection into text components from different input sources. For each stochastic text component, the combiner retrieves a stochastic model containing alternatives for the text component. If the stochastic text component is the result of a series of stochastic input sources, the combiner derives a stochastic model that accurately reflects the probabilities of the results of the entire series. The combiner creates a list of alternatives for the text selection by combining the stochastic models retrieved. The combiner may revise the list of alternatives by applying natural language principles to the text selection as a whole. The list of alternatives for the text selection is then presented to the user. If the user chooses one of the alternatives, then the word processor replaces the text selection with the chosen candidate.
Owner:MICROSOFT TECH LICENSING LLC

Voice Recognition Device and Method, and Program

A speech recognition system in which a user may correct a recognition error resulting from speech recognition more efficiently and easily. Speech recognition means compares a plurality of words inputted from speech input means with a plurality of words stored in dictionary means, respectively, and determines a most-competitive word candidate. Word correction means has a word correction function of correcting the words constituting a word sequence displayed on a screen. Competitive word display commanding means selects one or more competitive words having competitive probabilities close to the competitive probability of the most-competitive word candidate and displays the one or more competitive words adjacent to the most-competitive word candidate. Competitive word selection means selects an appropriate correction word from the one or more competitive words. Word replacement commanding means causes one of the most-competitive word candidate to be replaced with the correction word selected by the competitive word selection means.
Owner:NAT INST OF ADVANCED IND SCI & TECH

Method for correcting a speech response and natural language dialogue system

A natural language dialogue system and a method capable of correcting a speech response are provided. The method includes following steps. A first speech input is received. At least one keyword included in the first speech input is parsed to obtain a candidate list having at least one report answers. One of the report answers is selected from the candidate list as a first report answer, and a first speech response is output according to the first report answer. A second speech input is received and parsed to determine whether the first report answer is correct. If the first report answer is incorrect, another report answer other than the first report answer is selected from the candidate list as a second report answer. According to the second report answer, a second speech response is output.
Owner:VIA TECH INC
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products