Patents
Literature
Hiro is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Hiro

44 results about "Speech Recognition Software" patented technology

Software capable of recognizing dictation and transcribing the spoken words into written text.

Method and apparatus for improving the transcription accuracy of speech recognition software

A virtual vocabulary database is provided for use with a with a particular user database as part of a speech recognition system. Vocabulary elements within the virtual database are imported from the user database and are tagged to include numerical data corresponding to the historical use of the vocabulary element within the user database. For each speech input, potential vocabulary element matches from the speech recognition system are provided to the virtual database software which creates virtual sub-vocabularies from the criteria according to predefined criteria templates. The software then applies vocabulary element weighting adjustments according to the virtual sub-vocabulary weightings and applies the adjustment to the default weighting provided by the speech recognition system. The modified weightings are returned with the associated vocabulary elements to the speech engine for selection of an appropriate match to the input speech.
Owner:COIFMAN ROBERT E

Collection and use of side information in voice-mediated mobile search

Methods and systems for providing voice-mediated search capability to a mobile communications device involve receiving a signal from the mobile device that includes a representation of a spoken search request from a user of the mobile device, using speech recognition software to convert the search request into a text search request, extracting side information contained implicitly within the received signal, using the extracted side information to assign the user to a category, sending the text search request and the user category to content providers, receiving from the content providers content that is responsive to the text search request and the user category, and sending to the mobile device search results that are based on content from content providers. The methods and systems further involve sending searches and user categories to advertising providers, and sending advertisements returned by the advertising providers to the mobile device along with the search results.
Owner:CERENCE OPERATING CO

Capture and application of sender voice dynamics to enhance communication in a speech-to-text environment

A method for providing voice dynamics of human utterances converted to and represented by text within a data processing system. A plurality of predetermined parameters for recognition and representation of dynamics in human utterances are selected. An enhanced human speech recognition software program is created implementing the predetermined parameters on a data processing system. The enhanced software program includes an ability to monitor and record human voice dynamics and provide speech-to-text recognition. The dynamics in a human utterance is captured utilizing the enhanced human speech recognition software. The human utterance is converted into a textual representation utilizing the speech-to-text ability of the software. Finally, the dynamics are merged along with the textual representation of the human utterance to produce a marked-up text document on the data processing system.
Owner:NUANCE COMM INC

Apparatus and method for processing service interactions

An interactive voice and data response system that directs input to a voice, text, and web-capable software-based router, which is able to intelligently respond to the input by drawing on a combination of human agents, advanced speech recognition and expert systems, connected to the router via a TCP / IP network. The digitized input is broken down into components so that the customer interaction is managed as a series of small tasks rather than one ongoing conversation. The router manages the interactions and keeps pace with a real-time conversation. The system utilizes both speech recognition and human intelligence for purposes of interpreting customer utterance or customer text. The system may use more than one human agent, or both human agents and speech recognition software, to interpret simultaneously the same component for error-checking and interpretation accuracy.
Owner:ARES VENTURE FINANCE

Method for comparing a transcribed text file with a previously created file

A method to creating a final text from an audio file comprising (a) transcribing the audio file into a transcribed text file using a speech recognition software; (b) loading a first widow with the transcribed text file; (c) loading a second window with a previously created text file; (d) comparing the transcribed text file and the previously created file to find differences between the text in the transcribed text file and the text in the previously created text file; (e) correcting the transcribed text file based upon the differences to create the final text. The method may also include searching for the previously created text file.
Owner:CUSTOM SPEECH USA

Spatially indexed grammar and methods of use

Improved systems and methods are described which simplify the individual's interaction with speech recognition software, expand the database of spoken point names that can be recognized, and increase the quality and therefore likelihood of success of speech recognition applications. The present systems and methods apply to various uses, such as providing driving directions, finding the nearest location based service, and finding the nearest “Where Am I?” type of location based services.
Owner:NEUSTAR INFORMATION SERVICES

Apparatus and method for processing service interactions

An interactive voice and data response system then directs input to a voice, text, and web-capable software-based router, which is able to intelligently respond to the input by drawing on a combination of human agents, advanced speech recognition and expert systems, connected to the router via a TCP / IP network. The digitized input is broken down into components so that the customer interaction is managed as a series of small tasks rather than one ongoing conversation. The router manages the interactions and keeps pace with a real-time conversation. The system utilizes both speech recognition and human intelligence for purposes of interpreting customer utterance or customer text. The system may use more than one human agent, or both human agents and speech recognition software, to interpret simultaneously the same component for error-checking and interpretation accuracy.
Owner:ARES VENTURE FINANCE

Spatial asset management system and method

A data collection and automatic database population system which combines global positioning system (GPS), speech recognition software, radio frequency (RF) communications, and geographic information system (GIS) to allow rapid capture of field data, asset tracking, and automatic transfer of the data to a GIS database. A pre-defined grammar allows observations to be continuously captured along with GPS location and time, and stored on the field mobile unit. A mobile unit's location is tracked in real time or post processed through wireless RF transmission of location information between the mobile unit and a central processing station. The captured data is electronically transferred to a central processing station for quality assurance and automatic population of the GIS database. The system provides for automatic correlation of field data with other GIS database layers. Tools to generate predefined or user defined reports, work orders, and general data queries allow exploitation of the GIS database.
Owner:LOCKHEED MARTIN CORP

Electronic health record system and method for patient encounter transcription and documentation

A patient encounter documentation and analytics system includes a mobile computing platform and a server-based host platform. A mobile application in tandem with a wireless microphone collects voice signals during a patient-caregiver encounter, transforms the voice signals into audio data files, and uploads the audio data files to the server. A speech recognition software module digitally transcribes the audio data file into text. A text processing module extracts and organizes relevant clinical data based on keyword, key phrase and question / answer analysis. Relevance of words and phrases may be determined in view of, e.g., their presence, frequency and context. A diagnostic decision support module enables the healthcare provider to review the determined clinical information and provide a diagnosis associated with the encounter. A documentation skeleton module extracts diagnosis-specific text components from the transcribed text file and assembles an electronic medical document based on the diagnosis and the diagnosis-specific text components.
Owner:VOICEHIT

Method and apparatus for improving the transcription accuracy of speech recognition software

A virtual vocabulary database is provided for use with a with a particular user database as part of a speech recognition system. Vocabulary elements within the virtual database are imported from the user database and are tagged to include numerical data corresponding to the historical use of the vocabulary element within the user database. For each speech input, potential vocabulary element matches from the speech recognition system are provided to the virtual database software which creates virtual sub-vocabularies from the criteria according to predefined criteria templates. The software then applies vocabulary element weighting adjustments according to the virtual sub-vocabulary weightings and applies the adjustment to the default weighting provided by the speech recognition system. The modified weightings are returned with the associated vocabulary elements to the speech engine for selection of an appropriate match to the input speech.
Owner:COIFMAN ROBERT E

Spatially indexed grammar and methods of use

Improved systems and methods are described which simplify the individual's interaction with speech recognition software, expand the database of spoken point names that can be recognized, and increase the quality and therefore likelihood of success of speech recognition applications. The present systems and methods apply to various uses, such as providing driving directions, finding the nearest location based service, and finding the nearest “Where Am I?” type of location based services.
Owner:NEUSTAR INFORMATION SERVICES

Method and apparatus for improving the transcription accuracy of speech recognition software

The present invention involves an enhanced method for operating a speech recognition system in which sequential vocabularies are loaded for comparison to the input speech. Form each sequential vocabulary a subset of candidate vocabulary elements are selected for matching, the probability match scores of each of the selected candidate vocabulary elements is weighted by a weighting factor that is dependent on the particular vocabulary from which the vocabulary elements are selected. As each set of candidate vocabulary elements is selected from the next sequential vocabulary, the selected set is combined with the set of previously selected and weighted vocabulary elements, which is then further weighted according to the weighting criteria of the then active vocabulary. At the end of the sequential vocabulary selection grouping process, the final, sequentially weighted match scores for the candidate vocabulary elements are arranged and an appropriate match to the input speech is presented.
Owner:COIFMAN ROBERT E

Real-time voice recognition on a handheld device

A method and apparatus for implementation of real-time speech recognition using a handheld computing apparatus are provided. The handheld computing apparatus receives an audio signal, such as a user's voice. The handheld computing apparatus ultimately transmits the voice data to a remote or distal computing device with greater processing power and operating a speech recognition software application. The speech recognition software application processes the signal and outputs a set of instructions for implementation either by the computing device or the handheld apparatus. The instructions can include a variety of items including instructing the presentation of a textual representation of dictation, or a function or command to be executed by the handheld device (such as linking to a website, opening a file, cutting, pasting, saving, or other file menu type functionalities), or by the computing device itself.
Owner:NUANCE COMM INC

Media device with speech recognition and method for using same

A media player utilizing speech recognition software to perform functions of the media player or make file selections that may be played by the media player. The media player may include one or more microphones to receive a voice command from the user. The one or more microphones may be actuated into a state for receiving a voice command and providing the voice command to one or more microprocessors which perform a function based on the voice command.
Owner:MAU II FREDERICK W

Generic spelling mnemonics

A system and method for creating a mnemonics Language Model for use with a speech recognition software application, wherein the method includes generating an n-gram Language Model containing a predefined large body of characters, wherein the n-gram Language Model includes at least one character from the predefined large body of characters, constructing a new language Model (LM) token for each of the at least one character, extracting pronunciations for each of the at least one character responsive to a predefined pronunciation dictionary to obtain a character pronunciation representation, creating at least one alternative pronunciation for each of the at least one character responsive to the character pronunciation representation to create an alternative pronunciation dictionary and compiling the n-gram Language Model for use with the speech recognition software application, wherein compiling the Language Model is responsive to the new Language Model token and the alternative pronunciation dictionary.
Owner:MICROSOFT TECH LICENSING LLC

Microphone Circuit

Microphones are used in acoustically insulated masks, headsets, phones and personal digital assistants. Frequently, the microphone provides an input to speech recognition software. The working environment is often humid and the speaker's mouth is in close proximity to the microphone. Frequently the signal suffers from clipping and distortion caused by the large signals and nonlinear response of the microphone circuitry. The claimed invention uses a resistor connected in parallel with the signal source to reduce its sensitivity and to produce a signal suitable for use with speech recognition software. The resistor can be varied for different speakers.
Owner:VAN KATZ ARTHUR WILLIAM +1

Real-time voice recognition on a handheld device

A method and apparatus for implementation of real-time speech recognition using a handheld computing apparatus are provided. The handheld computing apparatus receives an audio signal, such as a user's voice. The handheld computing apparatus ultimately transmits the voice data to a remote or distal computing device with greater processing power and operating a speech recognition software application, the speech recognition software application processes the signal and outputs a set of instructions for implementation either by the computing device or the handheld apparatus. The instructions can include a variety of items including instructing the presentation of a textual representation of dictation, or a function or command to be executed by the handheld device (such as linking to a website, opening a file, cutting, pasting, saving, or other file menu type functionalities), or by the computing device itself.
Owner:NUANCE COMM INC

Method for an Automated Distress Alert System with Speech Recognition

A software application for an automated alert system that utilizes speech recognition software to determine the emergency status of a person and respond accordingly. The software application includes monitoring ambient noise through a microphone for an utterance. Once an utterance is identified, it is recorded into an audio signal and fed into a speech-to-text software. The speech-to-text software converts the audio signal into a corresponding text. The corresponding text is then compares against a preconfigured plurality of distress passphrases in order to identify a positive match. Each of the plurality of distress passphrases is associated with an at least one type of alarm. If a positive match is identified for a specific distress passphrase, the software application triggers the type of alarm associated with the specific passphrase. If no match is found, the system continues to monitor the ambient noise and repeat the process for each utterance that is spoken.
Owner:ZHONG VICTORIA

System and method for generating and sending a simplified message using speech recognition

An embodiment provides a system and method for generating and sending a simplified message using speech recognition. The system provides a speech recognition software that may be utilized for receiving audio, converting audio to text derived from audio, comparing text derived from audio to match fields to find matches, replacing matched text with contents of replacement fields associated to the match fields, generating an output message incorporating the replacement text into the text derived from audio, transmitting the output message to a messaging system and redistributing the output message to recipients.
Owner:GOOGLE LLC

System and method for transcribing audio files of various languages

System, method and program product for transcribing an audio file included in or referenced by a web page. A language of text in the web page is determined. Then, voice recognition software of the language of text is selected and used to transcribe the audio file. If the language of the text is not the language of the audio file, then a related language is determined. Then, voice recognition software of the related language is selected and used to transcribe the audio file. The related language can be related geographically, by common root, as another dialect of the same language, or as another language commonly spoken in the same country as the language of the text. Another system, method and program product is disclosed for transcribing an audio file included in or referenced by a web page. A domain extension or full domain of the web page and an official language of the domain extension or full domain are determined. Then, voice recognition software of the official language is used to attempt to transcribe the audio file. If the official language is not a language of the audio file, then a language related to the official language is determined. Then, voice recognition software of the related language is selected and used to transcribe said audio file. The related language can be related geographically, by common root, as another dialect of the same language, or as another language commonly spoken in the same country as the official language.
Owner:MICROSOFT TECH LICENSING LLC

Free text voice training

A system and method provide acoustic training of a voice or speech recognition engine and / or voice or speech recognition software application. Instead of requiring a user to read from a prepared or predetermined script, the system and method described herein enable acoustic training using any free text spoken phrases provided by the user directly, or by a previously recorded speech, presentation, or the like, performed by the user.
Owner:NUANCE COMM INC

Speech recognition method and system capable of filtering loudspeaker noises

The invention provides a speech recognition method and system capable of filtering loudspeaker noises. The method comprises: when detecting that user voice is recorded through a microphone, and a loudspeaker is playing a voice file stored in an intelligent terminal, obtaining the composite tone of the user voice and loudspeaker sound; according to the first frequency and first amplitude of the loudspeaker sound sampled by the intelligent terminal, and the composite tone frequency and composite tone amplitude of the composite tone, obtaining the second frequency and second amplitude of the user voice; filtering the timbre of the loudspeaker sound in the composite tone, and recovering with the second frequency and second amplitude of the user voice to obtain the user voice; and according to a voice database, converting the user voice into a text. According to the invention, when a user employs speech recognition software and a loudspeaker plays external sound, the processor in a terminal performs analysis according to sound composition, filters loudspeaker sound, reduces environmental noises in user voice received by a background, and realizes voice high efficiency identification.
Owner:HUIZHOU TCL MOBILE COMM CO LTD

Electret Microphone Circuit

Microphones are used in acoustically insulated masks to prevent the speaker's voice from being overheard by others. Frequently, the microphone provides an input to speech recognition software. The environment inside the mask is often humid and the speaker's mouth is in close proximity to the microphone. The shape of the mask's shell and the restricted volume within the shell introduce distortion and the signal suffers further from clipping and distortion caused by the large signals and nonlinear response of the microphone circuitry. The use of an electret microphone is particularly troublesome due to its high sensitivity. This invention uses a resistor connected in parallel with the microphone to reduce the sensitivity of an electret microphone used in these conditions and produces a signal suitable for use with speech recognition software. The resistor can be varied for different speakers.
Owner:VAN KATS ARTHUR WILLIAM +1

Multi-lingual transcription system

A multi-lingual transcription system for processing a synchronized audio / video signal containing an auxiliary information component from an original language to a target language is provided. The system filters text data from the auxiliary information component, translates the text data into the target language and displays the translated text data while simultaneously playing an audio and video component of the synchronized signal. The system additionally provides a memory for storing a plurality of language databases which include a metaphor interpreter and thesaurus and may optionally include a parser for identifying parts of speech of the translated text. The auxiliary information component can be any language text associated with an audio / video signal, i.e., video text, text generated by speech recognition software, program transcripts, electronic program guide information, closed caption text, etc.
Owner:KONINKLIJKE PHILIPS ELECTRONICS NV

System for classifying lists of telephone numbers

A system and method for automatically classifying lists of telephone numbers into one or more predetermined categories is provided. The system initiates calls to each telephone number. A specific request is made of the callee, which confirms the number as being “live-answered.” The system uses speech recognition software to compare audible sounds received on the other end of the line to a plurality of spoken messages to classify each telephone number that has been classified as “not live-answered” for future use or exclusion. One embodiment of the system records each telephone call, enabling the process to be performed using a single call to each telephone number.
Owner:PRAIRIE SYST

Systems and methods for NACHA compliant ACH transfers using an automated voice response system

Embodiments of the invention described a method for processing an ACH transfer in compliance with NACHA regulations. The method comprises receiving a request, through a phone call from a customer, to initiate an ACH transfer, the request including at least an account identifier and a payment amount. The method confirms the ACH transfer using speech recognition software and an interactive voice response unit, wherein the customer's identity, the date of transfer, the account identifier, the payment amount, a contact phone number, and the date of the confirmation, are confirmed by the customer using a verbal response recognized by the speech recognition software. This method also includes, recording, using a recording server, the verbal response of the customer in a sound file, tagging the sound file with at least the field of an account identifier, and storing the sound file for at least two years in a data repository.
Owner:HSBC TECH & SERVICES (USA) INC

Generic spelling mnemonics

A system and method for creating a mnemonics Language Model for use with a speech recognition software application, wherein the method includes generating an n-gram Language Model containing a predefined large body of characters, wherein the n-gram Language Model includes at least one character from the predefined large body of characters, constructing a new language Model (LM) token for each of the at least one character, extracting pronunciations for each of the at least one character responsive to a predefined pronunciation dictionary to obtain a character pronunciation representation, creating at least one alternative pronunciation for each of the at least one character responsive to the character pronunciation representation to create an alternative pronunciation dictionary and compiling the n-gram Language Model for use with the speech recognition software application, wherein compiling the Language Model is responsive to the new Language Model token and the alternative pronunciation dictionary.
Owner:MICROSOFT TECH LICENSING LLC

Use of multiple speech recognition software instances

A wireless communication device is disclosed that accepts recorded audio data from an end-user. The audio data can be in the form of a command requesting user action. Likewise, the audio data can be converted into a text file. The audio data is reduced to a digital file in a format that is supported by the device hardware, such as a .wav, .mp3, .vnf file, or the like. The digital file is sent via secured or unsecured wireless communication to one or more server computers for further processing. In accordance with an important aspect of the invention, the system evaluates the confidence level of the of the speech recognition process. If the confidence level is high, the system automatically builds the application command or creates the text file for transmission to the communication device. Alternatively, if the confidence of the speech recognition is lower, the recorded audio data file is routed to a human transcriber employed by the telecommunications service, who manually reviews the digital voice file and builds the application command or text file. Once the application command is created, it is transmitted to the communication device. As a result of the present invention, speech recognition in the context of a communications devices has been shown to be accurate over 90% of the time.
Owner:NUANCE COMM INC
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products