Patents

Literature

Hiro is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.

42 results about "Phonetic representation" patented technology

Filter

Efficacy Topic

Property

Owner

Technical Advancement

Application Domain

Technology Topic

Technology Field Word

Patent Country/Region

Patent Type

Patent Status

Application Year

Inventor

Phonetic representation, or more commonly phonetic transcription is the representation of speech sounds using symbols in phonetic alphabet such as IPA, X-SAMPA, Kirshenbaum for linguistic studies and for learning the pronunciation of languages. Among these systems, the International Phonetic Alphabet has been the most widely used so far, whose symbols are printed in most dictionaries and books on linguistics.

Generating large units of graphonemes with mutual information criterion for letter to sound conversion

InactiveUS7693715B2Speech recognitionSpeech synthesisSyllableLetter to sound

A method and apparatus are provided for segmenting words into component parts. Under the invention, mutual information scores for pairs of graphoneme units found in a set of words are determined. Each graphoneme unit includes at least one letter. The graphoneme units of one pair of graphoneme units are combined based on the mutual information score. This forms a new graphoneme unit. Under one aspect of the invention, a syllable n-gram model is trained based on words that have been segmented into syllables using mutual information. The syllable n-gram model is used to segment a phonetic representation of a new word into syllables. Similarly, an inventory of morphemes is formed using mutual information and a morpheme n-gram is trained that can be used to segment a new word into a sequence of morphemes.

Generating large units of graphonemes with mutual information criterion for letter to sound conversion

Generating large units of graphonemes with mutual information criterion for letter to sound conversion

Generating large units of graphonemes with mutual information criterion for letter to sound conversion

Owner:MICROSOFT TECH LICENSING LLC

Synthesis by Generation and Concatenation of Multi-Form Segments

ActiveUS20090048841A1Character and pattern recognitionSpeech recognitionSpeech synthesisConcatenation

A speech synthesis system and method is described. A speech segment database references speech segments having various different speech representational structures. A speech segment selector selects from the speech segment database a sequence of speech segment candidates corresponding to a target text. A speech segment sequencer generates from the speech segment candidates sequenced speech segments corresponding to the target text. A speech segment synthesizer combines the selected sequenced speech segments to produce a synthesized speech signal output corresponding to the target text.

Synthesis by Generation and Concatenation of Multi-Form Segments

Synthesis by Generation and Concatenation of Multi-Form Segments

Synthesis by Generation and Concatenation of Multi-Form Segments

Owner:CERENCE OPERATING CO

Apparatus and method for synthesized audible response to an utterance in speaker-independent voice recognition

InactiveUS20050273337A1Speech recognitionSpeech synthesisSpoken languageSpeech sound

When a speaker-independent voice-recognition (SIVR) system recognizes a spoken utterance that matches a phonetic representation of a speech element belonging to a predefined vocabulary, it may play a synthesized speech fragment as a means for the user to verify that the utterance was correctly recognized. When a speech element in the vocabulary has more than one possible pronunciation, the system may select the one most closely matching the user's utterance, and play a synthesized speech fragment corresponding to that particular representation.

Apparatus and method for synthesized audible response to an utterance in speaker-independent voice recognition

Apparatus and method for synthesized audible response to an utterance in speaker-independent voice recognition

Apparatus and method for synthesized audible response to an utterance in speaker-independent voice recognition

Owner:MARVELL WORLD TRADE LTD

Method and apparatus for recognizing a speaker in lawful interception systems

ActiveUS20090043573A1Reduce in quantitySpeech recognitionTransmissionLawful interceptionLoudspeaker

A method and apparatus for identifying a speaker within a captured audio signal from a collection of known speakers. The method and apparatus receive or generate voice representations for each known speakers and tag the representations according to meta data related to the known speaker or to the voice. The representations are grouped into one or more groups according to the indices. When a voice to be recognized is introduced, characteristics are determined according to which the groups are prioritized, so that the representations participating only in part of the groups are matched against the o voice to be identified, thus reducing identification time and improving the statistical significance.

Method and apparatus for recognizing a speaker in lawful interception systems

Method and apparatus for recognizing a speaker in lawful interception systems

Method and apparatus for recognizing a speaker in lawful interception systems

Owner:CYBERBIT

Entering text into an electronic communications device

InactiveUS7385531B2Easy to getEasy to browseInput/output for user-computer interactionElectronic switchingGraphicsSyllable

In a method of entering text into an electronic communications device by means of a keypad having a number of keys, each key representing a plurality of letters and / or phonetic symbols, entered text is displayed on a display on the device. Possible phonetic syllables corresponding to an activated key sequence are generated. These are compared with a stored vocabulary comprising syllables and corresponding characters occurring in a given language. Those stored syllables and corresponding characters that match the possible syllables are pre-selected; and a number of these are presented in a separate first graphical object arranged predominantly on the display. Characters corresponding to one of the syllables in the first object are simultaneously presented in a second graphical object. Thus, there is provided a way of entering text with characters having a phonetic representation by means of keys representing a plurality of letters or phonetic symbols, which may be easier to use even in the case where a phonetic syllable corresponds to several characters.

Entering text into an electronic communications device

Entering text into an electronic communications device

Entering text into an electronic communications device

Owner:SONY ERICSSON MOBILE COMM AB

Arrangement for Creating and Using a Phonetic-Alphabet Representation of a Name of a Party to a Call

ActiveUS20090248421A1Substation equipmentAutomatic exchangesNatural language processingLettering

A first party creates and edits a phonetic-alphabet representation of its name. The phonetic representation is conveyed to a second party as “caller-identification” information by messages that set up a call between the parties. The phonetic representation of the name is displayed to the second party, converted to speech, and / or converted to an alphabet of a language of the second party and then displayed to the second party.

Arrangement for Creating and Using a Phonetic-Alphabet Representation of a Name of a Party to a Call

Arrangement for Creating and Using a Phonetic-Alphabet Representation of a Name of a Party to a Call

Arrangement for Creating and Using a Phonetic-Alphabet Representation of a Name of a Party to a Call

Owner:AVAYA INC

Method and apparatus for recognizing a speaker in lawful interception systems

ActiveUS8219404B2Reduce in quantitySpeech recognitionTransmissionLawful interceptionLoudspeaker

A method and apparatus for identifying a speaker within a captured audio signal from a collection of known speakers. The method and apparatus receive or generate voice representations for each known speakers and tag the representations according to meta data related to the known speaker or to the voice. The representations are grouped into one or more groups according to the indices. When a voice to be recognized is introduced, characteristics are determined according to which the groups are prioritized, so that the representations participating only in part of the groups are matched against the voice to be identified, thus reducing identification time and improving the statistical significance.

Method and apparatus for recognizing a speaker in lawful interception systems

Method and apparatus for recognizing a speaker in lawful interception systems

Method and apparatus for recognizing a speaker in lawful interception systems

Owner:CYBERBIT

Entering text into an electronic communications device

InactiveUS20050144566A1AmountConsiderable amountInput/output for user-computer interactionElectronic switchingGraphicsSyllable

In a method of entering text into an electronic communications device by means of a keypad having a number of keys, each key representing a plurality of letters and / or phonetic symbols, entered text is displayed on a display on the device. Possible phonetic syllables corresponding to an activated key sequence are generated. These are compared with a stored vocabulary comprising syllables and corresponding characters occurring in a given language. Those stored syllables and corresponding characters that match the possible syllables are pre-selected; and a number of these are presented in a separate first graphical object arranged predominantly on the display. Characters corresponding to one of the syllables in the first object are simultaneously presented in a second graphical object. Thus, there is provided a way of entering text with characters having a phonetic representation by means of keys representing a plurality of letters or phonetic symbols, which may be easier to use even in the case where a phonetic syllable corresponds to several characters.

Entering text into an electronic communications device

Entering text into an electronic communications device

Entering text into an electronic communications device

Owner:SONY ERICSSON MOBILE COMM AB

Reducing a size of a compiled speech recognition grammar

InactiveUS20090171663A1Speech recognitionHard disc driveReduced size

The present invention discloses creating and using speech recognition grammars of reduced size. The reduced speech recognition grammars can include a set of entries, each entry having a unique identifier and a phonetic representation that is used when matching speech input against the entries. Each entry can lack a textual spelling corresponding to the phonetic representation. The reduced speech recognition grammar can be digitally encoded and stored in a computer readable media, such as a hard drive or flash memory of a portable speech enabled device.

Reducing a size of a compiled speech recognition grammar

Reducing a size of a compiled speech recognition grammar

Reducing a size of a compiled speech recognition grammar

Owner:NUANCE COMM INC

Context sensitive multi-stage speech recognition

InactiveUS20090182559A1Speech recognitionSpeech inputContext sensitivity

A system enables devices to recognize and process speech. The system includes a database that retains one or more lexical lists. A speech input detects a verbal utterance and generates a speech signal corresponding to the detected verbal utterance. A processor generates a phonetic representation of the speech signal that is designated a first recognition result. The processor generates variants of the phonetic representation based on context information provided by the phonetic representation. One or more of the variants of the phonetic representation selected by the processor are designated as a second recognition result. The processor matches the second recognition result with stored phonetic representations of one or more of the stored lexical lists.

Context sensitive multi-stage speech recognition

Context sensitive multi-stage speech recognition

Context sensitive multi-stage speech recognition

Owner:NUANCE COMM INC

System and a Method For Representing Unrecognized Words in Speech to Text Conversions as Syllables

InactiveUS20080140398A1Speech recognitionSyllableNatural language processing

The present invention is a novel system and method for overcoming the shortcomings of existing speech-to-text systems which relates to the processing of unrecognized words. On encountering words which are not decipherable by it the preferred embodiment of the present invention analyzes the syllables which make up these words and translates them into the appropriate phonetic representations. The method described by the present invention ensures that words which were not uttered clearly would not be lost or distorted in the process of transcribing the text. Additionally, it allows using smaller and simpler speech-to-text applications, which are suitable for mobile devices with limited storage and processing resources, since these applications may use smaller dictionaries and may be designed only to identify commonly used words. Also disclosed are several examples for possible implementations of the described system and method.

System and a Method For Representing Unrecognized Words in Speech to Text Conversions as Syllables

System and a Method For Representing Unrecognized Words in Speech to Text Conversions as Syllables

System and a Method For Representing Unrecognized Words in Speech to Text Conversions as Syllables

Owner:SHPIGEL AVRAHAM

Method and system for obtaining personal aliases through voice recognition

InactiveUS7428491B2Speech recognitionSpecial data processing applicationsSpoken languageSpeech identification

Methods and systems for recognizing a spoken alias are disclosed. The present invention includes generating a plurality of alias variations based on a discoverable name and creating a phonetic representation for each of the alias variations. The present invention also includes capturing a phonetic pronunciation of the spoken alias. At least one of the created alias variations that has a phonetic representation that corresponds to the captured phonetic pronunciation is selected.

Method and system for obtaining personal aliases through voice recognition

Method and system for obtaining personal aliases through voice recognition

Method and system for obtaining personal aliases through voice recognition

Owner:MICROSOFT TECH LICENSING LLC

Midi-compatible hearing device and reproduction of speech sound in a hearing device

InactiveUS20100260363A1Improve compatibilityGood and realistic audio reproductionElectrophonic musical instrumentsHearing aids signal processingDigital interfaceHearing apparatus

The method for providing a user of a hearing device with speech sound comprises the step ofa) providing in the hearing device speech-representing data representative of speech-bound contents.The speech-bound contents is encoded in said speech-representing data in a compressed way by means of a set of encoded-speech-segment data, wherein each of the encoded-speech-segment data of the set is indicative of one speech segment, and wherein the speech-representing data comprise a multitude of the encoded-speech-segment data.And it also comprises the steps ofb) deriving from the multitude of the encoded-speech-segment data audio signals representative of the speech-bound contents by composing audio signal segments derived by decoding the multitude of encoded-speech-segment data; andc) converting the so-derived audio signals into speech sound by means of an output converter of the hearing device.Preferably, the encoded-speech-segment data are MIDI data, wherein MIDI stands for Musical Instrument Digital Interface. For example, the speech-bound contents is the contents of an audio book or news to which the user wants to listen.

Midi-compatible hearing device and reproduction of speech sound in a hearing device

Midi-compatible hearing device and reproduction of speech sound in a hearing device

Midi-compatible hearing device and reproduction of speech sound in a hearing device

Owner:PHONAK

Method and system for autocompletion for languages having ideographs and phonetic characters

ActiveCN101194256ANatural language translationWeb data indexingNatural language processingQuery string

When a user enters text in a text input box (eg, a browser or a tool bar), a sorted set of predicted input-complete strings comprising ideographic strings is presented to the user. The user-entered text may include zero or more ideograms followed by one or more phonetic characters, or the entered text may be one or more. The predicted completion string can be a URL or a query string. Ranking can be based on any number of factors (eg, frequency of queries submitted by user groups). URLs may be ranked based on the URL's importance value. The set of sort-predicted completion strings may be obtained by matching the fingerprint of the user input string with the fingerprint-to-table mapping containing the set of sort-predicted input complete strings. The sequence-predicted string generation takes into account multiple phonetic representations of an ideographic string.

Method and system for autocompletion for languages having ideographs and phonetic characters

Method and system for autocompletion for languages having ideographs and phonetic characters

Method and system for autocompletion for languages having ideographs and phonetic characters

Owner:GOOGLE LLC

Method for Speech Recognition on All Languages and for Inputing words using Speech Recognition

InactiveUS20110066434A1Easy to recognizeEasy inputNatural language data processingSpeech recognitionM categorySpeech identification

The invention can recognize all languages and input words. It needs m unknown voices to represent m categories of known words with similar pronunciations. Words can be pronounced in any languages, dialects or accents. Each will be classified into one of m categories represented by its most similar unknown voice. When user pronounces a word, the invention finds its F most similar unknown voices. All words in F categories represented by F unknown voices will be arranged according to their pronunciation similarity and alphabetic letters. The pronounced word should be among the top words. Since we only find the F most similar unknown voices from m (=500) unknown voices and since the same word can be classified into several categories, our recognition method is stable for all users and can fast and accurately recognize all languages (English, Chinese and etc.) and input much more words without using samples.

Method for Speech Recognition on All Languages and for Inputing words using Speech Recognition

Method for Speech Recognition on All Languages and for Inputing words using Speech Recognition

Method for Speech Recognition on All Languages and for Inputing words using Speech Recognition

Owner:LI TZE FEN +4

Method and apparatus for voice controlled devices with improved phrase storage, use, conversion, transfer, and recognition

InactiveUS7283964B1Preserve capabilityEasy to useSound input/outputSpeech recognitionSpeech identificationSpeech sound

The embodiments of the invention provide for the storage of speech phrases. Speech phrases are processed by a speaker-independent speech recognition engine of a voice controlled device. This engine returns a speaker-independent representation of the phrase. The speaker-independent representation is stored. Embodiments of the invention include methods of converting text to speaker-independent representations of speech and speaker-independent representations of speech into text.

Method and apparatus for voice controlled devices with improved phrase storage, use, conversion, transfer, and recognition

Method and apparatus for voice controlled devices with improved phrase storage, use, conversion, transfer, and recognition

Method and apparatus for voice controlled devices with improved phrase storage, use, conversion, transfer, and recognition

Owner:WINBOND ELECTRONICS CORP

Generating large units of graphonemes with mutual information criterion for letter to sound conversion

InactiveUS20050203739A1Character and pattern recognitionSpeech recognitionSyllableMorpheme

A method and apparatus are provided for segmenting words into component parts. Under the invention, mutual information scores for pairs of graphoneme units found in a set of words are determined. Each graphoneme unit includes at least one letter. The graphoneme units of one pair of graphoneme units are combined based on the mutual information score. This forms a new graphoneme unit. Under one aspect of the invention, a syllable n-gram model is trained based on words that have been segmented into syllables using mutual information. The syllable n-gram model is used to segment a phonetic representation of a new word into syllables. Similarly, an inventory of morphemes is formed using mutual information and a morpheme n-gram is trained that can be used to segment a new word into a sequence of morphemes.

Generating large units of graphonemes with mutual information criterion for letter to sound conversion

Generating large units of graphonemes with mutual information criterion for letter to sound conversion

Generating large units of graphonemes with mutual information criterion for letter to sound conversion

Owner:MICROSOFT TECH LICENSING LLC

Providing speech recognition data to a speech enabled device when providing a new entry that is selectable via a speech recognition interface of the device

InactiveUS20090157392A1Reduce consumptionEnhance solution scalabilitySpeech recognitionSpeech identificationSpeech sound

The present invention discloses a solution for providing a phonetic representation for a content item along with a content item delivered to a speech enabled computing device. The phonetic representation can be specified in a manner that enables it to be added to a speech recognition grammar of the speech enabled computing device. Thus, the device can recognize speech commands using the newly added phonetic representation that involve the content item. Current implementations of speech recognition systems of this type rely internal generation of speech recognition data that is added to the speech recognition grammar. Generation of speech recognition data can, however, be resource intensive, which can be particularly problematic when the speech enabled device is resource limited. The disclosed solution offloads the task of providing the speech recognition data to an external device, such as a relatively resource rich server or a desktop device.

Providing speech recognition data to a speech enabled device when providing a new entry that is selectable via a speech recognition interface of the device

Providing speech recognition data to a speech enabled device when providing a new entry that is selectable via a speech recognition interface of the device

Providing speech recognition data to a speech enabled device when providing a new entry that is selectable via a speech recognition interface of the device

Owner:IBM CORP

Method and apparatus for providing foreign language text display when encoding is not available

InactiveUS20060150098A1Character and pattern recognitionFluid-tightness measurementText displayDisplay device

A method and apparatus include referencing a phonetic language database that includes double-byte font entries and associated phonetic representations of the double-byte font entries. At least one of the double byte font entries is used to obtain a phonetic representation of the used at least one double-byte font. The phonetic representation is displayed on a display device.

Method and apparatus for providing foreign language text display when encoding is not available

Method and apparatus for providing foreign language text display when encoding is not available

Method and apparatus for providing foreign language text display when encoding is not available

Owner:MICROSOFT TECH LICENSING LLC

System and method for phonetic searching of data

ActiveUS20140067820A1Efficient accessNatural language analysisDigital data processing detailsDistributed File SystemSequential data

A method of phonetically searching media information comprises receiving a plurality of search queries from one or more client systems and providing a phonetic representation of each search query. One or more search jobs are instantiated, each search job comprising a plurality of tasks, each task being arranged to sequentially read a block from an archive file. The archive file is stored within a distributed filing system (DFS) in which sequential blocks of data comprising the archive file are replicated to be locally available to one or more processors from a cluster of processors for executing the tasks. Each block stores index files corresponding to a plurality of source media files, each index file containing a phonetic stream corresponding to audio information for a given source media file. Each task obtains phonetic representations of outstanding search queries for a block and sequentially searches the block for each outstanding search query.

System and method for phonetic searching of data

System and method for phonetic searching of data

System and method for phonetic searching of data

Owner:AVAYA INC

Flexible keyword searching

InactiveUS7502781B2Data processing applicationsWeb data indexingEngineeringWord group

A search engine implements a multi-level search scheme. A first level involves performing a keyword search based on character matching. A second level, performed only if the first level yields no results, is a keyword search based on phonetic representations of a search phrase and of the keywords. A third level, performed only if the first and second levels yield no results, is a rough matching search. The keywords or keyword phrases are specified in a phrase table. Each entry of the phrase table specifies a keyword phrase, its phonetic representation, a topic URL, and an action that is to be performed in conjunction with the topic URL. There are a plurality of defined actions, having different priorities. If multiple keyword phrases are found in the multi-level search, the one having the action with the highest priority is initiated. If there is a tie for the highest priority, the results are listed in a results page, regardless of the actions associated with the matched entries. Different actions can be specified in the phrase table entries, corresponding to different levels of the multi-level search scheme that were required to discover a matching entry.

Flexible keyword searching

Flexible keyword searching

Flexible keyword searching

Owner:MICROSOFT TECH LICENSING LLC

Semantic recognition method, device, storage medium and computer equipment based on voice interaction

InactiveCN107451119AAvoid errorsImprove accuracySemantic analysisSpeech recognitionRepresentation termSpeech sound

The invention provides a semantic recognition method, device, storage medium and computer equipment based on voice interaction. The semantic recognition method based on voice interaction comprises the steps that voice data acquired is converted into phonetic text, wherein the phonetic text comprises at least one phonetic representation character; by means of the phonetic representation characters in the phonetic text, term matching is conducted, so that text is acquired, wherein the text contains meaning representation terms matched with the phonetic representation characters; semantic understanding processing is conducted on the text, so that the semantic meaning of the voice data is acquired. By means of the technical scheme, the accuracy of voice recognition can be improved.

Semantic recognition method, device, storage medium and computer equipment based on voice interaction

Semantic recognition method, device, storage medium and computer equipment based on voice interaction

Semantic recognition method, device, storage medium and computer equipment based on voice interaction

Owner:SHANGHAI XIAOI ROBOT TECH CO LTD

System and method for phonetic searching of data

InactiveUS20140067374A1Speed up searchNatural language analysisSpecial data processing applicationsDocument preparationVoice search

A method for phonetically searching media including a plurality of audio tracks is disclosed where each audio track is indexed to provide a phonetic representation of the audio track. The method comprises obtaining a text search query and searching for the text query against a set of reference documents to obtain a sub-set of pseudo-relevant documents. The pseudo-relevant documents are examined for a set of search expressions characterizing the pseudo-relevant documents. A phonetic representation corresponding to at least some of the set of search expressions is provided and for each of the phonetic representations of the search expressions, the indexed phonetic representations for one or more of the plurality of audio tracks is phonetically searched to provide any indicators of the incidence of the search expression within the one or more audio tracks.

System and method for phonetic searching of data

System and method for phonetic searching of data

Owner:AVAYA INC

Method and system for obtaining personal aliases through voice recognition

InactiveUS20060129398A1Speech recognitionSpecial data processing applicationsSpoken languageSpeech sound

Methods and systems for recognizing a spoken alias are disclosed. The present invention includes generating a plurality of alias variations based on a discoverable name and creating a phonetic representation for each of the alias variations. The present invention also includes capturing a phonetic pronunciation of the spoken alias. At least one of the created alias variations that has a phonetic representation that corresponds to the captured phonetic pronunciation is selected.

Method and system for obtaining personal aliases through voice recognition

Method and system for obtaining personal aliases through voice recognition

Method and system for obtaining personal aliases through voice recognition

Owner:MICROSOFT TECH LICENSING LLC

Synthesis by generation and concatenation of multi-form segments

ActiveCN101828218ASpeech recognitionSpeech synthesisSpeech synthesisConcatenation

A speech synthesis system and method is described. A speech segment database references speech segments having various different speech representational structures. A speech segment selector selects from the speech segment database a sequence of speech segment candidates corresponding to a target text. A speech segment sequencer generates from the speech segment candidates sequenced speech segments corresponding to the target text. A speech segment synthesizer combines the selected sequenced speech segments to produce a synthesized speech signal output corresponding to the target text.

Synthesis by generation and concatenation of multi-form segments

Synthesis by generation and concatenation of multi-form segments

Synthesis by generation and concatenation of multi-form segments

Owner:DIFFERENTIAL COMM CORP

Apparatus and methods for pronunciation lexicon compression

InactiveUS20060004564A1Code conversionNatural language data processingPrediction algorithmsSpeech sound

A compressed pronunciation lexicon file is generated from a source pronunciation lexicon using a pronunciation prediction algorithm in a multi-output mode. The pronunciation prediction algorithm may generate a deterministic ordered list of phoneme strings from the textual representation of a particular word. The compressed pronunciation lexicon file may include a sorted list of records of compressed textual representations of words and compressed phonetic representations of the words.

Apparatus and methods for pronunciation lexicon compression

Apparatus and methods for pronunciation lexicon compression

Apparatus and methods for pronunciation lexicon compression

Owner:MARVELL ASIA PTE LTD

Method and apparatus for providing foreign language text display when encoding is not available

InactiveUS7260780B2Digital computer detailsCharacter and pattern recognitionText displayDisplay device

A method and apparatus include referencing a phonetic language database that includes double-byte font entries and associated phonetic representations of the double-byte font entries. At least one of the double-byte font entries is used to obtain a phonetic representation of the used at least one double-byte font. The phonetic representation is displayed on a display device.

Method and apparatus for providing foreign language text display when encoding is not available

Method and apparatus for providing foreign language text display when encoding is not available

Method and apparatus for providing foreign language text display when encoding is not available

Owner:MICROSOFT TECH LICENSING LLC

System and method for combining phonetic and automatic speech recognition search

ActiveUS20210065679A1Improve prior search technologyLess storageSpeech recognitionSpecial data processing applicationsAutomatic speechVoice search

A text search query including one or more words may be received. An ASR index created for an audio recording may be searched over using the query to produce ASR search results including words, each word associated with a confidence score. For each of the words in the ASR search results associated with a confidence score below a threshold (and in some cases having one or more preceding words in the ASR index and one or more subsequent words in the ASR index), a phonetic representation of the audio recording may be searched for the word having the confidence score below the threshold, where it occurs in the audio recording, possibly after the one or more preceding words and in the audio recording before the one or more subsequent words, to produce phonetic search results. Search results may be returned include ASR and phonetic results.

System and method for combining phonetic and automatic speech recognition search

System and method for combining phonetic and automatic speech recognition search

System and method for combining phonetic and automatic speech recognition search

Owner:NICE LTD

Database storing syllables and sound units for use in text to speech synthesis system

InactiveUS20070203705A1Speech synthesisSyllableText to speech synthesis

In embodiments the present invention includes a method for populating a text to speech synthesis database. This method can include the steps of defining a set of phonetic symbols, wherein each symbol is a single alphabetic character representing a separate sound, representing a syllable by at least one phonetic symbol of the set of phonetic symbols to form a phonetic representation of the syllable, recording a verbal expression of the syllable using the phonetic representation, indexing the recording of the verbal expression of the syllable to a description of the recording, and storing the indexed recording of the verbal expression of the syllable in a database.

Database storing syllables and sound units for use in text to speech synthesis system

Database storing syllables and sound units for use in text to speech synthesis system

Database storing syllables and sound units for use in text to speech synthesis system

Owner:ALPINE ELECTRONICS INC

Method for speech recognition on all languages and for inputing words using speech recognition

InactiveUS8352263B2Easy to recognizeEasy inputNatural language data processingSpeech recognitionM categoryAccentual verse

The invention can recognize all languages and input words. It needs m unknown voices to represent m categories of known words with similar pronunciations. Words can be pronounced in any languages, dialects or accents. Each will be classified into one of m categories represented by its most similar unknown voice. When user pronounces a word, the invention finds its F most similar unknown voices. All words in F categories represented by F unknown voices will be arranged according to their pronunciation similarity and alphabetic letters. The pronounced word should be among the top words. Since we only find the F most similar unknown voices from m (=500) unknown voices and since the same word can be classified into several categories, our recognition method is stable for all users and can fast and accurately recognize all languages (English, Chinese and etc.) and input much more words without using samples.

Method for speech recognition on all languages and for inputing words using speech recognition

Method for speech recognition on all languages and for inputing words using speech recognition

Method for speech recognition on all languages and for inputing words using speech recognition

Owner:LI TZE FEN +4

Popular searches

Acoustics Mutual information Phonetic representation Target text Utterance Audio frequency Audio signal Statistical significance Text entry Electronic communication

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

© 2025 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com