Patents
Literature
Patsnap Copilot is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Patsnap Copilot

89 results about "Phonetic transcription" patented technology

Phonetic transcription (also known as phonetic script or phonetic notation) is the visual representation of speech sounds (or phones). The most common type of phonetic transcription uses a phonetic alphabet, such as the International Phonetic Alphabet.

Method and apparatus for providing unsupervised adaptation of phonetic transcriptions in a speech recognition dictionary

An adaptive speech recognition system is provided including an input for receiving a signal derived from a spoken utterance indicative of a certain vocabulary item, a speech recognition dictionary, a speech recognition unit and an adaptation module. The speech recognition dictionary has a plurality of vocabulary items each being associated to a respective dictionary transcription group. The speech recognition unit is in an operative relationship with the speech recognition dictionary and selects a certain vocabulary item from the speech recognition dictionary as being a likely match to the signal received at the input. The results of the speech recognition process are provided to the adaptation module. The adaptation module includes a transcriptions bank having a plurality of orthographic groups, each including a plurality of transcriptions associated with a common vocabulary item. A transcription selector module in the adaptation module retrieves a given orthographic group from the transcriptions bank on a basis of the vocabulary item recognized by the speech recognition unit. The transcription selector module processes the given orthographic group on the basis of the signal received at the input to select a certain transcription from the transcriptions bank. The adaptation module then modifies a dictionary transcription group corresponding to the vocabulary item selected as being a likely match to the signal received at the input on the basis of the selected certain transcription.
Owner:AVAYA INC

Streaming phonetic transcription system based on self-attention mechanism

The invention discloses a streaming phonetic transcription system based on a self-attention mechanism. The streaming phonetic transcription system based on the self-attention mechanism comprises a feature front-end processing module, a self-attention audio coding network module, a self-attention prediction network module and a united network module. The feature front-end processing module is usedfor receiving an input acoustic feature and converting into a vector with specific dimensionality; the self-attention audio coding network module is connected with the feature front-end processing module and is used for receiving the processed acoustic feature and obtaining an coded acoustic state vector; the self-attention prediction network module is used for generating a language state vector according to an input prediction mark of the last moment; and the united network module is connected with the self-attention audio coding network module and the self-attention prediction network module, and is used for combining with an acoustic state and a language state and calculating the probability of a new prediction mark. The invention provides a streaming feedforward voice coder based on the self-attention mechanism, so that the calculation efficiency and the precision of a traditional voice coder are improved.
Owner:北京中科智极科技有限公司

Intelligent mobile platform Pinyin (phonetic transcriptions of Chinese characters) input method based on language models

The invention relates to an intelligent mobile platform Pinyin (phonetic transcriptions of Chinese characters) input method based on a language models. The Pinyin input method comprises the following steps of: firstly, training a Pinyin text to obtain a language model based on letters and a language model based on Pinyin; secondly, decoding an input Pinyin string by using an HMM (Hidden Markov Model) decoding method; and thirdly, predicting a next input step and giving out an input promotion: firstly, carrying out prediction according to the language model based on the letters, and acquiring all reasonably-input letters which can occur behind a single Pinyin letter and the occurring probabilities of the reasonably-input letters; then, carrying out the prediction according to the language model based on the Pinyin, and acquiring all reasonably-input letters which can occur behind all possible Pinyin prefixes and the occurring probabilities of the reasonably-input letters; and finally, acquiring all next possible reasonably-input letters and the occurring probabilities of the possible reasonably-input letters by comprehensively considering information of the last two steps, comparing the probabilities, realizing the input prediction according to comparing results, and carrying out the input promotion. According to the intelligent mobile platform Pinyin input method based on the language models, the accuracy rate and the fluency of the input of a user are improved, and the input efficiency is greatly improved.
Owner:SHANGHAI JIAO TONG UNIV

Word segmentation phonetic transcription and ligature writing method and device based on SC grammar

The invention relates to a word segmentation phonetic transcription and ligature writing method and device based on an SC grammar and belongs to the technical field of computer translation in computer science. Firstly, based on a word segmentation ambiguity rule of the SC grammar, an ambiguity segmentation rule library is built by means of abutment constraint conditions in natural language, and illegal segmentation is eliminated so that the word segmentation precision can be improved; secondly, based on a word segmentation ligature writing rule library of the SC grammar and a ligature writing corpora statistical library, the ligature writing corpora statistical library is used for performing ligature writing on ligature writing knowledge which cannot be presented as rules; finally, based on a dictionary library of the SC grammar, a dictionary is used for performing maximum matching to perform word segmentation, the word segmentation ambiguity rule is called for fields where ambiguity happens so that a correct segmentation result can be acquired, and the context of a word is analyzed so that correct part-of-speech tagging and phonetic transcription can be acquired. Compared with the prior art, word segmentation accuracy is improved, and the word segmentation ambiguity rule library, a combined ambiguity word library, the ligature writing rule library, the dictionary library and the ligature writing corpora statistical library are easy to expand and maintain.
Owner:HUAJIAN YUTONG TECH BEIJING CO LTD +1

Plurilingual voice decoding diagram establishment method, device, server and medium

ActiveCN109616096AThe need for voice recognitionSpeech recognitionCrowdsSpeech sound
The embodiment of the invention discloses a plurilingual voice decoding diagram establishment method, a device, a server and a medium and relates to the technical field of voice recognition. The method comprises the following steps: marking main language words and secondary language words in a sample corpus bank with phonetic symbols so as to obtain pronunciation phonemes of the main language words and the secondary language words; according to sample voice associated with sample corpora in the sample corpus bank, confirming acoustic features of the main language words and the secondary language words; according to the main language words and the secondary language words in the sample corpora in the sample corpus bank, and the pronunciation phonemes and the acoustic features of the main language words and the secondary language words, confirming decoding diagrams of plurilingual recognition. According to the embodiment of the invention, the pronunciation phonemes of the main language words and the secondary language words are obtained according to the sample corpus bank, furthermore acoustic features associated with the main language words and the secondary language words are confirmed, the decoding diagrams of plurilingual recognition are finally obtained, and the requirement of voice recognition for plurilingual mixed reading crowds can be met.
Owner:北京如布科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products