Patents
Literature
Patsnap Copilot is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Patsnap Copilot

223results about How to "Improve speech recognition accuracy" patented technology

Transparent monitoring and intervention to improve automatic adaptation of speech models

A system and method to improve the automatic adaptation of one or more speech models in automatic speech recognition systems. After a dialog begins, for example, the dialog asks the customer to provide spoken input and it is recorded. If the speech recognizer determines it may not have correctly transcribed the verbal response, i.e., voice input, the invention uses monitoring and if necessary, intervention to guarantee that the next transcription of the verbal response is correct. The dialog asks the customer to repeat his verbal response, which is recorded and a transcription of the input is sent to a human monitor, i.e., agent or operator. If the transcription of the spoken input is correct, the human does not intervene and the transcription remains unmodified. If the transcription of the verbal response is incorrect, the human intervenes and the transcription of the misrecognized word is corrected. In both cases, the dialog asks the customer to confirm the unmodified and corrected transcription. If the customer confirms the unmodified or newly corrected transcription, the dialog continues and the customer does not hang up in frustration because most times only one misrecognition occurred. Finally, the invention uses the first and second customer recording of the misrecognized word or utterance along with the corrected or unmodified transcription to automatically adapt one or more speech models, which improves the performance of the speech recognition system.
Owner:AVAYA INC

Speech-to-speech translation system with user-modifiable paraphrasing grammars

The present invention discloses a speech-to-speech translation device which allows one or more users to input a spoken utterance in one language, translates the utterance into one or more second languages, and outputs the translation in speech form. Additionally, the device allows for translation both directions, recognizing inputs in the one or more second languages and translating them back into the first language. The device recognizes and translates utterances in a limited domain as in a phrase book translation system, so the translation accuracy is essentially 100%. By limiting the domain the system increases the accuracy of the speech recognition component and thus the accuracy of the overall system. However unlike other phrase book systems, the device also allows wide variations and paraphrasing in the input, so that the user is much more likely to find the desired phrase from the stored list of phrases. The device paraphrases the input to a basic canonical form and performs the translation on that canonical form, ignoring the non-essential variations in the surface form of the input. The device can provide visual and / or auditory feedback to confirm the recognized input and makes the system usable for non-bilingual users with absolute confidence.
Owner:EHSANI FARZAD +2

Speech recognition method, speech assessment method, speech recognition system, and speech assessment system

The invention discloses a speech recognition method used for recognizing speech of users and generating speech recognition results. The method includes following steps: a speech obtaining step, obtaining the speech of the user; a speech recognition step, recognizing the obtained speech as text data and regarding the data as an initial speech recognition result; and an error correction step, checking an error correction list which records the relevance of a plurality of original vocabularies-deviation vocabularies, performing error correction processing if a deviation vocabulary in the relevance of a certain original vocabulary-deviation vocabulary in the error correction list existing in the initial speech recognition result, and replacing the vocabulary which is the same with the deviation vocabulary in the initial speech recognition result into the original vocabulary in the relevance of the original vocabulary-deviation vocabulary. In this way, the speech recognition result after error correction is obtained and regarded as the speech recognition result generated by the speech recognition method. The invention also discloses a speech assessment method based on the speech recognition method, and a corresponding speech recognition system and a speech assessment system.
Owner:RICOH KK

Speech recognition method and device based on Chinese and English mixed dictionary

The invention provides a speech recognition method and a speech recognition device based on a Chinese and English mixed dictionary. The speech recognition method comprises the steps of: acquiring the Chinese and English mixed dictionary marked by the International Phonetic Alphabet IPA, wherein the Chinese and English mixed dictionary comprises a Chinese dictionary and an English dictionary corrected by means of the Chinese dictionary; training the model by regarding the Chinese and English mixed dictionary as a training dictionary, regarding a layer of convolutional neural network CNN plus five layers of Long Short-Term Memory (LSTM) network as a model, regarding status of the IPA as a target and regarding a connectionist time classifier CTC as a training criterion, so as to obtain a trained CTC acoustic model; and combining with the trained CTC acoustic model for performing speech recognition on a Chinese and English mixed language. According to the speech recognition method and the speech recognition device, the Chinese and English mixed dictionary comprising the Chinese dictionary and the English dictionary corrected by means of the Chinese dictionary is adopted for training, the English word coverage is comprehensive and Chinglish can be recognized, and the accuracy degree of Chinese and English mixed language recognition is further improved by combining the application of the CTC acoustic model.
Owner:BAIDU ONLINE NETWORK TECH (BEIJIBG) CO LTD

Voice recognition method and device based on Chinese and English mixed dictionary

The invention proposes a voice recognition method and device based on a Chinese and English mixed dictionary, and the method comprises the steps: obtaining the Chinese and English mixed dictionary marked with the IPA (International Phonetic Alphabet), wherein the Chinese and English mixed dictionary comprises a Chinese dictionary and an English dictionary which passes through the Chinglish correction; taking the Chinese and English mixed dictionary as a training dictionary, adding one CNN (Convolutional Neural Network) to five time recursion neural network LSTMs to form a model, taking the syllables or words as the targets, taking a CTC (Connectionist Temporal Classifier) as a training rule for the training of the model, and obtaining a trained CTC acoustic model; combining with the trained CTC acoustic model to carry out the voice recognition of the Chinese and English mixed language. According to the embodiment of the invention, the method employs the Chinese and English mixed dictionary for training, wherein the Chinese and English mixed dictionary comprises the Chinese dictionary and the English dictionary which passes through the Chinglish correction. The Chinese and English words are complete, and the method and device can achieve the recognition of Chinglish. The method and device further improve the recognition accuracy of the Chinese and English mixed language through combining with the CTC acoustic model.
Owner:BAIDU ONLINE NETWORK TECH (BEIJIBG) CO LTD

Small data speech acoustic modeling method in speech recognition

The invention belongs to the technical field of signal processing in the electronic industry, and aims at solving a problem that the discrimination performance of an acoustic model of a target language with just a little mark data is low. In order to solve the above problem, the invention provides a small data speech acoustic modeling method in speech recognition, and the method comprises the steps: carrying out the adversarial training of the acoustic features of many languages through a language adversarial discriminator, so as to build a multi-language adversarial bottleneck network model;taking the acoustic features of a target language as the input of the multi-language adversarial bottleneck network model, so as to extract the bottleneck features which is irrelevant to the language;carrying out the fusion of the bottleneck features which is irrelevant to the language with the acoustic features of the target language, so as to obtain fusion features; carrying out the training through the fusion features, so as to build an acoustic model of the target language. The method effectively irons out the defects, caused by a condition that the bottleneck information comprises the information correlated with the language, of the unremarkable improvement of the recognition performance of the target language and even the negative migration phenomenon in the prior art, thereby improving the voice recognition precision of the target language.
Owner:INST OF AUTOMATION CHINESE ACAD OF SCI
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products