Patents
Literature
Hiro is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Hiro

101 results about "Speech recognition performance" patented technology

Exception dictionary creating unit, exception dictionary creating method, and program therefor, as well as speech recognition unit and speech recognition method

An exception dictionary creating device, an exception dictionary creating method, and a program therefor allowing creating an exception dictionary are provided for affording high speech recognition performance while reducing the size of the exception dictionary, as well as a speech recognition device and a speech recognition method capable of recognizing a speech with high accuracy of recognition by using the exception dictionary. To achieve this, a text-to-phonetic symbol converting unit (21) of an exception dictionary creating device (10) creates converted phonetic symbol sequence by converting text sequence of vocabulary list data (21) into phonetic symbol sequence. A recognition degradation contribution degree calculating unit (24) calculates a recognition degradation contribution degree when the converted phonetic symbol sequence is not identical to a correct phonetic symbol sequence registered in a database or word dictionary (50). An exception dictionary registering unit (41) registers in the exception dictionary (60) the text sequence and the phonetic symbol sequence registered in the text sequence of the vocabulary list data (12) with a high degree of recognition degradation contribution degree to the recognition so as not to exceed data limitation capacity indicated by exception dictionary memory size content (71).
Owner:ASAHI KASEI KK

Speech recognition apparatus

The present invention provides a speech recognition apparatus having high speech recognition performance and capable of performing speech recognition in a highly efficient manner. A matching unit 14 calculates the scores of words selected by a preliminary word selector 13 and determines a candidate for a speech recognition result on the basis of the calculated scores. A control unit 11 produces word connection relationships among words included in a word series employed as a candidate for the speech recognition result and stores them into a word connection information storage unit 16. A reevaluation unit 15 corrects the word connection relationships one by one. On the basis of the corrected word connection relationships, the control unit 11 determines the speech recognition result. A word connection managing unit 21 limits times allowed for a boundary between words represented by the word connection relationships to be located thereat. A word connection managing unit 22 limits start times of words preliminarily selected by the preliminary word selector 13. The present invention can be applied to an interactive system that recognizes an input speech and responds to the speech recognition result.
Owner:SONY CORP

Speech processing method and apparatus for improving speech quality and speech recognition performance

A speech processing apparatus which, in the process of performing echo canceling by using a pseudo acoustic echo signal, continuously uses an impulse response used for the previous frame as an impulse response to generate the pseudo acoustic echo signal when a voice is contained in the microphone input signal, and which uses a newly updated impulse response when a voice is not contained in the microphone input signal.
Owner:ASAHI KASEI KK

Method of using microphone characteristics to optimize speech recognition performance

A system and method for tuning a speech recognition engine to an individual microphone using a database containing acoustical models for a plurality of microphones. Microphone performance characteristics are obtained from a microphone at a speech recognition engine, the database is searched for an acoustical model that matches the characteristics, and the speech recognition engine is then modified based on the matching acoustical model.
Owner:GENERA MOTORS LLC

Speech recognition device and speech recognition method

The invention discloses a speech recognition device and a speech recognition method. The speech recognition device comprises a unit which is configured to obtain speech inputted by a current user, a unit which is configured to split obtained speech and output at least two voice command segments, a unit which is configured to recognize a first predefined voice command from the voice command segments through using an acoustic model unrelated to a speaker, a unit which is configured to calculate a transformation matrix for the current user based on a voice command segment which is recognized as the first predefined voice command, a unit which is configured to select an acoustic model for the current user from acoustic models registered in the speech recognition device based the calculated transformation matrix, and a unit which is configured to recognize a second voice command from the voice command segments through using the selected acoustic model. According to the speech recognition device and the speech recognition method of the invention adopted, speech recognition performance can be improved through using the selected acoustic model (AM).
Owner:CANON KK

System and method for measuring confusion among words in an adaptive speech recognition system

A system and method are proposed for measuring confusability or similarity between given entry pairs, including text string pairs and acoustic model pairs, in systems such as speech recognition and synthesis systems. A string edit distance (Levenshiten distance) can be applied to measure distance between any pair of text strings. It also can be used to calculate a confusion measurement between acoustic model pairs of different words and a model-driven method can be used to calculate a HMM model confusion matrix. This model-based approach can be efficiently calculated with low memory and low computational resources. Thus it can improve the speech recognition performance and models trained from text corpus.
Owner:NOKIA CORP

Speech recognition processing method and device

The invention discloses a speech recognition processing method and device. The method comprises: according to speed sample data of all areas of the country, training is carried out on a preset processing model to generate a universal mandarin acoustic model; and on the basis of the speech sample data of all provinces, adaptive training is carried out on the universal mandarin acoustic model respectively to generate mandarin acoustic models with dialectal accents, wherein the mandarin acoustic models corresponding to all provinces. Therefore, on the basis of the accent difference of users at different areas, mandarin acoustic models with dialectal accents are established, so that the speech recognition performance is improved.
Owner:BAIDU ONLINE NETWORK TECH (BEIJIBG) CO LTD

Distributed speech recognition system and method

A distributed speech recognition system and method thereof in accordance with the present invention enables a word and a natural language to be recognized using detection of a pause period in a speech period in an inputted speech signal, and various groups of recognition vocabulary (for example, a home speech recognition vocabulary, a telematics vocabulary for a vehicle, a vocabulary for call center, and so forth) to be processed in the same speech recognition system by determining the recognition vocabulary required by a corresponding terminal using an identifier of the terminal since various terminals require various speech recognition targets. In addition, various types of channel distortion occurring due to the type of terminal and the recognition environment are minimized by adapting them to a speech database model using a channel estimation method so that the speech recognition performance is enhanced.
Owner:SAMSUNG ELECTRONICS CO LTD

Voice recognition performance estimation apparatus, method and program allowing insertion of an unnecessary word

A voice recognition estimating apparatus for a voice recognition apparatus, including a voice data property generator that generates properties of voice data used to determine, based on an estimation item, a feature of synthetic voice data. The estimation item is used to estimate a performance of the voice recognition apparatus. The voice data property generator includes an acquisition unit that acquires vocabulary data and unnecessary word data as the estimation item. The unnecessary word data indicates an unnecessary word inserted in the vocabulary data and an insertion position of the unnecessary word. The voice data property generator also includes a generator that generates the properties of the voice data. The properties of the voice data include selected voice quality data items, the vocabulary data and the unnecessary word data.
Owner:KK TOSHIBA

Method and apparatus for recognizing speech

Disclosed herein are an apparatus and method for recognizing speech. The apparatus includes a frame-based speech recognition unit, a segment division unit, a segment feature extraction unit, a segment speech recognition performance unit, and a combination and synchronization unit. The frame-based speech recognition unit extracts frame speech feature vectors from a speech signal, and performs speech recognition on frames of the speech signal using the frame speech feature vectors and a frame-based probability model. The segment division unit divides the speech signal into segments. The segment feature extraction unit extracts segment speech feature vectors around a boundary between the segments. The segment speech recognition performance unit performs speech recognition on the segments of the speech signal using the segment speech feature vectors and a segment-based probability model. The combination and synchronization unit combines results of the speech recognition for the frames with results of the speech recognition for the segments.
Owner:ELECTRONICS & TELECOMM RES INST

Image display apparatus and method of controlling the same

Provided are an image display apparatus able to improve voice recognition performance by decreasing the volume of an audio signal output from the image display apparatus to a predetermined level or less when the image display apparatus recognizes user voice, and a method of controlling the same. The image display apparatus enabling voice recognition includes a first voice input unit to receive a user-side audio signal, an audio output unit to output an audio signal processed by the image display apparatus, a first voice recognizer to recognize the user-side audio signal received through the first voice input unit, and a controller to decrease the volume of the audio signal output through the audio output unit to a predetermined level if a voice recognition start command is received.
Owner:SAMSUNG ELECTRONICS CO LTD

Voice data processing method and apparatus of mobile terminal

The invention provides a voice data processing method and apparatus of a mobile terminal. The method includes: acquiring voice data input by users; recognizing the voice data; obtaining a keyboard corresponding to the voice data and identification information of a controlled electronic device according to a recognition result, and performing classification according to goals of the users in the voice data; searching a first historical voice control event matched with the keyboard and a classification result from historical voice control events, wherein an execution result of the controlled electronic device corresponding to the first historical voice control event is execution success; and obtaining a corresponding historical control instruction from the first historical voice control event, and sending the historical control instruction to the controlled electronic device corresponding to the identification information to control the controlled electronic device to execute an operation corresponding to the historical control instruction. By employing the above scheme, the speech recognition performance can be enhanced.
Owner:SHANGHAI PATEO INTERNET TECH SERVICE CO LTD

Method for speech recognition using uncertainty information for sub-bands in noise environment and apparatus thereof

According to a method and apparatus for speech recognition in noise environment of the present invention using uncertainty information for sub-band, uncertainty information of each sub-band is extracted from estimated clean speech using noise modeling, and helps to extract speech features that are robust to noise using the extracted uncertainty information as a weight with respect to each sub-band. Also, an acoustic model is converted according to each sub-band weight, and speech recognition is performed based on the converted acoustic model and the extracted speech features. As a result, while the noise modeling over time is not so accurate, noise influence resulted from sub-bands having high corruption can be reduced according to the uncertainty information of the corresponding sub-band, and speech recognition performance in complex noise environments can be improved.
Owner:ELECTRONICS & TELECOMM RES INST

Sparse representation features for speech recognition

Techniques are disclosed for generating and using sparse representation features to improve speech recognition performance. In particular, principles of the invention provide sparse representation exemplar-based recognition techniques. For example, a method comprises the following steps. A test vector and a training data set associated with a speech recognition system are obtained. A subset of the training data set is selected. The test vector is mapped with the selected subset of the training data set as a linear combination that is weighted by a sparseness constraint such that a new test feature set is formed wherein the training data set is moved more closely to the test vector subject to the sparseness constraint. An acoustic model is trained on the new test feature set.The acoustic model trained on the new test feature set may be used to decode user speech input to the speech recognition system.
Owner:NUANCE COMM INC

Speech recognition terminal evaluation system and method

The invention provides a speech recognition terminal evaluation system and method. The speech recognition terminal evaluation system comprises a speech playback device for outputting a test speech corpus, a terminal to be tested for recognizing the test speech corpus in different test environments including a noise test environment to obtain a recognition result, a noise generating device for generating noise required for the test, an image collecting device for performing image acquisition on the recognition result to obtain and transmit the speech recognition image to a control device and the control device which is used for converting a test text corpus into the test speech corpus through a speech synthesis method, performing image recognition on a speech device image based on a deep learning algorithm to obtain the recognition result, and comparing the recognition result with preset tagged data to obtain a comparison result used for indicating the speech recognition performance ofthe terminal to be tested. According to the scheme, automated test is adopted to support repetitive test, and the use of functional test based on deep learning algorithm comparison can reduce labor costs.
Owner:CHINA ACADEMY OF INFORMATION & COMM

Far-field pickup device and method for collecting human voice signals in far-field pickup device

The invention provides a far-field pickup equipment device and a method for acquiring a human voice signal in the far-field pickup equipment device. The far-field pickup system comprises an device equipment main body and a microphone pickup unit which are separated from each other. And tThe microphone pickup unit collects echoes of the user voice and the a sound signal played by the equipment devicemain body after spatial propagation and sends the echoes back to the device equipment main body for processing. Wherein tThe equipment device main body comprises a playing signal source; , a synchronization signal generator; , a horn; , a delay determination unit; and an echo cancellation unit. According to the embodiment of the invention, the problem that the microphone signal and the echo reference signal cannot be synchronized in the prior art can be solved, and the speech recognition performance is improved.
Owner:TENCENT TECH (SHENZHEN) CO LTD

Multi-mode speech recognition

InactiveUS20110131040A1Enhancing and improving speech recognition performanceImprove speech recognition accuracySpeech recognitionElectric/fluid circuitIn vehicleSpeech recognition performance
A method and an in-vehicle system having a speech recognition component are provided for improving speech recognition performance. The speech recognition component may have multiple vocabulary dictionaries, each of which may include phonetics associated with commands. When the in-vehicle system receives speech input, the speech recognition component may determine whether the received speech input includes a speech access command. If the received speech input is determined to include a speech access command, then a dictionary changing component may transition a currently-used dictionary of the speech recognition component to a vocabulary dictionary associated with the determined speech access command. Otherwise, the dictionary changing component may transition the currently-used dictionary to a first vocabulary dictionary. A command included in the received speech input may then be recognized by the speech recognition component using the transitioned currently-used dictionary.
Owner:HONDA MOTOR CO LTD

Apparatus and method for improving performance of voice recognition in a portable terminal

An apparatus and method for improving the performance of voice recognition in a portable terminal are provided. The apparatus includes a voice recognition management unit, and a controller. After recognizing a user's voice and extracting at least one voice parameter, the voice recognition management unit determines if the extracted at least one voice parameter meets a criterion for determining one of success and failure of voice recognition. The controller analyzes a result of the determination by the voice recognition management unit and outputs a result of the analysis.
Owner:SAMSUNG ELECTRONICS CO LTD

Accumulating transformations for hierarchical linear regression HMM adaptation

A new iterative hierarchical linear regression method for generating a set of linear transforms to adapt HMM speech models to a new environment for improved speech recognition is disclosed. The method determines a new set of linear transforms at an iterative step by Estimate-Maximize (EM) estimation, and then combines the new set of linear transforms with the prior set of linear transforms to form a new merged set of linear transforms. An iterative step may include realignment of adaptation speech data to the adapted HMM models to further improve speech recognition performance.
Owner:TEXAS INSTR INC

Weight Coefficient Generation Device, Voice Recognition Device, Navigation Device, Vehicle, Weight Coefficient Generation Method, and Weight Coefficient Generation Program

A weight coefficient generation device, a speech recognition device, a navigation system, a vehicle, a vehicle coefficient generation method, and a weight coefficient generation program are provided for the purpose of improving a speech recognition performance of place names. In order to address the above purpose, an address database 12 has address information data items including country names, city names, street names, and house numbers, and manages the address information having a tree structure indicating hierarchical relationships between the place names from wide area to a narrow area. Each of the place names stored in the address database 12 is taken as a speech recognition candidate. A weight coefficient calculation unit 11 of a weight coefficient generation device 10 calculates a weight coefficient of the likelihood of the aforementioned recognition candidate based on the number of the street names belonging to the lower hierarchy below the city names.
Owner:ASAHI KASEI KK

Method of using microphone characteristics to optimize speech recognition performance

A system and method for tuning a speech recognition engine to an individual microphone using a database containing acoustical models for a plurality of microphones. Microphone performance characteristics are obtained from a microphone at a speech recognition engine, the database is searched for an acoustical model that matches the characteristics, and the speech recognition engine is then modified based on the matching acoustical model.
Owner:GENERA MOTORS LLC

Cross-modal multi-feature fusion audio and video speech recognition method and system

The invention relates to an audio and video speech recognition technology, and provides a cross-modal multi-feature fusion audio and video speech recognition method and system in consideration of thesituation that speech interaction is easily affected by complex environmental noise, and facial motion information is acquired through a video and is relatively stable in an actual robot application environment. According to the method, speech information, visual information and visual motion information are fused through an attention mechanism, and the speech content expressed by a user is acquired more accurately by using the relevance among different modes, so that the speech recognition precision under the condition of complex background noise is improved, the speech recognition performance in human-computer interaction is improved, and the problem of low pure-speech recognition accuracy in a noise environment is effectively solved.
Owner:HUNAN UNIV

Method and apparatus for improving spontaneous speech recognition performance

ActiveUS20180247642A1Improving spontaneous speech recognition performanceImprove speech recognition performanceSpeech recognitionNeural learning methodsSpeech rateLength variation
The present invention relates to a method and apparatus for improving spontaneous speech recognition performance. The present invention is directed to providing a method and apparatus for improving spontaneous speech recognition performance by extracting a phase feature as well as a magnitude feature of a voice signal transformed to the frequency domain, detecting a syllabic nucleus on the basis of a deep neural network using a multi-frame output, determining a speaking rate by dividing the number of syllabic nuclei by a voice section interval detected by a voice detector, calculating a length variation or an overlap factor according to the speaking rate, and performing cepstrum length normalization or time scale modification with a voice length appropriate for an acoustic model.
Owner:ELECTRONICS & TELECOMM RES INST

Method and apparatus for speech recognition

A method and apparatus for enhancing the performance of speech recognition by adaptively changing a process of determining the final, recognized word depending on a user's selection in a list of alternative words represented by a result of speech recognition. A speech recognition method comprising: inputting speech uttered by a user; recognizing the input speech and creating a predetermined number of alternative words to be recognized in the order of similarity; and displaying a list of alternative words arranged in a predetermined order and determining an alternative word that a cursor currently indicates as the final, recognized word if a user's selection from the list of alternative words has not been changed within a predetermined standby time.
Owner:SAMSUNG ELECTRONICS CO LTD

Intelligent voice recognizing method, voice recognizing apparatus, intelligent computing device and server

Provided are an intelligent voice recognition method, a voice recognition device, and an intelligent computing device. A method of intelligently recognizing a voice in a voice recognition device includes obtaining an ambient noise signal from the microphone detection signal when a voice is detected from the microphone detection signal; updating a previously learned noise cancellation model based on the ambient noise; and removing the ambient noise signal from the microphone detection signal. Therefore, a voice recognition performance of the voice recognition device can be maximized. At least one of the voice recognition device, the intelligent computing device and the server of the present disclosure may be associated with an Artificial Intelligence module, a drone (Unmanned Aerial Vehicle, UAV), robot, Augmented Reality (AR) device, virtual reality (VR) device and a device related to the 5G service.
Owner:LG ELECTRONICS INC

Miniature stylish noise and wind canceling microphone housing, providing enchanced speech recognition performance for wirless headsets

A miniature microphone enclosure is provided for use with a wireless headset. The microphone enclosure provides superior noise and wind cancellation without any circuitry. Two sound ports and wind suppression material help minimize any noise attributable to wind or air. Moreover the microphone enclosure has a directivity such that the microphone does not have to be placed directly in front of a user's mouth.
Owner:ANDREA DOUGLAS +2
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products