Unlock instant, AI-driven research and patent intelligence for your innovation.

Testing and tuning of speech recognition systems using synthetic inputs

A technology for speech recognition and speech recognition testing, applied in speech recognition, speech analysis, speech synthesis, etc., to solve problems such as inability to recognize individual words, language model errors, and inability to recognize

Inactive Publication Date: 2006-04-19
MICROSOFT CORP
View PDF1 Cites 12 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Errors introduced by language models are situations where a speech recognition system is able to recognize individual words when they stand alone, but not in the context in which those words exist in the test
For example, if the language model can recognize "tohose" alone, but not "want to hose" (for example, the system can recognize the input as "want to host"), this is a language model error
In the second example of this error, the language model will correctly recognize "July25th", but not "July 25th."

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Testing and tuning of speech recognition systems using synthetic inputs
  • Testing and tuning of speech recognition systems using synthetic inputs
  • Testing and tuning of speech recognition systems using synthetic inputs

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0025] The invention relates to testing or tuning speech recognizers based on individually generated feature vectors. Before describing the invention in more detail, one exemplary environment in which the invention can be used will be described.

[0026] Figure 1 illustrates one example of a computing system environment 100 suitable for implementing the present invention. Computing system environment 100 is only one example of a suitable computing environment and is not intended to suggest any limitation as to the scope of use or functionality of the invention. Neither should the computing environment 100 be interpreted as having any dependency or requirement relating to any one or combination of components illustrated in the exemplary operating environment 100 .

[0027]The invention is operational with numerous other general purpose or special purpose computing system environments or configurations. Well-known computing systems, environments and / or configurations suitable ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

A system and method of testing a speech recognition system by providing pronunciations to the speech recognizer. First a text document is provided to the system and converted into a sequence of phonemes representative of the words in the text. The phonemes are then converted to model units, such as Hidden Markov Models. From the models a probability is obtained for each model or state, and feature vectors are determined. The feature vector matching the most probable vector for each state is selected for each model. These ideal feature vectors are provided to the speech recognizer, and processed. The end result is compared with the original text, and modifications to the system can be made based on the output text.

Description

technical field [0001] The present invention relates to speech recognition, and more particularly to testing and tuning of speech recognizers. Background technique [0002] First, a basic description of processing used in the speech recognition system will be described. In a speech recognition system, an input speech signal is converted into words representing the spoken content of the speech signal. The conversion begins by converting the analog voice signal into a series of digital values. The digital values ​​then pass through a feature extraction unit which computes a sequence of feature vectors based on the digital values. Each feature vector is typically multidimensional and represents a single speech signal frame. [0003] To identify the most likely sequence of words, apply the feature vector to one or more models that have been trained using the training text. Typically, this involves applying feature vectors to a frame-based acoustic model, where a single frame...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G10L15/00G10L15/06G10L15/26
CPCG10L15/01G10L13/08G10L15/142
Inventor R·洛佩斯-巴基利亚
Owner MICROSOFT CORP