Apparatus for generating accurate
speech transcription from natural speech, comprising a data storage for storing a plurality of audio data items, each of which being recitation of text by a specific speaker! a plurality of ASR modules, each of which being trained to optimally create a unique acoustic /
linguistic model according to the spectral components contained in said audio data item and analyzing each audio data item and representing said audio data item by an ASR module! a memory for storing all unique acoustic / linguistic models! a controller, adapted to receive natural speech audio signals and divide each natural speech
audio signal to equal segments of a predetermined time! adjust the length of each segment, such that each segment will contain one or more complete words! distribute said segments to all ASR module and activate each ASR module to generate a transcription of the words in each segment according to the level of matching to its unique acoustic /
linguistic model! calculate, for each given word in a segment, a confidence measure being the probability that said given word is correct; for each segment and for each ASR module, calculate the average confidence of the transcription; obtain the confidence for each word in the segment and calculating mean
confidence value of said word! for each segment, decide which transcription is the most accurate by choose only the ASR module with the highest average confidence, from all chosen ASR modules for said segment and creating the transcription of said
audio signal by combining all transcriptions resulting from the decisions made for each segment.