A
system for substantially automating transcription services for multiple voice users including a manual transcription
station, a
speech recognition program and a routing program. The
system establishes a profile for each of the voice users containing a training status which is selected from the group of enrollment, training, automated and stop
automation. When the
system receives a voice
dictation file from a current voice user based on the training status the system routes the voice
dictation file to a manual transcription
station and the
speech recognition program. A human transcriptionist creates transcribed files for each received voice
dictation files. The
speech recognition program automatically creates a written text for each received voice dictation file if the training status of the
current user is training or automated. A verbatim file is manually established if the training status of the
current user is enrollment or training and the speech recognition program is trained with an
acoustic model for the
current user using the verbatim file and the voice dictation file if the training status of the current user is enrollment or training. The transcribed file is returned to the current user if the training status of the current user is enrollment or training or the written text is returned if the training status of the current user is automated. An apparatus and method is also disclosed for simplifying the manual establishment of the verbatim file. A method for substantially automating transcription services is also disclosed.