The invention discloses an automatic
voice data labeling method and
system for voice recognition, and particularly relates to the field of voice recognition. The
system comprises a mute detection module, a volume screening module, a length screening module, a voice recognition module, a recognition result judgment module and a manual proofreading module, the mute detection module splits each voiceinto a plurality of voice segments through a mute detection
algorithm, and the volume screening module is used for screening out voices meeting the requirements through a volume threshold value and removing voices not meeting the requirements. The invention discloses a combined
system of multiple modules. According to the system, speech preprocessing and
speech recognition are carried out, by a public cloud mode, recognition result judgment manual proofreading are carried out,
voice data annotation is constructed, after multiple times of iteration of the processes, a new corpus is continuously trained, high-quality corpus data is obtained, manpower is reduced, the
voice data annotation quality is improved, and the problems that the
manual annotation period is long, the cost is high and the efficiency is low are solved.