A pronunciation training system extracts pronunciation features from various pronunciation samples, links pronunciation features with corresponding muscle movements and diagram representations, displays related waveforms and pronunciation processes, and mark the differences between different waveforms and different pronunciation processes for helping a user to distinguish different sounds. First, the system collects pronunciation samples from people, categorizes these samples, analyzes them in time domain and in frequency domain, identifies the positions and movements of pronunciation organs, provides interfaces for experts to define pronunciation features, extracts and compares pronunciation features, and build links between pronunciation features and pronunciation processes. Then, the system collects pronunciation samples from a user, analyzes the pronunciation samples, extracts pronunciation features from the pronunciation samples, regenerates the pronunciation process, and displays related waveforms for helping a user to enhance the user's awareness on different sounds. The system can further increase the user's awareness on how a sound relates to a pronunciation feature and the muscle movements of a pronunciation organ by providing interfaces for a user to create different sounds by modifying the existing sounds on its loudness, tone, duration, and pace, by modifying the features in time domain or frequency domain, and by modifying the muscle movements of related pronunciation organs.