The embodiment of the invention provides a voice recognition method, device and system and a terminal. The method comprises the steps that voice to be recognized is received; feature extraction is performed on the voice to be recognized so as to obtain feature information; and the feature information is inputted to a weighted finite state transducer WFST to be recognized, the WFST is obtained by the pre-created combination of an acoustic model, a pronunciation dictionary and a language model, all the first language phonemes and second language phonemes in the acoustic model have the corresponding relationship, and phonetic notation of all the first language vocabularies in the pronunciation dictionary is performed by the second language phonemes. With application of the scheme, the voice recognition accuracy can be enhanced.