The embodiment of the invention provides a modeling method for
language identification, which comprises the following steps of: inputting
voice data, preprocessing the
voice data to obtain a
characteristic sequence, mapping a characteristic vector to form a super vector, performing projection compensation on the super vector, and establishing a training
language model through an
algorithm of a
support vector machine; and adopting the steps to obtain a super vector to be measured of the voice to be measured, performing the projection compensation on the super vector to be measured, grading the super vector to be measured by utilizing the
language model, and identifying language types of the voice to be measured. The embodiment of the invention also provides a modeling device for the
language identification, which comprises a voice preprocessing module, a characteristic extraction module, a multi-coordinate
system origin selection module, a characteristic vector mapping module, a subspace extraction module, a subspace projection compensation module, a training module and an identification module. According to the method and the device which are provided by the embodiment of the invention, information which is invalid to the identification in high-dimension statistics is removed, the correction rate of the
language identification is improved, and the computational complexity on an
integrated circuit is reduced.