The invention provides a voice vocal print modeling method and device. By combining with an actual application scene, a set of vocal print automatic modeling frame facing multi-person conversation voice is provided, on the basis of implementation modes, including presetting of the number of people speaking, pre-collecting of reference man voice data and the like, of a client-side and a server-side, by combining with prior information, the problem is restrained, and multi-person combined voice separation and modeling demands are met more effectively. The requirements for hardware are low, and time-and-labor-wasting manual voice editing is avoided. Collecting is completed through the client-side, processing is completed through the server-side, extra collecting equipment is not needed, and distribution deployment can be supported. Time-and-labor-wasting work such as manual editing is prevented from being conducted through audio voice editing software, under the situation without implementation only through manpower, vocal print registration is automatically completed in the whole process, and the working efficiency is effectively improved.