Voice vocal print modeling method and device

A modeling method and voiceprint technology, applied in speech analysis, instruments, etc., can solve problems such as time-consuming and labor-intensive, and achieve the effect of avoiding time-consuming and labor-intensive, low hardware requirements, and meeting the needs of separation and modeling

Active Publication Date: 2018-09-28
北京远鉴信息技术有限公司
View PDF5 Cites 10 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, this manual method is not only time-consuming and labor-intensive, but also cannot be achieved by manpower alone when the number of target people increases rapidly and model training needs to be completed quickly.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Voice vocal print modeling method and device
  • Voice vocal print modeling method and device
  • Voice vocal print modeling method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0050] join figure 1 A flowchart of a voiceprint modeling method is shown, the method is applied to the client, and specifically includes the following steps:

[0051] S101. Receive the request information input by the user, and transmit the request information to the server, so as to trigger the server to verify the request information;

[0052] Specifically, the user submits a collection request through the client, and the user ID and parameter validity are checked through the server. Automatically estimating the number of speakers in a multi-person conversation is a difficult point in speech separation. The present invention combines actual application scenarios and allows users to fill in the actual number of people participating in the conversation, so that the problems of speech segmentation and clustering can be solved more focusedly;

[0053] S102. Receive a verification result of the request information transmitted by the server;

[0054] S103. When the verification...

Embodiment 2

[0057] join figure 2 A flow chart of voiceprint modeling is shown, the method is implemented on the basis of the voiceprint modeling provided in Embodiment 1, and is applied to a server, and specifically includes the following steps:

[0058] S201. Receive the request information sent by the client, verify the request information, and transmit the verification result to the client;

[0059] After the server responds to the registration request, the display device on the client prompts whether to collect the voice data of the reference person in advance. In practical applications, the chat host or conference host is relatively fixed, and usually does not pay attention to their voiceprint, so it can be set to remove invalid information. If the voice of the reference person is not collected in advance, it means that all the speakers participating in the conversation are the persons concerned;

[0060] S202. When the verification result is valid, receive the original voice data...

Embodiment 3

[0087] For the speech voiceprint modeling method provided in the first embodiment, the embodiment of the present invention provides a voiceprint modeling device, see image 3 A structural block diagram of a voiceprint modeling device is shown, which is applied to the client, and the device includes the following parts:

[0088] The input module 31 is configured to receive request information input by the user, and transmit the request information to the server, so as to trigger the server to verify the request information;

[0089] A receiving module 32, configured to receive a verification result of the request information transmitted by the server;

[0090] The acquisition module 33 is configured to collect original voice data when the verification result is valid and an instruction to collect voice from the user is received, and transmit the original voice data to the server, so that the server can process the original voice data. Voice data is processed.

[0091] The emb...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a voice vocal print modeling method and device. By combining with an actual application scene, a set of vocal print automatic modeling frame facing multi-person conversation voice is provided, on the basis of implementation modes, including presetting of the number of people speaking, pre-collecting of reference man voice data and the like, of a client-side and a server-side, by combining with prior information, the problem is restrained, and multi-person combined voice separation and modeling demands are met more effectively. The requirements for hardware are low, and time-and-labor-wasting manual voice editing is avoided. Collecting is completed through the client-side, processing is completed through the server-side, extra collecting equipment is not needed, and distribution deployment can be supported. Time-and-labor-wasting work such as manual editing is prevented from being conducted through audio voice editing software, under the situation without implementation only through manpower, vocal print registration is automatically completed in the whole process, and the working efficiency is effectively improved.

Description

technical field [0001] The present invention relates to the technical field of speech processing, in particular to a speech voiceprint modeling method and device. Background technique [0002] Voiceprint recognition, also known as speaker recognition, identifies the identity of the speaker through the voice parameters in the voice waveform that reflect the speaker's physiological and behavioral characteristics. It has the characteristics of high security and convenient data collection. [0003] The application scenarios targeted by this patent include conversation voices of two or more people, such as synchronous recording of transcripts, conference conversation voices, etc. Under the condition of multi-person conversation, the difficulty of voiceprint application is how to separate multiple single-person voices from multi-person combined voices, especially in the process of training the voiceprint model, given a multi-person voice needs to be separated Multiple single-per...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G10L17/04
CPCG10L17/04
Inventor 郑榕王黎明
Owner 北京远鉴信息技术有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products