Voiceprint open-set identification method with unknown category internal division capability

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
A recognition method and category technology, applied in speech analysis, instruments, etc., can solve problems such as the inability to determine the number of speakers in advance, the inability to use K-Means, and improve the clustering effect, so as to eliminate the effect of vocal cords and lips, and eliminate DC Drift, the effect of suppressing random noise

Inactive Publication Date: 2021-04-30

10TH RES INST OF CETC

View PDF5 Cites 6 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

Although these methods have a good ability to identify unknown classes, they cannot classify known classes.

Furthermore, for speech data of unknown class, it is required to determine the number of speakers and their speaker affiliation of the speech data, which is an unsupervised online clustering problem

Because the number of speakers cannot be determined in advance, and the clustering process requires real-time performance, methods such as K-Means and LDA cannot be used

Traditional methods can use hierarchical clustering (Hierarchical Clustering), spectral clustering (Spectral Clustering) and GMM, etc. These clustering methods are unsupervised clustering processes, and it is difficult to improve the clustering effect through training

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0018] In order to make the purpose, technical solution and advantages of the present invention clearer, the present invention will be further elaborated below in conjunction with the accompanying drawings.

[0019] see figure 1 shown. According to the present invention, a voiceprint open-set recognition method with internal division capability of unknown categories is characterized in that it comprises the following steps:

[0020] Firstly, a text-independent voiceprint open-set recognition data set is constructed by using multiple voice fragments of language users with different accents as their native language; secondly, the voice data of different speakers are used as voiceprint open-set recognition The input of the system is to perform feature transformation on the original audio data in the data set, after pre-emphasis, framing, windowing, fast Fourier transform (FFT), Mel filter bank filtering, logarithm, discrete cosine transform (DCT ) After these several preprocess...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a voiceprint open-set identification method with unknown category internal division capability. The voiceprint open-set identification method has relatively high accuracy and relatively good applicability. The method is realized through the following calculation scheme: taking voice data of different speakers as input of a voiceprint open set recognition system, calculating Mel-frequency cepstrum coefficient characteristics of the voice data one by one, and training an audio coding module GE2E based on a time sequence by adopting a generalized end-to-end loss function, thereby effectively eliminating ambiguity between the speakers; through audio data coding output by GE2E, in combination with a multivariate Gaussian model and a training probability step model CGDL, judging whether any audio data belongs to a known category or not, and classifying the audio data judged to be the known category; for the audio data which is judged by the CGDL as an unknown category, constructing and training an unbounded staggered state neural network, and clustering the audio data online, wherein the obtained clustering number is the number of speakers, and all the audio data in a certain cluster belong to the same speaker.

Description

technical field [0001] The invention belongs to the technical field of voiceprint open-set recognition, and in particular relates to a voiceprint open-set recognition method capable of internally dividing unknown categories. Background technique [0002] With the development of information technology, people have more and more demands for identification technology, and identification plays an increasingly important role in the field of information security, and the requirements for its security and reliability are becoming more and more stringent. The identification technology based on traditional password authentication has exposed many shortcomings in the actual information network application, while the identification technology based on biometric identification has shown great advantages due to its unique stability, uniqueness and convenience. It has become an important research direction in the field of identification. Voiceprint Recognition (VPR), also known as Speake...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & AuthorityApplications(China)

IPC IPC(8): G10L17/02G10L17/04G10L17/18G10L17/20G10L17/22

CPCG10L17/02G10L17/18G10L17/20G10L17/04G10L17/22

Inventor庄旭袁鑫尹可鑫甘翼丛迅超

Owner10TH RES INST OF CETC

Voiceprint open-set identification method with unknown category internal division capability

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements:Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology