Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Voiceprint open-set identification method with unknown category internal division capability

A recognition method and category technology, applied in speech analysis, instruments, etc., can solve problems such as the inability to determine the number of speakers in advance, the inability to use K-Means, and improve the clustering effect, so as to eliminate the effect of vocal cords and lips, and eliminate DC Drift, the effect of suppressing random noise

Inactive Publication Date: 2021-04-30
10TH RES INST OF CETC
View PDF5 Cites 6 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Although these methods have a good ability to identify unknown classes, they cannot classify known classes.
Furthermore, for speech data of unknown class, it is required to determine the number of speakers and their speaker affiliation of the speech data, which is an unsupervised online clustering problem
Because the number of speakers cannot be determined in advance, and the clustering process requires real-time performance, methods such as K-Means and LDA cannot be used
Traditional methods can use hierarchical clustering (Hierarchical Clustering), spectral clustering (Spectral Clustering) and GMM, etc. These clustering methods are unsupervised clustering processes, and it is difficult to improve the clustering effect through training

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Voiceprint open-set identification method with unknown category internal division capability
  • Voiceprint open-set identification method with unknown category internal division capability
  • Voiceprint open-set identification method with unknown category internal division capability

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0018] In order to make the purpose, technical solution and advantages of the present invention clearer, the present invention will be further elaborated below in conjunction with the accompanying drawings.

[0019] see figure 1 shown. According to the present invention, a voiceprint open-set recognition method with internal division capability of unknown categories is characterized in that it comprises the following steps:

[0020] Firstly, a text-independent voiceprint open-set recognition data set is constructed by using multiple voice fragments of language users with different accents as their native language; secondly, the voice data of different speakers are used as voiceprint open-set recognition The input of the system is to perform feature transformation on the original audio data in the data set, after pre-emphasis, framing, windowing, fast Fourier transform (FFT), Mel filter bank filtering, logarithm, discrete cosine transform (DCT ) After these several preprocess...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a voiceprint open-set identification method with unknown category internal division capability. The voiceprint open-set identification method has relatively high accuracy and relatively good applicability. The method is realized through the following calculation scheme: taking voice data of different speakers as input of a voiceprint open set recognition system, calculating Mel-frequency cepstrum coefficient characteristics of the voice data one by one, and training an audio coding module GE2E based on a time sequence by adopting a generalized end-to-end loss function, thereby effectively eliminating ambiguity between the speakers; through audio data coding output by GE2E, in combination with a multivariate Gaussian model and a training probability step model CGDL, judging whether any audio data belongs to a known category or not, and classifying the audio data judged to be the known category; for the audio data which is judged by the CGDL as an unknown category, constructing and training an unbounded staggered state neural network, and clustering the audio data online, wherein the obtained clustering number is the number of speakers, and all the audio data in a certain cluster belong to the same speaker.

Description

technical field [0001] The invention belongs to the technical field of voiceprint open-set recognition, and in particular relates to a voiceprint open-set recognition method capable of internally dividing unknown categories. Background technique [0002] With the development of information technology, people have more and more demands for identification technology, and identification plays an increasingly important role in the field of information security, and the requirements for its security and reliability are becoming more and more stringent. The identification technology based on traditional password authentication has exposed many shortcomings in the actual information network application, while the identification technology based on biometric identification has shown great advantages due to its unique stability, uniqueness and convenience. It has become an important research direction in the field of identification. Voiceprint Recognition (VPR), also known as Speake...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G10L17/02G10L17/04G10L17/18G10L17/20G10L17/22
CPCG10L17/02G10L17/18G10L17/20G10L17/04G10L17/22
Inventor 庄旭袁鑫尹可鑫甘翼丛迅超
Owner 10TH RES INST OF CETC
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products