Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Voiceprint recognition method based on variational information bottleneck and system thereof

A voiceprint recognition and information bottleneck technology, which is applied in the field of voiceprint recognition methods and systems based on variational information bottlenecks, can solve the problems of low voiceprint recognition accuracy, improve recognition accuracy, reduce feature redundancy, and improve The effect of robustness

Active Publication Date: 2021-10-08
WUHAN UNIV OF TECH
View PDF7 Cites 3 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] The present invention proposes a voiceprint recognition method and system based on a variational information bottleneck, which is used to solve or at least partially solve the technical problem of low voiceprint recognition accuracy in practical application scenarios

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Voiceprint recognition method based on variational information bottleneck and system thereof
  • Voiceprint recognition method based on variational information bottleneck and system thereof
  • Voiceprint recognition method based on variational information bottleneck and system thereof

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0048] An embodiment of the present invention provides a voiceprint recognition method based on variational information bottleneck, including:

[0049] S1: Obtain original voice data;

[0050] S2: Build a voiceprint recognition model that introduces a variational information bottleneck. The voiceprint recognition model includes an acoustic feature parameter extraction layer, a frame-level feature extraction network, a feature aggregation layer, a variational information bottleneck layer, and a classifier. The acoustic feature The parameter extraction layer is used to convert the input original speech waveform into the acoustic feature parameter FBank, and the frame-level feature extraction network is used to extract multi-scale and multi-frequency frame-level speaker information from the acoustic feature parameter FBank by one-time aggregation to obtain frame-level Feature vectors, the feature aggregation layer is used to convert frame-level feature vectors into low-dimensiona...

Embodiment 2

[0118] Based on the same inventive concept, this embodiment provides a voiceprint recognition system based on variational information bottleneck, including:

[0119] The data acquisition module is used to obtain the original voice data;

[0120] The model construction module is used to construct a voiceprint recognition model that introduces a variational information bottleneck, wherein the voiceprint recognition model includes an acoustic feature parameter extraction layer, a frame-level feature extraction network, a feature aggregation layer, a variational information bottleneck layer, and a classifier, Among them, the acoustic feature parameter extraction layer is used to convert the input original speech waveform into the acoustic feature parameter FBank, and the frame-level feature extraction network is used to extract multi-scale and multi-frequency frame-level speaker information from the acoustic feature parameter FBank to obtain frame-level Feature vectors, the featur...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a voiceprint recognition method based on variational information bottleneck and a system thereof, and solves the problems of poor robustness and low discrimination of speaker embedding extracted by an existing voiceprint recognition model. The method comprises the following steps: firstly, providing a feature extraction network consisting of VovNet and an ultra-lightweight subspace attention mechanism (ULSAM), wherein the feature extraction network is used for extracting multi-scale and multi-frequency frame-level speaker information; and then introducing a variational information bottleneck as a regularization method, further compressing the feature vector of the speaker, removing information irrelevant to the speaker, and only keeping information relevant to identification of the identity of the speaker, so that embedding of the finally extracted speaker is more robust. Compared with an existing voiceprint recognition technology, the voiceprint recognition method improves the recognition accuracy of voiceprint recognition under the noise background, and enables the voiceprint recognition technology to be more suitable for actual life scenes.

Description

technical field [0001] The invention relates to the fields of deep learning and voiceprint recognition, in particular to a voiceprint recognition method and system based on a variational information bottleneck. Background technique [0002] Voiceprint recognition, also known as speaker recognition, is a technology that automatically identifies the speaker's identity based on the speech parameters in the sound waveform that reflect the speaker's physiological and behavioral characteristics. The emergence of deep learning has greatly promoted the development of voiceprint recognition. End-to-end voiceprint recognition based on deep neural network has become the current mainstream technology, that is, to use the powerful learning ability of deep neural network to learn a language from voice signals. Person representation vectors, called speaker embeddings. [0003] Voiceprint recognition based on deep speaker embeddings usually consists of three parts: feature extraction netwo...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G10L17/04G10L17/02G10L17/18G10L17/20G06N3/04G06N3/08
CPCG10L17/04G10L17/02G10L17/18G10L17/20G06N3/084G06N3/045
Inventor 熊盛武王丹董元杰
Owner WUHAN UNIV OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products