End-to-end text-independent voiceprint recognition method and system

A voiceprint recognition, text-independent technology, applied in speech analysis, instruments, etc., can solve the problem of low voiceprint recognition accuracy, and achieve the effect of reducing intra-class distance, improving accuracy, and improving accuracy

Pending Publication Date: 2021-12-07
WUHAN UNIV OF TECH
View PDF0 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] The present invention proposes an end-to-end text-independent voiceprint recognition method and system, which is used to solve or at least partially solve the technical problem of low voiceprint recognition accuracy in the methods in the prior art

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • End-to-end text-independent voiceprint recognition method and system
  • End-to-end text-independent voiceprint recognition method and system
  • End-to-end text-independent voiceprint recognition method and system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0036] An embodiment of the present invention provides an end-to-end text-independent voiceprint recognition method, including:

[0037] S1: Obtain a large amount of speaker voice data as a training data set;

[0038]S2: Build a voiceprint recognition model, in which the voiceprint recognition model includes a frame-level feature extraction layer, an utterance-level feature extraction layer, a high-order attention pooling layer, and a fully connected layer. The frame-level feature extraction layer includes three time-delay neural networks. The network TDNN is used to extract frame-level features in the input speech data; the utterance-level feature extraction layer includes three gated recurrent units GRU, which are used to perform global feature extraction and temporal representation of frame-level features to generate utterance-level features; The attention pooling layer includes a high-order statistical pooling layer and a high-order attention pooling layer. The high-order ...

Embodiment 2

[0085] Based on the same inventive concept, this embodiment provides an end-to-end text-independent voiceprint recognition system, please refer to Figure 5 , the system consists of:

[0086] The training data set obtaining module 201 is used to obtain a large amount of speaker voice data as a training data set;

[0087] The voiceprint recognition model construction module 202 is used to build a voiceprint recognition model, wherein the voiceprint recognition model includes a frame-level feature extraction layer, a speech-level feature extraction layer, a high-order attention pooling layer, and a fully connected layer. The extraction layer includes three time-delay neural networks TDNN for extracting frame-level features in the input speech data; the utterance-level feature extraction layer includes three gated recurrent units GRU for global feature extraction and temporal representation of frame-level features , to generate utterance-level features; the high-order attention ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides an end-to-end text-independent voiceprint recognition method and system. The method comprises the following steps of capturing important narrow-band loudspeaker characteristics of an original voice sample by designing a filter based on a Sinc function; then, using a time delay neural network (TDNN) and a gated loop unit (GRU) for generating a hybrid neural network structure of complementary speaker information of different levels, adopting a multi-level pooling strategy, adding an attention mechanism to a pooling layer, and extracting feature information which most represents the frame level and the utterance level of a speaker from a time delay neural network layer and a gated loop unit layer, performing regularization processing on the speaker vector extraction layer; carrying out training through an AM-softmax loss function; and finally, realizing an end-to-end text-independent voiceprint recognition process through similarity calculation of the embedded model and the recognition model. Therefore, the accuracy and applicability of end-to-end text-independent voiceprint recognition are improved.

Description

technical field [0001] The invention relates to the fields of speech signal processing and deep learning, in particular to an end-to-end text-independent voiceprint recognition method and system. Background technique [0002] Today, with the rapid development of information technology, the demand for identification is becoming more and more extensive. Voiceprint recognition is a biometric technology that uses the unique characteristics of human voice to identify identities. As the third biometric technology, voiceprint recognition has begun to enter people's lives. At present, the voiceprint recognition technology has been put into use in some banks. Users log in to the mobile banking APP to conduct transfers, payments and other transactions. Taking the emerging voiceprint recognition as an example, in addition to entering the financial field and the investigation field, it has also begun to enter the public security, smart home, smart car, smart education, smart community ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G10L17/04G10L17/02G10L17/18
CPCG10L17/04G10L17/02G10L17/18
Inventor 熊盛武字云飞冯莹王旭李涛
Owner WUHAN UNIV OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products