End-to-end text-independent voiceprint recognition method and system

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A voiceprint recognition, text-independent technology, applied in speech analysis, instruments, etc., can solve the problem of low voiceprint recognition accuracy, and achieve the effect of reducing intra-class distance, improving accuracy, and improving accuracy

Pending Publication Date: 2021-12-07

WUHAN UNIV OF TECH

View PDF0 Cites 1 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0005] The present invention proposes an end-to-end text-independent voiceprint recognition method and system, which is used to solve or at least partially solve the technical problem of low voiceprint recognition accuracy in the methods in the prior art

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0036] An embodiment of the present invention provides an end-to-end text-independent voiceprint recognition method, including:

[0037] S1: Obtain a large amount of speaker voice data as a training data set;

[0038]S2: Build a voiceprint recognition model, in which the voiceprint recognition model includes a frame-level feature extraction layer, an utterance-level feature extraction layer, a high-order attention pooling layer, and a fully connected layer. The frame-level feature extraction layer includes three time-delay neural networks. The network TDNN is used to extract frame-level features in the input speech data; the utterance-level feature extraction layer includes three gated recurrent units GRU, which are used to perform global feature extraction and temporal representation of frame-level features to generate utterance-level features; The attention pooling layer includes a high-order statistical pooling layer and a high-order attention pooling layer. The high-order ...

Embodiment 2

[0085] Based on the same inventive concept, this embodiment provides an end-to-end text-independent voiceprint recognition system, please refer to Figure 5 , the system consists of:

[0086] The training data set obtaining module 201 is used to obtain a large amount of speaker voice data as a training data set;

[0087] The voiceprint recognition model construction module 202 is used to build a voiceprint recognition model, wherein the voiceprint recognition model includes a frame-level feature extraction layer, a speech-level feature extraction layer, a high-order attention pooling layer, and a fully connected layer. The extraction layer includes three time-delay neural networks TDNN for extracting frame-level features in the input speech data; the utterance-level feature extraction layer includes three gated recurrent units GRU for global feature extraction and temporal representation of frame-level features , to generate utterance-level features; the high-order attention ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention provides an end-to-end text-independent voiceprint recognition method and system. The method comprises the following steps of capturing important narrow-band loudspeaker characteristics of an original voice sample by designing a filter based on a Sinc function; then, using a time delay neural network (TDNN) and a gated loop unit (GRU) for generating a hybrid neural network structure of complementary speaker information of different levels, adopting a multi-level pooling strategy, adding an attention mechanism to a pooling layer, and extracting feature information which most represents the frame level and the utterance level of a speaker from a time delay neural network layer and a gated loop unit layer, performing regularization processing on the speaker vector extraction layer; carrying out training through an AM-softmax loss function; and finally, realizing an end-to-end text-independent voiceprint recognition process through similarity calculation of the embedded model and the recognition model. Therefore, the accuracy and applicability of end-to-end text-independent voiceprint recognition are improved.

Description

technical field [0001] The invention relates to the fields of speech signal processing and deep learning, in particular to an end-to-end text-independent voiceprint recognition method and system. Background technique [0002] Today, with the rapid development of information technology, the demand for identification is becoming more and more extensive. Voiceprint recognition is a biometric technology that uses the unique characteristics of human voice to identify identities. As the third biometric technology, voiceprint recognition has begun to enter people's lives. At present, the voiceprint recognition technology has been put into use in some banks. Users log in to the mobile banking APP to conduct transfers, payments and other transactions. Taking the emerging voiceprint recognition as an example, in addition to entering the financial field and the investigation field, it has also begun to enter the public security, smart home, smart car, smart education, smart community ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G10L17/04G10L17/02G10L17/18

CPCG10L17/04G10L17/02G10L17/18

Inventor 熊盛武字云飞冯莹王旭李涛

Owner WUHAN UNIV OF TECH

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

End-to-end text-independent voiceprint recognition method and system

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

Embodiment 2

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology