Voiceprint recognition method based on TDNN (time delay neural network)

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A technology of voiceprint recognition and neural network, applied in speech analysis, instruments, etc., can solve the problems of a large amount of training data and increased computational complexity, and achieve good recognition effect, simple calculation, and strong feature extraction ability

Inactive Publication Date: 2019-08-13

NANJING SILICON INTELLIGENCE TECH CO LTD

View PDF4 Cites 9 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

In both cases, if the DNN is trained on in-domain data, the improvement over the traditional i-vector acoustics is significant, but it requires a large amount of training data compared to the traditional i-vectors model, and the calculation The complexity is also greatly increased

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0020] The present disclosure will be described in further detail below in conjunction with the accompanying drawings.

[0021] Before performing voiceprint recognition, the voice must be collected first. This disclosure provides two data collection methods. One is to develop a mobile phone APP with local recording and timing functions. After recording, it is deployed to Alibaba Cloud, and the data is saved locally. Released version, the audio storage format is WAV, and the sampling rate is 16000Hz. The second is to develop telephone recording, using simple background scheduling, the client can call php through url to make a call, and at the same time support dialing 32 lines (involving port idle monitoring), support uninterrupted free recording, and save long audio to the local.

[0022] When collecting sound, you can formulate some test requirements and regulations, for example: 1. The environment is quiet, there is no sharp noise, no loud interference from others, and your ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a voiceprint recognition method based on a TDNN (time delay neural network), and solves the problems that a voiceprint recognition algorithm is complicated and data are complex. The voiceprint recognition method is technically characterized by extremely strong feature extraction capacity of a neural network. The TDNN is used for extracting the feature vector of a voice segment of a speaker, a pooling layer and a softmax layer are used for acquiring the posterior probability of the voice segment of the speaker, a loss function is used for training to obtain a cross entropy, the softmax layer is removed after training, the feature vector for finally training a PLDA (probabilistic linear discriminant analysis) model is acquired, transcription of training data is omitted, and simple calculation and good recognition effects are achieved.

Description

technical field [0001] The present disclosure relates to a voiceprint recognition method, in particular to a voiceprint recognition method based on a time-delay neural network TDNN. Background technique [0002] The performance of deep neural network (DNN) embeddings for speech recognition is improved using data augmentation techniques. The DNN is trained to distinguish speakers by mapping variable-length utterances into fixed-dimensional embeddings, which we call x-vectors. Previous research has found that embeddings make better use of large-scale training datasets than i-vectors, however, collecting large amounts of labeled data for training is challenging. Data augmentation consisting of additive noise and reverberation is used as an inexpensive way to increase the amount of training data and improve robustness. Comparing x-vector and i-vector baselines for NIST SRE 2016 Cantonese speakers, we find that while augmentation is beneficial in the probabilistic linear discrim...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G10L17/18G10L17/02G10L17/04

CPCG10L17/02G10L17/04G10L17/18

Inventor 司马华鹏唐翠翠

Owner NANJING SILICON INTELLIGENCE TECH CO LTD

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Voiceprint recognition method based on TDNN (time delay neural network)

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology