Speaker recognition method based on Triplet-Loss

What is AI technical title?
AI technical title is built by Patsnap AI team. It summarizes the technical point description of the patent document.
A speaker recognition and speaker technology, applied in the field of neural network and deep learning, can solve the problems of channel environmental noise sensitivity, voice data sensitivity, and low accuracy, so as to improve reliability and accuracy, direct physical meaning, and improve The effect of discernment

Pending Publication Date: 2019-01-22

GUANGDONG UNIV OF TECH

View PDF4 Cites 13 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0003] Common voiceprint recognition methods, such as the early voiceprint recognition method based on signal processing, use some technical methods in signal processing technology to calculate the parameters of voice data in signal science, and then perform template matching and statistical analysis of variance. The data is extremely sensitive, the accuracy rate is very low, and the recognition effect is very unsatisfactory

[0004] The recognition method based on the Gaussian mixture model can achieve good results and is simple and flexible, but it requires a large amount of voice data and is very sensitive to channel environmental noise, which cannot meet the requirements of real scenarios.

[0005] The existing deep learning neural network-based methods do not consider the context-dependent nature of the speech signal, and the extracted features cannot represent the speaker well, and do not fully utilize the advantages of deep learning

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0045] The present invention will be further described below in conjunction with specific embodiment:

[0046] See attached figure 1 As shown, a kind of speaker recognition method based on Triplet-Loss described in this embodiment comprises the following steps:

[0047] S1: Acquire a speech signal, which includes three groups of samples, which are a group of speech sequences Xa of the speaker, another group of speech sequences Xp of the same speaker, and a group of speech sequences Xn of different speakers;

[0048] S2: Preprocessing the speech signal; more channel noise will be generated in the speech collection process, which will bring great difficulties to the recognition task, so firstly, the spectral subtraction method is used to denoise the input speech data, that is, from The noise spectrum estimate is subtracted from the noisy speech estimate to obtain the spectrum of the clean speech. What is eliminated here is the channel noise, which is the noise caused by the re...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention relates to a speaker recognition method based on Triplet-Loss. The speaker recognition method comprises the following steps: S1, acquiring voice signals, comprising three groups of samples, respectively a group of voice sequence of a speaker, another group of voice sequence of the same speaker and a group of voice sequence of a different speaker; S2, carrying out preprocessing on thevoice signals, so that channel noises generated during the voice collecting process are removed; S3, carrying out voice feature parameter extraction on the voice signals after denoising; S4, constructing an RNN neural network based on an LSTM neural network; S5, training the RNN neural network by adopting 90% of the extracted three groups of voice feature parameters as the input of the RNN neuralnetwork; and S6, after the RNN neural network is trained, carrying out speaker recognition by taking the rest 10% of the three groups of voice feature parameters as the input of the RNN neural network. The speaker recognition method has the advantages that the accuracy is high, the recognition effect is good, the reliability high, etc.

Description

technical field [0001] The present invention relates to the technical field of neural network and deep learning, in particular to a speaker recognition method based on Triplet-Loss. Background technique [0002] As information security issues become more and more serious, the impact is growing. The issue of "personal privacy and confidentiality" urgently needs to be resolved. How to accurately and safely determine a person's identity arouses people's thinking. As a key interface of human-computer interaction, voice plays an important role in identity authentication. Voiceprint recognition is speaker recognition. As a unique biological feature of a speaker, voiceprint is a new means to overcome traditional authentication methods. Compared with other methods, the acquisition of speech containing voiceprint features is convenient and natural, and the voiceprint extraction can be completed unconsciously, so the user's acceptance is also high; the cost of acquiring speech reco...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G10L17/18G10L17/02

CPCG10L17/02G10L17/18

Inventor王艺航熊晓明刘祥李辉

OwnerGUANGDONG UNIV OF TECH

Speaker recognition method based on Triplet-Loss

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology