Speaker recognition method based on Triplet-Loss

A speaker recognition and speaker technology, applied in the field of neural network and deep learning, can solve the problems of channel environmental noise sensitivity, voice data sensitivity, and low accuracy, so as to improve reliability and accuracy, direct physical meaning, and improve The effect of discernment

Pending Publication Date: 2019-01-22
GUANGDONG UNIV OF TECH
View PDF4 Cites 13 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] Common voiceprint recognition methods, such as the early voiceprint recognition method based on signal processing, use some technical methods in signal processing technology to calculate the parameters of voice data in signal science, and then perform template matching and statistical analysis of variance. The data is extremely sensitive, the accuracy rate is very low, and the recognition effect is very unsatisfactory
[0004] The recognition method based on the Ga

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Speaker recognition method based on Triplet-Loss
  • Speaker recognition method based on Triplet-Loss
  • Speaker recognition method based on Triplet-Loss

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0045] The present invention will be further described below in conjunction with specific embodiment:

[0046] See attached figure 1 As shown, a kind of speaker recognition method based on Triplet-Loss described in this embodiment comprises the following steps:

[0047] S1: Acquire a speech signal, which includes three groups of samples, which are a group of speech sequences Xa of the speaker, another group of speech sequences Xp of the same speaker, and a group of speech sequences Xn of different speakers;

[0048] S2: Preprocessing the speech signal; more channel noise will be generated in the speech collection process, which will bring great difficulties to the recognition task, so firstly, the spectral subtraction method is used to denoise the input speech data, that is, from The noise spectrum estimate is subtracted from the noisy speech estimate to obtain the spectrum of the clean speech. What is eliminated here is the channel noise, which is the noise caused by the re...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a speaker recognition method based on Triplet-Loss. The speaker recognition method comprises the following steps: S1, acquiring voice signals, comprising three groups of samples, respectively a group of voice sequence of a speaker, another group of voice sequence of the same speaker and a group of voice sequence of a different speaker; S2, carrying out preprocessing on thevoice signals, so that channel noises generated during the voice collecting process are removed; S3, carrying out voice feature parameter extraction on the voice signals after denoising; S4, constructing an RNN neural network based on an LSTM neural network; S5, training the RNN neural network by adopting 90% of the extracted three groups of voice feature parameters as the input of the RNN neuralnetwork; and S6, after the RNN neural network is trained, carrying out speaker recognition by taking the rest 10% of the three groups of voice feature parameters as the input of the RNN neural network. The speaker recognition method has the advantages that the accuracy is high, the recognition effect is good, the reliability high, etc.

Description

technical field [0001] The present invention relates to the technical field of neural network and deep learning, in particular to a speaker recognition method based on Triplet-Loss. Background technique [0002] As information security issues become more and more serious, the impact is growing. The issue of "personal privacy and confidentiality" urgently needs to be resolved. How to accurately and safely determine a person's identity arouses people's thinking. As a key interface of human-computer interaction, voice plays an important role in identity authentication. Voiceprint recognition is speaker recognition. As a unique biological feature of a speaker, voiceprint is a new means to overcome traditional authentication methods. Compared with other methods, the acquisition of speech containing voiceprint features is convenient and natural, and the voiceprint extraction can be completed unconsciously, so the user's acceptance is also high; the cost of acquiring speech reco...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G10L17/18G10L17/02
CPCG10L17/02G10L17/18
Inventor 王艺航熊晓明刘祥李辉
Owner GUANGDONG UNIV OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products