Text-independent voiceprint recognition method

A voiceprint recognition, text-independent technology, applied in voice analysis, instruments, etc., can solve the difficulties, limitations, and impossibility of fully connected neural networks, etc., to reduce the number of parameters and improve robustness Effect

Inactive Publication Date: 2018-10-12
SOUTH CHINA UNIV OF TECH
View PDF5 Cites 15 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

We know that the more layers of the network, the stronger the expressive ability, but it is very difficult to train the deep fully connected neural network through the gradient descent method, because the gradient of the fully connected neural network is difficult to pass more than 3 layers
Therefore, it is impossible for us to get a deep fully connected neural network, which limits its ability

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Text-independent voiceprint recognition method
  • Text-independent voiceprint recognition method
  • Text-independent voiceprint recognition method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0036] The present invention will be further described below in conjunction with specific examples.

[0037] Such as figure 1 As shown, the text-independent voiceprint recognition method provided in this embodiment is divided into three stages: voiceprint recognition model training, extraction and embedding, and decision scoring.

[0038] First, train the voiceprint recognition model and select a suitable corpus, such as using the AISHELL-ASR0009-OS1 open source Chinese speech database, which includes a training library and a testing library.

[0039] Such as figure 2 As shown, the voiceprint recognition model training steps are as follows:

[0040] 1) Speech signal preprocessing

[0041] Each segment of speech in the corpus is divided into 25ms frames, and speech activity detection is performed to identify and eliminate long periods of silence from the sound signal stream, generate 20-dimensional Mel spectrum cepstrum coefficient MFCC, and add first-order and second-order...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a text-independent voiceprint recognition method. The text-independent voiceprint recognition method comprises three stages of voiceprint recognition model training, extractionembedding and decision-making scoring. The model training stage comprises the steps of (1) voice signal preprocessing; (2) voice frame-level operation; (3) statistical convergence layer summary frame-level outputting; (4) one-dimension convolution operation; and (5) full joint layer output speaker sorting. After model training is completed, extraction embedding is conducted before nonlinearization of a first layer of full joint layers. Finally, decision-making scoring is conducted through the cosine distance to decide acceptance or refusal. A neural network embedding technology and a convolutional neural network are combined, one-dimension convolution and a maximum value convergence layer are used for dimensionality reduction, the number of convolution layers is increased, then deep-layerfeature extraction is conducted, and thus the property of a model is improved; and the cosine distance is used as the scoring standard, and the process is quicker and simpler.

Description

technical field [0001] The invention relates to the technical field of voiceprint recognition, in particular to a text-independent voiceprint recognition method combining neural network embedding technology and convolutional neural network. Background technique [0002] Voiceprint refers to the sound wave spectrum that carries speech information in human speech. Like fingerprints, it has unique biological characteristics and has the function of identification. It is not only specific, but also relatively stable. The sound signal is a one-dimensional continuous signal. After it is discretized, the sound signal that our common computer can process can be obtained. [0003] A discrete sound signal that a computer can process. Voiceprint recognition (also known as speaker recognition) technology, like the fingerprint recognition technology widely used on smartphones, extracts voice features from the voice signal sent by the speaker, and uses this to authenticate the speaker. r...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G10L17/02G10L17/04G10L17/18G10L17/22
CPCG10L17/02G10L17/04G10L17/18G10L17/22
Inventor 郭炜强平怡强张宇郑波
Owner SOUTH CHINA UNIV OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products