Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Speaker identification method based on deep stack autoencoder network

A technology of speaker recognition and self-encoding network, applied in speech analysis, instruments, etc., can solve problems such as limiting model performance, achieve the effects of improving recognition performance, reducing system performance impact, and improving robustness

Pending Publication Date: 2019-02-15
HUBEI UNIV OF TECH
View PDF3 Cites 14 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, the premise of the feasibility of the overall change model and the linear discriminant analysis model in the i-vector model framework is that the speaker information and the channel information are linearly separable. In fact, linear separability is difficult to effectively separate the two effectively, which limits the model Performance in complex real-world environments

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Speaker identification method based on deep stack autoencoder network
  • Speaker identification method based on deep stack autoencoder network
  • Speaker identification method based on deep stack autoencoder network

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0034] In order to facilitate those of ordinary skill in the art to understand and implement the present invention, the present invention will be described in further detail below in conjunction with the examples. It should be understood that the implementation examples described here are only used to illustrate and explain the present invention, and are not intended to limit the present invention.

[0035] The present invention will be further explained below in conjunction with specific embodiments.

[0036] refer to Figure 1-4 , a speaker recognition method based on deep stack autoencoder network, which can be divided into three parts: 1) speaker feature extraction; 2) network design of stack autoencoder; 3) speaker recognition and decision (softmax) .

[0037] 1) Speaker feature extraction, the steps are as follows:

[0038] A. Collect the original voice signal and pre-emphasize, frame, window, fast Fourier transform (FFT), triangular window filter, logarithm, discrete ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to a speaker identification method based on the deep stack autoencoder network. The method comprises steps of S1, speaker feature extraction; S2, stack autoencoder network design; and S3: speaker identification and decision making. The method is advantaged in that compared with traditional speaker identification, the deep stack autoencoder network is fused with a speaker identification system model, in combination with the multi-layer structure of a stack autoencoder to improve the characterization ability of an evaluation model, system identification performance in the presence of background noise can be finitely improved, influence of the noise on the system performance is reduced, system noise robustness is improved, the system structure is optimized, and identification timeliness is effectively enhanced.

Description

technical field [0001] The invention relates to the technical field of computer vision, in particular to a speaker recognition method based on a deep stack autoencoder network. Background technique [0002] Speaker recognition, also known as voiceprint recognition, is a biometric authentication technology that uses specific speaker information contained in voice signals to identify the identity of the speaker. In recent years, the introduction of the identity vector (i-vector) speaker modeling method based on factor analysis has significantly improved the performance of the speaker recognition system. I-vector uses a low-dimensional total variable space to represent the speaker subspace and channel subspace, and maps the speaker's voice to this space to obtain a fixed-length vector representation (i.e., i-vector). The speaker recognition system based on i-vector mainly includes three steps: extraction of sufficient statistics, i-vector mapping, and calculation of likelihood...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G10L17/00G10L17/02G10L17/04G10L17/18G10L17/22
CPCG10L17/02G10L17/04G10L17/18G10L17/22G10L17/00
Inventor 曾春艳马超峰武明虎叶佳翔朱莉王娟吕松南朱栋梁蔡松
Owner HUBEI UNIV OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products