Speaker identification method based on deep stack autoencoder network

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
A technology of speaker recognition and self-encoding network, applied in speech analysis, instruments, etc., can solve problems such as limiting model performance, achieve the effects of improving recognition performance, reducing system performance impact, and improving robustness

Pending Publication Date: 2019-02-15

HUBEI UNIV OF TECH

View PDF3 Cites 14 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

However, the premise of the feasibility of the overall change model and the linear discriminant analysis model in the i-vector model framework is that the speaker information and the channel information are linearly separable. In fact, linear separability is difficult to effectively separate the two effectively, which limits the model Performance in complex real-world environments

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0034] In order to facilitate those of ordinary skill in the art to understand and implement the present invention, the present invention will be described in further detail below in conjunction with the examples. It should be understood that the implementation examples described here are only used to illustrate and explain the present invention, and are not intended to limit the present invention.

[0035] The present invention will be further explained below in conjunction with specific embodiments.

[0036] refer to Figure 1-4 , a speaker recognition method based on deep stack autoencoder network, which can be divided into three parts: 1) speaker feature extraction; 2) network design of stack autoencoder; 3) speaker recognition and decision (softmax) .

[0037] 1) Speaker feature extraction, the steps are as follows:

[0038] A. Collect the original voice signal and pre-emphasize, frame, window, fast Fourier transform (FFT), triangular window filter, logarithm, discrete ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention relates to a speaker identification method based on the deep stack autoencoder network. The method comprises steps of S1, speaker feature extraction; S2, stack autoencoder network design; and S3: speaker identification and decision making. The method is advantaged in that compared with traditional speaker identification, the deep stack autoencoder network is fused with a speaker identification system model, in combination with the multi-layer structure of a stack autoencoder to improve the characterization ability of an evaluation model, system identification performance in the presence of background noise can be finitely improved, influence of the noise on the system performance is reduced, system noise robustness is improved, the system structure is optimized, and identification timeliness is effectively enhanced.

Description

technical field [0001] The invention relates to the technical field of computer vision, in particular to a speaker recognition method based on a deep stack autoencoder network. Background technique [0002] Speaker recognition, also known as voiceprint recognition, is a biometric authentication technology that uses specific speaker information contained in voice signals to identify the identity of the speaker. In recent years, the introduction of the identity vector (i-vector) speaker modeling method based on factor analysis has significantly improved the performance of the speaker recognition system. I-vector uses a low-dimensional total variable space to represent the speaker subspace and channel subspace, and maps the speaker's voice to this space to obtain a fixed-length vector representation (i.e., i-vector). The speaker recognition system based on i-vector mainly includes three steps: extraction of sufficient statistics, i-vector mapping, and calculation of likelihood...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & AuthorityApplications(China)

IPC IPC(8): G10L17/00G10L17/02G10L17/04G10L17/18G10L17/22

CPCG10L17/02G10L17/04G10L17/18G10L17/22G10L17/00

Inventor曾春艳马超峰武明虎叶佳翔朱莉王娟吕松南朱栋梁蔡松

OwnerHUBEI UNIV OF TECH

Speaker identification method based on deep stack autoencoder network

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology