Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Verification method and device for speaker authentication and speaker authentication system

A technology for speaker authentication and verification devices, which is applied in speech analysis, instruments, etc.

Inactive Publication Date: 2009-06-24
KK TOSHIBA
View PDF1 Cites 9 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, for DTW-based systems, the phoneme recognizer will necessarily cause additional storage requirements and computational loads

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Verification method and device for speaker authentication and speaker authentication system
  • Verification method and device for speaker authentication and speaker authentication system
  • Verification method and device for speaker authentication and speaker authentication system

Examples

Experimental program
Comparison scheme
Effect test

no. 1 example

[0024] figure 1 is a flow chart of the verification method for speaker authentication according to the first embodiment of the present invention.

[0025] The present embodiment will be described below with reference to this figure.

[0026] Such as figure 1 As shown, firstly in step 101, a test voice including a password is input by the user performing verification. Wherein, the password is a specific phrase or pronunciation sequence set by the user in the registration stage for verification.

[0027] Next, in step 102, a sequence of acoustic feature vectors is extracted from the test speech input in step 101. The present invention is not particularly limited to the manner of expressing acoustic features, for example, MFCC (Mel-scale Frequency Cepstral Coefficients, Mel cepstral coefficients), LPCC (Linear Prediction Cepstrum Coefficient, linear prediction cepstral coefficients) or other based on Various coefficients obtained from energy, pitch frequency, or wavelet analy...

example 1

[0040] In Example 1, the weight of each frame on the matching path is measured based on the feature distance between the target frame and its adjacent frames in time series.

[0041] First, the spectral changes are measured for each frame of speaker template X and test speech Y, respectively.

[0042] Specifically, formula (1) is used to calculate the spectral change d of the speaker template X x (i):

[0043] d x (i)=(dist(x i , x i-1 )+dist(x i , x i+1 )) / twenty one)

[0044] where i is the index of the frame of the speaker template X, x is the feature vector in the speaker template X, and dist refers to the feature distance between two vectors, e.g., Euclidean distance.

[0045] It should be understood that although formula (1) is used here, the feature distance dist(x i , x i-1 ) and dist(x i , x i+1 ) to measure the spectral change of the speaker template X, but the present invention is not limited thereto, and the feature distance dist(x i , x i-1 ) and dist...

example 2

[0054] In Example 2, the weights of each frame on the matching path are metric-matched based on segmentation processing using codebooks.

[0055] The codebook used in this embodiment is a codebook trained in the acoustic space of the entire application, for example, for the Chinese language application environment, the codebook needs to be able to cover the acoustic space of Chinese speech; for the English language application environment For example, the codebook needs to be able to cover the acoustic space of English speech. Of course, for some special-purpose application environments, the acoustic space covered by the codebook can also be changed accordingly.

[0056] The codebook in this embodiment includes multiple codewords and feature vectors corresponding to each codeword. The number of codewords depends on the size of the acoustic space, the desired compression ratio and the desired compression quality. The larger the acoustic space, the larger the number of codewor...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a verification method, a verification device and a verification system for verifying a speaker. As one aspect of the invention, the verification method for verifying the speaker is provided, and comprises the following steps: inputting tested speech conducted by a speaker, in which codes are contained; extracting acoustic feature vector sequence from the tested speech; obtaining matching path of the extracted acoustic feature vector sequence and the speaker module registered by the speaker; considering the frequency spectrum change of the tested speech and / or the frequency spectrum change of the speaker module, and calculating the matching score of the obtained matching path; and comparing the matching score with a pre-defined resolution threshold, so as to confirm whether the input tested speech is the speech which contains codes and is conducted by the registered speaker.

Description

technical field [0001] The present invention relates to information processing technology, in particular to speaker authentication (speaker authentification) technology. Background technique [0002] Different speakers can be identified by using the pronunciation characteristics of each person when they speak, so that speaker authentication can be performed. In the article "Speaker recognition using hidden Markov models, dynamic time warping and vector quantification" published by K.Yu, J.Mason, J.Oglesby (Vision, Image and Signal Processing, IEEProceedings, Vol.142, Oct.1995, pp.313- 318) introduced three common speaker recognition engine technologies: HMM (Hidden Markov Model, hidden Markov model), DTW (Dynamic Time Warping, dynamic time warping) and VQ (Vector Quantization, vector quantization) (following Referred to as reference 1), the entire content of which is hereby incorporated by reference. [0003] Usually, a speaker verification system includes two parts: enrol...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G10L17/00
CPCG10L17/24G10L17/20
Inventor 栾剑郝杰
Owner KK TOSHIBA
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products