Method and device for recognizing speaker based on tensor subspace analysis

A technology of speaker recognition and spatial analysis, which is applied in the field of speaker recognition based on tensor quantum space analysis, can solve problems such as difficulty and computational complexity, and achieve the effect of low complexity and reduced computational load

Active Publication Date: 2012-10-17
IFLYTEK CO LTD
View PDF4 Cites 24 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] Mapping the speaker Gaussian mixture model into a supervector has the following problems: (1) The computational complexity is large
[0050] Since both U and V are variables, it is difficult to solve the above formula

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and device for recognizing speaker based on tensor subspace analysis
  • Method and device for recognizing speaker based on tensor subspace analysis
  • Method and device for recognizing speaker based on tensor subspace analysis

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0099] A speaker recognition method and device based on tensor quantum space analysis proposed by the present invention will be described in detail as follows in conjunction with the accompanying drawings and embodiments.

[0100] The present invention proposes a speaker recognition method based on tensor quantum space analysis, such as figure 2 As shown, the method includes: training general model stage, training speaker model stage and testing stage; it is characterized in that,

[0101] 1) The general model training phase includes the following steps:

[0102] 1-a) Convert the speech data for training the general background Gaussian mixture model into spectral features through speech preprocessing and feature extraction;

[0103] 1-b) Based on the extracted spectral features, use the K-means or LBG algorithm to initialize the general background Gaussian mixture model;

[0104] 1-c) Utilize the Expectation maximum (EM) algorithm to update the general background Gaussian m...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to the field of automatic voice recognition, in particular to a method and a device for recognizing a speaker based on tensor subspace analysis. The method comprises steps of training a general background Gaussian mixture model and a general projection matrix through voice data; then establishing a speaker model by utilizing the trained general model and the voice of an objective speaker; and finally, calculating related coefficients of the objective speaker model and a low-dimensional embedding matrix of the tested voice, and utilizing the related coefficients as a reference for recognizing the speaker. The device comprises a voice preprocessing module, a feature extracting module, the trained general background Gaussian mixture model, an adaptive module, the trained general projection matrix module, a low-dimensional embedding calculating module, a related coefficient calculating module, a score judging module and a storage module. The method and the device can obviously reduce the amount of calculation when the speaker modal is established, have the advantages of low modal freedom, high robustness and the like, and is applicable to recognizing speakers unrelated to a text under the condition that the length of voice is limited.

Description

technical field [0001] The invention relates to the fields of speech recognition, pattern recognition and subspace analysis, in particular, a speaker recognition method and device based on tensor quantum space analysis. Background technique [0002] Speaker recognition, also known as voiceprint recognition, is a biometric recognition technology that uses computers to automatically determine the speaker's identity based on voice. According to different application scenarios, there are many classification methods for speaker recognition technology: according to whether the speech content is known, speaker recognition can be divided into text-related and text-independent. According to different recognition tasks, speaker recognition can be divided into speaker identification and speaker confirmation. Speaker recognition technology is mainly used in the fields of security monitoring, criminal investigation and justice, and e-commerce. [0003] In recent years, mainstream text-...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G10L15/28G10L15/02G10L15/22
Inventor 刘加何亮孙贻滋
Owner IFLYTEK CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products