Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Multi-background modeling method for speaker recognition

A speaker recognition and background model technology, applied in the field of speech recognition, can solve the problem that the division is not necessarily accurate, and achieve the effect of overcoming the inaccurate division of data, overcoming the inaccuracy and improving the accuracy.

Inactive Publication Date: 2010-09-15
TSINGHUA UNIV
View PDF10 Cites 40 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] Obviously, dividing all speakers by gender is a natural and external division, which is not necessarily accurate for speech signals

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Multi-background modeling method for speaker recognition
  • Multi-background modeling method for speaker recognition
  • Multi-background modeling method for speaker recognition

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0037] In the GMM-UBM system, the establishment of the UBM model is a crucial step. However, there is still no complete set of theoretical guidance on how to select UBM training data. Researchers can only select according to the final experimental results based on experience. Generally speaking, there are two types of gender-independent UBM and gender-related UBM, among which the performance of gender-related UBM is more superior. The invention promotes the gender-related UBM, divides the training data according to the channel length, and obtains a plurality of background models, and can be divided into three modules for specific implementation.

[0038] Module 1: Multi-background model training module

[0039] Firstly, it is necessary to obtain the bending coefficient of the channel length of the training UBM data. In this step, the maximum likelihood criterion is used to obtain it. First use all the training data to train a "neutral" GMM model with the Baum-Welch algorithm,...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a multi-background modeling method for speaker recognition, relating to a background modeling method in speaker recognition. The method is characterized by comprising the steps of: firstly, dividing training data according to a vocal tract length bending coefficient of voice, respectively training a UBM (Universal Background Model) in each group of data, then obtaining a target speaker GMM (Gaussian Mixture Model) by means of the self-adaption of each background model, obtaining multiple groups of GMM and UBM models, when recognizing a speaker, carrying out calculation on test data by each group of GMM and UBM models to obtain logarithm likelihood ratio fractions, and finally selecting a minimum logarithm likelihood ratio fraction from the logarithm likelihood ratio fractions to output. By adopting the invention, delicate depiction can be carried out on the background model, thereby improving the accuracy rate on speaker recognition.

Description

technical field [0001] The invention belongs to the field of speech recognition, and in particular relates to a multi-background model establishment method, which can be used for speaker recognition. Background technique [0002] Speaker recognition refers to the use of machines to identify the speaker's identity information from a speech signal. Speaker recognition technology is mainly used in voice-based identity confirmation, voice interception, court evidence identification and other fields. [0003] Speaker recognition methods mainly include VQ (Vector Quantization), GMM-UBM (Gaussian Mixture Model-Universal Background Model), SVM (Support Vector Machine) and so on. Among them, GMM-UBM is simple to implement and has excellent performance, and is widely used in the field of speaker recognition. [0004] In the GMM-UBM system, UBM describes the feature distribution of the average person, while GMM describes the feature distribution of the target speaker. In the trainin...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G10L15/06G10L15/02G10L15/07
Inventor 张卫强刘加
Owner TSINGHUA UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products