Multi-background modeling method for speaker recognition

What is AI technical title?
AI technical title is built by Patsnap AI team. It summarizes the technical point description of the patent document.
A speaker recognition and background model technology, applied in the field of speech recognition, can solve the problem of not necessarily accurate division, and achieve the effect of overcoming inaccurate data division, overcoming the lack of fineness, and improving accuracy

Inactive Publication Date: 2011-11-09

TSINGHUA UNIV

View PDF0 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0006] Obviously, dividing all speakers by gender is a natural and external division, which is not necessarily accurate for speech signals

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0037] In the GMM-UBM system, the establishment of the UBM model is a crucial step. However, there is still no complete set of theoretical guidance on how to select UBM training data. Researchers can only select according to the final experimental results based on experience. Generally speaking, there are two types of gender-independent UBM and gender-related UBM, among which the performance of gender-related UBM is more superior. The invention promotes the gender-related UBM, divides the training data according to the channel length, and obtains a plurality of background models, and can be divided into three modules for specific implementation.

[0038] Module 1: Multi-background model training module

[0039] Firstly, it is necessary to obtain the bending coefficient of the channel length of the training UBM data. In this step, the maximum likelihood criterion is used to obtain it. First use all the training data to train a "neutral" GMM model with the Baum-Welch algorithm,...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a multi-background modeling method for speaker recognition, relating to a background modeling method in speaker recognition. The method is characterized by comprising the steps of: firstly, dividing training data according to a vocal tract length bending coefficient of voice, respectively training a UBM (Universal Background Model) in each group of data, then obtaining a target speaker GMM (Gaussian Mixture Model) by means of the self-adaption of each background model, obtaining multiple groups of GMM and UBM models, when recognizing a speaker, carrying out calculationon test data by each group of GMM and UBM models to obtain logarithm likelihood ratio fractions, and finally selecting a minimum logarithm likelihood ratio fraction from the logarithm likelihood ratio fractions to output. By adopting the invention, delicate depiction can be carried out on the background model, thereby improving the accuracy rate on speaker recognition.

Description

technical field [0001] The invention belongs to the field of speech recognition, and in particular relates to a multi-background model establishment method, which can be used for speaker recognition. Background technique [0002] Speaker recognition refers to the use of machines to identify the speaker's identity information from a speech signal. Speaker recognition technology is mainly used in voice-based identity confirmation, voice interception, court evidence identification and other fields. [0003] Speaker recognition methods mainly include VQ (Vector Quantization), GMM-UBM (Gaussian Mixture Model-Universal Background Model), SVM (Support Vector Machine) and so on. Among them, GMM-UBM is simple to implement and has excellent performance, and is widely used in the field of speaker recognition. [0004] In the GMM-UBM system, UBM describes the feature distribution of the average person, while GMM describes the feature distribution of the target speaker. In the trainin...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & AuthorityPatents(China)

IPC IPC(8): G10L15/06G10L15/02G10L15/07

Inventor张卫强刘加

OwnerTSINGHUA UNIV

Multi-background modeling method for speaker recognition

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology