Text-related speaker recognition method based on infinite-state hidden Markov model

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A speaker recognition and text-related technology, applied in the field of speaker recognition, can solve problems such as poor robustness, limited number of states, and overfitting of training data

Inactive Publication Date: 2011-07-20

NANJING UNIV OF POSTS & TELECOMM

View PDF5 Cites 22 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

First of all, the number of states of the traditional GHMM is limited, which is preset before training, and is fixed during the training process, which easily causes the model to overfit or underfit the training data.

Secondly, the output probability distribution function corresponding to each state in the traditional GHMM is represented by a Gaussian mixture model. In practical applications, a disadvantage of the Gaussian mixture model is that it is prone to noise and outliers in the data acquisition process. Points are less robust

The above problems often make the recognition accuracy rate of the traditional Hidden Markov Model-based text-related speaker recognition system poor

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0058] The technical solutions of the present invention will be further described below in conjunction with the drawings and embodiments. figure 1 It is a flow chart of the method of the present invention, and the method of the present invention is divided into four steps.

[0059] The first step: preprocessing of the speech signal

[0060] (1) Rice sample and quantification

[0061] For each piece of simulated speech signal y in the data set used for training and the data set used for recognition a (t) Sampling to obtain the amplitude sequence y(n) of the digital voice signal. The y(n) is quantized and coded by the pulse code modulation (PCM) technique, so as to obtain the quantized value representation form y'(n) of the amplitude sequence. Here, the accuracy of sampling and quantization is determined according to the requirements of the speaker recognition system applied in different environments. For most speech signals, the sampling frequency F is 8KHz-10KHz, and the q...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a text-related speaker recognition method based on an infinite-state hidden Markov model, which can be used for solving the problem that overfitting or underfitting data is easily generated in the traditional hidden Markov model. The text-related speaker recognition method disclosed by the invention comprises the following steps of: firstly, carrying out preprocessing and feature extraction on a voice signal set for training; then, describing the set for training in a training process by adopting the infinite-state hidden Markov model, wherein the model has an infinite state number before training data arrives and an output probability distribution function corresponding to each state is expressed by using a student's t mixed model; after the training data arrives, calculating to obtain a parameter value in the model and the distribution condition of random variables; and during recognition, calculating a likelihood value related to each trained speaker model on the basis of recognizable voices subjected to the processing and feature extraction, wherein a speaker corresponding to the maximal likelihood value is used as a recognition result. The method disclosed by the invention can be used for effectively improving the recognition accuracy rate of a text-related speaker recognition system, and in addition, the text-related speaker recognition system has better robustness for noises.

Description

technical field [0001] The invention relates to the fields of signal processing and pattern recognition, and mainly relates to a text-related speaker recognition method based on an infinite state hidden Markov model. Background technique [0002] In access control, credit card transactions and court evidence, automatic speaker recognition, especially text-related speaker recognition, plays an increasingly important role. Its goal is to correctly determine the speech to be recognized as belonging to the speech library One of many references. [0003] In terms of text-related speaker recognition methods, the traditional Hidden Markov Model (GHMM) method has attracted more and more attention. Because of its high recognition rate, simple training, and small training data requirements, it has become a The current mainstream recognition method for text-related speaker recognition. Since GHMM has a good ability to represent the distribution of data, as long as there are enough st...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G10L17/00G10L17/16

Inventor 魏昕

Owner NANJING UNIV OF POSTS & TELECOMM

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Text-related speaker recognition method based on infinite-state hidden Markov model

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology