Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Speaker recognition method based on semantic cell mixing model

A speaker recognition and hybrid model technology, applied in speech analysis, instruments, etc., can solve problems such as limited accuracy and poor pertinence

Inactive Publication Date: 2015-04-22
ZHEJIANG UNIV
View PDF4 Cites 14 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The disadvantage of this invention is that the dimensionality reduction of the feature vectors is carried out from the statistical point of view by using the principal component analysis method, and the pertinence is not strong
In addition, this invention uses the semantic cell mixture model as the recognition model of the classifier, and the accuracy of the method is limited when it is used for speaker recognition

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Speaker recognition method based on semantic cell mixing model
  • Speaker recognition method based on semantic cell mixing model
  • Speaker recognition method based on semantic cell mixing model

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0094] Combine below figure 1 and 2 The present invention is further described. The implementation method of the present invention comprises five steps.

[0095] Step (1) Constructing the speech library: the input speech signal must contain the identifier of the speaker, such as name.

[0096] The speech database constructed in this embodiment includes 138 speakers (106 males and 32 females), each with 10 speeches, and a total of 1380 speech data.

[0097] The preprocessing in step (2) includes pre-emphasis, framing and windowing. For the specific process, please refer to the patent application with publication number CN104200814A.

[0098] (2-1) The power spectrum of the speech signal decreases with the increase of frequency, and most of its energy is concentrated in the low frequency range. As a result, the signal-to-noise ratio at the high-frequency end of the speech signal may drop to an unacceptable level. However, because the energy of the higher frequency component...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a speaker recognition method based on a semantic cell mixing model. The method comprises the following steps: (1) establishing a voice library, wherein the voice library comprises multiple voice signals of multiple speakers; (2) preprocessing each voice signal in the voice library, extracting the voice characteristics, thereby obtaining each feature vector of each person; (3) performing dimensionality reduction on the feature vector so as to obtain a dimensionality reduction feature vector based on a semantic cell feature selection method, and training the semantic cell mixing model; (4) constructing an SVM classifier of each speaker by using a kernel function based on the semantic cell mixing model, and training a recognition model of the SVM classifier; and (5) recognizing the unknown speaker by utilizing the recognition model. According to the method disclosed by the invention, the problem that the kernel function of the conventional SVM model does not perform targeted optimization on a specific speaker, and when voice features used for a training classifier are selected, the method has high targeting property compared with the conventional common method, and the needed space for storing the model can be reduced.

Description

technical field [0001] The invention relates to the fields of signal processing and pattern recognition, in particular to a speaker recognition method based on a semantic cell mixed model. Background technique [0002] Speaker Recognition, also known as speaker recognition, refers to the process of automatically determining whether the speaker is in the registered speaker set and identifying the specific speaker by performing feature extraction and other analysis on the speech signal generated by the unknown speaker. . Since the shape and size of individual vocal tracts, larynx, and other generating organs vary, the speech characteristics of any two individuals are different (see Kinnunen T, Li H. An overview of text-independent speaker recognition: from features to supervectors. Speech communication, 2010, 52(1):12-40.). This technology can be used in processes that need to identify operators, such as telephone banking, voice access control, and telephone shopping. [00...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G10L17/02G10L17/04
Inventor 孙凌云何博伟尤伟涛李彦郑楷洪
Owner ZHEJIANG UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products