Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

A speaker recognition method based on gmm Token matching similarity correction score

A speaker recognition and speaker technology, applied in speech analysis, instruments, etc., can solve the problems of incomparability of test scores and low reliability of recognition results, and achieve the effect of improving system recognition performance and reducing scores

Active Publication Date: 2017-08-15
ZHEJIANG UNIV
View PDF4 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, due to the different sources of test voice scores, the test scores are not absolutely comparable, so the reliability of the recognition results obtained by traditional methods is not high.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A speaker recognition method based on gmm Token matching similarity correction score
  • A speaker recognition method based on gmm Token matching similarity correction score
  • A speaker recognition method based on gmm Token matching similarity correction score

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0034] In order to describe the present invention more specifically, the technical solutions of the present invention will be described in detail below in conjunction with the accompanying drawings and specific embodiments.

[0035] What the experimental data in the present embodiment adopts is Chinese emotional voice database (MASC@CCNT), and this database adopts Olympus DM-20 recording pen to record under quiet environment, and this database is composed of 68 native speakers who are Chinese It consists of 68 speakers, including 45 male speakers and 23 female speakers. In the identification method provided in this embodiment, there are many choices. In this embodiment, in order to facilitate description and provide specific test results, five emotional states are selected, which are neutral, angry, happy, angry and sad. That is, each speaker has a total of 5 kinds of speech in emotional states. Each speaker reads 2 paragraphs (about 30s recording length) and reads 5 words an...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a speaker recognition method based on GMM Token matching similarity correction scores. According to the method, the GMM Token matching similarity of a testing voice and training voices of target speakers on a UBM is calculated, weighting correction is carried out on likelihood scores of the testing voice on all target speaker models through the similarity, and the corrected likelihood scores are made to be more comparable. Before the scores are output, the reliability of the scores is evaluated, unreliable scores are punished, and therefore the scores of part of fake speakers are reduced, and the system recognition performance is improved.

Description

technical field [0001] The invention belongs to the technical field of speech recognition, and in particular relates to a speaker recognition method based on GMM Token ratio similarity correction score. Background technique [0002] Speaker recognition technology is a method of using signal processing and pattern recognition to recognize the speaker's identity according to the speaker's voice. It mainly includes two steps: speaker model training and voice testing. [0003] Currently, the main features used in speaker speech recognition include Mel cepstral coefficients (MFCC), linear predictive coding cepstral coefficients (LPCC), and perceptually weighted linear predictive coefficients (PLP). Speaker speech recognition algorithms mainly include Vector Quantization (VQ), Universal Background Model (GMM-UBM), Support Vector Machine (SVM) and so on. Among them, GMM-UBM is widely used in the field of speaker speech recognition. [0004] In the test speech recognition stage ba...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G10L17/04G10L17/02
Inventor 杨莹春吴朝晖邓立才
Owner ZHEJIANG UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products