Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Voiceprint identification method based on global change space and deep learning hybrid modeling

A global change and deep learning technology, applied in speech analysis, instruments, etc., can solve the problems of reducing the performance of voiceprint recognition methods, mismatching training and testing environments, etc., and achieve the effects of improving performance, making up for deficiencies, and strong robustness

Inactive Publication Date: 2016-05-11
中科极限元(杭州)智能科技股份有限公司
View PDF3 Cites 54 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, due to the fact that voiceprint recognition will face the mismatch between training and testing environments, speech noise, multi-channel and other factors in practical applications, the performance of voiceprint recognition methods will be reduced.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Voiceprint identification method based on global change space and deep learning hybrid modeling
  • Voiceprint identification method based on global change space and deep learning hybrid modeling
  • Voiceprint identification method based on global change space and deep learning hybrid modeling

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0058] In step S100 , the original speech is obtained, features of Mel-frequency cepstral coefficients are extracted, endpoint detection is realized through short-term energy and short-term zero-crossing rate, and non-audio data in the original speech is eliminated to obtain speech segment data. Mel-frequency cepstral coefficient features are composed of 19-dimensional cepstrum features plus 1-dimensional energy features, and their first-order and second-order dynamic parameters, a total of 60-dimensional vectors.

[0059] The general background model for male and female voices is trained separately, and the general background model for male voices and the general background model for female voices are trained separately for the different characteristics of male and female voices. Since the general background model is used to describe the common characteristics of all speaker data, the number of mixtures of the general background model is relatively high, and 2048 dimensions ar...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a voiceprint identification method based on global change space and deep learning hybrid modeling, comprising the steps of: obtaining voice segment training data, employing a global change space modeling method to perform an identity authentication vector to obtain a TVM-IVECTOR; employing a deep neural network method to perform training to obtain an NN-IVECTOR; fusing two vectors of a same audio frequency file to obtain a new I-IVECTOR characteristic extractor; for the audio frequency to be tested, fusing the TVM-IVECTOR and the NN-IVECTOR, and then extracting a final I-IVECTOR; and after channel compensation, performing rating identification on the speaker model in a model base to obtain an identification result. The voiceprint identification method possesses greater robustness to environmental factor interference such as environment mismatching, multiple channel change and noise, and can improve voiceprint identification method performance.

Description

technical field [0001] The invention relates to a voiceprint recognition method, in particular to a voiceprint recognition method based on hybrid modeling of global variation space and deep learning. Background technique [0002] Language is one of the main sources for human beings to obtain information, and it is the most convenient, effective and natural tool for people to exchange information with the outside world. In addition to the voice information that contains the actual pronunciation content, the voice also includes the information of who the speaker is. Voiceprint recognition is a biological feature recognition method that identifies the speaker's information in the speech, and uses the speaker's voice signal to compare with the pre-extracted speaker's voice features to determine or identify the speaker's identity. [0003] Voiceprint recognition has a wide range of uses. In the field of justice and public security, as a means of technical reconnaissance, it can...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G10L17/10G10L17/04G10L17/02
CPCG10L17/10G10L17/02G10L17/04
Inventor 徐明星车浩
Owner 中科极限元(杭州)智能科技股份有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products