Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Robust speaker distinguishing method based on multifactor frequency displacement invariant feature

A frequency displacement and multi-factor technology, applied in voice analysis, instruments, etc., can solve problems such as poor robustness and speaker discrimination performance degradation, and achieve the effect of improving accuracy and reducing interference

Inactive Publication Date: 2014-04-16
SHANDONG UNIV
View PDF4 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0016] At present, the characteristic coefficient of the signal in the text-independent speaker identification system can get better recognition accuracy in a relatively quiet environment, but when the surrounding environment is complex and there are many interferences and noises, the speaker identification performance will be reduced. The main reason is that the robustness of single-factor features (such as spectrum) is poor, and the training model of the speaker model does not match the test data.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Robust speaker distinguishing method based on multifactor frequency displacement invariant feature
  • Robust speaker distinguishing method based on multifactor frequency displacement invariant feature
  • Robust speaker distinguishing method based on multifactor frequency displacement invariant feature

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0058] The present invention will be further described below in conjunction with drawings and embodiments.

[0059] like figure 2 As shown, the frequency displacement invariant feature extraction method considering multiple factors in the speech of the present invention specifically comprises the following steps:

[0060] (1) Preprocess the speech data x(t) of 51 children in the Tidigits database. The sampling rate is 8kHz. The Hamming window is used for windowing. The window length is 23ms, and the window shift is 10ms. Calculate the energy spectrum S(f,t) of the signal by Fourier transform;

[0061] (2) Use 4 different scales and 4 different phases of two-dimensional complex wavelet transform to filter the energy spectrum S(f, t) to obtain the tensor multi-factor representation of the speech signal here is a size of The 4-order tensor of , each order corresponds to frequency, time, scale and phase; using 36 Mel-scale filter bank pairs Frequency-order filtering of t...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a frequency displacement invariant feature extracting method considering multifactor in voice, and the method is used for distinguishing text-independent speakers under a complex environment. The method comprises the step of: in consideration of the time, frequency, scale and phase information of voice, performing multifactor characterization on a voice signal energy spectrum through two-dimensional plurality wavelet transform of different scales and phases, in consideration of the displacement invariant feature of the frequency, calculating a displacement invariant feature projection matrix on a frequency order through a convolution-type non-negative tensor decomposition method to obtain a multifactor sparse feature, decorrelating the feature through discrete cosine transform, and calculating the first order and second order difference coefficients of the feature, thus finally obtaining speaker feature with good robustness. The frequency displacement invariant feature extracting method considering multifactor in voice disclosed by the invention calculates the multifactor frequency displacement invariant feature with robustness through the convolution-type non-negative tensor decomposition method to distinguish the text-independent speakers under a noise environment, so that good distinguishing accuracy is good.

Description

technical field [0001] The invention relates to a feature extraction method for improving speaker identity discrimination performance, and belongs to the technical field of speech signal processing. Background technique [0002] With the continuous development of computer and artificial intelligence technology, various intelligent machines participate in human production and social activities, so how to improve the relationship between people and these machines and make it easier for people to manipulate machines It is becoming more and more important, and language is the best way for people to communicate with machines. [0003] Speech signal processing is an interdisciplinary subject combining linguistics and digital signal processing technology. It is one of the important means of computer intelligent interface and human-computer interaction. Speaker identification is an efficient means of human-computer interaction. Its characteristics are signal The collection is conve...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G10L17/02G10L17/20
Inventor 吴强刘琚孙建德
Owner SHANDONG UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products