Robust speaker distinguishing method based on multifactor frequency displacement invariant feature

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A frequency displacement and multi-factor technology, applied in voice analysis, instruments, etc., can solve problems such as poor robustness and speaker discrimination performance degradation, and achieve the effect of improving accuracy and reducing interference

Inactive Publication Date: 2014-04-16

SHANDONG UNIV

View PDF4 Cites 2 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0016] At present, the characteristic coefficient of the signal in the text-independent speaker identification system can get better recognition accuracy in a relatively quiet environment, but when the surrounding environment is complex and there are many interferences and noises, the speaker identification performance will be reduced. The main reason is that the robustness of single-factor features (such as spectrum) is poor, and the training model of the speaker model does not match the test data.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0058] The present invention will be further described below in conjunction with drawings and embodiments.

[0059] like figure 2 As shown, the frequency displacement invariant feature extraction method considering multiple factors in the speech of the present invention specifically comprises the following steps:

[0060] (1) Preprocess the speech data x(t) of 51 children in the Tidigits database. The sampling rate is 8kHz. The Hamming window is used for windowing. The window length is 23ms, and the window shift is 10ms. Calculate the energy spectrum S(f,t) of the signal by Fourier transform;

[0061] (2) Use 4 different scales and 4 different phases of two-dimensional complex wavelet transform to filter the energy spectrum S(f, t) to obtain the tensor multi-factor representation of the speech signal here is a size of The 4-order tensor of , each order corresponds to frequency, time, scale and phase; using 36 Mel-scale filter bank pairs Frequency-order filtering of t...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a frequency displacement invariant feature extracting method considering multifactor in voice, and the method is used for distinguishing text-independent speakers under a complex environment. The method comprises the step of: in consideration of the time, frequency, scale and phase information of voice, performing multifactor characterization on a voice signal energy spectrum through two-dimensional plurality wavelet transform of different scales and phases, in consideration of the displacement invariant feature of the frequency, calculating a displacement invariant feature projection matrix on a frequency order through a convolution-type non-negative tensor decomposition method to obtain a multifactor sparse feature, decorrelating the feature through discrete cosine transform, and calculating the first order and second order difference coefficients of the feature, thus finally obtaining speaker feature with good robustness. The frequency displacement invariant feature extracting method considering multifactor in voice disclosed by the invention calculates the multifactor frequency displacement invariant feature with robustness through the convolution-type non-negative tensor decomposition method to distinguish the text-independent speakers under a noise environment, so that good distinguishing accuracy is good.

Description

technical field [0001] The invention relates to a feature extraction method for improving speaker identity discrimination performance, and belongs to the technical field of speech signal processing. Background technique [0002] With the continuous development of computer and artificial intelligence technology, various intelligent machines participate in human production and social activities, so how to improve the relationship between people and these machines and make it easier for people to manipulate machines It is becoming more and more important, and language is the best way for people to communicate with machines. [0003] Speech signal processing is an interdisciplinary subject combining linguistics and digital signal processing technology. It is one of the important means of computer intelligent interface and human-computer interaction. Speaker identification is an efficient means of human-computer interaction. Its characteristics are signal The collection is conve...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Patents(China)

IPC IPC(8): G10L17/02G10L17/20

Inventor 吴强刘琚孙建德

Owner SHANDONG UNIV

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Robust speaker distinguishing method based on multifactor frequency displacement invariant feature

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology