Playback speech detection method

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A speech detection and speech technology, applied in speech analysis, instruments, etc., can solve the problems of high algorithm complexity, background noise pollution, and small silent segment amplitude.

Active Publication Date: 2018-12-11

NINGBO UNIV

View PDF3 Cites 12 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

In the early research work, Zhang Lipeng and others proposed a detection algorithm based on the silent segment of speech by modeling the silent segment of speech. Through the research on the algorithm, it was found that although the algorithm has certain performance in the detection of playback speech, the algorithm Because of the small range of silent segments used in this method, it is easy to be polluted by background noise, so the algorithm has certain limitations.

However, Wang Zhifeng and others proposed a recording playback detection algorithm based on channel mode noise by exploring the mechanism of playback voice generation and analyzing the noise differences introduced by different devices. Although this algorithm has a good detection effect, its experiments only use A recording device and a playback device, so the robustness of the algorithm needs to be studied

In addition, foreign research such as Shang and Stevenson proposed an algorithm to detect the similarity between the test voice and the legal voice on the peak map by using the randomness of voice generation. This algorithm can only be applied to text-related voiceprint authentication. system

On the basis of an algorithm for detecting the similarity between the speech to be tested and the legal speech on the peak map, Jakub Galka et al. added the positional relationship of each frequency point to the characteristics of the peak map, which further improved the anti-replay of the voiceprint authentication system. Voice performance, but the algorithm is still limited to text-related voiceprint authentication systems

In recent years, Todisco M and Delgado H et al. proposed a CQCC (Constant Q Cepstral Coefficients) feature based on constant Q transform in 2016. Although it has a certain effect on the detection of playback speech, the detection accuracy still needs to be improved.

Ji Z and others used a variety of combined features and integrated classifier ideas to reduce the equal error probability of playback speech detection to about 20%, but the complexity of the algorithm is extremely high

Lantian Li et al. proposed the I-MFCC method for playback speech detection by using the F-ratio method to analyze the difference between real speech and playback speech. Experiments show that although this feature has a certain detection effect, the robustness of this feature Poor sex

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0036] The present invention will be further described in detail below in conjunction with the accompanying drawings and embodiments.

[0037] A kind of playback voice detection method proposed by the present invention, its overall realization block diagram is as follows figure 1 As shown, it includes the following steps:

[0038] Step 1: Select N realdifferent real voices with a duration greater than or equal to 1 second; then obtain a number of playback voices corresponding to each real voice; then use each real voice as a positive sample, and use all the playback voices corresponding to each real voice Select at least one playback voice as a negative sample, and the voice database is composed of all positive samples and all negative samples, and the number of positive samples in the voice database is N real and the number of negative samples is N back , that is, the number of speech samples in the speech database is N real +N back ; where N real ≥500, if take N real ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a playback speech detection method. At a training stage, a first variation coefficient vector, a normalized first cepstrum feature matrix, a second variation coefficient vectorand a normalized second cepstrum feature matrix of each speech sample in a speech database are firstly acquired as four kinds of features; the four kinds of features of all positive samples are theninputted to a GMM model for training, four positive sample feature models are obtained, and besides, four negative sample feature models are obtained; at a testing stage, four kinds of features for to-be-detected speech are acquired in the same way, the kinds of features are inputted to the corresponding positive sample feature models and the negative sample feature models respectively, and four likelihood ratio scores are obtained; and according to the four likelihood ratio scores, a final score is obtained, and through compairing the final score and a judgment threshold, whether to be a playback speech is judged. The playback speech detection method is not only limited to a text-related voiceprint authentication system, and has the advantages of low equal error detection possibility, strong robustness and relatively low calculation complexity.

Description

technical field [0001] The invention relates to a voice detection technology, in particular to a playback voice detection method. Background technique [0002] In the field of biometric technology, the voiceprint recognition system has been widely used in the fields of life, finance and justice due to its high security and convenient acquisition. With the continuous development of voiceprint recognition technology, the attacks of various counterfeit voices on the voiceprint recognition system are becoming more and more severe. In the past few years, researchers have focused on the detection of counterfeit voice mainly on synthetic voice and converted voice, and to some extent ignored the attack of playback voice on voiceprint recognition system. In fact, first of all, since the playback voice is directly recorded from the real voice, it is more threatening than the synthetic voice and converted voice; second, the playback voice is more convenient to obtain than other counte...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G10L17/00G10L17/02G10L17/04G10L25/24

CPCG10L17/00G10L17/02G10L17/04G10L25/24

Inventor 王让定林朗严迪群胡君

Owner NINGBO UNIV

Playback speech detection method

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology