Supercharge Your Innovation With Domain-Expert AI Agents!

Speech speed multiplication attack detection method based on rhythm features and random forest classifier

A random forest and attack detection technology, which is applied in speech recognition technology and security fields, can solve problems such as difficult detection of double-speed speech attacks, and achieve the effects of high attack detection accuracy, low cost, wide demand and application prospects

Pending Publication Date: 2022-05-27
ZHEJIANG UNIV
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] At present, many related researches have protected against voice attacks by detecting the noise and distortion introduced during the generation of voice adversarial samples. However, this type of detection method is difficult to detect voice double-speed attacks without adding noise.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Speech speed multiplication attack detection method based on rhythm features and random forest classifier
  • Speech speed multiplication attack detection method based on rhythm features and random forest classifier
  • Speech speed multiplication attack detection method based on rhythm features and random forest classifier

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0033] The technical solutions provided by each embodiment of the present invention will be described in detail below with reference to the accompanying drawings. The flow charts shown in the figures are merely illustrative and do not necessarily include all steps. For example, some steps can be decomposed, and some steps can be combined or partially combined, so the actual execution order may be changed according to the actual situation.

[0034] A speech double-speed attack is a speech adversarial attack that simply speeds up or slows down the original audio, rather than adding distractions. In order to realize the recognition of both normal audio and double-speed confrontation audio without adding noise, the present invention uses the "unnatural" distortion brought by the double-speed operation, and proposes a voice double-speed attack detection method based on prosodic features and random forest classifiers. The microphone of the device receives the voice signal to extrac...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a voice speed multiplication attack detection method based on rhythm features and a random forest classifier, and belongs to the technical field of voice recognition and safety. Acquiring an audio data set, wherein the audio data set comprises normal audio and double-speed confrontation audio; extracting jitter features, tremolo features and harmonic noise ratio features of all audios in the audio data set to form feature vectors; and training a random forest classifier by using the feature vectors of the normal audio and the multi-speed confrontation audio, and carrying out voice multi-speed attack detection by using the trained random forest classifier. The voice double-speed spoofing attack can be efficiently detected through the existing microphone and voice hardware of the voice recognition system, and the method has the advantages of being low in cost and high in attack detection accuracy, can be used for safety protection of the voice recognition system on intelligent equipment such as a mobile phone and has wide requirements and application prospects.

Description

technical field [0001] The invention belongs to the field of speech recognition technology and security technology, and in particular relates to a speech double-speed attack detection method based on prosodic features and random forest classifiers. Background technique [0002] Automatic Speech Recognition (ASR) systems can recognize speech and output speech recognition text. Existing popular ASR systems include open source systems (such as Kaldi and DeepSpeech) and commercial systems (such as Google Cloud Speech-to-Text, Baidu ASR, and iFlytek). For the input audio, the ASR system first performs signal processing to reduce noise and remove irrelevant frequency components; then divide the processed audio signal into short segments to extract features such as Mel Frequency Cepstral Coefficients (MFCC); finally, use the extracted features , the most likely word sequence is inferred by a pre-trained speech recognition model. [0003] Time-scale Modification (TSM) refers to an...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G10L25/51G10L25/27G06K9/00G06N3/00
CPCG10L25/51G10L25/27G06N3/006G06F2218/08G06F2218/12
Inventor 徐文渊冀晓宇闫琛何睿文石卓扬李超豪
Owner ZHEJIANG UNIV
Features
  • R&D
  • Intellectual Property
  • Life Sciences
  • Materials
  • Tech Scout
Why Patsnap Eureka
  • Unparalleled Data Quality
  • Higher Quality Content
  • 60% Fewer Hallucinations
Social media
Patsnap Eureka Blog
Learn More