A Synthetic Speech Detection Method Based on Speech Segmentation

A technology for synthesizing speech and detection methods, applied in speech analysis, speech recognition, instruments, etc., can solve the problem of high threat degree of ASV system, and achieve the effect of improving accuracy, improving detection accuracy, and high detection accuracy.

Active Publication Date: 2022-05-31
UNIV OF ELECTRONICS SCI & TECH OF CHINA
View PDF13 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The principle of converted speech attack is similar to that of synthetic speech attack, and it poses a greater threat to the ASV system
At the same time, these two attacks often appear in other speech recognition technology application scenarios, such as phone fraud, etc.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A Synthetic Speech Detection Method Based on Speech Segmentation
  • A Synthetic Speech Detection Method Based on Speech Segmentation
  • A Synthetic Speech Detection Method Based on Speech Segmentation

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0040] The training phase mainly includes two parts: data processing and model training.

[0043] Step A1: Obtain all training data from the training set, and check the sampling rate of speech recognition in the training set.

[0050] y

[0052] Step B2: the voice data is divided into short segments by 10ms, and there should be some overlap between each segment. speech signal in macro

[0054]

[0063] The main purpose of the deployment stage is to put the model that has been trained on the parameters into the synthetic speech detection pusher on the device.

[0064] Use the trained model for inference detection. The inference detection is mainly divided into three parts, the data processing part, the inference part

[0065] As shown in Figure 4, the deployment stage is divided into high-precision requirement detection effect deployment and rapid detection deployment.

[0066] Step D: High-precision inference deployment stage. The detailed steps are as follows:

[0068] Step D2:...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a synthesized speech detection method based on speech segmentation, which is applied in the field of speech detection. Aiming at the problem of low detection accuracy in the prior art, the present invention extracts two features in the audio: the CQCC feature of the audio segment, and the extracted audio feature The average zero-crossing rate feature of a silent (silent) segment; then use two GMM models to fit the two features respectively, and assign different weights to the two GMMs and test to find the most suitable weight; significantly improved the synthesis Speech detection accuracy.

Description

A synthetic speech detection method based on speech segmentation technical field [0001] The invention belongs to the field of speech detection, in particular to a synthetic speech detection technology. Background technique [0002] With the development of artificial intelligence, great changes have taken place in embedded devices. Image recognition and face unlock in embedded The application in embedded equipment greatly facilitates production and life. As a representative of acoustic artificial intelligence, speech recognition It has been more and more widely used in embedded devices such as assistants, voice printing and unlocking. Speech recognition technology refers to the The identification and analysis process enables a computer to convert speech signals into corresponding text or commands. Automatic speaker verification (ASV) is a speech recognition technology that identifies individuals by distinguishing the phonetic print characteristics of human speech. ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G10L15/02G10L15/04G10L15/06G10L25/24
CPCG10L15/02G10L15/04G10L15/063G10L25/24
Inventor 詹瑾瑜江维蒲治北杨永佳边晨雷洪江昱呈于安泰
Owner UNIV OF ELECTRONICS SCI & TECH OF CHINA
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products