A Synthetic Speech Detection Method Based on Speech Segmentation

What is AI technical title?
AI technical title is built by Patsnap AI team. It summarizes the technical point description of the patent document.
A technology for synthesizing speech and detection methods, applied in speech analysis, speech recognition, instruments, etc., can solve the problem of high threat degree of ASV system, and achieve the effect of improving accuracy, improving detection accuracy, and high detection accuracy.

Active Publication Date: 2022-05-31

UNIV OF ELECTRONICS SCI & TECH OF CHINA

View PDF13 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

The principle of converted speech attack is similar to that of synthetic speech attack, and it poses a greater threat to the ASV system

At the same time, these two attacks often appear in other speech recognition technology application scenarios, such as phone fraud, etc.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0040] The training phase mainly includes two parts: data processing and model training.

[0043] Step A1: Obtain all training data from the training set, and check the sampling rate of speech recognition in the training set.

[0050] y

[0052] Step B2: the voice data is divided into short segments by 10ms, and there should be some overlap between each segment. speech signal in macro

[0054]

[0063] The main purpose of the deployment stage is to put the model that has been trained on the parameters into the synthetic speech detection pusher on the device.

[0064] Use the trained model for inference detection. The inference detection is mainly divided into three parts, the data processing part, the inference part

[0065] As shown in Figure 4, the deployment stage is divided into high-precision requirement detection effect deployment and rapid detection deployment.

[0066] Step D: High-precision inference deployment stage. The detailed steps are as follows:

[0068] Step D2:...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a synthesized speech detection method based on speech segmentation, which is applied in the field of speech detection. Aiming at the problem of low detection accuracy in the prior art, the present invention extracts two features in the audio: the CQCC feature of the audio segment, and the extracted audio feature The average zero-crossing rate feature of a silent (silent) segment; then use two GMM models to fit the two features respectively, and assign different weights to the two GMMs and test to find the most suitable weight; significantly improved the synthesis Speech detection accuracy.

Description

A synthetic speech detection method based on speech segmentation technical field [0001] The invention belongs to the field of speech detection, in particular to a synthetic speech detection technology. Background technique [0002] With the development of artificial intelligence, great changes have taken place in embedded devices. Image recognition and face unlock in embedded The application in embedded equipment greatly facilitates production and life. As a representative of acoustic artificial intelligence, speech recognition It has been more and more widely used in embedded devices such as assistants, voice printing and unlocking. Speech recognition technology refers to the The identification and analysis process enables a computer to convert speech signals into corresponding text or commands. Automatic speaker verification (ASV) is a speech recognition technology that identifies individuals by distinguishing the phonetic print characteristics of human speech. ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & AuthorityPatents(China)

IPC IPC(8): G10L15/02G10L15/04G10L15/06G10L25/24

CPCG10L15/02G10L15/04G10L15/063G10L25/24

Inventor詹瑾瑜江维蒲治北杨永佳边晨雷洪江昱呈于安泰

OwnerUNIV OF ELECTRONICS SCI & TECH OF CHINA

A Synthetic Speech Detection Method Based on Speech Segmentation

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology