Audio frequency splitting method for changing detection based on decision tree and speaking person

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A technology that changes the detection and speaker. It is applied in speech analysis, speech recognition, instruments, etc. It can solve problems such as unreliable distance values, and achieve the effect of saving calculation time and improving accuracy.

Inactive Publication Date: 2006-01-04

ZHEJIANG UNIV

View PDF0 Cites 40 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

However, the KL2 distance-based algorithm is used to calculate the distance of the speech segment from the moving fixed-length window, making the distance value unreliable

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0011] The present invention will be further introduced below in conjunction with accompanying drawing and embodiment: this audio segmentation method based on decision tree and speaker change detection is divided into six steps altogether:

[0012] Step 1: Audio Preprocessing

[0013] Audio preprocessing is divided into three parts: sampling quantization, zero drift removal, pre-emphasis and windowing.

[0014] 1. Sampling and quantization

[0015] A), filter the audio signal with a sharp cut-off filter to make its Nyquist frequency F N 4KHZ;

[0016] B), set the audio sampling rate F=2F N ;

[0017] C), for audio signal S a (t) Sampling by period to obtain the amplitude sequence of the digital audio signal s ( n ) = s a ( n F ) ;

[0018] D), s(n) is quantized and coded by ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The audio splitting method based on the decision tree and the talker change detection includes the first adaptive silencing detection to find out the mute in audio frequency and the coarse splitting of audio frequency signal with the mute; the subsequent mutation detection for fine splitting of audio frequency signal and classifying the audio segment into phonetic part and non-phonetic part with the decision tree; and the final detecting the talker change point in the phonetic segment to obtain the final splitting result. The present invention performs phonetic detection via combining two methods of mute detection and mutation detection and adopting phonetic / non-phonetic decision tree to raise the accuracy of phonetic detection, and performs the talker change detection in phonetic segment with saving in calculation time.

Description

technical field [0001] The invention relates to signal processing and pattern recognition, mainly an audio segmentation method based on decision tree and speaker change detection. Background technique [0002] Speaker retrieval technology refers to the technology of retrieving specific speakers in a large number of audio documents by using signal processing and pattern recognition methods. Speaker retrieval technology needs to solve two problems, that is, who is speaking and when. Common speaker retrieval uses speaker recognition technology to solve the problem of who is speaking, and when to speak requires audio segmentation. [0003] Commonly used segmentation methods include segmentation based on Bayesian information criterion and segmentation based on KL2 distance. The segmentation method of the Bayesian information criterion determines whether to segment by calculating the Bayesian values of two hypotheses: "two audio features obey the same Gaussian distribution" an...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G10L15/08G10L15/00G10L25/87

Inventor 吴朝晖赵民德孟晓楠李红厉蒋姜旭锋

Owner ZHEJIANG UNIV

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Audio frequency splitting method for changing detection based on decision tree and speaking person

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology