Complex audio segmentation clustering method based on bottleneck feature

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A technology for segmentation and clustering and bottlenecks, applied in speech analysis, speech recognition, special data processing applications, etc., can solve the problems of strong subjectivity, high cost of manual labeling, and low efficiency.

Inactive Publication Date: 2017-07-14

SOUTH CHINA UNIV OF TECH

View PDF1 Cites 49 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

Although manual labeling can be used to find out the audio types in the audio stream, the cost of manual labeling is high, subjectivity is strong, and the efficiency is low, while the supervised audio classification method needs to know the audio types in the audio stream in advance, and train specific audio types in advance. type classifier

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment

[0165] Figure 4 It is a flowchart of an embodiment of the complex audio segmentation clustering method based on bottleneck features, and it mainly includes the following processes:

[0166] 1. Construction of deep neural network with bottleneck layer: read in training data and extract MFCC features, and then train a DNN feature extractor with bottleneck layer through two steps of unsupervised pre-training and supervised precise adjustment; the specific steps include:

[0167] S1.1. Read in the training data and extract the features of Mel-frequency cepstral coefficients. The specific steps are as follows:

[0168] S1.1.1, pre-emphasis: set the transfer function of the digital filter as H(z)=1-αz -1 , where α is a coefficient and its value is: 0.9≤α≤1, and the read-in audio stream is pre-emphasized after passing through the digital filter;

[0169] S1.1.2, Framing: Set the frame length of the audio frame to 25 milliseconds, the frame shift to 10 milliseconds, and the number ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a complex audio segmentation clustering method based on a bottleneck feature. The method comprises the steps that a deep neural network with a bottleneck layer is constructed; a complex audio stream is read, and endpoint detection is carried out on the complex audio stream; the audio feature of a non-silent segment is extracted and input into the deep neural network; the bottleneck feature is extracted from the bottleneck layer of the deep neural network; the bottleneck feature is used as input, and an audio segmentation method based on the Bayesian information criterion is used, so that each audio segment contains only one kind of audio type and adjacent audio segments have different audio types; a spectral clustering algorithm is used to cluster segmented audio segments to acquire the number of audio types of complex audios; and the audio segments of the same audio type are merged together. According to the invention, the used bottleneck feature is a deep transform feature, can more effectively describe the feature difference of the complex audio type than a traditional audio feature, and acquires an excellent effect in complex audio segmentation clustering.

Description

technical field [0001] The invention relates to audio signal processing and pattern recognition technology, in particular to a complex audio segmentation and clustering method based on bottleneck features. Background technique [0002] With the development and popularization of multimedia acquisition equipment, the Internet and cloud storage platforms, the demand for analysis and retrieval of massive and complex audio content is becoming increasingly urgent. As an unsupervised method, complex audio segmentation and clustering are one of the important means of audio content analysis. Although manual labeling can be used to find out the audio types in the audio stream, the cost of manual labeling is high, subjectivity is strong, and the efficiency is low, while the supervised audio classification method needs to know the audio types in the audio stream in advance, and train specific audio types in advance. type of classifier. Therefore, unsupervised complex audio segmentatio...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G10L15/04G10L15/26G10L25/24G10L25/30G10L25/51G06F17/30

Inventor 李艳雄王琴李先苦张雪张聿晗

Owner SOUTH CHINA UNIV OF TECH

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Complex audio segmentation clustering method based on bottleneck feature

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology