Unlock instant, AI-driven research and patent intelligence for your innovation.

System and method using blind change detection for audio segmentation

Combining multiple blind change detection algorithms for audio segmentation addresses the inaccuracy and robustness issues in existing techniques, enhancing the detection of audio changes with improved accuracy and reduced false alarms for applications like speech recognition and audio indexing.

Inactive Publication Date: 2011-08-02
INTERNATIONAL BUSINESS MACHINE CORPORATION
View PDF7 Cites 4 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Benefits of technology

This approach improves the accuracy and robustness of audio segmentation by minimizing the detection time for changes while maintaining a low false alarm rate, making it suitable for applications like speech recognition, speaker recognition, and online audio indexing.

Problems solved by technology

The performance of many applications based on these streams like speech recognition and audio indexing degrades significantly due to the presence of the irrelevant portions of the audio stream.
The informed automatic segmentation is limited to applications where enough amount of training data is available for building the acoustic models.
It can not generalize to unseen acoustic conditions in the training data.
Unfortunately all of the current techniques for automatic blind segmentation like using the Kullback-Liebler distance, the generalized likelihood ratio distance, or the Bayesian Information Criterion (BIC) try to optimize an objective function that is not directly related to minimizing the missing probability for a given false alarm rate.
Known solutions of this problem like using the BIC criterion are not accurate enough and have robustness problems due to employing a single criterion that is not directly related to minimizing the missing probability for a given false alarm rate and comparing this criterion to a threshold.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0025]The present invention is directed to a system and method that combines various approaches for audio segmentation change detection using different statistical modeling of the data and optimizes different criteria to generate an automatic segmentation of the audio stream.

[0026]While an example embodiment described herein utilizes three (3) automatic change detection audio segmentation algorithms, it is understood that other algorithms providing for automatic segmentation of the audio data may be used in addition to or as alternates of the three algorithms described herein. While it is understood that the invention contemplates use of at least two algorithms, three (3) algorithms employed according to the present invention are now described:

[0027]A. Change Detection Using the CuSum Algorithm

[0028]Under the assumption that the sequence of the log likelihood ratios, {li}i=1n

[0029]is an i.i.d process, the CuSum algorithm is optimal in the sense of minimizing detection time for a gi...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

A system, method and computer program product for performing blind change detection audio segmentation that combines hypothesized boundaries from several segmentation algorithms to achieve the final segmentation of the audio stream. Automatic segmentation of the audio streams according to the system and method of the invention may be used for many applications like speech recognition, speaker recognition, audio data mining, online audio indexing, and information retrieval systems, where the actual boundaries of the audio segments are required.

Description

CROSS-REFERENCE TO RELATED APPLICATIONS[0001]The present application is a continuation application of U.S. Ser. No. 11 / 206,621, filed Aug. 18, 2005; and relates to and claims the benefit of U.S. Provisional Patent Application Ser. No. 60 / 663,079 filed Mar. 18, 2005, the entire contents and disclosure of which is incorporated by reference herein.STATEMENT REGARDING FEDERALLY SPONSORED RESEARCH OR DEVELOPMENT[0002]This invention was made with Government support under contract number H98230-04-3-0001 awarded by the Distillery Phase II Program. The Government has certain rights in this inventionBACKGROUND OF THE INVENTION[0003]1. Field of the Invention[0004]The present invention relates generally to the field of audio data processing systems and methods, and, more particularly, to a novel system and method for performing blind change detection audio segmentation.[0005]2. Discussion of the Prior Art[0006]Many audio resources like broadcast news contain different kinds of audio signals li...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(United States)
IPC IPC(8): G10L21/00
CPCG10L25/48
Owner INTERNATIONAL BUSINESS MACHINE CORPORATION