Audio indexing method based on multi-distance sound sensor

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
An acoustic sensor and sensor technology, applied in the direction of instruments, speech analysis, speech recognition, etc., can solve the problems of small samples, limited training data, loss of identification information, etc.

Active Publication Date: 2012-06-20

TSINGHUA UNIV

View PDF2 Cites 9 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

The GMM-SVM method has better performance, but there are still the following problems: GMM has too many parameters when estimating the probability density, the training data is limited, and GMM-SVM is mainly aimed at speaker recognition and has not developed into a general technology.

Disadvantages of the LPP-based algorithm Dimensionality reduction processing will affect the flow distribution of data, resulting in loss of identification information and small sample problems, etc.

For the small sample problem, Yang et al. proposed a Null-space Locality Preserving Projection algorithm (Null-space Locality Preserving Projections, NDLPP), but this method only uses the identification information of the null space and ignores the identification information in the pivot space.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0048] The present invention will be described in further detail below in conjunction with the accompanying drawings and embodiments.

[0049] The input devices of SPKR include headset microphone, single microphone, microphone array and multiple distance microphones (Multiple Distance Microphones). The multi-distance acoustic sensor meets the requirements of complex dialogue scenarios with multiple sound sources and directions, and can be applied to sound source localization, speaker clustering and identification, etc. Based on the particularity of the multi-distance acoustic sensor topology, the multi-time-delay feature can be used to classify spatially non-overlapping sound sources.

[0050] Such as figure 1 As shown, it is a multi-distance acoustic sensor system, including multiple acoustic sensors, figure 1 The four acoustic sensors 111-114 are represented by four of them, and these four acoustic sensors are randomly placed on the same platform. Likewise, only three sou...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses an audio indexing method based on a multi-distance sound sensor. In the method, a multi-distance sound sensor is used as an audio recording device for recording the audio information in a multimedia conference, a space multi-delay feature is extracted based on the multi-distance sound sensor as a feature for distinguishing different speakers, and a new flow-type algorithm is adopted to perform dimension reduction of the multi-delay feature and classify the speakers according to the identities. The method can reduce the complexity and calculation cost of the system; finally, the audio segment and identity of each speaker are output by the system as audio index information; the optimal discriminant vector set theory obtained by the method can achieve optimal discrimination theoretically; and the method can be applied to a multi-people multi-party conversion scene in a complicated acoustic environment.

Description

technical field [0001] The invention belongs to the technical field of audio and relates to audio indexing, in particular to an audio indexing method based on a multi-distance acoustic sensor. Background technique [0002] Teleconferencing and video conferencing have increasingly penetrated into business activities and daily life, and the corresponding recorded data has shown a geometric growth. In such scenarios, there are usually multiple sound sources in a piece of audio data. Such data can be processed through audio indexing techniques, offloading post-processing methods such as speech recognition. [0003] Audio indexing technology automatically extracts information from audio data to search and discover target content. Speaker classification is the key technology of audio indexing. Speaker classification technology includes three parts: feature extraction, speech segmentation, and classification decision-making. The main algorithms are mixed Gaussian log-likelihood ra...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G10L15/08

Inventor杨毅陈国顺王胜开

OwnerTSINGHUA UNIV

Audio indexing method based on multi-distance sound sensor

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology