Neural network classifier for separating audio sources from a monophonic audio signal

A technology of audio signal and neural network, which is applied in voice analysis, instrumentation, voice recognition, etc., and can solve problems such as ineffective equalization

Inactive Publication Date: 2009-02-11
DTS BVI
View PDF0 Cites 41 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, equalization is not effective for extracting overlapping sources

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Neural network classifier for separating audio sources from a monophonic audio signal
  • Neural network classifier for separating audio sources from a monophonic audio signal
  • Neural network classifier for separating audio sources from a monophonic audio signal

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0024] The present invention is capable of componentizing and sorting multiple arbitrary and a priori unknown audio sources downmixed to a single monophonic audio signal.

[0025] As shown in FIG. 1 , multiple audio sources 10 such as speech, stringed instruments, and percussion are downmixed (step 12 ) to a single monophonic audio channel 14 . A mono signal can be a conventional mono mix, or it can be one channel of a stereo or multi-channel signal. In most cases, there is no a priori information about the specific type of audio sources in a specific mix, the signals themselves, how many different signals are included, or the mixing coefficients. The types of audio channels that can be included in a particular mix are known. For example, an application could be for categorizing sources or primary sources in a music mix. The classifier will know that possible sources include boys, girls, stringed instruments, percussion, etc. The classifier will not know which or how many o...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

A neural network classifier provides the ability to separate and categorize multiple arbitrary and previously unknown audio sources down-mixed to a single monophonic audio signal. This is accomplished by breaking the monophonic audio signal into baseline frames (possibly overlapping), windowing the frames, extracting a number of descriptive features in each frame, and employing a pre-trained nonlinear neural network as a classifier. Each neural network output manifests the presence of a pre-determined type of audio source in each baseline frame of the monophonic audio signal. The neural network classifier is well suited to address widely changing parameters of the signal and sources, time and frequency domain overlapping of the sources, and reverberation and occlusions in real-life signals. The classifier outputs can be used as a front-end to create multiple audio channels for a source separation algorithm (e.g., ICA) or as parameters in a post-processing algorithm (e.g. categorize music, track sources, generate audio indexes for the purposes of navigation, re-mixing, security and surveillance, telephone and wireless communications, and teleconferencing).

Description

technical field [0001] The present invention relates to the separation of multiple unknown audio sources downmixed to a single monophonic audio signal. Background technique [0002] There are various techniques for extracting sources from stereo or multi-channel audio signals. Independent component analysis (ICA) is the most widely used and researched method. However, ICA is only capable of extracting a number of sources equal to or less than the number of channels in the input signal. Therefore, it cannot be used for mono signal components. [0003] Extracting audio sources from monotone signals can be used to extract speech signal features, synthesize multi-channel signal representations, classify music, track sources, generate additional channels for ICA, generate audio indexes for navigation purposes (browsing), and remix (consumer & pro) security and surveillance, telephony and wireless communications, and video conferencing. Extraction of speech signal features (su...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G10L19/00G10L15/00G10L21/00G10L21/04
CPCG10L21/0272G10L25/30
Inventor D·V·施穆克
Owner DTS BVI
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products