Controlling a multimedia software application using high-level metadata features and symbolic object labels derived from an audio source, wherein a first-pass of low-level signal analysis is performed, followed by a stage of statistical and perceptual processing, followed by a symbolic machine-learning or data-mining processing component is disclosed. This multi-stage analysis system delivers high-level metadata features, sound object identifiers, stream labels or other symbolic metadata to the application scripts or programs, which use the data to configure processing chains, or map it to other media. Embodiments of the invention can be incorporated into multimedia content players, musical instruments, recording studio equipment, installed and live sound equipment, broadcast equipment, metadata-generation applications, software-as-a-service applications, search engines, and mobile devices.