An apparatus for generating, storing or editing an audio representation of an audio scene includes audio processing means for generating a plurality of speaker signals from a plurality of input channels as well as means for providing an object-oriented description of the audio scene, wherein the object-oriented description of the audio scene includes a plurality of audio objects, wherein an audio object is associated with an audio signal, a starting time instant and an end time instant. The apparatus for generating further distinguishes itself by mapping means for mapping the object-oriented description of the audio scene to the plurality of input channels, wherein an assignment of temporally overlapping audio objects to parallel input channels is performed by the mapping means, whereas temporally sequential audio objects are associated with the same channel. With this, an object-oriented representation is transferred into a channel-oriented representation, whereby on the object-oriented side the optimal representation of a scene may be used, whereas on channel-oriented side the channel-oriented concept users are used to may be maintained.