A method and apparatus for editing heterogeneous media objects in a digital imaging device having a display screen, where each one of the media objects has one or more media types associated therewith, such as a still image, a sequential image, video, audio, and text. The method aspect of the present invention begins by displaying a representation of each one of the media objects on the display screen to allow a user to randomly select a particular media object to edit. In response to a user pressing a key to edit a selected media object, one or more specialized edit screens is invoked for editing the media types associated with the selected media object. If the media object includes a still or a sequential image, then an image editing screen is invoked. If the media object includes a video clip, then a video editing screen is invoked. If the media object includes an audio clip, then an audio editing screen is invoked. And if the media object includes a text clip, then a text editing screen is invoked.