A video processing system enhances quality of an overlay image, such as a logo, text, game scores, or other areas forming a region of interest (ROI) in a video stream. The system separately enhances the video quality of the ROI, particularly when screen size is reduced. The data enhancement can be accomplished at decoding with metadata provided with the video data for decoding so that the ROI that can be separately enhanced from the video. In improve legibility, the ROI enhancer can increase contrast, brightness, hue, saturation, and bit density of the ROI. The ROI enhancer can operate down to a pixel-by-pixel level. The ROI enhancer may use stored reference picture templates to enhance a current ROI based on a comparison. When the ROI includes text, a minimum reduction size for the ROI relative to the remaining video can be identified so that the ROI is not reduced below human perceptibility.