JPEG compression does not work very well on non-realistic images, such as cartoons or
line drawings.
The
algorithm achieves much of its compression by exploiting known limitations of the
human eye, notably the fact that small color variations are not perceived as well as small variations in brightness.
1. Transform the image into a suitable
color space. This is a no-op for
grayscale images. For color images, RGB information is transformed into a luminance /
chrominance color space (e.g.,
YCbCr, YUV, etc). The luminance component is
grayscale and the other two axes are color information.
2. (Optional) Down sample each component by averaging together groups of pixels. The luminance component is left at full resolution, while the chroma components are often reduced 2:1 horizontally and either 2:1 or 1:1 (no change) vertically. In JPEG, these alternatives are usually called 2h2v and 2h1v sampling, but you may also see the terms "411" and "422" sampling. This step immediately reduces the data volume by one-half or one-third. In numerical terms it is highly lossy, but for most images it has almost no
impact on perceived quality, because of the eye's poorer resolution for chroma info. Note that down sampling is not applicable to
grayscale data; this is one reason color images are more compressible than grayscale.
3. Group the pixel values for each component into 8.times.8 blocks. Transform each 8.times.8 block through a
discrete cosine transform (DCT). The DCT is a relative of the
Fourier transform and likewise gives a
frequency map, with 8.times.8 components. Thus you now have numbers representing the average value in each block and successively higher-frequency changes within the block. The motivation for doing this is that you can now throw away high-frequency information without affecting low-frequency information. (The
DCT transform itself is reversible except for
round off error.)
4. In each block, divide each of the 64 frequency components by a separate "quantization coefficient" and round the results to integers. This is the fundamental information-losing step. The larger the quantization coefficients, the more data is discarded. Note that even the minimum possible quantization coefficient, 1, loses some info, because the exact DCT outputs are typically not integers. Higher frequencies are always quantized less accurately (given larger coefficients) than lower, since they are less visible to the eye. Also, the luminance data is typically quantized more accurately than the chroma data, by using separate 64-element quantization tables.
6. Tack on appropriate headers, etc., and output the result. In normal "interchange" JPEG file, all of the compression parameters are included in the headers so that the decompressor can reverse the process. These parameters include the quantization tables and the
Huffman coding tables.
This oversimplified case reveals two of the most difficult problems in motion compensation: 1) determining if an image is stationary; and 2) determining how and what portion of an image to extract for the portion of the image that moves.
However, when there is complex motion or new imagery, these error coding schemes may perform poorly, and the
error signal may be harder to
encode than the original
signal.
The primary visual artifacts affecting current
image compression systems are blocking effects and intermittent distortions, often near object boundaries, often called mosquito
noise.
Blocking effects are produced because adjacent blocks in an image are processed independently and the resulting independent
distortion from block to block causes a lack of continuity between neighboring blocks.
In addition, block-type
contouring, which is a special case of blocking effect, often results in instances when the intensity of an image is slowly changing.
Block DCT applications are not effective at representing sharp edges.
Accordingly, there is considerable
distortion at sharp edges: the reconstructed edges are not as sharp as normal and the adjacent regions are not as uniform as they should be.
Thus, most reproduced images may be adversely affected by blocking effects and edge distortion.
Whether the set-top box is functioning as a
encoder or a decoder both analog TVs and DTVs are adversely affected by the block DCT coding introduced image artifacts.