You are on page 1of 5

019_23 2/19/01 12:00 PM Page 19

e are witnessing phenomenal ally in analog form. However, for pro- Lossy schemes are widely used since the

W increases in the use of images


in many different applications.
This is mainly due to: 1) tech-
nological advances impacting several
image operations; 2) the availability of
cessing, storage and transmission by
computer applications, they are convert-
ed from analog to digital form. For dis-
play and presentation, however, they
usually need to be in analog form.
quality of the reconstructed images is
adequate for most applications. A taxon-
omy of image compression techniques is
given in Fig. 1.
Practical compression systems and
sophisticated software tools for the In this article, the term “image” standards use hybrid coding. This is a
manipulation and management, and 3) refers to “digital image.” A digital combination of several basic lossy
the World Wide Web (WWW) provid- image is basically a 2-dimensional array coding techniques. They include:
ing easy access to a wide range of users. of pixels (picture elements). An image a) transform coding and predic-
Typical applications using huge whose pixels have one of only two tive coding, b) subband
amounts of images are medical imaging, intensity levels (black and white) is coding and transform
remote sensing, entertainment, digital called a bi-tonal (or bi-level) image. coding and c) predic-
libraries, distance learning and training Printed text on paper is a common tive coding and vec-
and multimedia. example of this class of images. tor quantization. In
Digital images require huge In a continuous-tone image, the addition, the output
amounts of space for storage and large pixels have a range of values. For
bandwidths for transmission. For example, in a typical gray-scale
example, a single 640 × 480 pixel color image, the pixels could have val-
image using 24 bits/pixel requires close ues in the range [0 - 255], repre-
to one megabyte of space. Despite the senting different gray levels.
technological advances in storage and In a typical color image
transmission, the demands placed on used for display, each pixel
the storage capacities and on the band- has three color components
width of communication exceed the (R, G, B) corresponding to
availability. Image compression has the three primary colors,
proved to be a viable technique as one red, green and blue. Each
solution response. pixel of a typical color
Digital images generally contain sig- image to be transmitted has
nificant amounts of spatial and spectral three components (Y, I, Q),
redundancy. Spatial redundancy is due where Y is the luminance
to the correlation between neighboring (brightness) component and
pixel values, and spectral redundancy is I and Q are chrominance
due to the correlation between different (color) components. Each
color planes. Image compression (cod- component of (R, G, B) or
ing) techniques reduce the number of (Y, I, Q) requires 8
bits required to represent an image by bits/pixel. Thus, color
taking advantage of these redundancies. images (usually) require 24
An inverse process called decompres- bits/pixel. The number of
sion (decoding) is applied to the com- pixels in each dimension in
pressed data to get the reconstructed an image defines the
image. The objective of compression is image’s resolution—more
to reduce the number of bits as much as pixels mean more details are
possible, while keeping the resolution seen in the image.
and the visual quality of the reconstruct-
ed image as close to the original image The taxonomy
as possible. The image compression techniques of the
This article gives an overview of the are broadly classified as either lossless or lossy cod-
major image compression techniques. lossy, depending, respectively, on ing scheme is
The decoding steps for most of the cod- whether or not an exact replica of the further com-
ing schemes are quite intuitive and are original image could be reconstructed pressed using a
usually the reverse of the encoding steps. using the compressed image. Lossless lossless coding scheme
The reader is referred to the “Read more compression is also referred to as such as Huffman or
about it” for the details. In this article, entropy coding. In addition to using the Arithmetic coding.
the terms compression and coding are spatial and spectral redundancies, lossy © Digital Vision Ltd.
used synonymously. techniques also take advantage of the Lossless image
way people see to discard data that are
Basics of image perceptually insignificant. compression
Lossy schemes provide much higher In lossless compression techniques,
representation compression ratios than lossless the original image can be perfectly recov-
An image is essentially a 2-D signal schemes. Lossless compression is used ered from the compressed (encoded)
processed by the Human Visual System. only for a few applications with stringent image. These are also called noiseless
The signals representing images are usu- requirements such as medical imaging. [since they do not add noise to the signal

FEBRUARY/MARCH 2001 0278-6648/01/$10.00 © 2001 IEEE 19


019_23 2/19/01 12:00 PM Page 20

Lossless Coding Techniques Lossy Coding Techniques


(Entropy Coding) (Source Coding) bits in the bit planes are similar. This
makes the individual bit planes
Repetitive Statistical
amenable for good compression.
Lossless Bitplane Block Lossy Transform Subband Fractal Vector
Sequence Encoding Predictive Encoding Truncation Predictive Coding Coding Coding Quantization
Encoding Coding Coding Coding
RLE
Huffman
Arithmetic DPCM DPCM
DFT
DCT
Subbands
Wavelets
Lossy image compression
LZW ADPCM Haar All known lossy image compres-
Delta Hadamard
Fig. 1 The taxonomy of image compression Modulation sion techniques take advantage of how
we see things. The human visual sys-
(image)], or entropy coding (since they storing frequently occurring sequences tem is more sensitive to the lower fre-
use statistical/decomposition techniques of symbols (pixels) in a dictionary quencies than to the higher frequencies
to eliminate/minimize redundancy). (table). Such frequently occurring in the visual spectrum. Thus, we derive
Run length encoding. This technique sequences in the original data (image) the (spatial) frequencies of an image and
replaces sequences of identical symbols are represented by just their indices into suitably allocate more bits for those fre-
(pixels), called runs by shorter symbols. the dictionary. This has been used in quency components that have more visu-
This technique is usually used as a post- TIFF (Tagged Image File Format) and al impact. We then allocate less bits, or
processing step after applying a lossy GIF (Graphical Interchange Format) file even discard, the insignificant compo-
technique to the image and obtaining a formats. This scheme has also been used nents. The resulting image is represented
set of data values that are suitably re- for compress-
ordered to get long runs of similar values. ing half-tone
Prediction/ Entropy
Huffman coding. This is a general images. (Half- Original Transformation/ Quantization (Lossless)
Compressed
Data Data
technique for coding symbols based on tone images Decomposition/ Coding
their statistical occurrence frequencies are binary
(probabilities). The pixels in the image images that Fig. 2 Outline of lossy image compression
are treated as symbols. The symbols that provide the
occur more frequently are assigned a visual effect of continuous-tone gray with fewer bits and reconstructed with a
smaller number of bits, while the sym- images by using variations of the density better closeness to the original.
bols that occur less frequently are of black dots in the images). To achieve this goal, one of the fol-
assigned a relatively larger number of Predictive coding. This is based on lowing operations is generally per-
bits. Huffman code is a prefix code. This the assumption that the pixels in images formed: 1) a predicted image is formed.
means that the (binary) code of any sym- conform to the autoregressive model, Its pixels are predicted based on the val-
bol is not the prefix of the code of any where each pixel is a linear combination ues of neighboring pixel of the original
other symbol. Most image coding stan- of its immediate neighbors. The lossless image, and then a differential (residual)
dards use lossy techniques in the earlier differential pulse code modulation image is derived (it is the difference
stages of compression and use Huffman (DPCM) technique is the most common between the original and the predicted
coding as the final step. type of lossless predictive coding. In the image.); 2) a transformed image is
Arithmetic coding. Like Huffman lossless DPCM scheme, each pixel derived by applying a transform to the
coding, this is a statistical technique. value (except at the boundaries) of the original image. This essentially trans-
However, instead of coding each symbol original image is first predicted based forms the pixel values to the frequency
separately, the whole data sequence is on its neighbors to get a predicted domain; 3) the original image is decom-
coded with a single code. Thus, the cor- image. Then the difference between the posed into different components (in the
relation between neighboring pixels is actual and the predicted pixel values is frequency domain).
exploited. Arithmetic coding is based on computed to get the differential or In 1), the dynamic range of the signal
the following principle. Given that a) the residual image. The residual image will values is reduced; in 2) and 3) a repre-
symbol alphabet is finite; b) all possible have a much less dynamic range of sentation that is more efficiently coded is
symbol sequences of a given length are pixel values. This image is then effi- derived. In each case, there exits an
finite; c) all possible sequences are ciently encoded using Huffman coding. inverse operation, which yields the origi-
countably infinite; d) the number of real Bit-plane encoding. In this scheme, nal image (lossless), when it is applied to
numbers in the interval [0, 1] is uncount- the binary representations of the values the new representation.
ably infinite, we can assign a unique of the pixels in the image are considered. However, to achieve compression the
subinterval for any given input The corresponding bits in each of the “redundant” information the human eye
(sequence of symbols). This is the code positions in the binary representation considers perceptually insignificant is
(tag) for the input. form a binary image of the same dimen- discarded. This is done using quantiza-
The cumulative density function sions as the original image. This is called tion. The new representation has desir-
(CDF) of the symbol probabilities is used a bit plane. Each of the bit planes can able properties. The quantized data has
to partition the interval (usually [0, 1]) then be efficiently coded using a lossless much less variance than the original.
into subintervals and map the sequence technique. Entropy coding is then applied to achieve
of symbols to a unique subinterval. This The underlying principle is that (in further compression.
scheme is well suited to small set of most images) the neighboring pixels are The outline of lossy compression
symbols with highly skewed probabili- correlated. That means the values of the techniques is shown in Fig. 2. Please
ties of occurrence. Arithmetic coding is neighboring pixels differ by small note that the prediction-transformation-
used as the final step in several image amounts. They can be captured by the decomposition process is completely
coding applications and standards. representation of pixel values in gray reversible. The quantization process (see
Lempel-Ziv coding. This is based on code so that the values of neighboring box) results in loss of information. The

20 IEEE POTENTIALS
019_23 2/19/01 12:00 PM Page 21

entropy coding after the quantization the decoder is built into the encoder). the energy of the original data being con-
step, however, is lossless. The decoding The design of a DPCM coder centrated in only a few of the significant
is a similar but reverse process: a) involves optimizing the predictor and the transform coefficients. This is the basis
entropy decoding is applied to the com- quantizer. The inclusion of the quantizer of achieving the compression. Only
pressed data to get the quantized data, b) in the prediction loop results in a com- those few significant coefficients are
dequantization is applied to it, and then plex dependency between the prediction selected and the remaining are discarded.
c) the inverse transformation to get the error and the quantization error. The selected coefficients are considered
reconstructed image. (This is an approxi- However, the predictor and quantizer are for further quantization and entropy
mation of the original image.) usually optimized separately, since a encoding. DCT coding has been the most
Major performance considerations of joint optimization is usually complex. common approach to transform coding.
a lossy compression scheme are: a) the [Under mean-squared error (MSE) opti- It is also adopted in the JPEG image
compression ratio (CR), b) the signal-to- mization criterion, independent optimiza- compression standard. The broad outline
noise ratio (SNR) of the reconstructed tions of the predictor and quantizer are of transform coding of images is shown
image with respect to the original, and c) good approximations to the jointly opti- in Fig. 5.
the speed of encoding and decoding. The mized solution.] Subband coding. In this scheme, the
compression ratio is given by: Block truncation coding. In this image is analyzed to produce the compo-
size of uncompressed data scheme, the image is divided into non- nents containing frequencies in well-
CR = overlapping blocks of pixels. For each defined bands, the subbands.
size of compressed data
block, threshold and reconstruction val- Subsequently, quantization and coding is
The PSNR is given by: ues are determined. The threshold is applied to each of the bands. The advan-
PSNR = 20 log10(peak data value/RMSE) usually the mean of the pixel values in tage of this scheme is that the quantiza-
where RMSE is the root mean square the block. Then a bitmap of the block is tion and coding well suited for each of
error, given by: derived by replacing all pixels whose the subbands can be designed separately.
values are greater than or equal (less The broad outline of transform coding of
1 N M
[ ]
2
RMSE = ∑ ∑ Ii, j − Iˆi, j
NM i =1 j =1
than) to the threshold by a 1 (0). Then images is shown in Fig. 6.
for each segment (group of 1s and 0s) in Vector quantization. The basic idea in
where N × M is the image size, Ii,j and Ii,j the bitmap, the reconstruction value is this technique is to develop a dictionary
are values of pixels at (i,j) in the original determined. This is the average of the of fixed-size vectors, called code vectors.
and the reconstructed (compressed- values of the corresponding pixels in the A vector is usually a block of pixel val-
decompressed) images, respectively. original block. The broad outline of ues. A given image is then partitioned
Predictive coding. In most images, block truncation coding of images is into non-overlapping blocks (vectors)
there is a high correlation among neigh- shown in Fig. 4. called image vectors. Then for each
boring pixels. This fact is used in pre-
dictive coding. Differential Pulse Code Encoder
Modulation (DPCM) is a popular pre- dm Third-Order
Predictor
dictive coding technique. The lossy xm dm
Quantizer
qm Entropy Channel
Coder B C
DPCM is very similar to the lossless pm
version. The major difference is that in Reconstructor xm Original Value
A
lossy DPCM, the pixels are predicted sm
pm Predicted Value
based on the “reconstructed values” of dm Prediction Error
Predictor rm Reconstructed Value Predicted Value of this Pixel:
certain neighboring pixels. The differ- rm
sm a.A + b.B + c.C
Quantized, Reconstructed,
ence between the predicted value and Prediction Error
the actual value of the pixels is the dif- Fig. 3 Lossy DPCM coding scheme
ferential (residual) image. It is much
less correlated than the original image.
The differential image is then quantized Divide Original
Image Into Blocks
and encoded. Determine
Quantization Threshold
The schematic for lossy DPCM coder and Reconstruction
Bitmap Entropy Code:
Quantize of Block Bitmap and
is shown in Fig. 3, along with a third- Level for Each Block Each Block Reconstruction
order predictor. (In a third-order predic- Level

tor, three previous values are used to pre-


dict each pixel.) Note that the decoder Fig. 4 Outline of block truncation coding of images
has access only to the reconstructed val-
ues of (previous) pixels while forming Transform coding. In this coding image vector, the closest matching vector
predictions of pixels. Since the quantiza- scheme, transforms such as DFT in the dictionary is determined and its
tion of the differential image introduces (Discrete Fourier Transform) and DCT index in the dictionary is used as the
error, the reconstructed values generally (Discrete Cosine Transform) are used to encoding of the original image vector.
differ from the original values. To ensure change the pixels in the original image Thus, each image is represented by a
identical predictions at both the encoder into frequency domain coefficients sequence of indices that can be further
and decoder, the encoder also uses the (called transform coefficients). entropy coded. The outline of the scheme
“reconstructed pixel values” in its predic- These coefficients have several desir- is shown in Fig. 7.
tion. This is done by using the quantizer able properties. One is the energy com- Fractal coding. The essential idea
within the prediction loop. (In essence, paction property that results in most of here is to decompose the image into seg-

FEBRUARY/MARCH 2001 21
019_23 2/19/01 12:01 PM Page 22

used for runs with lengths less than 64.


Quantization For runs with length greater than 64, a
Coefficient Selection:
make-up code is followed by a terminat-
Entropy
Zonal/Threshold Coding ing code. The make-up codes represent
run lengths of multiples of 64
Bit Allocation
(64,128,192,...). Tables specifying the
Divide Image Into Apply Transform terminating codes and make-up codes for
Non-Overlapping to Each Block
Blocks Fig. 5 Outline of transform coding the white and black runs are provided by
of images the standard.

Image Components
Group 4 Fax
at Different The Group 4 (G4) fax standard is a
Frequency Subbands
Original Image superset of the Group 3 standard and is
backwards compatible with it. The G4
Analysis Sub- standard is said to use a 2-dimensional
Quanti- Entropy
Filter
Sampling zation Coding
Bank coding scheme. This is because it uses
spatial redundancy in the vertical direction
also by making use of the previous line as
Fig. 6 Outline of subband coding of images a reference while coding the current line.
Most runs on a line usually lie nearly
ments by using standard image process- secutive pixels of the same value. The directly below a run of the same color in
ing techniques such as color separation, first run on each line is assumed to be the previous line. The differences in the
edge detection, and spectrum and texture white. Each line is considered to be made run boundaries between successive lines
analysis. Then each segment is looked up of 1728 pixels. Thus each line is are coded. The cases where a line may
up in a library of fractals. The library reduced to alternating runs of white and have fewer or more lines than the refer-
actually contains codes called iterated black pixels. The runs are then encoded. ence lines are suitably handled. The
function system (IFS) codes, which are Each end of a line is marked with an Group 4 standard generally provides
compact sets of numbers. Using a sys- EOL. Page breaks are denoted with two more efficient compression than Group 3.
tematic procedure, a set of codes for a successive EOLs.
given image are determined, such that Two types of encodings are used for JBIG
when the IFS codes are applied to a suit- run lengths—terminating codes and The Joint Bi-Level Image Group
able set of image blocks yield an image make-up codes. Terminating codes are (JBIG) standard was developed by the
that is a very close approximation of the
original. This scheme is highly effective
for compressing images that have good Scalar quantization
regularity and self-similarity. The broad Quantization is a process (function) that maps a very large (possibly infinite)
outline of fractal coding of images is set of values to a much smaller (finite) set of values. In scalar quantization, the
shown in Fig. 8. values that are mapped are scalars (numbers). In the context of image coding
and decoding, the range of pixel values say N, is divided into L non-overlapping
intervals, also known as quantization levels.
Image compression Each interval i is defined by its decision boundaries (di, di+1). During encod-
standards ing, the quantizer maps a given pixel value x to a quantization level
Image compression standards have l: l = Q(x), such that dl ≤ x < d l+1. Each quantization level i has its associated
been developed to facilitate the interop- reconstruction level ri.
erability of compression and decompres- During decoding, the (de)quantizer maps a given level l to a reconstruction
sion of schemes across several hardware pixel value rl = xˆ, xˆ = Q-1 (l). This introduces noise or error in the image (signal)
platforms, operating systems and appli- called quantization error. This is the root mean square value of the x - x. ˆ
cations. Most standards are hybrid sys- The essential difference among various types of quantizers is in terms of how
tems making use of a few of the basic the forward and inverse mappings are defined. These definitions are dictated
techniques already mentioned. The major according to the number of quantization levels, the decision boundaries and
image compression standards are Group the reconstruction values. The basic design objective of a quantizer is to minimize
3, Group 4, and JBIG (Joint Bi-level the quantization error, while being computationally simple. The quantizer has a
Image Group) for bi-tonal images, and large impact on the compression ratio and image quality of a lossy scheme.
JPEG (Joint Photographic Experts There are two broad types of scalar quantizers—uniform and non-uniform. In
Group) for continuous-tone images. The a uniform quantizer of k levels, the range of values is divided into k equally
most common application that uses com- spaced intervals. The reconstruction values are the mid-points of the intervals.
pression of bi-tonal images is digital fac- This is simple to implement but it does not attempt to minimize the quantization
simile (FAX). error. A quantizer that takes into account the probability distributions of the pix-
els in images performs better. Such a quantizer is a non-uniform quantizer,
Group 3 Fax where the intervals are non-uniform. The most common non-uniform quantizer
The image is scanned left-to-right and is the Lloyd-Max quantizer. For it, the decision boundaries and the reconstruc-
top-to-bottom and the runs of each tion levels are determined using the probability model of the image pixels such
color—black and white—are deter- that the quantization error is minimized.—SRS
mined. A run refers to a sequence of con-

22 IEEE POTENTIALS
019_23 2/19/01 12:01 PM Page 23

International Standards Organization


tables are reached. Video and Image Compression
(ISO) for the lossless compression of
Hierarchical JPEG compression Standards, Kluwer Academic, 1995.
bi-level images. Typically, these are
offers a progressive representation of a • R. J. Clarke, TransformCoding of
printed pages of text whose corre-
decoded image similar to progressive Images, Academic Press, London, 1985.
sponding images contain either black
JPEG, but also provides encoded images • A. Gersho and R.M. Gray, Vector
or white pixels.
at multiple resolutions. Hierarchical Quantization and Signal Compression,
JBIG uses a combination of bit-plane
JPEG creates a set of compressed Kluwer Academic, 1992.
encoding and arithmetic coding. The
images beginning with small images, • H-M. Hang and J. W. Woods (Eds.),
adaptivity of the arithmetic coder to the
and continuing with images with Handbook of Visual Communications,
statistics of the image results in the
increased resolutions. This process is Academic Press, 1995.
improved performance of JBIG. JBIG
also called pyramidal coding. • A. K. Jain, “Image Data
also incorporates a progressive transmis-
Hierarchical JPEG mode requires signif- Compression: A Review,” Proc. IEEE,
sion mode. This can be used for the
icantly more storage space, but the 69(3), 1981, pp. 349-389.
compression of gray-scale and color
encoded image is immediately available • W. Kou, Digital Image
images. Each bit plane of the image is
at different resolutions. Compression: Algorithms and
treated as a bi-level image. This provides
Standards, Kluwer Academic, 1995.
lossless compression and enables pro-
gressive buildup.
Summary • A.N. Netravali and B.G. Haskell,
The generation and use of digital Digital Pictures: Representation,
images is expected to continue at an ever Compression, and Standards (2nd edi-
JPEG faster pace in the coming years. The tion), Plenum Press, 1995.
The Joint Photographic Experts huge size requirements of images cou- • M. Rabbani and P.W. Jones, Digital
Group (JPEG) is a standard developed pled with the explosive increases are Image Compression Techniques, SPIE,
for compressing continuous-tone still straining the storage capacities and trans- Vol. TT7, 1991.
images. JPEG has been widely accepted mission bandwidths. Compression is a • K. Sayood, Introduction to Data
for still image compression throughout viable way to overcome these bottle- Compression (2nd edition), Morgan-
the industry. JPEG can be used on both necks. Kaufmann, 2000.
gray-scale and color images. JPEG con- All the techniques described here are
sists of four modes: lossless, sequential, considered “first-gen-
progressive and hierarchical. The first eration” techniques. Divide Original Lookup Code Book
Image Into Vectors for Closest Match to
one is a lossless mode and the other The second generation Each Image Vector
three are lossy modes. The sequential of compression tech- Image
Vector
Indices of
the Closest Entropy
mode, also called baseline JPEG, is the niques—already Index in
Code Book Matches in Coding
most commonly used scheme. underway—use a Code Book
The lossless JPEG mode uses linear mode-based approach.
predictive schemes. It provides seven The images are ana-
different predictors. Pixel values (except lyzed using image
those at the boundaries) are predicted processing and pattern Code Book
based on neighboring pixels. The resid- recognition techniques
ual, which is the difference between the to derive high-level Fig. 7 Outline of vector quantization of images
original and the predicted image, is objects. The images
encoded using entropy (lossless) coding
such as Huffman or arithmetic coding.
Segments Library of Fractals
In the baseline JPEG scheme, the Image Processing
(Iterated Function System)
image is divided into non-overlapping Color Separation IFS Codes Closest
blocks of 8 x 8 pixels. DCT is applied to Original Edge Detection Lookup
IFS: a Set of Contractive
IFS Code
Image Spectrum Analysis of Original
each block to obtain the transform coef- Texture Analysis Affine Transformations
Image
ficients. The coefficients are then quan- Affine Transformation:
Combination of Rotations,
tized using a table specified by the stan- Scalings, Translations
dard, which contains the quantizer step
sizes. The quantized coefficients are then Fig. 8 Outline of fractal coding of images
ordered using a zigzag ordering. The are
ordered quantized values are then encod- then described with well-defined image About the author
ed using Huffman coding tables, speci- models. They are expressible using R. Subramanya obtained his Ph.D. in
fied by the standard. much less information than the original Computer Science from George
Progressive JPEG compression is data. The challenge is in devising good Washington University, where he
similar to the sequential (baseline) models that achieve good compression received the Richard Merwin memorial
JPEG scheme in the formation of DCT without loss of fidelity. award from the EECS department in
coefficients and quantization. The key 1996. He received the Grant-in-Aid of
difference is that each coefficient Read more about it Research award from Sigma-Xi for his
(image component) is coded in multi- • M. Y. Barnsley and L. P. Hurd, research in audio data indexing in 1997.
ple scans instead of a single scan. Each Fractal Image Compression, A. K. He is currently an Assistant Professor of
successive scan refines the image until Peters, 1993. Computer Science at the University of
the quality determined the quantization • V. Bhaskaran and K. Konstantinides, Missouri-Rolla.

FEBRUARY/MARCH 2001 23

You might also like