Transform Coding

Ch8 Image Compression 3
Overview
Block Transform Coding
Discrete Fourier Transform
Discrete Cosine Transform
Walsh-Hadamard Transform
Transform Selection
Coefficient Quantization
Coefficient Coding
Conclusion
Overview
The previous image compression techniques
have focused on removing spatial and/or
coding redundancy
In this section, we will look at block transform
coding, the technology behind the JPEG and
MPEG compression standards
Again we will remove spatial/coding redundancy
We will also use quantization for data reduction
The basic idea of block transform coding is to
transform the image from the spatial domain to
another domain where coefficients have lower
entropy so they can be coded more efficiently
This approach is used in both JPEG and
MPEG because it yields much higher
compression rates than previous methods
We subdivide the image into nxn blocks prior
to calculating the transform to improve speed
and also to localize any errors that may occur
original image
16x16 blocks
decompressed
image
16x16
transform
image
error
image
The block transform creates an nxn block of values
T(u,v) from an nxn block from the image f(x,y)
!
T(u, v) = f (x, y) r(x, y, u,v)
y=0
n-1
"
x=0
n-1
"
f (x, y) = T(u,v) s(x, y, u,v)
v=0
n-1
"
u=0
n-1
"
r(x, y, u,v) = forward transform kernel
s(x, y, u,v) = inverse transform kernel
What transform should we use?
We want orthogonal basis functions
We want uncorrelated coefficients
We want separable/symmetric transform
We want simple/fast implementation
Several good options to consider:
Yields complex transform T(u,v) with even
odd symmetry in the real and imaginary
components
!
r(x, y, u,v) = e
"i2# (ux+vy)/ n
s(x, y, u,v) =
1
n
2
e
i2#(ux+vy)/ n
Discrete Fourier
Transform basis
functions with n=4
!
r(x, y, u,v) = s(x, y, u,v) =
"(u)"(v)cos
(2x +1)#u
2n
$
%
&
'
(
)
cos
(2y +1)#v
2n
$
%
&
'
(
)
where
"(u) =
1/ n u = 0
2/ n u > 0
*
+
,
-
.
/
Yields real valued transform T(u,v)
Discrete Cosine
Transform basis
functions with n=4
!
r(x, y, u,v) = s(x, y, u,v) = H
m
where
H
m
= 2
"1/ 2
H
m"1
H
m"1
H
m"1
"H
m"1
#
$
%
&
'
(
and
H
0
=1
Yields real valued transform T(u,v)
!
H
0
=1
H
1
= 2
"1/ 2
1 1
1 "1
#
$
%
&
'
(
H
2
= 2
"1
1 1 1 1
1 "1 1 "1
1 1 "1 "1
1 "1 "1 1
#
$
%
%
%
%
&
'
(
(
(
(
!
H
3
= 2
"3/ 2
1 1 1 1 1 1 1 1
1 "1 1 "1 1 "1 1 "1
1 1 "1 "1 1 1 "1 "1
1 "1 "1 1 1 "1 "1 1
1 1 1 1 "1 "1 "1 "1
1 "1 1 "1 "1 1 "1 1
1 1 "1 "1 "1 "1 1 1
1 "1 "1 1 "1 1 1 "1
#
$
%
%
%
%
%
%
%
%
%
%
&
'
(
(
(
(
(
(
(
(
(
(
Walsh-Hadamard
basis functions
with n=4
Transform Selection
How can we choose between these transforms?
Look at how much information is associated
with each of the T(u,v) coefficients
Calculate T(u,v) for each nxn block in the image
Set the smallest P% of T(u,v) values to zero
Use the inverse transform to get f(x,y)
Calculate rms error of resulting image
Transform Selection
Fourier
rms = 2.32
P=50%
Walsh-Hadamard
rms = 1.78
P=50%
Cosine
rms = 1.13
P=50%
Transform Selection
How can we select the
best subimage size?
Vary the subimage size
from 2x2, 4x4, 8x8,
Perform transform
Discard smallest P% of
T(u,v) coefficients
Perform inverse transform
Calculate rms error
The process of discarding the P% smallest
T(u,v) coefficients is a form of quantization
In general, quantization will reduce the range
of T(u,v) values, which reduces entropy and
increases the compression rate
A wide variety of quantization schemes have
been devised over the years
A threshold mask shows
which P% of the coefficients
were discarded (zeros)
A zonal mask shows which
coefficients have the highest
variance in the image
Threshold mask
with P=87.5%
Zonal mask
with P=87.5%
In some systems, the nxn mask for each block
is transmitted as an n
2
bit stream, followed by
the non-zero T(u,v) values for the block
Other systems save bits by using a single mask
for the entire image, or by predefining a set of
common masks, and transmitting the mask
identifiers instead of the masks
Another quantization option
is to use a variable number of
bits to encode each of the
T(u,v) coefficients
Fewer bits are used for high
frequency terms, so this
process causes smoothing
(and block artifacts)
Coefficient Coding
T(u,v) coefficients are often
reordered using a zig-zag
pattern prior to coding
Low frequency terms before
high frequency terms
A special symbol EOB is
used to replace trailing 0s
Coefficient Coding
After quantization, the T(u,v) values for each
block are encoded using a lossless coding
scheme such as Huffman or Arithmetic coding
The specific parameters of the coding scheme
are stored in the compressed file, so the
decompression program can undo the coding,
rescale the data, and perform the inverse
transform to obtain the output image
Conclusion
In this section, we described block transform
coding, the image compression technique used
by JPEG and MPEG

Transform Coding

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Transform Coding

Uploaded by

Copyright:

Available Formats

Ch8 Image Compression 3

You might also like