You are on page 1of 65

Basic Image Compressi

on Concepts
Presenter

Guan-Chen Pan
Research Advisor

Jian-Jiun Ding , Ph. D.


Assistant professor

Digital Image and Signal Processing Lab


Graduate Institute of Communication Engineering
National Taiwan University

Outlines
Introductions
Basic

concept of image compression


Proposed method for arbitrary-shape
image segment compression
Improvement of the boundary region
by morphology
JPEG2000
Triangular and trapezoid regions and
modified JPEG image compression
2

Introduction
Lossless

or lossy(widely used)

YCbCr

Y
Cb

Cr

0.299 0.587 0.114



0.169 0.334 0.500


0.500 0.419 0.081

R
G

0
128

128

the luminance of the image whic


h
represents the brightness
Cb the chrominance of the image
which
represents the difference be
tween
the gray and blue
Cr the chrominance of the image
which
represents the difference
between
the gray and red
4

Chrominance Subsampling
The

name of the format is not always


related to the subsampling ratio.

Compression

ratio ()

where n1 is the data quantity of original image, and n2 is th


e

compressed data quantity

Root

mean square error (RMSE)

where H and W are the height and the width of the images
respectively
6

Peak-to-signal

We

ratio (PSNR)

must keep in mind that the tw

o measurements are not related t


o the real quality of the image.
7

Reduce the Correlation between Pixe


ls
Transform
1.
2.
3.
4.

coding

Coordinate rotation
Karhunen-Loeve transform
Discrete cosine transform
Discrete wavelet transform

Predictive

coding

Coordinate rotation
Draw

a line that has the mean square


error with all data for
Weig
ht

Heigh
t
9

do

the inverse transform to get the d


ata and reduce the correlation

10

Karhunen-Loeve transform(K
LT)
The

KLT is the optimal transform codi

ng.
We represent the autocorrelation ma
trix of the output vector =as
we can substitute with , and assume =0

11


We

can achieve optimal decorrelation


through diagonalization, which mean
s that is composed of the eigenvector
s of the autocorrelation matrix.
But it takes large amount of computa
tion to find eigenvectors and all the ei
genvectors are need to be stored.

12

Discrete cosine transform


The

DCT is an approximation of the K


LT and more widely used in image an
d video compression.
The DCT can concentrate more energ
y in the low frequency bands than the
DFT.

13

Discrete wavelet transform


Wavelet

transform is very similar to t


he conventional Fourier transform, b
ut it is based on small waves, called w
avelet, which is composed of time var
ying and limited duration waves.
We use 2-D discrete wavelet transfor
m in image compression.

14

We

can see that is very similar to the


original image , so we can use it to ac
hieve image compression.
15

Predictive Coding
Predictive

coding means that we tran


smit only the difference between the
current pixel and the previous pixel.
The difference may be close to zero.
However, the predictive coding algori
thm is more widely used in video.
EX. Delta modulation (DM), Adaptive
DM. DPCM ,Adaptive DPCM (ADPCM)
16

Quantization
DCT coefficient F(u,v) is divided
by the corresponding quantization m
atrix Q(u,v) and rounded to the neare
st integer.

Each

17

Luminance

quantization matrix
16
12
14
14
18
24

11
12
13
17
22
35

61
55
56
62
77
92
10
49 64 78 87 103 121 120
1
11 10 10
72 92 95 98
99
2 0 3

Chrominance

Removes

10
14
16
22
37
55

16
19
24
29
56
64

24
26
40
51
68
81

40 51
58 60
57 69
87 80
109 103
104 113

quantization matrix

the high frequencies


18

Entropy Coding Algorithms


1.

Huffman Coding
Difference Coding (DC)
Zero Run Length Coding (AC)

Arithmetic Coding
3. Golomb Coding
2.

19

Huffman Coding
Huffman

coding is the most popular t


echnique for removing coding redund
ancy.
Unique prefix property
Instantaneous decoding property
Optimality

JPEG(fixed,

not optimal)

20

21

Difference Coding
For

DC coefficients
The DC coefficients is very close to its
neighbors and usually have much larg
er value than AC coefficients.
-

22

Zero Run Length Coding


Encode

each value which is not 0, tha


n add the number of consecutive zero
es in front of it
EOB (End of Block) = (0,0)
Only 4-bit value
[57,45,0,0,0,0,23,0,-30,-16,0,,0]
[(0,57)(0,45)(4,23)(1,-30)(0,16)EOB]
Eighteen zeroes, 3 (15,0) ; (2,3)
where (15,0) is 16 consecutive zeroes
23

24

Arithmetic Coding
Arithmetic

coding is another coding


method widely used in image and vid
eo compression, and its performance
is better than Huffman coding.
We treat the whole input data as a sin
gle symbol and find the correspondin
g codeword for it.
Huffman, probability very close to 1.0
,
25

26

Symbol

Probability

Sub-interval

0.05

[0.00,0.05)

0.2

[0.05,0.25)

0.1

[0.25,0.35)

w
e

0.05
0.3

[0.35,0.40)
[0.40,0.70)

0.2

[0.70,0.90)

0.1

[0.90,1.00)

l
First

Second l

27

Symbol

Probability

Sub-interval

0.05

[0.00,0.05)

0.2

[0.05,0.25)

0.1

[0.20,0.35)

0.05

[0.35,0.40)

0.3

[0.40,0.70)

0.2

[0.70,0.90)

0.1

[0.90,1.00)

For

0.071334

interval 0.05~0.25

Symbol

Probability

Sub-interval

0.05

[0.05,0.06)

l
u

0.2
0.1

[0.06,0.1)
[0.1,0.12)

0.05

[0.12,0.13)

0.3

[0.13,0.19)

0.2

[0.19,0.23)

0.1

[0.23,0.25)

0.071334

28

Golomb Coding
Golomb

coding is a special case of th


e Huffman coding.
Optimal for the data with a geometric
distribution.
No

table

29

determine m from p ,
a = qm + r
Convert q into the prefix. The prefix is co
mposed of q 1 bits followed by a 0 bit.
Convert r into the suffix using the binary c
ode. Threshold parameter (m) = 2^ m.

First,
1.
2.
3.
4.

If r (m), the length of the suffix is bits.


If r (m), we update r into r +(m) and encode it
into a -length suffix.

30

Example

p = 0.93,m = 10, a = 19
q = 1, r = 9
Prefix = 10
Threshold parameter (m) = 2^ m = 6
r threshold
r = r + threshold = 9 + 6 = 15
encode 15 into a -length suffix, =4
Suffix = 1111
Code = 101111

31

Decode

101111

Encoding of quotient part


q output bits
0 0
1 10
2 110
3 1110
4 11110
5 111110
6 1111110
:
:
N <N repetitions of 1>0

Encoding of remainder part


r
offset binary output bits
0 0 0000 000
1 1 0001 001
2 2 0010 010
3 3 0011 011
4 4 0100 100
5 5 0101 101
6 12 1100 1100
7 13 1101 1101
8 14 1110 1110
9 15 1111 1111

= 1, r = 9
a = 10*1+9 = 19
32

However,

Golomb coding can just ach


ieve optimal coding efficiency when t
he data is geometrically distributed.
To solve this problem, there is an Ada
ptive Golomb Code.

Without
codeword
table

Huffman
Golomb
Adaptive
Golomb

NO
YES
YES

Flexibility
and
adaptatio
n
GOOD
MIDDLE
GOOD
33

Proposed Method for Arbitrary-Shape


Image Segment Compression
arbitrary-shape image segment f a
nd its shape matrix.

An

34

Standard

pe of f

8x8 DCT bases with the sha

35

The

37 arbitrary-shape orthonormal
DCT bases by Gram-Schmidt process

W 1 H 1

F (k ) f ( x, y ) 'x , y (k ) for k 1, 2,..., M


x0 y0

36

Quantization
Q ( k ) Qa k Qc ,

for k 1, 2,..., M

F (k )
Fq (k ) Round
, where k 1, 2,..., M
Q( k )
45

40

35

30

25

20

15

10

100

200

300

400

500

600

37

Improvement of the Boundary Region by Morphology

38

JPEG2000
JPEG

2000 is a new standard and it ca


n achieve better performance in imag
e compression.
Advantages
Efficient lossy and lossless compression
Superior image quality
Additional features such as spatial scalabi
lity and region of interest.
Complexity
39

JPEG

2000 encoder

JPEG

2000 decoder

Embedded Block Coding with


Optimized Truncation(EBCOT) :
Tier-1+Tier-2

40

41

Irreversible component transform (I


CT)
Irreversible

and real-to-real

0.587
0.114 U 0 x, y
V0 x, y
0.299

U x, y
V
x
,
y

0.5

0.41869

0.08131

1
1

V2 x, y
0.16875 0.33126
0.5 U 2 x, y

and are just like , and ,respectively

.
, and are just like , and ,respectively
.
Y 0.299 0.587 0.114
R 0
Cb


0.169 0.334 0.500

Cr 0.500 0.419 0.081

G 128

B 128

42

Reversible component transform (RC


T)
Reversible

and integer-to-integer

U 0 x, y 2U1 x, y U 2 x, y
V0 x, y

V1 x, y U 2 x, y U1 x, y
V2 x, y U 0 x, y U1 x, y

43

44

Irreversible

n
0

, Daubechies 9/7 filter

Analysis Filter Coefficients


Lowpass Filter Highpass Filter
0.6029490182 1.1150870524
36
56
0.2668641184
0.0591271763
42
11
0.0782232665 0.0575435262
28
28
0.0912717631
0.0168641184
14
428
0.0267487574

10

Synthesis Filter Coefficients


Lowpass Filter Highpass Filter
1.1150870524 0.6029490182
56
363
0.5912717631
0.2668641184
14
428
0.0575435262 0.0782232665
28
289
0.0168641184
0.0912717631
428
142
0.0267487574

108
45

46

Tier-1 Encoder

Each

Fractional Bit-plane coding will gener


ate the Context (CX) and the Decision (D),
which are used for arithmetic coding.

zero coding
sign coding
magnitude refinement coding
run length coding
47

Bit-plane Conversion
Converts

the quantized wavelet coeffi


cients into several bit-planes
First bit-plane is the sign plane
The other planes are the magnitude p
lane, from MSB to LSB

48

17

22

33

48

64

80

96

11
2

22

28

38

52

67

81

96

11
2

33

38

48

62

75

86

10
0

11
6

48

52

62

70

83

96

11
0

12
5

64

67

75

83

96

10
8

11
8

13
2

80

81

86

96

10
8

11
7

12
8

14
2

96

96

10
0

11
0

11
8

12
8

14
0

15
0

11
2

11
2

11
6

12
5

13
2

14
2

15
0

16
0

17

= 000100012
160 = 101000002

49

Stripe and Scan Order

50

Zero Coding
D

d v d
h D h
d v d

: current encode data, binary : 0 or

1
h :0~2 v :0~2 d :0~4

51

Sign Coding
h,

v
h D h
v

v: neighborhood si
gn status

-1: one or both negative


0: both insignificant or
both significant but opp
osite sign
-1: one or both positive
D

:,
= XOR
52

Magnitude Refinement Coding


[x,y]

is initialized to 0, and it will bec


ome 1 after the first time of the magn
itude refinement coding is met at [x,y]

53

Run-Length Coding
For

four zeros : (CX,D) is (0,0)


Else is (0,1), and use 2 uniform(CX=1
8) to record the 1s position
(0110)
The first nonzero position is (01)2
(0,1), (18,0), (18,1)

54

D
(0,1)

CX
(total 19)

Arithmet
ic
encoder

Compressed
data

55

Why Called Fractional

56

Tier-2 Encoder
Rate/Distortion

optimized truncation

57

Triangular and Trapezoid Regions and Mod


ified JPEG Image Compression
Divide

an image into 3 parts:

1. Lower frequency regions


2. Traditional image blocks and
3. The arbitrarily-shaped image blocks

58

1 1 1 1 1 1 1 1 1 0

0 1 1 1 1 1 1 1 1 1

0 1 1 1 1 1 1 1 1 0

0 1 1 1 1 1 1 1 0 0

0 0 1 1 0 1 1 1 1 0

0 0 1 0 0 1 1 1 0 0

0 0 0 0 0 0 1 1 0 0

0 0 0 0 0 0 1 1 0 0

1
1
1
1
2
2
1
1

sections
sections
sections
sections
sections
sections
sections
sections

Zone

Zone

Zone

59

-distance

< threshold

60

Corner

too close

Trapezoid

inside the zone

61

N = 10

= K(m) + K(M-1-m)

62

Construct
1.

2.

3.

the rectan
gular region and obt
ain the orthonormal
DCT basis
Select the DCT basis
that satisfies p+q=ev
en
=

63

Reference:
1.
2.

3.
4.
5.
6.
7.

J.D Huang "Image Compression by Segmentation and Boundary Description , " 2008.
G. Roberts, "Machine Perception of Three-Dimensional Solids," in Optical and Electro- Opt
ical Information Processing, J. T. T. e. al., Ed. Cambridge, MA: MIT Press, 1965, pp. 159-197
.
J. Canny, "A Computational Approach to Edge Detection," IEEE Trans. Pattern Analysis and
Machine Intelligence, vol. 8, pp. 679-698, Nov. 1986.
D. Comaniciu and P. Meer, "Mean Shift: A Robust Approach toward Feature Space Analysi
s, " IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 24, pp. 603-619, 2002.
J.J Ding, P.Y Lin, S.C Pei, and Y.H Wang, "The Two-Dimensional Orthogonal DCT Expansion i
n Triangular and Trapezoid Regions and Modified JPEG Image Compression, ",VCIP2010
J.J Ding, S.C Pei, W.Y Wei, H.H Chen, and T.H Lee , "Adaptive Golomb Code for Joint Geomet
rically Distributed Data and Its Application in Image Coding", APSIPA 2010
W.Y Wei, "Image Compression", available in http://disp.ee.ntu.edu.tw/tutorial.php

K. R. Rao and P. Yip, Discrete Cosine Transform, Algorithms, Advantage, Applications , New
York: Academic, 1990.
9. S.S. Agaian, Hadamard Matrices and Their Applications , New York, Springer-Verlag, 1985.
10. H. F. Harmuth, Transmission of information by orthogonal functions , Springer, New York,
1970.
8.

64

11. R.

Koenen, Editor, Overview of the MPEG-4 Standard, ISO/IEC JTC/SC29/WG21, MPEG-9


9-N2925, March 1999, Seoul, South Korea.
12. T. Sikora, MPEG-4 very low bit rate video, IEEE International Symposium on Circuits and
Systems, ISCAS 97, vol. 2, pp. 1440-1443, 1997.
13. T. Sikora and B. Makai, Shape-adaptive DCT for generic coding of video, IEEE Trans. Circ
uits Syst. Video Technol., vol. 5, pp. 59-62, Feb. 1995.
14. W.K. Ng and Z. Lin, A New Shape-Adaptive DCT for Coding of Arbitrarily Shaped Image Se
gments, ICASSP, vol. 4, pp. 2115-2118, 2000.
15. S. C. Pei, J. J. Ding, P. Y. Lin and T. H. H. Lee, Two-dimensional orthogonal DCT expansion i
n triangular and trapezoid regions, Computer Vision, Graphics, and Image Processing , Sit
ou, Taiwan, Aug. 2009.
16. D. A. Huffman, "A method for the construction of minimum-redundancy codes," Proceedi
ngs of the IRE, vol. 40, no. 9, pp. 1098-1101, 1952.
17. S. W. Golomb, "Run length encodings," IEEE Trans. Inf. Theory, vol. 12, pp. 399-401, 1966.
18. R. Gallager and D. V. Voorhis, "Optimal source codes for geometrically distributed integer
alphabets," IEEE Trans. Information Theory, vol. 21, pp. 228230, March 1975.
19. R. F. Rice, "Some practical universal noiseless coding techniquespart I," Tech. Rep. JPL-7
9-22, Jet Propulsion Laboratory, Pasadena, CA, March 1979.
20. G. Seroussi and M. J. Weinberger, "On adaptive strategies for an extended family of Golo
mb-type codes," Proc. DCC97, pp. 131-140, 1997.
21. C. J. Lian JPEG2000 , DSP/IC design lab, GIEE, ntu

65

You might also like