You are on page 1of 34

Overview of Multimedia Compression Technologies

Vinay Kumar

Presentation Overview

Introduction Data Compression Image Compression Video Compression Audio Compression Summary Q&A

What is Compression and Why?


Compression Algorithm x bytes y bytes y<x Compression ratio: Number of bytes in the original image Number of bytes in the compressed image
e.g. input image = 10 MB output image = 5 MB compression ratio = 2:1

Whytransmission and storage

Common Terminology

Lossless Compression perfect reconstruction Lossy Compression data is lost Spatial Compression 2D or single image Temporal Compression 3D or video Codec Compression / Decompression Colour / intensity same thing

1. Data Compression

Oldest form of compression 1940s

Still important used to encode output from modern codecs Lossless

Huffman Coding
Symbol Frequency a 19 b 10 c 8 d 8 e 5

1. Show symbols as leaf nodes


a b c d e Decreasing frequency

2. Combine nodes with lowest frequencies


13

Huffman Coding II
50

0 0
13

1
31

Symbol a

Huffman Code 0 111 110 101 100

1
18

=
1

b c d e

Fewer bits per symbol compression achieved

LZ77 - PKZIP

Replaces a repeated stream with a symbol Search is finite windowed look ahead window
text window

anbfcatkdfjs lcatjfl
Symbols cat code 01

dictionary

Need to store and transmit codebook inefficient LZW needs no codebook generated by Tx / Rx

2. Image Compression

Can break free from stream compression associated with data compression Intra-frame coding

Run-length Encoding

Data compression method Replaces repetitive stream data with tuple in format (symbol, count) aaaaazz encoded as (a,5) (z,2)

Good at horizontal runs Still 1D

Vector Quantization

2D based

source block

codebook

= =

= n [i] n = matched entry

average intensity

Loads of enhancements

Quadtree Encoding

Look for blocks of similar colour - recursive


Leaf node Pixel intensity 1 Pixel intensity 0

(a)

(b)

-1 0 1 0 1 0 0 0 1 0 1 1 1 1 0 1 0 0 0 0 1

Good for few colours

GIF - PNG

Uses LZW compression patent problems A little knowledge useful for web designers

PNG replaces GIF no patent problems Uses older (but free) LZ77 algorithm

Pyramidal Techniques

Based on averaging pixels


pyramid

pyramid

i=3

pel[i]
i=0

original image

reconstructed image

Image reconstructed by summing beta pyramid

Surface Approximation

View single image as 3D Uses parametric equations


1 0. 75 0. 5 0. 25 0 - 2 - 1 0 1 2 - 2 - 1

2 1 0

1 0. 75 0. 5 0. 25 0 - 2 - 1 0 1 2 - 2 - 1

2 1 0

Recursive for complex images

Directional Filtering

2nd generation technique HVS


(a) (b)

(c)

(d)

Reconstruct image by adding all components

JPEG - the theory

Uses Discrete Cosine Transform (DCT)

(a)

(t)

More waves can be added for more accuracy

JPEG II in practice
DC Component AC Components

Source Image

DCT IDCT

DC component represents average

Usually 0

Fractals

Can a block be found similar to Domain block?

Range Blocks Original Domain Blocks

If so, its called a Range block Only store Range blocks plus affine transformations Expensive!

3. Image Sequence Compression

Uses Inter-frame encoding

Also known an image sequence or temporal coding

MPEG Process

Hows does it work? 1. Subsample


R G B Y U V

luminance

Reduces data by around 50%

MPEG Process II

2. Motion Detection on luminance block only 3 types of frame I Frames intra-coded P Frames prediction from previous frame B frames use bi-directional prediction
1 I 2 B 3 B 4 P 5 B 6 P 7 B 8 I

MPEG Process III - Schematic


Image Sequences RGB To YUV Motion Estimation

DCT

No compression achieved

Output Buffer

VLC

RLE

Quantize

MPEG-1 & MPEG-2

MPEG-1

Designed for video playback at 150KB/s - single speed CD-ROM Used in VCD technology

MPEG-2

Much higher bandwidth 3MB/s DVD technology

MPEG-4

Very different from previous generations Aimed at low-bandwidth applications at upper end, good enough for digital TV Digital Camcorders

How does it work then

MPEG-4 II - Meshes

2-D animated meshes

Textures mapped onto meshes Store vertices of mesh and movement parameters

MPEG-4 III - Sprites

MPEG-4 is object based state of the art

Panoramic images massive compression ratios 1000:1

4. Audio Compression

Techniques from image compression can be used


Huffman encodes output DCT

MP3 huge!!!! How does it work

MP3 MPEG-1 Layer 3

1. Minimal Audition Threshold

Dont store anything under 5Khz

2. Masking Effect

Uses psychoacoustic model of the ear Dont store quiet and loud noises simultaneously

MP3 II

3. Joint Stereo (JS) coding

1. Intensity Stereo (IS)


Ear unable to locate some frequencies bass Store signal in mono + minimum for spatialization Used if left and right speakers are similar Store middle (L+R) plus a side speaker (L or R) e.g.

2. Mid/Side (MS) Stereo


L Raw: 10 Store: Decompress: 10

R 5 5 5

Fewer bits

MP3 III - schematic

Summary
Technique Huffman RLE Compression Ratio 1.5-2:1 4-10:1 1952 1966 When?

LZW
Quadtree VQ Directional Filtering Fractals MPEG-1 Surface Methods MPEG-2

2-10:1
2:1 10:1 10-40:1 10-1000:1 10-100:1 10-50:1 10-200:1

1977&84
1980 1984 1985 1988 1993 1995 1995

time

MPEG-4

10-500:1

1999

Q&A

Anyone still awake?

Did that make sense?


Questions

The End!

Contact Me!

Vinay Kumar -NIC, DIT, MOCIT, GOI


vinay.kumar@nic.in

You might also like