Overview of Multimedia Compression Technologies: Vinay Kumar

Overview of Multimedia Compression Technologies
Vinay Kumar
Presentation Overview

Introduction Data Compression Image Compression Video Compression Audio Compression Summary Q&A
What is Compression and Why?

Compression Algorithm x bytes y bytes y<x Compression ratio: Number of bytes in the original image Number of bytes in the compressed image
e.g. input image = 10 MB output image = 5 MB compression ratio = 2:1
Whytransmission and storage
Common Terminology

Lossless Compression perfect reconstruction Lossy Compression data is lost Spatial Compression 2D or single image Temporal Compression 3D or video Codec Compression / Decompression Colour / intensity same thing
1. Data Compression
Oldest form of compression 1940s
Still important used to encode output from modern codecs Lossless
Huffman Coding
Symbol Frequency a 19 b 10 c 8 d 8 e 5
1. Show symbols as leaf nodes

a b c d e Decreasing frequency
2. Combine nodes with lowest frequencies

13
Huffman Coding II
50
0 0
13
1
31
Symbol a
Huffman Code 0 111 110 101 100
1
18
=
1
b c d e
Fewer bits per symbol compression achieved
LZ77 - PKZIP

Replaces a repeated stream with a symbol Search is finite windowed look ahead window
text window
anbfcatkdfjs lcatjfl
Symbols cat code 01
dictionary
Need to store and transmit codebook inefficient LZW needs no codebook generated by Tx / Rx
2. Image Compression
Can break free from stream compression associated with data compression Intra-frame coding
Run-length Encoding

Data compression method Replaces repetitive stream data with tuple in format (symbol, count) aaaaazz encoded as (a,5) (z,2)
Good at horizontal runs Still 1D
Vector Quantization
2D based
source block
codebook
= =
= n [i] n = matched entry
average intensity
Loads of enhancements
Quadtree Encoding
Look for blocks of similar colour - recursive

Leaf node Pixel intensity 1 Pixel intensity 0
(a)
(b)
-1 0 1 0 1 0 0 0 1 0 1 1 1 1 0 1 0 0 0 0 1
Good for few colours
GIF - PNG

Uses LZW compression patent problems A little knowledge useful for web designers
PNG replaces GIF no patent problems Uses older (but free) LZ77 algorithm
Pyramidal Techniques
Based on averaging pixels

pyramid
pyramid
i=3
pel[i]
i=0
original image
reconstructed image
Image reconstructed by summing beta pyramid
Surface Approximation

View single image as 3D Uses parametric equations

1 0. 75 0. 5 0. 25 0 - 2 - 1 0 1 2 - 2 - 1
2 1 0
1 0. 75 0. 5 0. 25 0 - 2 - 1 0 1 2 - 2 - 1
2 1 0
Recursive for complex images
Directional Filtering
2nd generation technique HVS

(a) (b)
(c)
(d)
Reconstruct image by adding all components
JPEG - the theory
Uses Discrete Cosine Transform (DCT)
(a)
(t)
More waves can be added for more accuracy
JPEG II in practice
DC Component AC Components
Source Image
DCT IDCT
DC component represents average
Usually 0
Fractals
Can a block be found similar to Domain block?
Range Blocks Original Domain Blocks
If so, its called a Range block Only store Range blocks plus affine transformations Expensive!
3. Image Sequence Compression
Uses Inter-frame encoding
Also known an image sequence or temporal coding
MPEG Process

Hows does it work? 1. Subsample

R G B Y U V
luminance
Reduces data by around 50%
MPEG Process II

2. Motion Detection on luminance block only 3 types of frame I Frames intra-coded P Frames prediction from previous frame B frames use bi-directional prediction
1 I 2 B 3 B 4 P 5 B 6 P 7 B 8 I
MPEG Process III - Schematic

Image Sequences RGB To YUV Motion Estimation
DCT
No compression achieved
Output Buffer
VLC
RLE
Quantize
MPEG-1 & MPEG-2
MPEG-1

Designed for video playback at 150KB/s - single speed CD-ROM Used in VCD technology
MPEG-2
Much higher bandwidth 3MB/s DVD technology
MPEG-4

Very different from previous generations Aimed at low-bandwidth applications at upper end, good enough for digital TV Digital Camcorders
How does it work then
MPEG-4 II - Meshes
2-D animated meshes
Textures mapped onto meshes Store vertices of mesh and movement parameters
MPEG-4 III - Sprites
MPEG-4 is object based state of the art
Panoramic images massive compression ratios 1000:1
4. Audio Compression
Techniques from image compression can be used

Huffman encodes output DCT
MP3 huge!!!! How does it work
MP3 MPEG-1 Layer 3
1. Minimal Audition Threshold
Dont store anything under 5Khz
2. Masking Effect

Uses psychoacoustic model of the ear Dont store quiet and loud noises simultaneously
MP3 II
3. Joint Stereo (JS) coding
1. Intensity Stereo (IS)

Ear unable to locate some frequencies bass Store signal in mono + minimum for spatialization Used if left and right speakers are similar Store middle (L+R) plus a side speaker (L or R) e.g.
2. Mid/Side (MS) Stereo

L Raw: 10 Store: Decompress: 10
R 5 5 5
Fewer bits
MP3 III - schematic
Summary
Technique Huffman RLE Compression Ratio 1.5-2:1 4-10:1 1952 1966 When?
LZW
Quadtree VQ Directional Filtering Fractals MPEG-1 Surface Methods MPEG-2
2-10:1
2:1 10:1 10-40:1 10-1000:1 10-100:1 10-50:1 10-200:1
1977&84
1980 1984 1985 1988 1993 1995 1995
time
MPEG-4
10-500:1
1999
Q&A
Anyone still awake?
Did that make sense?

Questions
The End!
Contact Me!
Vinay Kumar -NIC, DIT, MOCIT, GOI

vinay.kumar@nic.in

Overview of Multimedia Compression Technologies: Vinay Kumar

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Overview of Multimedia Compression Technologies: Vinay Kumar

Uploaded by

Copyright:

Available Formats

Overview of Multimedia Compression Technologies

What is Compression and Why?

Whytransmission and storage

Oldest form of compression 1940s

Still important used to encode output from modern codecs Lossless

1. Show symbols as leaf nodes

2. Combine nodes with lowest frequencies

Huffman Code 0 111 110 101 100

Fewer bits per symbol compression achieved

Good at horizontal runs Still 1D

= n [i] n = matched entry

Look for blocks of similar colour - recursive

Good for few colours

Based on averaging pixels

Image reconstructed by summing beta pyramid

View single image as 3D Uses parametric equations

Recursive for complex images

2nd generation technique HVS

Reconstruct image by adding all components

JPEG - the theory

Uses Discrete Cosine Transform (DCT)

More waves can be added for more accuracy

DC component represents average

Can a block be found similar to Domain block?

Range Blocks Original Domain Blocks

3. Image Sequence Compression

Uses Inter-frame encoding

Also known an image sequence or temporal coding

Hows does it work? 1. Subsample

Reduces data by around 50%

MPEG Process III - Schematic

MPEG-1 & MPEG-2

Much higher bandwidth 3MB/s DVD technology

How does it work then

2-D animated meshes

MPEG-4 III - Sprites

MPEG-4 is object based state of the art

Panoramic images massive compression ratios 1000:1

Techniques from image compression can be used

Huffman encodes output DCT

MP3 huge!!!! How does it work

MP3 MPEG-1 Layer 3

1. Minimal Audition Threshold

Dont store anything under 5Khz

3. Joint Stereo (JS) coding

1. Intensity Stereo (IS)

2. Mid/Side (MS) Stereo

L Raw: 10 Store: Decompress: 10

MP3 III - schematic

Anyone still awake?

Did that make sense?

Vinay Kumar -NIC, DIT, MOCIT, GOI

You might also like