You are on page 1of 3

Implementation of MPEG accelerator

Why going video compression


Television services are broadcasted at a frame rate of 25 Hz or 30 Hz in Europe or US
respectively. Each frame has a resolution of 720x576 with 3 bytes color depth for each pixel
(RGB). Simple calculation shows that a bandwidth of 248.8 Mbit/s is needed to transmit a frame
at rate of 25Hz.

This Bandwidth is quite high and cannot be achieved in either wire or wireless communication.
So it can be said that delivering video or audio information through the current available media
and without compression is simply impossible.

RGB to YUV:

RGB values will be converted to YUV (Luma and chroma). The human eye is more sensitive to
brightness more than color information which enables us to take advantage of this property in
compressing. In order to reduce the resolution the chroma is sub sampled using either 4:2:0 or
4:2:2 sub sampling patterns.

DCT and Quantization:

discrete cosine transform is performed on 8x8 image blocks to convert the image to frequency
domain. The DCT doesn’t directly reduce the number of bits required to represent the image, but
from the observation that the energy is concentrated on the low frequency coefficients whereas
the high frequencies are near zero. The near zero coefficients will be neglected after quantization
which is the most lossy part during the compression process. To reconstruct the image only the
remaining non-zero coefficients will be used.

Motion-compensated inter-frame prediction (MCP):

Two types of information redundancies are exist in a video stream; a spatial redundancy in one
frame and temporal redundancy between frames. The latter can be compressed by the MCP.

The MCP involves in block matching and error prediction. In block matching the MCP tries to
find the best match of the current frame block from the previous frame blocks. It’s worthy to say
that this block matching is the most time consuming part during the encoding. The best matching
block is subtracted from the current block to produce the error matrix which is going to be

1
encoded along with the motion vector. The motion vector is estimated from the offset between
the current block and the matching block.

Variable length coding (VLC): In this process, the compression of the spatial redundancy is
concerned. VLC consists of three steps: zigzag scanning, RLE, Huffman coding. Zigzag
scanning is applied to the quantized matrix in order to put the high frequency components
together. Run-Length-Encoding (RLE) is used to code the string of data from the zigzag scanner.
The coefficients are coded into a run length (or number of occurrences) and a level or amplitude.
For instance, instead of sending [4, 4, 4, 4, 4] which are 5 bytes size, this can be coded to
[number of occurrence=5, level=4] which they are only two bytes.

Figure 1 MPEG encoder

2
Decompression
The coded data during decompression steps is decoded to the original video frames. The decoder
performs the inverse of all steps in the encoder. The input bitstream is the input of VLD(variable
length decoder) which includes variable-length decoding, inverse zigzag, run-length expansion
and dequantization. VLD is the most time consuming function of the video decoder. IDCT will
perform on dequantized coefficients in the next step. Output of IDCT is either the 8*8 image
block or the predicted error. The latter reconstruct the 8*8 image block together with motion
vector data. Last step of decompression is converting the YUV to RGB which is an inverse
function of the encoder.

Figure 2 MPEG decoder

References:
1. APOSTOLOPOULOS, J. G. (2004). Video Compression. Streaming Media Systems Group.
http://www.mit.edu/~6.344/Spring2004/video_compression_2004.pdf(10.DEC. 2010).
2. MPEG2 video compression by P.N Tudor
http://www.bbc.co.uk/rd/pubs/papers/paper_14/paper_14.shtml (10.DEC. 2010).

You might also like