You are on page 1of 48

Chapter 1 Introduction

The computer is becoming more and more powerful day by day. As a result, the uses of digital images are increasing rapidly. Along with this, increasing use of digital images come as the serious issue of storing and transferring the huge volume of data representing the images because the uncompressed multimedia (graphics, audio and video) data requires considerable storage capacity and transmission bandwidth. Though there is a rapid progress in mass storage density, speed of the processor and the performance of the digital communication systems, the demand for data storage capacity and data transmission bandwidth continues to exceed the capabilities of on hand technologies. Besides, the latest growth of data intensive multimedia based web applications has put much pressure on the researchers to find the way of using the images in the web applications more effectively. Internet teleconferencing, High Definition Television (HDTV), satellite communications and digital storage of movies are not feasible without a high degree of compression. As it is, such applications are far from realizing their full potential largely due to the limitations of common image compression techniques. The image is actually a kind of redundant data i.e. it contains the same information from certain perspective of view. By using data compression techniques, it is possible to remove some of the redundant information contained in images. Image compression minimizes the size in bytes of a graphics file without degrading the quality of the image to an unacceptable level. The reduction in file size allows more images to be stored in a certain amount of disk or memory space. It also reduces the time necessary for images to be sent over the Internet or downloaded from web pages. Wavelets are functions which allow data analysis of signals or images, according to scales or resolutions. The processing of signals by wavelet algorithms in fact works much the same way the human eye does; or the way a digital camera processes visual scales of resolutions, and intermediate details. But the same principle also captures cell phone signals, and even digitized colour images.
1

Wavelets are of real use in these areas, for example in approximating data with sharp discontinuities such as choppy signals, or pictures with lots of edges. While wavelets is perhaps a chapter in function theory, we show that the algorithms that result are key to the processing of numbers, or more precisely of digitized information, signals, time series, still-images, movies, colour images, etc. Though there is a rapid progress in mass storage density, speed of the processor and the performance of the digital communication systems, the demand for data storage capacity and data transmission bandwidth continues to exceed the capabilities of on hand technologies. Besides, the latest growth of data intensive multimedia based web applications has put much pressure on the researchers to find the way of using the images in the web applications more effectively. As it is, such applications are far from realizing their full potential largely due to the limitations of common image compression techniques. Since the Haar Transform is memory efficient, exactly reversible without the edge effects, it is fast and simple. As such the Haar Transform technique is widely used these days in wavelet analysis. Fast Haar Transform is one of the algorithms which can reduce the tedious work of calculations. One of the earliest versions of FHT is included in HT. FHT involves addition, subtraction and division by 2. Its application in atmospheric turbulence analysis, image analysis, signal and image compression. The Modified Fast Haar Wavelet Transform (MFHWT), in which the MFHWT is used for one-dimensional approach and FHT is used to find the N/2 detail coefficients at each level for a signal of length N. In this project, it has used the same concept of finding averages and differences as in but here that approach is extended for 2D images with the addition of considering the detail coefficients 0 for N/2 elements at each level. The Haar Transform and Fast Haar Transform have been explained. Modified Fast Haar Wavelet Transform is presented with the proposed algorithm for 2D images.

Chapter 2 Introduction to Digital Image processing


2.1 Introduction:
Collection of pixels laid out in a specific order with width (x) and height (y) in pixels. Each pixel has a numerical value, which correspond to a color or gray scale value. A Pixel has no absolute size and pixels may (sometimes NOT always) have a spatial value (Spatial data is data associated with the pixels that provides information about the size of the objects in the image).

Fig: 2.1 Representation of digital image in x and y pixel format. An image may be defined as a two-dimensional function, f(x, y), where x and y are spatial coordinates, and the amplitude of f at any pair of coordinates (x, y) is called the intensity or gray level of the image at that point. When x, y, and the amplitude values of f are all finite, discrete quantities, we call the image a digital image. The field of digital image processing refers to processing digital images by means of a digital computer. Note that a digital image is composed of a finite number of elements, each of which has a particular location and value. These elements are referred to as picture elements, image elements and pixels. Pixel is the term most widely used to denote the elements of a digital image. Vision is the most advanced of our senses, so it is not surprising that images play the single most important role in human perception. However, unlike humans, who are limited to the visual band of the electromagnetic (EM) spectrum, imaging machines cover almost the entire EM spectrum, ranging from gamma to radio waves. They can operate also on images generated by sources that humans are not accustomed to associating with images. These include ultrasound, electron microscopy, and computer-generated images. Thus, digital image processing
3

encompasses a wide and varied field of applications. Image processing stops and other related areas, such as image analysis and computer vision, start. Sometimes a distinction is made by defining image processing as a discipline in which both the input and output of a process are images. We believe this to be a limiting and somewhat artificial boundary. For example, under this definition, even the trivial task of computing the average intensity of an image would not be considered an image processing operation. On the other hand, there are fields such as computer vision whose ultimate goal is to use computers to emulate human vision, including learning and being able to make inferences and take actions based on visual inputs. This area itself is a branch of artificial intelligence (AI), whose objective is to emulate human intelligence. The field of AI is in its earliest stages of infancy in terms of development, with progress having been much slower than originally anticipated. The area of image analysis (also called image understanding) is in between image processing and computer vision. There are no clear-cut boundaries in the continuum from image processing at one end to computer vision at the other. However, one useful paradigm is to consider three types of computerized processes in this continuum: low-, mid-,and high-level processes. Low-level processes involve primitive operations such as image pre-processing to reduce noise, contrast enhancement, and image sharpening. A low-level process is characterized by the fact that both its inputs and outputs are images. Mid-level processes on images involve tasks such as segmentation (partitioning an image into regions or objects), description of those objects to reduce them to a form suitable for computer processing, and classification (recognition) of individual objects. A mid-level process is characterized by the fact that its inputs generally are images, but its outputs are at- tributes extracted from those images (e.g., edges, contours, and the identity of individual objects). Finally, higher-level processing involves making sense of an ensemble of recognized objects, as in image analysis, and, at the far end of the continuum, performing the cognitive functions normally associated with human vision. Based on the preceding comments, we see that a logical place of overlap between image processing and image analysis is the area of recognition of individual regions or objects in an image. Thus, what we call in this book digital image processing encompasses processes whose inputs and outputs are images and, in addition, encompasses processes that extract attributes from images, up to and including the recognition of individual objects. As a simple illustration to
4

clarify these concepts, consider the area of automated analysis of text. The processes of acquiring an image of the area containing the text, pre-processing that image, extracting (segmenting) the individual characters, describing the characters in a form suitable for computer processing, and recognizing those individual characters, are in the scope of what we call digital image processing in this book. Making sense of the content of the page may be viewed as being in the domain of image analysis and even computer vision, depending on the level of complexity implied by the statement making sense. Digital image processing, as we have defined it, is used successfully in a broad range of areas of exceptional social and economic value.

2.2 Digital Image Characteristics:


Pixel - An abbreviation of the term 'picture element.' A pixel is the smallest picture element of a digital image. A monochrome pixel can have two values, black or white/0 or 1. Color and gray scale require more bits; true color, displaying approximately 16.7 million colors, requires 24 bits for each pixel. A pixel may have more data than the eye can perceive at one time. Dot The smallest unit that a printer can print. Voxel An abbreviation of the term volume element. The smallest distinguishable box-shaped part of a three dimensional space. A particular voxel will be identified by the x, y and z coordinates of one of its eight corners, or perhaps its centre. The term is used in three dimensional modeling. Voxels need not have uniform dimensions in all three coordinate planes. To the human observer, the internal structures and functions of the human body are not generally visible. However, by various technologies, images can be created through which the medical professional can look into the body to diagnose abnormal conditions and guide therapeutic procedures. The medical image is a window to the body. No image window reveals everything. Different medical imaging methods reveal different characteristics of the human body. It is an overview of the medical imaging process. The five major components are the patient, the imaging system, the system operator, the image itself, and the observer, The objective is to make an object or condition within the patient's body visible to the observer. The visibility of specific anatomical features depends on the characteristics of the imaging system and the manner in which it is operated. Most medical imaging systems have a considerable number of variables that must be selected by the operator. They can be changeable system components, such as intensifying screens in radiography, transducers in sonography, or coils in
5

magnetic resonance imaging (MRI). However, most variables are adjustable physical quantities associated with the imaging process, such as kilo voltage in radiography, gain in sonography, and echo time (TE) in MRI. The values selected will determine the quality of the image and the visibility of specific body features.

2.2.1 Image Quality:


The quality of a medical image is determined by the imaging method, the characteristics of the equipment, and the imaging variables selected by the operator. Image quality is not a single factor but is a composite of at least five factors: contrast, blur, noise, artefacts, and distortion, as shown above. The human body contains many structures and objects that are simultaneously imaged by most imaging methods. We often consider a single object in relation to its immediate background. In fact, with most imaging procedures the visibility of an object is determined by this relationship rather than by the overall characteristics of the total image. The task of every imaging system is to translate a specific tissue characteristic into image shades of gray or colour. If contrast is adequate, the object will be visible. The degree of contrast in the image depends on characteristics of both the object and the imaging system.

2.2.2 Image Contrast:


Contrast means difference. In an image, contrast can be in the form of different shades of gray, light intensities, or colors. Contrast is the most fundamental characteristic of an image. An object within the body will be visible in an image only if it has sufficient physical contrast relative to surrounding tissue. However, image contrast much beyond that required for good object visibility generally serves no useful purpose and in many cases is undesirable. The physical contrast of an object must represent a difference in one or more tissue characteristics. For example, in radiography, objects can be imaged relative to their surrounding tissue if there is an adequate difference in either density or atomic number and if the object is sufficiently thick. When a value is assigned to contrast, it refers to the difference between two specific points or areas in an image. In most cases we are interested in the contrast between a specific structure or object in the image and the area around it or its background.

2.2.3 Contrast Sensitivity:


The degree of physical object contrast required for an object to be visible in an image depends on the imaging method and the characteristics of the imaging system. The primary characteristic of an imaging system that establishes the relationship between image contrast and object contrast is its contrast sensitivity. Consider the situation shown below. The circular objects are the same size but are filled with different concentrations of iodine contrast medium. That is, they have different levels of object contrast. When the imaging system has a relatively low contrast sensitivity, only objects with a high concentration of iodine (ie, high object contrast) will be visible in the image. If the imaging system has a high contrast sensitivity, the lowercontrast objects will also be visible. It emphasize that contrast sensitivity is a characteristic of the imaging method and the variables of the particular imaging system. It is the characteristic that relates to the system's ability to translate physical object contrast into image contrast. The contrast transfer characteristic of an imaging system can be considered from two perspectives. From the perspective of adequate image contrast for object visibility, an increase in system contrast sensitivity causes lower-contrast objects to become visible. However, if we consider an object with a fixed degree of physical contrast (i.e., a fixed concentration of contrast medium), then increasing contrast sensitivity will increase image contrast. It is difficult to compare the contrast sensitivity of various imaging methods because many are based on different tissue characteristics. However, certain methods do have higher contrast sensitivity than others. For example, computed tomography (CT) generally has a higher contrast sensitivity than conventional radiography. This is demonstrated by the ability of CT to image soft tissue objects (masses) that cannot be imaged with radiography. Consider the image below. Here is a series of objects with different degrees of physical contrast. They could be vessels filled with different concentrations of contrast medium. The highest concentration (and contrast) is at the bottom. Now imagine a curtain coming down from the top and covering some of the objects so that they are no longer visible. Contrast sensitivity is the characteristic of the
7

imaging system that raises and lowers the curtain. Increasing sensitivity raises the curtain and allows us to see more objects in the body. A system with low contrast sensitivity allows us to visualize only objects with relatively high inherent physical contrast.

2.2.4 Blur and Visibility of Detail:


Structures and objects in the body vary not only in physical contrast but also in size. Objects range from large organs and bones to small structural features such as trabecula patterns and small calcifications. It is the small anatomical features that add detail to a medical image. Each imaging method has a limit as to the smallest object that can be imaged and thus on visibility of detail. Visibility of detail is limited because all imaging methods introduce blurring into the process. The primary effect of image blur is to reduce the contrast and visibility of small objects or detail. Consider the image below, which represents the various objects in the body in terms of both physical contrast and size. As we said, the boundary between visible and invisible objects is determined by the contrast sensitivity of the imaging system. We now extend the idea of our curtain to include the effect of blur. It has little effect on the visibility of large objects but it reduces the contrast and visibility of small objects. When blur is present, and it always is, our curtain of invisibility covers small objects and image detail.

2.2.5 Noise:
Another characteristic of all medical images is image noise. Image noise, sometimes referred to as image mottle, gives an image a textured or grainy appearance. The source and amount of image noise depend on the imaging method and are discussed in more detail in a later chapter. We now briefly consider the effect of image noise on visibility. In the image below we find our familiar array of body objects arranged according to physical contrast and size. We now add a third factor, noise, which will affect the boundary between visible and invisible objects. The general effect of increasing image noise is to lower the curtain and reduce object visibility. In most medical imaging situations the effect of noise is most significant on the low-contrast objects that are already close to the visibility threshold.

2.2.6 Object Contrast:


The ability to see or detect an object is heavily influenced by the contrast between the object and its background. For most viewing tasks there is not a specific threshold contrast at which the object suddenly becomes visible. Instead, the accuracy of seeing or detecting a specific object increases with contrast. The contrast sensitivity of the human viewer changes with viewing conditions. When viewer contrast sensitivity is low, an object must have a relatively high contrast to be visible. The degree of contrast required depends on conditions that alter the contrast sensitivity of the observer: background brightness, object size, viewing distance, glare, and background structure.

2.2.7 Background Brightness:


The human eye can function over a large range of light levels or brightness, but vision is not equally sensitive at all brightness levels. The ability to detect objects generally increases with increasing background brightness or image illumination. To be detected in areas of low brightness, an object must be large and have a relatively high level of contrast with respect to its background. This can be demonstrated with the image in the image above. View this image with different levels of illumination. You will notice that under low illumination you cannot see all of the small and low-contrast objects. A higher level of object contrast is required for visibility.

2.3 File Formats:


File format which defines the components of the digital image (x & y values, values of the pixels, colour/gray scale, compression, manner in which the pixels are laid out, etc.) Standard file formats provide the exchange of digital image information . There are many file formats exist. They are, JPEG - Joint Photographic Experts Group. TIFF - Tagged Image File Format.
PNG - Portabe Network Graphics.

2.4 Digital Image Representation:


9

An image is defined as a two-dimensional function ie. a matrix, f(x, y), where x and y are spatial coordinates, and the amplitude of f at any pair of coordinates (x, y) is called the intensity or gray level of the image at the point. Color images are formed by combining the individual two-dimensional images. For example, in the RGB color system, a color images consists of three namely, red, green and blue individual component images. Thus many of the techniques developed for monochrome images can be extended to color images by processing the three component images individually. When x, y and the amplitude values of f are all finite, discrete quantities, the image is called a digital image. The field of digital image processing refers to processing digital images by means of a digital computer. A digital image is composed of a finite number of elements, each of which has a particular location and value. These elements are referred to as picture elements, image elements, pels and pixels. Since pixel is the most widely used term, the elements will be denoted as pixels from now on. An image may be continuous with respect to the x- and y-coordinates, and also in amplitude. Digitizing the coordinates as well as the amplitude will take into effect the conversion of such an image to digital form. Here, the digitization of the coordinate values are called sampling; digitizing the amplitude values is called quantization. A digital image is composed of a finite number of elements, each of which has a particular location and value. The field of digital image processing refers to processing digital images by means of a digital computer.

2.4.1 Coordinate Convention:


Assume that an image f(x, y) is sampled so that the resulting image has M rows and N columns. Then the image is of size M N. The values of the coordinates (x, y) are discrete quantities. Integer values are used for these discrete coordinates. In many image processing books, the image origin is set to be at (x, y) = (0, 0). The next coordinate values along the first row of the image are (x, y) = (0, 1). Note that the notation (0, 1) is used to signify the second sample along the first row. These are not necessarily the actual values of physical coordinates when the image was sampled. Note that x ranges from 0 to M1, and y from 0 to N 1, where x and y are integers. However, in the Wavelet Toolbox the notation (r, c) is used where r indicates rows and c indicates the columns. It could be noted that the order of coordinates is the same as the order discussed previously. Now, the major difference is that the origin of the coordinate.

10

system is at (r, c) = (1, 1); hence r ranges from 1 to M, and c from 1 to N for r and c integers. The coordinates are referred to as pixel coordinates.

2.4.2 Images as Matrices:


The coordinate system discussed in preceding section leads to the following representation for the digitized image function:

f(x,y) = [

(2.1)

The right side of the equation is a representation of digital image. Each element of this array (matrix) is called the pixel. Now, in MATLAB, the digital image is represented as the following matrix:

f=[

(2.2)

Where M = the number of rows and N = the number of columns Matrices in MATLAB are stored in variables with names such as A, a, RGB, real array and so on.

2.4.3 Color Image Representation:


An RGB color image is an M N 3 array or matrix of color pixels, where each color pixel consists of a triplet corresponding to the red, green, and blue components of an RGB image at a specific spatial location. An RGB image may be viewed as a stack of three gray-scale images, that when fed into the red, green, and blue inputs of a color monitor, produce a color image on the screen. So from the stack of three images forming that RGB color image, each image is referred to as the red, green, and blue component images by convention. Now, the data class of the component images determine their range of values. If an RGB color image is of class double, meaning that all the pixel values are of type double, the range of values is [0, 1]. Likewise, the range of values is [0, 255] or [0, 65535] for RGB images of class uint8 or uint16,
11

respectively. The number of bits used to represent the pixel values of the component images determines the bit depth of an RGB color image. The RGB color space is shown graphically as an RGB color cube. The vertices of the cude are the primary (red, green, and blue) and secondary (cyan, magenta, and yellow) colors of light.

2.4.4 Indexed Images:


An indexed image has two components: a data matrix of integers, X, and a colormap matrix, map. Matrix map is an m 3 array of class double containing floating-point values in the range [0, 1]. The length, m, of the map is equal to the number of colors it defines. Each row of map specifies the red, green, and blue components of a single color. An indexed image uses direct mapping of pixels intensity values of color map values. The color of each pixel is determined by using the corresponding value of integer matrix X as a pointer into map. If X is of class double then all of its components with value 2 point to the second row, and so on. If X is of class unit 8 or unit 16, then all components with value 0 point to the first row in map, all components with value 1 to point to the second row and so on.

2.4.5 The Basics of Color Image Processing:


Color image processing techniques deals with how the color images are handled for a variety of image-processing tasks. For the purposes of the following discussion we subdivide color image processing into three principal areas: (1) color transformations (also called color mappings); (2) spatial processing of individual color planes; and (3) color vector processing. The first category deals with processing the pixels of each color plane based strictly on their values and not on their spatial coordinates. This category is analogous to the intensity transformations. The second category deals with spatial (neighbor-hood) filtering for individual color planes and is analogous to spatial filtering. The third category deals with techniques base on processing all components of a color image simultaneously. Since full-color images have at least three components, color pixels are indeed vectors. For example, in the RGB color images, the RGB system color point can be interpreted as a vector extending from the origin to that point in the RGB coordinate system. Let c represent an arbitrary vector in RGB color space:
12

c=[

]=[

(2.3)

This above equation indicates that the components of c are simply the RGB components of a color image at a point. Since the color components are a function of coordinates (x, y) by using the notation.

c(x,y) = [

]=[

(2.4)

For an image of size M N, there are MN such vectors, c(x, y), for x = 0,1,. M 1 and y = 0,1,.N 1. In order for independent color component and vector-based processing to be equivalent, two conditions have to be satisfied: (i) the process has to be applicable to both vectors and scalars. (ii) the operation on each component of a vector must be independent of the other components. The averaging would be accomplished by summing the gray levels of all the pixels in the neighborhood. Or the averaging could be done by summing all the vectors in the neighborhood and dividing each component of the average vector is the sum of the pixels in the image corresponding to that component, which is the same as the result that would be obtained if the averaging were done on the neighborhood of each component image individually, and then the color vector were formed.

2.4.6 Reading Images:


In MATLAB, images are read into the MATLAB environment using function called imread. The syntax is as follows: imread(filename) Here, filename is a string containing the complete name of the image file including any applicable extension. For example, the command line >> f = imread (x.jpg); reads the JPEG image into image array or image matrix f. Since there are three color components in the image, namely red, green and blue components, the image is broken down into the three distinct color matrices fR, fG and fB.

2.5 Standard method of image compression:


In 1992, JPEG established the first international standard for still image compression where the encoders and decoders are DCT-based. The JPEG standard specifies three modes
13

namely sequential, progressive, and hierarchical for lossy encoding, and one mode of lossless encoding. The performance of the coders for JPEG usually degrades at low bit-rates mainly because of the underlying block-based Discrete Cosine Transform (DCT) . The baseline JPEG coder [5] is the sequential encoding in its simplest form. Fig. 1 and 2 show the key processing steps in such an encoder and decoder respectively for grayscale images. Color image compression can be approximately regarded as compression of multiple grayscale images, which are either compressed entirely one at a time, or are compressed by alternately interleaving 8x8 sample blocks from each in turn. The DCT-based encoder can be thought of as essentially compression of a stream of 8x8 blocks of image samples. Each 8x8 block makes its way through each processing step, and yields output in compressed form into the data stream. Because adjacent image pixels are highly correlated, the Forward DCT (FDCT) processing step lays the basis for gaining data compression by concentrating most of the signal in the lower spatial frequencies. For a typical 8x8 sample block from a typical source image, most of the spatial frequencies have zero or near-zero amplitude and need not to be encoded.
Original FDCT Image
Quantizer Entropy Encoder

Compressed Image Data

Quantization Table(QT)

Huffman Table

Fig: 2.2 Encoder Block Diagram.


Compressed
Entropy
Dequantizer

Image Data

Decoder

Inverse DCT
Huffman Table

Reconstructed Image

Quantization Table(QT)

Fig: 2.3 Decoder Block Diagram.


14

After output from the Forward DCT (FDCT), each of the 64 DCT coefficients is uniformly quantized in conjunction with a carefully designed 64-element Quantization Table (QT). At the decoder, the quantized values are multiplied by the corresponding QT elements to pick up the original unquantized values. After quantization, all the quantized coefficients are ordered into zig-zag sequence. This ordering helps to facilitate entropy encoding by placing low frequency non-zero coefficients before high-frequency coefficients. The DC coefficient, which contains a significant fraction of the total image energy, is differentially encoded. Entropy Coding (EC) achieves additional compression losslessly through encoding the quantized DCT coefficients more compactly based on their statistical characteristics. The JPEG proposal specifies both Huffman coding and arithmetic coding. More recently, the wavelet transform has emerged as a cutting edge technology, within the field of image analysis. Wavelets are a mathematical tool for hierarchically decomposing functions. Though rooted in approximation theory, signal processing, and physics, wavelets have also recently been applied to many problems in Computer Graphics including image editing and compression, automatic level-of detail control for editing and rendering curves and surfaces, surface reconstruction from contours and fast methods for solving simulation problems in 3D modelling, global illumination, and animation . Wavelet-based coding provides substantial improvements in picture quality at higher compression ratios. Over the past few years, a variety of powerful and sophisticated waveletbased schemes for image compression have been developed and implemented. Because of the many advantages of wavelet based image compression as listed below, the top contenders in the JPEG-2000 standard are all wavelet-based compression algorithms.

2.6 Conclusion:
The digital image characteristics, digital image representation in different analysis, the basic colour image processing, the standard method of image compression are been discussed in this chapter. Image compression using different teqniques are discussed in the next chapter.

15

Chapter 3 Image compression using different technique


3.1 Introduction:
Here, some background topics of image compression which include the principles of image compression, the classification of compression methods and the framework of a general image coder and wavelets for image compression, different types of transforms and quantization are going to be discussed.

3.2 Principles of Image Compression:


An ordinary characteristic of most images is that the neighboring pixels are correlated and therefore hold redundant information. The foremost task then is to find out less correlated representation of the image. Two elementary components of compression are redundancy and irrelevancy reduction. Redundancy reduction aims at removing duplication from the signal source image. Irrelevancy reduction omits parts of the signal that is not noticed by the signal receiver, namely the Human Visual System (HVS). In general, three types of redundancy can be identified: (a) Spatial Redundancy or correlation between neighboring pixel values, (b) Spectral Redundancy or correlation between different color planes or spectral bands and (c) Temporal Redundancy or correlation between adjacent frames in a sequence of images especially in video applications. Image compression research aims at reducing the number of bits needed to represent an image by removing the spatial and spectral redundancies as much as possible.

3.3. Framework of General Image Compression Method:


A typical lossy image compression system is shown in Fig. C. It consists of three closely connected components namely (a) Source Encoder, (b) Quantizer and (c) Entropy Encoder. Compression is achieved by applying a linear transform in order to decorrelate the image data, quantizing the resulting transform coefficients and entropy coding the quantized values.

Input
Source
Quantizer

Image

Encoder

Entropy Encoder

Compressed Image

Fig: 2.4 A typical lossy encoder.


16

Source Encoder:
A variety of linear transforms have been developed which include Discrete Fourier Transform (DFT), Discrete Cosine Transform (DCT), Discrete Wavelet Transform (DWT) and many more, each with its own advantages and disadvantages.

Quantizer:
A quantizer is used to reduce the number of bits needed to store the transformed coefficients by reducing the precision of those values. As it is a many-to-one mapping, it is a lossy process and is the main source of compression in an encoder. Quantization can be performed on each individual coefficient, which is called Scalar Quantization (SQ). Quantization can also be applied on a group of coefficients together known as Vector Quantization (VQ). Both uniform and non-uniform quantizers can be used depending on the problems.

Entropy Encoder:
An entropy encoder supplementary compresses the quantized values losslessly to provide a better overall compression. It uses a model to perfectly determine the probabilities for each quantized value and produces an appropriate code based on these probabilities so that the resultant output code stream is smaller than the input stream. The most commonly used entropy encoders are the Huffman encoder and the arithmetic encoder, although for applications requiring fast execution, simple Run Length Encoding (RLE) is very effective.

3.4. Image Compression:


In the last decade, there has been a lot of technological transformation in the way we communicate. This transformation includes the ever present, ever growing internet, the explosive development in mobile communication and ever increasing importance of video communication. Data Compression is one of the technologies for each of the aspect of this multimedia revolution. Cellular phones would not be able to provide communication with increasing clarity without data compression. Data compression is art and science of representing information in compact form. Despite rapid progress in mass-storage density, processor speeds, and digital communication system performance, demand for data storage capacity and data-transmission
17

bandwidth continues to outstrip the capabilities of available technologies. In a distributed environment large image files remain a major bottleneck within systems. Image Compression is an important component of the solutions available for creating image file sizes of manageable and transmittable dimensions. Platform portability and performance are important in the selection of the compression/decompression technique to be employed.

Four Stage model of Data Compression:


Almost all data compression systems can be viewed as comprising four successive stages of data processing arranged as a processing pipeline (though some stages will often be combined with a neighboring stage, performed "off-line," or otherwise made rudimentary). The four stages are (A) Preliminary pre-processing steps. (B) Organization by context. (C) Probability estimation. (D) Length-reducing code. The ubiquitous compression pipeline (A-B-C-D) is what is of interest. With (A) we mean various pre-processing steps that may be appropriate before the final compression engineLossy compression often follows the same pattern as lossless, but with one or more quantization steps somewhere in (A). Sometimes clever designers may defer the loss until suggested by statistics detected in (C); an example of this would be modern zero tree image coding. (B) Organization by context often means data reordering, for which a simple but good example is JPEG's "Zigzag" ordering. The purpose of this step is to improve the estimates found by the next step. (C) A probability estimate (or its heuristic equivalent) is formed for each token to be encoded. Often the estimation formula will depend on context found by (B) with separate 'bins' of state variables maintained for each conditioned class.

18

(D) Finally, based on its estimated probability, each compressed file token is represented as bits in the compressed file. Ideally, a 12.5%-probable token should be encoded with three bits, but details become complicated.

Principle behind Image Compression:


Images have considerably higher storage requirement than text; Audio and Video Data require more demanding properties for data storage. An image stored in an uncompressed file format, such as the popular BMP format, can be huge. An image with a pixel resolution of 640 by 480 pixels and 24-bit colour resolution will take up 640 * 480 * 24/8 = 921,600 bytes in an uncompressed format. The huge amount of storage space is not only the consideration but also the data transmission rates for communication of continuous media are also significantly large. An image, 1024 pixel x 1024 pixel x 24 bit, without compression, would require 3 MB of storage and 7 minutes for transmission, utilizing a high speed, 64 Kbits /s, ISDN line. Image data compression becomes still more important because of the fact that the transfer of uncompressed graphical data requires far more bandwidth and data transfer rate. For example, throughput in a multimedia system can be as high as 140 Mbits/s, which must be transferred between systems. This kind of data transfer rate is not realizable with todays technology, or in near the future with reasonably priced hardware.

3.5. Fundamentals of Image Compression Techniques:


A digital image, or "bitmap", consists of a grid of dots, or "pixels", with each pixel defined by a numeric value that gives its colour. The term data compression refers to the process of reducing the amount of data required to represent a given quantity of information. Now, a particular piece of information may contain some portion which is not important and can be comfortably removed. All such data is referred as Redundant Data. Data redundancy is a central issue in digital image compression. Image compression research aims at reducing the number of bits needed to represent an image by removing the spatial and spectral redundancies as much as possible. A common characteristic of most images is that the neighboring pixels are correlated and therefore contain redundant information. The foremost task then is to find less correlated representation of the image. In general, three types of redundancy can be identified:
19

1. Coding Redundancy 2. Inter Pixel Redundancy 3.PsychovisualRedundancy

Coding Redundancy:
If the gray levels of an image are coded in a way that uses more code symbols than absolutely necessary to represent each gray level, the resulting image is said to contain coding redundancy. It is almost always present when an images gray levels are represented with a straight or natural binary code. Let us assume that a random variable r
K

lying in the interval [0,

1] represents the gray levels of an image and that each r K occurs with probability Pr (r K). Pr (r K) = N k / n where k = 0, 1, 2 L-1. L = No. of gray levels. N k =No. of times that gray appears in that image. N = Total no. of pixels in the image. If no. of bits used to represent each value of r represent each pixel is L avg = l (r K) Pr (r K) (3.2) That is average length of code words assigned to the various gray levels is found by summing the product of the no. of bits used to represent each gray level and the probability that the gray level occurs. Thus the total no. of bits required to code an MN image is MN L avg.
K

(3.1)

is l (r K), the average no. of bits required to

Inter Pixel Redundancy:


The Information of any given pixel can be reasonably predicted from the value of its neighboring pixel. The information carried by an individual pixel is relatively small. In order to reduce the inter pixel redundancies in an image, the 2-D pixel array normally used for viewing and interpretation must be transformed into a more efficient but usually non visual format. For example, the differences between adjacent pixels can be used to represent an image. These types of transformations are referred as mappings. They are called reversible if the original image elements can be reconstructed from the transformed data set.

20

Psycho visual Redundancy:


Certain information simply has less relative importance than other information in normal visual processing. This information is said to be Psycho visually redundant, it can be eliminated without significantly impairing the quality of image perception. In general, an observer searches for distinguishing features such as edges or textual regions and mentally combines them in recognizable groupings. The brain then correlates these groupings with prior knowledge in order to complete the image interpretation process. The elimination of psycho visually redundant data results in loss of quantitative information; it is commonly referred as quantization. As this is an irreversible process i.e. visual information is lost, thus it results in Lossy Data Compression. An image reconstructed following Lossy compression contains degradation relative to the original. Often this is because the compression scheme completely discards redundant information.

Image Compression Techniques:


There are basically two methods of Image Compression: 3.5.1. Lossless Coding Techniques 3.5.2. Lossy Coding Techniques

3.5.1. Lossless Coding Techniques:


In Lossless Compression schemes, the reconstructed image, after compression, is numerically identical to the original image. However Lossless Compression can achieve a modest amount of Compression. Lossless coding guaranties that the decompressed image is absolutely identical to the image before compression. Lossless techniques can also be used for the compression of other data types where loss of information is not acceptable. Lossless compression algorithms can be used to squeeze down images and then restore them again for viewing completely unchanged. Lossless Coding Techniques are as follows: Source Encoder Input Image F(x, y) 1. Run Length Encoding. 2. Huffman Encoding. 3. Entropy Encoding. 4. Area Encoding.
21

3.5.2. Lossy Coding Techniques:


Lossy techniques cause image quality degradation in each Compression / Decompression step. Careful consideration of the Human Visual perception ensures that the degradation is often unrecognizable, though this depends on the selected compression ratio. An image reconstructed following Lossy compression contains degradation relative to the original. Often this is because the compression schemes are capable of achieving much higher compression. Under normal viewing conditions, no visible loss is perceived (visually Lossless). Lossy Image Coding Techniques normally have three Components:

Image Modeling:
It is aimed at the exploitation of statistical characteristics of the image (i.e. high correlation, redundancy). It defines such things as the transformation to be applied to the Image.

Parameter Quantization:
The aim of Quantization is to reduce the amount of data used to represent the information within the new domain.

Encoding:
Here a code is generated by associating appropriate code words to the raw produced by the Quantizer. Encoding is usually error free. It optimizes the representation of the information and may introduce some error detection codes.

3.6. Measurement of Image Quality:


The design of an imaging system should begin with an analysis of the physical characteristics of the originals and the means through which the images may be generated. For example, one might examine a representative sample of the originals and determine the level of detail that must be preserved, the depth of field that must be captured, whether they can be placed on a glass platen or require a custom book-edge scanner, whether they can tolerate exposure to high light intensity, and whether specula reflections must be captured or minimized. A detailed examination of some of the originals, perhaps with a magnifier or microscope, may be necessary to determine the level of detail within the original that might be meaningful for a researcher or scholar. For example, in drawings or paintings it may be important to preserve stippling or other techniques characteristic.
22

3.7. Wavelets for image compression:


Wavelet transform exploits both the spatial and frequency correlation of data by dilations (or contractions) and translations of mother wavelet on the input data. It supports the multiresolution analysis of data i.e. it can be applied to different scales according to the details required, which allows progressive transmission and zooming of the image without the need of extra storage. Another encouraging feature of wavelet transform is its symmetric nature that is both the forward and the inverse transform has the same complexity, building fast compression and decompression routines. Its characteristics well suited for image compression include the ability to take into account of Human Visual Systems (HVS) characteristics, very good energy compaction capabilities, robustness under transmission, high compression ratio etc. Wavelet transform divides the information of an image into approximation and detail sub-signals. The approximation sub-signal shows the general trend of pixel values and other three detail sub-signals show the vertical, horizontal and diagonal details or changes in the images. If these details are very small (threshold) then they can be set to zero without significantly changing the image. The greater the number of zeros the greater the compression ratio. If the energy retained (amount of information retained by an image after compression and decompression) is 100% then the compression is lossless as the image can be reconstructed exactly. This occurs when the threshold value is set to zero, meaning that the details have not been changed.

3.8 Image Compression Methodology: Overview:


The storage requirements for the video of a typical Angiogram procedure is of the order of several hundred Mbytes. *Transmission of this data over a low bandwidth network results in very high latency. * Lossless compression methods can achieve compression ratios of ~2:1. * We consider lossy techniques operating at much higher compression ratios (~10:1). * Key issues: - High quality reconstruction required. - Angiogram data contains considerable high-frequency spatial texture.

23

* Proposed method applies a texture-modeling scheme to the high-frequency texture of some regions of the image. * This allows more bandwidth allocation to important areas of the image.

3.9 Different types of transforms:


1. FT (Fourier Transform). 2. DCT (Discrete Cosine Transform). 3. DWT (Discrete Wavelet Transform). .

3.9.1 Discrete Fourier Transform:


The DTFT representation for a finite duration sequence is (3.3)

(3.4)

Where x(n) is a finite duration sequence, X(j ) is periodic with period 2.It is convenient sample X(j ) with a sampling frequency equal an integer multiple of its period =m that is taking N uniformly spaced samples between 0 and 2.
Let (3.5)

Therefore

(3.6)

Since X(j ) is sampled for one period and there are N samples X(j ) can be expressed as

(3.7)

3.9.2 The Discrete Cosine Transform (DCT):


The discrete cosine transform (DCT) helps separate the image into parts (or spectral sub-bands) of differing importance (with respect to the image's visual quality). The DCT is similar to the discrete Fourier transform: it transforms a signal or image from the spatial domain to the frequency domain.
24

3.9.3 Discrete Wavelet Transform (DWT):


The discrete wavelet transform (DWT) refers to wavelet transforms for which the wavelets are discretely sampled. A transform which localizes a function both in space and scaling and has some desirable properties compared to the Fourier transform. The transform is based on a wavelet matrix, which can be computed more quickly than the analogous Fourier matrix. Most notably, the discrete wavelet transform is used for signal coding, where the properties of the transform are exploited to represent a discrete signal in a more redundant form, often as a preconditioning for data compression. The discrete wavelet transform has a huge number of applications in Science, Engineering, Mathematics and Computer Science. Wavelet compression is a form of data compression well suited for image compression (sometimes also video compression and audio compression). The goal is to store image data in as little space as possible in a file. A certain loss of quality is accepted (lossy compression). Using a wavelet transform, the wavelet compression methods are better at representing transients, such as percussion sounds in audio, or high-frequency components in twodimensional images, for example an image of stars on a night sky. This means that the transient elements of a data. Signal can be represented by a smaller amount of information than would be the case if some other transform, such as the more widespread discrete cosine transform, had been used. First a wavelet transform is applied. This produces as many coefficients as there are pixels in the image (i.e.: there is no compression yet since it is only a transform). These coefficients can then be compressed more easily because the information is statistically concentrated in just a few coefficients.

3.10 Quantization:
Quantization involved in image processing. Quantization techniques generally compress by compressing a range of values to a single quantum value. By reducing the number of discrete symbols in a given stream, the stream becomes more compressible. For example seeking to reduce the number of colors required to represent an image. Another widely used example DCT data quantization in JPEG and DWT data quantization in JPEG 2000.

25

3.11 Entropy Encoding:


An entropy encoding is a coding scheme that assigns codes to symbols so as to match code lengths with the probabilities of the symbols. Typically, entropy encoders are used to compress data by replacing symbols represented by equal-length codes with symbols represented by codes proportional to the negative logarithm of the probability. Therefore, the most common symbols use the shortest codes. According to Shannon's source coding theorem, the optimal code length for a symbol is logbP, where b is the number of symbols used to make output codes and P is the probability of the input symbol. Three of the most common entropy encoding techniques are Huffman coding, range encoding, and arithmetic coding. If the approximate entropy characteristics of a data stream are known in advance (especially for signal compression), a simpler static code such as unary coding, Elias gamma coding, Fibonacci coding, Golomb coding, or Rice coding may be useful. There are three main techniques for achieving entropy coding: Huffman Coding - one of the simplest variable length coding schemes. Run-length Coding (RLC) - very useful for binary data containing long runs of ones of zeros. Arithmetic Coding - a relatively new variable length coding scheme that can combine the best features of Huffman and run-length coding, and also adapt to data with non-stationary statistics. It shall concentrate on the Huffman and RLC methods for simplicity.

3.12 Conclusion:
Here, some topics of image compression which include the principles of image compression, the classification of compression methods and the framework of a general image coder and wavelets for image compression, different types of transforms and quantization are discussed. The introduction to wavelet transforms is given in the next chapter.

26

Chapter 4 Introduction to wavelet transform


4.1. Introduction:
The fundamental idea behind wavelets is to analyze according to scale. Indeed, some researchers in the wavelet field feel that, by using wavelets, one is adopting a whole new mindset or perspective in processing data. Wavelets are functions that satisfy certain mathematical requirements and are used in representing data or other functions. This idea is not new. Approximation using superposition of functions has existed since the early 1800's, when Joseph Fourier discovered that he could superpose sines and cosines to represent other functions. However, in wavelet analysis, the scale that we use to look at data plays a special role. Wavelet algorithms process data at different scales or resolutions. If we look at a signal with a large "window," we would notice gross features. Similarly, if we look at a signal with a small "window," we would notice small features. The result in wavelet analysis is to see both the forest and the trees, so to speak. This makes wavelets interesting and useful. For many decades, scientists have wanted more appropriate functions than the sines and cosines which comprise the bases of Fourier analysis, to approximate choppy signals. By their definition, these functions are non-local (and stretch out to infinity). They therefore do a very poor job in approximating sharp spikes. But with wavelet analysis, we can use approximating functions that are contained neatly in finite domains. Wavelets are well-suited for approximating data with sharp discontinuities. The wavelet analysis procedure is to adopt a wavelet prototype function, called an analyzing wavelet or mother wavelet. Temporal analysis is performed with a contracted, high-frequency version of the prototype wavelet, while frequency analysis is performed with a dilated, low-frequency version of the same wavelet. Because the original signal or function can be represented in terms of a wavelet expansion (using coefficients in a linear combination of the wavelet functions), data operations can be performed using just the corresponding wavelet coefficients. And if you further choose the best wavelets adapted to your data, or truncate the coefficients below a threshold, your data is sparsely represented. This sparse coding makes wavelets an excellent tool in the field of data compression.

27

Other applied fields that are making use of wavelets include astronomy, acoustics, nuclear engineering, sub-band coding, signal and image processing, neurophysiology, music, magnetic resonance imaging, speech discrimination, optics, fractals, turbulence, earthquakeprediction, radar, human vision, and pure mathematics applications such as solving partial differential equations.

4.2. Basis Functions:


It is simpler to explain a basis function if we move out of the realm of analog (functions) and into the realm of digital (vectors) (*). Every two-dimensional vector (x,y) is a combination of the vector (1,0) and (0,1). These two vectors are the basis vectors for (x,y). Why? Notice that x multiplied by (1,0) is the vector (x,0), and y multiplied by (0,1) is the vector (0,y). The sum is (x,y). The best basis vectors have the valuable extra property that the vectors are perpendicular, or orthogonal to each other. For the basis (1,0) and (0,1), this criteria is satisfied. Now let's go back to the analog world, and see how to relate these concepts to basis functions. Instead of the vector (x,y), we have a function f(x). Imagine that f(x) is a musical tone, say the note A in a particular octave. We can construct A by adding sines and cosines using combinations of amplitudes and frequencies. The sines and cosines are the basis functions in this example, and the elements of Fourier synthesis. For the sines and cosines chosen, we can set the additional requirement that they be orthogonal. How? By choosing the appropriate combination of sine and cosine function terms whose inner product add up to zero. The particular set of functions that are orthogonal and that construct f(x) are our orthogonal basis functions for this problem.

Scale-Varying Basis Functions:


A basis function varies in scale by chopping up the same function or data space using different scale sizes. For example, imagine we have a signal over the domain from 0 to 1. We can divide the signal with two step functions that range from 0 to 1/2 and 1/2 to 1. Then we can divide the original signal again using four step functions from 0 to 1/4, 1/4 to 1/2, 1/2 to 3/4, and 3/4 to 1. And so on. Each set of representations code the original signal with a particular resolution or scale.
28

4.3. Fourier analysis: Fourier Transform:


The Fourier transform's utility lies in its ability to analyze a signal in the time domain for its frequency content. The transform works by first translating a function in the time domain into a function in the frequency domain. The signal can then be analyzed for its frequency content because the Fourier coefficients of the transformed function represent the contribution of each sine and cosine function at each frequency. An inverse Fourier transform does just what you'd expect, transform data from the frequency domain into the time domain.

Discrete Fourier Transform:


The discrete Fourier transform (DFT) estimates the Fourier transform of a function from a finite number of its sampled points. The sampled points are supposed to be typical of what the signal looks like at all other times. The DFT has symmetry properties almost exactly the same as the continuous Fourier transform. In addition, the formula for the inverse discrete Fourier transform is easily calculated using the one for the discrete Fourier transform because the two formulas are almost identical.

Windowed Fourier Transform:


If f(t) is a non-periodic signal, the summation of the periodic functions, sine and cosine, does not accurately represent the signal. You could artificially extend the signal to make it periodic but it would require additional continuity at the endpoints. The windowed Fourier transform (WFT) is one solution to the problem of better representing the non periodic signal. The WFT can be used to give information about signals simultaneously in the time domain and in the frequency domain. With the WFT, the input signal f(t) is chopped up into sections, and each section is analyzed for its frequency content separately. If the signal has sharp transitions, window uses input data so that the sections converge to zero at the endpoint. This windowing is accomplished via a weight function that places less emphasis near the interval's endpoints than in the middle. The effect of the window is to localize the signal in time.

29

Fast Fourier Transform:


To approximate a function by samples, and to approximate the Fourier integral by the discrete Fourier transform, requires applying a matrix whose order is the number sample points n. Since multiplying an matrix by a vector costs on the order of arithmetic

operations, the problem gets quickly worse as the number of sample points increases. However, if the samples are uniformly spaced, then the Fourier matrix can be factored into a product of just a few sparse matrices, and the resulting factors can be applied to a vector in a total of order arithmetic operations. This is the so-called fast Fourier transform or FFT.

4.4. Similarities between Fourier and Wavelet Transform:


The fast Fourier transform (FFT) and the discrete wavelet transform (DWT) are both linear operations that generate a data structure that contains segments of various lengths, .

usually filling and transforming it into a different data vector of length

The mathematical properties of the matrices involved in the transforms are similar as well. The inverse transform matrix for both the FFT and the DWT is the transpose of the original. As a result, both transforms can be viewed as a rotation in function space to a different domain. For the FFT, this new domain contains basis functions that are sines and cosines. For the wavelet transform, this new domain contains more complicated basis functions called wavelets, mother wavelets, or analyzing wavelets. Both transforms have another similarity. The basis functions are localized in frequency, making mathematical tools such as power spectra (how much power is contained in a frequency interval) and scale grams (to be defined later) useful at picking out frequencies and calculating power distributions.

4.5. Dissimilarities between Fourier and Wavelet Transform:


The most interesting dissimilarity between these two kinds of transforms is that individual wavelet functions are localized in space. Fourier sine and cosine functions are not. This localization feature, along with wavelets' localization of frequency, makes many functions and operators using wavelets "sparse" when transformed into the wavelet domain. This sparseness, in turn, results in a number of useful applications such as data compression, detecting features in images, and removing noise from time series.

30

4.6. Wavelets:
Compactly supported wavelets are functions defined over a finite interval and having an average alue of zero. The basic idea of the wavelet transform is to represent any arbitrary function f(x) as a uperposition of a set of such wavelets or basis functions. These basis functions are obtained from a single prototype wavelet called the mother wavelet (x), by dilations or scaling and translations. Wavelet bases are very good at efficiently representing functions that are smooth except for a small set of discontinuities. For each n, k Z, define (x) = ( (x) by x - k) on R, such that { (4.1) (x)}n,kZ is an or- thonormal (x) }n,kZ is a

Constructing the function (x),

basis on R. As mentioned before (x) is a wavelet and the collection {

wavelet orthonormal basis on R; this framework for constructing wavelets involves the concept of a multi resolution analysis or MRA. Multi resolution analysis is a device for computation of basis coefficients in f= , . It is defined as follows, g( x), g(x) }, (4.2) (4.3) (R) :

= {f(x)|f(x) =
Where

f(x) =

(f, ( n))(x n)

Then a multi resolution analysis on R is a sequence of subspaces { }nZ of functions on R, satisfying the following properties: (a) For all n, k Z, (b) If f(x) is . such that ||f g|| < . x) on R, then f(x) span{ }nZ. That is, given > 0, there is an n Z

and a function g(x) (c) = 0. (d) A function f(x)

if and only if

f(

(e) There exists a function (x), = span{(x n)}.

on R, called the scaling function such that the

collection (x n) is an orthonormal system of translates and Let { } be an MRA with scaling function (x) which satisfies scaling filter h(k), where h(k) = ( ( ) ( x-k)). Then the wavelet filter g(k) is defined by (h(1 k))
31

g(k) = (
and the wavelet by

(4.4)

(x) = Then { (x)} f=

g(k) (2x k). is a wavelet orthonormal basis on R. onto is given by

(4.5)

The orthogonal projection of an arbitrary function f (f, )

(4.6) are shifted in steps of , so f cannot have the . For a

As k varies, the basis functions

represent any detail on a scale smaller than that. We say that the functions in resolution or scale . Here, f is called an approximation to f at resolution

given function f, an MRA provides a sequence of approximations The difference between the approximations at resolution detail at resolution f(x) = Or f= (f, ) . which is as follows: f(x) f(x). and

f of increasing accuracy. is called the fine

(4.7)

(4.8) is orthorgonal to where the following

is also an orthogonal projection and its range holds: = {f| = {f| = There are choices of the numbers h and g such that { f = f} f = f}

(4.9) (4.10) (4.11) (x)} is a wavelet orthonormal

basis on R. We must show otho-normality and completeness. As for completeness, we have =0 and = Then we have { if = R = . Hence { (x)} (4.13) is complete if and only |k Z} = (4.12)

(R) holds, and this is true. x k), ) = (k l). (x) V1, Then we have , x l) (4.14)

Now, as for the ortho-normality, ( , )= ( =( , ( (

To prove ortho-normality between scales, let n, n Z with n < n, and let k, k Z be arbitrary. Since (x) V1, that f(x)= (f, . Since ( , )= 0 for all k, l Z, it follows that ( ) )= 0, for all n, k, l Z. Given f(x)
32

we know

(x) Hence for f(x)

, f) = ( = (f,

(f, ) (

) ,

) )=0 , also. Hence ( , (4.15) ) =0

Since, n < n, Therefore {

(x)}

and since

is a wavelet orthonormal basis on R.

i) Symmetry: Symmetric filters are preferred for they are most valuable for minimizing the edge effects in the wavelet representation of discrete wavelet transform (DWT) of a function; large coefficients resulting from false edges due to periodization can be avoided. Since orthogonal filters in exception to Haar-filter cannot be symmetric, biorthogonal filters are almost always selected for image compression application. ii) Vanishing Moments: Vanishing Moments are defined as follows: From the definition of multi resolution analysis(MRA), any wavelet (x) that comes from MRA must satisfy =0 (4.16)

The integral is referred to as the zeroth moment of (x), so that if the above equation holds, we say that (x) has its zeroth moment vanishing. The integral to as the vanishing. We may encounter a situation where having different number of vanishing moments on the analysis filters than on the reconstruction filters. As a matter of fact, it is possible to have different number of vanishing moments on the analysis filters than on the reconstruction filters. Vanishing moments on the analysis filters are desired for small coefficients in the transform as a result, whereas vanishing moments on the reconstruction filter results in fewer blocking artifacts in the compressed image thus is desired. Thus having sufficient vanishing moments which maybe different in numbers on each filters are advantageous. iii) Size of the filters: Long analysis filters results in greater computation time for the wavelet or wavelet packet transform. Long reconstruction filters can create unpleasant artifacts in the compressed image for the following reason. The reconstructed image is made up of the superposition of only a few scaled and shifted reconstruction filters. So features of the reconstruction filters such as oscillations or lack of smoothness, can be obvious noted in the moment of (x) and if (x) dx = 0, we say that (x) dx is referred moment

(x) has its

33

reconstructed image. Smoothness can be guaranteed by requiring a large number of vanishing moments in the reconstruction filter.

4.7 List Of Wavelet Related Transform: 4.7. 1. Continuous Wavelet Transform:


A continuous wavelet transform is used to divide a continuous-time function into wavelets. Unlike Fourier transform, the continuous wavelet transform possesses the ability to construct a time frequency represented of a signal that offers very good time and frequency localization.

4.7.2. Multi resolution analysis:


A multi resolution analysis (MRA) or multi scale approximation (MSA) is the design methods of most of the practically relevant discrete wavelet transform (DWT) and the justification for the algorithm of the fast Fourier wavelet transform (FWT)

4.7.3. Discrete Wavelet Transform:


In numerical analysis and functional analysis, a discrete wavelet transform (DWT) is any wavelet transform for which the wavelets are discretely sampled. As with other wavelet transforms, a key advantage it has over Fourier transforms is temporal resolution: it captures both frequency and location information.

4.7.4. Fast Wavelet Transform:


The Fast Wavelet Transform is a mathematical algorithm designed to turn a waveform or signal in the time domain into a sequence of coefficients based on an orthogonal basis of small finite waves, or wavelets. The transform can be easily extended to multidimensional signals, such as images, where the time domain is replaced with the space domain.

4.8. Applications of Wavelet Transforms:


Wavelets have broad applications in fields such as signal processing and medical imaging. Due to time and space constraints, I will only be discussing two in this paper. The two applications most applicable to this are wavelet image compression and the progressive transmission of image files over the internet.

34

4.8.1. Wavelet Compression:


The point of doing the Haar wavelet transform is that areas of the original matrix that contain little variation will end up as small or zero elements in the Haar transform matrix. A matrix is considered sparse if it has a high proportion of zero entries. Matrices that are sparse take much less memory to store. Because we cannot expect the transformed matrices always to be sparse, we must consider wavelet compression To perform wavelet compression we rst decide on a non-negative threshold value known as . We next let any value in the Haar wavelet transformed matrix whose magnitude is less than be reset to zero. Our hope is that this will leave us with a relatively sparse matrix. If is equal to zero we will not modify any of the elements and therefore we will not lose any information. This is known as lossless compression. Lossy compression occurs when is greater than zero. Because some of the elements are reset to zero, some of our original data is lost. In the case of lossless compression we are able to reverse our operations and get our original image back. With lossy compression we are only able to build an approximation of our original image.

4.8.2. Progressive Transmission:


Many people frequently download images from the internet, a few of which are not even pornographic. Wavelet transforms speed up this process considerably. When a person clicks on an image to download it, the source computer recalls the wave transformed matrix from memory. It first sends the overall approximation coecient and larger detail coecients, and then the progressively smaller detail coefficients. As your computer receives this information it begins to reconstruct the image in progressively greater detail until the original image is fully reconstructed. This process can be interrupted at any time if the user decides that s/he does not want the image. Otherwise an user would only be able to see an image after the entire image file had been downloaded. Because a compressed image file is signicantly smaller it takes far less time to download.

4.9. Conclusion:
The basis function, Fourier analysis, similarities, dissimilarities between Fourier and wavelet transform, introduction to wavelets, types of wavelet transforms and applications of wavelet transform are discussed in this chapter. The image compression using modified fast haar wavelet transform will be discussed.
35

Chapter 5

1. Image compression using SPIHT algorithm


A. Description of the Algorithm

Image data through the wavelet decomposition, the coefficient of the distribution turn into a tree. According to this feature, defining a data structure: spatial orientation tree. 4-level wavelet decomposition of the spatial orientation trees structure are shown in Figure1.We can see that each coefficient has four children except the red marked coeffcients in the LL subband and the coeffcients in the highest subbands (HL1;LH1; HH1). The following sets of coordinates of coeffcients are used to represent set partitioning method in SPIHT algorithm. The location of coeffcient is notated by (i,j),where i and j indicate row and column indices, respectively. H: Roots of the all spatial orientation trees O (i, j):Set of offspring of the coeffcient (i, j), O(i, j) = {(2i, 2j), (2i, 2j + 1),(2i + 1, 2j), (2i + 1, 2j + 1)}, except (i, j) is in LL; When (i,j) is in LL subband, O(i; j) is defined as: O(i, j) = {(i, j + ), (i + , j), (i + , j + )}, where and is the width and

height of the LL subband, respectively. D (i, j): Set of all descendants of the coeffcient (i, j), L (i, j): D (i, j) - O (i, j).

Figure5.1: Parent-child relationship in SPIHT A significance function () which decides the Significance of the set of coordinates, , with respect to the threshold 2n is defined by:

36

Where ci,j is the wavelet coefficient. In this algorithm, three ordered lists are used to store the significance information during set partitioning. List of insignificant sets (LIS), list of insignificant pixels (LIP), and list of significant pixels (LSP) are those three lists. Note that the term pixel is actually indicating wavelet coeffcient if the set partitioning algorithm is applied to a wavelet transformed image.

Algorithm: SPIHT 1) Initialization: 1. Output n= [log2 max {| ( , )|}] 2. Set LSP = ; 3. Set LIP = (i,j) H; 4. Set LIS = (i,j) H, where D(i; j) 2) Sorting Pass: 1. For each (i, j) (a) Output (, ) (b) If (, ) = 1 then move (i, j) to LSP and output Sign ( , ) 2. For each (i, j) LIS do: (a) If (i, j) is type A then I. output ((, )) ii. If then ( , ) = 1 then A. for each (k, l) O(i, j) . Output ( , ) . If ( , ) = 1 then append (k, l) to LSP, output Sign( sign( ,),and , = , 2 ,) else append (k; l) to LIP B. move (i, j) to the end of LIS as type B LIP do: and set each entry in LIS as type A

(b) If (i, j) is type B then I. output ( , ) ii. If ( , ) = 1 then . Append each (k, l) O(i, j) to the end of LIS as type A . Remove (i,j) from LSP

3) Refinement Pass: 1. For each (i,j) in LSP, except those included in the last sorting pass
37

. Output the n-th MSB of | , |

4) Quantization Pass: 1. Decrement n by 1 2. Goto step 2). B. Analyses of SPIHT Algorithm Here a concrete example to analyze the output binary stream of SPIHT encoding. The following is 3-level wavelet decomposition coefficients of SPIHT encoding

n = [log2 max {|c(i,j)|}] = 5, so, The initial threshold value: = 25, for , the output binary stream: 11100011100010000001010110000, 29 bits in all. By the SPIHT encoding results, we can see that the output bit stream with a large number of seriate "0" situation, and along with the gradual deepening of quantification, the situation will become much more severity, so there will have a great of redundancy when we direct output.

38

2. Image compression using WDR algorithm


WDR ALGORITHM One of the defects of SPIHT is that it only implicitly locates the position of significant coefficients. This makes it difficult to perform operations, such as region selection on compressed data, which depend on the exact position of significant transform values. By region selection, also known as region of interest (ROI), which means selecting a portion of a compressed image, which requires increased resolution. Such compressed data operations are possible with the Wavelet Difference Reduction (WDR) algorithm of Tian and Wells.

The term difference reduction refers to the way in which WDR encodes the locations of significant wavelet transform values. In WDR, the output from the significance pass consists of the signs of significant values along with sequences of bits which concisely describe the precise locations of significant values.

The WDR algorithm is a very simple procedure. A wavelet transform is first applied to the image, and then the bit-plane based WDR encoding algorithm for the wavelet coefficients is carried out. WDR mainly consists of five steps as follows: 1. Initialization: During this step an assignment of a scan order should first be made. For an image with P pixels, a scanorder is a one-to-one and onto mapping = Xk, for k =1,2,..., P between the wavelet coefficient () and a linear ordering (Xk). The scan order is a zigzag through subbands from higher to lower levels. For coefficients in subbands, row-based scanning is used in the horizontal subbands, column based scanning is used in the vertical subbands, and zigzag scanning is used for the diagonal and low-pass subbands. As the scanning order is made, an initial threshold T0 is chosen so that all the transform values satisfy |Xm|< T0 and at least one transform value satisfies |Xm|>= T0 / 2.

2.Update threshold: Let Tk=Tk-1 / 2.

39

3.Significance pass: In this part, transform values are deemed significant if they are greater than or equal to the threshold value. Then their index values are encoded using the difference reduction method of Tian and Wells. The difference reduction method essentially consists of a binary encoding of the number of steps to go from the index of the last significant value to the index of the current significant value. The output from the significance pass includes the signs of significant values along with sequences of bits, generated by difference reduction, which describes the precise locations of significant values.

4. Refinement pass: The refinement pass is to generate the refined bits via the standard bit-plane quantization procedure like the refinement process in SPHIT method. Each refined value is a better approximation of an exact transform value.

5.Repeat steps (2) through (4) until the bit budget is reached.

3. Image compression using EZW algorithm


THEORY OF EZW ALGORITHM It generates a lot of unimportant data after wavelet transforming on image data. It discards some unimportant data after the process of quantizing and coding according to some special rules and the remained data can represent the original data approximately. This is the principle of Image compress algorithm based on wavelet transform. Zero-tree coding method is one of the most popular image compress algorithms using wavelet transforms and Embedded Zero-tree Wavelets (EZW) coding is the representative method of the zero-tree coding based. EZW was invented by Shapiro in 1993[3]. It is an embedded wavelet image coding algorithms, which has a high compression rate. It is a progressive coding method and can perform well at image compressing from lossy to lossless.

40

The main features of EZW include compact multiresolution representation of images by discrete wavelet transformation, zero-tree coding of the significant wavelet coefficients providing compact binary maps, successive approximation quantization of the wavelet coefficients, adaptive multilevel arithmetic coding, and capability of meeting an exact target Compress rate. The basic process flow of EZW algorithm can be described as follows: Operate the image through wavelet transform and quantizing the coefficients. Given a series of threshold values which are sorted from high to low, for every threshold (current threshold value equals to 1/2 of the former threshold), sort all the coefficients and remain the important coefficients and discard unimportant coefficients according to this threshold. It will generate four symbols, respectively named positive important coefficient (POS), negative important coefficient (NEG), isolated zero (IZ) and zero-tree root (ZTR). Gradually decrease the threshold and find the important coefficients due to a peculiar scan order. It will form a sequence of important coefficients through this method named Successive Quantization. We can see that, EZW generates an embedded bit-stream thus allowing the progressive transmission and precise control of target bit rate or target distortion[3-4].

Although EZW has a high compress rate and it do well for reverting the image, there still some defects, such as it must cost the same memory as the whole image when operating the image, not suit for browsing image in sub blocks, and so on, needed to be improved.

41

Chapter 6

Results
Results for SPIHT wname=haar, level=3, Nb. Encoding Loops=9

Compression ratio: 22.68% BPP: 1. 8141 P.S.N.R= 36.77, L2- Norm Ratio: 99.48%, M.S.E:13.67

42

Results for EZW wname=haar, level=3, Nb. Encoding Loops=9

Compression ratio: 51.06% BPP: 4.0851 P.S.N.R= 44.5, L2- Norm Ratio: 99.99%, M.S.E:2.305

43

Results for WDR wname=haar, level=3, Nb. Encoding Loops=9

Compression ratio: 60.42% BPP: 4.8335 P.S.N.R= 42.2, L2- Norm Ratio: 99.91%, M.S.E: 3.726

44

Conclusion
A picture can say more than a thousand words. However, storing an image can cost more than a million words. This is not always a problem because now computers are capable enough to handle large amounts of data. However, it is often desirable to use the limited resources more efficiently. For instance, digital cameras often have a totally unsatisfactory amount of memory and the internet can be very slow. In these cases, the importance of the compression of image is greatly felt. The rapid increase in the range and use of electronic imaging justifies attention for systematic design of an image compression system and for providing the image quality needed in different applications. Wavelet can be effectively used for this purpose. A modified fast haar wavelet transforms (MFHWT) as the basis functions along with the quality measurement of the compressed images have been presented here. As for the further work, the tradeoffs between the value of the threshold and the image quality can be studied and also fixing the correct threshold value is also of great interest.

45

Future scope of the work


The main benefit of MFHWT is sparse representation and fast transformation and possibility of implementation of fast algorithms. From test images it can be found that the reconstructed images are as good as in FHT and HT. Thus in the light of the above discussion gives, numerical accurate results can be obtained by using the MFHWT. The disadvantage of the MFHWT is, image compression for the pink, yellow and pink with red colours is performing. For the remaining colours the image compression is showing with the combination of black and white colours. In the future, there may have little modification in the usage of HAAR wavelet transform which will give the 100% colour image compression. The intensity of the individual colours will be needed to consider as the main part of image compression.

46

References
[1]. Chang P. and P. Piau, Modified Fast and Exact Algorithm for Fast Haar Transform, Proceedings of World Academy of Science, Engineering and Technology,2007. [2]. V. N. Kopenkov and V. V. Myasnikov, Fast Algorithms of a Local Discrete Transform with Haar Basis, in Proceedings of Scientic and Technical Concern with International Participation, 2006. [3]. P. N. Podkur, On Constructing Some Types of Wavelets with the Scaling Coefficient N Electronic Scientific J.Sdelano v Rossii, 2006, Electronic Scientific J. Made in Russia, 2006. [4]. Castleman, Digital Image Processing, Pearson Education India, 2007. [5]. Kian Kee Teoh, Ibrahim H, Bejo S.K, Investigation on several basic interpolation methods for the use in remote sensing application, IEEE paper on Innovative Technologies in Intelligent Systems and Industrial Applications, 2008. [6]. Lemieux A and Knoll E, Digital Image Resolution, IEEE International Proceedings on Communication, 1999. [7]. Gribbon K.T, Bailey, D.G. A Real- Time Bilinear Interpolation, IEEE International Workshop on Electronic Design Test and Applications, 2004. [8]. Gonzalez, Richard E. Woods, Digital Image Processing, Second Edition. [9]. Ralf Steinmetz and Klara Mahrstedt, Multimedia Computing, Communication and Applications, 2001.
47

[10]. Galli A.W, G.T Heydt and P.F Ribeiro, Exploring the Power of Wavelet Analysis, IEEE Computer Application in Power, 1996.

48

You might also like