You are on page 1of 28

1

Integer Wavelet Transform for Embedded Lossy to Lossless Image Compression

Julien Reichel, Gloria Menegaz, Marcus J. Nadenau and Murat Kunt Signal Processing Laboratory, Swiss Federal Institute of Technology, Lausanne, Switzerland

October 4, 2000

DRAFT

Abstract The use of the Discrete Wavelet Transform (DWT) for embedded lossy image compression is now well established. One of the possible implementations of the DWT is the Lifting Scheme (LS). Because perfect reconstruction is granted by the structure of the LS, non-linear transforms can be used, allowing ecient lossless compression as well. The Integer Wavelet Transform (IWT) is one of them. This is an interesting alternative to the DWT because its rate-distortion performances is similar and the dierences can be predicted. This topic is investigated in a theoretical framework. A model of the degradations caused by the use of the IWT instead of the DWT for lossy compression is presented. The rounding operations are modeled as additive noises. The noises are then propagated through the LS structure to measure their impact on the reconstructed pixels. This methodology is veried using simulations with random noise as input. It predicts accurately the results obtained using images compressed by the well-known EZW [1] algorithm. Experiments are also performed to measure the dierence in terms of bitrate and visual quality. This allows to a better understanding of the impact of the IWT when applied to lossy image compression.

I. Introduction The Discrete Wavelet Transform (DWT) is widely used in signal and image processing. One of the main applications is image compression, mainly due to the very good ratedistortion performances of DWT-based codecs. A whole family of algorithms [2], [3], [4], based on the EZW scheme [1] can be found in the literature. More recently, a new still image compression standard (JPEG2000) is also based on the DWT. Most of the DWTbased codecs can produce an embedded bitstream. This means that the quality of the reconstructed image increases as more encoded bits become available to the decoder. The decoding can be stopped at any point of the bitstream. The main drawback of the DWT is that the wavelet coecients are real numbers. In this case ecient lossless coding is not possible using linear transforms. The Lifting Scheme (LS) presented by Sweldens [5], [6] allows an ecient implementation of the DWT. Another of its properties is that perfect reconstruction is ensured by the structure of the LS itself. This allows new transformations to be used. One such transformation is the Integer Wavelet Transform (IWT) [7], [8]. It is a basic modication of linear transforms, where each lter output is rounded to the nearest integer. IWT can be used to have a unied lossy and lossless codec. It is also of interest for hardware implementations, where the use of oating point is still a costly operation. In software implementations,
October 4, 2000 DRAFT

many processors now oers multimedia instructions which can execute multiple integer operations in parallel, like the MMX instruction set in the Pentium processor [9]. The use of IWT is also a means to reduce the memory demands of the compression algorithm as integers are used instead of real numbers. If IWT is very interesting because of the previously cited advantages, it has one main drawback. Using the IWT instead of the DWT degrades the performances of the lossy codecs [10], [11]. It is important to quantify this dierence in a theoretical framework. We do this by assuming that rounding is equivalent to some form of quantization. The non-linearity can be replaced by an additive quantization noise. We have designed a model using this hypothesis which predicts the dierence between the IWT and the DWT for a wide range of lters. Similar ideas has been presented in the case of the DWT in a nite precision framework [12], [13], [14]. They introduce the idea of additive noise to replace non-linearities. Their main concern is the loss of the perfect reconstruction (PR) properties of the lterbank. In the case of IWT, the PR properties are guaranteed by the structure of the transform. The creation of a model to describe the degradation caused by quantization of the DWT coecients has been studied in [15], [16], [17]. The noise is added to the wavelet coecients and propagated through the lter-bank to the reconstructed images. The purpose of such modeling is to optimize the quantization of the wavelet coecients. The same kind of modeling is applied here, except that the noise is introduced at multiple positions in the LS structure. The paper is organized as follows; The LS implementation of the DWT will be presented in section II. This implementation allows the transformation of any perfect reconstruction lter to an integer equivalent. Section III will examine some theoretical aspects of the use of IWT based on a noise model of the rounding. Section IV will verify the theory using simulations and compressed data. Some experimental results linked to the bitrate and the visual quality will also be presented. II. Lifting Scheme For Integer Wavelet Transform The Lifting Scheme (LS) implementation [5], [6] is a possible implementation of the DWT. It exploits the redundancy between the high pass (HP) and low pass (LP) lters
October 4, 2000 DRAFT

necessary for perfect reconstruction (PR). It reduces the number of arithmetic operations up to a factor of two compared to the lter-bank (FB) implementation. Its structure guarantees that the scheme is reversible, regardless of the lters used. As shown in [5], FB and LS implementations of the DWT are mathematically equivalent. It is possible to transform any PR-FB into the LS structure. The basic idea behind the LS is the following. The input data are split into two signals corresponding to evenly and oddly indexed samples. Then one signal is convolved with a lifting lter and added to the other one. This action is called a Lifting Step. The role of the two signals is then reversed and further lifting steps can be applied. The aim of the scheme is that after K lifting steps, one signal d(n) corresponds to the high pass and the other one s(n) to the low pass coecients. Generally, lifting steps using the low pass signal are called prediction steps and steps using the high pass signal are the update steps, as shown in Fig. 1. The reconstruction algorithm is simply the application of the same structure, but in the reverse order. The reconstruction lter Hk (z) is exactly equivalent to its decomposition counterpart Hk (z), except for its sign, i.e. Hk (z) = Hk (z). (1)

Without loss of generality, it will be assumed that the lifting structure starts with a prediction step and is composed of an even number of steps. It is sucient to set H1 (z) or HK (z) to be equal to zero to take into account all other cases. Because codecs based on the DWT can be used for ecient lossy coding, some researchers have been interested in using the same kind of structure for lossless coding [18], [19]. All theses schemes used modied versions of the DWT. This is necessary because wavelet coecients have real values, and thus cannot be eciently encoded without loss. The fact that PR is insured by the LS construction permits another approach [7]. It is possible to replace the linear lters of the dierent steps by any non-linear operation and still preserve the PR properties. This last point allows the computation of the Integer Wavelet Transform (IWT). By introducing a rounding operation on the output of each lter, as shown in Fig. 2, it is possible to guarantee that all DWT coecients have integer values for integer inputs [8]. When the IWT coecients are transmitted without loss, the non-linearity (rounding)
October 4, 2000 DRAFT

introduced at the encoder will be compensated by one introduced at the decoder. If the compression is increased, the scheme becomes lossy. This is done by further quantizing the IWT coecients. In this case the relationship between the non-linearity of the encoder and the one of the decoder becomes unknown. The prediction of the dierences observed between the two transforms [10], [11] becomes of primary importance for embedded coding schemes which compress data in lossy and lossless manners. III. Theoretical Approach A. Principles To quantify the expected increment of distortion caused by the usage of IWT, a model for the degradation is necessary. In this analysis, it is assumed that the incorporation of a rounding operation after each ltering increases the distortion of the signal. It is a reasonable hypothesis conrmed by the experimental results in [10], [11], [20]. If H1 (z) is seen as the perfect predictor, its rounded output will no longer match the signal to be predicted and thus performance will degrade. In this case it is possible to replace the non-linear rounding operations, which is equivalent to some quantization, by the addition of random noise, i.e. round(x) = x + noise (2)

The IWT can be expressed as the DWT plus the addition of rounding noise. In other words, Fig. 2 can be replaced by Fig. 3, where the noise of the decomposition and reconstruction side are represented respectively by their power spectral density function kd (z) and kr (z). Notice that another quantization noise, with power spectral density function (z), is introduced by the compression algorithm itself, which needs to be specied as well. It will also be represented by the addition of another noise source to keep the entire scheme coherent. The construction of the model representing the degradation caused by the usage of IWT in the framework of image compression can be divided in the following steps. First the analysis of the spectral density of the noise introduced at each steps and their crosscorrelations are presented in section III-B. The same is done for the inuence of the
October 4, 2000 DRAFT

compression in section III-C. Then, to determine the inuence of the quantization noise on the reconstructed pixels, the transfer functions applied to the rounding noise are computed in three steps: rst for one level of transform in section III-D, then for the whole onedimensional case in section III-E and nally for the 2D case in section III-F. B. Rounding Noise As mentioned earlier, each individual rounding operation following the lters will be replaced by an additive random noise. If all of the random noise sources are uncorrelated then the nal degradation is the sum of each individual contribution. In contrast, if these individual noise components are correlated, their correlation must be taken into consideration in the summation. For sake of simplicity all rounding operations will be computed using the same formula. In this case the rounding should be symmetric towards the origin in order to preserve PR properties. This implies : round(x) = round(x) (3)

The quantization noise introduced by the rounding operation is assumed to be white and uniformly distributed between -0.5 and 0.5. This is based on the assumption that the probability distribution of the signal to be quantized is constant over the quantization interval [21]. In this case the quantizer is said to have a high resolution and its variance is given by [22]:
2 k =

12 12

(4)

where the subscript k is the index of the lifting step. Following the assumption of white
2 rounding noise, its power spectral density is constant and is k (z) = k . To dierentiate

noise introduced at the decomposition step k and at the reconstruction step with the same index, kd (z) and kr (z) will denote the power spectrum for decomposition and for reconstruction respectively. Let us consider now a specic family of lifting lters having rational numbers for their impulse response coecients. If the input signal has integer samples, the ltered output signal has a discrete distribution. In this case the assumptions that lead to Eq.(4) are no
October 4, 2000 DRAFT

longer valid. It is possible to nd a common denominator Dk for all rational coecients. The output signal will take values that are multiple of 1/Dk . Accordingly the quantization noise will also be a multiple of 1/Dk . Since the noise interval is normalized to 1, there can be only Dk dierent values of the noise. In this case the variance of the quantization error is equal to
2 k

1 1 12 1 1+ 12

1 2 Dk 2 2 Dk

Dk odd Dk even

(5)

Eq.(5) corresponds to Eq.(4) with a correction term taking into consideration the discrete nature of the noise. Because of this nature there are some cross-correlations between the noises introduced at step k and l. Let Pkl be the probability that two inputs at step k and l have the same sign. In this case the cross spectral density function k,l is equal to k,l (z) = 2Pkl 1 4Dk Dl (6)

The details leading to Eqs.(5) and (6) can be found in the Appendix. The probability Pkl is zero for a couple of decomposition/reconstruction steps with the same index, i.e. k = kd and l = kr(see Eq.(1)). For all other cases, one can assume Pkl = 1/2 due to the likely independence of the signals at these steps. Accordingly, the correlation given by Eq.(6) will be especially strong between one lter and its inverse, used in the reconstruction phase. C. Compression Noise The quantization due to compression has also some impact on the dierence between the IWT and the DWT. The input of the quantizer will be integer (I) in the rst case and real (R) in the second case. Using the assumption that the distribution of the data over the quantization interval is uniform, the power spectral density function of the compression noise for the DWT data is:
2 R (z) = C =

2 12

(7)
DRAFT

October 4, 2000

where is the compression quantization step size [17]. Because the input pixels are integers, the nal values of the reconstructed pixels must also be integers. For this reason the nal reconstructed values after inverse DWT should be rounded. The total noise caused by the compression algorithm has to be increased by 12 /12 according to Eq.(4). For embedded lossy to lossless compression, takes values that are powers of 2. Thus, for IWT the expression I (z) of the power spectral density function of the compression noise is very similar to Eq. (5) except that the quantization step size is no longer equal to 1:
2 I (z) = I =

2 2 (1 + 2 ) 12

(8)

For = 2i with i 0 no quantization occurs in case of IWT and thus I (z) = 0. This corresponds to the lossless mode. Eqs. (7) and (8) hold only for small values of . Nevertheless the main dierences between the IWT and the DWT can be found when the compression is close to lossless as it will be shown in Section IV. D. Impact After One Level of Transformation Using the results of the two previous sections the spectral density of the noise added to the reconstructed signal after one basic transform block can be computed. It corresponds to the sum of all rounding noise ltered through the system, end to end. For this reason the trajectory of each noise should be traced. In other words, one need to nd the equivalent transfer function Gk (z) which lters the noise introduced at step k. Let us dene two transfer functions: Gku (z) and Gkl (z) which correspond respectively to the upper and to the lower branch transfer function of the reconstruction scheme as shown in Fig. 4. The nal transfer function Gk (z) for the noise of step k is computed by summing the lower and upper branch and taking into account the up-sampling. Gk (z) = Gku (z 2 ) + zGkl (z 2 ) (9)

For the noise added on the reconstruction side of the transform, the computation of Gk (z) can easily be deduced from the structure of the lifting lters (see Fig. 5). The noise
October 4, 2000 DRAFT

introduced at the last step (1r (z)) has no inuence on the upper part, G1u (z) = 0. On the lower part, it will not be ltered by G1l (z) = 1. The other equivalent lters can be computed similarly, i.e.: G2u (z) = 1 and G2l (z) = H1 (z), (10)

For lters with a larger index, a recursive formula can be derived from the basic equations of the lifting scheme [5]. It is valid for lower and upper transfer function: Gki (z) = G(k2)i (z) Hk1(z)G(k1)i (z), i = u, l (11)

Notice that this iterative equation can be computed for k = 1, . . . , K + 1, as the last lter needed is HK (z). It should also be noted that GK (z) and GK+1 (z) are respectively equivalent to the LP and HP reconstruction lter of the FB implementation. Let us now consider the noise added at the decomposition side of the transform. The noise introduced at step k of the decomposition goes through the decomposition lters Hk (z) and the reconstruction lters Hk (z), with k = k + 1, . . . , K. According to Eq. (1), the eect of the lter HK (z) will cancel that of HK (z). So, two by two, these lters cancel each other. Only lters with an index number smaller than k on the reconstruction side will be used in the computation of Gk (z). The equivalent transfer functions have already been established in Eqs.(10) and (11). Let us now compute the inuence of the rounding noise kd (z) and kr (z) on the reconstructed pixel. Recall that the power spectral density k,l (z) given by Eq.(6) is valid for any k and l, be it in the decomposition or the reconstruction side. Accordingly, the possible cross-correlation between the decomposition and the reconstruction steps with respective indices kd and kr is k,l (z) = kd,kr (z). Let k (z) be the inuence of the decomposition and reconstruction step k. In this case k (z) is the sum of two interdependent noise sources: k (z) = kd (z) + kr (z) + kd,kr (z) + kr,kd(z) = kd (z) + kr (z) + 2kd,kr (z) According to Eq.(6), kd,kr (z) = kr,kd(z). Prior to the up-sampling, the power spectral (12)

October 4, 2000

DRAFT

10

density function of the noise added to samples from the upper and lower branches is: 1 ki (z) = k (z)Gki (z)Gki ( ), z i = u, l (13)

After up-sampling the upper and lower branches will correspond to the even and odd indexed samples, respectively. As two dierent noises are mixed together to create even and odd samples of the output, the resulting signal is not the realization of a stationary process. It can be seen as periodically time varying, and thus belongs to the larger family of cyclostationary signals [15]. Using the notion of wide sense stationarity the average spectral density function (z) can be computed [17], [23]: 1 1 1 k (z) = (u (z) + l (z)) = k (z)Gk (z)Gk ( ) 2 2 z E. Multi-Level Wavelet Transform In the case where more than one level of transform is considered, the inuence of all the transform blocks must be taken into consideration. The propagation of the noise through all the blocks to the nal level will modify the transfer function Gk (z) [17]. Let Gk (z) be the transfer function after a n-level transformation. It is derived from Gk (z) through: Gk (z 2n1 ) n2 GK (z 2m ) n > 1 m=0 (n) Gk (z) = (15) Gk (z) n=1 Eq.(14) can be extended for n levels of transformation using Gk (z) and the idea that the period of the signal in the cyclostationary sense is increased to 2n : k (z) = where k
(n) (n) (n) (n)

(14)

1 (n) (n) (n) 1 (z)Gk (z)Gk ( ) n k 2 z

(16)

corresponds to the noise added at the decomposition step k of level n. Fi-

nally the total noise due to the rounding operations i (z) is computed by summing the contribution of all steps over all levels. In case of a structure of K steps and N levels of decomposition, it is expressed by:
N K

I (z) =
n=1 k=1

1 (n) (n) (n) 1 (z)Gk (z)Gk ( ) n k 2 z

(17)

October 4, 2000

DRAFT

11

A similar formula is used to compute the inuence of the compression noise, corresponding to the quantization of the wavelet coecients. Let us call (n) (z) the spectral density function of the noise caused by the quantization of the high pass coecients d(n) at level n (Fig. 1). Similarly, let s (z) corresponds to the quantization of the low pass coecients s(N ) at the nal level N. The nal compression noise (z) is computed by:
N

(z) =
n=1

1 1 (n) (n) (n) (z)(GK+1 (z)GK+1 ( )) 2n z

(18)

1 (N ) (N ) 1 + N s (z)(GK (z)GK ( )) 2 z In case of the DWT, (n) (z) and s (z) are computed using Eq.(7) with dierent quantization step sizes according to the level n. For the IWT however Eq.(8) must be used for the same function. The nal dierence between the IWT and the DWT depends on the dierence between the two compression noise functions with the addition of the rounding noise (Eq.(17)). F. Two dimensional transforms Separable 2D transform with the same bases for horizontal and vertical directions are used for image compression. Without loss of generality, we assume that the original image was rst decomposed along the vertical and then along the horizontal direction. Let us rst consider noise introduced along the horizontal direction. It will pass through some lifting lters along the same direction like in the 1D case (Eq.(14)). Then it will be ltered again, but this time along the vertical direction. This last ltering corresponds either to the LP or HP branch . Let z1 and z2 refer respectively to the horizontal and vertical direction. The equivalent transfer function for one basic transform block is GLP (z1 , z2 ) = Gk (z1 )GK (z2 ) and GHP (z1 , z2 ) = Gk (z1 )GK+1(z2 ) respectively for noise going trough LP and HP along the vertical direction. Adding n 1 further transformation blocks correspond to the up-sampling of data and multiplication by GK (z1 )GK (z2 ) for each additional level. Using the notation of the
October 4, 2000 DRAFT

(19)

(20)

12

previous section, this corresponds to the addition of the exponent (n) to all variables in Eqs.(19) and (20), i.e. LP (z1 , z2 ) =
(n)

1 (n) (n) 1 G (z2 )GK ( ) n K 2 z2


K

k=1

1 (n) (n) (n) 1 (z1 )Gk (z1 )Gk ( ) n k 2 z1

(21)

and HP (z1 , z2 ) =
(n)

1 (n) 1 (n) G (z )GK+1( ) n K+1 2 2 z2


K

k=1

1 (n) (n) (n) 1 (z1 )Gk (z1 )Gk ( ) n k 2 z1

(22)

The same development is computed for noise introduced along the vertical direction. When considering one transform block, the noise passes only through vertical lters. When further levels are added only LP branches will be used. The spectral density function of this category of noise introduced at the level n is : v (z1 , z2 ) =
(n)

1 2n1
K

GK

(n1)

(z1 )GK

(n1)

1 ) z1

k=1

1 (n) (n) (n) 1 (z2 )Gk (z2 )Gk ( ) n k 2 z2

(23)

with GK (z) = 1 for simplicity. The total impact of the rounding noises is computed by summing Eqs.(21), (22) and (23) over all levels of decomposition.
N

(0)

I (z1 , z2 ) =
n=1

LP (z1 , z2 ) + HP (z1 , z2 ) + v (z1 , z2 )

(n)

(n)

(n)

(24)

The same development is also applied to the compression noise. It is possible to use Eqs.(21), (22) and (23), considering that the high pass compression noise (n) (z1 ) passes rst through the equivalent lter GK+1(z1 ). This applies also to the low pass noise s (z1 ), but going trough the transfer function GK (z1 ). Because of the recursive structure of the wavelet transform, the compression noises are always ltered along the horizontal direction rst. Due to their obvious similarity with Eqs.(21), (22) and (23), the power spectral density functions of the compression noises will not be written here separately.
October 4, 2000 DRAFT

13

IV. Simulations and Results A. Random noise In this section, the input data matches exactly the hypothesis of section III, i.e. uniformly distributed between 0 and 255, white, random noise. The noise is rounded to integer values prior to the experiments to match the pixel denition. Dierent values of compression quantization are used to verify its relationship to the nal error. In order to be in the same environment as in the case of lossy to lossless image compression, the quantization step take on values that are only powers of two. The quantization step is chosen to be the same for all levels of decomposition. It is a function of the quantization factor Q, i.e. = 2Q . An increase of Q implies more compression and lower quality in the compressed data. Two dierent lters are used. The rst one is the 5/3 lter presented in [7], which oers the advantage to be very compact and ecient for lossless coding. It consist of two lifting lters, each one represented by a rational lter of 2 taps. It is thus possible to compute this IWT using only integer arithmetic. The second one is the 9/7 presented in [24], which is one of the most commonly used lters in image compression. It is composed of 4 lifting steps with 2 taps each. Each impulse response is composed of irrational numbers. Fig. 6 presents the result of the simulation together with the theoretical prediction. The curves represent the dierence between the expected Mean Square Error (MSE) using the DWT and the IWT as a function of the quantization factor Q. This representation is used because it reduces the number of curves in the graph and makes even small dierences between the IWT and the DWT visible. The prediction is accurate for this type of inputs except for the 9/7 lter with small Q. In this particular case some cross-correlations, which have not been taken in consideration in the model, have some clear impact. For small quantization steps size , there is a high probability that the output of the reconstruction step K (after rounding) is exactly the same as the output of the decomposition step K. In this case the two rounding noise are highly correlated. The construction of a model for this special task is possible and experiments have been performed by the authors. However, it will be omitted here, because the impact of this particular point for two-dimensional image compression is rather small.
October 4, 2000 DRAFT

14

Another vision of the dierence is made in Fig. 7, where the same results are presented, but using this time the Pick Signal to Noise Ratio (PSNR) dened as P SNR = 10 log10 (2552 /MSE). The spikes for Q values smaller than 1 is explained by the fact that the IWT-based codec has reached the lossless mode. Fig. 7 shows that for small values of Q, i.e. close to the lossless rate, using IWT and the 9/7 lter decreases the PSNR by 6 dB and increases the MSE by 0.6. For the 5/3 lter, the degradation caused by the use of IWT is much smaller. In the worst case, the MSE is increased by 0.2, corresponding to a loss of 3.5 dB in the PSNR. Other experiments have been performed, showing that, in general, lters with rational coecients are more suited for lossless coding than those having irrational coecients in their impulse response. B. Compressed Images Next, results are given for natural images and a real compression algorithm. The compression algorithm is derived from the EZW scheme [1] with uniform quantization of the wavelet coecients. When real image compression is considered, many unknowns are added to the settings of section III. Most of them are caused by the non-linear relationship between the bitrate and the Q factor. To overcome this problem let us consider the results as a function of Q. The quantization step size of the subband with the lowest frequencies is then 2Q . The other subbands are quantized with a multiple of this factor, as shown in Fig. 8. The relationship between and Q is computed using the gain of the lifting scheme [5], which has been omitted so far. As explained in [7], in case of IWT, the gain at the end of the lifting chain is replaced by 3 additional lifting steps. As this addition would increase the noise caused by the rounding, it is preferable to modify the size of the quantization step instead. The scaling factors presented in Fig. 8 are computed by choosing the power of 2 closest to the desired gain. As the gain depends on the lter, the relationship between and Q is dierent between the 5/3 and the 9/7 lters. In the IWT case the wavelet coecients are all integers. Thus the quantization step sizes will not take values under 1 to avoid the coding of predictive information. Fig. 9 presents the PSNR dierence between the DWT and the IWT as a function of Q for the experiments with natural images. These results are the average over 10 dierent images. Notice the excellent alignment of the theoretical curves and their experimental
October 4, 2000 DRAFT

15

counterparts. It appears clearly that for large Q, i.e. large or high compression, both transformations lead to the same error. In other words, the dierences converge towards zero. Hence, they are equivalent for very lossy compression, while IWT is of inferior quality when the quality is close to lossless. A special behaviour occurs for Q = 1 and Q = 2 respectively for the 9/7 and the 5/3 lter. At these values of Q, all quantization steps are smaller or equal to 1, as explained in in Fig 8. Thus the IWT reaches lossless compression and its PSNR equals innity. As a reference, the PSNR of the DWT based compression of lena varies approximately from 60 dB to 30 dB for the addressed quality range. Let us now consider the bitrate as a function of Q. The dierence of bitrate between the DWT and the IWT is shown in Fig. 10 for the same lters. The IWT based scheme needs, in the case of the 9/7 lter, up to 0.3 bpp more than the DWT to transmit coecients quantized with the same quantization factor. This correspond to approximately 8% of the bitrate at this quantization factor. One way to understand this increase is to recall the hypothesis that the rounding following the lifting step can be replaced by an additive random noise. In this case, the IWT coecient prior to the compression are equivalent to the ones of the DWT with the addition of the noise. The IWT coecients are noisier than those for the DWT, resulting in an increase in the entropy. Further investigations are necessary in order to build a model for this eect. This aspect should also be taken into consideration when comparing the IWT and DWT approaches. C. Impact on Visual Quality So far, the eect of IWT on the MSE compression performance has been discussed in terms of MSE. It is well known that MSE or PSNR are not always well correlated to the visual quality. For this reason, this subsection will try to asses the dierence in terms of visual quality between the two approaches based on an objective metric more elaborate than MSE or PSNR. Many schemes comparing the visual quality of degraded images exist and all of them use a dierent metric [25]. One metric that show a good correlation to the human assessment is the MPQM quality metric [26], [27]. We will use it here. It measures the quality of the image according to the CCIR-500 recommendation . The scale of the quality varies from 1 to 5. A score of 5 means that the compressed image is of the same
October 4, 2000 DRAFT

16

visual quality as the original and a 1 represents a very bad quality. Two images, whose dierence of quality is under 0.1 are not distinguishable by a human observer. The MPQM has been measured with 10 images compressed using dierent Q factors. In order to point out the (small) dierence between the IWT and the DWT, the average dierence between the MPQM of the two approaches is presented in Fig. 11. The maximum dierence is 0.12. There is a high probability that most human observers would not notice this dierence. Another interesting point is that the maximum visual dierence is not at the same Q factor as the maximum MSE dierence. V. Conclusions This paper presented an analysis of the dierences between the Integer (IWT) and innite precision Discrete Wavelet Transform (DWT) for lossy to lossless image compression. A model for the degradation of quality caused by the use of IWT was presented. It is based on the hypothesis that the non-linear rounding operation can be replaced by an additive white noise. Dierent congurations of noise have been studied. To compute the impact of various noise sources on the reconstructed pixel, equivalent transfer functions were computed. The development was done rst for one level transformation, then for the one dimensional multi-level case and nally for the entire 2D system. The model was veried via simulation. The rst used white noise as input to verify the theory. The second used an entire compression system and natural images to check the validity of the theory in the framework of real applications. Both types of experiments showed that the theoretical predictions were very close to the measured data. Furthermore, experiments concerning the visual quality of the two approaches were performed. The use of the Mean Square Error (MSE) showed signicant dierences between the two approaches. The IWT can lead to much larger degradations than the DWT, especially for small quantization steps, i.e small compression factor. However the dierences in terms of visual quality are rather small regardless of the compression factor. Finally the two transforms are equivalent for large compression ratios from both a MSE and a visual quality point of view.

October 4, 2000

DRAFT

17

Appendix A. Variance of Filter with Rational Coecients Let us recall that in the case of lifting lters with rational coecients, it is possible to nd a common denominator Dk for all coecients. In this case the quantization noise is a multiple of 1/Dk . Let dk represent the amplitude of the noise multiplied by Dk ; this way dk takes integer values. The probability distribution of the rounding noise can be established as follows. For a positive input signal, the noise can take on values in (0.5, 0.5] by steps of 1/Dk , each with the same probability of 1/Dk . For negative input signals the corresponding interval is [0.5, 0.5) as given by Eq.(3). Thus, over the entire dynamic range of the input signal, the distribution of the noise p(dk ) can take two dierent forms. If no rounding error is equal to 0.5 or 0.5, the distribution is uniform over the Dk possible values of the rounding noise. This is the case for odd values of Dk . On the other hand, for even values of Dk , the distribution of the noise is 1 1 1 1 , , ,... , 2Dk Dk Dk 2Dk (25)

over [0.5, 0.5] as the union of [0.5, 0.5) and (0.5, 0.5]. The variance of the quantization error can be computed by summing all fractional errors, according to their probability.
2 k

1 = Dk

(Dk 1)/2 dk =(Dk 1)/2

dk Dk

1 12 1 2 12 Dk

(26)

for odd values of Dk and


2 k Dk /2

=
dk =Dk /2

dk Dk

2 12 1+ 2 p(dk ) = 12 Dk

(27)

for even values. Where, according to Eq.(25), 1 for dk {Dk /2, Dk /2} 2Dk p(dk ) = 1 otherwise
Dk

(28)

Eqs.(26) and (27) correspond to Eq.(4) with a correction term taking into consideration the discrete nature of the noise.
October 4, 2000 DRAFT

18

B. Cross-correlation Between Rounding Noises Let us now examine the possible correlation between the rounding noise of dierent lifting steps. Let us recall that the amplitude of the quantization noise can only be in (0.5, 0.5] and [0.5, 0.5) respectively for positive and negative inputs. As the two intervals overlap on all values except the extremes, only 0.5 and 0.5 can induce information about the sign of the input signal. Let Pkl be the probability that two input at step k and l have the same sign. If the error of step k is equal to 0.5, the probability distribution of the noise at step l becomes: Pkl 1 1 1 Pkl , , ,... , Dl Dl Dl Dl Symmetrically, for an error equals to 0.5 the distribution is: 1 Pkl 1 1 Pkl , , ,... , Dl Dl Dl Dl Accordingly, the distribution of the noise at step l depends on the noise at step k. To compute the cross-correlation, all dierent combinations of errors introduced by the two roundings at step k and l are summed weighted by their probability. The cross spectral density function k,l (z) is constant and given by:
Dk /2 Dl /2

(29)

(30)

k,l (z) =
dk =Dk /2 dl =Dl /2

dk dl )( )p(dk , dl ), Dk Dl

(31)

where p(dk , dl ) is the joint probability distribution of the noises at steps k and l. Using the principle that leads to Eqs.(29)and(30), it is possible to compute p(dk , dl ) = p(dl |dk )p(dk ) as a function of dk and dl . 1 Dl 1 2Dl p(dl |dk ) = Pkl Dl 1P kl
Dl

for

Dl 2

< dl <

Dl , 2

dk
Dk 2

for dl { Dl , Dl } and 2 2 for (dk , dl )

< dk <

Dk 2

(32)

{( Dk , Dl ), ( Dk , Dl )} 2 2 2 2

for (dk , dl ) {( Dk , Dl ), ( Dk , Dl )} 2 2 2 2

Using Eqs.(28) and (32), Eq.(31) can be rewritten as: k,l (z) =
October 4, 2000

2Pkl 1 4Dk Dl

(33)
DRAFT

19

Julien Reichel (SM 1999) was born in Geneva, Switzerland, on January 13, 1973. He received his diploma (M.S.) in electrical engineering from the from the Swiss Federal Institute of Technology (EPFL), Lausanne, Switzerland, in 1996. In 1996 he joined the Signal Processing Laboratory of Prof. Murat Kunt at the EPFL, where he is currently working towards his Ph.D. degree. His research interest are primarily in the area of image compression, algorithm complexity and human vision.

Gloria Menegaz received the M.Sc. degree in Electronical Engineering from the Polytechnic University of Milan, Italy, in July 1993, the post-grade Master degree in Information Technology from the Research and Education Center in Information Technology (Cefriel) of Milan, Italy, in July 1995, and the Ph.D. degree from the Swiss Federal Institute of Technology of Lausanne (EPFL), Switzerland, in July 2000. She is currently post-doctoral fellow at the Biomedical Imaging Group of the EPFL. Her research interest are primarily in the area of multi-dimensional signal processing and coding, modeling and visualization.

Marcus J. Nadenau (SM 1996) was born in Germany in 1971. He received his diploma (M.S.) in electrical engineering from the University of Technology, Aachen, Germany, in 1997. In 1997 he joined the Signal Processing Laboratory of Prof. Murat Kunt at the Swiss Federal Institute of Technology, Lausanne, Switzerland. He is currently working towards his Ph.D. degree. His research interests include image compression, color and human vision.

Murat Kunt (SM 1970, M 1974, SM 1980, F 1986) was born in Ankara, Turkey, on January 16, 1945. He received his M.S. in Physics and his Ph.D. in Electrical Engineering, both from the Swiss Federal Institute of Technology, Lausanne, Switzerland, in 1969 and 1974 respectively. From 1974 to 1976, he was a visiting scientist at the Research Laboratory of Electronics of the Massachusett Institute of Technology where he developed compression techniques for X-ray images and electronic image les. In 1976, he returned to the Swiss Federal Institute of Technology (EPFL) where, presently, he is Professor of Electrical Engineering and Director of the Signal Processing Laboratory, one of the largest laboratories at EPFL. He conducts teaching and research in digital signal and image processing with applications to modeling, coding, pattern recognition, scene analysis, industrial developments and biomedical engineering. His Laboratory participates to a large number of European projects under various Programmes such as Esprit, Eureka, Race, HCM, Commett and Cost. He is the author or the co-author of more than two hundred research papers and fteen books and hold seven patents. He is the Editor-in-Chief of the Signal Processing Journal and a founding member of EURASIP, the European Association for Signal Processing. He serves as a chairman and/or a member of the Scientic Committees of several international conferences and in the editorial boards of the Proceedings of the IEEE, Pattern Recognition Letters and Traitement du Signal. He was the co-chairman of the rst European Signal Processing Conference which was held in Lausanne in 1980 and the general chairman of the International Image Processing Conference (ICIP96) held
October 4, 2000 DRAFT

20

in Lausanne in 1996. He was the President of the Swiss Association for Pattern Recognition from its creation till 1997. He consults for governmental oces including the French General Assembly and major IT companies. He received the gold medal of EURASIP for meritorious services, the IEEE ASSP technical achievement award and the IEEE Third millennium Medal in 1983, 1997 and 2000 respectively.

References
[1] [2] Jerome M. Shapiro, Embedded image coding using zerotrees of wavelet coecients, IEEE Transactions on Signal Processing, vol. 41, no. 12, pp. 34453462, December 1993. Amir Said and William A. Pearlman, A new fast and ecient image codec based on set partitioning in hierarchical trees, IEEE Transaction on Circuits and Systems for Video Technology, vol. 6, pp. 243250, June 1996. [3] [4] [5] [6] [7] D. Taubman and A. Zakhor, Multirate 3-d subband coding of video, IEEE Transaction On Image Processing, vol. 3, no. 5, pp. 572588, September 1994. Zixiang Xiong, Kannan Ramchandran, and Michael T. Orchard, Space-frequency quantization for wavelet image coding, IEEE Transaction on Image Processing, vol. 6, no. 5, pp. 677693, May 1997. W. Sweldens, The lifting scheme: A custom-design construction of biorthogonal wavelets, Appl. Comput. Harmon. Anal., vol. 3, no. 2, pp. 186200, 1996. I. Daubechies and W. Sweldens, Factoring wavelet transforms into lifting steps, J. Fourier Anal. Appl., vol. 4, no. 3, pp. 245267, 1998. R. Calderbank, I. Daubechies, W. Sweldens, and B.-L. Yeo, Lossless image compression using integer to integer wavelet transforms, in International Conference on Image Processing (ICIP), Vol. I. 1997, pp. 596599, IEEE Press. [8] [9] R. Calderbank, I. Daubechies, W. Sweldens, and B.-L. Yeo, Wavelet transforms that map integers to integers, Appl. Comput. Harmon. Anal., vol. 5, no. 3, pp. 332369, 1998. Intel-Corporation, Intel architecture MMX(TM) technology, programmers reference manual, Tech. Rep., Intel Corporation, March 1996. [10] F. Sheng, A. Bilgin, P.J. Sementilli, and M.W. Marcellin, Lossy and lossless image compression using reversible integer wavelet transforms, in Proceedings 1998 International Conference on Image Processing, IEEE, Ed., Los Alamitos, CA, USA, 1998, vol. 3, pp. 87680. [11] M. D. Adams and F. Kossentini, Reversible integer-to-integer wavelet transforms for image compression: Performance evaluation and analysis, IEEE Trans. on Image Processing, vol. 9, no. 6, pp. 10101024, June 2000. [12] A. Grzeszczak, M. Mandal, and S. Panchanathan, VLSI implementation of discrete wavelet transform, IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 4, no. 4, pp. 421433, December 1996. [13] E. Abdel-Raheem, A. Tawk, and F. El-Guibaly, Eects of nite word length in two-channel QMF banks, in IEEE Symposium on Advances in Digital Filtering and Signal Processing, New York, NY, USA, 1998, pp. 144148, IEEE. [14] J. Artigas, L. Barragan, J. Beltran, E. Laloya, J. Moreno, D. Navarro, and A. Roy, Word length considerations on the hardware implementation of two-dimensional mallats wavelet transform, Optical Engineering, vol. 35, no. 4, pp. 11981212, April 1996. [15] N. Uzun and R. Haddad, Cyclostationary modeling, analysis, and optimal compensation of quantization

October 4, 2000

DRAFT

21

errors in subband codecs, IEEE Transactions on Signal Processing, vol. 43, no. 9, pp. 21092119, September 1995. [16] PH. Westerink, J. Biemond, and D. Boekee, Scalar quantization error analysis for image subband coding using QMFs, IEEE Transactions on Signal Processing, vol. 40, no. 2, pp. 421428, February 1992. [17] John W. Woods and T. Naveen, A lter based bit allocation scheme for subband compression of HDTV, IEEE Transactions on Image Processing, vol. 1, no. 3, pp. 436440, July 1992. [18] O. Egger and M. Kunt, Embedded zerotree based lossless image coding, in Proceedings of the International Conference on Image Processing ICIP, Washington, USA, September 1995, vol. III, pp. 616619. [19] Amir Said and William A. Pearlman, An image multiresolution representation for lossless and lossy compression, IEEE Transaction on Image Processing, vol. 5, no. 9, pp. 13031310, September 1996. [20] M. D. Adams and F. Kossentini, Performance evaluation of reversible integer-to-integer wavelet transforms for image compression, in IEEE Data Compression Conference, Snowbird, UT, USA, March 1999, pp. 514524, IEEE. [21] S. Mallat and F. Falzon, Analysis of low bit rate image transform coding, IEEE Transactions on Signal Processing, vol. 46, no. 4, pp. 10271042, April 1998. [22] Allen Gersho and Robert M. Gray, Vector Quantization and Signal Compression, Kluvwer Academic Publishers, Boston/Dordrechet/London, 1992. [23] A. Papoulis, Probability, Random Variable and Stochastic Processes, pp. 373376, McGraw-Hill, New-York, USA, 1991. [24] Marc Antonini, Michel Barlaud, Pierre Mathieu, and Ingrid Daubechies, Image coding using wavelet transform, IEEE Transaction on Image Processing, vol. 1, no. 2, pp. 205220, April 1992. [25] Video Quality Experts Group, Final report from the Video Quality Experts Group on the validation of objective models of video quality assessment, VQEG, 2000, available at ftp://ftp.its.bldrdoc.gov/dist/ituvidq/. [26] C. J. van den Branden Lambrecht and O. Verscheure, Perceptual quality measure using a spatio-temporal model of the human visual system, Proceedings of the SPIE, vol. 2668, pp. 450461, January 1996. [27] Christian J. van den Branden Lambrecht and Murat Kunt, Characterization of human visual sensitivity for video imaging applications, IEEE Transaction on Image Processing, vol. 67, no. 3, pp. 255269, June 1998.

October 4, 2000

DRAFT

22

List of Figures 1 Basic lifting based wavelet decomposition. The lters Hi (z) with an even index i are called prediction steps, the ones with an odd index are called update steps. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 IWT based on the lifting scheme. Each lifting steps is followed by a rounding to the nearest integer operation. . . . . . . . . . . . . . . . . . . . . . . . . . 3 Equivalent lifting scheme structure for the IWT. The non-linearity introduced by the rounding operations after each lter has been replaced by additive random noise (with power spectral density function kd (z) and kr (z) respectively for the decomposition and the reconstruction noise). The quantization due to the compression is symbolized by (z) . . . . . . . . . . . . . . . . . 4 Equivalent transfer function for the noise introduced at step k. The lters Hi (z), i k are combined to nd the transfer function Gku (z) and Gkl (z). Those two functions are then upsampled and merged into Gk (z) . . . . . . . 5 Trajectories of the noises on the reconstruction side of the transform. The dashed lines represent the trajectories of the noises with spectral density 1r and 2r . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 Comparison between the DWT and the IWT for (a): 1 and (b): 3 levels of decomposition. The graph presents the MSE dierences between the IWT and the DWT. Both transformation use once the 5/3 and once the 9/7 lters. Random white noise is used as input. A quantization factor Q = 0 signies lossless compression (MSE 0), and large Q implies a lossy compression(MSE 100). The simulation curves are the average over 100 simulations. The error bars corresponds to one standard deviation. . . . . . . . . . . . . . . . . . . . . 7 PSNR dierence between the DWT and the IWT for (a): 1 and (b): 3 levels of decomposition. White noise is used as input. The absolute PSNR values vary approximately from 55 to 30 dB for a quantization factor Q between 0 and 5. For Q < 1 the IWT reaches the lossless mode and its PSNR is innite. 26 25 25 24 24 24 24

October 4, 2000

DRAFT

23

Quantization step size as a function of the quantization factor Q for: (a) the 9/7 and (b) the 5/3 lter. For lossless quantization ( 1), Q must be equal to 1 and 2 for (a) and (b), respectively. . . . . . . . . . . . . . . . . 26

PSNR dierence between the IWT and the DWT as a function of the quantization factor Q. (Average between 10 dierent natural images with three levels of transform.) Because of the scaling factor presented in Fig 8, the IWT reaches the lossless compression for Q = 1 for the 9/7 and Q = 2 for the 5/3 lter. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

10 Bitrate as a function of the quantization step of the lowest subband. Average between 10 images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 Dierence of CCIR quality between the IWT and the DWT as a function of the quantization factor Q. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 27

October 4, 2000

DRAFT

24

update

2 s
(n-1)

(n)

...
(n)

H1(z) z-1 2
+

H2(z) ...

H K-1(z)
+

HK(z)

H (z) K

...

...

prediction

Fig. 1.

Basic lifting based wavelet decomposition. The lters Hi (z) with an even index i are called

prediction steps, the ones with an odd index are called update steps.

2 s
(n-1)

(n)

...

H1(z)
Round()

Round()

... H K-1(z)
Round()
+

Round()

Round()

...
HK(z) d
(n)

H2(z)

-HK(z)

-1

...

Fig. 2.

IWT based on the lifting scheme. Each lifting steps is followed by a rounding to the nearest

integer operation.

2d(z) 2 s
(n-1) +

.Kd(z)
+

. (z) .Kr(z)
s
(n) + +

H1(z) z-1 2
+

H2(z) ...

H K-1(z)
+

HK(z) d
(n) +

-HK(z)

...

.1d(z)

.(K-1)d(z)

. (z)

Fig. 3. Equivalent lifting scheme structure for the IWT. The non-linearity introduced by the rounding operations after each lter has been replaced by additive random noise (with power spectral density function kd (z) and kr (z) respectively for the decomposition and the reconstruction noise). The quantization due to the compression is symbolized by (z)
+

Gku(z)
+

.kr (z)

-Hk(z)
+

...

-H2(z)

-H1(z)
+

=> . (z)
kr

=> . (z)
kr

Gk(z)

Gkl(z)

Fig. 4.

Equivalent transfer function for the noise introduced at step k. The lters Hi (z), i k are

combined to nd the transfer function Gku (z) and Gkl (z). Those two functions are then upsampled and merged into Gk (z)
October 4, 2000 DRAFT

25

.2r (z)
+

...

-H3(z)
+

-H2(z)

-H1(z)

.3r (z)

.1r (z)

Fig. 5. Trajectories of the noises on the reconstruction side of the transform. The dashed lines represent the trajectories of the noises with spectral density 1r and 2r .

(a) 1 level of decomposition


1.2 1

(b) 3 levels of decomposition


2.5 2

0.8
MSE IWT - MSE DWT MSE IWT - MSE DWT

1.5

0.6

0.4

0.5

0.2 Theory Simulations Theory Simulations 1 2 3 Quantization Factor 4 (9/7) (9/7) (5/3) (5/3) 5 0 Theory Simulations Theory Simulations 1 2 3 Quantization Factor 4 (9/7) (9/7) (5/3) (5/3) 5

-0.2

-0.5

Fig. 6. Comparison between the DWT and the IWT for (a): 1 and (b): 3 levels of decomposition. The graph presents the MSE dierences between the IWT and the DWT. Both transformation use once the 5/3 and once the 9/7 lters. Random white noise is used as input. A quantization factor Q = 0 signies lossless compression (MSE 0), and large Q implies a lossy compression(MSE 100). The simulation curves are the average over 100 simulations. The error bars corresponds to one standard deviation.

October 4, 2000

DRAFT

26

(a) 1 level of decomposition


5 4.5 4
PSNR DWT - PSNR IWT PSNR DWT - PSNR IWT

(b) 3 levels of decomposition


(9/7) (9/7) (5/3) (5/3) 7 6 5 4 3 2 1 Theory Simulations Theory Simulations (9/7) (9/7) (5/3) (5/3)

Theory Simulations Theory Simulations

3.5 3 2.5 2 1.5 1 0.5 0 0 1 2 3 Quantization Factor 4 5

0 0

2 3 Quantization Factor

Fig. 7. PSNR dierence between the DWT and the IWT for (a): 1 and (b): 3 levels of decomposition. White noise is used as input. The absolute PSNR values vary approximately from 55 to 30 dB for a quantization factor Q between 0 and 5. For Q < 1 the IWT reaches the lossless mode and its PSNR is innite.

Q 2

Q 2

Q Q 2 22

2 2Q 2 2Q 2 2Q

Q Q 2 22 Q Q 22 42

2 2Q 4 2Q 4 2Q

2 2Q

2 2Q

D=
2 2Q 2 2Q

D=
4 2Q 4 2Q

(a)
Fig. 8.

(b)

Quantization step size as a function of the quantization factor Q for: (a) the 9/7 and (b)

the 5/3 lter. For lossless quantization ( 1), Q must be equal to 1 and 2 for (a) and (b), respectively.

October 4, 2000

DRAFT

27

Compression: PSNR Difference


9 8 7
PSNR DWT - PSNR IWT

Theory Simulations Theory Simulations

(9/7) (9/7) (5/3) (5/3)

6 5 4 3 2 1

0 -2

-1

1 2 3 Quantization Factor

Fig. 9.

PSNR dierence between the IWT and the DWT as a function of the quantization factor Q.

(Average between 10 dierent natural images with three levels of transform.) Because of the scaling factor presented in Fig 8, the IWT reaches the lossless compression for Q = 1 for the 9/7 and Q = 2 for the 5/3 lter.

Bitrate
0.4 0.2
(bit/pix IWT) - (bit/pix DWT)

0 -0.2 -0.4 -0.6 -0.8 -1 -1.2 -3 (9/7) (5/3)

-2

-1

0 1 2 Quantization factor

Fig. 10. Bitrate as a function of the quantization step of the lowest subband. Average between 10 images

October 4, 2000

DRAFT

28

Visual Difference According to recomendation CCIR 500 0.02

0
CCIR IWT - CCIR DWT

-0.02

-0.04

-0.06 (9/7) (5/3) -0.08

-0.1

-0.12 -3

-2

-1

0 1 2 Quantization factor

Fig. 11. Dierence of CCIR quality between the IWT and the DWT as a function of the quantization factor Q.

October 4, 2000

DRAFT

You might also like