You are on page 1of 8

ISSN (Print) : 0974-6846

Indian Journal of Science and Technology, Vol 10(2), DOI: 10.17485/ijst/2017/v10i2/110615, January 2017 ISSN (Online) : 0974-5645

VLSI Implementation of Low Power and High Speed


Architecture of DWT-IDWT using Lifting based
Algorithm
Chetan H1 and Dr.Indumathi G2
Visvesvaraya Technological University, Jnana Sangama, Belagavi 590018, Karnataka, India;
1

Chetan.h.gowda@gmail.com
2
Cambridge Institute of Technology, SR Layout, Chikkabasavanapura, Krishnarajapura, Bengaluru 560036,
Karnataka, India; indumathi.ece@citech.edu.in

Abstract
Objective: The purpose of this study is to optimize DWT1-IDWT2 architecture for different Image Compression techniques
using lifting based algorithms. Statistical Analysis: The data in form of image and video are transmitted as signal. Because
of limited channel bandwidth the data has to be compressed and this reduces the quality of the image. An algorithmic
concept of encoding information is given by wavelets in a manner that is layered according to level of detail. The analysis of
this implementation includes speed optimization, accuracy, and power reduction. This study uses pipelined architecture of
1D-DWT architecture and is combined with another 1D-DWT module in parallel to obtain 2D-DWT architecture to analyze
the speed.Findings: The study was done using VLSI cad tools and coding was done using Verilog, by implementing the
proposed algorithm with pipelined-parallel architecture for image compression using DWT, we analyzed the timing wrt*
clock speed and we analyzed PSNR and SNR for different video and image compression techniques. Improvements: Our
study shows higher speed can be achieved by using DWT for image compression and by using VLSI architecture, the study
can be optimized to any further extent.

Keywords: Compression, DWT, IDWT, Lifting Algorithm, Low Power

*wrt: - with respect to in communication technology. Rich cellular data com-


1DWT- Discrete Wavelet Transform munication is in demand for enabling high bandwidth
2IDWT-Inverse Discrete Wavelet Transform requirements in various data formats like voice, data,
image and video. The significant challenge in multimedia
information service is the need to process and transmit
1. Introduction vast information. It forces extreme requests on battery
The demand for bandwidth and storage capacity dur- assets of resources and transfer speed necessities of appli-
ing transmission in uncompressed multimedia devices ances. Despite the fact that the data transmission change
is high. Though currently available technologies have is achievable, the battery technology wont be in standard
shown good progress in speed, storage density and sys- with the future energy prerequisites. One strategy to beat
tem performance the requirements for storage capacity this problem is to lessen the volume of multimedia data
transmission and bandwidth far outweighs them. Web transmission over wireless channel through information
applications area demand for efficient encoding tech- compression method, for example, JPEG, JPEG 2000
niques for signal compression which is also key for storage and MPEG1. These methodologies focus on enhancing

*Author for correspondence


VLSI Implementation of Low Power and High Speed Architecture of DWT-IDWT using Lifting based Algorithm

compression without degrading quality of picture and Communication Technologies have thrown wide array
transmitting in a channel specified. While transmitting of challenges to emerging technologies and architectures
data we overlook at power consumption by decreasing the which is capable of handling huge volumes of data under
data information which is transmitted through informa- minimum constraints of bandwidth and power.
tion compression strategy, for example, JPEG and MPEG Further3 proposes Wavelet packet based FFT also;
over the wireless channel can be one of the ways to deal application of Wavelet to SNR estimation is discussed.
with the above issue. The focus of these methodologies is The computed solution matches the exact results, and its
basically without the penance of nature of picture getting computational complexity is of same order of FFT3.
higher compression ratio. Here, handling power is addi- Further implementing the wavelet packets, the effects
tionally one of the parameters. of implementation and requirements considered in the
design of usable wavelets are studied4, also the constraints
imposed by lossless reconstruction make way for use of
bi orthogonal wavelets. But however, this affects the per-
formance. The frequency behavior of the wavelet packet
transform is complex in practical use of this transform in
a multicarrier system4. For our work, we need DWT to be
used for compression of data, hence Figure 1 shows 2D
DWT and 2D-IDWT compression algorithm.
Figure 1. Discrete Wavelet Transform2.
Wavelet coding schemes are better suited for charac-
teristics of HVS (Human Visual System). Wavelet coding
The block diagram shown in Figure 1 describes the
methodologies when used at higher compression rate
communication system using discrete wavelet transform.
avoids block-artifacts.
The communication system represents orthogonal fre- As the transformation process, can be repeatedly
quency division multiplexing which effectively increases applied with wavelets compression is scalable resulting in
the spectrum efficiency. The discrete wavelet transform very high compression ratios. With wavelets, parametric
used is designed without using multiplication operation gain control on sharpening and image softening is pos-
to employ the factor of speed in the whole communica- sible also, wavelet based coding scheme is robust during
tion system. transmission and error decoding. Progressive transmis-
sion of images is also feasible. Efficient compression is
2. Literature Analysis achieved at low bit rates also, it facilitates for efficient
decomposition of signals prior to compression. Further
Performance of OFDM and WPM over multipath DWT is designed using Lifting scheme2, the transform
wireless channel having WPM is compared depict- module of DWT includes two horizontal filters and two
ing utilization of time domain Minimum Mean Square vertical filters working in parallel and pipeline. The per-
Error (MMS E) when compared with OFDM system formance analysis of the results shows that it provides
WPM possessing a time domain MMS E equalizer gives better image compression ratio with simple steps which is
higher noise immunity to NBI for multipath wireless suitable for VLSI implementation.
channels. The developments in ubiquitous connectivity Hence in our paper for implementing we have used
technologies and convergence of ICT-Information and lifting based CDF 5/3 DWT architecture5, proposed for

Figure 2. Wavelet - based Compression Algorithms2

2 Vol 10 (2) | January 2017 | www.indjst.org Indian Journal of Science and Technology
Chetan H and Dr.Indumathi G

Figure 3. a, b Block Diagram of Lifting based Schema4.

Figure 4. Three-levels Wavelet Decomposition Tree (DWT)5.


parameterized to tackle diverse word length and picture the high pass filter produces detail information; d[n],
sizes. The model is low unpredictable due to unit blocks, while the low pass filter associated with scaling function
offers a simple approach for higher DWT modeling out- produces rough details, a[n].
comes of the 2DDWT and it can work with frequency of In the design of the transceiver for wavelet modula-
198MHz. tion and its implementation in an AWGN channel, it has
showed that the Bit error rate with SNR in the AWGN
channel is precise. In another channel medium, it was
3. Principle Method concluded that DWT-OFDM performs much better than
DWT based Lifting schema is a better approach for oper- DFT-OFDM6.
ational speed. The principle applied is first factorizing
the polyphase matrix of a wavelet filter into a sequence
4. Implementation
of alternating upper triangular matrices, lower triangular
matrices and a diagonal matrix. Secondly banded-matrix The architecture is developed by using Lifting scheme of
multiplications are applied to implement wavelets. A. wavelet filter with value N=2. Here N=2 denotes that the
Two Lifting schema lifting scheme uses 2 stages viz. predict_1 and update_1
The Figure 3 depicts that factorization carried out in first stage and predict_2 and update_2 in second stage
results in non-unique values, here si(z) (primary lifting respectively. This scheme reduces computational com-
steps) and ti(z) (dual lifting steps) are filters and K is a plexity. The Figure 5 depicts block diagram of forward
constant. This computation results in several si(z), ti(z) lifting scheme transforms. Here there are basically three
and K values. steps split, lifting and scaling respectively.
The Wavelet transform provides optimized time- Input sequences are orderly split to decompose the ip
frequency representation of the data signal than other signal into even & odd. The spliced even and odd signals
existing transforms like FFT transformation6. Following are predicted and updated in accordance with the splitting
equation provides the Continuous Wavelet Transform operation. In predicting stage, the predict value is selected
(CWT). from the decomposed input signal; this operation is suc-
cessively iterated four times and updated inherently. In
The DWT is calculated by successive low pass and DWT, main operation would be addition and in inverse
high pass filtering of the discrete time-domain signal as DWT the subtraction operation is used.
shown in figure. The Low pass filter is denoted by G0 Some constants used are alpha, beta, gamma and
while the high pass filter is denoted by H0. At each level, delta, should be rounded off. The multiplication opera-

Vol 10 (2) | January 2017 | www.indjst.org Indian Journal of Science and Technology 3
VLSI Implementation of Low Power and High Speed Architecture of DWT-IDWT using Lifting based Algorithm

tion plays a major role in performing wavelet operation The lifting Scheme algorithm is applied as:
using lifting scheme. The multiplication operation inher- I. Split step
ently consumes more clock cycle with which latency
parameter would come into picture. The latency increases
by increasing the number of stages in lifting scheme. The
latency can be decreased by replacing multiplication by
shifting operation which will boost the speed of the sys- II. Lifting Steps, N=2
tem comparatively.
The prime objective of lifting scheme is to split the orig-
inal 1D signal into odd and even indexed sub sequences
and compute a trivial wavelet. Further these values are
updated with subsequent prediction and updating steps.
The lifting based scheme algorithm steps are as fol-
lows:
III. Scaling step
Splitting Stage: Split the input (main) signal X
(n) into odd and even number of samples.
Lifting Stage: Its executed in N sub stages
(depending up on the type of the filter), Here the
prediction and update filters Pn(n)and Un (n) Where a=-1.586134342, b-0.0529801185,
are used to filter odd and even samples. c=0.882911076, d=-0.443506852 and K=1.149604398.
Scaling Stage: On completion of N Lifting stage,
a scaling parameters K and 1/K are applied to the
odd and even samples respectively to obtain the
low pass band YL (i) and the high pass band YH (i).

Figure 7. Block Diagram of Lifting Based.

The Figure 7 shows the block diagram of implementa-


tion of DWT and IDWT. The design consists of clk and
rst as control inputs along with the data input sequences.
Figure 5. Block Diagram for Forward LDWT5. The input data is given to the lifting based algorithm. This
algorithm calculates DWT and IDWT. Lifting based algo-
rithm is designed using Verilog, and it consists of split,
predict and update operations and finally the data out-
put is scaled. The scaling operation is done to obtain the
constant value. The constant value represents the exact
input signal, there by verifying the design functional-
ity. The DWT implementation is performed by adding
the input sequence with the predict and update values
in the first four clock cycles whereas IDWT operation is
implemented using the subtraction operation in the next
Figure 6. Diagram of LDWT (9, 7) Filter5. alternate four clock cycles.

4 Vol 10 (2) | January 2017 | www.indjst.org Indian Journal of Science and Technology
Chetan H and Dr.Indumathi G

The Optimized 1D-DWT architecture which is imple- bands are obtained from 1D-DWT block 2. The control-
mented is as shown in Figure 8. Architecture consists of ler module is optimized in terms of using (i) The low pass
6 shifters, 6 adders and 6 delay elements. The four adders and high pass filters are designed using shifters and addi-
and four shifters are used to design low pass filter. The two tion operations. (ii) The architecture does not use any
shifters and two adders are used to design the high pass multiplier which builds speed and decrease equipment
filter. Buffers are used to make zeros from negative values complexity. (iii) The four subgroups are implemented in
filters coefficients and positive value is unchanged. D flip- parallel.
flop acts as a down sampler to generate high and low pass
filter Coefficients. A. Parallel Processing Memory Unit
The original sequences of images size of 256x256 is The low pass and high pass coefficients of O1D-DWT are
converted into low pass and high pass filter coefficients put away in two squares of memory units as indicated in
of size 32768x1 fig 10 which is utilized to acquire transposition of info
picture. The LPF signal output and HPF signal output of
O1D-DWT squares are associated with parallel handling
memory unit through the signals Data in1 and Data in2
individually. The clock signal clk and clk_div are utilized
to peruse and compose the coefficients from both mem-
ory units at the same time. The clk_out of O1DDWT
square is utilized as info to clk_ div signal for memory
unit. The control signals like rd addr, wr addr, rd wr,
clk and clk_div are chosen by MUX. The memory unit is
utilized to change over O1D-DWT into O2D-DWT with
Figure 8. O1D-DWT Architecture7. the assistance of control unit. The LPF and HPF coeffi-
cients are passed in parallel by utilizing clk and clk_div
as a part of memory-1 and memory-2 which expands the
rate compared with existing technique in which coeffi-
cients are handled in serial.

Figure 9. O2D-DWT Architecture7.

2D-DWT is derived from 1D-DWT architecture


which is represented as shown above. Memory modules Figure 10. Memory Unit8.
and controller modules are added in the architecture is as
shown in Figure 9. The two-memory modules of 2DWT
are in parallel. The memory unit block is used to store
the low pass and high pass filter coefficients which are
obtained from the 1D-DWT. Memory unit-1 stores the
L band coefficients and memory unit-2stores the H band
coefficients. Both memory unit-1 and memory unit-2 are
accessed using 1D-DWT block 1 and 1D-DWT block 2
simultaneously to obtain sub bands of HH, HL, LL and
LH via MUX and DEMUX. HL and HH sub bands are
obtained from 1D-DWT block 1 and LL and LH sub Figure 11. Controller Architecture8.

Vol 10 (2) | January 2017 | www.indjst.org Indian Journal of Science and Technology 5
VLSI Implementation of Low Power and High Speed Architecture of DWT-IDWT using Lifting based Algorithm

B. Proposed Controller Unit The above snapshot illustrates the output results of
The controller unit is utilized to peruse coefficients of DWT and IDWT lifting based algorithm. The output
rows and columns of network. The controller unit com- signal values X8, X9....X15 of IDWT obtained is shown
prises of three counters which are utilized for creating in the Figure 13.
read and compose locations demonstrated in the Figure
11 to get filter coefficients. The counter-1 performs com-
pose operation by resetting rd, wr control signal to
zero. The counter-2 and counter-3 are utilized to peruse
the filter coefficients from the memory by setting rd,
wr control signal to one. The counter-2 is incremented
for each 256 count of counter-3 to read column matrix
of coefficients. The counter-2 number will proceed till it
achieves 128 checks to get all the picture coefficients.

5. Results Figure 14. Input Values to DWT Block in Waveform for


The below figure illustrates the input signal given to the Lifting Based Algorithm DWT-IDWT using Shift Operator.
Lifting DWT Whose input signal values are taken in 3-bit
binary form. X0, X1, X2, X3 ... X7 are the input signals
shown in the Figure 12.

Figure 15. Output Values of DWT Block and Input Values


for IDWT Block in Waveform for Lifting Based Algorithm
Figure 12. Input Waveform Results for Lifting Based DWT-IDWT using Shift Operator.
Algorithm DWT using Multiplication Operator.

Figure 16. Output Values of IDWT Block in Waveform for


Figure 13. Waveform Results for Lifting Based Algorithm
Lifting Based Algorithm DWT-IDWT using Shift Operator.
DWT-IDWT using Multiplication.

6 Vol 10 (2) | January 2017 | www.indjst.org Indian Journal of Science and Technology
Chetan H and Dr.Indumathi G

Figure 17. Input Values to IDWT Block in Waveform for Figure 20. PSNR of Different Video Format8.
Lifting Based Algorithm IDWT-DWT using Shift Operator.

Figure 21. MSE of Different Video Format8.


Figure 18. Output Values of IDWT Block and Input Values
of DWT Block in Waveform for Lifting Based Algorithm
IDWT-DWT using Shift Operator.

Figure 22. Percentage of Co-efficient Threshold between


Haar and CDF8.

6. Conclusion
The Optimized 2D DWT algorithm was basically used
for image compression technique and has provided
Figure 19. Output Values DWT Block in Waveform for
a better compression ratio but here in the proposed
Lifting Based Algorithm IDWT-DWT using Shift Operator.
architecture it has high compression ratio for the high
bit rate data and the quality of the data matrix is also
The comparison of PSNR, MSE of different video for-
not lost after the compression. Only adders and shifters
mat is shown in the Figure 20 and Figure 21. HD video
has high PSNR compare to all other video formats. 3GP were used to develop the FIR filters in the 02D-DWT
video has less compare to all and reverse in case of MSE architecture so it reduces the memory and area space
the percentage of threshold co-efficient comparison of required during the hardware implementation and
Haar and CDF technique for different threshold which speed is increased due to parallel architecture and with
are set is shown in the graph is also shown in the below usage of shifters. The entire concept still has scope of
figures. further extending the application to the image process-

Vol 10 (2) | January 2017 | www.indjst.org Indian Journal of Science and Technology 7
VLSI Implementation of Low Power and High Speed Architecture of DWT-IDWT using Lifting based Algorithm

ing domain for higher compression ratio requirement 5. Al-Azawi S, Abbas Y, Jidin R. Low complexity multidimen-
by using highly parallel DWT architecture. sional CDF 5/3 DWT architecture. 2014 9th International
Symposium Communication Systems, Networks & Digital
Signal Processing (CSNDSP). 2014 July.
7. Acknowledgement 6. Yifan S, Xiao S, Xiong Y, Hao S, Zhao X. Time-frequency
analysis system based on temporal Fourier transform, in
This research was supported by C M R Institute of 2015 IEEE International Conference on Communication
Technology. We thank Naveen H for assistance for Problem-Solving (ICCP), Guilin, 2015.
comments and paper building and improving the final 7. S. S. Bhairannawar, S. Sarkar, K. B. Raja, K. R. Venugopal.
manuscript. We would also like to show our gratitude to An Efficient VLSI architecture for fingerprint recognition
the Dr. B. Narasimha Murthy, Vice Principal, CMRIT for using O2D-DWT architecture and modified CORDIC
guiding us and giving valuable insights during this paper. FFT. Signal Processing, Informatics, Communication and
Energy Systems (SPICES). 2015 Feb 19-21.
8. Chetan H, Indumathi G. Low power VLSI implementation
8. References of data compression for multimedia devices using CDF
1. Fei Wu B, Fu Lin C. A high-performance and memory- m/n DWT on to resource constrained dynamically recon-
efficient pipeline architecture for the 5/3 and 9/7 discrete figurable memories. New Delhi: 2016 3rd International
wavelet transform of JPEG2000 codec Circuits and Systems Conference on Computing for Sustainable Global
for Video Technology, IEEE Transactions. 2005 Dec; Development (INDIACom). 2016.
15(12):1615-28. 9. Baig S, Rehman FU, Mughal MJ. Performance comparison
2. Cao P, Guo X, Wang C, Li J. Efficient architecture for two- of DFT, discrete wavelet packet and wavelet transforms.
dimensional discrete wavelet transform based on lifting OFDM transceiver for multipath fading channel, Multitopic
scheme.7th International Conference-ASICON -07. 2007 Conference. 2005.
October. 10. Lakshmananm MK, Nikookar H. A review of wavelets
3. Manzoor RS, Gani R, Jeoti V, Kamel N, Asif M. for digital wireless communication. Springer: Journal on
Implementation of FFT using discrete wavelet packet trans- Wireless Personal Communication. 2006; 37(3-4):387-420.
form (DWPT) and its application to SNR estimation in 11. Kang Lai Y, Fei Chen L, Chih Shih Y. A high-performance
OFDM systems. Kaula Lumpur, Malaysia: IEEE International and memory efficient VLSI architecture with parallel
Symposium on Information Technology. 2008. scanning method for 2-D lifting-based discrete wavelet
4. Bouwel CV. Wavelet Packet Based Multicarrier Modulation. transform, Consumer Electronics. IEEE Transactions. 2009
IEEE Communications and Vehicular Technology. 2000; p. May; 55(2): 400-07.
131-38.

8 Vol 10 (2) | January 2017 | www.indjst.org Indian Journal of Science and Technology

You might also like