You are on page 1of 42

Abstract:

This paper presents an adaptive encoding framework for the reduction of transition activity in
high-capacitance off-chip data buses, since power dissipation associated with those buses can be
significant for high-speed communication. The technique relies on the observation of data characteristics
over fixed window sizes and formation of cluster with bit lines having highly correlated switching
patterns. The proposed method utilizes redundancy in space and time to prevent loss of information
while retrieving data. We present analytical and experimental analyses, which demonstrate the activity
reduction of our encoding scheme for various data. The extra power cost due to the encoder and decoder
circuitry along with redundancy is offset due to reduced number of off-chip transitions.
Introduction:

As CMOS technology progresses into nanometer and sub-nanometer technology, it poses many
challenges to design and test engineers. The scaling of VLSI integrated circuits has increased the
sensitivity of CMOS technology to cause large power dissipation, propagation delays and various noise
mechanisms such as power supply noise, crosstalk noise, leakage noise, etc. The power consumption
and crosstalk has become a major concern because of continuing decrease in the minimum feature size
and the corresponding increase in chip density and operating frequencies. Most of the power is being
wasted on the data buses and long interconnects as dynamic power dissipation for charging and
discharging of internal node capacitances and inter-wire capacitances.

In Non Deep submicron technology the load capacitance or the substrate capacitance (C L) between wires

to substrate is dominating factor. The coupling capacitance (CC) is between parallel wires which is

negligible compare to the load capacitance. Unfortunately in nanometer and sub nanometer technologies
the coupling capacitance dominates the load capacitance and its magnitude is several times larger than
load capacitance. The characteristics of data buses and long interconnects such as wire spacing [9], wire
length, wire material, wire width, driver strength, coupling length and signal transition time, etc.
influences the coupling effect. This increased coupling effect on on-chip buses and on long
interconnects not only increase the power dissipation but also deteriorate the signal integrity due to the
coupling capacitance. As a results these busses and interconnects becoming more sensitive and prone to
errors caused by crosstalk and delay faults [17], [18], [19].Reducing the power dissipating transitions
can also reduces the crosstalk and delay faults [12], [13].The coupling capacitance also depends upon
the data dependent transitions and the coupling effect will increase or decrease depending upon the
relative switching activity between adjacent bus wires [14].
Off-chip data buses play an important role in reliable communication and high-performance chips.
Power is consumed because of charging and discharging of a coupling and load capacitance due to
transition of a signal on data bus. Reducing the transition activity or switching activity on the on-chip
data buses is the one of the attractive way of reducing power dissipation. Switching activity on the data
bus can be reduced by employing bus encoding techniques. Several bus encoding techniques have been
proposed to reduce power consumption during bus transmission in literature. These techniques mainly
relay on reducing the data bus activity by reducing the self transitions or reducing the coupling
transitions. Reducing power dissipating transition by encoding the data on the data buses leads to
reducing the bus activity hence overall power consumption is reduced.

Over the past few years, a number of coding techniques have been proposed for reducing the
transitions on a data bus. For data buses, one popular coding scheme is the bus invert coding technique
proposed by Stan and Burleson [1]. In this method it compares the successive data bus values and
determining if inverting a data on data bus word would results in fewer bit transitions than not inverting
the word. This technique is suitable for uncorrelated data patterns and is based on the Hamming
distance. It calculates the number of bits that change state from one data word to next data word. If the
number of bit transitions is greater than the half of the total number of bus lines, then the inverted data is
transmitted over the bus, other wise original data is sent. This method effectively reduces the maximum
number of transitions to one half of the number of data bits. Other variants of the bus invert coding
schemes include a decomposition approach [5] and partial bus coding technique [6].Both these
techniques have an area overhead to determine the suitable partition of the data bus. In addition, the
decomposition approach [5] can require up to p-1 extra lines on the bus where p is the number of
partitions of the original data bus. The energy dissipated due to coupling capacitance is analyzed in [7],
[8]. For instruction buses Gray code [2], T0 code [3], the Beach code [4] have been proposed which
reduces the transitions there by reducing the power dissipation. Dynamic coding method takes even and
odd line as bus sub-group and finds the coupling transitions and then invert the sub-bus group which
decreases the coupling transitions [11]. Bus regrouping method divides the bus into small sub-groups
and then regroups by taking bits from different subgroups [15]. In Novel Coding Technique the data bus
is sub divided into even and odd bit groups. Hamming distance between even sub group, odd sub group,
inverted data and present data is compared with the data on the bus respectively. The sub-group whose
hamming distance is lesser that subgroup’s data is inverted and transmitted with two redundant control
bits for decodes purpose. One of the disadvantage of this technique is coupling transitions occurs due to
redundant bits also [16]. In almost all above mentions methods only coupling transitions are considered
and self transitions either neglected or cannot be reduced. The proposed method by using Bus
regrouping with Hamming distance considers the reduction of both coupling as well as self transition
which results to a more save in power consumption.
Switching activity:

Typical pipelined cir-cuit in which each stage consists of a combinational circuit between two latches.
At the beginning of a clock cycle, the processor first latches the input signals of the combinational
circuit in latch A. It then evaluates those signals in the combinational circuit and propagates them to
output. Output is latched in latch B during the next cycle and becomes the input signals for the next
pipeline stage. The switching activity of combinational circuits depends on the logic and structure of the
circuit and the switching at the output of the input latch. There is no general theory about the
relationship between the switching activity of the inputs and that of the internal nodes of a
combinational circuit. We believe that if switching is high at the inputs, the internal switching at the
combinational circuit also tends to be high and vice versa. Different instruction sequences can have
significantly different effects on the switching activity; the impact depends on the architecture. In a
CISC processor, the impact is not obvious since one instruction may need several cycles to execute. In
contrast, for a pipelined RISC-like embedded processor, most instructions execute in one cycle, and the
impact can be significant since the instruction scheduler schedules more instructions. To better
understand the impact of instruction sequence on switching activity, we selected RISC-like pipelined
processor VLSI-BAM’ (VLSI-Berkeley Abstract Machine) as an experimental architecture. This
processor is pipelined with datastationary control, that is, each pipeline stage has separate controls

Instruction bfetch, instruction decode, instruction vbexecution, memory access, and write back. The
instruction set of the VUI-BAM is similar to the Mips 20OO2 with extensions for symbolic
computation. Figure 2 shows the pipeline stages and the control path of the VLSI-BAM. Each pipeline
stage has an instruction register, programmable logic array (PLA), and latch for control signals. The
processor passes instructions through instruction registers, and the PLA decodes them in each stage. The
procesr sor then generates control signals from PLAs and usually latches them before sending them
down the data path. p i s may not always be true in an actual VUI-BAM implementation.) We built a
cyclebycycle instructionlevel simulator for collecting the switching at the latches in the control path
during execution of benchmark programs. The benchmarks, shown in Table 1, all come from the
Aquariu~suitea~nd we first put them through the Aquarius Prolog ~ompi l e r .T~h,e~ c ompiler
produces an intermediate BAM code that is target machine independent. We then further compile the
BAM code into code for target machine VISI-BAM.
Gray code addressing
Pipelined embedded processors produce an instruction address during each cycle by selecting the
address that the counter or address adder generates. Due to instruction locality during program
execution, the processor accesses instructions sequentially most of the time. Gray code (see What is
Gray code box, next page) characteristically changes by only one bit as it sequences from one number to
the next. Thus Gray code has an advantage over straight binary code since each memory access changes
the address by only one bit. Therefore we can eliminate a significant number of bit switches using Gray
code addressing.

. Miller Encoding Technique:


Miller encoding is also known as delay encoding. It can be used for higher operating frequency and it is
similar to Manchester encoding except that the transition occurs in the middle of an interval when the bit
is 1. While using the Miller delay, noise interference can be reduced.
The block diagram has a d flip flop, t flip flop, NOT gate, and XOR gate. Where the input is A_in and
CLK, then the output is a Miller output. For example, if the input is 0 and the clock, given the XOR
operation has done that, is A_in CLK , therefore 0 plus a positive edge clock produces the output as 0.
Given to d flip flop, the clock has inverted, and after that output is given to t flip flop it inputs and flip
flop output, which is 0. Then the TFF is toggle FF, which produces the Miller output as 1.
Fig: Block Diagram Of Miller Encoding

Design of FMO Manchester and Miller in One Module:

Fig: RTL design flow of Proposed Encoder

From the previous logics of FMO/Manchester with SLOS was taken as same architecture for this we add
a Miller Encoding by adding another MUX to it. SO the user can select the type of encoding he wants
with the Two selection lines. The selection truth table was shown below

Mode1 Mode2 Clear Output


0 0 1 FMO
1 0 0 Manchester
1 x x Miller

Chapter-2
Literature survey:

1.Dedicated Short-Range Communications (DSRC) Standards in the United States:


Wireless vehicular communication has the potential to enable a host of new
applications, the most important of which are a class of safety applications that can
prevent collisions and save thousands of lives. The automotive industry is working to
develop the dedicated short-range communication (DSRC) technology, for use in vehicle-
to-vehicle and vehicle-to-roadside communication. The effectiveness of this technology
is highly dependent on cooperative standards for interoperability. This paper explains the
content and status of the DSRC standards being developed for deployment in the United
States. Included in the discussion are the IEEE 802.11p amendment for wireless access in
vehicular environments (WAVE), the IEEE 1609.2, 1609.3, and 1609.4 standards for
Security, Network Services and Multi-Channel Operation, the SAE J2735 Message Set
Dictionary, and the emerging SAE J2945.1 Communication Minimum Performance
Requirements standard. The paper shows how these standards fit together to provide a
comprehensive solution for DSRC. Most of the key standards are either recently
published or expected to be completed in the coming year. A reader will gain a thorough
understanding of DSRC technology for vehicular communication, including insights into
why specific technical solutions are being adopted, and key challenges remaining for
successful DSRC deployment. The U.S. Department of Transportation is planning to
decide in 2013 whether to require DSRC equipment in new vehicles.

2. Design of 5.9 ghz dsrc-based vehicular safety communication:

The automotive industry is moving aggressively in the direction of advanced active


safety. Dedicated short-range communication (DSRC) is a key enabling technology for
the next generation of communication-based safety applications. One aspect of vehicular
safety communication is the routine broadcast of messages among all equipped vehicles.
Therefore, channel congestion control and broadcast performance improvement are of
particular concern and need to be addressed in the overall protocol design. Furthermore,
the explicit multichannel nature of DSRC necessitates a concurrent multichannel
operational scheme for safety and non-safety applications. This article provides an
overview of DSRC based vehicular safety communications and proposes a coherent set of
protocols to address these requirements

3. A Manchester code generator running at 1 GHz:

A new Manchester code generator designed at transistor level is presented in this


paper. This generator uses 32 transistors and has the same complexity as a standard D
flip-flop. It is intended to be used in a complex optical communication system. The main
benefit of this design is to use a clock signal running at the same frequency as the data.
Output changes on the rising edge and falling edge of the clock. Simulations results show
a correct behavior up to 1 Gbit/s data rate with a 0.35 μ CMOS technology within a
commercial temperature range.

4. A 90nm Manchester Code Generator with CMOS Switches Running at 2.4GHz


and 5GHz:

A Manchester code generator designed at transistor level with NMOS switches is


presented. This generator uses 26 transistors and has the same complexity as a standard D
flip-flop. It is intended to be used in a complex optical communication system. The main
benefit of this design is the use of a clock signal running at the same frequency as the
data. Output changes on the rising edge and falling edge of the clock. The circuit has
been designed in a 90 nm UMC CMOS technology to evaluate the efficiency of the
proposed approach and experimental results show a correct behavior up to 5 Gbit/s data
rate.

5. High-Speed CMOS Chip Design for Manchester and Miller Encoder:


In this paper, we propose a modified Manchester and Miller encoder that can
operate in high frequency without a sophisticated circuit structure. Based on the previous
proposed architecture, the study has adopted the concept of parallel operation to improve
data throughput. In addition, the technique of hardware sharing is adopted in
this design to reduce the number of transistors. The study uses TSMCCMOS 0.35-mum
2P4M technology. The simulation result of HSPICE indicates that it functions
successfully and works at 200-MHz speed. The average power consumption of the circuit
under room temperature is 549 muW. The total core area is 70.7 mumtimes72.2 mum. As
expected, the circuit can be easily integrated into radio frequency identification (RFID)
application.

6. FSM based Manchester encoder for UHF RFID tag emulator:

The radio frequency identification system (RFID) is becoming one of the most
popular system in wireless technologies. The UHF RFID tag emulator is a part
of RFID testing tools. The UHF RFID tagemulator would be imitating the behavior
of RFID Tag. The UHF RFID tag emulator (860 MHz to 960 MHz) is aimed for testing
the RFID systems and also acts as a general-purpose data transport device for
other RFID systems. The tag emulator belongs to the EPC class-III (semi-passive) tags,
but it implements the Class-1Generation II (C1G2) air interface protocol for
communicating with the reader. In this work, we have presented RTL design
of Manchester encoder. As motivated by Finite State Machine (FSM) and RTL
implementations of encoder are discussed with particular focus to use
theRFID Emulator as data transport device and debugging tool. The synthesis result
shows that FSMdesign is efficient (less area and high speed) and it operates at a
maximum frequency of 256.54 MHz.

7. Top down design of joint MODEM and CODEC detection schemes for DSRC
coded-FSK systems over high mobility fading channels:
The joint detection and verification of frequency shift keying (FSK) modulation
and demodulation (MODEM), Manchester coding and decoding (CODEC) schemes are
proposed for dedicated short range communication (DSRC) systems over high mobility
fading channels. The proposed joint coded-FSK detection scheme with low complexity
benefit can outperform the conventional separated coded-FSK detection scheme. It is due
to the joint scheme with time diversity gain to enhance the detectionperformance.
Moreover, the proposed joint algorithms with floating-point and fixed-point designs are
verified in the software-defined-ratio (SDR) platform. Based on the measurement results
via SDR equipments, it is confirmed that the implementation of VHDL hardware
circuit design of the proposedjoint detection scheme can provide robust performance over
high mobility Rician multipath fading channel environment.

8. Simultaneous Routing and Buffer Insertion algorithm for interconnect delay


optimization in VLSI layout design:

The design of VLSI circuits today has become very challenging indeed. The main
factor affecting system performance is the interconnect delay. Many algorithms have
been proposed to solve the interconnect timing optimization problem. Research has
shown that techniques like buffer insertion and wire-sizing have been proven to be very
effective in reducing interconnect delay. This paper describes a graph-based routing
algorithm to solve the interconnect delay optimization problem in a deep submicron
VLSI layout routing. The algorithm finds the optimal delay routing paths
with simultaneous consideration of buffer insertions and wire-sizing, while taking into
account wire or buffer obstacles. The proposed algorithm, called S-RABILA
(Simultaneous Routing and Buffer Insertion with Look-Ahead), utilizes a novel look-
ahead technique that significantly contributes to the computational efficiency of the
proposed algorithm. In this paper, the performance of S-RABILA is presented, which
shows the effectiveness of the look-ahead scheme. Experimental results also indicate that
the proposed algorithm provide significant improvements over similar existing VLSI
routing algorithms.

Chapter 3
VLSI DESIGN:

The company was founded in 1979, by a trio from Fairchild Semiconductor by


way of Synertek – Jack Balletto, Dan Floyd, and Gunnar Wetlesen – and by Doug
Fairbairn of Xerox PARC and Lambda (later VLSI Design) magazine. Alfred J. Stein
became the CEO of the company in 1982. Subsequently VLSI built its first fab in San
Jose; eventually a second fab was built in San Antonio, Texas. VLSI had its initial public
offering in 1983, and was listed on the stock market as (NASDAQ: VLSI). The company
was later acquired by Philips and survives to this day as part of NXP Semiconductors.

A VLSI VL82C106 Super I/Ochip

The original business plan was to be a contract wafer fabrication company, but the
venture investors wanted the company to develop IC (Integrated Circuit) design tools to
help fill the foundry. Thanks to its Caltech and UC Berkeley students, VLSI was an
important pioneer in the electronic design automation (EDA) industry. It offered a
sophisticated package of tools, originally based on the 'lambda-based' design style
advocated by Carver Mead and Lynn Conway. VLSI became an early vendor of standard
cell (cell-based technology) to the merchant market in the early 1980s where the other
ASIC-focused company, LSI Logic, was a leader in gate arrays. Prior to VLSI's cell-
based offering, the technology had been primarily available only within large vertically
integrated companies with semiconductor units such as AT&T and IBM. VLSI's design
tools included not only design entry and simulation but eventually also cell-based routing
(chip compiler), a datapath compiler, SRAM and ROM compilers, and a state machine
compiler. The tools were an integrated design solution for IC design and not just point
tools, or more general purpose system tools. A designer could edit transistor-level
polygons and/or logic schematics, then run DRC and LVS, extract parasitics from the
layout and run Spice simulation, then back-annotate the timing or gate size changes into
the logic schematic database. Characterization tools were integrated to generate
FrameMaker Data Sheets for Libraries. VLSI eventually spun off the CAD and Library
operation into Compass Design Automation but it never reached IPO before it was
purchased by Avanti Corp. VLSI's physical design tools were critical not only to its ASIC
business, but also in setting the bar for the commercial electronic design
automation (EDA) industry. When VLSI and its main ASIC competitor, LSI Logic, were
establishing the ASIC industry, commercially-available tools could not deliver the
productivity necessary to support the physical design of hundreds of ASIC designs each
year without the deployment of a substantial number of layout engineers. The companies'
development of automated layout tools was a rational "make because there's nothing to
buy" decision. The EDA industry finally caught up in the late 1980s when Tangent
Systems released its TanCell and TanGate products. In 1989, Tangent was acquired by
Cadence Design Systems (founded in 1988).

Unfortunately, for all VLSI's initial competence in design tools, they were not
leaders in semiconductor manufacturing technology. VLSI had not been timely in
developing a 1.0 µm manufacturing process as the rest of the industry moved to that
geometry in the late 1980s. VLSI entered a long-term technology parthership
with Hitachi and finally released a 1.0 µm process and cell library (actually more of a
1.2 µm library with a 1.0 µm gate). As VLSI struggled to gain parity with the rest of the
industry in semiconductor technology, the design flow was moving rapidly to a Verilog
HDL and synthesis flow. Cadence acquired Gateway, the leader in Verilog hardware
design language (HDL) and Synopsys was dominating the exploding field of design
synthesis. As VLSI's tools were being eclipsed, VLSI waited too long to open the tools
up to other fabs and Compass Design Automation was never a viable competitor to
industry leaders. Meanwhile, VLSI entered the merchant high speed static RAM (SRAM)
market as they needed a product to drive the semiconductor process technology
development. All the large semiconductor companies built high speed SRAMs with cost
structures VLSI could never match. VLSI withdrew once it was clear that the Hitachi
process technology partnership was working. ARM Ltd was formed in 1990 as a
semiconductor intellectual property licensor, backed by Acorn, Apple and VLSI. VLSI
became a licensee of the powerful ARM processor and ARM finally funded processor
tools. Initial adoption of the ARM processor was slow. Few applications could justify the
overhead of an embedded 32-bit processor. In fact, despite the addition of further
licensees, the ARM processor enjoyed little market success until they developed the
novel 'thumb' extensions. Ericsson adopted the ARM processor in a VLSI chipset for its
GSM handset designs in the early 1990s. It was the GSM boost that is the foundation of
ARM the company/technology that it is today. Only in PC chipsets, did VLSI dominate
in the early 1990s. This product was developed by five engineers using the 'Megacells" in
the VLSI library that led to a business unit at VLSI that almost equaled its ASIC business
in revenue. VLSI eventually ceded the market to Intel because Intel was able to package-
sell its processors, chipsets, and even board level products together. VLSI also had an
early partnership with PMC, a design group that had been nurtured of British Columbia
Bell. When PMC wanted to divest its semiconductor intellectual property venture, VLSI's
bid was beaten by a creative deal by Sierra Semiconductor. The telecom business unit
management at VLSI opted to go it alone. PMC Sierra became one of the most important
telecom ASSP vendors. Scientists and innovations from the 'design technology' part of
VLSI found their way to Cadence Design Systems (by way of Redwood Design
Automation). Compass Design Automation (VLSI's CAD and Library spin-off) was sold
to Avant! Corporation, which itself was acquired by Synopsys.

ENCODER AND DECODER:

ENCODER:

In digital electronics, a decoder can take the form of a multiple-input, multiple-


output logic circuit that converts coded inputs into coded outputs, where the input and
output codes are different e.g. n-to-2n , binary-coded decimal decoders. Decoding is
necessary in applications such as data multiplexing, 7 segment display and memory
address decoding. The example decoder circuit would be an AND gate because the output
of an AND gate is "High" (1) only when all its inputs are "High." Such output is called as
"active High output". If instead of AND gate, the NAND gate is connected the output
will be "Low" (0) only when all its inputs are "High". Such output is called as "active low
output". A slightly more complex decoder would be the n-to-2n type binary decoders.
These types of decoders are combinational circuits that convert binary information from
'n' coded inputs to a maximum of 2n unique outputs. In case the 'n' bit coded information
has unused bit combinations, the decoder may have less than 2n outputs. 2-to-4 decoder,
3-to-8 decoder or 4-to-16 decoder are other examples.
The input to a decoder is parallel binary number and it is used to detect the presence of a
particular binary number at the input. The output indicates presence or absence of
specific number at the decoder input.
DECODER:

Combine two or more small decoders with enable inputs to form a larger decoder e.g. 3-
to-8-line decoder constructed from two 2-to-4-line decoders.
Decoder with enable input can function as demultiplexer.

3:8 decoder

It uses all AND gates, and therefore, the outputs are active- high. For active- low outputs,
NAND gates are used. It has 3 input lines and 8 output lines. It is also called as binary to
octal decoder it takes a 3-bit binary input code and activates one of the 8(octal) outputs
corresponding to that code. The truth table is as follows:
Octal to binary encoder

Octal-to-Binary take 8 inputs and provides 3 outputs, thus doing the opposite of what the
3-to-8 decoder does. At any one time, only one input line has a value of 1. The figure
below shows the truth table of an Octal-to-binary encoder.

Table 3: Truth Table of octal to binary encoder

For an 8-to-3 binary encoder with inputs I0-I7 the logic expressions of the outputs Y0-Y2
are:
Y0 = I1 + I3 + I5 + I7
Y1= I2 + I3 + I6 + I7
Y2 = I4 + I5 + I6 +I7

Fig 4: Logic Diagram of octal to binary encoder

Priority encoder

A priority encoder is a circuit or algorithm that compresses multiple binary inputs into a
smaller number of outputs. The output of a priority encoder is the binary representation
of the ordinal number starting from zero of the most significant input bit. They are often
used to control interrupt requests by acting on the highest priority request. It includes
priority function. If 2 or more inputs are equal to 1 at the same time, the input having the
highest priority will take precedence. Internal hardware will check this condition and
priority is set.
Table 4: Truth Table of 4 bit priority encoder/p>

Fig 5: Logic Diagram of 4 bit priority encoder

IC 74148 is an 8-input priority encoder. 74147 is 10:4 priority encoder

Multiplexer

In electronics, a multiplexer or mux is a device that selects one of several analog or


digital input signals and forwards the selected input into a single line. A multiplexer of 2n
inputs has n select lines, which are used to select which input line to send to the output.
An electronic multiplexer can be considered as a multiple-input, single-output switch i.e.
digitally controlled multi-position switch. The digital code applied at the select inputs
determines which data inputs will be switched to output.

A common example of multiplexing or sharing occurs when several peripheral devices


share a single transmission line or bus to communicate with computer. Each device in
succession is allocated a brief time to send and receive data. At any given time, one and
only one device is using the line. This is an example of time multiplexing since each
device is given a specific time interval to use the line.

In frequency multiplexing, several devices share a common line by transmitting at


different frequencies.

Table 5: Truth Table of 8:1 MUX


Fig 6: Logic Diagram of 8:1 MUX

Demultiplexer

A demultiplexer (or demux) is a device taking a single input signal and selecting one of
many data-output-lines, which is connected to the single input. A multiplexer is often
used with a complementary demultiplexer on the receiving end. A demultiplexer is a
single-input, multiple-output switch. Demultiplexers take one data input and a number of
selection inputs, and they have several outputs. They forward the data input to one of the
outputs depending on the values of the selection inputs.

Demultiplexers are sometimes convenient for designing general purpose logic, because if
the demultiplexer's input is always true, the demultiplexer acts as a decoder. This means
that any function of the selection bits can be constructed by logically OR-ing the correct
set of outputs. Demultiplexer is called as a ‘distributro’, since it transmits the same data
to different destinations.

Table 6: Truth Table of 1:8 DEMUX


Forward Error Correction (FEC) schemes are an essential component of wireless
communication systems. Present wireless standards such as Third generation (3G)
systems, GSM, 802.11A, 802.16 utilize some configuration of convolution coding.
Convolution encoding with Viterbi decoding is a powerful method for forward error
correction. The Viterbi algorithm is the most extensively employed decoding algorithm
for convolutional codes which comprises of minimum path and value calculation and
retracing the path. The efficiency of error detection and correction increases with
constraint length. In this paper the convolutional encoder and viterbi decoder are
implemented on FPGA for constraint length of 9 and bit rate ½. Forward error correction
(FEC) codes have long been a powerful tool in the advancement of information storage
and transmission. By introducing meaningful redundancy into a stream of information,
systems gain the ability not only to detect data errors, but also to correct them.
Convolution coding is a popular error-correcting coding method used in digital
communications. It is used in communications such as satellite and space communication
to improve communication efficiency. To detect and correct errors occurred while
transmitting digital data through a noisy channel, the original data is convolutionally
encoded by using convolutional encoder. The encoder adds some redundancy to the
information and then transmits through a noisy channel. The transmitted data is received
at the receiver and is given to viterbi decoder. The viterbi decoder evaluates the corrupted
data and corrects the errors in the bit streams occurred during transmission. Forward
Error Correction is a process of error control for data transmission by adding some
redundant symbols to the transmitted information to facilitate error detection and error
correction at receiver end. Forward Error Correction (FEC) in digital communication
system improves the error detection as well as error correction capability of the system at
the cost of increased system complexity. Using FEC the need for retransmission of data
can be avoided. Hence, it is applied in situations where applied in situations where
retransmissions are relatively costly or impossible. FEC codes can be classified into two
categories namely block codes and convolution codes. Block codes work on fixed size
blocks of bits where as convolution codes work on sequential and as well as blocks of
data. In this, the encoding operation may be viewed as discrete time convolution of input
sequence with the impulse response of the encoder. Error detection and correction or
error control is a technique that enables reliable delivery of digital data over unreliable
communication channels. Many communication channels are subject to channel noise,
and thus errors may be introduced during transmission from the source to the receiver.
Error detection techniques allow detecting such errors, while error correction enables
reconstruction of the original data. It consist of four blocks: the branch metric unit
(BMU), which computes metrics, the path metric unit (PMU), which computes the path
metric, the add–compare– select unit (ACSU), which selects the survivor paths for each
trellis state, also finds the minimum path metric of the survivor paths and the survivor
management unit (SMU), that is responsible for selecting the output based on the
minimum path metric. The received data bits are given to the branch metric block which
calculates the possible branch metrics at that particular state. Any state from stage three
in the trellis diagram can be reached from two possible previous states thus two error
metrics are obtained. The Add compare select unit finds both the path metrics and
compares, whichever is minimum that path metric is chosen as the new path metrics. The
new path metrics are stored in path metric unit. Above two steps are repeated until the
trellis ends and the entire path metric and next state metrics are obtained. Using these
above metrics the survivor path traces the optimum path from last values of the next state
matrix and then the data is decoded.

FLIPFLOP:

In electronics, a flip-flop or latch is a circuit that has two stable states and can be
used to store state information. A flip-flop is a bistable multivibrator. The circuit can be
made to change state by signals applied to one or more control inputs and will have one
or two outputs. It is the basic storage element insequential logic. Flip-flops and latches
are a fundamental building block of digital electronics systems used in computers,
communications, and many other types of systems. Flip-flops and latches are used as data
storage elements. A flip-flop stores a single bit (binary digit) of data; one of its two states
represents a "one" and the other represents a "zero". Such data storage can be used for
storage of state, and such a circuit is described as sequential logic. When used in a finite-
state machine, the output and next state depend not only on its current input, but also on
its current state (and hence, previous inputs). It can also be used for counting of pulses,
and for synchronizing variably-timed input signals to some reference timing signal. Flip-
flops can be either simple (transparent or opaque) or clocked (synchronous or edge-
triggered). Although the term flip-flop has historically referred generically to both simple
and clocked circuits, in modern usage it is common to reserve the term flip-
flop exclusively for discussing clocked circuits; the simple ones are commonly
called latches.[1][2] Using this terminology, a latch is level-sensitive, whereas a flip-flop is
edge-sensitive. That is, when a latch is enabled it becomes transparent, while a flip flop's
output only changes on a single type (positive going or negative going) of clock edge.

The D flip-flop is widely used. It is also known as a "data" or "delay" flip-flop.

The D flip-flop captures the value of the D-input at a definite portion of the clock cycle
(such as the rising edge of the clock). That captured value becomes the Q output. At other
times, the output Q does not change.[22][23] The D flip-flop can be viewed as a memory
cell, a zero-order hold, or a delay line.[citation needed]

Truth table:

Clock D Qnext

Rising edge 0 0

Rising edge 1 1

Non-Rising X Q

('X' denotes a Don't care condition, meaning the signal is irrelevant)

Most D-type flip-flops in ICs have the capability to be forced to the set or reset state
(which ignores the D and clock inputs), much like an SR flip-flop. Usually, the illegal
S = R = 1 condition is resolved in D-type flip-flops. By setting S = R = 0, the flip-flop
can be used as described above. Here is the truth table for the others S and R possible
configurations:

Inputs Outputs

S R D > Q Q'

0 1 X X 0 1

1 0 X X 1 0

1 1 X X 1 1

These flip-flops are very useful, as they form the basis for shift registers, which are an
essential part of many electronic devices. The advantage of the D flip-flop over the D-
type "transparent latch" is that the signal on the D input pin is captured the moment the
flip-flop is clocked, and subsequent changes on the D input will be ignored until the next
clock event. An exception is that some flip-flops have a "reset" signal input, which will
reset Q (to zero), and may be either asynchronous or synchronous with the clock.

The above circuit shifts the contents of the register to the right, one bit position on each
active transition of the clock. The input X is shifted into the leftmost bit position.

Classical positive-edge-triggered D flip-flop[edit]


A positive-edge-triggered D flip-flop

This circuit[24] consists of two stages implemented by SR NAND latches. The input stage
(the two latches on the left) processes the clock and data signals to ensure correct input
signals for the output stage (the single latch on the right). If the clock is low, both the
output signals of the input stage are high regardless of the data input; the output latch is
unaffected and it stores the previous state. When the clock signal changes from low to
high, only one of the output voltages (depending on the data signal) goes low and
sets/resets the output latch: if D = 0, the lower output becomes low; if D = 1, the upper
output becomes low. If the clock signal continues staying high, the outputs keep their
states regardless of the data input and force the output latch to stay in the corresponding
state as the input logical zero (of the output stage) remains active while the clock is high.
Hence the role of the output latch is to store the data only while the clock is low.

The circuit is closely related to the gated D latch as both the circuits convert the two D
input states (0 and 1) to two input combinations (01 and 10) for the outputSR latch by
inverting the data input signal (both the circuits split the single D signal in two
complementary S and R signals). The difference is that in the gated D latch simple
NAND logical gates are used while in the positive-edge-triggered D flip-flop SR NAND
latches are used for this purpose. The role of these latches is to "lock" the active output
producing low voltage (a logical zero); thus the positive-edge-triggered D flip-flop can
also be thought of as a gated D latch with latched input gates.

Master–slave edge-triggered D flip-flop[edit]

A master–slave D flip-flop. It responds on the falling edge of the enable input (usually a
clock)
An implementation of a master–slave D flip-flop that is triggered on the rising edge of
the clock

A master–slave D flip-flop is created by connecting two gated D latches in series, and


inverting the enable input to one of them. It is called master–slave because the second
latch in the series only changes in response to a change in the first (master) latch.

For a positive-edge triggered master–slave D flip-flop, when the clock signal is low
(logical 0) the "enable" seen by the first or "master" D latch (the inverted clock signal) is
high (logical 1). This allows the "master" latch to store the input value when the clock
signal transitions from low to high. As the clock signal goes high (0 to 1) the inverted
"enable" of the first latch goes low (1 to 0) and the value seen at the input to the master
latch is "locked". Nearly simultaneously, the twice inverted "enable" of the second or
"slave" D latch transitions from low to high (0 to 1) with the clock signal. This allows the
signal captured at the rising edge of the clock by the now "locked" master latch to pass
through the "slave" latch. When the clock signal returns to low (1 to 0), the output of the
"slave" latch is "locked", and the value seen at the last rising edge of the clock is held
while the "master" latch begins to accept new values in preparation for the next rising
clock edge.

By removing the leftmost inverter in the circuit at side, a D-type flip-flop that strobes on
the falling edge of a clock signal can be obtained. This has a truth table like this:

D Q > Qnext

0 X Falling 0

1 X Falling 1

A CMOS IC implementation of a "true single-phase edge-triggered flip-flop with reset"


Edge-triggered dynamic D storage element[edit]

An efficient functional alternative to a D flip-flop can be made with dynamic circuits


(where information is stored in a capacitance) as long as it is clocked often enough; while
not a true flip-flop, it is still called a flip-flop for its functional role. While the master–
slave D element is triggered on the edge of a clock, its components are each triggered by
clock levels. The "edge-triggered D flip-flop", as it is called even though it is not a true
flip-flop, does not have the master–slave properties.

Edge-triggered D flip-flops are often implemented in integrated high-speed operations


using dynamic logic. This means that the digital output is stored on parasitic device
capacitance while the device is not transitioning. This design of dynamic flip flops also
enables simple resetting since the reset operation can be performed by simply discharging
one or more internal nodes. A common dynamic flip-flop variety is the true single-phase
clock (TSPC) type which performs the flip-flop operation with little power and at high
speeds. However, dynamic flip-flops will typically not work at static or low clock speeds:
given enough time, leakage paths may discharge the parasitic capacitance enough to
cause the flip-flop to enter invalid states.
Adaptive Bus Encoding:

When data statistics are not known beforehand, and transitional probabilities of each bit line are
changing over time with probabilities among the bit lines varying from low to high, then the
consideration of the fixed subgroup or cluster of bit lines reduces the savings margin, since the transition
correlation changes with time. The best way to enrich the transition reduction is to extract the signal
statistics before application of encoding by observing the data over time. This ensures the establishment
of the transitional correlation among bit line adaptively and dynamic formation of cluster with high
correlated bit lines within a fixed observation window. This gives it an advantage over existing encoding
schemes which cannot efficiently handle the situations when the transmitted data characteristics change
abruptly
Proposed Encoding Scheme: Theoretical Background
The proposed approach encodes the data to minimize the self-switching activity before they are
introduced into the off-chip bus with the objective of reducing average power dissipation. The main idea
is to evaluate the switching statistics for each bit line by observing data stream over an observation
window, and to establish the transition correlation among them. The highly correlated bit lines form a
cluster, which changes across different observation windows as local switching probability changes. In
each observation window, one bit line is designated as a basis line, which has maximum correlated
switching transitions with the other lines. The lines which have maximum correlation with the basis are
clustered together. In other words, the switching transitions of all the clustered lines of the bus in that
particular observation window have maximum projection component along the switching transitions of
the selected basis line. When all the clustered lines are XOR-ed with the basis, it leads to maximum
switching savings. The clustering information is sent using temporal redundancy between two adjacent
windows, while the basis is sent as an extra line as a spatial redundancy. A sample of an encoded
observation window, with spatiotemporal redundancy, is shown in Fig. 1. The entire process can be
symbolically represented as follows.

In this section, we demonstrate statistical analysis of the proposed algorithm in a single observation
window. The same is also done for BIC and APBIC, and the advantage of our algorithm is clearly
revealed. Consider a data source which generates symbols S. At any time t, N bit word forms symbol s ∈
S, where s = [b0, b1, . . . , bN−1]. N bit wide bus carries symbols over time where each bit line exhibits
different localtransitional probabilities p ∈ P, p = [p0, p1, . . . , pN−1]. Transition probabilities of each
bit line can be computed by dividing the occurrence frequency of S01 or S10 by window size W, e.g., p0
= (S01/W)|b0. S01 or S10 defines the total number of 0 to 1 or 1 to 0 transition. In our proposed
method,expected savings from each window rely on the selection of thebasis line. Probability of being
basis line differs for differentbit lines, since expected savings contribution changes with bitlines chosen
as basis. Consider bi as basis line for a bus ofwidth N. XOR operation among basis and other bit line b
jat the final stage of our suggested encoding technique leadsto the switching reduction from b j . Table I
shows impactson b j while performing bi ⊕ b j (here ⊕ denotes XOR operation)and probabilities of
different outcomes at any switchingtime.Combined probability of no switching change on b j,
representedhere as χ5, takes the value (1 − pi ). Extension of theconcept of XOR outcome for
transitional observation windowof width W leads to the same three different scenarios withdifferent
probabilities, where decrease or increase of switchingcount can take any value between 1 and W.
Probability of knumber of switching savings in any window, defined here asχi, jW,k, from bit line b j is
expressed by
and the summationlimits are from n = 1 to ((W/2) − (k − 1/2)) if k is odd, and n = 1 to ((W/2) − (k/2) + 1)
if k is even. Equation (7) is obtained by considering all the scenarios for each savings count. For
example, when W = 8, probability of six switching savings χ i, j 8,6 in b j comes from two possible
scenarios: six switching savings or seven switching savings and 1 increase in switching transition in
eight clock steps

Adaptive encoder:

In this section, we will demonstrate possible implementation of the proposed algorithm. Fig. 4 shows basic block diagram of
proposed encoding methodology for bus of width N. It consists of decision blocks, delay elements, and set of XOR gates and a
multiplexer. Decision block consists of eliminator, cluster formation, and basis selection unit. It generates the control
information, corresponding to cluster for each observation window and the multiplexer inserts the temporal redundancy. The
sequence of operation in the decision block is shown in Fig. 5. Each element of the row evaluates the savings contribution by
each line and decides the presence of bit line in cluster as per (5) for each observation window. Savings computation unit at
the end of each row, which takes the output of bitwise savings computation units, computes overall savings for each bit line if
it was chosen as basis. It can be implemented by balanced carry save adder tree. Since basis line is an obvious candidate for
cluster and incurs no savings from itself, diagonal section of the matrix contributes no additional hardware cost. Basis
selection is nothing but an index selection unit, which compares the overall savings contribution due to selection of each bit
line as basis and finally determines the potential basis among all as per (6). Later in this section, we have given some detail
insight of bitwise saving computation unit and eliminator block.
The presence of eliminator block reduces the internal node switching of encoder. It eliminates bit lines
from cluster at early stage without computing αi, j . It also groups potential bit lines as basis among all,
which further decreases internal switching count. Fig. 6 shows the basic eliminator block when window
size for transition observation is of 16 clock cycles. The computation corresponds to the (bi , b j )
element of Fig. 5 matrix where bi is chosen as basis. The hardware implementation is optimized by
removing the bit lines with switching probability less than 0.25 from basis consideration. This
simplification is justified since these lines have a very low probability of being chosen as the basis; and
eliminating them at this stage leads to power savings in the encoder. Regi or Regj stores the self-
switching count. The number of these registers is equal to bus width. Hardware components of this
section are shared among all other elements of the matrix. Positive value of αi, j can be ensured if
switching count of b j is greater than half of the switching count of basis [see (3) and (4)]. Eliminator block
takes this scenario into account, using the comparator, to eliminate inessential computation. The enable line of comparator
output enables the computation of αi, j , as shown in Fig. 7. The diagram demonstrates one possible way to implement
computation of bitwise switching savings for a particular observation window. One of the inputs of the final subtracter is the
switching count of the basis for that window; 4-bit register stores the joint transitional count of basis bi and line b j . Hardware
components (adder and register) and stored value in the register are shared with mirror element of (bi , b j ). This reduces
encoder dynamic power by minimizing internal node switching. The overall savings computation unit is an adder stage which
takes input from the bitwise savings computation blocks and computes the total switching savings when a particular line is
chosen as the basis. The basis selection unit takes the output of all the overall savings computation units and finally selects
one of the bit lines as basis [see (6)]. For the logical interpretation of the basis selection unit, consider that tn i represents the
nth bit from the left in the N-bit binary representation of the number of switching savings obtained if Fig. 8. Decoder architecture
for proposed scheme. i th bit line is chosen as basis. Pn i stores the status of the i th line when comparison has been done up to the
nth bit from the left, , then i th line is still in consideration for being the basis after the most significant n bits have been
compared. The equation for the basis selection and cluster information unit, for the MSB of the input to the basis selection
unit, is where the summation represents logical OR. This has to be repeated for all N bits of the input. The gate level
implementation of the Boolean expression in (14) is given in the Appendix. The decoder architecture is presented in Fig. 8
which retrieves the data back to its original form. The cluster information is sent as control signal at the beginning of encoded
data for every observation window. Before decoding the data, the decoder extracts this cluster information by observing the
transition between control signal and encoded data of previous observation window. This information is kept in the register
for each bit line until the end of decoding for current window.
Software details:P
6.1 Why (V) HDL?

 Interoperability
 Technology independence
 Design reuse
 Several levels of abstraction
 Readability
 Standard language
 Widely supported
What is Verilog:
Verilog, standardized as IEEE 1364, is a hardware description language (HDL)
used to model electronic systems. It is most commonly used in the design and verification
of digital circuits at the register-transfer level of abstraction. It is also used in the
verification of analog circuits and mixed-signal circuits.

Hardware description languages such as Verilog differ from software programming


languages because they include ways of describing the propagation time and signal
strengths (sensitivity). There are two types of assignment operators; a blocking
assignment (=), and a non-blocking (<=) assignment. The non-blocking assignment
allows designers to describe a state-machine update without needing to declare and use
temporary storage variables. Since these concepts are part of Verilog's language
semantics, designers could quickly write descriptions of large circuits in a relatively
compact and concise form. At the time of Verilog's introduction (1984), Verilog
represented a tremendous productivity improvement for circuit designers who were
already using graphical schematic capture software and specially written software
programs to document and simulate electronic circuits.

The designers of Verilog wanted a language with syntax similar to the C


programming language, which was already widely used in engineering software
development. Like C, Verilog is case-sensitive and has a basic preprocessor (though less
sophisticated than that of ANSI C/C++). Its control flowkeywords (if/else, for, while,
case, etc.) are equivalent, and its operator precedence is compatible with C. Syntactic
differences include: required bit-widths for variable declarations, demarcation of
procedural blocks (Verilog uses begin/end instead of curly braces {}), and many other
minor differences. Verilog requires that variables be given a definite size. In C these sizes
are assumed from the 'type' of the variable (for instance an integer type may be 8 bits).
A Verilog design consists of a hierarchy of modules. Modules encapsulate design
hierarchy, and communicate with other modules through a set of declared input, output,
and bidirectional ports. Internally, a module can contain any combination of the
following: net/variable declarations (wire, reg, integer, etc.), concurrent and sequential
statement blocks, and instances of other modules (sub-hierarchies). Sequential statements
are placed inside a begin/end block and executed in sequential order within the block.
However, the blocks themselves are executed concurrently, making Verilog a dataflow
language.

Verilog's concept of 'wire' consists of both signal values (4-state: "1, 0, floating,
undefined") and signal strengths (strong, weak, etc.). This system allows abstract
modeling of shared signal lines, where multiple sources drive a common net. When a
wire has multiple drivers, the wire's (readable) value is resolved by a function of the
source drivers and their strengths.Asubset of statements in the Verilog language
aresynthesizable. Verilog modules that conform to a synthesizable coding style, known as
RTL (register-transfer level), can be physically realized by synthesis software. Synthesis
software algorithmically transforms the (abstract) Verilog source into a netlist, a logically
equivalent description consisting only of elementary logic primitives (AND, OR, NOT,
flip-flops, etc.) that are available in a specific FPGA or VLSI technology. Further
manipulations to the netlist ultimately lead to a circuit fabrication blueprint (such as a
photo mask set for an ASIC or a bitstream file for an FPGA).

Example:

modulemain;
initial
begin
$display("Hello world!");
$finish;
end
endmodule
The Verilog Procedural Interface (VPI), originally known as PLI 2.0, is an
interface primarily intended for the Cprogramming language. It allows behavioral
Verilog code to invoke C functions, and C functions to invoke standard Verilog system
tasks. The Verilog Procedural Interface is part of the IEEE 1364 Programming Language
Interface standard; the most recent edition of the standard is from 2005. VPI is sometimes
also referred to as PLI 2, since it replaces the deprecatedProgram Language Interface
(PLI).

While PLI 1 was depreciated in favor of VPI (aka. PLI 2), PLI 1 is still commonly used
over VPI due to its much more widely documented tf_put, tf_get function interface that is
described in many verilog reference books.

moduletoplevel(clock,reset);
inputclock;
inputreset;

regflop1;
regflop2;

always@(posedgeresetorposedgeclock)
if(reset)
begin
flop1<=0;
flop2<=1;
end
else
begin
flop1<=flop2;
flop2<=flop1;
end
endmodule
The definition of constants in Verilog supports the addition of a width parameter. The
basic syntax is:

<Width in bits>'<base letter><number>

Examples:

 12'h123 - Hexadecimal 123 (using 12 bits)


 20'd44 - Decimal 44 (using 20 bits - 0 extension is automatic)
 4'b1010 - Binary 1010 (using 4 bits)
 6'o77 - Octal 77 (using 6 bits)

SOFTWARE INFORMATION:
Create a New Project Create a new ISE project which will target the FPGA device on the
Spartan-3 Startup Kit demo board. To create a new project: 1. Select File > New Project...
The New Project Wizard appears. 2. Type tutorial in the Project Name field. 3. Enter or
browse to a location (directory path) for the new project. A tutorial subdirectory is
created automatically. 4. Verify that HDL is selected from the Top-Level Source Type
list. 5. Click Next to move to the device properties page. 6. Fill in the properties in the
table as shown below:
♦ Product Category: All
♦ Family: Spartan3
♦ Device: XC3S200
♦ Package: FT256
♦ Speed Grade: -4
♦ Top-Level Source Type: HDL
♦ Synthesis Tool: XST (VHDL/Verilog)
♦ Simulator: ISE Simulator (VHDL/Verilog)
♦ Preferred Language: Verilog (or VHDL)
♦ Verify that Enable Enhanced Design Summary is selected.
Leave the default values in the remaining fields. When the table is complete, your project
properties will look like the following:

Creating a Verilog Source:


Create the top-level Verilog source file for the project as follows: 1. Click New
Source in the New Project dialog box. 2. Select Verilog Module as the source type in the
New Source dialog box. 3. Type in the file name counter. 4. Verify that the Add to
Project checkbox is selected. 5. Click Next. 6. Declare the ports for the counter design by
filling in the port information as shown below:
The source file containing the counter module displays in the Workspace, and the counter displays in the
Sources tab, as shown below:

Using Language Templates (Verilog):

The next step in creating the new source is to add the behavioral description for
counter. Use a simple counter code example from the ISE Language Templates and
customize it for the counter design. 1. Place the cursor on the line below the output [3:0]
COUNT_OUT; statement. 2. Open the Language Templates by selecting Edit →
Language Templates… Note: You can tile the Language Templates and the counter file
by selecting Window → Tile Vertically to make them both visible. 3. Using the “+”
symbol, browse to the following code example: Verilog → Synthesis Constructs →
Coding Examples → Counters → Binary → Up/Down Counters → Simple Counte.
4. With Simple Counter selected, select Edit → Use in File, or select the Use
Template in File toolbar button. This step copies the template into the counter source file.
5. Close the Language Templates
Design Simulation :

Verifying Functionality using Behavioral Simulation Create a test bench waveform


containing input stimulus you can use to verify the functionality of the counter module.
The test bench waveform is a graphical view of a test bench. Create the test bench
waveform as follows: 1. Select the counter HDL file in the Sources window. 2. Create a
new test bench source by selecting Project → New Source. 3. In the New Source Wizard,
select Test Bench WaveForm as the source type, and type counter_tbw in the File Name
field. 4. Click Next. 5. The Associated Source page shows that you are associating the
test bench waveform with the source file counter. Click Next. 6. The Summary page
shows that the source will be added to the project, and it displays the source directory,
type, and name. Click Finish. 7. You need to set the clock frequency, setup time and
output delay times in the Initialize Timing dialog box before the test bench waveform
editing window opens. The requirements for this design are the following:

♦ The counter must operate correctly with an input clock frequency = 25 MHz.

♦ The DIRECTION input will be valid 10 ns before the rising edge of CLOCK

♦ The output (COUNT_OUT) must be valid 10 ns after the rising edge of


CLOCK. The design requirements correspond with the values below. Fill in the fields in
the Initialize Timing dialog box with the following information:

♦ Clock High Time: 20 ns.

♦ Clock Low Time: 20 ns.

♦ Input Setup Time: 10 ns.


♦ Output Valid Delay: 10 ns. ♦ Offset: 0 ns.

♦ Global Signals: GSR (FPGA) Note: When GSR(FPGA) is enabled, 100 ns. is
added to the Offset value automatically.

♦ Initial Length of Test Bench: 1500 ns.

8. Click Finish to complete the timing initialization. 9. The blue shaded areas that precede
the rising edge of the CLOCK correspond to the Input Setup Time in the Initialize
Timing dialog box. Toggle the DIRECTION port to define the input stimulus for the
counter design as follows:

♦ Click on the blue cell at approximately the 300 ns to assert DIRECTION high so that
the counter will count up.

♦ Click on the blue cell at approximately the 900 ns to assert DIRECTION low so that
the counter will count down.
Simulation Results:

You might also like