Professional Documents
Culture Documents
ISSN: 2278-0181
Vol. 2 Issue 7, July - 2013
Design Of High Performance IEEE- 754 Single Precision (32 bit) Floating Point
Adder Using VHDL
Preethi Sudha Gollamudi, M. Kamaraju
to ieee 754 standard with optimal chip area synthesis report. VHDL code for floating point
and high performance using VHDL .The adder is written in Xilinx 8.1i and its synthesis
report isshown in Design process of Xilinx which
proposed architecture is implemented on will outline various parameters like number of slices
Xilinx ISE Simulator.Results of proposed used, number of slice flipflop used, number of 4
architecture are compared with the existed input LUTs, number of bonded IOBs,number of
architecture and have observed reduction in global CLKs. Floating point addition is most widely
area and delay . Further, this project can be used operation in DSP/Math processors, Robots, Air
extendable by using any other type of faster traffic controller, Digital computers because of its
adder in terms of area, speed and power. raising application the main emphasis is on the
implementation of floating point adder effectively
Keywords:Floatingpoint,Adder,FPGA,IEEE- such that it uses less chip area with more clock
754,VHDL,Single precision. speed.
Many fields of science, engineering Floating point numbers are one possible way
and finance require manipulating real of representing real numbers in binary format; the
numbers efficiently. Since the first computers IEEE 754 standard presents two different floating
appeared, many different ways of approximating point formats, Binary interchange format and
real numbers on it have been introduced. One of Decimal interchange format. Fig. 1 shows the IEEE
them, the floating point arithmetic, is clearly the 754 single precision binary format representation; it
most efficient way of representing real numbers consists of a one bit sign (S), an eight bit exponent
in computers. Representing an infinite, continuous (E), and a twenty three bit fraction (M or Mantissa).
set (real numbers) with a finite set (machine If the exponent is greater than 0 and smaller than
numbers) is not an easy task: some compromises 255, and there is 1 in the MSB of the significand
must be found between speed, accuracy, ease of
then the number is said to be a normalized number; 2.2 Conversion of decimal to floating point
in this case the real number is represented by (1) numbers
Conversion of Decimal to Floating point 32
bit format is explained with example. Let us take an
example of a decimal number that how could it will
be converted into floating format. Enter a decimal
”Figure 1. IEEE single precision floating point number suppose 129.85 before converting into
format” floating format this number is converted into binary
value which is 10000001.110111. After conversion
Z=(-1)S * 2(E-bias) * (1.M) (1) move the radix point to the left such that there will
be only one bit which is left of the radix point and
Where M= m22 2-1 + m21 2-2 + m20 2-3 +........+m1 this bit must be 1 this bit is known as hidden bit and
2-22+ m0 2-23; also made above number of 24 bit including hidden
bit which is like 1.00000011101110000000000 the
Bias=127 number which is after the radix point is called
mantissa which is of 23 bits and the whole number is
2.1 IEEE 754 Standard for Binary called significand which is of 24 bits. Count the
Floating-Point Arithmetic: number of times the radix point is shifted say „x‟.
But in above case there is 7 times shifting of radix
The IEEE 754 Standard for Floating-Point point to the left. This value must be added to 127 to
Arithmetic is the most widely-used standard for get the exponent value i.e. original exponent value is
floating-point computation, and is followed by 127 + „x‟. In above case exponent is 127 + 7 = 134
many hardware (CPU and FPU) and software which is 10000110. Sign bit i.e. MSB is „0‟ because
implementations. The standard specifies: number is +ve. Now assemble result into 32 bit
Basic and extended floating-point number
RT
format which is sign, exponent, mantissa.
formats
Add, subtract, multiply, divide, square 4. Existing Method
root, remainder, and compare operations
IJE
Conversions between integer and floating- Floating point addition algorithm can be
point formats explained in two cases with example. Case I is when
Conversions between different floating- both the numbers are of same sign i.e. when both the
point formats numbers are either +ve or –ve means the MSB of
The different format parameters for simple and both the numbers are either 1 or 0. Case II when
double precision are shown in table 1. both the numbers are of different sign i.e. when one
number is +ve and other number is –ve means the
MSB of one number is 1 and other is 0.
Step 4:- Amount of shifting i.e. „y‟ is added to 1.Some possible combinations of A and B have a
exponent of N2 value. New exponent value of E2= direct result where there is no need of adder or some
previous E2 + „y‟. Now result is in normalize form other resources,but in the existing work adder is
because E1 = E2. used for direct result.Hence it is a time taking
process.
Step 5:- Is N1 and N2 have different sign „no‟. In
this case N1 and N2 have same sign. 2. IEEE 754 does not say anything about the
operations between subnormal and normal numbers
Step 6:- Add the significands of 24 bits each i.e Mixed Numbers(one operand is normal and the
including hidden bit S=S1+S2. other is subnormal).
Step 10:- Sign of the result i.e. MSB = Sign of the “Figure 2.Black box view of single precision
larger number either MSB of N1or it can be MSB
Adder”
of N2. Step 11:- Assemble result into 32 bit format
excluding 24th bit of significand i.e. hidden bit .
5.1.2 Subnormal numbers case This is achieved by calculating the carry bits before
the sum which reduces the wait time to calculate the
The operation using subnormal numbers is the result of the large value bits. The code has been
easiest one.It is designed in just one block and the designed implementing a 1-bit CLA structure and
procedure is as follows: generating the other components up to 28 (the
1. Obtaining the two sign bits and both mantissas. number of bits of the adder) by the function
2. Making a comparison between both A and B generate.
numbers in order to acquire the largest number.
3. Fixing the result exponent in zero.
5.1.3 Mixed Numbers case
RT
IJE
RT
“Figure 8.Simulation Report for special cases-infinity”
IJE
RT
IJE
RT
IJE
RT
IJE
In this work VHDL code is written for single synthesized using xilinx 9.1.The proposed work
precision floating point adder and implemented in consumes less chip area when compared to the
Xilinx ISE simulator. existing one and has less combinationa delay.
• Maximum combinational path delay for [10] N. Shirazi, A. Walters, and P. Athanas,
proposed one:16.988ns “Quantitative Analysis of Floating Point Arithmetic
on FPGA Based Custom Computing Machines,”
6.Conclusion Proceedings of the IEEE Symposium on FPGAs