You are on page 1of 24

[ DATA REPRESENTATION] Mr.

THOBIUS
May 8, 2019 JOSEPH +255717341960

NUMBER SYSTEM:

We have many number systems this includes decimal, binary, octal and hexadecimal.

A. Decimal Number System:

Decimal Number system composed of 10 numerals or symbols. These numerals are 0 to 9. Using these
symbols as digits we can express any quantity. It is also called base-10 system. It is a positional value
system in which the value of a digit depends on its position.

These digits can represent any value, for example: 754.

The value is formed by the sum of each digit, multiplied by the base (in this case it is 10 because there are
10 digits in decimal system) in power of digit position (counting from zero):

Decimal numbers would be written like this: 12710 1110 567310

B. Binary Number System:

In Binary Number system there are only two digits i.e. 0 or 1. It is base-2 system. It can be used to represent
any quantity that can be represented in decimal or other number system. It is a positional value system,
where each binary digit has its own value or weight expressed as power of 2.

NB: in high level programming some programming languages denote binary numbers with prefix 0b or 0B
(e.g., 0b1001000), or prefix b with the bits quoted (e.g., b'10001111').

A binary digit is called a bit. Eight bits is called a byte (why 8-bit unit? Probably because 8=23).

The following are some examples of binary numbers:


1011012 112 101102

Conversion from Decimal to Binary or Binary to Decimal

Convert from decimal to binary Χ(10)->Χ(2)

Integer

45(10)->Χ(2)

Div Quotient Remainder Binary Number (Χ)

1
[ DATA REPRESENTATION] Mr. THOBIUS
May 8, 2019 JOSEPH +255717341960

45 / 2 22 1 1

22 / 2 11 0 01

11 / 2 5 1 101

5/2 2 1 1101

2/2 1 0 01101

1/2 0 1 101101

45(10)->101101(2)

Fractional Part

0.182(10)->Χ(2)

Div Product Integer value Binary Number (Χ)

0.182 * 2 0.364 0 0.0

0.364 * 2 0.728 0 0.00

0.728 * 2 1.456 1 0.001

0.456 * 2 0.912 0 0.0010

0.912 * 2 1.824 1 0.00101

0.824 * 2 1.648 1 0.001011

0.648 * 2 1.296 1 0.0010111

0.182(10)->0.0010111(2) (After we round and cut the number)

Conversion from Binary to Decimal


Convert from binary to decimal Χ(2)->Χ(10)

101101.0010111(2)->Χ(10)

Index the digits of the number

150413120110.0-10-21-30-41-51-61-7
Multiply each digit

1 * 25 + 0 * 24 + 1 * 23 + 1 * 22 + 0 * 21 + 1 * 20 + 0 * 2-1 + 0 * 2-2 + 1 * 2-3 + 0 * 2-4 + 1 * 2-5 + 1 * 2-6 + 1 * 2-7 =


2
[ DATA REPRESENTATION] Mr. THOBIUS
May 8, 2019 JOSEPH +255717341960

32 + 0 + 8 + 4 + 0 + 1 + 0 + 0 + 0.125 + 0 + 0.03125 + 0.015625 + 0.007813

= 45.179688(10)

C. Octal Number System:

It has eight unique symbols i.e. 0 to 7. It has base of 8. Each octal digit has its own value or weight
expressed as a power of 8.

Convert from decimal to octal Χ(10)->Χ(8)


Integer
45(10)->X(8)
Div Quotient Remainder Octal Number (Χ)

45 / 8 5 5 5

5/8 0 5 55

45(10)->55(8)

Fractional Part

0.182(10)->Χ(8)

Mul Product Integer Binary Number (Χ)

0.182 * 8 1.456 1 0.1

0.456 * 8 3.648 3 0.13

0.648 * 8 5.184 5 0.135

0.184 * 8 1.472 1 0.1351

0.472 * 8 3.776 3 0.13513

0.776 * 8 6.208 6 0.135136

0.182(10)->0.135136(8) (After we round and cut the number)

D. Hexadecimal Number System:

The hexadecimal system uses base 16. It has 16 possible digit symbols. It uses the digits 0 through 9 plus
the letters A, B, C, D, E, and F as 16 digit symbols. Each hexadecimal digit has its own value or weight
expressed as a power of 16.

3
[ DATA REPRESENTATION] Mr. THOBIUS
May 8, 2019 JOSEPH +255717341960

Some programming languages denote hex numbers with prefix 0x or 0X (e.g., 0x1A3C5F), or prefix x with
hex digits quoted (e.g., x'C3A4D98B').

Each hexadecimal digit is also called a hex digit. Most programming languages accept lowercase 'a' to 'f' as
well as uppercase 'A' to 'F'.

Computers use binary system in their internal operations, as they are built from binary digital electronic
components with 2 states - on and off. However, writing or reading a long sequence of binary bits is
cumbersome and error-prone (try to read this binary string: 1011 0011 0100 0011 0001 1101 0001 1000B,
which is the same as hexadecimal B343 1D18H). Hexadecimal system is used as a compact form or
shorthand for binary bits. Each hex digit is equivalent to 4 binary bits.

Decimal Binary Hexadecimal Octal


0 0000 0 0

1 0001 1 1

2 0010 2 2

3 0011 3 3

4 0100 4 4

5 0101 5 5
6 0110 6 6
7 0111 7 7

10
8 1000 8

11
9 1001 9

12
10 1010 A

13
11 1011 B

14
12 1100 C

15
13 1101 D

16
14 1110 E

17
15 1111 F

20
16 10000 10

Convert from decimal to hexadecimal Χ(10)->Χ(16)


Integer
45(10)->X(16)
Div Quotient Remainder Hex Number (Χ)

45 / 16 2 13 D (Since 13 decimal is D in hexadecimal)

4
[ DATA REPRESENTATION] Mr. THOBIUS
May 8, 2019 JOSEPH +255717341960

2 / 16 0 2 2D (See the table)

45(10)->2D(16)

Fractional Number

0.182(10)->Χ(16)

Mul Product Integer Binary Number (Χ)

0.182 * 16 2.912 2 0.2

0.912 * 16 14.592 14 0.2Ε

0.592 * 16 9.472 9 0.2Ε9

0.472 * 16 7.552 7 0.2Ε97

0.552 * 16 8.832 8 0.2Ε978

0.832 * 16 13.312 13 0.2Ε978D

0.182(10)->0.2E978D(16) (After we round and cut the number)

Convert from octal to decimal Χ(8)->Χ(10)

55.135136(8)->Χ(10)

Index the digits of the number

5150.1-13-25-31-43-56-6

We multiply each digit

5 * 81 + 5 * 80 + 1 * 8-1 + 3 * 8-2 + 5 * 8-3 + 1 * 8-4 + 3 * 8-5 + 6 * 8-6 =

40 + 5 + 0.125 + 0.03125 + 0.009766 + 0.000244 + 0.0001 + 0.0000229

= 45.1663829(10)

Convert from hexadecimal to decimal Χ(16)->Χ(10)

2D.2E978D (16)->Χ(10)

Index the digits of the number

5
[ DATA REPRESENTATION] Mr. THOBIUS
May 8, 2019 JOSEPH +255717341960

21130.2-114-29-37-48-513-6

We multiply each digit

2 * 161 + 13 * 160 + 2 * 16-1 + 14 * 16-2 + 9 * 16-3 + 7 * 16-4 + 8 * 16-5 + 13 * 16-6 =

32 + 13 + 0.125 + 0.0546875 + 0.00219727 + 0.00010681 + 0.00000762 + 0.00000077

= 45.18199997(10)

Convert from binary to octal: For this conversion make the group of three digits from right to left before
decimal & left to right after decimal then assign the specific octal value. (Given in the table above)

110101000.101010(2)->X(8)

|3| |3||3| |3| |3|

110 101 000 .101 010

|| || || || ||

\/ \/ \/ \/ \/

6 5 0 . 5 2 (See that in the array 110(2) corresponds to 6(8) )

110101000.101010(2)->650.52(8)

Convert from binary to hexadecimal: This conversion make the group of four digits from right to left before
decimal & left to right after decimal then assign the specific Hexadecimal value. (Given in the table above)

110101000.101010(2)->X(16)

|4 | | 4| | 4| | 4| | 4|

0001 1010 1000 .1010 1000

|| || || || ||
\/ \/ \/ \/ \/
1 Α 8 . Α 8

110101000(2)->1Α8.Α8(16)

Other examples: 1001001010B = 0010 0100 1010B = 24AH

10001011001011B = 0010 0010 1100 1011B = 22CBH

6
DATA REPRESENTATION] Mr. THOBIUS
[
May 8, 2019 JOSEPH +255717341960

NB: It is important to note that hexadecimal number provides a compact form or shorthand for representing
binary bits.

Convert from hexadecimal to octal and binary: In this conversion write the binary of specific digit. For
Octal three digit binary & for Hexadecimal four digit binary.

Convert from octal to binary

650.52(8)->X(2)

6 5 0 . 5 2

|| || || || ||

\/ \/ \/ \/ \/

110 101 000 .101 010

650.52(8)->110101000.101010(2)

Convert from hexadecimal to binary

1Α8.Α8(16)->X(2)

1 Α 8 . Α 8

|| || || || ||

\/ \/ \/ \/ \/

0001 1010 1000 .1010 1000

Convert the followings:


i. 101001.0101 to decimal
ii. (236)8 to Binary
iii. (266)10 to Hexadecimal
iv. (AF2)16 to Binary
v. 0101110.1010110 to Hexadecimal

BIT, BYTE AND WORDS

Computer uses a fixed number of bits to represent a piece of data, which could be a number, a character, or
others. A n-bit storage location can represent up to 2^n distinct entities. For example, a 3-bit memory location
can hold one of these eight binary patterns: 000, 001, 010, 011, 100, 101, 110, or 111. Hence, it can
represent at most 8 distinct entities.

7
[ DATA REPRESENTATION] Mr. THOBIUS
May 8, 2019 JOSEPH +255717341960

Integers, for example, can be represented in 8-bit, 16-bit, 32-bit or 64-bit. You, as the programmer, you
choose an appropriate bit-length for your integers. Your choice will impose constraint on the range of
integers that can be represented. Besides the bit-length, an integer can be represented in various
representation schemes, e.g., unsigned vs. signed integers. An 8-bit unsigned integer has a range of 0 to
255, while an 8-bit signed integer has a range of -128 to 127 - both representing 256 distinct numbers.

It is important to note that a computer memory location merely stores a binary pattern. It is entirely up to you,
as the programmer, to decide on how these patterns are to be interpreted. For example, the 8-bit binary
pattern "0100 0001B" can be interpreted as an unsigned integer 65, or an ASCII character 'A', or some
secret information known only to you. In other words, you have to first decide how to represent a piece of
data in a binary pattern before the binary patterns make sense. The interpretation of binary pattern is called
data representation or encoding. Furthermore, it is important that the data representation schemes are
agreed-upon by all the parties, i.e., industrial standards need to be formulated and straight followed.

Once you decided on the data representation scheme, certain constraints, in particular, the precision and
range will be imposed. Hence, it is important to understand data representation to write correct and high-
performance programs.

The smallest unit is bit = Basic Information uniT, which mean either 0 or 1.

Concentrating on binary numbers, a single binary digit is usually termed a bit. To represent numbers we
group together bits into larger sequences; we usually term these binary vectors. Hence we have an n-bit
binary vector or bit-vector, where n can be 2, 4,8,16, 32, 64, 128 ….

Example of n bit vector:

N=1 = 0, 1
N=2 = 00,01,10,11
N=3 = 000, 001,010,011,100,101,110,111

We have special terms for certain length vectors. A vector of eight bits is termed a byte (or octet) while a
vector of four bits is a nibble (or nybble). A word is the vector of n –bits which can be fetched by computer
execution at one cycle. For example, when you hear a computer processor described as being 32-bit or 64-
bit this is, roughly speaking, specifyingthatthewordsizeis32or64bits.This imply that computer will fetch 32 bits
or 64bits at once operation.

Endianness convention

We use the concatenation operator (written as x : y) to join together two or more bit-vectors; for example 111
:1011 describes the bit-vector 1111011.We do however need some standard way to make sure we know
which bits in the vector represent first bit and which bit represent last bit. The techniques used to identify the
first and last bit is called endiannes convention.

Types of Endiannes convention

Little- endian convention

Usually, and as above, we use a little-endian convention by reading the bits from right to left, giving each bit
is given an index. So writing the vector 1111011 we have that
8
[ DATA REPRESENTATION] Mr. THOBIUS
May 8, 2019 JOSEPH +255717341960

x0 = 1
x1 = 1
x2 = 0
x3 = 1
x4 = 1
x5 = 1
x6 = 1

If we interpret this vector as a binary number, it therefore represents


1111011=123.
In this case, the right-most bit (bit zero or x0) is termed the least- significant bit (LSB) while the left-most bit
(bit n−1) is the most-significant bit MSB.

Big-endian convention

A big-endian naming convention reverses the indices so we now read them left to right. The left-most bit (bit
zero or x0) is now the most-significant and the right-most (bit n−1 orxn−1) is the leastsignificant. If the same
vector is interpreted in big-endian notation, we find that

x0 = 1
x1 = 1
x2 = 1
x3 = 1
x4 = 0
x5 = 1
x6 = 1

and if interpreted as a binary number, the value now represents 1101111 = 111(10).

Information Units:

1 bit = 0 or 1
1 Byte = 8 bit
1 Nibble = 4 bit
1 Kilo Byte = 1024 Byte= 210 Byte
1 Mega Byte = 1024 KB= 210 KB
1 Gega Byte = 1024 MB= 210 MB
1 Tera Byte = 1024 GB= 210 GB
1 Peta Byte =1024 TB= 210 TB
1 Exa Byte =1024 PB= 210 PB
1 Zetta Byte = 1024 EB= 210 EB
1 Yotta Byte = 1024 ZB= 210 ZB

REPRESENTING NUMBER’S IN A COMPUTER

INTEGER PRESENTATION

Integers are whole numbers or fixed-point numbers with the radix point fixed after the least-significant bit.
They are contrast to real numbers or floating-point numbers, where the position of the radix point varies. It is
important to take note that integers and floating-point numbers are treated differently in computers. They

9
[ DATA REPRESENTATION] Mr. THOBIUS
May 8, 2019 JOSEPH +255717341960

have different representation and are processed differently (e.g., floating-point numbers are processed in a
so-called floating-point processor). Floating-point numbers will be discussed later.

Computers use a fixed number of bits to represent an integer. The commonly-used bit-lengths for integers
are 8-bit, 16-bit, 32-bit or 64-bit. Besides bit-lengths, there are two representation schemes for integers:

1. Unsigned Integers: can represent zero and positive integers.


2. Signed Integers: can represent zero, positive and negative integers. Three representation schemes
had been proposed for signed integers:
1. Sign-Magnitude representation
2. 1's Complement representation
3. 2's Complement representation

You, as the programmer, need to decide on the bit-length and representation scheme for your integers,
depending on your application's requirements. Suppose that you need a counter for counting a small quantity
from 0 up to 200, you might choose the 8-bit unsigned integer scheme as there is no negative numbers
involved.

Consider the following kind of numbers;

Unsigned numbers: 0, 1 , 2 ….

These are positive numbers including zero. Unsigned integers can represent zero and positive integers, but
not negative integers. One of method used to encoding binary to this numbers is called Binary Coded
Decimal (BCD). The Binary Coded Decimal (BCD) system is one way of encoding decimal digits as binary
vectors. There are several standards for BCD. The most simple of these is called Simple Binary Coded
Decimal (SBCD) or BCD 8421: one represents a single decimal digit as a vector of four binary digits.

So for example, the number 123(10) would be represented (with spacing for clarity) by the vector 0001 0010
0011.
An n-bit pattern can represent 2^n distinct integers. An n-bit unsigned integer can represent integers from 0
to (2^n)-1, as tabulated below:

n(bits) Minimum Maximum

8 0 (2^8)-1 (=255)
16 0 (2^16)-1 (=65,535)
32 0 (2^32)-1 (=4,294,967,295) (9+ digits)

Signed numbers: -4,-2,-1, 0, 1, 2, 3, 4...

Sometimes, we need something extra at the start of the encoding to tell us if the number is positive or
negative. Signed integers can represent zero, positive integers, as well as negative integers. Three
representation schemes are available for signed integers:

 Sign-Magnitude representation
 1's Complement representation

10
[ DATA REPRESENTATION] Mr. THOBIUS
May 8, 2019 JOSEPH +255717341960

 2's Complement representation

In all the above three schemes, the most-significant bit (msb) is called the sign bit. The sign bit is used to
represent the sign of the integer - with 0 for positive integers and 1 for negative integers. The magnitude of
the integer, however, is interpreted differently in different schemes.

Sign-magnitude

The sign-magnitude method represents a signed integer in n bits by allocating one bit to store the sign,
typically the most-significant, and n−1 to store the magnitude. The sign bit is set to one for a negative integer
and zero for a positive integer.

In sign-magnitude representation:

The most-significant bit (msb) is the sign bit, with value of 0 representing positive integer and 1 representing
negative integer.

The remaining n-1 bits represent the magnitude (absolute value) of the integer. The absolute value of the
integer is interpreted as "the magnitude of the (n-1)-bit binary pattern".

Hence we find that our example is represented in eight bits by

01111011=+1. (26+25+24+23+21+20)=+123(10) while the negation is represented by

11111011=−1· (26+25+24+23+21+20) =−123 (10).

For our 8-bit example, this means we can represent−127...+127.

Example 1: Suppose that n=8 and the binary representation is 0 100 0001B.
Sign bit is 0 ⇒ positive
Absolute value is 100 0001B = 65D
Hence, the integer is +65D

Example 2: Suppose that n=8 and the binary representation is 1 000 0001B.
Sign bit is 1 ⇒ negative
Absolute value is 000 0001B = 1D
Hence, the integer is -1D

Example 3: Suppose that n=8 and the binary representation is 0 000 0000B.
Sign bit is 0 ⇒ positive
Absolute value is 000 0000B = 0D
Hence, the integer is +0D

Example 4: Suppose that n=8 and the binary representation is 1 000 0000B.
Sign bit is 1 ⇒ negative
Absolute value is 000 0000B = 0D
Hence, the integer is -0D

11
[ DATA REPRESENTATION] Mr. THOBIUS
May 8, 2019 JOSEPH +255717341960

Weakness of sign-magnitude method

1. There are two representations (0000 0000B and 1000 0000B) for the number zero, which could lead
to inefficiency and confusion.
2. Positive and negative integers need to be processed separately.

Ones – complement method

In 1's complement representation:

 Again, the most significant bit (msb) is the sign bit, with value of 0 representing positive integers and
1 representing negative integers.
 The remaining n-1 bits represents the magnitude of the integer, as follows:
 for positive integers, the absolute value of the integer is equal to "the magnitude of the (n-1)-
bit binary pattern".
 for negative integers, the absolute value of the integer is equal to "the magnitude of the
complement (inverse) of the (n-1)-bit binary pattern" (hence called 1's complement).

Example 1: Suppose that n=8 and the binary representation 0 100 0001B.
Sign bit is 0 ⇒ positive
Absolute value is 100 0001B = 65D
Hence, the integer is +65D

Example 2: Suppose that n=8 and the binary representation 1 000 0001B.
Sign bit is 1 ⇒ negative
Absolute value is the complement of 000 0001B, i.e., 111 1110B = 126D
Hence, the integer is -126D

Example 3: Suppose that n=8 and the binary representation 0 000 0000B.
Sign bit is 0 ⇒ positive
Absolute value is 000 0000B = 0D
Hence, the integer is +0D

Example 4: Suppose that n=8 and the binary representation 1 111 1111B.
Sign bit is 1 ⇒ negative
Absolute value is the complement of 111 1111B, i.e., 000 0000B = 0D
Hence, the integer is -0D

Again, the weaknesses are:

1. There are two representations (0000 0000B and 1111 1111B) for zero.
2. The positive integers and negative integers need to be processed separately.

Twos – complement method

In 2's complement representation:

12
[ DATA REPRESENTATION] Mr. THOBIUS
May 8, 2019 JOSEPH +255717341960

 Again, the most significant bit (msb) is the sign bit, with value of 0 representing positive integers and
1 representing negative integers.
 The remaining n-1 bits represents the magnitude of the integer, as follows:
 for positive integers, the absolute value of the integer is equal to "the magnitude of the (n-1)-
bit binary pattern".
 for negative integers, the absolute value of the integer is equal to "the magnitude of the
complement of the (n-1)-bit binary pattern plus one" (hence called 2's complement).

Example 1: Suppose that n=8 and the binary representation 0 100 0001B.
Sign bit is 0 ⇒ positive
Absolute value is 100 0001B = 65D
Hence, the integer is +65D

Example 2: Suppose that n=8 and the binary representation 1 000 0001B.
Sign bit is 1 ⇒ negative
Absolute value is the complement of 000 0001B plus 1, i.e., 111 1110B + 1B = 127D
Hence, the integer is -127D

Example 3: Suppose that n=8 and the binary representation 0 000 0000B.
Sign bit is 0 ⇒ positive
Absolute value is 000 0000B = 0D
Hence, the integer is +0D

Example 4: Suppose that n=8 and the binary representation 1 111 1111B.
Sign bit is 1 ⇒ negative
Absolute value is the complement of 111 1111B plus 1, i.e., 000 0000B + 1B = 1D
Hence, the integer is -1D

Computers use 2's complement in representing signed integers. This is because:

1. There is only one representation for the number zero in 2's complement, instead of two
representations in sign-magnitude and 1's complement.
2. Positive and negative integers can be treated together in addition and subtraction. Subtraction can be
carried out using the "addition logic".

DECODING Twos COMPLEMENT NUMBER

1. Check the sign bit (denoted as S).


2. If S=0, the number is positive and its absolute value is the binary value of the remaining n-1 bits.
3. If S=1, the number is negative. you could "invert the n-1 bits and plus 1" to get the absolute value of
negative number.
Alternatively, you could scan the remaining n-1 bits from the right (least-significant bit). Look for the
first occurrence of 1. Flip all the bits to the left of that first occurrence of 1. The flipped pattern gives
the absolute value. For example,

13
[ DATA REPRESENTATION] Mr. THOBIUS
May 8, 2019 JOSEPH +255717341960

4. n = 8, bit pattern = 1 100 0100B


5. S = 1 → negative
6. Scanning from the right and flip all the bits to the left of the first occurrence of 1 ⇒ 011 1100B = 60D
Hence, the value is -60D

REPRESENTING CHARACTER’S IN A COMPUTER

In computer memory, character are "encoded" (or "represented") using a chosen "character encoding
schemes" (aka "character set", "char set", "character map", or "code page").

For example, in ASCII (as well as Latin1, Unicode, and many other character sets):

 Code numbers 65D (41H) to 90D (5AH) represents 'A' to 'Z', respectively.
 Code numbers 97D (61H) to 122D (7AH) represents 'a' to 'z', respectively.
 Code numbers 48D (30H) to 57D (39H) represents '0' to '9', respectively.

It is important to note that the representation scheme must be known before a binary pattern can be
interpreted. E.g., the 8-bit pattern "0100 0010B" could represent anything under the sun known only to the
person encoded it.

The most commonly-used character encoding schemes are: 7-bit ASCII (ISO/IEC 646) and 8-bit Latin-x
(ISO/IEC 8859-x) for western European characters, and Unicode (ISO/IEC 10646) for internationalization
(i18n).

A 7-bit encoding scheme (such as ASCII) can represent 128 characters and symbols. An 8-bit character
encoding scheme (such as Latin-x) can represent 256 characters and symbols; whereas a 16-bit encoding
scheme (such as Unicode UCS-2) can represents 65,536 characters and symbols.

Imagine that we want to represent a sequence or string of letters or characters; perhaps as used in a call to
the printf function in a C program. We have already seen how to represent integers using binary vectors that
are suitable for digital computers; the question is how can we do the same with characters?

Of course, this is easy; we just need to translate from one to the other. More specifically, we need two
functions: ORD(x) which takes a character x and gives us back the corresponding integer representation,
and CHR(y) which takes an integer representation y and gives us back the corresponding character. But how
do we decide how the functions should work?

Fortunately, people have thought about this and provided standards we can use. One of the oldest, from
circa 1967, and simplest is the American Standard Code for Information Interchange (ASCII), another one is
UNICODE and extra.

Given the ASCII table, we can see that for example CHR (104)=„h‟, i.e., if we see the integer 104 then this
represents the character „h‟.

Imagine we want to convert a character x from lower case into upper case. The lower case characters are
represented numerically as the contiguous range 97...122; the upper case characters as the contiguous
range 65...90. So we can convert from lower case into upper case simply by subtracting 32. For example:
CHR (ORD („a‟) −32) =„A‟.

14
[ DATA REPRESENTATION] Mr. THOBIUS
May 8, 2019 JOSEPH +255717341960

ASCII TABLE

The above table can be rearranged such as shown below; from the table below Code number 32D (20H) is
the blank or space character.

15
[ DATA REPRESENTATION] Mr. THOBIUS
May 8, 2019 JOSEPH +255717341960

INSTRUCTION SET COMPUTING

CISC and RISC

Central Processing Unit Architecture operates the capacity to work from “Instruction Set Architecture” to
where it was designed. The architectural designs of CPU are RISC (Reduced instruction set computing) and
CISC (Complex instruction set computing). CISC has the ability to execute addressing modes or multi-step
operations within one instruction set. It is the design of the CPU where one instruction performs many low-
level operations. For example, memory storage, an arithmetic operation and loading from memory. RISC is a
CPU design strategy based on the insight that simplified instruction set gives higher performance when
combined with a microprocessor architecture which has the ability to execute the instructions by using some
microprocessor cycles per instruction.

16
DATA REPRESENTATION] Mr. THOBIUS
[
May 8, 2019 JOSEPH +255717341960

17
[ DATA REPRESENTATION] Mr. THOBIUS
May 8, 2019 JOSEPH +255717341960

NB: Hardware of the Intel is termed as Complex Instruction Set Computer (CISC) and Apple hardware is
Reduced Instruction Set Computer (RISC).

CPU performance (given as execution time) is dependent upon Instruction Count, CPI (Cycles per
instruction) and Clock cycle time. And all three are affected by the instruction set architecture.

Another formula are:

Instruction Count (Instructions/program)

Computer architects can reduce the instruction count by adding more powerful instructions to the instruction
set. However, this can increase either CPI or clock time, or both.

Instruction (IC) count is a dynamic measure: the total number of instruction executions involved in a program.
It is dominated by repetitive operations such as loops and recursions.

Instruction count is affected by the power of the instruction set. Different instruction sets may do different
amounts of work in a single instruction. CISC processor instructions can often accomplish as much as two or
three RISC processor instructions. Some CISC processor instructions have built-in looping so that they can
accomplish as much as several hundred RISC instruction executions.

18
[ DATA REPRESENTATION] Mr. THOBIUS
May 8, 2019 JOSEPH +255717341960

For predicting the effects of incremental changes, architects use execution traces of benchmark programs to
get instruction counts. If the incremental change does not change the instruction set then the instruction
count normally does not change. If there are small changes in the instruction set then trace information can
be used to estimate the change in the instruction count.

For comparison purposes, two machines with different instruction sets can be compared based on
compilations of the same high-level language code on the two machines.

Clocks per Instruction (CPI)(cycles/instructions)

Computer architects can reduce CPI by exploiting more instruction-level parallelism. If they add more
complex instructions it often increases CPI.

Clocks per instruction (CPI is an effective average. It is averaged over all of the instruction executions in a
program.

CPI is affected by instruction-level parallelism and by instruction complexity. Without instruction-level


parallelism, simple instructions usually take 4 or more cycles to execute. Instructions that execute loops take
at least one clock per loop iteration. Pipelining (overlapping execution of instructions) can bring the average
for simple instructions down to near 1 clock per instruction. Superscalar pipelining (issuing multiple
instructions per cycle) can bring the average down to a fraction of a clock per instruction.

For computing clocks per instruction as an effective average, the cases are categories of instructions, such
as branches, loads, and stores. Frequencies for the categories can be extracted from execution traces.
Knowledge of how the architecture handles each category yields the clocks per instruction for that category.

Clock Time(seconds/cycle)(1/speed of processor)

Clock time depends on transistor speed and the complexity of the work done in a single clock. Clock time
can be reduced when transistor sizes decrease. However, power consumption increases when clock time is
reduced. This increase the amount of heat generated.

Clock time (CT) is the period of the clock that synchronizes the circuits in a processor. It is the reciprocal of
the clock frequency.

For example, a 1 GHz processor has a cycle time of 1.0 ns and a 4 GHz processor has a cycle time of 0.25
ns.

Clock time is affected by circuit technology and the complexity of the work done in a single clock. Logic gates
do not operate instantly. A gate has a propagation delay that depends on the number of inputs to the gate
(fan in) and the number of other inputs connected to the gate's output (fan out). Increasing either the fan in or
the fan out slows down the propagation time. Cycle time is set to be the worst-case total propagation time
through gates that produce a signal required in the next cycle. The worst-case total propagation time occurs
along one or more signal paths through the circuitry. These paths are called critical paths.

19
[ DATA REPRESENTATION] Mr. THOBIUS
May 8, 2019 JOSEPH +255717341960

For the past 35 years, integrated circuit technology has been greatly affected by a scaling equation that tells
how individual transistor dimensions should be altered as the overall dimensions are decreased. The scaling
equations predict an increase in speed and a decrease in power consumption per transistor with decreasing
size. Technology has improved so that about every 3 years, linear dimensions have decreased by a factor of
2. Transistor power consumption has decreased by a similar factor. Speed increased by a similar factor until
about 2005. At that time, power consumption reached the point where air cooling was not sufficient to keep
processors cool if the ran at the highest possible clock speed.

Instruction Set: Group of instructions given to execute the program and they direct the computer by
manipulating the data. Instructions are in the form – Opcode (operational code) and Operand. Where,
opcode is the instruction applied to load and store data, etc. The operand is a memory register where
instruction applied.

Problem Statement

Suppose a program (or a program task) takes 1 billion instructions to execute on a processor running at 2
GHz. Suppose also that 50% of the instructions execute in 3 clock cycles, 30% execute in 4 clock cycles,
and 20% execute in 5 clock cycles. What is the execution time for the program or task?

We have the instruction count: 109 instructions. The clock time can be computed quickly from the clock rate
to be 0.5×10-9 seconds. So we only need to to compute clocks per instruction as an effective value:

Value Frequency Product

3 0.5 1.5

4 0.3 1.2

5 0.2 1.0

CPI = 3.7

Then we have

Execution time = 1.0×109 × 3.7 × 0.5×10-9 sec = 1.85 sec.

Problem Statement

Suppose the processor in the previous example is redesigned so that all instructions that initially executed in
5 cycles now execute in 4 cycles. Due to changes in the circuitry, the clock rate has to be decreased from 2.0
GHz to 1.9 GHz. What is the overall percentage improvement?

For the clocks per instruction, we had a value of 3.7 before the change. We compute clocks per instruction
after the change as an effective value:

20
[ DATA REPRESENTATION] Mr. THOBIUS
May 8, 2019 JOSEPH +255717341960

Value Frequency Product

3 0.5 1.5

4 0.3 1.2

4 0.2 0.8

CPI = 3.5 Now, lower clocks per instruction means higher instruction throughput and thus better
performance.

Then % improvement = 1- performance ratio

This is a 0.43% improvement, which is probably not worth the effort.

21
DATA REPRESENTATION] Mr. THOBIUS
[
May 8, 2019 JOSEPH +255717341960

22
[ DATA REPRESENTATION] Mr. THOBIUS
May 8, 2019 JOSEPH +255717341960

CISC Architecture

The CISC approach attempts to minimize the number of instructions per program, sacrificing the number of
cycles per instruction. Computers based on the CISC architecture are designed to decrease the memory
cost. Because, the large programs need more storage, thus increasing the memory cost and large memory
becomes more expensive. To solve these problems, the number of instructions per program can be reduced
by embedding the number of operations in a single instruction, thereby making the instructions more
complex.

Examples of CISC PROCESSORS

IBM 370/168 – It was introduced in the year 1970. CISC design is a 32 bit processor and four 64-bit floating
point registers.
VAX 11/780 – CISC design is a 32-bit processor and it supports many numbers of addressing modes and
machine instructions which is from Digital Equipment Corporation.
Intel 80486 – It was launched in the year 1989 and it is a CISC processor, which has instructions varying
lengths from 1 to 11 and it will have 235 instructions.

CHARACTERISTICS OF CISC ARCHITECTURE

 Instruction-decoding logic will be Complex.


 One instruction is required to support multiple addressing modes.
 Less chip space is enough for general purpose registers for the instructions that are 0operated
directly on memory.
 Various CISC designs are set up two special registers for the stack pointer, handling interrupts, etc.
 MUL is referred to as a “complex instruction” and requires the programmer for storing functions.

RISC Architecture

RISC (Reduced Instruction Set Computer) is used in portable devices due to its power efficiency. For
Example, Apple iPod and Nintendo DS. RISC is a type of microprocessor architecture that uses highly-
optimized set of instructions. RISC does the opposite, reducing the cycles per instruction at the cost of the
number of instructions per program Pipelining is one of the unique feature of RISC. It is performed by
overlapping the execution of several instructions in a pipeline fashion. It has a high performance advantage
over CISC. RISC processors take simple instructions and are executed within a clock cycle.

RISC ARCHITECTURE CHARACTERISTICS

 Simple Instructions are used in RISC architecture.


 RISC helps and supports few simple data types and synthesize complex data types.
 RISC utilizes simple addressing modes and fixed length instructions for pipelining.
 RISC permits any register to use in any context.
 One Cycle Execution Time
 The amount of work that a computer can perform is reduced by separating “LOAD” and “STORE”
instructions.
 In RISC, more RAM is required to store assembly level instructions.
 Reduced instructions need a less number of transistors in RISC.
 RISC uses Harvard memory model means it is Harvard Architecture.

23
[ DATA REPRESENTATION] Mr. THOBIUS
May 8, 2019 JOSEPH +255717341960

 A compiler is used to perform the conversion operation means to convert a high-level language
statement into the code of its form.

The Advantages of RISC architecture

 RISC architecture has a set of instructions, so high-level language compilers can produce more
efficient code
 It allows freedom of using the space on microprocessors because of its simplicity.
 Many RISC processors use the registers for passing arguments and holding the local variables.
 RISC functions use only a few parameters, and the RISC processors cannot use the call instructions,
and therefore, use a fixed length instruction which is easy to pipeline.
 The speed of the operation can be maximized and the execution time can be minimized.
Very less number of instructional formats, a few numbers of instructions and a few addressing modes
are needed.

The Disadvantages of RISC architecture

 Mostly, the performance of the RISC processors depends on the programmer or compiler as the
knowledge of the compiler plays a vital role while changing the CISC code to a RISC code
 While rearranging the CISC code to a RISC code, termed as a code expansion, will increase the size.
And, the quality of this code expansion will again depend on the compiler, and also on the machine‟s
instruction set.
 The first level cache of the RISC processors is also a disadvantage of the RISC, in which these
processors have large memory caches on the chip itself. For feeding the instructions, they require
very fast memory systems.

Advantages of CISC architecture

 Microprogramming is easy assembly language to implement, and less expensive than hard wiring a
control unit.
 The ease of microcoding new instructions allowed designers to make CISC machines upwardly
compatible:
 As each instruction became more accomplished, fewer instructions could be used to implement a
given task.

Disadvantages of CISC architecture

 The performance of the machine slows down due to the amount of clock time taken by different
instructions will be dissimilar
 Only 20% of the existing instructions is used in a typical programming event, even though there are
various specialized instructions in reality which are not even used frequently.
 The conditional codes are set by the CISC instructions as a side effect of each instruction which takes
time for this setting – and, as the subsequent instruction changes the condition code bits – so, the
compiler has to examine the condition code bits before this happens.

24

You might also like