Algorithm Analysis and Design

MODULE I
WHAT IS AN ALGORITHM
Definition: An algorithm is a finite set of instructions that, if followed, accomplishes a
particular task.
PROPERTIES OF AN ALGORITHM
All algorithms must satisfy the following criteria:
1. Input. Zero or more quantities are externally supplied.
2. Output. At least one quantity is produced.
3. Definiteness. Each instruction is clear and unambiguous.
4. Finiteness. If we trace out the instructions of an algorithm, then for all cases, the
algorithm terminates after a finite number of steps.
5. Effectiveness. Every instruction must be very basic so that it can be carried out, in
principle, by a person using only pencil and paper. It is not enough that each operation be
definite as in criterion 3; it also must be feasible.
An algorithm is composed of a finite set of steps, each of which may require one or more
operations. The possibility of a computer carrying out these operations necessitates that
certain constraints be placed on the type of operations an algorithm can include.
Criteria 1 and 2 require that an algorithm produce one or more outputs and have zero or
more inputs that are externally supplied. According to criteria 3, each operation must be
definite, meaning that it must be perfectly clear what should be done.
The fourth criterion for algorithms is that they terminate after a finite number of
operations. A related consideration is that the time for termination should be reasonably
short.
Criteria 5 requires that each operation be effective; each step must be such that it can, at
least in principle, be done by a person using pencil and paper in a finite amount of time.
Performing arithmetic on integers is an example of an effective operation, but arithmetic
with real numbers is not, since some values may be expressible only by infinitely long
decimal expansion.
PSEUDOCODE CONVENTIONS
We can describe an algorithm in many ways.We can use a natural language like English,
although I we select this option,we must make sure that the resulting instructions are
definite. We can present most of our algorithms using a pseudocode that resembles c
1. Comments begin with // and continueuntill the end of line
Eg: count :=count+1;//count is global ;It is initially zero.
2. Blocks are indicated with matching braces: { and } .A compound statement can be
represent as a block.The body of a procedure also forms a block.Statements are delimited
by ;
Eg: for j:= 1 to n do
{
Count:=count+1;
C[i,j]:=a[i,j]+b[i,j];
Count:=count +1; }
3. An identifier begins with a letter. The data types of variables are not explicitly
declared. The types will be clear from the context .Whether a variable is global or local to
a procedure will also be evident from the context. Compound data types can be formed
with records.
Eg: node=record
{
datatype_1 data_1;
:
datatype_n data_n;
node *link;
}
4. Assignment of values to variables is done using the assignment statement
<variable> := <expression>;
Eg: count:= count+1;
5. There are two Boolean values true and false. In order to produce these values, the
logical operators and , or , and not and the relational relational operators <,<=,=,!=,>= and
> are provided.
Eg: if (j>1) then k:=i-1; else k:=n-1;
6. Elements of multidimentional arrays are accessed using [ and ].
For eg: if A is a two dimentional array , the (I,j) th element of the array is denoted as
A[I,j]. Array indicates start at zero.
7. The following looping statements are employed: for,while and repeat until. The while
loop takes the following form.
While (condition) do
{
<statement 1>
:
:
<statement n>
}
8. A conditional statement has the following forms:
If < condition > then <statement>
If<condition> then <statement 1> else <statement 2>
Here < condition > is a Boolean expression and <statement>,<statement 1>, and <
statement 2> are arbitrary statements.
9. Input and output are done using the instructions read and write. No format is used to
specify the size of input or output quantities.
Eg: write (.n is even.);
10. There is only one type of procedure: Algorithm. An algorithm consists of a heading
and a body. The heading takes the form
Algorithm Nmae (<parameter list>)
DEVELOPMENT OF AN ALGORITHM
The basic steps in the development of an algorithm are,
1. Statement of the Problem
2. Development of a Model
3. Design of an Algorithm
4. Correctness of the Algorithm
5. Implementation
6. Analysis and Complexity of the Algorithm
7. Program Testing
8. Documentation
1.Statement of the Problem:- Developing a precise problem statement is usually a
matter of asking the right questions. Some good questions to ask upon encountering a
crudely formulated problem are:
- Do I understand the vocabulary used in the raw formulation?
- What information has been given?
- What do I want to find out?
- How would I recognize a solution?
- What information is missing, and will any of this information be of use?
- Is any of the given information worthless?
- What assumptions have been made?
Questions such as these needs to be asked again, after some of them have been given
answers or partial answers.
2. Development of a Model:- Once a problem has been clearly stated, it is time to
formulate it as a mathematical model.We must choose mathematical objects to represent
both what we know and what we want to find. It is impossible to provide a set of rules
which automates the modeling stage. Most problems must be given individual attention.
However, there are some useful guidelines. The best way to become proficient is by
acquiring the experience that comes from the study of successful models. There are at
least two basic questions to be asked in setting up a model:
- Which mathematical structures seem best-suited for the problem?
- Are there other problems that have been solved which resemble this one?
3. Design of an Algorithm:-Once a problem has been clearly stated and a model has been
developed, we must get down to the business of designing an algorithm for solving the
problem. The choice of a design technique, which is often highly dependent on the choice
of model, can greatly influence the effectiveness of a solution algorithm. Two different
algorithms may be correct, but may differ tremendously in their effectiveness.
4. Correctness of the Algorithm:One of the more difficult, and sometimes more
tedious, steps in the development of an algorithm is proving that the algorithm is correct.
The most common procedure followed to prove that a program is correct is to run it on a
variety of test cases. If the answers produced by the program can be verified against hand
calculations or known values, we are tempted to conclude that the program "works."
However, this technique rarely removes all doubts that there is some case for which the
program will fail.
We offer the following as a general guide for proving the correctness of an algorithm.
Suppose that an algorithm is given in terms of a series of steps, say, step 0 through step
m. Try to offer some justification for each step. In particular, this might involve a lemma
about conditions that exist before and after the step is executed. Then try to offer some
proof that the algorithm will terminate, and, in doing so, will have examined all the
appropriate input data and produced all the appropriate output data.
5 .Implementation:-Once an algorithm has been stated, say, in terms of a sequence of
steps, and one is convinced that it is correct, it is time to implement the algorithm, that is,
to code it into a computer program.This fundamental step can be quite hard. One reason
for the difficulty is that all too often a particular step of an algorithm will be stated in a
form that is not directly translatable into code. Another reason why implementation can
be a difficult process is that before we can even begin to write code, we must design an
entire system of computer data structures to represent important aspects of the model
being used. To do this, we must answer such questions as the following:
- What are the variables?
- What are their types?
- How many arrays, and of what size, are needed?
- Would it be worthwhile to use linked lists?
- What subroutines are needed?
- What programming language should be used?
6. Analysis and Complexity of the Algorithm:-There are a number of important
practical reasons for analyzing algorithms. One reason is that we need to obtain estimates
or bounds on the storage or run time which our algorithm will need to successfully
process a particular input. Computer time and memory are relatively scarce (and
expensive) resources which are often simultaneously sought by many users. It is to
everyone's advantage to avoid runs that are aborted because of an insufficient time
allocation on a job card. There are also important theoretical reasons for analyzing
algorithms. One would like to have some quantitative standard for comparing two
algorithms which claim to solve the same problem. The weaker algorithm should be
improved or discarded.
7. Program Testing:-Once a program has been coded, it is time to run .Program testing
might be described as an experimental verification that the program is doing what it
should. It is also an experimental attempt to ascertain the usage limits of the
algorithm/program. The program must be verified for a broad spectrum of allowable
inputs. This process can be time-consuming, tedious, and complex. Programs should also
be tested to determine their computational limitations.
8. Documentation:-Documentation is not really the last step in the complete development
of an algorithm. The documentation process should be interwoven with the entire
development of the algorithm, and especially with the design and implementation steps.
The most obvious reason for documentation is to enable individuals to understand
programs which they did not write. Documentation includes every piece of information
that you produce which helps to explain what is going on, that is, such things as
flowcharts, records of the stages in your top-down development, supporting proofs of
correctness, test results, detailed descriptions of input-output requirements and format,
and so forth.
RECURSIVE ALGORITHMS
A recursive algorithm is an algorithm which calls itself with "smaller (or simpler)" input
values, and which obtains the result for the current input by applying simple operations to
the returned value for the smaller (or simpler) input. More generally if a problem can be
solved utilizing solutions to smaller versions of the same problem, and the smaller
versions reduce to easily solvable cases, then one can use a recursive algorithm to solve
that problem. For example, the elements of a recursively defined set, or the value of a
recursively defined function can be obtained by a recursive algorithm. Recursive
computer programs require more memory and computation compared with iterative
algorithms, but they are simpler and for many cases a natural way of thinking about the
problem.

For example consider factorial of a number, n
n! = n*(n-1)*(n-2)*...*2*1, and that 0! = 1.
In other words,

Function to calculate the factorial can be written as
int factorial(int n)
{
if (n == 0)
return 1;
else
return (n * factorial(n-1));
}

factorial(0)=> 1

factorial(3)
3 * factorial(2)
3 * 2 * factorial(1)
3 * 2 * 1 * factorial(0)
3 * 2 * 1 * 1
=> 6

This corresponds very closely to what actually happens on the execution stack in the
computer's memory.

TYPES OF RECURSION:
Linear Recursion
A linear recursive function is a function that only makes a single call to itself each time
the function runs (as opposed to one that would call itself multiple times during its
execution). The factorial function is a good example of linear recursion.

Int fact(int n)
{
If(n==1) return(1);
else return(n*fact(n-1);
}
Tail recursion
Tail recursion is a form of linear recursion. In tail recursion, the recursive call is the last
thing the function does. Often, the value of the recursive call is returned. As such, tail
recursive functions can often be easily implemented in an iterative manner; by taking out
the recursive call and replacing it with a loop, the same effect can generally be achieved.
In fact, a good compiler can recognize tail recursion and convert it to iteration in order to
optimize the performance of the code. A good example of a tail recursive function is a
function to compute the GCD, or Greatest Common Denominator, of two numbers:
int gcd(int m, int n)
{
int r;
if (m < n) return gcd(n,m);
r = m%n;
if (r == 0) return(n);
else return(gcd(n,r));
}
Binary Recursion
Some recursive functions don't just have one call to themself, they have two (or more).
Functions with two recursive calls are referred to as binary recursive functions. The
mathematical combinations operation is a good example of a function that can quickly be
implemented as a binary recursive function. The number of combinations, often
represented as nCk where we are choosing n elements out of a set of k elements, can be
implemented as follows:
int choose(int n, int k)
{
if (k == 0 || n == k) return(1);
else return(choose(n-1,k) + choose(n-1,k-1));
}
Exponential recursion
An exponential recursive function is one that, if you were to draw out a representation of
all the function calls, would have an exponential number of calls in relation to the size of
the data set (exponential meaning if there were n elements, there would be O(an) function
calls where a is a positive number). A good example an exponentially recursive function
is a function to compute all the permutations of a data set. Let's write a function to take an
array of n integers and print out every permutation of it.
void print_array(int arr[], int n)
{
int i;
for(i=0; i<n; i) printf("%d ", arr[i]);
printf("\n");
}

void print_permutations(int arr[], int n, int i)
{
int j, swap;
print_array(arr, n);
for(j=i+1; j<n; j) {
swap = arr[i]; arr[i] = arr[j]; arr[j] = swap;
print_permutations(arr, n, i+1);
swap = arr[i]; arr[i] = arr[j]; arr[j] = swap;
}
}
To run this function on an array arr of length n, we'd do print_permutations(arr, n, 0)
where the 0 tells it to start at the beginning of the array.
Nested Recursion
In nested recursion, one of the arguments to the recursive function is the recursive
function itself! These functions tend to grow extremely fast. A good example is the
classic mathematical function, "Ackerman's function. It grows very quickly (even for
small values of x and y, Ackermann(x,y) is extremely large) and it cannot be computed
with only definite iteration (a completely defined for() loop for example); it requires
indefinite iteration (recursion, for example).
Ackerman's function
int ackerman(int m, int n)
{
if (m == 0) return(n+1);
else if (n == 0) return(ackerman(m-1,1));
else return(ackerman(m-1,ackerman(m,n-1)));
}
Mutual Recursion
A recursive function doesn't necessarily need to call itself. Some recursive functions work
in pairs or even larger groups. For example, function A calls function B which calls
function C which in turn calls function A. A simple example of mutual recursion is a set
of function to determine whether an integer is even or odd.
int is_even(unsigned int n)
{
if (n==0) return 1;
else return(is_odd(n-1));
}

int is_odd(unsigned int n)
{
return (!iseven(n));
}
PERFORMANCE ANALYSIS
There are many criteria upon which we can judge an algorithm.Performance evaluation
can be loosely divided into two major phases:
1.a priori estimates (performance analysis) and
2.a posteriori testing(performance measurements)
Time and space are two criterias for judging algorithms that have a more direct
relationship to performance.
Space Complexity:-Space complexity of an algorithm is the amount to memory needed
by the program for its completion.
Time Complexity:-Time complexity of an algorithm is the amount of time needed by the
program for its completion.
Space Complexity
The amount of memory required by an algorithm to run to completion. The space needed
by an algorithm is seen to be the sum of the following components:
1.A fixed part that is independent of the characteristics of the inputs and outputs . This
part typically includes the instruction space(space for the code), space for simple
variables and fixed-size component variables, space for constants etc.
2. A variable part that consists of the space needed by component variables whose size is
dependent on the particular problem instance being solved, the space needed by reference
variables and the recursion stack space.
The space requirement s(p) of any algorithm p may thereore be written as s(p) =c+Sp,
where c is a constant.
Example 1:
Void main() space for a,b,c=3*sizeof(int)
{ =3*2=6 bytes
int a; space for constant 3=2 bytes
int b; total=8 bytes
int c;
return(a+b+c)/3;
}
Example 2:
int sequential_search ( int a[], int n, int key)
{
int i;
for ( i=0;i<n;i++)
if (a[i] == key)
return i;
return -1;
}
Space for constant 0 & -1 = 4 bytes
Space for array pointer = 2 bytes
Space for n ,i & key =6 bytes
Space for a = 2n
Total =2n+12 bytes
Example 3 :
int binarysearch (int a[], int n, int key)
{
int low , high, found, mid; Space for 0,1,2&-1 = 8bytes
low=0; high = n -1; found =0; Space for local variables =8 bytes
while( low<= high && !found) Space parameters = 6 bytes
{ Space for a=2n
mid=(low + high)/2; Total = 2n+22
if (a[mid]==key)
found = 1;
else if(key < a[mid])
high=mid-1;
else
low=mid+1;
}
if (found) return mid;
else return -1;
}
Example 4:
Algorithm Rsum(a,n) Depth of the recursion =n+1
{ Activation record Stack space
if(n<=0) then return 0; ( n, return address and pointer to a[])=3 units
else return Rsum(a,n-1)+a[n] ; Total stack space =3(n+1) units
} Constant =2 units
Space for array =n units
Total = 3(n+1)+2+n=4n+5
Example 5:
Int Reclinearsearch( int a[],int n,int key)
{ Depth of recursion =n+1
if (n < 0) return -1; Data space =2+n
if(a[n-1]==key) Stack space =4(n+1)
return n; Total =4(n+1) + 2+n units
return reclinearsearch(a,n-1,key)
}
Time Complexity
Time complexity of an algorithm is the amount of time needed by the program for
its completion.The time complexity of a problem is the number of steps that it takes to
solve an instance of the problem as a function of the size of the input (usually measured
in bits), using the most efficient algorithm. The exact number of steps will depend on
exactly what machine or language is being used. The time taken by a program P is tha
sum ofthe compile time and the run time.the compile time does not depend on the
instance characteristics.For calculating the run time we count only the number of
program steps.
Program step
It is defined as a syntactically or semantically meaningful segment of a program that has
an execution time that is independend of the instance characteristics.
Eg: return(a+b+c)/3;
- Coments count as zero steps.
- Assignment statement does not have any calls to other algorithms is counted as
one step.
- For iterative statement the step count depends on the control part only.
Methods for calculating step counts
1. Introduce a new variable count into the program.It is a global variable with initial
value zero .Each time a statement in the program is executed count is
incremented by the step count of that statement.
Example 1:
Algorithm Sum(a,n)
{
s=0;
for i=1 to n do
{
s=s+a[i];
}
return s;
}

Step count for algorithm Sum(a,n)
Aalgorithm Sum(a,n)
{
s=0;
count=count +1;
for i=1 to n do
{
count=count +1; // for s=s+a[i];
count=count+1;// for assignment
}
count=count +1;// for last time for
count=count+1;// for return
return s;
}
Total=2n+3

Example 2:
Algorithm Add(a,b,c,m,n)
{
for i= 1 to m do
{
count=count+1;//for i
n time

One time

n time

One time

One time

m times

mn times

for j = 1 to n do
{
count=count+1;//for j
c[i,j]=a[I,j]+b[I,j];
count=count+1;//for assignment
}
count=count+1;//for j
}
count=count+1;//for i
}

Total = 2mn+2m+1
Example 3:
Algorithm Rsum(a,n)
{
count=count +1// for if
if(n<=0) then
count=count +1// for return
return 0;
else
count=count +1// for return
return Rsum(a,n-1)+a[n]
}
mn times
times
tttimesti
mes
times

m times

1 times

t
Rsum
(n)= 2 if n=0
2+ t
Rsum
(n-1) if n>0
2.The second method to determine the step count of an algorithm is to build a table in
which we list the total number of steps contributed by each statement.First determining
the number of steps per execution(s/e) of the statement and the total number of
times(frequency) of each statement.The s/e of a statement is the amount by which the
count changes as a result of the execution of that statement . The term frequency will
tell us how many times a statement will be executed

.

Total steps = 2n+3

Statement s/e frequency Total steps
1. Algorithm Sum(a,n)
2. {
3. s=0;
4. for i=1 to n do
5. {
6. s=s+a[i];
7. }
8. Return s;
}

0
0
1
1
0
1
0
1
0
-
-
1
N+1
-
N
-
1
-
0
0
1
N+1
0
N
0
1
0
1. Algorithm Add(a,b,c,m,n)
2. {
3. for i= 1 to m do
4. {
5. for j = 1 to n do
6. {
7. c[i,j]=a[I,j]+b[I,j];
8. }
9. }
10. }

0
0
1
0
1
0
1
0
0
0
-
-
m+1
-
m(n+1)
-
mn
-
-
-
0
0
m+1
0
mn+m
0
mn
0
0
0

Total = 2mn+2m+1
Statement s/e Frequency
n=0 n>0
Total steps
1. Algorithm Rsum(a,n)
2. {
3. if(n<=0) then
4. return 0;
5. else
6. return Rsum(a,n-1)+a[n]
7. }

0
0
1
1
0
1+x
0
- -
- -
1 1
1 0
0 0
0 1
- -

0
0
1 1
1 0
0 0
0 1+x
0 0
Total = 2 2+x
X=trsum(n-1)
Time Complexity Vs Space Complexity
- Achieving both is difficult
- There is always trade off
- If memory available is large
Need not compensate on Time Complexity
- If fastness of execution is not main concern and memory available is less
cant compensate on space complexity
ASYMPTOTIC NOTATIONS
A problem may have numerous algorithmic solutions. In order to choose the best
algorithm for a particular task, you need to be able to judge how long a particular
solution will take to run.Step count is to compare time complexity of two programs
that compute same function and also to predict the growth in run time as instance
characteristics changes. Determining exact step count is difficult and not necessary
also. Since the values are not exact quantities we need only comparative statements
like c
1
n
2
t
p
(n) c
2
n
2
.
For example, consider two programs with complexities c
1
n
2
+ c
2
n and c
3
n
respectively. For small values of n, complexity depend upon values of c
1
, c
2
and c
3
.
But there will also be an n beyond which complexity of c
3
n is better than that of c
1
n
2
+
c
2
n.This value of n is called break-even point. If this point is zero, c
3
n is always faster
(or at least as fast).
Asymptotic efficiency of algorithms concerned with how the running time of an
algorithm increases with he size of the input in the limit.The notations used to decribe
the asymptotic efficiency of an algorithm is called asymptotic notations. Asymptotic
complexity is a way of expressing the main component of the cost of an algorithm,
using idealized units of computational work. Note that we have been speaking about
bounds on the performance of algorithms, rather than giving exact speeds.
Different asymptotic Notations are,
- Big Oh
- Omega
- Theta
- Little oh
- Little omega

BIG-OH NOTATION (O)
Big-O is the formal method of expressing the upper bound of an algorithm's running time.
It's a measure of the longest amount of time it could possibly take for the algorithm to
complete.
Defenition
Given functions f(n) and g(n), we say that f(n) = O(g(n)) if and only if there are
positive constants c and n
0
such that f(n) c g(n) for n n
0
for the algorithm to complete.

Eg:Say that f(n) = 2n + 8, and g(n) = n2. Can we find a constant c, so that 2n + 8 <= n2?
The number 4 works here, giving us 16 <= 16. For any number c greater than 4, this will
still work. Since we're trying to generalize this for large values of n, and small values (1,
2, 3) aren't that important, we can say that f(n) is generally faster than g(n); that is, f(n) is
bound by g(n), and will always be less than it. It could then be said that f(n) runs in O(n2)
time: "f-of-n runs in Big-O of n-squared time".
To find the upper bound - the Big-O time - assuming we know that f(n) is equal to
(exactly) 2n + 8, we can take a few shortcuts. For example, we can remove all constants
from the runtime; eventually, at some value of c, they become irrelevant. This makes f(n)
= 2n. Also, for convenience of comparison, we remove constant multipliers; in this case,
the 2. This makes f(n) = n. It could also be said that f(n) runs in O(n) time; that lets us put
a tighter (closer) upper bound onto the estimate.
Omega Notation ()
Definition
Given functions f(n) and g(n), we say thatf(n) = O (g(n)) if and only if there are positive
constants c and n
0
such that f(n) c g(n) for n n
0.
It is the lower bound of any function. Hence it denotes the best case complexity of any
algorithm. We can represent it graphically as

Theta Notation ()
Definition
Given functions f(n) and g(n), we say that f(n) = O (g(n)) if and only if there are positive
constants c
1
,c
2
and n
0
such that c
1
g(n)f(n) c
2
g(n) for n n
0.
The theta notation is more precise than both the big oh and big omega notations. The
function f(n)=theta(g(n)) iff g(n) is both lower and upper bound of f(n).

.

Little Oh Notation (o)
o(g(n)) = { f(n) : for any positive constants c > 0, there exists n
0
>0, such that 0 f(n) <
cg(n) for all n n
0
}
It defines the asymptotic tight upper bound. Main difference with Big Oh is that Big
Oh defines for some constants c by Little Oh defines for all constants.
1.5.5 Little Omega ()
(g(n)) = { f(n) : for any positive constants c>0 and n
0
>0 such that 0 cg(n) < f(n) for
all n n
0
}
It defines the asymptotic tight lower bound. Main difference with is that, defines
for some constants c by defines for all constants.

Comparison of Functions
f g ~ a b
f (n) = O(g(n)) ~ a s b
f (n) = O(g(n)) ~ a > b
f (n) = O(g(n)) ~ a = b
f (n) = o(g(n)) ~ a < b
f (n) = w (g(n)) ~ a > b
Intuition for Asymptotic Notation
Big-Oh
- f(n) is O(g(n)) if f(n) is asymptotically less than or equal to g(n)
big-Omega
- f(n) is O(g(n)) if f(n) is asymptotically greater than or equal to g(n)

big-Theta
- f(n) is O(g(n)) if f(n) is asymptotically equal to g(n)
little-oh
- f(n) is o(g(n)) if f(n) is asymptotically strictly less than g(n)

little-omega
- f(n) is e(g(n)) if is asymptotically strictly greater than g(n)

Asymptotic complexity of Sum algorithm

Total = O(n)
Asymptotic
complexity
of Add
algorithm

1. Algorithm Add(a,b,c,m,n)
2. {
3. for i= 1 to m do
4. {
5. for j = 1 to n do
6. {
7. c[i,j]=a[I,j]+b[I,j];
8. }
9. }
10. }

0
0
1
0
1
0
1
0
0
0
-
-
m+1
-
m(n+1)
-
mn
-
-
-
O(0)
O(0)
O(m)
O(0)
O(mn)
O(0)
O(mn)
O(0)
O(0)
O(0)

1. Algorithm Sum(a,n)
2. {
3. s=0;
4. for i=1 to n do
5. {
6. s=s+a[i];
7. }
8. Return s;
}

0
0
1
1
0
1
0
1
0
-
-
1
N+1
-
N
-
1
-
O(0)
O(0)
O(1)
O(n)
O(0)
O(n)
O(0)
O(1)
O(0)
Total = O(mn)
Properties of asymptotic notations

Transitivity:
f(n)=(g(n)) and g(n)=(h(n)) imply f(n)=(h(n)),
f(n)=O(g(n)) and g(n)=O(h(n)) imply f(n)=O(h(n)),
f(n)=(g(n)) and g(n)=(h(n)) imply f(n)=(h(n)),
f(n)=o(g(n)) and g(n)=o(h(n)) imply f(n)=o(h(n)),
f(n)=(g(n)) and g(n)=(h(n)) imply f(n)=(h(n)).
Reflexivity:
f(n)=(f(n)),
f(n)=O(f(n)),
f(n)=(f(n)).
Symmetry:
f(n)=(g(n)) if and only if g(n)=(f(n)).
Transpose symmetry
f(n)=O(g(n)) if and only if g(n)=(f(n)),
f(n)=O(g(n)) if and only if g(n)=(f(n)),
Because these properties hold for asymptotic notations, one can draw an analogy between
the asymptotic comparison of two functions f and g and the comparison of two real
numbers a and b:
f(n)=O(g(n)) similar to a<=b,
f(n)=(g(n)) similar to a>=b,
f(n)=(g(n)) similar to a=b,
f(n)=o(g(n)) similar to a<b,
f(n)=(g(n)) similar to a>b.
We say that f(n) is asymptotically smaller than g(n) if f(n)=o(g(n)), and f(n) is
asymptotically larger than g(n) if f(n)=(g(n)).
One property of real numbers, however, does not carry over to asymptotic notations:
Trichotomy: For any two real numbers a and b, exactly one of the following must hold:
a<b, a=b, or a>b.
Although any two real numbers can be compared, not all functions are asymptotically
comparable. That is, for two functions f(n) and g(n), it may be the case that neither
f(n)=O(g(n)) nor f(n)=(g(n)) holds. For example, the functions n and n^(1+sin n) cannot
be compared using asymptotic notation, since the value of the exponent in n^(1+sin n)
oscillates between 0 and 2, taking on all values in between.
COMMON COMPLEXITY FUNCTIONS
Typical Running Time Functions
- 1 (constant running time):
Instructions are executed once or a few times
- logN (logarithmic)
A big problem is solved by cutting the original problem in smaller sizes, by a
constant fraction at each step
- N (linear)
A small amount of processing is done on each input element
- N logN
A problem is solved by dividing it into smaller problems, solving them
independently and combining the solution

- N
2
(quadratic)
Typical for algorithms that process all pairs of data items (double nested loops)
- N
3
(cubic)
Processing of triples of data (triple nested loops)
- N
K
(polynomial)
- 2
N
(exponential)
Few exponential algorithms are appropriate for practical use
Worst, Best and Average Case Complexity
The worst-case complexity of the algorithm is the function defined by the maximum
number of steps taken on any instance of size n. Itrepresents the curve passing through the
highest point of each column.
The best-case complexity of the algorithm is the function defined by the minimum
number of steps taken on any instance of size n. It represents the curve passing through
the lowest point of each column.
the average-case complexity of the algorithm is the function defined by the average
number of steps taken on any instance of size n.

RECURRENCE RELATIONS

Recurrence is an equation or inequality that describes a function in terms of its value
on smaller inputs, and one or more base cases.
Useful for analyzing recurrent algorithms
Make it easier to compare the complexity of two algorithms

Different methods for solving recurrence relation are:-
- Substitution method
- Iteration method
- Changing variables method
- Recurrence tree
- Characteristic equation method
- Master theorem

1.1.1 Substitution Method
- Use mathematical induction to derive an answer
- Derive a function of n (or other variables used to express the size of the problem)
that is not a recurrence so we can establish an upper and/or lower bound on the
recurrence
- May get an exact solution or may just get upper or lower bounds on the solution

Steps
- Guess the form of the solution
- Use mathematical induction to find constants or show that they can be found
and to prove that the answer is correct

Example 1
Find the upper bound for the recurrence relation
T(n) = 2 T( n/2 ) + n

Guess the solution as T(n) = O(n.lg(n))
Then T(n) = c.n.lgn
Substituting in T(n), we get
T(n) =
= c n lg(n/2) + n
= cn lg(n) cnlg(2) + n
= cn lg(n) cn + n
= cn lg(n), c >=1
1.1.2 Recursion tree Method
- Main disadvantage of Substitution method is that it is always difficult to come up with a good guess
- Recursion tree method allows you make a good guess for the substitution method
- Allows to visualize the process of iterating the recurrence

Steps
- Convert the recurrence into a tree.
- Each node represents the cost of a single sub problem somewhere in the set of recursive function
invocations
- Sum the costs within each level of the tree to obtain a set of per-level costs
- Sum all the per-level costs to determine the total cost of all levels of the recursion

Example 1
T(n) = 3T(n/4) + (n
2
)
T(n) = 3T(n/4) + cn
2

The sub problem size for a node at depth i is n/4
i

When the sub problem size is 1, n/4
i
= 1, i=log
4
n
The tree has log4n+1 levels (0, 1, 2,.., log
4
n)
The cost at each level of the tree (0, 1, 2,.., log
4
n-1)
Number of nodes at depth i is 3
i

Each node at depth i has a cost of c(n/4
i
)
2

The total cost over all nodes at depth i is 3
i
c(n/4
i
)
2
=(3/16)
i
cn
2

The cost at depth log
4
n
Number of nodes is
Each contributing cost T(1)
The total cost

3 log log
4 4
3 n
n
=
) ( ) 1 (
3 log 3 log
4 4
n T n O =
) (
) (
13
16
) (
16
3
1
1
) ( )
16
3
(
) ( )
16
3
(
) ( )
16
3
( ... )
16
3
(
16
3
) (
2
3 log 2 3 log 2
3 log
0
2
3 log
1 log
0
2
3 log 2 1 log 2 2 2 2
4 4
4
4
4
4 4
n O
n cn n cn
n cn
n cn
n cn cn cn cn n T
i
i
n
i
i
n
=
O + = O +
=
O + <
O + =
O + + + + + =

Example 2
W(n) = 2W(n/2) + (n
2
)

Subproblem size at level i is: n/2
i

Subproblem size hits 1 when 1 = n/2
i
i = lgn
Cost of the problem at level i = (n/2
i
)2 No. of nodes at level i = 2
i

Total cost:

W(n) = O(n2)
Example 3
T(n) = T(n/4) + T(n/2) + n2

2 2
0
2
1 lg
0
2 lg
1 lg
0
2
2 ) (
2
1
1
1
) (
2
1
2
1
) 1 ( 2
2
) ( n n O n n O n n n W
n
n W
i
i
n
i
i
n
n
i
i
= +
= + |
.
|
\
|
s + |
.
|
\
|
= + =

=
=

Solving Recurrences with the Iteration
In the iteration method we iteratively unfold the recurrence until we see the pattern.
The iteration method does not require making a good guess like the substitution method
(but it is often more involved than using induction).
Example1: Solve T(n) = 8T(n/2) + n (T(1) = 1)
T(n) = n + 8T(n/2)
= n + 8(8T( n/2 ) + (n/2))
= n + 8T( n/2 ) + 8(n/4))
= n+ 2n + 8T( n/2 )
= n + 2n + 8(8T( n/2 ) + ( n/2 ))
= n + 2n + 8T( n/2 ) + 8(n/4 ))
= n + 2n + 2n+ 8T( n/2 )
= . . .
= n + 2n + 2n+ 2n + . . .
Example2

()=(1)+2, (1)=1
=================================
===============
T(n) = T(n-1) + 2
T(1) = 1
T(n) = T(n-1) + 2 = [T(n-2) + 2] + 2
= T(n-2) +2+2
T(n) = T(n-2) + 2*2
T(n) = T(n-2) + 2*2 = [T(n-3) + 2] + 2*2
= T(n-3) + 2 + 2*2
T(n) = T(n-3) + 2*3

T(n) = T(n-3) + 2*3 = [T(n-4) + 2] + 2*3
= T(n-4) + 2 + 2*3
T(n) = T(n-4) + 2*4
Do it one more time
Substituting Equations
n n-1
T(n-1) = T(n-2) + 2
T(n-2) = T(n-3) + 2
T(n-3) = T(n-4) + 2
T(n-4) = T(n-5) + 2

T(n) = T(n-4) + 2*4
So now rewrite these five equations and look
for a pattern: T(n) = T(n-1) + 2*1
1st step of recursion
T(n) = T(n-2) + 2*2 2nd step of recursion
T(n) = T(n-3) + 2*3 3rd step of recursion
T(n) = T(n-4) + 2*4 4th step of recursion
T(n) = T(n-5) + 2*5 5th step of recursion

Generalized recurrence relation at the kth step of the recursion:
T(n) = T(n-k) + 2*k

We want T(1). So we let n-k = 1. Solving for k, we get k = n - 1. Now plug back in.
T(n) = T(n-k) + 2*k
T(n) = T(1) + 2*(n-1), and we know T(1) = 1
T(n) = 2*(n-1) = 2n-1
We are done. Right side does not have any T()s. This recurrence relation is now solved in
its closed form, and it runs in O(n) time.

PROFILING
Profiling or performance measurement is the process of executing a correct
program on data set and measuring the time and space it takes to compute the
result.profiling refers to the experimental measurement of the performance of
algorithms.
Profiling techniques fall into two main categories:
- Instruction counting_ the number of times which particular
instruction(s) are executed is measured.
- Clock based timing- the time required for certain blocks of code to
execute is measured.

AMORTIZED COMPLEXITY

MODULE II

DIVIDE AND CONQUER

Given a function to compute on n inputs the divide and conquer
strategy suggests splitting the inputs into k distinct
subsets,1<k<n,yielding k subproblems. These subproblems must be
solved,and then a method must be found to combine subsolutions into
a solution of whole. If the subproblems are still relatively large,then
the divide and conquer strategy can possibly reapplied.The principle
is natrually expressed using recursive algorithm.ie.smaller and smaller
subproblems of the same kind are generated untill eventually
subproblems that are small enough to be solved without splitting are
produced.

CONTROL ABSTRACTION

Control abstraction means a procedures whose flowof control is clear
but whose primary operations are specified by other procedures
whose precise meanings are left defined.Below algorithm DAndC is
initially invoked as DandC(P),where P is the problem to be solved.
Small(P) is a boolean-valued function that determines whether
the input size is small enough that the answer can be computed
without splitting.If this is so,the function S is invoked.Otherwise the
problem p is divided into smaller subproblems.These subproblems
P1,P2,----,Pk are solved by recursive applications of DandC.Combine
is a function that determines the solutions to P using the solutions to
the k subproblems.

ALGORITHM

Algorithm DAndC(P)
{
if small(P) then return S(P);
else
{
divide P into smaller instances P1,P2,P3-----P(k), k>=1;
apply DandC to each of these subproblems;
return combine(DandC(P1),DandC(P2),-----,DandC(Pk));
}
}

FINDING THE MAXIMUM AND MINIMUM

This is a simple problem that can be solved by the divide and-
conquer technique. The problem is to find the maximum and
minimum items in a set of n elements.

ALGORITHM-1 Straight forward algorithm for finding
maximum and minimum

Algorithm StraightMaxMin (a, n, max, min)
//Set max to the maximum and min to the minimum of a[1:n];
{
max:=min:=a[1];
for i:=2 to n do
{
if (a[i] > max) then max:= a[i];
if (a[i] < min) then min:= a[i];
}
}

This is a straight forward algorithm to accomplish the above
problem. In analyzing the time complexity of this algorithm, we
concentrate on the number of element comparisons. The best case
occurs when the elements are in increasing order. The number of
element comparisons is (n-1). The worst case occurs when the
elements are in the decreasing order. In this case, the number of
element comparisons is 2(n-1). The maximum and minimum are a[i],
if n=1. If n=2, the problem can be solved by making one comparison.
If the list has more than two elements,P has to be divided
intosmaller instances.
P1=(n/2,a[1].....a[n/2]) and P2=(n-n/2,a[n/2+1].....a[n])
Then recursively invoke the same divide and conquer algorithm
to solve these subproblems.Then combine its solutions.

ALGORITHM-2 Recursive algorithm for finding maximum and
minimum

Algorithm MaxMin ( i, j, max, min )
{
if (i=j) then max :=min :=a[i]; // Small(P)
else if (i=j-1) then //Another case of Small(P)
{
if( a[i] < a[j]) then
{
max:=a[j];
min:=a[i];
}
else
{
max:=a[i];
min:=a[j];
}
}
else
{
// If P is not small, divide P into subproblems. Find where to split
mid := (i+j)/2;
MaxMin ( i, mid, max, min )
MaxMin ( mid+1, j, max1, min1)
if (max<max1)then max:=max1;
if (min>min1)then min:=min1;
}
}

MaxMin is a recursive algorithm that finds the maximum and
minimum of the set of elements {a(i),a(i+1),.a(j)}.The
situation of set sizes one (i=j) and two (i=j-1) are handled separately.
For sets containing more than two elements, the midpoint is
determined (just as in binary search) and two new subproblems are
generated. When the maxima and minima of these subproblems are
generated, the two maxima are compared and the two minima are
compared to achieve the solution for the entire set.
Eg:- Suppose we simulate MaxMin on the following nine elements:
a: [1] [2] [3] [4] [5] [6] [7] [8] [9]
22 13 -5 -8 15 60 17 31 47
A good way of keeping track of recursive calls is to build a tree by
adding a node each time a new call is made. For this algorithm each
node has four items of information: i, j, max, and min. On the array
a[] above, the tree is produced.

Trees of recursive calls of MaxMin
We see that the root node contains 1 and 9 as the values of i and j
corresponding to the initial call to MaxMin. This execution produces
new call to MaxMin where i and j have the values 1 ,5 and 6,9,
respectively, and thus split the set into two subsets of approximately
the same size. From the tree, we can immediately see that the
maximum depth of recursion is four(including the first call).
The order in which max and min are assigned values are follows:
[1,2,22,13] [3,3,-5,-5] [1,3,22,-5] [4,5,15,-8] [1,5,22,-8] [6,7,60,17]
[8,9,47,31] [6,9,60,17] [1,9,60,-8].

Number of element comparisons needed for MaxMin:
If T(n) represents the number of element comparisons needed, then
the resulting recurrence relation is:

T (n/2) + T (n/2) +2 n>2
T(n)= 1 n=2
0 n=1

When n is a power of 2, n=2
k
,we can solve this equation by
successive substitutions:
T(n) = 2T(n/2)+2
= 2(2T(n/4)+2) +2
= 4T(n/4)+ 4 + 2
.
.
= 2
k-1
T(2) +
1< = i < = k-1
2
i

= 2
k-1
+ 2
k
-2
= 3n/2 - 2
When n is a power of 2, the number of comparisons in the best ,
average and worst case is 3n/2-2

COMPLEXITY ANALYSIS OF FINDING
MAXIMUM&MINIMUM

Consider the total number of element comparison needed for
MaxMin.
If T(n) represents this number, then the resulting recurrence relation is
T ( n/2 ) + T( n/2 ) + 2 n>2
T (n)= 1 n=2
0 n=1

When n is a power of 2, n=2
k
for some positive integer k, then
T(n) = 2T( n/2 ) + 2
= 2(2T (n/4) +2) +2
= 4T (n/4) +4+2
:
:
= 2
k-1
T(2) +
1< = i < = k-1
2
i

= 2
k-1
+ 2
k
-2
= 3n/2 - 2
Therefore 3n/2-2 is the best, average and worst case number of
comparisons when n is a power of two.
Consider the count when element comparisons have the same cost as
comparisons between i and j. Let C (n) be this number.
Assuming n=2
k
for some positive integer k, we get
C(n) = 2C(n/2) + 3 n>2
2 n=2
C (n) = 2C (n/2) + 3
=4 C (n/4) + 6 + 3
:
:
=2
k-1
C (2)+3

=2
k
+3*2
k-1
-3
=5n/2-3
if comparisons between array elements are costlier than comparisons
of integer variable, then divide and conquer method is more efficient
In both cases mentioned above the best case, average case and worst
case complexity is (n)

BINARY SEARCH ALGORITHMS

Let a
i,
1in,be a list of elements that are sorted in nondecreasing
order.Consider the problem of determining whether a given element x
is present in the list.If x is present,we have to determine the value of j
such that a
j
=x.If x is not in the list,then j is to be set to zero.
P=(n,a
i
.....a
l
,x) is the problem.
n - no. Of elements in the list
a
i
.....a
l
list of elements

x element searched for
Divide and conquer can be applied to solve this problem.Let small(P)
be true if n=1.If P has more than one element,it can be divided into a
new subproblem.Pick an index q in the range[i,l] and compire x with
a
q
. There are three possibilities.
1. x=a
q
- In this case the problem is immediately solved
2. x<a
q
- In this case x has to be searched for only the sublist
a
i
,a
i+1,......
a
q-1.

3. x>a
q
- In this case x has to be searched for only the sublist
a
q+1,......
a
n.

The answer to the new subproblem is the answer to the original
problem.There is no need for any combining.

Iterative Algorithm

Algorithm(a,n,x)
{
low:=1; high:=n;
while(lowhigh) do
{
mid:=(low+high)/2;
if(x<a[mid]) then high:=mid-1;
else if(x>a[mid]) then low:=mid+1;
else return mid;
}
return 0;
}

Recursive algorithm

Algorithm BinSrch(a,i,l,x)
{
if(l=i) then
{
if(x=a[i]) then return i;
else return 0;
}
else
{
mid:=(i+l)/2;
if(x=a[mid]) then return mid;
else if(x<a[mid]) then
return BinSrch(a,i,mid-1,x);
else return BinSrch(a,mid+1,l,x);
}
}

Example

Consider 10 elements 10, 20, 30, 40, 50, 60, 70, 80, 90, 100
Number of comparison needed to search element is as per the table given
below
Position 1 2 3 4 5 6 7 8 9 10
Element 10 20 30 40 50 60 70 80 90 100
No. of
comparisons
required
3 2 3 4 1 3 4 2 3 4

The search tree for the above list of elements will be as given below.

If the element is present, then search end in a circular node (inner node), if
the element is not present, then it end up in a square node (leaf node)
Now we will consider the worse case, average case and best case
complexities of the algorithm for a successful and an unsuccessful search.
Worst Case
Find out k, such that 2
k-1
<= n <= 2
k

Then for a successful search, it will end up in either of the k inner nodes.
Hence the complexity is O(k), which is equal to O(log
2
n).
For an unsuccessful search, it need either k-1 or k comparisons. Hence
complexity is (k), which is equal to (log
2
n).

Average Case
Let I and E represent the sum of distance of all internal nodes from root and
sum of distance of all external nodes from the root respectively. Then
E = I + 2n

Let A
s
(n) and A
u
(n) represents average case complexity of a successful and
unsuccessful search respectively. Then
A
s
(n) = 1 + I/n
A
u
(n) = E / (n+1)

A
s
(n) = 1 + I/n
= 1 + (E-2n)/n
= 1 + (A
u
(n)(n+1) 2n)/n
= (n + (A
u
(n)(n+1) 2n))/n
= (n(A
u
(n) -1) + A
u
(n))/n
= A
u
(n) 1 + A
u
(n)/n
A
s
(n) = A
u
(n)(1+1/n) - 1

A
s
(n) and A
u
(n) are directly related and are proportional to log
2
n.
Hence the average case complexity for a successful and unsuccessful search
is O(log
2
n)
Best Case
Best case for a successful search is when there is only one comparison, that
is at middle position if there is more than one element. Hence complexity is
(1)
Best case for an unsuccessful search is O(log
2
n)

MERGE SORT
Assume a sequence of n elements a[1],a[2],...a[n]. Split them into two sets
like a[1],a[2],..a[n/2] and a[(n/2)+1],...a[n]. Each set is individually sorted,
and the resulting sorted sequences are merged to produce a single sorted
sequence of n elements. Merge-sort is based on the divide-and-conquer
paradigm. Three steps in this algorithm are,
Divide: Divide the n element sequence to be sorted into two sub
sequences of n/2 elements each
Conquer: Sort the two sub sequences recursively using merge sort
Combine: Merge the two sorted sub sequences to produce a sorted
answer

Best Case Average
Case
Worse Case
Successful (1) O(log
2
n) O(log
2
n)
Unsuccessful O(log
2
n) O(log
2
n) O(log
2
n)
Idea of Mergesort
Divide: divide array A[0..n-1] in two about equal halves and make copies
of each half in arrays B and C.
Conquer:
If number of elements in B and C is 1, directly solve it
Sort arrays B and C recursively
Combine: Merge sorted arrays B and C into array A
Repeat the following until no elements remain in one of the arrays:
compare the first elements in the remaining unprocessed
portions of the arrays B and C
copy the smaller of the two into A, while incrementing the
index indicating the unprocessed portion of that array
Once all elements in one of the arrays are processed, copy the
remaining unprocessed elements from the other array into A.

Algorithm MergeSort(low,high)
//a[low:high] is a global array to be sorted. Small(P) is true if there
is only one element to sort. In this case the list is already sorted.
{
if(low<high) then //if there are more than one element
{
//Divide P into subproblems
// find where to split the set.

mid:= [(low+high)/2];
//Solve the subproblems

MergeSort(low,mid);
MergeSort(mid+1,high);
// Combine the solutions

Merge(low,mid,high)
}
}

Algorithm Merge(low,mid,high)
//a[low:high] is a global array containing two sorted subset in
a[low:mid] and in a[mid+1:high]. The goal is to merge these two
sets into a single set residing in a[low:high]. b[] is an auxilary
global array.
{
h:=low; i:=low; j:=mid+1;
while((hmid) and (j high)) do
{
if(a[h] a[j]) then
{
b[i]:=a[h]; h=h+1;
}
else
{
b[i]=a[j];j=j+1;
}
i=i+1;
}
if(h>mid)then
for k:= j to high do
{
b[i]=a[k];i=i+1;
}
else
for k:=h to mid do
{
b[i]:=a[k]; i=i+1;
}
for k:= low to high do
a[k]:=b[k];
}

Example consider the array a of ten elements
[1] [2] [3] [4] [5] [6] [7] [8] [9] [10]
310 285 179 652 351 423 861 254 450 520

Algorithm MergeSort begins by splitting a[] into two subarrays each of size
five (a[1:5] and a[6:10]). The elements in a[1:5] are then split into two
subarrays of size three (a[1:3]) and two (a[4:5]). Then the items in a[1:3] are
split into subarrays of size two (a[1:2]) and one (a[3:3]). The two values in
a[1:2] are split final time into one-element subarrays , and now the merging
begins. A record of the subarrays is implicitly maintained by the recursive
mechanism.

( 310 | 285 | 179 | 652 , 351 | 423 , 861 , 254 , 450 , 520 ) where vertical bars
indicate the boundaries of subarrays.
Elements a[1] and a[2] are merged to yield
( 285 , 310 | 179 | 652 , 351 | 423 , 861 , 254 , 450 , 520 )
Then a[3] is merged with a[1:2] and
( 179 , 285 , 310 | 652 , 351 | 423 , 861 , 254 , 450 , 520 ) is produced.
Next elements a[4] and a[5] are merged:
( 179 , 285 , 310 | 351 , 652 | 423 , 861 , 254 , 450 , 520 )
And then a[1:3] and a[4:5]:
( 179 , 285 , 310 , 351 , 652 | 423 , 861 , 254 , 450 , 520 )

At this point the algorithm has returned to the first invocation of MergeSort
and is about to process the second recursive call. Repeated recursive calls are
invoked producing the following subarrays:
( 179 , 285 , 310 , 351 , 652 | 423 | 861 | 254 | 450 , 520 )
Elements a[6] and a[7] are merged. Then a[8] is merged with a[6:7]:
( 179 , 285 , 310 , 351 , 652 | 254 , 423 , 861 | 450 , 520 )
Next a[9] and a[10] are merged and then a[6:8] and a[9:10]:
( 179 , 285 , 310 , 351 , 652 | 254 , 423 , 450 , 520 , 861 )

At this point there are two sorted subarrays and the final merge produces the
fully sorted result

( 179 , 254 , 285 , 310 , 351 , 423 , 450 , 520 , 652 , 861 )

Figure shows a tree that represents the sequence of recursive calls that are
produced by MergeSort when it is applied to ten elements. The pair of values
in each node are the values of parameters low and high. The splitting continues
until the sets containing a single element are produced.

Tree of calls of MergeSort

Tree of calls of Merge

Another example

Complexity of MergeSort
If T(n) represent the time for the merging operation is proportional to n then the
computing time for merge sort is ,

T(n) = 2T(n/2)+cn , n>2
a , n=1
8 3 2 9 7 1 5 4
8 3 2 9 7 1 5 4
8 3 2 9 7 1 5 4
8 3 2 9 7 1 5 4
3 8 2 9 1 7 4 5
2 3 8 9 1 4 5 7
1 2 3 4 5 7 8 9

Where n is a power of 2, n=2
k
for some integer k.
c and a are constants.
In the iterative substitution technique, we iteratively apply the recurrence
equation to itself and see if we can find a pattern:

Note that base, T(n)=a, case occurs when 2
k
=n. That is, k = log n.
So,

Thus, T(n) is O(n log n).

Concerning the MergeSort algorithm, there are two complaints.
1. It uses 2
n
locations The additional n locations were needed because we
couldnt reasonable merge two sorted sets in place.
kcn n T
bn n T
cn n T
cn n T
cn n c n T
cn n T n T
k k
+ =
=
+ =
+ =
+ =
+ + =
+ =
) 2 / ( 2
...
4 ) 2 / ( 2
3 ) 2 / ( 2
2 ) 2 / ( 2
)] 2 / ( ) 2 / ( 2 [ 2
) 2 / ( 2 ) (
4 4
3 3
2 2
2
n cn na
kcn T
kcn T n T
k
k k k
log
) 1 ( 2
) 2 / 2 ( 2 ) (
+ =
+ =
+ =
2. The stack space that is necessitated by the use of recursion.

For small set sizes most of the time will be spent processing the recursion
instead of sorting. This situation can be improved by not allowing the recursion
to go to the lowest level. We use a second sorting algorithm that works well on
small-sized set.
Insertion sort works exceedingly fast on small sized arrays.
Algorithm InsertionSort(a,n)
// Sort the array a[1:n]into nondecreasing order n1
{
for j:=2 to n do
{
//a[1:j-1] is already sorted
item := a[j];
i=j-1;
while((i1) and (item<a[ i ] ))
{
a[i+1]=a[ i ];
i=i-1;
}
a[i+1]:=item;
}
}

Then the MergeSort algorithm can be written as,

Algorithm MergeSort(low,high)
//a[low:high] is a global array to be sorted. Small(P) is true if there is only
one element to sort.In this case the list is already sorted.
{
if((low-high)<15) then
{
return InsertionSort(low,high)
}
else
{
mid:= [(low+high)/2];
q=MergeSort(low,mid);
r=MergeSort(mid+1,high);
// Combine the solutions

Merge(q,r)
}
}

QUICK SORT
Quick sort is one of the fastest and simplest sorting algorithms. It works
recursively by a divide-and-conquer strategy.
Divide: Partition array A[p..q] into 2 subarrays, A[p..j-1] and A[j+1..q]
such that each element of the first array is A[j] and each element of the
second array is A[j].
Conquer: Sort the two subarrays A[p..j-1] and A[j+1..q] by recursive calls
to quicksort .
Combine: No work is needed, because A[j] is already in its correct place
after the partition is done, and the two subarrays have been sorted
Idea of Quicksort
Select a pivot w.r.t. whose value we are going to divide the sublist. (e.g.,
p = A[p])
Rearrange the list so that it starts with the pivot followed by a sublist (a
sublist whose elements are all smaller than or equal to the pivot) and a
sublist (a sublist whose elements are all greater than or equal to the pivot
)
Exchange the pivot with the last element in the first sublist(i.e., sublist)
the pivot is now in its final position
Sort the two sublists recursively using quicksort.
Algorithm QuickSort(p,q)
// Sorts the element a[p]..a[q] which reside in the global array
a[1:n] into ascending order; a[n+1] is considered to be defined and must
be all the elements in a[1:n].
{
if (p <q) then //if there are more than one element
{
// divide P into subproblems
j=Partition(a,p, q+1);
// j is the position of the partitioning element.
// solve the subproblems
QuickSort(p,j-1);
QuickSort(j+1,q);
//there is no need for combining solutions.
}
}

Algorithm Partition (a,m,p)
// within a[m],a[m+1],,.a[p-1] the elements are rearranged in
such a manner that if initially t=a[m], then after compltion a[q]=t for
some q between m and p-1, a[k] t for m k<q and a[k] t for q<k<p. q
is returned set a[p] =
{
v=a[m]; i= m; j=p;
repeat
{
repeat
i=i+1;
until (a[i] v);
repeat
j=j-1;
until(a[j] v)
if(i<j) then Interchange(a, i, j );
} until(i j);
a[m]=a[j]; a[j]=v ; return j
}

Algorithm Interchange(a,i,j)
// exchange a[i] with a[j]
{
p=a[i];
a[i]=a[j];
a[j]=p;
}

Examples
[1] [2] [3] [4] [5] [6] [7] [8] [9]
65 70 75 80 85 60 55 50 45
Initially, first element 65 is taken as pivot element. i and j are two
pointers pointing to the first and last elements respectively. The pointer i is
incremented until it find a greater element than the pivot element. Similarly the
pointer j is decremented until it find an element that is less than the pivot
element.
In the above example i=2 and j=9 itself. Then exchange these two elements.
Then the array becomes,
65 45 75 80 85 60 55 50 70.
This process continues until i>j.When i>j , the i
th
element is exchanged with
pivot element. This crude "sorting" around the pivot element yields two sub-
arrays: a left one and a right one. All the elements in the left array is less than
the pivot element and that of right array is greater than the pivot element.

[65 45 50 55] 60 [85 80 75 70]

The next step is for quicksort to call itself to have the left and right sub-
arrays sorted.
The solution is
65 70 75 80 85 60 55 50 45
65 45 75 80 85 60 55 50 70
65 45 50 80 85 60 55 75 70
65 45 50 55 85 60 80 75 70
i=6 j=5
65 45 50 55 60 85 80 75 70
60 45 50 55 65 85 80 75 70
55 45 50 60 65 70 80 75 85
50 45 55 60 65 70 75 80 85
45 50 55 60 65 70 75 80 85
Complexity of quick sort

The timing analysis of QuickSort algorithm is given by the following recurrence
relation,

T(n) = cn+T(n/2)+T(n/2) , n>1
a n=1
Where
cn is the time required for partitioning.
T(n/2) is the time required to sort the left or right subarray.
n is a power of 2, n=2
k
for some integer k.
c and a are constant.
Best case Analysis
The best case timing analysis is possible when the array is always
partitioned in half . The timing analysis has taken the following form,

Note that base, T(n)=a, case occurs when 2
k
=n. That is, k = log n.
So,

Thus, T(n) is O(n log n).

Worst Case Analysis in this case on every function call the given array is
partitioned into two sub arrays. One of them is an array and another one is an
empty array. The timing analysis is,

kcn n T
bn n T
cn n T
cn n T
cn n c n T
cn n T n T
k k
+ =
=
+ =
+ =
+ =
+ + =
+ =
) 2 / ( 2
...
4 ) 2 / ( 2
3 ) 2 / ( 2
2 ) 2 / ( 2
)] 2 / ( ) 2 / ( 2 [ 2
) 2 / ( 2 ) (
4 4
3 3
2 2
2
n cn na
kcn T
kcn T n T
k
k k k
log
) 1 ( 2
) 2 / 2 ( 2 ) (
+ =
+ =
+ =

Efficiency of Quicksort
Based on whether the partitioning is balanced.
Best case: split in the middle O( n log n)
C(n) = 2C(n/2) + (n) //2 subproblems of size n/2 each
Worst case: sorted array! O( n
2
)
C(n) = C(n-1) + n+1 //2 subproblems of size 0 and n-1 respectively
Average case: random arrays O( n log n)

) (
] 2 / [ ] 2 / [
] 2 / ) 1 ( [
] 1 2 3 ..... 2 1 [
) 0 ( ) 1 ( ...... ) 2 ( ) 1 (
) 2 ( ) 1 (
) 1 (
) 1 ( ) 0 (
) 2 / ( ) 2 / ( ) (
2
2
n O
cn cn
n n c
n n n c
T c n c n c cn
n T n c cn
n T cn
n T T cn
n T n T cn n T
=
+ =
+ =
+ + + + + + =
+ + + + + =
+ + =
+ =
+ + =
+ + =
MODULE III
GREEDY STRATEGY
Greedy method is another important algorithm design paradigm.
It can be applied to variety of problems.All these problems have n
inputs and require us to obtain a subset that satisfies some
constraints.Any subset that satisfies these constraints is called a
feasible solution.We need to find a feasible solution that either
maximizes or minimizes a given objective function. A feasible
solution that does this is called an optimal solution.
In a greedy method we attempt to construct an optimal solution
in stages. At each stage we make a decision that appears to be the best
(under some criterion) at the time. A decision made at one stage is not
changed in a later stage, so each decision should assure feasibility.

CONTROL ABSTRACTION

Greedy Algorithm ( a, n )
//a[1: n] contains the n inputs
{
solution = ; // initialization of the solution
for i = 1 to n do
{
x = select (a);
if Feasible ( solution, x ) then
solution = Union (solution , x);
}
return solution;
}

At each stage , a decision is made regarding whether a particular
input is an optimal solution
The function select select an input from a[] & remove it
The selected input value is assigned to x.
The function Feasible is a Boolean valued function which
determines whether x can be included into the solution.
The function Union combines x with the solution and update
the objective function.

Subset paradigm

If the inclusion of the next input into the partially constructed
optimal solution will result an infeasible solution, then this input
is not added to the partial solution.
Otherwise it is added
This version of the greedy technique is called subset paradigm

Ordering paradigm

Problems that do not call for the selection of the optimal subset,
we make a decision by considering the inputs in some order.
Each decision is made using an optimization criterion that can
be computed using decision already made.
This version of greedy technique is called ordering paradigm.

GENERAL KAPSACK PROBLEM

In the knapsack problem, we have n objects and a knapsack or a
bag. Object i has a weight w
i
and the knapsack has a capacity m. If a
fraction x
i
, 0 x
i
1, of object i is placed into the knapsack, then a
profit of p
i
x
i
earned. The objective is to obtain a filling of the
knapsack that maximizes the total profit earned. Since the knapsack
capacity is m, we require the total weight of all chosen objects to be at
most m. Formally, the problem can be stated as
n objects, each with a weight w
i
> 0
a profit p
i
> 0
capacity of knapsack: M

Maximize

p x
i i
i n 1s s
Subject to

0 s x
i
s 1, 1 s i s n
The profit and weight are positive numbers
Algorithm GreedyKnapsack(m,n)
// p[1:n] and w[1:n] contain the profits and weights respectively of the
n objects ordered such that p[i] / w[i] p[i+1] / w[i+1]. M is the
knapsack size and x[1:n] is the solution//
{
for i=1 to n do
x[i]=0.0;// Initialize x
U:=m;
for i= 1 to n do
{
if (w[i] > U) then break;
x[i] =1.0;
U=U-w[i];
}
if (i n) then x[i] = U/w[i];
}

w x M
i i
i n 1s s
s
Example :
M = 20 n = 3
p = (25, 24, 15)
w = (18, 15, 10)
Sl
no.
(x1,x2,x3) w
i
x
i

p
i
x
i

1 (1/2,1/3,1/4)

16.5 24.25
2 (1,2/15,0)
(largest profit first)
20 28.2
3 (0,2/3,1)
(smallest weight first)
20 31
4 (0,1,1/2)
(maximum profit per unit of capacity
is first)
20 31.5

The greedy algorithm:
Step 1: Sort p
i
/w
i
into nonincreasing order.
Step 2: Put the objects into the knapsack according
to the sorted sequence as possible as we can.
Sol: p
1
/w
1
= 25/18 = 1.32
p
2
/w
2
= 24/15 = 1.6
p
3
/w
3
= 15/10 = 1.5
So
(p
1
, p
2
, p
3
) = (24,15,25)
(w
1
, w
2
, w
3
) = (15,10,18)

i=1 U=20
W1< U
X[1]=1.0
U=20-15=5
i=2 U=5
W2 > U BREAK
X[2]=5/10=1/2;
Solution is : x
1
= 1, x
2
= 1/2, x
3
=0
Profit =24 + 7.5 = 31.5
The solution to the original problem is
Optimal solution: x
1
= 0, x
2
= 1, x
3
= 1/2
total profit = 24 + 7.5 = 31.5
Theorem
If objects are selected in order of decreasing pi/wi, then knapsack
finds an optimal solution.
Proof
The theorem is proved using the following technique.
Compare the greedy solution with any optimal solution.If the two
solutions are differ,then find the firest x
i
at which they
differ.Next,it is shown how to make the x
i
in the optimal solution
equql to that in the greedy solution without any loss in total
value.Repeated use of this transformation shows that the greedy
solution is optimal.
Given that p1/w1 p2/w2 .. pn/wn. Let X = (x1, x2 , , xn) be
the solution generated by the Greedy algorithm
Let j be the smallest index such that xj<1. Then we know that
xi = 1, when i < j
xi = 0, when i > j
0 xi 1, when i = j and that .
The value of solution is
Now let Y = (y1, y2 , , yn) be any feasible solution. Since Y is
feasible,
Hence .
Let the value of the solution Y be

When i<j, xi =1 and so (xi yi) is positive or zero.
i>j, xi =0 and so (xi yi) is negative or zero.
But since pi/wi pj/wj in every case
(xi yi) pi/wi (xi yi) pj/wj i=j, pi/wi = pj/wj

So we have proved that no feasible solution can have a value greater
than P(X), so solution X is optimal.

MINIMUM COST SPANNING TREE
An undirected graph G is a pair (V,E), where V is a finite set of
points called vertices and E is a finite set of edges.
In a directed graph, the edge e is an ordered pair (u,v). An edge
(u,v) is incident from vertex u and is incident to vertex v.
The length of a path is defined as the number of edges in the
path.
An undirected graph is connected if every pair of vertices is
connected by a path.
A forest is an acyclic graph, and a tree is a connected acyclic
graph.
A graph that has weights associated with each edge is called a
weighted graph.
A spanning tree of an undirected graph G is a subgraph of G
that is a tree containing all the vertices of G.
In a weighted graph, the weight of a subgraph is the sum of the
weights of the edges in the subgraph.
A minimum spanning tree (MST) for a weighted undirected
graph is a spanning tree with minimum weight.
An undirected graph and its minimum spanning tree
We know that the number of edges needed to connect an
undirected graph with n vertices is n-1. If more that n-1 edges are
present, then there will be a cycle. Then we can remove an edge
which is a part of the cycle without disconnecting T. This will reduce
the cost. There are two algorithms to find minimum spanning trees.
They are Prims algorithm and Kruskals algorithm.

PRIMS ALGORITHM

Prim's algorithm finds a minimum spanning tree for a connected
weighted graph. This means it finds a subset of the edges that forms a
tree that includes every vertex, where the total weight of all the edges
in the tree is minimized. The algorithm was discovered in 1930 by
mathematician Vojtch Jarnik and later independently by computer
scientist Robert C. Prim in 1957 and rediscovered by Edsger Dijkstra
in 1959. Therefore it is sometimes called the DJP algorithm, the
Jarnik algorithm, or the Prim-Jarnik algorithm.
Steps
Builds the tee edge by edge
Next edge to be selected is one that result in a minimum
increase in the sum of costs of the edges so far included
Always verify that resultant is a tree
The Efficiency of Prims Algorithm is O(n
2
)

Algorithm Prim(E,cost,n,t)
//E is the set of edges in G. cost[1:n,1] is the cost
//adjacency marix of an n vertex graph such tht cost[I,j] is
//either a positive real number or if no edge(i,j)exists.
//A minimum spanning tree is computed and stored as a set of
// edges in the array t[1:n-1,1:2]. (t[I,1],t[I,2]) is an edge in
// the minimum-cost spanning tree. The final cost is returned.
{
Let (k,l) be an edge of minimum cost in E;
mincost=cost[k,l];
t[1,1]=k; t[1,2]=l;
for i=1 to n do
if (cost[i,l]<cost[i ,k]) then near[i] = l;
else near[i]=k;
near[k] = near[ l ]=0;
for i=2 to n-1 do
{ // find n-2 additional edges for t.
Let j be an index such that near[j] 0 and
cost[ j, near[ j ] ] is minimum;
t[ i, 1]=j; t[ i , 2]= near[ j ] ;
mincost = mincost + cost [ j , near[ j] ];
near[ j ] =0;
for k=1 to n do // update near[ ]
if ((near [k] 0) and (cost[ k , near[ k]]>cost[k,j ]))
then near[k] = j
}
return mincost;
}

Example 3.2
Consider the connected graph given below

Minimum spanning tree using Prims algorithm can be formed as
given below.

Step-I

Step -II

Step-III

Step-IV

Step-V

Step-VI

KRUSKALS ALGORITHM
Kruskal's algorithm is another algorithm that finds a minimum
spanning tree for a connected weighted graph. Kruskal's Algorithm
builds the MST in forest. Initially, each vertex is in its own tree in
forest. Then, algorithm considers each edge in turn, order by
increasing weight.

Basics of Kruskals Algorithm

Edges are initially sorted by increasing weight
Start with an empty forest ,grow MST one edge at a time
intermediate stages usually have forest of trees (not
connected)
at each stage add minimum weight edge among those not yet
used that does not create a cycle
at each stage the edge may:
expand an existing tree
combine two existing trees into a single tree
create a new tree
need efficient way of detecting/avoiding cycles
- algorithm stops when all vertices are included

Algorithm Kruskal(E,cost,n,t)
// E is the set of edges in G . G has n vertices. Cost[u,v] is the
// cost of edge(u,v). T is the set of edges in the minimum-cost
//spanning tree. The finalcost is returned
{
construct a heap out of the edge costs using heapify;
for i=1 to n do parent[i]=-1;
// each vertex is in a different set.
i=0; mincost=0.0;
while((i<n-1) and (heap not empty)) do
{
Delete a minimum cost edge(u,v) from the heap
and reheapify using Adjust;
j= Find(u); k=Find(v);

if(j k) then
{
i=i+1;
t[i,1]=u; t[i,2]=v;
mincost=mincost + cost[u,v];
Union(j,k)
}
}
if ( i n-1) then write ( no spanning tree);
else return mincost;
}

Example of Kruskals Algorithm

Need to construct a heap with edge cost
The initial array is
{ 28, 16, 12, 22, 25, 10, 14, 24, 18}
After Heapify the array becomes
A={ 28 25 14 24 22 10 12 16 18}

The Initial Set Is
{1} {2} {3} {4} {5} {6} {7}

Minimum spanning tree using Kruskals algorithm can be formed as
given below.
i=0
(u,v)=(1,6)
J={1} k={6}
Mincost=10

The set becomes
{1 6} {2} {3} {4} {5} {7}
Next smallest weight is 12
i=1
(u,v)=(3,4)
J={3} k={4}
i=2
Mincost=10+12=22

The set becomes
{1 6} {2} {3 4} } {5} {7}

Smallest weight is 14
i=2
(u,v)=(2,7)
J={2} k={7}
i=3
Mincost=22+14=36

The set becomes
{1 6} {2 7} {3 4} } {5}

Next smallest cost is 16
i=3
(u,v)=(2,3)
J={2 7} k={3 4}
i=4
Mincost=36+16=52

The set becomes
{1 6} {2 3 4 7} {5}

Next smallest cost is 18.
But form a cycle
So we select next smallest 22
i=4
(u,v)=(4,5)
J={2 3 4 7} k={5}
i=5
Mincost=52+22=74

{1 6} {2 3 4 7} {5}
Since (4,7)is on the same set we discard it.
So The set becomes {1 6} {2 3 4 5 7}
Next smallest cost is 24
But form a cycle
So we select next smallest 25
i=5
(u, v)=(5,6)
J={2 3 4 5 7} k={1 6}
i=6
Mincost=74+25=99

So the MST using Kruskals algorithm
Mincost=99

The final set is
{1 2 3 4 5 6 7}

JOB SEQUENCING WITH DEADLINES
Problem:
Given set of n jobs.
Associated with job i an integer deadline d
i
>0 and profit p
i
> 0.
For any job i profit pi is earned iff the job is completed by the
deadline.
To complete a job one has to process the job for one unit time .
Only one machine is available for processing jobs.

Given :
n jobs 1, 2, , n
deadline di > 0 each job taken 1 unit time
profit pi > 0 1 machine avalable
Find
J {1,2,..., N}
Feasibility:
The jobs in J can be completed before their deadlines
Optimality:

Maximize
Example:
Let n= 4,
(p1,p2,p3,p4)=(100,10, 15, 27)
(d1,d2,d3,d4)=(2,1,2,1)

ej i
i
P
The feasible solutions and their values are
Feasible solution Processing
sequence
Value or Profit
(1 , 2) 2,1 100+10=110
(1,3) 1,3 or 3,1 100+15=115
(1 , 4) 4,1 100+27=127
(2,3) 2,3 10+15=25
(3,4) 4,3 15+27=42
1 1 100
2 2 10
3 3 15
4 4 27

Algorithm GreedyJob(d,J,n)
//J is the set of jobs that can be completed by their deadline
{
J:={1};
i:=2 to n do
{
If(all jobs of jU{i} can be completed by their deadlines)
then J:=J U{i};
}
}

Algorithm JS(d, j, n)
//d[i]>=1,1<=i<=n are deadlines,n1. the jobs are ordered such that
p[1] p[2] . //p[n].J[i]is the ith job in an optimal //solution,1i<k
{
d[0]:=J[0]:= 0;// INITIALIZATION
J[1]:=1;
K:=1;
For i:=2 to n do
{//consider jobs in non increasing order of p[i]. Find position
for i and check feasibility of insertion
r:=k;
While((d[j[r]]>d[i]) and (d[j[r]] r))do r:=r-1;
If((d[j[r]]d[i]) and (d[i]>r) then
{
//Insert I into j[].
For q:=k to (r+1) to step-1 do j[q+1]:=j[q];
J[r+1]:=i; k:=k+1;
}
}
return k;
}

Algorithm:
Step 1: Sort p
i
into nonincreasing order. After sorting p
1
> p
2
> p
3
>
> p
i
.
Step 2: Add the next job i to the solution set if i can be completed by
its deadline. Assign i to time slot [r-1, r], where r is the largest integer
such that 1 s r s d
i
and [r-1, r] is free.
Step 3: Stop if all jobs are examined. Otherwise, go to step 2.
Time complexity: O(n
2
)
Example:
Let n= 4,
(p1,p2,p3,p4)=(100,10, 15, 27)
(d1,d2,d3,d4)=(2,1,2,1)

Algorithm Analysis and Design

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Algorithm Analysis and Design

Uploaded by

Copyright:

Available Formats

MODULE I

You might also like