You are on page 1of 18

!

Deva Road, Lucknow
 www.srmu.ac.in


U.P.
SHRI RAMSWAROOP MEMORIAL UNIVERSITY

LAB MANUAL
Artificial Intelligence
Table of Contents
Lab Activity 1. .................................................................................................................2
Lab Activity 2 .................................................................................................................3
Lab Activity 3 .................................................................................................................5
Lab Activity 4 .................................................................................................................8
Lab Activity 5 ................................................................................................................12
Lab Activity 6-7 .............................................................................................................13
Lab Activity 8 ................................................................................................................14
Lab Activity 9 ................................................................................................................15
Lab Activity 10 ..............................................................................................................15
References ....................................................................................................................17
Lab Activity 1.
Objective: To have an introduction of Prolog fundamentals: constants, predicates, arguments, variables.

 Prolog Syntax
There are four kinds of term in Prolog: atoms, numbers, variables, and complex terms (or structures). Atoms and
numbers are lumped together under the heading constants, and constants and variables together make up the simple
terms of Prolog.
Let’s take a closer look. To make things crystal clear, let’s first be precise about the basic characters (that is,
symbols) at our disposal. The upper-case letters are A , B ,…, Z ; the lower-case letters are a , b ,…, z ;
the digits are 0 , 1 , 2 ,…, 9 . In addition we have the _ symbol, which is called underscore, and some special
characters , which include characters such as +, - , * , / , < , > , = , : , . , & , ~ . The blank space is also a character,
but a rather unusual one, being invisible. A string is an unbroken sequence of characters.
Atoms
An atom is either:
1. A string of characters made up of upper-case letters, lower-case letters, digits, and the underscore character,
that begins with a lower-case letter. Here are some
examples: butch ,big_kahuna_burger , listens2Music and playsAirGuitar .
2. An arbitrary sequence of characters enclosed in single quotes. For example ’ Vincent ’, ’ The Gimp ’,
’ Five_Dollar_Shake ’, ’ &^%&#@$ &* ’, and ’ ’. The sequence of characters between the single quotes
is called the atom name. Note that we are allowed to use spaces in such atoms; in fact, a common reason for
using single quotes is so we can do precisely that.
3. A string of special characters. Here are some examples: @= and ====> and ; and :- are all atoms. As we
have seen, some of these atoms, such as ; and :- have a pre-defined meaning.
Numbers
Real numbers aren’t particularly important in typical Prolog applications. So although most Prolog implementations
do support floating point numbers or floats (that is, representations of real numbers such as 1657.3087 or π ) we say
little about them in this book.
But integers (that is: …,-2, -1, 0, 1, 2, 3,…) are useful for such tasks as counting the elements of a list, and we’ll
discuss how to manipulate them in Chapter 5 . Their Prolog syntax is the obvious one: 23 , 1001 , 0 , -365 , and so
on.
Variables
A variable is a string of upper-case letters, lower-case letters, digits and underscore characters that starts either with
an upper-case letter or with an underscore. For
example, X , Y ,Variable , _tag , X_526 , List , List24 , _head , Tail , _input and Output are all Prolog variables.
The variable _ (that is, a single underscore character) is rather special. It’s called the anonymous variable , and we
discuss it in Chapter 4 .
Complex terms
Constants, numbers, and variables are the building blocks: now we need to know how to fit them together to make
complex terms. Recall that complex terms are often called structures.
Complex terms are build out of a functor followed by a sequence of arguments. The arguments are put in ordinary
parentheses, separated by commas, and placed after the functor. Note that the functor has to be directly followed by
the parenthesis; you can’t have a space between the functor and the parenthesis enclosing the arguments. The
functor must be an atom. That is, variables cannot be used as functors. On the other hand, arguments can be any kind
of term.
Now, we’ve already seen lots of examples of complex terms when we looked at the knowledge bases KB1 to KB5.
For example, playsAirGuitar(jody) is a complex term: its functor isplaysAirGuitar and its argument is jody . Other
examples are loves(vincent,mia) and, to give an example containing a variable, jealous(marsellus,W) .
But the definition allows for more complex terms than this. In fact, it allows us to keep nesting complex terms inside
complex terms indefinitely (that is, it is allows recursive structure). For example
hide(X,father(father(father(butch))))

is a perfectly acceptable complex term. Its functor is hide , and it has two arguments: the variable X , and the
complex term father(father(father(butch))) . This complex term has father as its functor, and another complex term,
namely father(father(butch)) , as its sole argument. And the argument of this complex term, namely father(butch) , is
also complex. But then the nesting bottoms out, for the argument here is the constant butch .
As we shall see, such nested (or recursively structured) terms enable us to represent many problems naturally. In fact
the interplay between recursive term structure and variable unification is the source of much of Prolog’s power.
The number of arguments that a complex term has is called its arity. For example, woman(mia) is a complex term of
arity 1, and loves(vincent,mia) is a complex term of arity 2.
Arity is important to Prolog. Prolog would be quite happy for us to define two predicates with the same functor but
with a different number of arguments. For example, we are free to define a knowledge base that defines a two-place
predicate love (this might contain such facts as love(vincent,mia) ), and also a three-place love predicate (which
might contain such facts as love(vincent,marsellus,mia) ). However, if we did this, Prolog would treat the two-
place love and the three-place love as different predicates. Later in the book (for example, when we introduce
accumulators in Chapter 5) we shall see that it can be useful to define two predicates with the same functor but
different arity.
When we need to talk about predicates and how we intend to use them (for example, in documentation) it is usual to
use a suffix / followed by a number to indicate the predicate’s arity. To return to KB2, instead of saying that it
defines predicates
listens2Music 

happy 

playsAirGuitar

we should really say that it defines predicates


listens2Music/1 

happy/1 

playsAirGuitar/1

And Prolog can’t get confused about a knowledge base containing the two different love predicates, for it regards
the love/2 predicate and the love/3 predicate as distinct.

Lab Activity 2
Objective: To have an introduction of Test, Backtracking.

Backtracking is a powerful feature of Prolog that simplifies development of many programs. It enables the program
to use other alternative if the previous alternative fails. Thus, programming of generate and test algorithms is natural
in Prolog. Also, it is usually possible to find one solution as well as all solutions using the same program.

Suppose that we have the following database

eats(fred,pears).
eats(fred,t_bone_steak).
eats(fred,apples).
So far we have only been able to ask if fred eats specific things. Suppose that I wish to instead answer the question,
"What are all the things that fred eats". To answer this I can use variables again. Thus I can type in the query
?- eats(fred,FoodItem).

As we have seen earlier, Prolog will answer with

FoodItem = pears

This is because it has found the first clause in the database. At this point Prolog allows us to ask if there are other
possible solutions. When we do so we get the following.

FoodItem = t_bone_steak

if I ask for another solution, Prolog will then give us.

FoodItem = apples

If we ask for further solutions, Prolog will answer no, since there are only three ways to prove fred eats something.
The mechanism for finding multiple solution is called backtracking. This is an essential mechanism in Prolog and
we shall see more of it later.

Backtracking in Rules
We can also have backtracking in rules. For example consider the following program.

hold_party(X):-

birthday(X),

happy(X).

birthday(tom).

birthday(fred).
birthday(helen).
happy(mary).
happy(jane).
happy(helen).
If we now pose the query

?- hold_party(Who).

In order to solve the above, Prolog first attempts to find a clause of birthday, it being the first subgoal of birthday.
This binds X to tom. We then attempt the goal happy(tom). This will fail, since it doesn't match the above database.
As a result, Prolog backtracks. This means that Prolog goes back to its last choice point and sees if there is an
alternative solution. In this case, this means going back and attempting to find another clause of birthday. This time
we can use clause two, binding X to fred. This then causes us to try the goal happy(fred). Again this will fail to
match our database. As a result, we backtrack again. This time we find clause three of birthday, and bind X to helen,
and attempt the goal happy(helen). This goal matches against clause 3 of our happy database. As a result, hold_party
will succeed with X=helen.

Lab Activity 3
Objective: To have an introduction of Recursion in Prolog.

Now that we know something about what recursion in Prolog involves, it is time to ask why it is so important.
Actually, this is a question that can be answered on a number of levels, but for now, let’s keep things fairly practical.
So: when it comes to writing useful Prolog programs, are recursive definitions really so important? And if so, why?
Let’s consider an example. Suppose we have a knowledge base recording facts about the child relation:
child(bridget,caroline). 

child(caroline,donna).
That is, Caroline is a child of Bridget, and Donna is a child of Caroline. Now suppose we wished to define the
descendant relation; that is, the relation of being a child of, or a child of a child of, or a child of a child of a child of,
and so on. Here’s a first attempt to do this. We could add the following two non -recursive rules to the knowledge
base:
descend(X,Y) :- child(X,Y). 


descend(X,Y) :- child(X,Z), 

child(Z,Y).
Now, fairly obviously these definitions work up to a point, but they are clearly limited: they only define the concept
of descendant-of for two generations or less. That’s ok for the above knowledge base, but suppose we get some more
information about the child-of relation and we expand our list of child-of facts to this:
child(anne,bridget). 

child(bridget,caroline). 

child(caroline,donna). 

child(donna,emily).
Now our two rules are inadequate. For example, if we pose the queries
?- descend(anne,donna).
or
?- descend(bridget,emily).
we get the answer no, which is not what we want. Sure, we could ‘fix’ this by adding the following two rules:
descend(X,Y) :- child(X,Z_1), 

child(Z_1,Z_2), 

child(Z_2,Y). 


descend(X,Y) :- child(X,Z_1), 

child(Z_1,Z_2), 

child(Z_2,Z_3), 

child(Z_3,Y).
But, let’s face it, this is clumsy and hard to read. Moreover, if we add further child-of facts, we could easily find
ourselves having to add more and more rules as our list of child-of facts grow, rules like:
descend(X,Y) :- child(X,Z_1), 

child(Z_1,Z_2), 

child(Z_2,Z_3), 

.

.

.

child(Z_17,Z_18). 

child(Z_18,Z_19). 

child(Z_19,Y).
This is not a particularly pleasant (or sensible) way to go!
But we don’t need to do this at all. We can avoid having to use ever longer rules entirely. The following recursive
predicate definition fixes everything exactly the way we want:
descend(X,Y) :- child(X,Y). 


descend(X,Y) :- child(X,Z), 

descend(Z,Y).
What does this say? The declarative meaning of the base clause is: if Y is a child of X , then Y is a descendant of X .
Obviously sensible. So what about the recursive clause? Its declarative meaning is: if Z is a child of X , and Y is a
descendant of Z , then Y is a descendant of X . Again, this is obviously true.
So let’s now look at the procedural meaning of this recursive predicate, by stepping through an example. What
happens when we pose the query:
descend(anne,donna)
Prolog first tries the first rule. The variable X in the head of the rule is unified with anne and Y with donna and the
next goal Prolog tries to prove is
child(anne,donna)
This attempt fails, however, since the knowledge base neither contains the fact child(anne,donna) nor any rules that
would allow to infer it. So Prolog backtracks and looks for an alternative way of proving descend(anne,donna) . It
finds the second rule in the knowledge base and now has the following subgoals:
child(anne,_633), 

descend(_633,donna).
Prolog takes the first subgoal and tries to unify it with something in the knowledge base. It finds the
fact child(anne,bridget) and the variable _633 gets instantiated to bridget . Now that the first subgoal is satisfied,
Prolog moves to the second subgoal. It has to prove
descend(bridget,donna)
This is the first recursive call of the predicate descend/2 . As before, Prolog starts with the first rule, but fails,
because the goal
child(bridget,donna)
cannot be proved. Backtracking, Prolog finds that there is a second possibility to be checked
for descend(bridget,donna) , namely the second rule, which again gives Prolog two new subgoals:
child(bridget,_1785), 

descend(_1785,donna).
The first one can be unified with the fact child(bridget,caroline) of the knowledge base, so that the variable _1785 is
instantiated with caroline . Next Prolog tries to prove
descend(caroline,donna).
This is the second recursive call of predicate descend/2 . As before, it tries the first rule first, obtaining the following
new goal:
child(caroline,donna)
This time Prolog succeeds, since child(caroline,donna) is a fact in the database. Prolog has found a proof for the
goal descend(caroline,donna) (the second recursive call). But this means that descend(bridget,donna) (the first
recursive call) is also true, which means that our original query descend(anne,donna) is true as well.
Here is the search tree for the query descend(anne,donna) . Make sure that you understand how it relates to the
discussion in the text; that is, how Prolog traverses this search tree when trying to prove this query.

It should be obvious from this example that no matter how many generations of children we add, we will always be
able to work out the descendant relation. That is, the recursive definition is both general and compact: it
contains all the information in the non-recursive rules, and much more besides. The non-recursive rules only defined
the descendant concept up to some fixed number of generations: we would need to write down infinitely many non-
recursive rules if we wanted to capture this concept fully, and of course that’s impossible. But, in effect, that’s what
the recursive rule does for us: it bundles up the information needed to cope with arbitrary numbers of generations
into just three lines of code.
Recursive rules are really important. They enable to pack an enormous amount of information into a compact form
and to define predicates in a natural way. Most of the work you will do as a Prolog programmer will involve writing
recursive rules.
Lab Activity 4
Objective: To implement DFS, BFS [State-Space Search]

The depth-first search received its name from the fact that it continually expands a path. In this way, the distance
between the last node and the start node gets increasingly larger; the search gets deeper. If necessary, backtracking
allows parts of the path to be retraced and alternative turn-offs can then be followed. A tree would be searched
branch for branch (from top to bottom).

In this part, we will implement a depth-first search that is more independent from the prolog prover as it saves
alternative paths in a so-called agenda. This `memory' makes it more independent because it can find alternatives
without using backtracking and can select successor paths more independently from the prover's way of choosing
clauses.

The agenda acts as a memory for paths that sooner or later have to be followed. The agenda is a list of paths. The
depth-first search algorithm takes the first path out of the agenda and looks to see whether the path ends with the
destination node. If this is the case the search has been successful. If this is not the case then all successor paths i.e.
all possibilities going further a node from the current node, are generated. All successor paths that contain cycles are
eliminated. The remaining paths are put at the beginning of the agenda. The algorithm starts again i.e. the first path
is taken out of the agenda, expanded and checked. As not all paths can always be expanded, the agenda becomes
emptier with time. If the agenda becomes completely empty then the whole graph has been completely searched
without a path having been found from the start to the destination node. The agenda is initialised with the search's
start node being a path of length 1. The agenda keeps a list of (partial) paths. This work was previously (in the
programs up-to-now) carried out by the prover.

The depth-first search algorithm can be characterised as follows. At the beginning of the search, the agenda is
initialised by the start node. The start node is a path of length 1.

If the agenda is empty, the search is exhausted. All possible paths have been explored.
If the agenda is not empty, the first path is taken out of it.
If this path leads to the chosen destination:
show result path
If this path does not lead to the chosen destination and/or further solutions are required:
generate all successor paths that do not contain cycles,
insert these successor paths at the top of the agenda,
continue the search with the new agenda.
Protocol of depth-first search(a,c,Weg).

Depth-first search from a to c. Initialise agenda: [[a]]


Step 1

Contents of the agenda: [ [a] ]


Obtain and remove the first path from the agenda: [a].
The successor paths of this path are: [ [e, a], [b, a] ]
Append the successor paths to the front of the agenda.
The new agenda is: [ [e, a], [b, a] ]
Continue search using the new agenda.
Step 2
Contents of the agenda: [ [e, a], [b, a] ]
Obtain and remove the first path from the agenda: [e, a].
The successor paths of this path are: [ [d, e, a], [b, e, a], [h, e, a] ]
Append the successor paths to the front of the agenda.
The new agenda is: [ [d, e, a], [b, e, a], [h, e, a], [b, a] ]
Continue search using the new agenda.
Step 3

Contents of the agenda: [ [d, e, a], [b, e, a], [h, e, a], [b, a] ]
Obtain and remove the first path from the agenda: [d, e, a].
The successor paths of this path are: [ [c, d, e, a], [b, d, e, a], [i, d, e, a] ]
Append the successor paths to the front of the agenda.
The new agenda is: [ [c, d, e, a], [b, d, e, a], [i, d, e, a], [b, e, a], [h, e, a], [b, a] ]
Continue search using the new agenda.
Step 4

The path [c, d, e, a] reaches the destination node c.

Step 5

Contents of the agenda: [ [c, d, e, a], [b, d, e, a], [i, d, e, a], [b, e, a], [h, e, a], [b, a] ]
Obtain and remove the first path from the agenda: [c, d, e, a].
The successor paths of this path are: [ ]
Append the successor paths to the front of the agenda.
The new agenda is: [ [b, d, e, a], [i, d, e, a], [b, e, a], [h, e, a], [b, a] ]
Continue search using the new agenda.

BFS:
The breadth-first search starts by examining all paths that lead from the start node with the length 0, then with the
length 1, ..., and finally all paths of length n. A tree would thus be searched widthways. Hence the name and the
feature that this search - measured in nodes - always finds the shortest path first. In section [*] the depth-first search
and the breadth-first search are presented in a diagram and compared.
The depth-first search from section [*] can be easily transformed into a breadth-first search. The order of the paths in
the agenda have to be changed so that the paths are sorted according to length. The shortest paths are then at the
front of the agenda. This ordering can be easily brought about by appending the expanded paths to the end of the
agenda (instead of to the front, as with the depth-first search).

The breadth-first search program is the same as the depth-first search program with the exception that the first two
arguments of the append-call have to be swapped:

breitensuche(Start, Ziel, Weg) :-


bs([[Start]], Ziel, Gew),
reverse(Gew,Weg).

bs([[Ziel|R]|_RestWege], Ziel, [Ziel|R]).

bs([Weg1 | RestWege], Ziel, Weg) :-


expandiere(Weg1, NachfolgeWege),
append(RestWege, NachfolgeWege, NeueAgenda),
bs(NeueAgenda, Ziel, Weg).
Example call:

?- breitensuche(d,g,Weg).

Weg = [d, i, h, g] ;

Weg = [d, e, h, g] ;

Weg = [d, b, e, h, g] ;

Weg = [d, b, a, e, h, g] ;

Weg = [d, i, h, k, n, l, g] ;

Weg = [d, e, h, k, n, l, g]

Yes

Protocol of breitensuche(a,c,Weg).

Breadth-first search from a to c. Initialise agenda: [[a]]


Step 1
Contents of the agenda: [ [a] ]
Obtain and remove the first path from the agenda: [a].
The successor paths of this path are: [ [e, a], [b, a] ]
Append the successor paths to the end of the agenda.
The new agenda is: [ [e, a], [b, a] ]
Continue search using the new agenda.
Step 2
Contents of the agenda: [ [e, a], [b, a] ]
Obtain and remove the first path from the agenda: [e, a].
The successor paths of this path are: [ [d, e, a], [b, e, a], [h, e, a] ]
Append the successor paths to the end of the agenda.
The new agenda is: [ [b, a], [d, e, a], [b, e, a], [h, e, a] ]
Continue search using the new agenda.
Step 3
Contents of the agenda: [ [b, a], [d, e, a], [b, e, a], [h, e, a] ]
Obtain and remove the first path from the agenda: [b, a].
The successor paths of this path are: [ [d, b, a], [e, b, a] ]
Append the successor paths to the end of the agenda.
The new agenda is: [ [d, e, a], [b, e, a], [h, e, a], [d, b, a], [e, b, a] ]
Continue search using the new agenda.
Step 4
Contents of the agenda: [ [d, e, a], [b, e, a], [h, e, a], [d, b, a], [e, b, a] ]
Obtain and remove the first path from the agenda: [d, e, a].
The successor paths of this path are: [ [c, d, e, a], [b, d, e, a], [i, d, e, a] ]
Append the successor paths to the end of the agenda.
The new agenda is: [ [b, e, a], [h, e, a], [d, b, a], [e, b, a], [c, d, e, a], [b, d, e, a], [i, d, e, a] ]
Continue search using the new agenda.
Step 5
Contents of the agenda: [ [b, e, a], [h, e, a], [d, b, a], [e, b, a], [c, d, e, a], [b, d, e, a], [i, d, e, a] ]
Obtain and remove the first path from the agenda: [b, e, a].
The successor paths of this path are: [ [d, b, e, a] ]
Append the successor paths to the end of the agenda.
The new agenda is: [ [h, e, a], [d, b, a], [e, b, a], [c, d, e, a], [b, d, e, a], [i, d, e, a], [d, b, e, a] ]
Continue search using the new agenda.
Step 6
Contents of the agenda: [ [h, e, a], [d, b, a], [e, b, a], [c, d, e, a], [b, d, e, a], [i, d, e, a], [d, b, e, a] ]
Obtain and remove the first path from the agenda: [h, e, a].
The successor paths of this path are: [ [g, h, e, a], [k, h, e, a], [j, h, e, a], [i, h, e, a] ]
Append the successor paths to the end of the agenda.
The new agenda is: [ [d, b, a], [e, b, a], [c, d, e, a], [b, d, e, a], [i, d, e, a], [d, b, e, a], [g, h, e, a], [k, h, e, a],
[j, h, e, a], [i, h, e, a] ]
Continue search using the new agenda.
Step 7
Contents of the agenda: [ [d, b, a], [e, b, a], [c, d, e, a], [b, d, e, a], [i, d, e, a], [d, b, e, a], [g, h, e, a], [k, h, e,
a], [j, h, e, a], [i, h, e, a] ]
Obtain and remove the first path from the agenda: [d, b, a].
The successor paths of this path are: [ [c, d, b, a], [i, d, b, a], [e, d, b, a] ]
Append the successor paths to the end of the agenda.
The new agenda is: [ [e, b, a], [c, d, e, a], [b, d, e, a], [i, d, e, a], [d, b, e, a], [g, h, e, a], [k, h, e, a], [j, h, e,
a], [i, h, e, a], [c, d, b, a], [i, d, b, a], [e, d, b, a] ]
Continue search using the new agenda.
Step 8
Contents of the agenda: [ [e, b, a], [c, d, e, a], [b, d, e, a], [i, d, e, a], [d, b, e, a], [g, h, e, a], [k, h, e, a], [j, h,
e, a], [i, h, e, a], [c, d, b, a], [i, d, b, a], [e, d, b, a] ]
Obtain and remove the first path from the agenda: [e, b, a].
The successor paths of this path are: [ [d, e, b, a], [h, e, b, a] ]
Append the successor paths to the end of the agenda.
The new agenda is: [ [c, d, e, a], [b, d, e, a], [i, d, e, a], [d, b, e, a], [g, h, e, a], [k, h, e, a], [j, h, e, a], [i, h, e,
a], [c, d, b, a], [i, d, b, a], [e, d, b, a], [d, e, b, a], [h, e, b, a] ]
Continue search using the new agenda.
Step 9
The path [c, d, e, a] reaches the destination node c.
Step 10
Contents of the agenda: [ [c, d, e, a], [b, d, e, a], [i, d, e, a], [d, b, e, a], [g, h, e, a], [k, h, e, a], [j, h, e, a], [i,
h, e, a], [c, d, b, a], [i, d, b, a], [e, d, b, a], [d, e, b, a], [h, e, b, a] ]
Obtain and remove the first path from the agenda: [c, d, e, a].
The successor paths of this path are: [ ]
Append the successor paths to the end of the agenda.
The new agenda is: [ [b, d, e, a], [i, d, e, a], [d, b, e, a], [g, h, e, a], [k, h, e, a], [j, h, e, a], [i, h, e, a], [c, d, b,
a], [i, d, b, a], [e, d, b, a], [d, e, b, a], [h, e, b, a] ]
Continue search using the new agenda.
Lab Activity 5
Objective: To implement Supervised Learning on IRIS Dataset using Bayes Classifier.

A Bayesian classifier is based on the idea that the role of a (natural) class is to predict the values of features for
members of that class. Examples are grouped in classes because they have common values for the features. Such
classes are often called natural kinds. In this section, the target feature corresponds to a discrete class, which is not
necessarily binary.
The idea behind a Bayesian classifier is that, if an agent knows the class, it can predict the values of the other
features. If it does not know the class, Bayes' rule can be used to predict the class given (some of) the feature values.
In a Bayesian classifier, the learning agent builds a probabilistic model of the features and uses that model to predict
the classification of a new example.
A latent variable is a probabilistic variable that is not observed. A Bayesian classifier is a probabilistic model where
the classification is a latent variable that is probabilistically related to the observed variables. Classification then
become inference in the probabilistic model.
The simplest case is the naive Bayesian classifier, which makes the independence assumption that the input features
are conditionally independent of each other given the classification. The independence of the naive Bayesian
classifier is embodied in a particular belief network where the features are the nodes, the target variable (the
classification) has no parents, and the classification is the only parent of each input feature. This belief network
requires the probability distributions P(Y) for the target feature Y and P(Xi|Y) for each input feature Xi. For each
example, the prediction can be computed by conditioning on observed values for the input features and by querying
the classification.
Given an example with inputs X1=v1,...,Xk=vk, Bayes' rule is used to compute the posterior probability distribution
of the example's classification, Y:
P(Y | X1=v1,...,Xk=vk)
= (P(X1=v1,...,Xk=vk| Y) ×P(Y))/(P(X1=v1,...,Xk=vk))
= (P(X1=v1|Y)×···×P(Xk=vk| Y)×P(Y))/( ∑Y P(X1=v1|Y)×···×P(Xk=vk| Y) ×P(Y))
where the denominator is a normalizing constant to ensure the probabilities sum to 1. The denominator does not
depend on the class and, therefore, it is not needed to determine the most likely class.
To learn a classifier, the distributions of P(Y) and P(Xi|Y) for each input feature can be learned from the data, as
described in Section 7.2.3. The simplest case is to use the empirical frequency in the training data as the probability
(i.e., use the proportion in the training data as the probability). However, as shown below, this approach is often not
a good idea when this results in zero probabilities.

IRIS DATASET: This is perhaps the best known database to be found in the pattern recognition literature. Fisher's
paper is a classic in the field and is referenced frequently to this day. (See Duda & Hart, for example.) The data set
contains 3 classes of 50 instances each, where each class refers to a type of iris plant. One class is linearly separable
from the other 2; the latter are NOT linearly separable from each other.

Predicted attribute: class of iris plant.

This is an exceedingly simple domain.

This data differs from the data presented in Fishers article (identified by Steve Chadwick, spchadwick '@'
espeedaz.net ). The 35th sample should be: 4.9,3.1,1.5,0.2,"Iris-setosa" where the error is in the fourth feature. The
38th sample: 4.9,3.6,1.4,0.1,"Iris-setosa" where the errors are in the second and third features.
Pseudocode:
Training:
1. Import data from iris.txt to workspace of Matlab.
A=importdata(‘iris.txt’,’\t’);
2. Divide data into training set and testing set.
training_data=[A(1:35,:);A(51:85,:);A(101:135,:)]
testing_data=[A(36:50,:);A(86:100,:);A(136:150,:)]
3. For each feature find the frequency corresponding to three classes.
4. Now find the posterior probability of each sample point of dataset corresponding to classes.

Testing:
1. For each value of feature of sample see its posterior probability in the posterior probability matrix as we
get it during training.
2. Among three classes which one is having the maximum posterior probability that sample will be
categorized to that class.
3. After that find the accuracy that among 45 sample how many are categorized correctly.

Accuracy=No. of sample points categorized correctly/total no. of samples.

Lab Activity 6-7


Objective: To implement Genetic Algorithm to find out the optimal solution of different equation.

The genetic algorithm is a method for solving both constrained and unconstrained optimization problems that is
based on natural selection, the process that drives biological evolution. The genetic algorithm repeatedly modifies a
population of individual solutions. At each step, the genetic algorithm selects individuals at random from the current
population to be parents and uses them to produce the children for the next generation. Over successive generations,
the population "evolves" toward an optimal solution. You can apply the genetic algorithm to solve a variety of
optimization problems that are not well suited for standard optimization algorithms, including problems in which the
objective function is discontinuous, nondifferentiable, stochastic, or highly nonlinear. The genetic algorithm can
address problems of mixed integer programming, where some components are restricted to be integer-valued.
The genetic algorithm uses three main types of rules at each step to create the next generation from the current
population:
• Selection rules select the individuals, called parents, that contribute to the population at the next generation.
• Crossover rules combine two parents to form children for the next generation.
• Mutation rules apply random changes to individual parents to form children.

Classical Algorithm Genetic Algorithm

Generates a single point at Generates a population of points at each iteration. The best point in the population
each iteration. The sequence approaches an optimal solution.
of points approaches an
optimal solution.
Classical Algorithm Genetic Algorithm

Selects the next point in the Selects the next population by computation which uses random number
sequence by a deterministic generators.
computation.
The genetic algorithm differs from a classical, derivative-based, optimization algorithm in two main ways, as
summarized in the following table.

Pseudo Code:

1. [Start] Generate random population of n chromosomes (suitable solutions for the problem)
2. [Fitness] Evaluate the fitness f(x) of each chromosome x in the population
3. [New population] Create a new population by repeating following steps until the new population is
complete
a. [Selection] Select two parent chromosomes from a population according to their fitness (the better
fitness, the bigger chance to be selected)
b. [Crossover] With a crossover probability cross over the parents to form a new offspring (children).
If no crossover was performed, offspring is an exact copy of parents.
c. [Mutation] With a mutation probability mutate new offspring at each locus (position in
chromosome).
d. [Accepting] Place new offspring in a new population
4. [Replace] Use new generated population for a further run of algorithm
5. [Test] If the end condition is satisfied, stop, and return the best solution in current population
6. [Loop] Go to step 2

Lab Activity 8
Objective: To implement PCA algorithm for dimensionality reduction.

Principal component analysis (PCA) is a statistical procedure that uses an orthogonal transformation to convert a set
of observations of possibly correlated variables into a set of values of linearly uncorrelated variables called principal
components. The number of principal components is less than or equal to the number of original variables. This
transformation is defined in such a way that the first principal component has the largest possible variance (that is,
accounts for as much of the variability in the data as possible), and each succeeding component in turn has the
highest variance possible under the constraint that it is orthogonal to the preceding components. The resulting
vectors are an uncorrelated orthogonal basis set. PCA is sensitive to the relative scaling of the original variables.

Pseudo code for principal component analysis:


- Take the whole dataset consisting of d-dimensional samples ignoring the class labels
- Compute the dd-dimensional mean vector (i.e., the means for every dimension of the whole dataset)
- Compute the scatter matrix (alternatively, the covariance matrix) of the whole data set
- Compute eigenvectors (e_1,e_2,……,e_d) and corresponding eigenvalues (λ_1,λ_2,…,λ_d)
- Sort the eigenvectors by decreasing eigenvalues and choose k eigenvectors with the largest eigenvalues to form a
d×k dimensional matrix W(where every column represents an eigenvector)
- Use this d×k eigenvector matrix to transform the samples onto the new subspace. This can be summarised by the
mathematical equation: y=W^{T}×x (where x is a d×1-dimensional vector representing one sample, and y is the
transformed k×1-dimensional sample in the new subspace.)

Lab Activity 9
Objective: To implement K-NN classification technique.

Categorizing query points based on their distance to points in a training dataset can be a simple yet effective way of
classifying new points. You can use various metrics to determine the distance, described next. Use pdist2 to find the
distance between a set of data and query points.
k-NN is a type of instance-based learning, or lazy learning, where the function is only approximated locally and all
computation is deferred until classification. The k-NN algorithm is among the simplest of all machine learning
algorithms.

Pseudo code:

1. Import data from iris.txt to workspace of Matlab.


A=importdata(‘iris.txt’,’\t’);
2. Divide data into training set and testing set.
training_data=[A(1:35,:);A(51:85,:);A(101:135,:)]
testing_data=[A(36:50,:);A(86:100,:);A(136:150,:)]
3. Choose value of K, which tell how many neighbor we have to see for voting.
4. For each value from testing_data,
a. Calculate its distance metric [Euclidean distance] from all sample point of training_data, keep
track of class of training_data also. Distance matrix contain two column. One column for
distance another column for class.
b. Sort the distance matrix in ascending order.
c. Take the first K values from the distance matrix and among those find the mode of class
values
d. The class having high mode value then testing_data sample point will be classified to that
class
end

Lab Activity 10
Objective: To implement K-means clustering technique.

The most common algorithm uses an iterative refinement technique. Due to its ubiquity it is often called the k-
means algorithm; it is also referred to as Lloyd's algorithm, particularly in the computer science community.
Given an initial set of k means m1(1),…,mk(1) (see below), the algorithm proceeds by alternating between two
steps:[7]
Assignment step: Assign each observation to the cluster whose mean yields the least within-cluster sum of
squares (WCSS). Since the sum of squares is the squaredEuclidean distance, this is intuitively the "nearest"
mean.[8] (Mathematically, this means partitioning the observations according to the Voronoi diagram generated
by the means).

&

where each & is assigned to exactly one & , even if it could be assigned to two or more of them.
Update step: Calculate the new means to be the centroids of the observations in the new clusters.

&
Since the arithmetic mean is a least-squares estimator, this also minimizes the within-cluster sum of squares
(WCSS) objective.
The algorithm has converged when the assignments no longer change. Since both steps optimize the WCSS
objective, and there only exists a finite number of such partitionings, the algorithm must converge to a (local)
optimum. There is no guarantee that the global optimum is found using this algorithm.
The algorithm is often presented as assigning objects to the nearest cluster by distance. The standard algorithm
aims at minimizing the WCSS objective, and thus assigns by "least sum of squares", which is exactly equivalent
to assigning by the smallest Euclidean distance. Using a different distance function other than (squared)
Euclidean distance may stop the algorithm from converging.[citation needed] Various modifications of k-means
such as spherical k-means and k-medoids have been proposed to allow using other distance measures.

Pseudo Code:

1. Import data from iris.txt to workspace of Matlab.


A=importdata(‘iris.txt’,’\t’);
2. Take whole dataset which is to divide in K cluster.
data_Set=A;
3. Choose value of K, which tell in how many cluster you want to cluster your dataset.
4. Place centroids c1, c2, ……ck.
5. Repeat until convergence:
For each point xi :
a. Find nearest centroid cj , on the basis of minimum distance [Euclidean distance] of
point form among k centroid point.
b. Assign the point to xi to cluster j.
For each cluster j=1,2,3…..k.
a. New centroid cj = mean of all points xi assigned to cluster j in previous step.
6. Stop when none of the cluster assignment change
References

1. https://www.cpp.edu/~jrfisher/www/prolog_tutorial/pt_framer.html
2. https://www.cs.unm.edu/~luger/ai-final/code/PROLOG.depth.html
3. http://en.wikipedia.org/wiki/K-means_clustering
4. http://in.mathworks.com/help/gads/what-is-the-genetic-algorithm.html
5. http://cogsci.uni-osnabrueck.de/cogsci/dirs/_Vorlesungsmaterialien/_Prolog/_english/node157.html
6. http://www.obitko.com/tutorials/genetic-algorithms/ga-basic-description.php

You might also like