You are on page 1of 6

Rapid Minor Tutorial Creation Assignment for

data Mining
Objective:

The objective of this assignment is to increase the understanding of how to


extract information from databases (of your choice). It is essential to use
the data set you already have selected and make sure that I got the
details thorough your CR by the end of this week. Data sets are available
at University of California, Irvin (UCI)’s Repository
(http://archive.ics.uci.edu/ml/datasets.html).

Goal of this exercise:


Understanding/Learning data analysis tools
Gaining of hands on experience on extracting information from
databases
Developing an understanding of how to make a system more
intelligent and think about its deployment
Improving your creativity and writing skills

Activity

Your are assigned with ONLY ONE data set and you have to work on that
data. You have to start with the preprocessing of the data and after pre
processing you have to implement K-Nearest Neighbor, Naïve Bayse
Classification, Decision Trees and Association Rules. You are free to
demonstrate your abilities of creating a modern document and tutorial
with better graphics, arrangement of text and quality of text for your
tutorial. (A sample document is attached at the end of this assignment for
guidelines. I am expecting a much better result from you)

Deliverables:
Hard form of tutorial on color prints
Soft form of tutorial on a CD (Auto Run)

Assessment:
You will be assessed based on
Your results of data analysis 25%
Presentation of your deliverables (above) of your work (tutorial)
15%
Quality of text and guidelines you write for your tutorial. 25%
Better arrangement of graphics 25%
Discussion on Results 25%

Deadline:
May 20, 2011, Midnight 0000 Hrs

Dr. Muhammad Shahbaz


March 17, 2011
Sample Document
Naïve Bayesian in Rapid Miner
There are following steps to create a process to implement Naïve Bayesian algorithm in
Rapid Miner, view its performance on input data and to see predicted results.
1. Go to start menu click programs, select Rapid Miner and click Rapid Miner Icon.
2. Select File, click New. It will create a new process.
3. For input to new process right click root Node select new operator, select IO,
select Examples and Select CSV Example Source. Data I have used in this
experiment is of CSV (Comma Separated Values) format that is why CSV
Example source is selected.
4. Right Click Root Node, select new operator, select Validation then select cross
validation.
5. Right click Cross Validation Node and Select new operator, select Learner, select
Supervised, then select Bayes and then select Naïve Bayes.
6. Right click Cross Validation Node and Select new operator and then select
Operator Chain.
7. Right click Operator Chain Node and Select new operator and then select Model
Applier.
8. Right click Operator Chain Node and Select new operator and then select
Performance.
Selecting an input source as CSV Example Source for Naïve Bayesian Algorithm

Selecting Cross Validation Operator for Naïve Bayesian Algorithm


Selecting learning algorithm as Naïve Bayesian.

Selecting Operator chain operator for Naïve Bayesian Algorithm


Selecting Model Applier Operator for Naïve Bayesian Algorithm

Selecting Performance Operator for Naïve Bayesian Algorithm

This completes the process creation activity. Now we have to set the parameters for the
operators. We can view the whole process in Rapid Miner in XML form. Rapid Miner
creates the process in XML format and it also saves the process file in XML format. We
have to select the XML in Rapid Miner.

You might also like