You are on page 1of 23

Educational Data Mining

Team
Objectives
Motivation
Methodology/Data Mining
Expected Outcome
Task already completed
Task ongoing
Task Upcoming
Index of Contents
Application
Challenges
Conclusion/Future Work
Project objectives

To apply data mining technique to analyze students academic


activities.
To Predict Students Academic Result by classification.
To identify factors those help a student to get a successive academic
result.
To detect the chances of students failure at early stage.
To generate a model to improvise students success rate.
Motivation
Educational Sectors:
In educational sectors it has been an challenging task to identify the students
individually to take appropriate actions to get a very deserving outcome from
them
Students Awareness:
On the other hand a student getting Higher Education should have the
knowledge about the market demands and where are their weaknesses.
Availability of Educational Data:
It is possible to get some data from students academic record and their
percepts on some factors related to academic performances those may help to
understand the reasons of success and failure.
Increasing Success Rate
Would be very useful in educational environment and students success rate.
Methodology
Data Mining:
Data mining is the process of gathering data (from a specific domain) and
analyze them to extract knowledge into (hidden and useful) information.

Data Mining in Educational :


Educational Data Mining describes a research field concerned with information
generated from educational environment through data mining methodology.
CRIPS-DM
Phases of CRIPS-DM
Business Understanding
Data Understanding
Data Preparation
Modeling
Evaluation
Deployment
Business Understanding
Formulating as a Data mining problem
Getting data from students and analyze them to get information how they
are performing and how to improve their academic and personal skills was
impossible until getting the idea of data mining. Its data mining that has been
contributed in educational sectors in such objectives, though its been a
complex procedure to get most out of it.

Preliminary project plan design to achieve objectives


The plan to achieve those objective is to get data from students by a survey.
The domain here is students academic status and activities through out a
semester. Then form a skill matrix having some crucial factors that affects
students academic performance. After identifying those factors we want to
analyze them and compare them with market demand. Then generate a
successive model to improve students success rate.
Expected Outcome

Identify factors which cause students to loss his academic status due to academic
performance.
Success rate of students can be increased by identifying the risk at earlier stage.
Detect student who is expected to fail and needs extra attention.
Acknowledge students of market condition and requirements.
Data Understanding
In order to understand data we have to specify the domain for
the data.
The domain we have choose is students academic Result and
their academic activities and participations.
Our objective is to identify the factors those help a student to get
a successive academic result.
Data Description
In present days educational system, students performance is
determined by the internal assessment and semester result.
The internal assessment is carried out by the teacher based upon
students performance in educational activities such as class test,
assignments, general proficiency, attendance and lab work.
Data Set
Attribute Survey Question Data Type Probable Values
CGPA (Class) What is your last semester Nominal D (Below 2.8), C (2.8-3.24), B (3.25-3.74),
Result ? A(3.75-4.00)

Attendance How regular you are in Nominal Poor (below 75%) ,Average ( 75% - 85%)
classes ? Good(over 85%)
Class Test What is your Average Class Nominal Poor (below 7) ,Average ( 7 - 10) , Good(over
Test Performance 10)
Assignment Did you Submitted the Nominal Yes, No
Assignment ?
Lab How was your lab work ? Nominal Poor, Average, Good

General Proficiency General Proficiency (means Nominal Poor, Average, Good


one's ability to handle all the
tasks skillfully and
efficiently)
Self-Study How do you study by your Nominal Insufficient, Optimum, Sufficient
own ?
Group-Study Do you attend or commence Nominal Never, Rare, Often, Always
Group study ?
Data Preparation
Data Source : Student
Method : Online Survey(Google Forms)
Construct the dataset from the initial raw data
This phase includes Table, Record, attribute selection, data
cleaning
Constriction of new attributes, Transformation of data for modeling
Attribute Selection

InfoGainAttributeEvaluator GainRatioAttributeEvaluator
Attribute Rank Attribute Rank

Group Study 0.3226 Class Test 0.3086

Class Test 0.2895 General Proficiency 0.1949

Lab 0.2053 Group Study 0.1722

Self-Study 0.2039 Lab 0.1578

General Proficiency 0.1782 Self-Study 0.1382

Attendance 0.0707 Attendance 0.0517


Decision Tree
DM Model
If ClassTest = Good and Self-Study = Optimum then Result = A
If ClassTest = Good and Self-Study = Sufficient and Group-Study = Rare then Result = B
If ClassTest = Good and Self-Study = Sufficient and Group-Study = Often or Always then Result = A
If ClassTest = Good and Self-Study = Insufficient then Result = B
If ClassTest = Average and Group-study = Rare and GP = Good then Result = B
If ClassTest = Average and Group-study = Rare and GP = Average then Result = C
If ClassTest = Average and Group-study = Often and Self-Study = Sufficient then Result = B
If ClassTest = Average and Group-study = Often and Self-Study = Insufficient then Result = C
If ClassTest = Average and Group-study = Always and GP = Good then Result = B
If ClassTest = Average and Group-study = Always and GP = Good then Result = C
If ClassTest = Average and Group-study = Never then Result = D
(* GP General Proficiency)
Result
Result

We will to predict students


performance (CGPA) by a number of
classes: Excellent, Good,
Average, Poor.
We will consider the confusion matrix
to get idea of accuracy of the
predictive model where the row and
column represent the Actual class and
Predicted class respectively.
Task already completed
Selecting Data Domain
Find out related data
Data collecting (Google Survey)
Preliminary Data pre-processing (Cleansing)
Taking data into Weka (A tool for Data Mining)
Applying techniques of Attribute Selection
Applying Decision tree (J48) on full training data set
Generate DM Model
Task ongoing
Data Validation
Background Study
Study Similar Researches
Task Upcoming

This Project just are planned to imply data mining capabilities in educational
data mining, as a pilot study. Further research and implementation is
required to get a potential outcome.
More factors should be identified those have a great effect in students
performance to get more accurate result.
Challenges

Data quality issues, then validating.


Uncollected data cannot be analyzed.
Choosing which data to mine and analyze is challenging.
Determining which data mining technique is to apply.
Conclusion

Evaluating the knowledge will give us the status of students academic


success, and the administration and managing levels can use those
information for a better observations.
Students can also know their lacking those are causing them to perform
good result.
Thank You

You might also like