Professional Documents
Culture Documents
To be Submitted to
PROF. SREEDHARA R.
Presented by:
Anindya Biswas
1527605, M1
QUESTION:
ABC School of Business selects students for its MBA Program every year through a written test,
Group Discussion, and Interview. Then it tracks the performances by means of Grade Point
Average during the 2 -Year Program. The school has past data for 30 students admitted in an
earlier year, who either were successful students [ GPA greater than 3.00 on a 4 Point scale is
defined as successful] or unsuccessful [ GPA less than 3.00]
Develop model using discriminant analysis for the ABC school of business. The model should be
able to predict whether a student will be successful or unsuccessful based on his Written Test
score, Group Discussion Score & Interview Score.
1. Explain the decision rule to be used for classifying the student as
i. Potentially Successful
ii. Unsuccessful
2. What is the classification accuracy level of the model?
3. Which of the three scores is the best predictor of a students future success?
Dependent Variable:
Successful 1
Unsuccessful 2
Independent Variable:
X1 Written Test Score
X2 Group Discussion Score
X3 Interview Score
Group Discussion
Score
200
270
300
250
260
220
210
Interview
Score
10
15
30
35
45
40
30
Successful or
Unsuccessful
15
20
30
30
35
45
40
2
1
1
1
1
2
1
2
200
240
230
200
300
280
290
275
263
285
291
300
205
220
230
240
270
290
280
250
255
260
260
10
15
25
30
35
20
30
40
25
30
35
25
15
25
40
40
45
35
30
30
15
20
25
25
30
25
35
40
20
25
40
30
25
35
30
20
40
25
15
25
30
35
28
23
40
15
2
1
2
2
1
2
2
2
1
1
1
1
2
2
2
2
1
1
2
2
2
1
2
Classification Resultsa
Predicted Group MembershipA
Successful/Unsuccessful
OriginalB
CountC
Successful
Unsuccessful
%D
Successful
Successful
Unsuccessful
Total
10
14
12
16
71.4
28.6
100.0
Unsuccessful
25.0
75.0
100.0
give a more accurate interpretation to determine whether a future candidate admitted into the
ABC School of Business will be potentially successful or unsuccessful.
Wilks' Lambda
Wilks' LambdaA
Test of Function(s)
1
Chi-square
.705
SigB.
Df
9.255
.026
.934
-.174
.567
The standardized canonical discriminant coefficients can be used to rank the importance of
each variable. A high standardized discriminant function coefficient might mean that the groups
differ a lot on that variable.
In our case, we could see that the written test score, is the most important predictor variable
in understanding whether the potential candidate will be successful or not. Followed by the
personal interview scores as this determines the candidates communication skills and also
ability to handle pressure.
The least important predictor variable is the Group Discussion Score as the student might
not have to participate in a discussion type scenario in his future.
.032
-.017
.070
-9.729
Unstandardized coefficients
In our case, the equation for the Discriminant Analysis Equation stands out to be:
Y =a+ K 1 X 1+ K 2 X 2+ K 3 X 3
Substituting the values from the above table we get the equation as:
6
Successful
.668
Unsuccessful
-.584
- 0.584
Successful
- ve
+ ve
+ 0.668
Centroid Mean
0.0415
On the right side of the line there are positive values and on the left side there are negative
values. The right side of the line denotes xx=0.668 (sum of Discriminant score for all the
unsuccessful students) and on the left side xx= -0.584 (sum of Discriminant score for all the
unsuccessful students). We know that the function scores have a mean of zero, and we can
check this by looking at the sum of the group means multiplied by the number of cases in
each group:
(14 * 0.668) + (16 * -0.578) = 9.352 9.248 = 0.104 which is almost near to 0.
We shall now calculate the discriminant scores as well as the predicted group by substituting the
values for X1, X2 and X3 in the above equation and finding the value of Y. A value that is lower
than the centroid mean of 0.0415 would be deemed unsuccessful whereas above it would
mean successful and then we can compare it with the original data.
7
Written Test
Score
200
270
300
250
260
220
210
200
240
230
200
300
280
290
275
263
285
291
300
205
220
230
240
270
290
280
250
255
260
260
Group Discussion
Score
10
15
30
35
45
40
30
10
15
25
30
35
20
30
40
25
30
35
25
15
25
40
40
45
35
30
30
15
20
25
Interview
Score
Successful or
Unsuccessful
15
20
30
30
35
45
40
25
30
25
35
40
20
25
40
30
25
35
30
20
40
25
15
25
30
35
28
23
40
15
2
1
1
1
1
2
1
2
1
2
2
1
2
2
2
1
1
1
1
2
2
2
2
1
1
2
2
2
1
2
Discriminant
Scores
-2.41213
0.10499
1.51215
-0.18397
0.31521
-0.18256
-0.68174
-1.70955
-0.15791
-1.00498
-1.35486
2.12776
0.33985
0.83903
1.23622
0.40835
0.67812
1.48683
1.59912
-1.98690
-0.27294
-1.26589
-1.64664
-0.06554
1.10335
1.21978
-0.23751
-0.16698
1.10135
-0.74207
Predicted
Group
2
1
1
2
1
2
2
2
2
2
2
1
1
1
1
1
1
1
1
2
2
2
2
2
1
1
2
2
1
2
Similarly, as we can see from the table above the highlighted cases are the ones wherein the
data gathered said that the student was successful or unsuccessful but the data when fed
into the system predicted otherwise. Thus the accuracy level was around (22/30) which is
around 73.33% which was as shown in the classification table.
FINAL VERDICT
Thus, in conclusion we can say that:
1. The model needs more parameters to justify the student potential because currently it has
a low classifying capability of around 73.33% which is not enough.
8
2. The model though has data (which has been provided as of now) that is significant as was
seen through the p-value of .026 has low discriminating power (as was seen through
Wilks Lambda value of .705) and may not be able to guarantee satisfactory and accurate
results of a students success potential in the institute.
3. Furthermore, as we saw in the table for standardized discriminant function coefficients,
the importance for Written Test Scores is the most important followed by Personal
Interview and finally, the Group Discussion Scores. This might not be the scenario in the
future, once the student graduates out of ABC School of business as being able to
participate successfully in a team meeting is often considered to be characteristic of a
good team player. Thus the judgement of the model need not be true always.