Professional Documents
Culture Documents
TEAM-1
• A.BHANU PRAKASH
• K.HANEETH
• R.SIREESHA
• N.ANJALI
Case Study:
The main objective of the case study is to retain the employees the
organization.
The main objective of the prediction is to reduce the attrition rates of the
organization using analytical methods.
Analytics can help organizations control employee turnover through
predictive models which can be used for developing strategies.
Data Exploration:
First of all, let us find out the number of employees who left the company
and those who didn’t:
There are 3571 employees left and 11428 employees stayed in our
data002ELet us get a sense of the numbers across these two classes:
Data Visualization:
Let us visualize our data to get a much clearer picture of the data and the
significant features.
Bar chart for department employee work for and the frequency of
turnover:
“Support Vector Machine” (SVM) is a supervised machine learning algorithm which can be used
for both classification or regression challenges.
SVM is computationally very expensive to tune it’s hyperparameters for two reasons:
With big datasets, it becomes very slow.
It has good number of hyperparameters to tune that takes very long time to tune
on a CPU.
Benefits of Turnover Prediction:
The output depends on the chosen model. For instance, ‘logistic model’
produces scorecards for employees based on their predicted ‘attrition risk’
parameters; while the classification model catalogues the employees into
wider parameters, such as-more likely or less likely to quit, high risk or low
risk, etc.
However, the bottom line is to keep it simple enough to understand and
implement accordingly. Changing the various factors help in assessing the
impact of changes and making the right decisions.
Conclusion:
Let’s conclude by printing out the test accuracy rates for all classifiers
we’ve trained so far and plot ROC curves. Then we will pick the classifier
that has the highest area under ROC curve.
Random Forest has higher accuracy rate and an f1-score with 99.27% and
99.44% respectively. Therefore, we safely say Random Forest outperforms
the rest of the classifiers. Now let’s look at the feature importance of
random forest classifier.