Analytics in Nutshell

Analytics can be basically divided into 3 domains. 1. Descriptive Analytics 2. Predictive Analytics and 3.
Prescriptive
Analytics.
A. Descriptive Analytics: It is the first step which consists of gathering and initial checking of data from experiment.
Experiment can be of many kind including literal and perceived meanings. Eg. Promotional campaign, Website
visit data etc. Data is collected from here and checked for below points.
1. Detection of mistakes during data entry.
2. Checking for assumptions and constrains.
3. Pattern recognition like correlation, linear regression, auto regressive nature etc.
4. Determining relationship between exploratory variables.
5. A future rough direction towards how the future perspectives will be designed.
Steps involve are arranging unstructured data into structured data. Unstructured data are one which are not
properly arranged in excel format in form of well-defined tables. It must be arranged in tables in such a way that
any mismatch can be found easily by visualization. Plus, type of data i.e. categorical, ordinal, binary etc. is taken
care of. Other steps consist of:
1. Finding central tendency: Which include type of distribution. Finding potential outliers. The most common
measure of central tendency is mean. For skewed distribution and outliers (which cannot be avoided)
median is preferred. Modes are used for grouping purpose (examples are available on net).
2. Skewness and Kurtosis
3. Histogram: Challenge is finding the bin size. For higher bin size a data may be normally distributed but same
data with smaller bin size may not be normally distributed. This is carefully selected. Can be done using
clustering to get similar data under one bin.
4. QQ Plot: Used to check and interpret normal distribution.
5. Correlation and covariance: Cor (X,Y) = Cov (X,Y)/SDx.SDy, Covariance is difficult to interpret. Correlation is
more robust and only varies between -1 and 1.
6. Box Plot: Highly handy tool for data exploration especially for non-normally distributed data.
Further exploratory data analysis like t test, ANOVA etc. can be used for a detailed initial report.
B. Predictive Analytics: Consists of:
1. Regression (Linear/Multiple): Can be used for forecasting, election prediction.
2. Logistic Regression: Used for categorical data. Highly use full in medical/insurance industry, Loan default,
Cricket, basketball etc.
3. CART/Forest: Can be used for both categorical and normal data. Useful as it helps in formation of rules and
future data entry can be done group which are created by rules. Eg: Medical field, D2Hawkeye (a medical
analytics company uses such model), Sensex data, vote prediction etc.
4. Text Analytics: Used for sentiment analysis, understanding of trends etc. Are clubbed with specialized
libraries in R and can be used for twitter analytics, Facebook analytics (Personally able to extract Facebook
IDs(who commented maximum on page posts) from page of The Hindu, can be used to create focused
groups (connect with marketing terminologies) and some kind of loyalty programs (Connect with IMC)
helping us in further understanding of what customer wants-connect with consumer behaviour), further
google trend can be connected using R which gives scaled search interest for a particular or a group of words
on google, can be used for stock price prediction using news as news affects the sentiment of share owners
for very short time.
5. Clustering: Can be of hierarchical or K mean type. Use for market segmentation, used by IMDB, Netflix to
suggest movies.
6. Time Series: Used in the field of finance and economics especially stock price prediction etc. It captures auto
regression i.e. how previous data will affect any new data. It is possible the effect of historic data is of
exponential nature. Similarly, how errors effect future data must be studied too. Examples are moving
average, weighted average, exponential smoothening, seasonality corrected exponential smoothening,

ARMA model, ARIMA model etc. R supports most of the above-mentioned models under forecast package.
7. Affinity Analysis: Used by Amazon for product suggestions.
8. Conjoint Analysis: one use is to find willingness to pay ... not much info available with me.
9. Deep Learning: Basically, consists of advance neural network consisting of very complicated interactions
between various predictors and how errors are used as input in form of feedback. Just like how brain works.
Used for Image Processing, Voice recognition etc. They are self-learning models which improves over the
time. Such models is what AI (Artificial Intelligence) stands for.
C. Prescriptive Analytics: Basically, consists of two types of models. As per the literal meaning it provides
prescription/medicine for higher level business problem. The one management consulting firms solves Increase
profit by 2% etc.. Based on above two types of analytics prescription can be designed. It is an iterative process
and in case of failure entire circle must be revisited i.e. Descriptive Predictive Prescriptive
1. Optimization: Using iteration check all the possible combination to give best result. Excel solver can be used
for this. Many sophisticated softwares are available for this. R too have advanced packages for this.
2. Simulation Model: A model is built and using various input dry runs are made to find the best input.
3. Artificial Neural Network Model: . Yet to explorer .
Decision variables: Are gathered through the above predictive modeling techniques. Eg. Education data, to
minimize teacher recruitment what are the factors which affect...we can somehow connect this.
Constrains: gathered using PESTAL, organizational constrains, physical constraints.
The idea is to come up with best combination (just like doctors medicine) which will cure the disease
(objective function). This is the end objective of any Analytics. Any type of data mining ultimately are used
for prescriptive analytics.

Analytics in Nutshell

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Analytics in Nutshell

Uploaded by

Copyright:

Available Formats

Analytics can be basically divided into 3 domains. 1. Descriptive Analytics 2. Predictive Analytics and 3.

average, weighted average, exponential smoothening, seasonality corrected exponential smoothening,

You might also like