You are on page 1of 41

Quantitative methods for

management
Descriptive statistics- Numerical
measures
Recap
Day 1 Introduction, types of statistics, data and its types
Definition of statistics, terminologies : population , sample, parameter,
statistic, qualitative and quantitative data, levels of measurements :
Nominal, Ordinal, Interval and Ratio- sources of collecting data
Primary and secondary, applications of Statistics in various functions of
management data mining and data warehousing

Day 2 Classification of data Qualitative , quantitative, geographical and


chronological :Presentation of data frequency distribution, relative
and cumulative frequencies ; bivariate distributions, Diagrammatic
bar diagram , pie diagram
Graphical histogram, Frequency polygon, Ogive
Exploratory data analysis : Scatter diagram, stem and leaf plot
Summarization of data
Measures of central tendencies
AM, WM, GM
Positional averages median, percentiles, quartiles
Mode
Empirical formula
Measures of dispersion
Range
Quartile deviation
Mean deviation
Standard deviation
Variance
Coefficient of variation RAW DATA
Arithmetic Mean
Commonly called the mean
is the average of a group of numbers
Applicable for interval and ratio data
Not applicable for nominal or ordinal data
Affected by each value in the data set, including
extreme values
Computed by summing all values in the data set and
dividing the sum by the number of values in the data
set
It is possible to find the average, if we know the
aggregate and number of items, not necessarily to
know the value of the individual
Population Mean

X X X X
1 2 3
... X N
N N
24 13 19 26 11

5
93

5
18. 6
Properties of AM
Sum of deviations from AM is ZERO
Sum of squares of deviation taken from AM
will be minimum
Combined mean
It is affected by change of scale and change of
origin
Median
Middle value in an ordered array of numbers.
Applicable for ordinal, interval, and ratio data
Not applicable for nominal data
Unaffected by extremely large and extremely
small values.
Median: Computational Procedure
First Procedure
Arrange the observations in an ordered array.
If there is an odd number of terms, the median is
the middle term of the ordered array.
If there is an even number of terms, the median is
the average of the middle two terms.
Second Procedure
The medians position in an ordered array is given
by (n+1)/2.
Median: Example
with an Odd Number of Terms
Ordered Array
3 4 5 7 8 9 11 14 15 16 16 17 19 19 20 21 22

There are 17 terms in the ordered array.


Position of median = (n+1)/2 = (17+1)/2 = 9
The median is the 9th term, 15.

If the 22 is replaced by 100, the median is 15.

If the 3 is replaced by -103, the median is 15.


Median: Example
with an Even Number of Terms
Ordered Array
3 4 5 7 8 9 11 14 15 16 16 17 19 19 20 21

There are 16 terms in the ordered array.


Position of median = (n+1)/2 = (16+1)/2 = 8.5
The median is between the 8th and 9th terms,
14.5.
If the 21 is replaced by 100, the median is 14.5.
If the 3 is replaced by -88, the median is 14.5.
Percentiles
Measures of central tendency that divide a group
of data into 100 parts
At least n% of the data lie below the nth
percentile, and at most (100 - n)% of the data lie
above the nth percentile

Example: 90th percentile indicates that at least


90% of the data lie below it, and at most 10% of
the data lie above it
The median and the 50th percentile have the same
value.
Applicable for ordinal, interval, and ratio data
Not applicable for nominal data
Percentiles: Computational Procedure
Organize the data into an ascending ordered
array.
Calculate the
P
percentile location:
i (n)
100
Determine the percentiles location and its value.

If i is a whole number, the percentile is the


average of the values at the i and (i+1) positions.

If i is not a whole number, the percentile is at


the (i+1) position in the ordered array.
Percentiles: Example
Raw Data: 14, 12, 19, 23, 5, 13, 28, 17
Ordered Array: 5, 12, 13, 14, 17, 19, 23, 28
Location of
30
30th percentile: i (8) 2. 4
100
The location index, i, is not a whole number; i+1 =
2.4+1=3.4; the whole number portion is 3; the
30th percentile is at the 3rd location of the array;
the 30th percentile is 13.
Quartiles
Measures of central tendency that divide a group of
data into four subgroups

Q1: 25% of the data set is below the first quartile


Q2: 50% of the data set is below the second quartile
Q3: 75% of the data set is below the third quartile

Q1 is equal to the 25th percentile


Q2 is located at 50th percentile and equals the
median
Q3 is equal to the 75th percentile
Quartile values are not necessarily members of the
data set
Quartiles

Q1 Q2 Q3

25% 25% 25% 25%


Quartiles: Example
Ordered array: 106, 109, 114, 116, 121, 122,
125, 129
Q1 25 109114
i (8) 2 Q1 1115
.
100 2
50 116121
Q2: i (8) 4 Q2 1185
.
100 2
75 122125
Q3: i (8) 6 Q3 1235
.
100 2
Mode
The most frequently occurring value in a data
set
Applicable to all levels of data measurement
(nominal, ordinal, interval, and ratio)

Bimodal -- Data sets that have two modes


Multimodal -- Data sets that contain more
than two modes
Mode -- Example
The mode is 44.
35 41 44 45
There are more 44s
than any other value. 37 41 44 46

37 43 44 46

39 43 44 46

40 43 44 46

40 43 45 48
Problem
The cost of consumer purchases such as single family
housing, gasoline, internet services, tax preparation ,
and hospitalization were provided in The Wall Street
journal. Sample data typical of the cost of tax return
preparation by services such as H&R block are shown
below
120 230 110 115 160 130 150 105
195 155 105 360 120 120 140 100
115 180 235 255
- Compute the mean, median and mode
- Compute the first and third quartiles
- Compute and interpret the 90th percentile
Measures of Variability
It is often desirable to consider measures of variability
(dispersion), as well as measures of location.

For example, in choosing supplier A or supplier B we


might consider not only the average delivery time for
each, but also the variability in delivery time for each.
Variability
No Variability in Cash Flow Mean
Mean

Variability in Cash Flow Mean


Mean
Variability

Variability

No Variability
Measures of Variability:
Ungrouped Data
Measures of variability describe the spread or the
dispersion of a set of data.
Common Measures of Variability
Range
Interquartile Range
Mean Absolute Deviation
Variance
Standard Deviation
Z scores
Coefficient of Variation
Range
The difference between the largest and the
smallest values in a set of data
Simple to compute 35 41 44 45

Ignores all data points except 37 41 the 44 46


two extremes
Example: 37 43 44 46

Range 39 43 = 44 46
Largest - Smallest =
48 - 35 = 13 40 43 44 46

40 43 45 48
Interquartile Range

Range of values between the first and third


quartiles
Range of the middle half
Less influenced by extremes

Interquartile Range Q 3 Q1
Deviation from the Mean
Data set: 5, 9, 16, 17, 18
Mean:

X 65 13
N 5
Deviations from the mean: -8, -4, 3, 4, 5
+5
-4 +4
-8 +3

0 5 10 15 20


Mean Absolute Deviation
Average of the absolute deviations from the
mean
X X X
X
M . A. D.
5 -8 +8 N
9 -4 +4
+3 +3 24

16
17 +4 +4
18 +5 +5 5
0 24 4.8
Population Variance
Average of the squared deviations from the
arithmetic mean

X X
X
X
2
2


2
5 -8 64
9 -4 16 N
16 +3 9
130
17
18
+4
+5
16
25
0 130
5
2 6 .0
Population Standard Deviation
Square root of the
variance
X
2

X X X
2

2

N
5 -8 64 130
9 -4 16
16 +3 9 5
17
18
+4
+5
16
25 2 6 .0
0 130

2

2 6 .0
5 .1
Sample Variance
Average of the squared deviations from the
arithmetic mean

X X X X X X
X
2
2

2
2,398 625 390,625 S
1,844 71 5,041 n1
1,539 -234 54,756
6 6 3 ,8 6 6
1,311
7,092
-462
0
213,444
663,866

3
2 2 1 , 2 8 8 .6 7
Sample Standard Deviation
Square root of the
X X
2
sample variance 2
S
X X X X X
2
n1
6 6 3 ,8 6 6
2,398 625 390,625
1,844 71 5,041 3
1,539 -234 54,756
1,311 -462 213,444 2 2 1 , 2 8 8 .6 7
7,092 0 663,866
2
S S
2 2 1 , 2 8 8 .6 7
4 7 0 .4 1
Uses of Standard Deviation
Indicator of financial risk
Quality Control
construction of quality control charts
process capability studies
Comparing populations
household incomes in two cities
employee absenteeism at two plants
Standard Deviation as an
Indicator of Financial Risk
Annualized Rate of Return
Financial
Security

A 15% 3%
B 15% 7%

3-33
Coefficient of Variation
Ratio of the standard deviation to the mean,
expressed as a percentage
Measurement of relative dispersion


C.V . 100

Coefficient of Variation
29
1
84
2

1
4.6 2
10
100 100
. .
CV 1
1
. .
CV 2
2

1 2

4.6 10
100 100
29 84
1586
. 1190
.
A home theatre in a box is the easiest and cheapest way to provide surround
sound for a home entertainment centre. A sample of prices is shown here
(Consumer Reports Buying Guide, 2013). The prices are for models with a
DVD player and for models without a DVD player.

Models with DVD Player Price Models without DVD Player Price
Sony HT-1800DP $450 Pioneer HTP-230 $300
Pioneer HTD-330DV 300 Sony HT-DDW750 300
Sony HT-C800DP 400 Kenwood HTB-306 360
Panasonic SC-HT900 500 RCA RT-2600 290
Panasonic SC-MTI 400 Kenwood HTB-206 300

Compute the mean price for models with a DVD player and the mean price for
models without a DVD player. What is the additional price paid to have a DVD
player included in a home theatre unit?
Compute the range, variance, and standard deviation for the two samples. What
does this information tell you about the prices for models with and without a DVD
player?
Price with DVD player Price without DVD player

Mean 410 Mean 310

Standard Error 33.1662479 Standard Error 12.64911064

Median 400 Median 300

Mode 400 Mode 300

Standard Deviation 74.16198487 Standard Deviation 28.28427125

Sample Variance 5500 Sample Variance 800

Kurtosis 0.867768595 Kurtosis 4.578125

Skewness -0.551618069 Skewness 2.099223257

Range 200 Range 70

Minimum 300 Minimum 290

Maximum 500 Maximum 360

Sum 2050 Sum 1550

Count 5 Count 5
The following data were used to construct the histograms of the number of
days required to fill orders for Dawson Supply, Inc., and J.C. Clark
Distributors

Dawson Supply Days for Delivery :11 10 9 10 11 11 10 11 10 10


Clark Distributors Days for Delivery : 8 10 13 7 10 11 10 7 15 12

Use the range and standard deviation to support that Dawson Supply
provides the more consistent and reliable delivery times.
dawson clark

Mean 10.3 Mean 10.3


Standard Error 0.213437475 Standard Error 0.817176711
Median 10 Median 10
Mode 10 Mode 10
Standard Deviation 0.674948558 Standard Deviation 2.584139659
Sample Variance 0.455555556 Sample Variance 6.677777778

Kurtosis -0.282994816 Kurtosis -0.350865189

Skewness -0.433637384 Skewness 0.359288855

Range 2 Range 8

Minimum 9 Minimum 7

Maximum 11 Maximum 15

Sum 103 Sum 103

Count 10 Count 10
coefficient of variation 25.08873455
coefficient of variation 6.552898619
Practice
The following times were recorded by the quarter-mile and mile runners of
a university track team (times are in minutes).
Quarter-Mile Times: .92 .98 1.04 .90 .99
Mile Times: 4.52 4.35 4.60 4.70 4.50
After viewing this sample of running times, one of the coaches commented
that the quarter milers turned in the more consistent times. Use the standard
deviation and the coefficient of variation to summarize the variability in the
data. Does the use of the coefficient of variation indicate that the coachs
statement should be qualified?
Chapter 3 ( 99 -123)

You might also like