You are on page 1of 28

Workshop 2

Numerical Descriptive Measures

BUSS1020 | Quantitative Business Analysis


Scott Liu

12 March 2015

Overview
1

Objectives

Lecture Recap

Learning the Basics

Applying Techniques

Objectives
Objectives

Lecture Recap

Learning the Basics

Applying Techniques

After this workshop, students should be able to:


1.
2.

3.

Use Excel and STATCRUNCH to produce descriptive (or summary) statistics.


Analysis and explain the output from desciptive statistics.
a) To describe the central tendency in numerical data
b) To describe the variation in numerical data
c) To describe the shape of a data distribution
Understand the box-plot as another graphical tool for describing the characteristics of
numerical data.

Lecture Recap
Objectives

1. Central Tendency

Lecture Recap

Learning the Basics

Applying Techniques

Central tendency measures where the distribution tends


towards. In order words, it finds the central/typical value.

2. Quartiles and percentiles and


interquartile range,
3. Five-number summary and
box-plot
4. Variations and Shape

5. Covariance and Correlation

There are three measures of central tendency:


Mean: The average value.
Median: The middle value.
Mode: The most frequent value.

Lecture Recap
Objectives

1. Central Tendency
2. Quartiles and percentiles and
interquartile range,
3. Five-number summary and
box-plot
4. Variations and Shape

5. Covariance and Correlation

Lecture Recap

Learning the Basics

Applying Techniques

Sample Mean
The mean () is mathematically shown by:

sum of values
1 + 2 + +
=
=
=
number of observations

Where:
represents the mean
denotes the number of observations
represents the th observation of

=1

Lecture Recap
Objectives

1. Central Tendency
2. Quartiles and percentiles and
interquartile range,
3. Five-number summary and
box-plot

Lecture Recap

Applying Techniques

Sample Median
The median () is the middle value of an ordered array.

Given a sorted (smallest to largest) observation sample:


1 2 1

4. Variations and Shape

5. Covariance and Correlation

Learning the Basics

If is odd,

=
If is even,
=

1

2

+1
2

+1
2

Lecture Recap
Objectives

1. Central Tendency
2. Quartiles and percentiles and
interquartile range,
3. Five-number summary and
box-plot
4. Variations and Shape

5. Covariance and Correlation

Lecture Recap

Learning the Basics

Applying Techniques

Sample Mode
The mode is the observation that appears the most.

There may be no mode or more than one mode.

Lecture Recap
Objectives

1. Central Tendency
2. Quartiles and percentiles and
interquartile range,
3. Five-number summary and
box-plot
4. Variations and Shape

5. Covariance and Correlation

Lecture Recap

Learning the Basics

Applying Techniques

Mean, Median or Mode?


Be reasonable.

Look at the nature of the distribution.


Is the distribution concentrated near the middle
Does the distribution have any outliers?

Lecture Recap
Objectives

1. Central Tendency
2. Quartiles and percentiles and
interquartile range,
3. Five-number summary and
box-plot
4. Variations and Shape

5. Covariance and Correlation

Lecture Recap

Learning the Basics

Applying Techniques

Quartiles and Percentiles


Quartiles divides the distribution into four equal parts:
1 represents the first (lower) quartile, it has 25% of the
distribution below it, or 75% above it.
2 represents the second quartile (median), it has 50% of the
distribution below it, or 50% above it.
3 represents the third (upper) quartile, it has 75% of the
distribution below it, or 25% above it.
25%

25%

25%

25%

Percentiles are a more general expression. It is a percentage


that shows how many percent of the distribution is below that
number. E.g.
25th percentile (first quartile) has 25% of distribution below it.

Lecture Recap
Objectives

1. Central Tendency
2. Quartiles and percentiles and
interquartile range,
3. Five-number summary and
box-plot
4. Variations and Shape

5. Covariance and Correlation

Lecture Recap

Learning the Basics

Applying Techniques

How to find quartiles?


Method 1
(lecture method): Think of the quartiles as:
1 = the median of the lower half of the data
3 = the median of the upper half of the data

If is an odd number, exclude the median from each half.

Lecture Recap
Objectives

1. Central Tendency
2. Quartiles and percentiles and
interquartile range,
3. Five-number summary and
box-plot
4. Variations and Shape

5. Covariance and Correlation

Lecture Recap

Learning the Basics

Applying Techniques

How to find quartiles?


Method 2
(textbook method):
1 =
2 =
3 =

+1
4
+1

ranked value
ranked value

2
3(+1)
4

ranked value

If the result is a whole number, then it is the ranked position to


be used.
If it is a fractional half, average the two corresponding data
values (e.g. 2.5, 4.5, etc).
If not a whole number nor fractional half, round it to the
nearest integer (e.g 2.8, 3.3, etc)

Lecture Recap
Objectives

1. Central Tendency
2. Quartiles and percentiles and
interquartile range,
3. Five-number summary and
box-plot
4. Variations and Shape

5. Covariance and Correlation

Lecture Recap

Learning the Basics

Applying Techniques

Interquartile Range
Measures the spread of data in the middle 50% of the data:

= 3 1
Benefit of this is that it is not influenced by extreme values or
outliers.

Lecture Recap
Objectives

1. Central Tendency
2. Quartiles and percentiles and
interquartile range,

Lecture Recap

Learning the Basics

Applying Techniques

Five-number Summary
Demonstrates the minimum, maxmimum and the three
quartiles:

3. Five-number summary and


box-plot

, 1 , , 3 ,

4. Variations and Shape

5. Covariance and Correlation

25%

25%

25%

25%

Lecture Recap
Objectives

1. Central Tendency
2. Quartiles and percentiles and
interquartile range,
3. Five-number summary and
box-plot
4. Variations and Shape

5. Covariance and Correlation

Lecture Recap

Learning the Basics

Applying Techniques

Box-Plot
Graphical representation of the five-number summary.

Example from lecture:

Lecture Recap
Objectives

1. Central Tendency
2. Quartiles and percentiles and
interquartile range,
3. Five-number summary and
box-plot
4. Variations
Variation and Shape

5. Covariance and Correlation

Lecture Recap

Learning the Basics

Applying Techniques

Variation
Answers this question: How much does my data scatter
around the central value?
Common measures of variation:

Range
Variance
Standard Deviation
Interquartile Range

Lecture Recap
Objectives

1. Central Tendency
2. Quartiles and percentiles and
interquartile range,
3. Five-number summary and
box-plot
4. Variations and Shape

5. Covariance and Correlation

Lecture Recap

Learning the Basics

Applying Techniques

Range
Range is the difference between the largest and smallest
values in a data set.

Sample Variance and Sample Standard Deviation


Sample variance is mathematically represented by:
1
2 =
1


=1

1
1

=1

Where:
represents the sample standard deviation.
2 represents the sample variance.

Note: Standard deviation has the same unit as and roughly


shows the average distance from an observation to the sample
mean.

Lecture Recap
Objectives

1. Central Tendency
2. Quartiles and percentiles and
interquartile range,

Lecture Recap

5. Covariance and Correlation

Applying Techniques

Coefficient of Variation
Sometimes when you measure the variability of two sets of
data with different means, we can standardise the mean

3. Five-number summary and


box-plot
4. Variations and Shape

Learning the Basics

Interpretation:
Higher represents greater variation.
Lower represents lesser variation.

Lecture Recap
Objectives

1. Central Tendency
2. Quartiles and percentiles and
interquartile range,
3. Five-number summary and
box-plot
4. Variations and Shape

5. Covariance and Correlation

Lecture Recap

Learning the Basics

Applying Techniques

Shape
It shows the pattern of distribution. Common measures:
Sample skewness
Sample Kurtosis

They are hard concepts that if you interpret incorrectly, you will
lose marks. All you need to know is how to differentiate these
shapes:

mean < median

mean = median

mean > median

Lecture Recap
Objectives

1. Central Tendency
2. Quartiles and percentiles and
interquartile range,
3. Five-number summary and
box-plot

Lecture Recap

Learning the Basics

Sample Covariance
Measures strength of linear relationship between two variables:

1
, =
1


=1

4. Variations
Variation and Shape

5. Covariance and Correlation

Applying Techniques

Sample Coefficient of Correlation


More robust and standardises and :
=

(, )

Lecture Recap
Objectives

1. Central Tendency
2. Quartiles and percentiles and
interquartile range,
3. Five-number summary and
box-plot
4. Variations
Variation and Shape

5. Covariance and Correlation

Lecture Recap

Learning the Basics

Applying Techniques

Z-Score
Transforms an observation into a standardised normal
distribution:
=

Interpretation: has standard deviations away from the


mean ().
Note: We will explore this in more detail later.

Learning the Basics


Objectives

1. Exercise 1

2. Exercise 2

3. Exercise 3

Lecture Recap

Learning the Basics

Applying Techniques

Do you know:
How to find the mean, median, variance and standard
deviation using Excel and STATCRUNCH?
What is the significance of the mean and median being close
in value?
Do you know how to identify skewedness?
Remember:
Do not put in figures in your tables that you are not going to
discuss.
Keep decimal places consistent and reasonable. E.g. if you
are dealing with integers, do not exceed 2 decimal places.

Learning the Basics


Objectives

1. Exercise 1

Lecture Recap

Learning the Basics

Do you know:
How to decide which summary measure to use?

2. Exercise 2

3. Exercise 3

Applying Techniques

Remember:
Think with common sense.

Learning the Basics


Objectives

Exercise 1

Exercise 2

Exercise 3

Lecture Recap

(a)
(i) T
(ii) F
(iii) F
(b)
(i) F
(ii) T
(iii) T
(iv) T
(c)
F

Learning the Basics

Applying Techniques

Applying Techniques
Objectives

Exercise 4

Exercise 5

Exercise 6

Exercise 7

Exercise 8

Exercise 9

Lecture Recap

Learning the Basics

Applying Techniques

Question 12
The file SUV contains the overall miles per gallon (MPG) of
2013 small SUVs:
a) Compute the mean, median and mode
b) Compute the variance, standard deviation, range, coefficient
of variation, and Z scores
c) Are the data skewed? If so, how?

Applying Techniques
Objectives

Exercise 4

Exercise 5

Exercise 6

Exercise 7

Exercise 8

Exercise 9

Lecture Recap

Learning the Basics

Applying Techniques

Applying Techniques
Objectives

Exercise 4

Exercise 5

Exercise 6

Exercise 7

Exercise 8

Exercise 9

Lecture Recap

Learning the Basics

Applying Techniques

Question 15
Is there a difference in the variation of the yields of different
types of investments? The file CD Rate contains the yields for
one-year certificates of deposits (CDs) and five year (CDs) for 23
banks in the United States, as of March 20, 2013.
a) For one-year and five-year CDs, separately compute the
variance, standard deviation, range, and coefficient of
variation.
b) Based on the results of (a), do one-year CDs or five-year
CDs have more variation in the yields offered? Explain.

Applying Techniques
Objectives

Exercise 4

Exercise 5

Exercise 6

Exercise 7

Exercise 8

Exercise 9

Lecture Recap

Learning the Basics

Applying Techniques

Applying Techniques
Objectives

Exercise 4

Exercise 5

Exercise 6

Exercise 7

Exercise 8

Exercise 9

Lecture Recap

Learning the Basics

Applying Techniques

Question 48
College football is big business, with coaches total revenues in
millions of dollars. The file College Football contains pay and
revenues for college football in the 124 schools that are part of
the Division 1 Football Subdivision.
a) Compute the covariance
b) Compute the coefficient of correlation
c) Based on (a) and (b), what conclusions can you reach
regarding the relationship between coaches total pay and
revenue.?

You might also like