You are on page 1of 10

Measures of Central Tendency

Introduction
Measures of central tendency are measures of the location of the middle or the center
of a distribution. The definition of "middle" or "center" is purposely left somewhat
vague so that the term "central tendency" can refer to a wide variety of measures.
Measures of central tendency attempt to quantify what we mean when we
think of as the "typical" or "average" score in a data set. The concept is extremely
important and we encounter it frequently in daily life. For example, we often want to
know before purchasing a car its average distance per litre of petrol. Or before
accepting a job, you might want to know what a typical salary is for people in that
position so you will know whether or not you are going to be paid what you are
worth. Or, if you are a smoker, you might often think about how many cigarettes you
smoke "on average" per day. Statistics geared toward measuring central tendency all
focus on this concept of "typical" or "average." As we will see, we often ask questions
in psychological science revolving around how groups differ from each other "on
average". Answers to such a question tell us a lot about the phenomenon or process
we are studying.
The concept of Measures of Central Tendency:

Describe the center point of a data set with a single value

Valuable tool to help us summarize many pieces of data with a single number

The ways to measure the central tendency of our data are mean, median and
mode.

This chapter defines the three most common measures of central tendency: the
mean, the median, and the mode. Often it is desirable to have a certain number to
describe a set of data. In other words, this one number would be representative of the
data. Since a representative number should be close to the "middle" of the data, we
call these measures of central tendency. The first, and weakest, of these measures is
the mode.

Mean (Average)
The mean is the most powerful, and usually the most accurate and reliable, measure of
central tendency. When we usually hear the word "average", what we are really
thinking about is the mean. To find the mean for a set of data, we take the sum of all
of the values, and divide the sum by how many values there are.
The mean, or "average", is the most widely used measure of central tendency. The
mean is defined technically as the sum of all the data scores divided by n (the number
of scores in the distribution). In a sample, we often symbolise the mean with a letter
with a line over it. If the letter is "X", then the mean is symbolised as , pronounced
"X-bar." If we use the letter X to represent the variable being measured, and then
symbolically, the mean is defined as

The formula for the sample mean is

, where:

is the mean of the sample,


is the sum of all the values, and
n is the number of values in the set.

If we are looking for the mean of a population, we denote that mean by the Greek
letter , mu. The way to calculate this mean is the same. The difference in notation is
to tell a sample statistic, , from a population parameter, . We will always use our
own alphabet when discussing a sample statistic, and the Greek alphabet to discuss a
population parameter.

The formula for the population mean is


is the mean of the population,
is the sum of all the values, and
N is the number of values in the set.

, where

Let's try an example.


Joe D. Student got the following scores on his 5 statistics exams: 89, 83, 71, 95, 73.
Find Joe's mean test score.

So, why
and not ? Since these represent all of Joe's 5 tests, we treat it as a
population. But again the difference here is only in name. Later there will be
calculations that are done differently for samples and populations.
It will be to your advantage to be able to use your calculator to compute the mean.
Your calculator has a built in way to calculate the mean of a set of data. Most nongraphing calculators use some or all of the following steps.
1. Put your calculator into statistics mode.
2. Make sure that your statistical registers are cleared. These are the memory locations
where your calculator stores the values.
3. Enter your numbers into the calculator by pressing the number and then hitting the
key that will "store" the number in the statistical registers. The key will either have
, M+, or Data on it.
4. Once all the numbers have been entered, push the key with over it. Usually, you
will have to push the 2nd key or the Shift key or the Inv key. Voila! There it is. Try it
for the above set of data.

Weighted Mean
Weighted mean allows you to assign more weight to certain values and less weight to
others. The weighted mean is similar to an arithmetic mean where instead of each of
the data points contributing equally to the final average, some data points contribute
more than others. The notion of weighted mean plays a role in descriptive statistics
and also occurs in a more general form in several other areas of mathematics.
Here's the formula.

For example, let say the Aviation Management College (AMC) recently listed the
names of people who had worked at the college for 15, 20, 25, and 30 years. There
were 8 people who had worked for 15 years, 5 people for 20 years, 4 people for 25
years, and 1 person for 30 years. If we wanted to find the mean length of service for
these 18 people, we could add 15 + 15 ++ 15 + 20 ++ 20 + 25 ++ 25 + 30,
then divide the total by 18. But we could also use the weighted mean. If a value is
repeated, we multiply it by the number of times it appears in the list. We repeat this
for all values in the list, add these products, then divide by the total number of values.
The number 15 appears 8 times in the above list. 15 is a value, and 8 is its weight. 20
is a value, and 5 is its weight.
Hence, the problem can be solved using the weighted formula:

Mean From a Frequency Distribution


Sometimes we are given a chart showing frequencies of certain groups instead of the
actual values. We can still come up with a good estimate of a typical value for the set
of data, provided that we make some assumptions. We assume that the values in each
class or group are spread evenly throughout the group. If this is the case, then the
mean for each class should be approximately equal to the midpoint for each class. The
midpoint is found by adding the lower class boundary to the higher class boundary,
then dividing that sum by 2. So for each class, we have a mean and a number of
values. We now call on our friend the weighted mean. If we multiply each midpoint
by its frequency, and then divide by the total number of values in the frequency
distribution, we have an estimate of the mean. The formula for Mean from a FD:

Let's try an example.


Ages of Statistics Students in Bandar Baru Bangi 2008
Ages

Frequency

17-21

12

22-26

15

27-31

32-36

37-41

Estimate the mean for this set of data.


Class

Midpoint

Frequency

mf

17-21

(17+21)/2 = 19

12

19 x12 = 228

22-26

(22+26)/2 = 24

15

24 x15 = 360

27-31

(27+31)/2 = 29

29 x 7 = 203

32-36

(32+36)/2 = 34

34 x 4 = 136

37-41

(37+41)/2 = 39

39 x 2 = 78

The sum of the product of the midpoints and frequencies is 1005. Divide this number
by 40 and we estimate the mean to be 25.125.

Median
Technically, the median of a distribution is the value that cuts the distribution exactly
in half, such that an equal number of scores are larger than that value as there are
smaller than that value. The median is by definition what we call the 50th percentile.
This is an ideal definition, but often distributions cant be cut exactly in half in this
way, but we still can define the median in the distribution.
The median is most easily computed by sorting the data in the data set from smallest
to largest. The median is the "middle" score in the distribution. Suppose we have the
following scores in a data set: 5, 7, 6, 1, 8. Sorting the data, we have: 1, 5, 6, 7, 8. The
"middle score" is 6, so the median is 6. Half of the (remaining) scores are larger than
6 and half of the (remaining) scores are smaller than 6.
To derive the median, using the following rule. First, compute (n+1)/2, where n is the
number of data points. Here, there are 5, so n = 5. If (n+1)/2 is an integer, the median
is the value that is in the (n+1)/2 location in the sorted distribution. Here, (n+1)/2 =
6/2 or 3, which is an integer. So the median is the 3rd score in the sorted distribution,
which is 6. If (n+1)/2 is not an integer, then there is no "middle" score. In such a case,
the median is defined as one half of the sum of the two data points that hold the two
nearest locations to (n+1)/2. For example, suppose the data are 1, 4, 6, 5, 8, 0. The
sorted distribution is 0, 1, 4, 5, 6, 8. n = 6, and (n+1)/2 = 7/2 = 3.5. This is not an
integer. So the median is one half of the sum of the 3rd and 4th scores in the sorted
distribution. The 3rd score is 4 and the firth score is 5. One half of 4 + 5 is 9/2 or 4.5.
So the median is 4.5. Here, notice that half of the scores are above 4.5 and half are
below. In this case, the ideal definition is satisfied. Also, notice that the median may
not be an actual value in the data set. Indeed, the median may not even be a possible
value.
For Example, Find the median of the following numbers: 98, 86, 46, 63, 66, 94, 31,
56, 51, 75, 48.
First put them in order: 31, 46, 48, 51, 56, 63, 66, 75, 86, 94, 98. n=11
Half way point: (11+1)/2 = 6
Hence, the median in the number integer no. 6 =63
The median is 63.
For example. Find the median of : 93, 90, 62, 44, 75, 89, 74, 100, 78, 61, 78, 81, 57,
67.
First put them in order : 44, 57, 61, 62, 67, 74, 75, 78, 78, 81, 89, 90, 93, 100. n=14
Half way point: (14+1)/2 = 7.5
Hence, the median in the number integer (no.7 + no.8)/2 =76.5
The median is found by taking the mean of 75 and 78. The median is 76.5.

Mode
By far the simplest, but also the least widely used, measure of central tendency is the
mode. The mode in a distribution of data is simply the score that occurs most
frequently. In statistics, the mode is the value that occurs the most frequently in a data
set or a probability distribution. In some fields, notably education, sample data are
often called scores, and the sample mode is known as the modal score.
Like the statistical mean and the median, the mode is a way of capturing important
information about a random variable or a population in a single quantity. The mode is
in general different from the mean and median, and may be very different for strongly
skewed distributions. There can be more than 1 mode of a data set if more than 1
value occurs the most frequent number of times and there can be no mode at all in a
data.
For example: Find the mode for the following set of data: 4, 6, 6, 7, 11, 11, 11, 12
Ans. The mode is 11, because it occurs more times (3) than any other number.
One weakness of the mode is that sometimes a set of data can have more than one
mode.
For example. Find the mode for the following set of data: 4, 6, 6, 6, 7, 11, 11, 11, 12
Ans. The modes are 6 and 11, because each occurs 3 times.
A set of data with 2 modes is sometimes called bimodal.
Sometimes a set of data doesn't have a mode. This happens when no value is repeated
in the set.
Find the mode for the following set of data : 4, 5, 6, 7, 10, 11, 12, 13
Ans. This set of data has no mode.
So, sometimes a set of data has more than one mode, and sometimes a set of data
doesn't even have a mode. Another weakness is that the mode occasionally is not a
typical value for the set of data. Consider the set of values: 5, 5, 73, 75, 77, 78, 79, 80,
82, 83, 84. The mode is 5, but is 5 representative of this set of values? Of course not!
This set of values, with the exception of the two outliers of 5, is made up of values in
the 70's and 80's. If you were told that the mode for a set of data was 5, and you did
not see the actual values, would you guess that most of the numbers were in the 70's
and 80's? Probably not.

Exercises
1) In trying to estimate Faiz Aizat's mean bowling score, six of his games are selected
at random. The scores are 187, 169, 172, 209, 154, and 195. Find the mean for these
six scores.
2) During the first 5 weeks of the 1995 NFL season, the San Francisco Forty Niners
gained the following number of yards rushing : 154, 158, 90, 78, and 109. George
Seifert was interested in his team's rushing performance through the first 5 weeks of
the season. Find the mean rushing yardage.
3) "I wonder how many points, on average, are scored by a typical NFL team in a
game?" wonders Joe D. Sportsfan. Joe gets his local paper that shows all of this
weekends results. Here are the points scored that week :
24, 21, 14, 17, 10, 3, 20, 23, 24, 22, 21, 6, 17, 14, 20, 23, 14, 52, 7, 17, 34, 10, 7, 27,
14, 31, 7, 22, 35, 0
Find the mean score for this data.
4) Below are the marks for Aviation History in 2007.

Calculate:
a) Mean
b) Median
c) Mode

5) Here are some test scores data from AMC Statistics class in the year 2008.
65
82
87
94
96

91
75
69
67
98

85
100
89
77
46

76
70
54
92
70

85
88
74
82
90

87
78
89
70
96

a) Construct a frequency distribution with 6 classes


b) Calculate the mean, median and mode of the scores

79
83
83
94
88

93
59
80
84
72

Answer
1) This data is sample data.

The mean score is 181.

2) This data represents population data.

The mean rushing yardage is 117.8 yards.


3) This data is sample data.

The mean score, rounded to 2 decimal places, is 18.53.

4) a)

= 29.75
b) Half way point = (n+1)/2
= (40+1)/2
= 20.5
Hence, the median marks is in the class (20-29)
a) Mode marks is in the class (20-29)
5)
Statistics Test
Scores (x)
41-50
51-60
61-70
71-80
81-90
91-100

AMC Statistics Test scores in 2008


No. of Students
RF
(f)
1
2.5%
2
5%
6
15%
8
20%
14
35%
9
22.5%
= 40
=100%

CFD
2.5%
7.5%
22.5%
42.5%
77.5%
100%

Hence, mean of Statistic Test Scores = 3210/40 = 80.25


Median of the Statistic Test Scores is in the class (81-90)
Mode of the Statistic Test Scores is in the class (81-90)

Class Midpoint
(m)
45.5
55.5
65.5
75.5
85.5
95.5

mf
45.5
111
393
604
1197
859.5
= 3210

You might also like