You are on page 1of 33

AbouEl-Makarim A. Aboueissa, Ph.D.

Department of Mathematics and Statistics, USM

MAT 120D Introduction to Statistics Descriptive Statistics (Organizing Data) Complementary reading (Chapter 2 Weiss) AbouEl-Makarim Aboueissa, Ph. D.

Frequency Distribution

Organizing Data: We know that descriptive statistics involves collecting,


organizing and describing data in such a way as to make the data more comprehensible. In this section we will concentrate on how to organize or summarize the data. This involves presenting the data in a form that is easy to comprehend. It also involves representing the data graphically.

The data that just have collected in original form are called raw data. These raw data do not furnish any useful information and rather make confuse to mind. They need to organize in such a way that is easily comprehensible. A better way is to organize them by constructing a frequency distribution. A frequency distribution exhibits how the frequencies are distributed over various categories.

Copyright 2011

AbouEl-Makarim A. Aboueissa, Ph.D.

Department of Mathematics and Statistics, USM

Example 1:

A teacher gave a test in basic science to his Form 1 pupils. Their marks were:

3 3 4 5

8 5 5 8

6 4 0 5

5 4 7 7

6 3 6

4 6 5

7 7 6

6 8 7

5 1 1

3 10 7

5 7 5

6 6 4

It should be noted that the minimum mark is 0 and the maximum mark is 10. Construct a frequency distribution for these data.

Answer:

Frequency Distribution Table


Marks obtained Tally marks / // ////
//// //// //// ////

Frequency ( f ) 1 2 0 4 5 9 8 7 3 0 1

( x)
0 1 2 3 4 5 6 7 8 9 10

/// //// // /// /

f = 40

Copyright 2011

AbouEl-Makarim A. Aboueissa, Ph.D.

Department of Mathematics and Statistics, USM

You can see that it is easier to gather the following types of information from the frequency table than the raw data.

(a) The highest mark is 10 and the lowest mark is 0. (b) 4 pupils scored more than 7. (c) 12 pupils scored less than 5. (d) 8 pupils scored 6. (e) Nobody scored 9. (f) 40 pupils did the test.

Notes:
(1) The distribution of the variable (i.e. the characteristics under study e.g.
marks in the above example) along with their frequency is known as frequency distribution.

(2) The frequency is the number of times a score (e.g. marks as in the
example) is repeated.

(3) Sometimes we may be interested to obtain the relative frequency


distribution. Relative frequency of a score is the number of times that a score is made relative to the total number of scores made. It is obtained by:

Relative frequency of a score =

Frequency of the score Total frequency

= f f
The relative frequency of a class is the class frequency, total number of data values, i.e., the overall sample size

f , divided by the n = f . For the

Copyright 2011

AbouEl-Makarim A. Aboueissa, Ph.D.

Department of Mathematics and Statistics, USM

first class;

f =1, n = f = 40, and the relative frequency is

f = 1 = 0.025 . f 40
If each relative frequency is multiplied by 100%, we have a percentage distribution.

Example 2:
Construct a relative frequency distribution for the data given in Example 1.

Answer:
The corresponding relative frequency distribution of these data is given below:

Relative Frequency Distribution Table


Marks obtained Frequency ( f ) 1 2 0 4 5 9 8 7 3 0 1 Relative Frequency

( x)
0 1 2 3 4 5 6 7 8 9 10

( f n)
1/40 = 0.025 2/40 = 0.500 0/40 = 0.000 4/40 = 0.100 5/40 = 0.125 9/40 = 0.225 8/40 = 0.200 7/40 = 0.175 3/40 = 0.075 0/40 = 0.000 1/40 = 0.025

f = 40
We can see from this table that relative frequency of mark 4 is 0.125. It means that 12.5% of pupils got 4 marks.
Copyright 2011

AbouEl-Makarim A. Aboueissa, Ph.D.

Department of Mathematics and Statistics, USM

Categorical Frequency Distribution


This frequency distribution is used for categorical data.

Example 3:
Twenty-five army personnel were given a blood test to determine their blood type. The data set is:

A B B A O

AB O O O B AB AB

AB B B B O A O B A

A O O O

Construct a frequency distribution and relative frequency distribution for these data.

Answer: Blood Type Tally marks


//// //// // //// ////
////

Frequency
(f)

Relative Frequency ( f n) 5/25 = 0.20 7/25 = 0.28 9/25 = 0.36 4/25 = 0.16 1.00

Percent

( x)
A B O AB 5 7 9 4

(( f

n)100 )
20 28 36 16 100

n = f = 25

You can see that it is easier to gather the following types of information from the frequency table than the raw data.

(a) More people have type O blood than any other type. (b) Very few have type AB.

Copyright 2011

AbouEl-Makarim A. Aboueissa, Ph.D.

Department of Mathematics and Statistics, USM

Grouped Frequency Distributions


If the number of distinct data values is large, the data must be grouped to make them more comprehensible as follows:

We divide an interval containing all the data into a small number of segments, usually of equal width. These segments are called classes (or class intervals).

Example 4:
The weights (in kg.) of 50 pieces of luggage are presented in a grouped frequency distribution with the class interval as follows:

Frequency Distribution Table Weights (kgs) Number of pieces 79 2 10 12 8 13 15 14 16 18 19 19 21 7 50

From the above frequency distribution we note the following:

(a) The intervals of weights i.e. 7-9, 10-12, , 19-21 are known as class
intervals.

(b) 7, 10, 13, 16, 19 are called lower limits of the respective classes.
Copyright 2011

AbouEl-Makarim A. Aboueissa, Ph.D.

Department of Mathematics and Statistics, USM

(c) 9, 12, 15, 18, 21 are called upper limits of the respective classes. (d) 6.5 9.5,
9.5 12.5, 12.5 15.5, 15.5 18.5 and 18.5 21.5 are known as class boundaries. These class boundaries are obtained by:

Lower class boundary =lower class limit d 2 Upper class boundary =Upper class limit + d 2
Where d = difference between any two consecutive classes .
For the above example,

d d = 1 = 0.5 2
(e)
2, 8, 14, 19 and 7 are called class frequencies.

(f) The class width is the difference between the upper and lower class
boundaries of a class interval. Thus, the class width for the class interval 13 15 is

Class width =15.5 12.5=3


(g) The class mark (or midpoint), x
m

, of class interval is obtained by

Copyright 2011

AbouEl-Makarim A. Aboueissa, Ph.D.

Department of Mathematics and Statistics, USM

+upper class boundary xm = lower class boundary 2


or equivalently,

+upper class limit xm = lower class limit 2


Constructing Frequency Distribution
When construct a frequency distribution; we need to make the following three Decisions:

(1) Number of Classes: The number of classes usually varies from 5 to 20.
It depends on the number of observation in the data set. It is preferable to have more classes as the size of a data set increases.

Rule: The Sturges formula may be helpful to decide the number of


classes, is given below:

c = 1 + 3.3 log( n)
Where is the number of classes, and the data set.

n is the number of observations in

Copyright 2011

AbouEl-Makarim A. Aboueissa, Ph.D.

Department of Mathematics and Statistics, USM

(2) Class Width: One can determine the class width of same size by using
the following formula.

Approximate class width = Largest value Smallest value


Number of classes

Then round up the result to the nearest whole number.

(3) Lower Limit of First Class: Smallest value or less than smallest value
in the data set can be used as lower limit of the first class.

Relative Frequency:
The relative frequency and percentages are obtained as follows:

Relative frequency = Frequency of the class = f Total frequencies f


and

Percentage = (Relative frequency) 100

Copyright 2011

AbouEl-Makarim A. Aboueissa, Ph.D.

Department of Mathematics and Statistics, USM

Example 5:
The following are the marks (out of 100) obtained by 50 students of MAT120D in Spring Semester, 2009 Examination.

55 62 72 78 81

54 64 53 55 86

76 80 54 69 58

70 85 76 80 72

77 78 90 72 92

80 42 66 74 78

84 72 85 74 38

66 63 82 54 85

80 85 79 54 69

61 50 83 54 82

Construct a grouped frequency distribution. Use classes 30 39, 40 49, 50 59, etc.

Answer:
Number of classes:

c = 1 + 3.3 log( n) = 1 + 3.3 log(50) = 6.6066 7

Copyright 2011

10

AbouEl-Makarim A. Aboueissa, Ph.D.

Department of Mathematics and Statistics, USM

Frequency Distribution Table


Class Intervals (Marks) Tally marks Number of Students in each class Frequency ( f )

30 39 40 49 50 59 60 69 70 79 80 89 90 99

/ / //// //// //// /// //// //// //// //// //// //// //

1 1 10 8 14 14
2

f =50

Note: The lower class limit of the first class is 30 is less than the smallest
data value 38. The upper class limit of the last class is 99 is greater than the largest data value 92.

Copyright 2011

11

AbouEl-Makarim A. Aboueissa, Ph.D.

Department of Mathematics and Statistics, USM

Example 6:
The following is the distribution of the ages of new employees joined at a factory. Frequency Distribution Table

Class Intervals (Marks) 20 29 30 39 40 49 50 59 60 69

Number of Employees Frequency ( f ) 7 21 4 2 1 f =35

(a) Obtain the class boundaries and class marks of the class intervals. (b) What is the upper class limit of the class 30 39? (c) What is the lower class boundary of the class 50 59? (d) What is the class mark of the class 40 49?
Answer:

(a) The class boundaries and class marks are given in the following table:
It should be noted that:

d d = 30 29 = 1 = 0.5 2
Copyright 2011

12

AbouEl-Makarim A. Aboueissa, Ph.D.

Department of Mathematics and Statistics, USM

Therefore, for example, the lower class boundary of the first class is given by:

d 20 = 20 0.5 = 19.5 2
and the upper class boundary of the first class is given by:

29 +

d = 29 + 0.5 = 29.5 2

The lower class boundary of the second class is given by:

d 30 = 30 0.5 = 29.5 2
and the upper class boundary of the second class is given by:

d 39 + = 39 + 0.5 = 39.5 2
The lower class boundary of the third class is given by:

d 40 = 40 0.5 = 39.5 2
Copyright 2011

13

AbouEl-Makarim A. Aboueissa, Ph.D.

Department of Mathematics and Statistics, USM

and the upper class boundary of the third class is given by:

d 49 + = 49 + 0.5 = 49.5 2
The lower class boundary of the fourth class is given by:

d 50 = 50 0.5 = 49.5 2
and the upper class boundary of the fourth class is given by:

d 59 + = 59 + 0.5 = 59.5 2
Finally. the lower class boundary of the fifth class is given by:

d 60 = 60 0.5 = 59.5 2
and the upper class boundary of the fifth class is given by:

d 69 + = 69 + 0.5 = 69.5 2
Copyright 2011

14

AbouEl-Makarim A. Aboueissa, Ph.D.

Department of Mathematics and Statistics, USM

Frequency Distribution Table


Class Intervals (Marks) 20 29 30 39 40 49 50 59 60 69 Class Boundaries 19.5 29.5 29.5 39.5 39.5 49.5 49.5 59.5 59.5 69.5 Class Mark ( xm ) 24.5 34.5 44.5 54.5 64.5 Number of Employees Frequency ( f ) 7 21 4 2 1 f =35

(b) 39 (c) 49.5 (d)

40 + 49 = 44.5 2

Copyright 2011

15

AbouEl-Makarim A. Aboueissa, Ph.D.

Department of Mathematics and Statistics, USM

Example 7:
The following frequency distribution gives the lengths of 15 cucumbers.

Frequency Distribution Table

(Length (cm) ) 5 10 10 15 15 20 20 25 25 30

Frequency ( f ) 3 4 5 2 1 f =15

(a) What is the upper class limit of the class interval 15-20? (b) What is the lower class boundary of the class interval 15-20? (c) What is the class width of the class interval 15-20? (d) What is the class mark of the class interval 15-20?
Answer:

(a) 20 (b) 15 (c) 20-15 = 5 (d)


15 + 20 = 17.5 2

Copyright 2011

16

AbouEl-Makarim A. Aboueissa, Ph.D.

Department of Mathematics and Statistics, USM

Example 8:
The following data represent glucose blood levels (mg/100 ml) after 12-hour fast for a random sample of 70 women (Reference: American Journal of Clinical Nutrition, Vol. 19, pp. 345-351). These data are as also available with other software on the statSpace CD-ROM. These data are:

45 85 81 93 65 101

66 77 76 85 89 71

83 82 96 83 70 109

71 90 83 80 80 73

76 87 67 78 84 73

64 72 94 80 77 80

59 79 101 85 65 72

59 69 94 83 46 81

76 83 89 84 80 63

82 71 94 74 70 74

80 87 73 81 75

81 69 99 70 45

(a) Find the class width. (b) Make a frequency table showing class limits, class boundaries,
midpoint, frequencies, relative frequencies, and cumulative frequencies.

(c) Draw a histogram. (d) What is the shape of the data?


Answer:

(a)

For these data we have: largest data value = 109 smallest data value = 45

Copyright 2011

17

AbouEl-Makarim A. Aboueissa, Ph.D.

Department of Mathematics and Statistics, USM

Number of classes:

c = 1 + 3.3 log( n) = 1 + 3.3 log(70) = 7.0888 7


number of classes specified = 7 (it is optional, but it should be between 5 and 20)

class width = 109 45 = 9.143, 7 L Q F U H D V H G W R Q H [ W Z K R O H Q X P E H U 10


(b)
The lower class limit of the first class is the smallest data value, 45. The lower class limit of the next class (second class) is the previous classs (first class) lower class limit plus the class width (10), for the second class, this is 45+10 = 55. The upper class limit is one value less than lower class limit of the next (second) class; for the first class, the upper class limit is 55-1 = 54. The class frequency is the number of data values that belong to that class; call this value f as shown in the frequency table below.

Copyright 2011

18

AbouEl-Makarim A. Aboueissa, Ph.D.

Department of Mathematics and Statistics, USM

Frequency Distribution Table (not complete)


Class Limits Lower upper 45 - 54 Class Boundaries Lower upper
///

Tally marks

Frequency

(f)

Class midpoint

( xm )
3

Relative frequency f f

Cumulative Frequency

55- 64

////

65 - 74

//// //// //// ////

19

75 - 84

//// //// //// //// //// //

27

85 -94

//// //// //

12

95 104

////

105 - 114 Total

n= f = 70

Copyright 2011

19

AbouEl-Makarim A. Aboueissa, Ph.D.

Department of Mathematics and Statistics, USM

The class boundaries are the halfway points between (i.e. the average of) the (adjacent) upper class limit of one class and the lower class limit of the next class. The lower class boundary of the first class is the lower class limit minus one-half unit. The upper class boundary for the last class is the upper class limit plus one-half unit. For the first class, the class boundaries are

45 1 = 44.5 and 54 + 55 = 54.5 . For the last class, the class 2 2 104 +105 = 104.5 and 114 + 1 = 114.5 . boundaries are 2 2

OR:

For these data we have:

d = 55 54 = 1

d 1 = = 0.5 2 2

Therefore, the lower class boundary of the first class is given by:

lower class boundary of the first class = lower class limit of the first class = 45 0.5 = 44.5

d 2

upper class boundary of the first class = upper class limit of the first class + = 54 + 0.5 = 54.5
And so on.

d 2

Copyright 2011

20

AbouEl-Makarim A. Aboueissa, Ph.D.

Department of Mathematics and Statistics, USM

The class mark or midpoint is the average of the class limits (lower and upper) for that class. For the first class, the class midpoint is

45 + 54 = 49.5 , and so on for the other classes. 2

The relative frequency of a class is the class frequency, total number of data values, i.e., the overall sample size

class;

f = 3, n = 70, and the relative frequency is

f , divided by the n . For the first f = 3 = 0.0429 . n 70

The cumulative frequency of a class is the sum of the frequencies for all previous classes, plus the frequency of that class. For the first class and second classes, the class cumulative frequencies are 3 and 3+4 = 7, respectively.

More on Cumulative Frequency Distribution


If some one is interested to know how many scores are less than a particular score, or what percentage of scores is less than a particular score, these questions can be answered by means of cumulative frequency or cumulative percentage.

Cumulative Frequency: The cumulative frequency of a score (or class)


is the sum of the frequencies of the scores (or classes) up to the given score (or class). It gives the total number of scores that fall below the score (or below the upper boundary of each class).

Copyright 2011

21

AbouEl-Makarim A. Aboueissa, Ph.D.

Department of Mathematics and Statistics, USM

Cumulative Relative frequency and Cumulative Percentage: The cumulative relative frequency for any score (or class) is obtained by:

Cumulative relative frequency =

Cumulative frequency of a class (c.f.) Total number ( n = f )

And the cumulative percentage for any score (or class) is obtained by:

Cumulative percentage =

Cumulative frequency of a class (c.f.) 100% Total number ( n = f )

It gives the percentage of the total score that fall below a particular score (or below an upper boundary of a particular class).

The completed frequency distribution is given below.

Copyright 2011

22

AbouEl-Makarim A. Aboueissa, Ph.D.

Department of Mathematics and Statistics, USM

Frequency Distribution Table


Class Limits Lower upper Class Boundaries Lower upper 44.5 54.5 3 Frequency Relative Class frequency Cumulative midpoint f Frequency

(f)

( xm )
49.5

45 - 54

0.0429

55- 64

54.5 64.5

59.5

0.0571

65 - 74

64.5 74.5

19

69.5

0.2714

26

75 - 84

74.5 84.5

27

79.5

0.3857

53

85 -94

84.5 -94.5

12

89.5

0.1714

65

95 104

94.5 -104.5

99.5

0.0571

69

105 - 114 Total

104.5 114.5

109.5

0.0143

70

n= f = 70

1.000

Copyright 2011

23

AbouEl-Makarim A. Aboueissa, Ph.D.

Department of Mathematics and Statistics, USM

(d)
Histograms
The histogram plots the class frequencies on the

y axis and the class

boundaries on the x axis . Since adjacent classes share boundary values, the bars touch each other. [Alternatively, the bars may be centered over the class marks (midpoints)]. A histogram is the most commonly used graphic representation of a frequency distribution. Here the horizontal axis ( x axis ) represents the data and the vertical axis ( y axis ) represents the frequency. Along the

x axis we display the classes by labeling the class boundaries. Above


each class we draw a bar with a width equal to the class width and a height equal to the class frequency. The histogram is a graph that displays the data by using continuous vertical bars of various heights to represent the frequencies of the classes.

Histogram of Glucose Blood Levels


30
27

25

20 Frequency

19

15
12

10

4 3

4 1

0 44.5 54.5 64.5 74.5 84.5 Glucose Blood Levels 94.5 104.5 114.5

Copyright 2011

24

AbouEl-Makarim A. Aboueissa, Ph.D.

Department of Mathematics and Statistics, USM

Shapes of Histograms
A histogram can have many shapes. The most common of these shapes are:

Symmetric Skewed Uniform or rectangular

Symmetric Histogram: It is approximately identical on both sides of a


line running through the center. This type of distribution is known as bellshaped distribution.

Symmetric Histogram
20

15

10

0 6 8 10 12 mean=median=mode 14

Copyright 2011

25

AbouEl-Makarim A. Aboueissa, Ph.D.

Department of Mathematics and Statistics, USM

Skewed Histogram: A nonsymmetrical histogram is known as skewed


histogram. When the peak of a histogram is to the left and a longer tail on the right side, a histogram is said to be right-skewed. When the histogram has a longer tail on the left side and peak on right side, it is said to be leftskewed.
Positive (or right) skewed Histogram
30

25

20

15

10

0 8 10 12 14 mean>median>mode 16 18

Negative (or left) skewed Histogram


30

25

20

15

10

0 2 4 6 8 mean<median<mode 10 12

Copyright 2011

26

AbouEl-Makarim A. Aboueissa, Ph.D.

Department of Mathematics and Statistics, USM

Uniform Histogram: If a histogram has the same frequency for each class, then it is said to be uniform or rectangular histogram.

Uniform Histogram
16 14 12 10 8 6 4 2 0 6 8 10 12 14

(e) Based on the histogram of these data (at the end of page 31), the data set
is almost symmetric.

Copyright 2011

27

AbouEl-Makarim A. Aboueissa, Ph.D.

Department of Mathematics and Statistics, USM

Frequency Polygons
Another way to represent a frequency distribution is by using a frequency polygon. The frequency polygon is especially useful in conveying the shape of the distribution. The frequency polygon is a graph that displays the data by using lines that connect points plotted for the frequencies at the mid point of the classes. The frequencies are represented by the heights of the points. To construct a frequency polygon, first find the midpoint of each class. Draw a horizontal x axis and a vertical y axis . Level the midpoints on the x axis and use a suitable scale the on y axis for the frequency. Above each midpoint, place a dot at a height equal to the frequency of the class. Then connect the adjacent dots with straight line and extend the line to the x axis . The extended lines meet at the midpoints of two hypothetical classes.

A frequency polygon connects the midpoints of each class (shown as a


dot in the middle of the top of the histogram bar) with line segments. Place a dot on the x axis one class width below the midpoint of the first class, and place another dot on the x axis one class width above the last classs midpoint. Connect these dots to the adjacent midpoint dots with line segments. The relative frequency histogram is exactly the same shape as the frequency histogram, but the vertical scale is relative frequency, of actual frequency,

f , instead n

Copyright 2011

28

AbouEl-Makarim A. Aboueissa, Ph.D.

Department of Mathematics and Statistics, USM

Polygon of Glucose Blood Levels


30
27

25

20 Frequency

19

15
12

10

5
0

4 3

4 1 0

0 39.5 59.5 79.5 Glucose Blood Levels 99.5 119.5

Cumulative Frequency Graphs (Ogives)


A cumulative frequency graph (or ogive) is the graphical representation of a cumulative frequency distribution. The following are the steps to be followed to draw a cumulative frequency graph.

Level upper boundaries on

x axis . y axis .

Level cumulative frequency (or cumulative percent) on

Plot points corresponding to each upper boundary and its cumulative frequency (or cumulative percent). Join the points by straight lines.
29

Copyright 2011

AbouEl-Makarim A. Aboueissa, Ph.D.

Department of Mathematics and Statistics, USM

The cumulative frequency graphs are used to locate visually how many values are below a certain upper class boundary.
To create the ogive, place a dot on the x axis at the lower class boundary of the first class and then, for each class, place a dot above the upper class boundary value at the height of the cumulative frequency for the class. Connect the dots with line segments.
Ogive of Glucose Blood Levels
70 60 Cumulative Frequency 50 40 30 20 10 0 40 50 60 70 80 90 Class Boundaries 100 110 120

Copyright 2011

30

AbouEl-Makarim A. Aboueissa, Ph.D.

Department of Mathematics and Statistics, USM

More on Statistical Graphs


Graphs are used for presenting statistical data in an attractive way. They enable us to visualize the whole meaning of complex data at a single glance.
Graph Type Histogram: A histogram displays continuous data in ordered columns. Categories are of continuous measure such as time, inches, temperature, etc. Frequency Polygon: A frequency polygon can be made from a line graph by shading in the area beneath the graph. It can be made from a histogram by joining midpoints of each column. Scatterplot: A scatterplot displays the relationship between two factors of the experiment. A trend line is used to determine positive, negative, or no correlation. Boxplot: A boxplot is a concise graph showing the five point summary. Multiple boxplots can be drawn side by side to compare more than one
Copyright 2011

Advantages Visually strong Can compare to normal curve Usually vertical axis is a frequency count of items falling into each category

Visually appealing

Disadvantages Cannot read exact values because data is grouped into categories More difficult to compare two data sets Use only with continuous data Anchors at both ends may imply zero as data points

Use only with


continuous data

Shows a trend in the


data relationship Retains exact data values and sample size Shows minimum / maximum and outliers

Hard to visualize
results in large data sets Flat trend line gives inconclusive results Data on both axes should be continuous

Shows 5-point

Not as visually

summary and outliers appealing as other graphs Easily compares two or more data sets Exact values not retained Handles extremely

31

AbouEl-Makarim A. Aboueissa, Ph.D.

Department of Mathematics and Statistics, USM

data set. More about boxplots Stem and Leaf Plot: Stem and leaf plots record data values in rows, and can easily be made into a histogram. Large data sets can be accomodated by splitting stems.

large data sets easily

Concise

Not visually

representation of data appealing Shows range, Does not easily minimum & indicate measures of maximum, gaps & centrality for large clusters, and outliers data sets easily Can handle extremely large data sets

Copyright 2011

32

AbouEl-Makarim A. Aboueissa, Ph.D.

Department of Mathematics and Statistics, USM

Types of Distributions
When all of the scores in a set of data are consider together, it is commonly called a distribution of scores or just a distribution. As it turns out, there are a number of specific types of distributions that deserve discussion. Therefore, we will discuss a normal distribution and two types of skewed distributions.

Normal Distributions
Perhaps the most common type of distribution in the social sciences is a Normal Distribution. This can also be called a bell-shaped distribution. In this type of distribution, most scores occur in the center of the distribution and fewer scores are present as you go further away from the mean. A normal distribution is symmetrical. This means that if you divide the distribution in the center, the area to the left (below) of the mean is a mirror image of the area to the right (above) of the mean. The normal distribution is shown in the following figure.

Skewed Distributions
Another type of distribution that can occur is a skewed distribution. In a skewed distribution, the majority of the scores are not in the center of the distribution of scores. This means that the distribution is not symmetrical. In a positively skewed distribution, the majority of the scores in the distribution are shifted to the left. Alternatively, in a negatively skewed distribution, the majority of the scores are shifted to the right side of the distribution.

Copyright 2011

33

You might also like