Professional Documents
Culture Documents
INTRODUCTION
Definitions of Statistics:
A branch of science which deals with the collection, presentation, analysis, and
interpretation of data.
Recorded data such as the number of business permits issued, number of customers
eating at a restaurant, the size of enrollment at USLS, and so on.
Numerical characteristics calculated for a set of data (e.g., mean, median, mode)
The backbone of Research
Two Branches of Statistics
1. Descriptive Statistics
- deals with organizing and summarizing observations so that they are easier to
comprehend
- used to describe the basic features of the data in a study
- provide simple summaries about the sample and the measures
2. Inferential Statistics
- deals with the formulation of inferences about conditions that exist in a population
from study of a sample drawn from a population.
- make inferences from the data to more general conditions
The Research Process:
Why do research?
Formulate the problem
S pecific
M - easurable
A attainable
R ealistic
T ime bound
Define the population of the study
o Population all subjects under investigation
the set of all elements of interest in a particular study
o Sample
a subset of the population
Identify the variable/s of the study
o Variable measurable characteristic of the subject
any entity that can take on different values
Example:
Problem: What is the average weekly allowance of a USLS BMath 2 student for the first semester
of AY 2012 2013?
Population of study:
All USLS BMath 2 students for the first semester, AY 2012 - 2013
Variable/s:
2.
Types of Variables:
Qualitative/Categorical
Attributes are in terms of categories
Examples:
a. sex:
Male /
Female
b. religious affiliation:
Roman Catholic / INC /
Quantitative/Numerical
Attributes are in terms of counts or measurements
Distinctions:
a. Discrete Variable
Functions of variables:
Important if the investigation is about cause and effect
Distinctions:
a. Independent Variable
what the researcher (or nature) manipulates -- a treatment or program or cause
b. Dependent Variable
what is affected by the independent variable -- the effects or outcomes
Example:
Study/Problem: the effects of a new educational program on student achievement
Independent variable - the program
Dependent variables - measures of achievement
Defn: Measurement The process of assigning numbers to observations
Levels of Measurement
1. Nominal Level
Consists of numbers which indicate categories for purely classification or identification
purposes
The categories are mutually exclusive (the observations cannot fall into more than one
category)
The categories are exhaustive (there must be enough categories for all the
observations)
Examples: gender, religious affiliation, citizenship
2. Ordinal Level
Possesses rank order characteristics
the categories must still be mutually exclusive and exhaustive, but they also indicate
the order of magnitude of some variable
Examples: military rank, size of T-shirts (small, medium, large)
3. Interval Level
Has all the properties of the ordinal scale
A given interval (distance) between scores has the same meaning anywhere on the
scale
Intervals provide information about how much better one value is compared with
another
Has no absolute zero
Examples: temperature measured on Celsius or Fahrenheit, test scores
4. Ratio Level
Possesses all the characteristics of the interval scale
Has a true or absolute zero point
The ratio of two values is meaningful
Examples: distance, height, weight, time, cost of an automobile
EXERCISES
1.
a.
b.
c.
d.
Simple Random
Each element of the population is given an equal chance of being included in the sample
Most basic probability sampling procedure
Foundation of all probability sampling procedures
When to use:
The population is homogeneous
A sampling frame is available
Procedure:
Lottery
Use of random number generators
2.
Systematic Random
3.
Stratified Random
selecting random samples from mutually exclusive subpopulations, or strata, of the population.
When to use:
When the population is heterogeneous but can be subdivided into
homogeneous subgroups or strata
A sampling frame is available for each stratum
Procedure:
i.
Determine the proportion of each stratum relative to the population
ii.
Identify the stratum sample sizes using proportional allocation
iii.
Select the samples from each stratum using either simple or
systematic random sampling
Example: Among the 250 employees of the local office of an international insurance
company, 182 are Filipinos, 51 are Chinese, and 17 are Americans. If we use
proportional allocation to select a stratified random grievance committee of 15
employees, how many employees must we take from each race?
Solution:
Race (i)
Ni
Filipino
182
Chinese
51
American
17
Total
4.
250
ni
100
15
Cluster Random
Selecting clusters of elements rather than individual elements
When to use:
when "natural" groupings are evident in a statistical population
a sampling frame is not available
Procedure:
i.
Divide the population into clusters (M =total number of clusters)
ii.
Randomly select m clusters
iii.
Include all elements within the selected clusters to form the resulting
sample
5. Multi-stage random sampling
Repeated cluster sampling
B. Non-probability sampling
not all elements of the population are given a chance of being included in the sample
prone to selection bias
1.
2.
3.
Judgmental/Purposive
The researcher selects the sample based on his judgment as to who best fit
the established criteria
Quota
Selecting sample elements nonrandomly according to some fixed quota
4. Snowball
Especially useful when you are trying to reach populations that are inaccessible or hard to find
DATA COLLECTION PROCEDURES
1. Interview
There is interaction between interviewer and respondent
Most important method of data collection
Some advantages:
o Clarifications about ambiguous questions/answers can be made
o More in-depth information can be generated
Some disadvantages:
o Time-consuming
o Costly
o Responses may be influenced by the interviewer
o
o
o
o
o
o
o
o
o
2. Questionnaire
No interaction between facilitator and respondent about the subject matter
Respondent personally answers the questions on survey forms
Some advantages:
Less costly
Less time- consuming
Responses are not influenced by the interviewer
Respondents answer the questions with relative anonymity; may answer moretruthfully
Some disadvantages:
Not effective if the respondent is illiterate
Clarifications about vague questions cannot be made
Respondents may misinterpret the questions
Intended respondents may not personally answer the forms; may request other people to
respond
Low rate of returns
3. Experimentation
4. Observation
Like experiments, observational studies attempt to understand cause-and-effect relationships
Unlike experiments, the researcher is not able to control (1) how subjects are assigned to groups and/or (2)
Web references:
1.
2.
3.
4.
5.
6.
stattrek.com/statistics/data-collection-methods.aspx
people.uwec.edu/.../researchmethods/data%20collection%20methods/...
www.fao.org/DOCREP/003/X2465E/x2465e09.htm
www.uk.sagepub.com/resources/oleary2/ch6.ppt - United Kingdom
http://www.youtube.com/watch?v=Hyh91AC_tAM
http://www.youtube.com/watch?feature=endscreen&NR=1&v=Qd8uBusuzks
ORGANIZATION AND PRESENTATION OF DATA
Frequency Distribution - A tabular summary of data showing the number (frequency) of items
in each of several non-overlapping classes.
Example: The following data were obtained from a sample of 50 soft drink purchases. Construct
a frequency distribution to summarize the data.
Coke
Coke Zero
Pepsi
Pepsi Max
Pepsi Max
Sprite
Mountain Dew
Mountain Dew
Coke
Coke
Coke Zero
Coke Zero
Coke Zero
Sprite
Coke
Coke
Coke
Pepsi
Pepsi
Pepsi
Pepsi Max
Sprite
Pepsi Max
Sprite
Coke
Coke
Pepsi Max
Pepsi Max
Coke
Coke
Pepsi
Coke
Coke Zero
Coke Zero
Pepsi
Mountain Dew
Coke
Mountain Dew
Pepsi Max
Sprite
Pepsi
Coke
Pepsi Max
Pepsi Max
Coke
Mountain Dew
Pepsi
Pepsi Max
Sprite
Mountain Dew
Frequency
(f)
50
rf = f / n
2. Pie chart A graphical device for presenting data summaries based on subdivision of a circle
into sectors that correspond to the relative frequency for each class
USING EXCEL: Watch Excel Statistics 15: Category Frequency Distribution w Pivot Table & Pie Chart by
ExcellsFun at http://www.youtube.com/watch?v=-ERARVSfeuw
SUMMARIZING QUANTITATIVE DATA
Constructing a Frequency Distribution for Quantitative Data
1. Determine the number of non-overlapping classes.
use between 5 to 20 classes.
use enough classes to show the variation in the data, but not so many that some contain only a
few items.
2. Determine the width of each class (also called interval size).
Class width (i)= range / no. of classes
Range = highest value lowest value
3. Determine the class limits.
Lower class limit identifies the smallest possible data value assigned to the class
Upper class limit identifies the largest possible data value assigned to the class
4.
14
15
27
21
18
19
18
22
33
16
18
17
23
28
13
16
21
15
14
27
30
31
25
22
18
Total
In two to three sentences, describe how the audit time data is distributed.
__________________________________________________________________________________________________
__________________________________________________________________________________________________
__________________________________________________________________________________________________
Other Components of a Frequency Distribution
Class Boundaries - the true or real limits of an interval
the specific points that serve to separate adjoining classes along a measurement scale for
continuous variables
can be determined by identifying the points that are halfway between the upper and lower stated
class limits, respectively, of adjoining classes
1. Class Marks or Class Midpoints the value halfway between the lower and upper class
limits
2. Relative frequencies obtained by dividing the class frequency by the total frequency
3. Percentages obtained by multiplying the relative frequencies by 100%
4. Cumulative frequencies the number of data items with values less than or equal to the
upper class limit of each class; obtained by summing the frequencies
5. Cumulative percentages obtained by dividing the cumulative frequencies by the total
number of cases and then multiplying the result by 100. Cumulative percentages provide
information on the percentage of values less than or equal to a specified value.
Example: Using the audit time data, complete the following table.
Frequenc
y
Audit Time
Class
Boundari
es
Class
Marks
Relative
Frequenc
y
Percenta
ge
Cumulati
ve
Frequenc
y
Cumulati
ve
Percenta
ge
B. by ExcellsFun:
1. Excel Statistics 20: P1 Quantitative Freq. Dist. w Formulas, http://www.youtube.com/watch?v=ERARVSfeuw
2. Excel Statistics 21: P2 Quantitative Freq. Dist. w Formulas, http://www.youtube.com/watch?
v=vCUMqHKwFn8&feature=BFa&list=ULx8ePdM9LquM
2. Excel Statistics 22: Histogram & Ogive Charts & % Cumulative Frequency,
http://www.youtube.com/watch?v=x8ePdM9LquM&feature=BFa&list=ULvCUMqHKwFn8
72
92
128
104
108
76
141
119
98
85
69
76
118
132
96
91
81
113
115
94
97
86
127
134
100
102
80
98
106
106
10
73
124
83
92
81
106
75
95
119
Procedure:
1. Arrange the leading digits of each data value to the left of a vertical line.
2. To the right of the vertical line, record the last digit for each data value corresponding to its
first digit.
3. Sort the digits on each line in rank order in order to obtain a stem-and-leaf display.
Stem and Leaf Plot
6
7
8
9
10
11
12
13
14
Shapes of Distributions
1.
2.
Symmetric the shape of the left side of the distribution is a mirror image of the right side
Skewed the two sides of the distribution are not mirror images of each other
10
a. Positively skewed (skewed to the right) scores tend to cluster toward the lower end
of the scale (i.e., the smaller numbers) with increasingly fewer scores at the upper end
of the scale (the larger numbers)
b. Negatively skewed (skewed to the left) most of the scores tend to occur toward the
upper end of the scale while increasingly fewer score occur toward the lower end
EXERCISES
1. Maris Steakhouse uses a questionnaire to ask customers how they rate the server, food
quality, cocktails, prices, and atmosphere at the restaurant. Each characteristic is rated
on a scale of outstanding (O), very good (V), good (G), average (A), and poor (P).
Construct a frequency distribution, bar graph, and pie chart to summarize the following
data collected on food quality. What is your feeling about the food quality ratings at the
restaurant?
G
O
V
G
A
O
V
O
V
G
O
V
A
V
O
P
V
O
G
A
O
O
O
G
O
V
V
A
G
O
V
P
V
O
O
G
O
O
V
O
G
A
O
V
O
O
G
V
A
G
2. The following are the final examination test scores of 50 statistics students.
68
55
65
42
64
45
56
59
56
42
a.
b.
c.
d.
38
50
37
42
53
52
54
57
49
63
54
38
46
49
33
43
40
29
43
60
69
54
64
41
63
44
55
58
55
41
52
51
53
49
48
64
55
37
47
50
3. The following data are the scores of 50 individuals who answered a 150-item aptitude test
as a requirement for a job application.
112
73
126
82
92
115
95
107
73
124
83
92
81
106
97
86
127
134
100
102
80
69
76
118
132
96
91
81
72
92
128
104
108
76
141
100
119
106
94
85
68
95
115
98
84
75
98
113
119
106
Present this set of data in the form of a frequency distribution. Use 7 classes.
Plot a frequency polygon of the distribution. What is the shape of the distribution?
In not more than 5 sentences, describe the frequency distribution and polygon that you
BASIC SUMMATION NOTATION
In Statistics, it is frequently necessary to work with sums of numerical values. We use the
symbol
(capital Greek letter sigma) to represent the sum of a set of numbers. Given a set of
11
n
X
i 1
X1 X 2 K X n
x 7 , find
Example 1. If x1 3 , x 2 5 , and 3
x
x x 2 x3 3 + 5 + 7 = 15
a) i = 1
2
x
b) i =
c)
(x
2) 2
Example 2. Given
x1 2, x 2 3, x3 1, y1 4, y 2 2, and y 3 5 , evaluate
xy
a) i i
b)
c)
xi yi
x y
i
DATA ANALYSIS
Measure - a number that summarizes a particular characteristic of a given data set.
Parameter a measure of the population; usually represented by lowercase Greek letters
Statistic a measure of the sample; usually represented by lowercase letters of the English
alphabet
MEASURES FOR QUALITATIVE DATA
Summarized using the following measures:
proportions (relative frequencies)
percentages
Example: gender
coded as
M0
F1
Population Mean:
Sample Mean:
xi
N , where xi ith score or observation; N population size
xi
X
n , where xi ith score or observation; n sample size
Example 1: During a particular summer month, the eight salespeople in an appliance store sold
the following number of central air-conditioning units: 8, 11, 5, 14, 8, 11, 16, 11. Considering this
month as the statistical population of interest, the mean number of units sold is
12
i
Note: For reporting purposes, one generally reports the measures of location to one additional
digit beyond the original level of measurement.
WEIGHTED MEAN
also called weighted average
an arithmetic mean in which each value is weighted according to its importance in the
overall group
formulas for the population, and sample weighted means are identical:
w or
Xw
wX
w
each value in the group (X) is multiplied by the appropriate weight factor (w), and
the products are then summed and divided by the sum of the weights.
Example 2: In a multiproduct company, the profit margins for the companys four product lines
during the past fiscal
year were: line A, 4.2percent; line B, 5.5 percent; line C, 7.4 percent; and line D, 10.1 percent.
The unweighted mean profit margin is
x
N
However, unless the four products are equal in sales, this unweighted average is incorrect.
Assuming the sales totals in the following table, the weighted mean correctly describes the
overall average.
Product Line
Profit Margin, X
(%)
Sales, in Php
(w)
4.2
30,000,000
5.5
20,000,000
7.4
5,000,000
10.1
3,000,000
Total
Php58,000,000
wX
126,000,00
0
110,000,00
0
37,000,00
0
30,300,00
0
Php303,300,00
0
MEDIAN
the value of the middle item of an array (arrangement of the values in either ascending
or descending order)
If N or n is odd, the median is the middle value of the array
If N or n is even, the median is the mean of the two middle values.
When N or n is large, the following procedure is used:
N 1
n 1
or
2
o Find the position of the median value in the array : 2
Population Median:
Sample Median :
~ x N 1
2
~
x x n 1
2
Example 3: The eight salespeople described in Example 1 sold the following number of central
air-conditioning units,
in ascending order: 5, 8, 8, 11, 11, 11, 14, 16. The value of the median is
~ x n 1 x 4.5
2
13
Remark: The value of the median is between the fourth and fifth value in the ordered
group. Since both these
values equal 11 in this case, the median equals 11.0.
MODE
the observation that occurs most frequently; in a frequency polygon, the value
corresponding to the highest peak
not necessarily unique, unlike the mean and the median
o does not always exist; in a rectangular distribution where all the frequencies are
equal, there is no mode
o may have correspond to multiple values; there may be two or more scores with the
same highest frequency.
Unimodal the distribution has a single mode
Bimodal the distribution has two modes
Polymodal the distribution has multiple modes
Example 4: The eight salespeople described in Example 1 sold the following number of central
air-conditioning units: 8, 11, 5, 14, 8, 11, 16, and 11. The mode for this group of values is the
value with the greatest frequency, or
mode=
RELATIONSHIP BETWEEN THE MEAN AND THE MEDIAN
The Median: always an excellent measure by which to represent the typical level of
observed values, such as wage rates, in a population. This is true regardless of whether
there is more than one mode or whether the population distribution is skewed or
symmetrical. The lack of symmetry is no special problem because the median wage
rate, for example, is always the wage rate of the middle person when the wage rates
are listed in order of magnitude.
The Mean: also an excellent representative value for a population, but only if the
population is fairly symmetrical. For nonsymmetrical data, the extreme values (for
instance, a few very high wage rates for technical specialists) will serve to distort the
value of the mean as a representative value.
Thus, the median is generally the best measure of data location for describing
population data.
The mode is not a good measure of location with respect to sample data because its
value can vary greatly from sample to sample.
The median is better than the mode because its value is more stable from sample to
sample.
However, the value of the mean is the most stable of the three measures.
Thus, for sample data, the best measure of location generally is the arithmetic mean.
14
EXERCISES
1. The following are scores of 50 high school students in a 150-item achievement test in
Mathematics.
112
73
126
82
92
a.
b.
107
73
124
83
92
97
86
127
134
100
69
76
118
132
96
72
92
128
104
108
115
95
84
68
100
81
106
75
95
119
102
80
98
106
106
91
81
113
115
94
76
141
119
98
85
4.3
99.2
0.0
28.8
67.9
145.6
4.6
34.9
17.5
0.0
94.2
70.4
7.0
81.7
45.0
98.9
7.6
65.1
9.2
0.0
53.3
64.5
56.6
63.6
a. Compute the mean. Do these data appear to be consistent with the average reported
by the newspaper? Explain your answer.
b. Compute the median. Between the mean and the median, which measure do you think
is more appropriate to use for this data set? Why?
3. During a 30-day period, the daily number of cars rented of a car rental company are as
follows:
7
5
9
10
5
10
6
7
4
7
8
7
9
4
5
4
6
9
7
9
8
9
7
9
9
12
5
8
7
7
Percentage
defective
1.1
1.5
2.3
Number of Items,
in thousands
210
120
50
7. The average IQ of 10 students in a mathematics course is 114. If 9 of the students have IQs
of 101, 125, 118, 128, 106, 115, 99, 118, and 109, what must be the other IQ?
15
8. What is the average for a student who received grades of 85, 76, and 82 on 3 tests and a 79
on the final examination in a certain course if the final examination counts three times as
much as each of the 3 tests?
MEASURES OF NON-CENTRAL POSITION
describe or locate the position of certain noncentral pieces of data relative to the entire
set of data
often referred to as fractiles or quantiles
values below which a specific fraction or percentage of the observations in a given set
must fall
PERCENTILES
values that divide a set of observations into 100 equal parts
denoted by P1, P2, , P99, such that 1% of the data falls below P 1, 2% falls below P2, and 99%
falls below P99.
Steps in Finding Percentiles:
1. Rank the given data in increasing order of magnitude.
2. Find the position of the ith percentile:
i
n
100
, where k = the position of the ith percentile in the ordered data set;
2.6
2.9
3.0
3.0
3.1
31.
3.1
3.1
3.2
3.2
3.2
3.3
3.3
3.3
3.4
3.4
3.4
3.5
3.5
3.6
3.7
3.7
3.7
3.8
3.8
3.9
3.9
4.1
4.1
4.2
4.3
4.4
4.5
4.7
4.7
Find P85.
DECILES
QUARTILES
MEASURES OF VARIATION
16
3
3
4
7
5
7
6
7
8
8
9
8
10
8
12
9
15
15
difference in value between the highest (maximum) and the lowest (minimum)
observation
can be computed very quickly but is not very useful
considers only the extremes and does not take into consideration the bulk of the
observations.
a measure of variability that is based on the difference between the value of each
observation (xi) and the mean
deviation about the mean = the difference between each xi and the mean
Population Variance:
( xi )
N
Sample Variance:
2
s2
( xi X )
n 1
STANDARD DEVIATION
REMARKS:
The sample variance may be thought of as the average of the squared deviations from the
mean
The greater the deviations, the greater the variance
The variance is of little use in descriptive statistics because its calculated value is
expressed in square units of measurement
the standard deviation is more widely used; it has the same unit of measurement as the
raw data
Calculation of the Variance and Standard Deviation: Raw Score Method
s2
n xi2 ( xi ) 2
n(n 1)
(Raw score formula)
17
32
xi
71
64
50
48
63
38
41
47
52
xi 506
1,0245,0414,0962,5002,3043,9691,4441,6812,2092,704 x 2 26,972
i
xi2
s2
152.04
10(9)
90
90
s 152.04 12.33
The standard deviation is used when:
1. the statistic having the greatest stability is desired.
2. coefficients of correlation and other statistics are to be computed later.
3. the mean is the preferred measure of central tendency.
APPLICATIONS OF THE STANDARD DEVIATION
COEFFICIENT OF VARIATION
a measure of relative variability
expresses the standard deviation as a percentage of the mean
expressed in percent
can be used to compare the variability of two or more distributions even when the
observations are expressed in different units of measurement: the smaller the CV the less
variable the values of a given set compared to another data set
formula:
CV
s
100%
X
Remarks: In the investing world, the coefficient of variation allows you to determine how much
volatility (risk) you are assuming in comparison to the amount of return you can expect from
your investment. In simple language, the lower the ratio of standard deviation to mean return,
the better your risk-return tradeoff.
Example: Consider two investment proposals, A and B, with the following data:
Therefore, because the coefficient is a relative measure of risk, B is considered more risky than A.
STANDARD SCORE
tells the relative location of a particular raw score with regard to the mean of all the scores in
a series.
is a transformed raw score.
expressed in terms of standard deviation units from the mean.
Has a mean of zero.
o a positive standard score indicates that the transformed raw score is above or higher
than the mean
o a negative standard score shows that the given raw score is below or lower than the
mean.
The formula for transforming a raw score to a standard score, represented by z, is
18
x X
s
usually used to compare observations in two or more different distributions of raw scores
which have different means and/or different standard deviations.
Example: Ruben got a final grade of 85 in both English and Physics. The mean final grades of
his class in these two courses are 80 in English and 75 in Physics with standard deviations of 12
and 10, respectively. In which subject was his academic performance better in relation to his
class?
EMPIRICAL RULE
When the data are believed to approximate a bell-shaped distribution, the empirical rule can
be used to determine the percentage of data values that must be within a specified number of
standard deviations of the mean, that is,
Approximately 68% of the data values will be within 1 standard deviation of the mean.
Approximately 95% of the data values will be within 2 standard deviations of the mean.
Approximately 99.7% of the data values will be within 3 standard deviations of the mean.
Example: Liquid detergent cartons are filled automatically on a production line. Filling weights
frequently have a bell-shaped distribution. If the mean filling weight is 16 ounces and the
standard deviation is 0.25 ounces, use the empirical rule to draw conclusions about the
distribution of filling weights.
EXERCISES
1. A goal of management is to help their company earn as much as possible relative to the
capital invested. One measure of success is return on equity the ratio of net income to
stockholders equity. Shown here are return on equity percentages for 25 companies.
Find the range, variance, and standard deviation.
9.0
15.8
17.3
12.8
5.0
2.
19.6
52.7
31.1
12.2
30.3
22.9
17.3
9.6
14.5
14.7
41.6
12.3
8.6
9.2
19.2
11.4
5.1
11.2
16.6
6.2
rented of a car rental company are as
9
7
9
9
12
5
8
7
7
3. Many national academic achievement and aptitude tests, such as the SAT, report
standardized test scores with the mean for the normative group used to establish scoring
standards converted to 500 with a standard deviation of 100. Suppose that the distribution
of scores for such a test is known to be approximately normally distributed. Determine the
approximate percentage of reported scores that would be
a. between 400 and 600
b. between 500 and 700
c. greater than 700
d. less than 200
4. A manufacturing firm regularly places orders with two different suppliers, A and B. The
following data are the number of days required to fill orders for these suppliers.
19
Supplier A: 11
10
9
10
11
11
10
11
10
10
Supplier B: 8
10
13
7
10
11
10
7
15
12
Use the range and standard deviation to determine which supplier provides the more
consistent and reliable delivery times.
5. A production department uses a sampling procedure to test the quality of newly produced
items. The department employs the following decision rule at an inspection station: If a
sample of 14 items has a variance of more than .005, the production line must be shut
down for repairs. Suppose the following data have been collected:
3.43 3.45 3.43 3.48 3.52 3.50 3.39
3.48 3.41 3.38 3.49 3.45 3.51 3.50
Should the production line be shut down? Why or why not?
6. Two friends want to take a summer holiday before going to college in the autumn. They
are looking for somewhere with plenty of clubs where they can party all night.
Unfortunately they have left it rather late to book and there are only two resorts, Medlena
and Bistry, available within their budget. When they ask about the ages of the holidaymakers at these resorts their travel agent says the only thing he can tell them is that that
the mean age of people going to Medlena is 19 whereas the mean age of visitors to Bistry
is 22. Just as they are about to book holidays in Medlena because it seems to attract the
sort of young crowd they want to be with the travel agent says. Ive got some more
figures, the standard deviation of the ages of visitors to Medlena is 8 and the standard
deviation of the ages of visitors to Bistry is 2. Should they change their minds on the basis
of this new information, and if so, why?