Professional Documents
Culture Documents
Descriptive Statistics
Population - collection of things under study Sample - a portion of the population Parameter - a number computed to describe a feature of the population Statistic - a number computed to describe a feature of the sample
Population vs Sample
Types of Data
Data
Population
Categorical (Qualitative)
Numerical (Quantitative)
Discrete
Continuous
Mean
Median
Mode
Mean
Mean of data values
Sample mean
Mean
(continued)
X=
X
i =1
Sample Size
i
Population mean
X1 + X 2 + n
+ Xn
X
i =1
Population Size
i
0 1 2 3 4 5 6 7 8 9 10
0 1 2 3 4 5 6 7 8 9 10 12 14
X + X2 + = 1 N
+ XN
Mean = 5
Mean = 6
Median
Not easily affected by outliers
0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 12 14
Mode
A measure of central tendency Value(s) that occur most often Not easily affected by outliers Used for both numerical and categorical data There may be no/several modes
Median = 5
Median = 5
Mode = 9,12
No Mode
Measures of Variation
Variation
Range
Difference between the largest and the smallest observations:
Standard Deviation
10
11
12
10
11
12
Quartiles
Quartiles split the ranked data into 4 segments with an equal number of values per segment 25% Q1 25% Q2 25% Q3 25%
Quartile Formulas
Find a quartile by determining the value in the appropriate position in the ranked data, where
Q1 position = (n+1)/4 Q2 position = (n+1)/2 Q3 position = 3(n+1)/4
The first quartile, Q1, is the value for which 25% of the observations are smaller and 75% are larger Q2 is the same as the median (50% are smaller, 50% are larger) Only 25% of the observations are greater than the third quartile
Quartiles : Example
Sample Data in Ordered Array: 11 12 13 16 16 17 18 21 22
Interquartile Range
A measure of dispersion that avoids outliers
Interquartile range = 3rd quartile 1st quartile = Q3 Q1
(n = 9) Q1 is in the (9+1)/4 = 2.5 position of the ranked data, so Q1 = 12.5 Q2 is in the (9+1)/2 = 5th position of the ranked data, so Q2 = 16 Q3 is in the 3(9+1)/4 = 7.5 position of the ranked data, so Q3 = 19.5
Variance
Important measure of variation Shows variation about the mean
Sample variance:
Standard Deviation
Another important measure of variation Shows variation about the mean Has the same units as the original data
2
S =
2
( Xi X )
i =1
Population variance:
N
n 1
S=
Population standard deviation:
2
( X
i =1
X)
n 1
2 =
( X
i =1
( X
i =1
Shape of a Distribution
Mean = 15.5 s = 3.338
Symmetric or skewed
s = .9258
Mean = 15.5 s = 4.57
Left-Skewed
Symmetric
Right-Skewed
Data C
11 12 13 14 15 16 17 18 19 20 21
Boxplot : Example
Q2
Q3
Max
10
27
Minimum Minimum
Median Median
Maximum Maximum
00
2 3 2 3 55
2 27 7
Q1
Q2 Q3
Q1 Q2 Q3
Q1 Q2 Q3