Normal Distribution

Review of Probability Distributions

Discrete Probability

Distributions

Binomial Poisson

The Binomial probability distribution :

P (X = r) = (n / r ! (n-r) !) p r (1 – p ) n – r

for r = 0,1,2,…, n.

Where n ! = (n factorial) is n ! = n (n-1) (n-2) …(1);

0 ! = 1, by definition.

p may be known a priori or it may be estimated empirically

as p = r/n, the ratio of the number of successes observed in

the n trials to the number of trials.

Properties of the binomial distribution :

•Characterised by n and p.

•Mean = np

•Variance = np(1-p)

•Standard deviation = np (1-p)

The Poisson probability distribution :

P(X = r) = e- r / r ! for r = 0, 1, 2, …,

and may be known a priori or it may be estimated from

sample by X (arithmetic mean).

•Characterised by .

•Mean = .

•Variance = .

•Standard deviation = .

•The Poisson distribution approximates the binomial distri-

bution when the number of trials, n, is large and the proba-

bility of success in a single trial, p, is small; with =np.

The Normal Distribution

The Normal distribution (Gaussian distribution) is a very

important probability distribution in statistics.

The distribution of many medical measurements in

populations approximate the normal in shape (e.g. serum

uric acid level, cholesterol levels, blood pressure, height,

and weight).

1. It is a distribution of a continuous variable, say X.

2. It is bell-shaped curve.

3. It is symmetrical about its mean,

4. The area under the curve is equal to 1.

5. It is determined by two quantities, its mean and its

standard deviation .

Changing merely shifts the whole curve to the left or

right. Increasing makes the curve flatter and more

spread out.

6. The probability between the limits :

- and + is 0.68

- 2 and + 2 is 0.95

- 3 and + 3 is 0.997

In practice the probability distribution of the variables

we observe are not know, but if the distribution is bell

shaped and reasonably symmetrical about the mean, use

can be made of the Normal distribution. Note, however,

that observations made on normal (healthy ?) individuals

do not necessarily follow a Normal distribution.

Given a random variable X that can take on any

value between negative and positive infinity (-

and +), the formula for the normal distribution

is as follows:

the mean and standard deviation because

they are the only components that vary.

Because the area under the curve is equal to 1, we can

use the curve for calculating probabilities. For example,

to find the probability that an observation falls between

a and b on the curve, we integrate the preceding equation

between a and b, where - is given the value a and +

is given the value b. (Integration is a mathematical technique

in calculus used to find area under a curve).

problem in hand and tables of the Normal distribution cannot

be published for all values of and , calculations are made

by referring to the Standard Normal Distribution which

has = 0 and = 1.

SND = Z = (X - ) /

The table A-2 gives the area under the curve between 0 and

+Z.

Example: If thermometers have an average (mean)

reading of 0 degrees and a standard deviation of 1

degree

for freezing water and if one thermometer is randomly

selected, find the probability that it reads freezing

water

between 0 degrees and 1.58 degrees.

Using Symmetry to Find the Area

to the Left of the Mean

Because of symmetry, these areas are equal.

under the curve (or the corresponding probability)

can never be negative.

Exercises :

b. P (Z < -1.64) f. P (0 < Z < 1.96)

c. P (1.0 < Z < 1.5)

d. P (-1.0 < Z < 2.0)

e. P (-2.0 < Z < 2.0)

buted about a mean = 60 with a standard deviation

= 20. What proportion of the scores :

a. Exceed 85 ?

b. Fall below 50 ?

3. Assuming systolic blood pressure (BP) in normal healthy

individuals is normally distributed with = 120 and

= 10 mm Hg. Make the appropriate transformations to

answer the following questions. (Hint: Make sketches of

the distribution to be sure you are finding the correct

area)

a. What area of the curve is above 130 mmHg ?

b. What area of the curve is above 140 mmHg ?

c. What area of the curve is between 100 and 140 mmHg?

d. What area of the curve is above 150 mmHg ?

e. What area of the curve is either below 90 mmHg or

above 150 mmHg ?

f. What is the value of the systolic blood pressure that

divides the area under the curve into the lower 95 %

and the upper 5 % ?

4. Assume that among diabetics the fasting blood level of

glucose is approximately normally distributed with a

mean = 105 mg per 100 ml and standard deviation

= 9 mg per 100 ml.

90 and 125 mg per 100 ml ?

b. What level cuts off the lower 10 percent of diabetics ?

5. The values of serum sodium in healthy adults approxima-

tely follow a normal distribution with a mean of 141

mEq/L and standard deviation of 3 mEq/L

a. What is the probability that a normal healthy adult will

have a serum sodium value above 147 mEq/L ?

b. What is the probability that a normal healthy adult will

have a serum sodium value below 130 mEq/L ?

c. What is the probability that a normal healthy adult will

have a serum sodium value between 132 and 150

mEq/L ?

d. What serum sodium level is necessary to put someone

in the top 1 % of the distribution ?

e. What serum sodium level is necessary to put someone

in the bottom 10 % of the distribution ?

