Professional Documents
Culture Documents
Descriptive statistics
Dr. N Shiukashvili
What is biostatistics???
Almost everyday several news portals inform as similar information:
A new treatment for HIV disease works better than current therapies
High blood pressure is demonstrated to be associated with heart
disease
A study suggests that a certain pollutant may be harmful to
humans
Such results are the work of multidisciplinary teams of researchers, including
Physicians
public and environmental health specialists
BIOSTATISTICIANS
Biostatisticians play essential roles in
designing the studies
analyzing the data
creating new methods for addressing these problems.
Descriptive Statistics
Class A
IQs of 13 Student
Class B
IQs of 13 Students
102
115
128
109
131
89
98
106
140
119
93
97
110
127
162
131
103
96
111
80
109
93
87
120
105
109
Descriptive Statistics
Which group is smarter now?
Class A--Average IQ
110.54
Class B--Average IQ
110.23
Descriptive Statistics
Population
Sample
Probability
Biostatistics is also used in modeling and hypothesizing.
Given a set of data, scientists combine biostatistics and probability theory in order to
determine the likelihood of diseases to hit populations, drugs to cure those diseases,
and peoples reaction to those drugs.
In this way, biostatistics promises to be as good at predicting the future as it is at
analyzing the past.
What means Probability???
A physician say that a patient has a 5050 chance of surviving a certain
operation.
A physician may say that she is 95 percent certain that a patient has a particular
disease.
As these examples suggest, most people express probabilities in terms of
percentages.
Probability
we measure the probability (p) of the occurrence of some event by a number
between zero and one as the event either occurs or not
0-1
The event less likely to occur is closer to the number 1;
Whereas the event more likely to occur is closer to the number 0.
An event that cannot occur has a probability of zero, and an event that is certain to
occur has a probability of one
Probability
Addition rule
Two events are called to be dependent if they DO affect one another
If there are 4 cards, what is the probability of after random taking to have heart card?
25 %
What is the probability to get red card?
25+25= 50%
Probability
Multiplication rule
Two events are called to be independent if they do NOT affect one another
A method for finding the probability that both of two events occur together.
A - blue eyes
B - high IQ
If the probability for a newborn
girl to have blue eyes is 25%,
and high IQ 1%
what is the probability that the
newborn blue eyed girl has high
IQ?
If we take probability range from 0-1
0.25 X 0.01 = .0025 (0.25%)
Binomial Distribution
Representation of descriptive statistics data:
Organize Data
Tables
Graphs
Summarize Data
Central Tendency
Variation
Binomial Distribution
The binomial distribution is a probability distribution.
It has discrete values. It counts the number of successes in yes/no-type
experiments.
There are two parameters:
the number of times an experiment is done (n)
the probability of a success (p).
Example:
Tossing a coin 10 times, and counting the number of face-ups. (n=10, p=1/2)
Binomial Distribution
if coins will be tossed twice the four possible outcomes are:
Frequency Distributions
You are a researcher and conducting study about arterial tension in normal population.
You have a data of 500 person.
What's the next step?
Organizing the data from the highest to the lowest in order, recording the
frequency () with which each score occurs.
500
200
0
Population
Arterial tension
Frequency Distribution
Grouped frequency
Frequency Distribution
RELATIVE FREQUENCY DISTRIBUTIONS
It transforms data, which shows the percentage of all the elements that fall within
each class interval.
If 18 person from 50 had same data, relative frequency will be 36 (18/50 X 100)
Normal Distribution
If we take the same example: arterial tension in normal population
gathered from 500 person.
Graphically it will be represented like this
200
0
Population
500
called:
symmetrical,
bell-shaped
Gaussian distribution
Arterial tension
Non-Normal Distribution
Distribution is not always symmetrical. There are Asymmetric frequency distributions
called skewed distributions.
by the location of the tail of the curve distribution can be:
Positively (or right) skewed distributions
negatively (or left) skewed distributions (Because the long "tail" is on the negative side of the
peak)
Non-Normal Distribution
There is also another non-normal distribution called Bimodal distribution
Ex: Good pasteurs syndrome: A very rare diseases with bimodal age distribution
20-30 years and 60-70 years.
Measures of Variable
There are two normal distributions (A and B) with the identical means, modes, and
medians
Despite these similarities, these two distributions are obviously different;
Means that only central tendency alone is not enough!!!
The scores forming distribution A are clearly more scattered than are those
forming distribution B.
They differ in terms of their variability
Measures of Variable
If these 2 graphs depict the drug effect, which drug will be more efficient???
Patient number
Measures of Variable
RANGE
Is the difference between the highest and the lowest scores in the distribution.
13, 13, 13, 13, 14, 14, 16, 18, 21
The largest value in the list is 21, and the smallest is 13
so the range is 21 13 = 8.
Measures of Variable
VARIANCE
The variance measures how far each number in the set is from the mean.
You and your friends have just measured the heights of your dogs (in
millimeters):
The heights (at the shoulders) are: 600mm, 470mm, 170mm, 430mm and
300mm.
Mean (average) height is 394 mm
Measures of Variable
Measures of Variable
To calculate the Variance, take each difference, square it, and
then average the result:
Variance equal to zero indicates that all values within a set of numbers are
identical;
A large variance indicates that numbers in the set are far from the mean and each
other, while a small variance indicates the opposite.
Has a limited use
Measures of Variable
Standard Deviation
It is just the square root of Variance, so:
Standard Deviation: = 21,704 = 147.32... = 147
So, using the Standard Deviation we have a "standard" way of knowing what is
normal, and what is extra large or extra small.
Rottweiler's are tall dogs. And Dachshunds are a bit short ...
Measures of Variable
Samples can be very uniform with the data all collected around the mean or they
can be spread out a long way from the mean.
Standard deviation measures it.
68-95-99 rule
Measures of Variable
What is biostatistics???
According to statistics every 6th on the earth is Chinese