Professional Documents
Culture Documents
probability
Abdallah Ezat, Physics department, Sohag University
April 11, 2016
Supervisor: Dr. Mohamed Mekhemar, Sohag University
Abstract
Statistics, being the study of the collection, analysis, interpretation, presentation, and
organization of data, is the way we deal with the large populations and
approximations in the nuclear experimental physics, alongside with probability. In
this report, I will discuss some of the basic concepts of statistics which will be used in
specific nuclear experiments. At the end of the report, I will discuss the probability
theory and two probability distributions, namely, normal (Gauss) distribution and
Poisson's distribution.
Contents
1. Statistics..............................................................................................................3
1.1 Descriptive statistics................................................................................................3
1.2 Inferential statistics: population and sample...........................................................3
1.3 Variable, observation, and data set..........................................................................3
5. Probability.........................................................................................................7
5.1 Experiment, outcomes, and sample space...............................................................7
5.2 Events, simple events, and compound events..........................................................7
5.3 Probability................................................................................................................7
5.4 Poisson's distribution...............................................................................................8
5.5 Normal distribution..................................................................................................9
References.............................................................................................................12
1. Statistics
1.1 Descriptive statistics
The use of graphs, charts, and tables and the calculation of various statistical measures to organize
and summarize information is called descriptive statistics. Descriptive statistics help to
reduce our information to a manageable size and put it into focus.
Number of
students
60-62
63-65
66-68
69-71
72-74
5
18
42
27
8
Total 100
The first class or category, for example, consists of masses from 60 to 62 Kilograms. Since 5
students have masses belonging to this class, the corresponding class frequency is 5.
A symbol defining a class such as 60-62 in the above table is called class interval. The end
numbers, 60 and 62, are the class limits (the lower class limit is 60 and the higher class limit is 62).
3
X , X , ... X N
=
X= 1 2
N
Xi
i=1
If the numbers X1,X2,...XN occur f1,f2,... fk times respectively, (i.e. occur with frequencies f1,f2,... fk),
the arithmetic mean is:
X=
f 1 X 1 , f 2 X 2 , ... f k X k
f 1 +f 2+...+ f k
1
(9+ 11)=10
2
For grouped date (like the frequency table above) the median is given by
N
( f )1
2
)
Median = L1 + L1+(
f median
where
L1 = lower class boundary of the median class (i.e. the class containing the median )
N = number of items in the data (i.e. total frequency)
( f )1 = sum of frequencies of all classes lower than the median class
fmedian = frequency of median class
c = size of median class
Example 2. the set of numbers 5,5,7,9,11,12,15,18 has median
s=
( X i X)2
(i=1)
x2
(
)
N
where x represents the deviations of each of the numbers xi from the mean X .
Thus s is the root mean square 1 of the deviations from the mean, or the root mean square deviation.
1- Its name suggests that it is the root of the square of the mean of the data! This type of average is
frequently used in physical applications
5
The variance of a set of data is defined as the square of the standard deviation and is thus given by
s2 in the above equation.
4.3.1 The properties of the standard deviation
1. For normal distributions2 it turns out that:
a) 68.3% of the cases are included between X s and X + s
(i.e. one standard deviation on either side of the mean) (see the figure below)
b) 95.45% of the cases are included between X 2 s and X 2 s
(i.e. two standard deviations on either side of the mean)
c) 97.73% of the cases are included between X 3 s and X 3 s
(i.e. three standard deviations on either side of the mean)
2- Suppose that two sets consisting of N1 and N2 numbers have variances given by s 21 and s 22
respectively and the same mean X . Then the combined variance of both sets is given by
N 1 s21 + N 2 s 22
s =
N1+ N2
2
5. Probability
2- A distribution that describes most statistical processes having a continuously varying magnitude.
6
5.3 Probability
Probability is a measure of the likelihood of the occurrence of some event. There are several
different definitions of probability. Three definitions are discussed in the next section. The
particular definition that is utilized depends upon the nature of the event under consideration.
However, all the definitions satisfy the following two specific properties and obey the rules of
probability.
The probability of any event E is represented by the symbol P(E) and the symbol is read as P of
E or as the probability of event E. P(E) is a real number between zero and one as indicated in the
following inequality:
0P ( E)1
The sum of the probabilities for all the simple events of an experiment must equal one. That is, if
E1 , E2 , . . . , E, are the simple events for an experiment, then the following equality must be true:
P(E1) + P(E2) + . . . + P(En) = 1
This equation is also sometimes expressed as in formula
P(S) = 1
The last equation states that the probability that some outcome in the sample space will occur is
one.
The curve on the left is shorter and wider than the curve on the right, because the curve on the left
has a bigger standard deviation.
Additionally, every normal curve (regardless of its mean or standard deviation) conforms to the
following "rule".
About 68% of the area under the curve falls within 1 standard deviation of the mean.
About 95% of the area under the curve falls within 2 standard deviations of the mean.
About 99.7% of the area under the curve falls within 3 standard deviations of the mean.
Collectively, these points are known as the empirical rule or the 68-95-99.7 rule. Clearly, given a
normal distribution, most outcomes will be within 3 standard deviations of the mean.
To find the probability associated with a normal random variable, use a graphing calculator, an
online normal distribution calculator, or a normal distribution table. In the examples below, we
illustrate the use of Stat Trek's Normal Distribution Calculator, a free tool available on this site. In
the next lesson, we demonstrate the use of normal distribution tables.
Example 1
An average light bulb manufactured by the Acme Corporation lasts 300 days with a standard
deviation of 50 days. Assuming that bulb life is normally distributed, what is the probability that an
Acme light bulb will last at most 365 days?
Solution: Given a mean score of 300 days and a standard deviation of 50 days, we want to find the
cumulative probability that bulb life is less than or equal to 365 days. Thus, we know the following:
The value of the normal random variable is 365 days.
The mean is equal to 300 days.
The standard deviation is equal to 50 days.
We enter these values into the Normal Distribution Calculator and compute the cumulative
probability. The answer is: P( X < 365) = 0.90. Hence, there is a 90% chance that a light bulb will
burn out within 365 days.
10
Example 2
Suppose scores on an IQ test are normally distributed. If the test has a mean of 100 and a standard
deviation of 10, what is the probability that a person who takes the test will score between 90 and
110?
Solution: Here, we want to know the probability that the test score falls between 90 and 110. The
"trick" to solving this problem is to realize the following:
P( 90 < X < 110 ) = P( X < 110 ) - P( X < 90 )
We use the Normal Distribution Calculator to compute both probabilities on the right side of the
above equation.
To compute P( X < 110 ), we enter the following inputs into the calculator: The value of the
normal random variable is 110, the mean is 100, and the standard deviation is 10. We find
that P( X < 110 ) is 0.84.
To compute P( X < 90 ), we enter the following inputs into the calculator: The value of the
normal random variable is 90, the mean is 100, and the standard deviation is 10. We find that
P( X < 90 ) is 0.16.
We use these findings to compute our final answer as follows:
P( 90 < X < 110 ) = P( X < 110 ) - P( X < 90 )
P( 90 < X < 110 ) = 0.84 - 0.16
P( 90 < X < 110 ) = 0.68
Thus, about 68% of the test scores will fall between 90 and 110.
11
References
[1] Spiegel, Murray R. Schaum's Outline of Theory and Problems of Statistics.
New York: Schaum Pub., 1961. Print.
[2] Dodge, Yadolah. The Oxford Dictionary of Statistical Terms. Oxford: Oxford
UP, 2003. Web.
[3] "Normal Distribution" <http://stattrek.com/probabilitydistributions/normal.aspx>.
[4] "Poisson Distribution" <http://stattrek.com/probabilitydistributions/poisson.aspx>.
12