Basics On Probability

Basics on Probability
Jingrui He
09/11/2007
Coin Flips
You flip a coin
Head with probability 0.5
You flip 100 coins

How many heads would you expect
Coin Flips cont.
You flip a coin
Head with probability p
Binary random variable
Bernoulli trial with success probability p
You flip k coins
How many heads would you expect
Number of heads X: discrete random variable
Binomial distribution with parameters k and p
Discrete Random Variables
Random variables (RVs) which may take on
only a countable number of distinct values
E.g. the total number of heads X you get if you
flip 100 coins
X is a RV with arity k if it can take on exactly

one value out of { x1 ,K , xk }
E.g. the possible values that X can take on are 0,
1, 2,…, 100
Probability of Discrete RV
Probability mass function (pmf): P ( X = xi )
Easy facts about pmf
∑ P (X = x ) = 1
i i
( )
P X = xi ∩ X = x j = 0 if i ≠ j
( ) ( )
P X = x ∪X = x = P X = x +P X = x
i j i (
j ) if i ≠ j
( )
P X = x1 ∪ X = x2 ∪ K ∪ X = xk = 1
Common Distributions
Uniform X U [1,K , N ]
X takes values 1, 2, …, N
( )
P X = i =1 N
E.g. picking balls of different colors from a box
Binomial X Bin ( n, p )
X takes values 0, 1, …, n
n i n −i
P ( X = i ) =   p (1 − p )
i
E.g. coin flips
Coin Flips of Two Persons
Your friend and you both flip coins
Head with probability 0.5
You flip 50 times; your friend flip 100 times
How many heads will both of you get
Joint Distribution
Given two discrete RVs X and Y, their joint
distribution is the distribution of X and Y
together
E.g. P(You get 21 heads AND you friend get 70
heads)

∑∑ x y
P (X = x ∩ Y = y) = 1
E.g.
∑ ∑ P ( You get i heads AND your friend get j heads ) = 1
50 100
i =0 j =0
Conditional Probability
P ( X = x Y = y ) is the probability of X = x ,
given the occurrence of Y = y
E.g. you get 0 heads, given that your friend gets
61 heads
P (X = x ∩ Y = y)
P (X = x Y = y) =
P(Y = y)
Law of Total Probability
Given two discrete RVs X and Y, which take
values in { x1 ,K , xm } and { y1 ,K , yn } , We have
P ( X = xi ) =∑ P (X = x ∩ Y = y )
j i j
= ∑ P ( X = x Y = y )P ( Y = y )
i j j
j
Marginalization
Marginal Probability Joint Probability
P ( X = xi ) = ∑ P (X = x ∩ Y = y )
j i j
= ∑ P ( X = x Y = y )P ( Y = y )
i j j
j
Conditional Probability Marginal Probability

Bayes Rule
X and Y are discrete RVs…
P(X = x ∩ Y = y)
P (X = x Y = y) =
P (Y = y)
( )
P Y = y j X = xi P ( X = xi )
( )
P X = xi Y = y j =
∑ P (Y = y
k j )
X = xk P ( X = xk )
Independent RVs
Intuition: X and Y are independent means that
X = x neither makes it more or less probable
that Y = y
Definition: X and Y are independent iff
P (X = x ∩ Y = y) = P (X = x) P (Y = y)
More on Independence
P (X = x ∩ Y = y) = P (X = x) P (Y = y)
P (X = x Y = y) = P (X = x) P (Y = y X = x) = P (Y = y)
E.g. no matter how many heads you get, your

friend will not be affected, and vice versa
Conditionally Independent RVs
Intuition: X and Y are conditionally
independent given Z means that once Z is
known, the value of X does not add any
additional information about Y
Definition: X and Y are conditionally
independent given Z iff
P (X = x ∩ Y = y Z = z) = P (X = x Z = z) P (Y = y Z = z)
More on Conditional Independence
P (X = x ∩ Y = y Z = z) = P (X = x Z = z) P (Y = y Z = z)
P ( X = x Y = y, Z = z ) = P ( X = x Z = z )
P ( Y = y X = x, Z = z ) = P ( Y = y Z = z )
Monty Hall Problem
You're given the choice of three doors: Behind one
door is a car; behind the others, goats.
You pick a door, say No. 1
The host, who knows what's behind the doors, opens
another door, say No. 3, which has a goat.
Do you want to pick door No. 2 instead?
Host reveals
Goat A
or
Host reveals
Goat B
Host must
reveal Goat B
Host must
reveal Goat A
Monty Hall Problem: Bayes Rule
Ci : the car is behind door i, i = 1, 2, 3
P ( Ci ) = 1 3
H ij : the host opens door j after you pick door i
0 i= j
0 j=k
(
P H ij Ck ) =

i=k
1 2
 1 i ≠ k , j ≠ k
Monty Hall Problem: Bayes Rule cont.
WLOG, i=1, j=3
P ( H13 C1 ) P ( C 1 )
P ( C1 H13 ) =
P ( H13 )
P ( H13 C1 ) P ( C1 ) = ⋅ =
1 1 1

2 3 6
P ( H13 ) = P ( H13 , C1 ) + P ( H13 , C2 ) + P ( H13 , C3 )
= P ( H13 C1 ) P ( C1 ) + P ( H13 C2 ) P ( C2 )
1 1
= + 1⋅
6 3
1
=
2
P ( C1 H13 ) =
16 1
=
12 3
P ( C1 H13 ) =
16 1
=
12 3
P ( C2 H13 ) = 1 − = > P ( C1 H13 )

1 2
3 3
You should switch!
Continuous Random Variables
What if X is continuous?
Probability density function (pdf) instead of
probability mass function (pmf)
A pdf is any function f ( x ) that describes the
probability density in terms of the input
variable x.
PDF
Properties of pdf
f ( x ) ≥ 0, ∀x
+∞

∫ f ( x) = 1
−∞
( )
f x ≤ 1 ???
Actual probability can be obtained by taking

the integral of pdf
E.g. the probability of X being between 0 and 1 is
1
P ( 0 ≤ X ≤ 1) = ∫ f ( x )dx
0
Cumulative Distribution Function
FX ( v ) = P ( X ≤ v )
Discrete RVs
FX ( v ) = ∑ vi
P ( X = vi )
Continuous RVs
v
( )
FX v =
∫
−∞
f ( x ) dx
d
FX ( x ) = f ( x )
dx
Common Distributions
Normal X N µ ,σ ( 2
)
 (
 ) 
2
1 x − µ
( )
f x = exp − , x ∈
2π σ  2σ 2

E.g. the height of the entire population
0.4
0.35
0.3
0.25
f(x)
0.2
0.15
0.1
0.05
0
-5 -4 -3 -2 -1 0 1 2 3 4 5
x
Common Distributions cont.
Beta X Beta (α , β )
β −1
x (1 − x ) , x ∈ [ 0,1]
1
f ( x; α , β ) =
α −1
B (α , β )
α = β = 1 : uniform distribution between 0 and 1
E.g. the conjugate prior for the parameter p in
Binomial distribution
1.6
1.4
1.2
1
f(x)
0.8
0.6
0.4
0.2
0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
x
Joint Distribution
Given two continuous RVs X and Y, the joint
pdf can be written as f X,Y ( x, y )

∫∫ f X,Y ( x, y )dxdy = 1
x y
Multivariate Normal
Generalization to higher dimensions of the
one-dimensional normal
Covariance Matrix
1
f Xv ( x1 ,K , xd ) =
( 2π ) d 2
Σ
12
 1 v 
⋅ exp − ( x − µ ) Σ ( x − µ ) 
T −1 v
 2 
Mean
Moments
Mean (Expectation): µ = E ( X )
∑ v P (X = v )
Discrete RVs: E ( X ) =
vi i i
+∞
Continuous RVs: E ( X ) = ∫ xf ( x ) dx
−∞
Variance: V ( X ) = E ( X − µ )
2

Discrete RVs: V ( X ) = ∑ ( vi − µ ) P ( X = vi )
2

vi
+∞
Continuous RVs: V ( X ) = ( x − µ) f ( x )dx
∫
2

−∞
Properties of Moments
Mean
( ) ( )
E X+Y = E X +E Y ( )
E ( aX ) = aE ( X )
If X and Y are independent, E ( XY ) = E ( X ) ⋅ E ( Y )
Variance
( )
V aX + b = a V X
2
( )
If X and Y are independent, V ( X + Y ) = V (X) + V (Y)
Moments of Common Distributions
Uniform X U [1,K , N ]
Mean (1 + N ) 2 ; variance ( N − 1) 12
2
Binomial X Bin ( n, p )
Mean np ; variance np 2
Normal X (
N µ ,σ 2 )
Mean µ ; variance σ 2
Beta X Beta (α , β )
αβ
Mean α (α + β ) ; variance
(α + β ) (α + β + 1)
2
Probability of Events
X denotes an event that could possibly happen
E.g. X=“you will fail in this course”
P(X) denotes the likelihood that X happens,
or X=true
What’s the probability that you will fail in this
course?
Ω denotes the entire event set
{
Ω = X, X }
The Axioms of Probabilities
0 <= P(X) <= 1
P (Ω) = 1
P ( X1 ∪ X 2 ∪ K) = ∑ P ( X ) , where X
i i i are
disjoint events
Useful rules
( ) ( ) ( )
P X1 ∪ X 2 = P X1 + P X 2 − P X1 ∩ X 2 ( )
P (X) = 1− P (X)
Interpreting the Axioms
Ω
X1
X2

Basics On Probability

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Basics On Probability

Uploaded by

Copyright:

Available Formats

Basics on Probability

You flip 100 coins

X is a RV with arity k if it can take on exactly

Marginal Probability Joint Probability

Conditional Probability Marginal Probability

E.g. no matter how many heads you get, your

Actual probability can be obtained by taking

You might also like