12 views

Uploaded by Khaleed Mohd

- Report (2)
- Chapter 7 Notes(1)
- Seminar Statistica
- stats 18 - ppt website
- Continuous Distribution
- 2010-06-24_043236_Specialty_Toys
- Lecture6 Distribution (1)
- Probability
- MA1S12-ch4
- AshleyBriones Module4 Homework
- Chapter 6:The Normal Distribution & Other Continuous Distributions
- MATH 2014
- 10 Sampling Tech 2
- A2Probability.pdf
- 2 Cumulative Effect of Tol
- Tutorial 5(4)
- 3100 401 Probability & Statistics Maths IV
- Programming Your Calculator FX-7400G PLUS
- Sensivity Study of the Estimation Variance Approximation of a Quotient
- Probability Tables.pdf

You are on page 1of 3

3 October 2005

like sample mean, sample variance, etc.

If the population has a certain distribution, and we take a sample/collect

data, we are drawing multiple random variables. Summaries are functions of

samples. In general, we call a function of the sample a statistic.

We try to generate samples so that each measurement is independent, be-

cause this maximizes the information we get about the population.

Sampling distribution = distribution of the summary statistic, when the

observations are drawn independently from a fixed distribution. All of the

machinery of probability, random variables, etc., we have developed so far is to

let us model this mathematically.

For instance, X1 , X2 , . . . Xn P

is a sample of size n.

Now let’s consider X n = n1 Xi . X n is the random variable which repre-

sents the sample mean.

We want to know what happens to the sampling distributions with large

samples.

There are two major results here: the law of large numbers, and the central

limit theorem. Together, these are the Law and the Prophets of probability and

statistics.

The law of large numbers says that large random samples are almost always

representative of the population.

Specifically,

Pr |X n − µ| > 0 → 0

or

Pr X n → µ = 1

Let’s prove this. Assume µ and σ are both finite.

" n #

1X

E Xn = E Xi

n i=1

1

" n #

1 X

= E Xi

n i=1

n

1X

= E [Xi ]

n i=1

1

= nµ

n

= µ

1

nσ 2

Var X n =

n2

σ

=

n

Now let’s use Chebyshev: For any > 0,

Var X n

Pr |X n − E X n | > ≤

2

2

σ

Pr |X n − µ| > ≤

n2

Pr |X n − µ| > → 0

So we can make sure that the sample average X n is probably a good approxi-

mation to the population average µ just by taking enough samples. We choose

a degree of approximation () and a probability level, and that sets the number

of samples:

σ

n= 2

δ

guarantees that the probability of being off by more than is at most δ.

The central limit theorem (CLT) is the most important result in statistics.

X1 , . . . Xn are independent, again, with mean µ and variance σ 2 . Then for

large n

n

X

Xi ∼ N (nµ, nσ 2 )

i=1

σ2

Xn ∼ N (µ, )

n

and more exactly,

Xn − µ

√ ∼ N (0, 1)

σ/ n

Moral: the sampling distribution of the mean will be Gaussian, if you just

take enough independent samples.

Why? Basically: the Gaussian distribution is “stable”:

2

• the sum of two independent Gaussians is another Gaussian;

• the product of a Gaussian and a constant is a Gaussian

So averaging — which is adding independent variables, and dividing by con-

stants — leaves Gaussians alone. So non-Gaussian distributions tend to be “at-

tracted” to Gaussians, when you add up enough of them.1 Notice that, because

of the two facts I just mentioned, the central limit theorem always holds exactly

if X itself is Gaussian, rather than just being a large-sample approximation.

Rule of thumb: the CLT ought to hold pretty well by n = 30, or at the worst

n = 100. Something’s probably wrong if it’s not working by that point.

Application: the Gaussian approximation to the binomial, which we saw

before, is an instance of the CLT. The binomial, remember, is a sum of n

independent Bernoulli variables, each with mean p and variance p(1 − p). So if

B ∼ Bin(n, p), by the central limit theorem, if n is large

B

∼ N (p, p(1 − p)/n)

n

B ∼ N (np, np(1 − p))

B ∼ N (E [B] , Var (B))

which makes some small, but not necessarily equal contribution to the outcome,

you should expect the result to be Gaussian. We’ll see more about this next

time, under the heading of “propagation of errors”.

In the bad old days before computers, you had to work out the sampling

distribution by hand, using clever math. A lot of old-fashioned statistics assumes

things are Gaussian, or Poisson, etc., because in those special cases, you can

compute the sampling distribution of important statistics, like the median or

the variance. Part of the importance of the central limit theorem was that

it gave people a way around this, by providing a general mathematical result

about the sampling distribution of an especially important statistic, namely

the sample mean. You could then devote your time to turning the statistic

you cared about into some kind of sample mean. These days, however, it’s

comparatively simple to just simulate sampling from whatever distribution we

like, and calculate the statistic for each of the simulated samples. We can

make the simulated distribution come arbitrarily close to the real sampling

distribution just by simulating often enough, which is generally fast. We’ll see

some examples of this later on. Even with simulation, however, we’ll still see

that the central limit theorem is incredibly important to statistics.

1 There are stable non-Gaussian distributions, but they have infinite variance, so you won’t

- Report (2)Uploaded byShrenik Jain
- Chapter 7 Notes(1)Uploaded byMary Zurlo Blancha
- Seminar StatisticaUploaded bymmmaya
- stats 18 - ppt websiteUploaded byapi-294156876
- Continuous DistributionUploaded byArial96
- 2010-06-24_043236_Specialty_ToysUploaded byJayaKhemani
- Lecture6 Distribution (1)Uploaded byaasxdas
- ProbabilityUploaded byneeebbbsy89
- MA1S12-ch4Uploaded byName
- AshleyBriones Module4 HomeworkUploaded bymilove4u
- Chapter 6:The Normal Distribution & Other Continuous DistributionsUploaded byThe Tun 91
- MATH 2014Uploaded byStephon King
- 10 Sampling Tech 2Uploaded bysoonvy
- A2Probability.pdfUploaded byRiddhi Pandya
- 2 Cumulative Effect of TolUploaded bysaravanaeee2004
- Tutorial 5(4)Uploaded byRUHDRA
- 3100 401 Probability & Statistics Maths IVUploaded byKarthik Kumar
- Programming Your Calculator FX-7400G PLUSUploaded byOmar Espino
- Sensivity Study of the Estimation Variance Approximation of a QuotientUploaded byEdward Chirinos
- Probability Tables.pdfUploaded bytangs
- Solutions Chapter 8Uploaded byrabxcv-3
- cursusUploaded byAnh Thanh
- OCCurves[1]Uploaded byepomales_11
- Stat 151 FormulasUploaded byTanner Hughes
- 31504077-Eurocode-3-1-5-2007-ENUploaded bymahmoodseraji
- Safety Factor and Inverse ReliabilityUploaded bystructuralmechanic
- Automatic differentiation variational inferenceUploaded byPeter
- ACTL1101 Wk 9-12Uploaded byscrat4acorn
- 05 Handout 1Uploaded byJerome Cailo Diaz
- Sample Test 3Uploaded byJayanthi Loganathan

- 15 Key Traits of Successful EntrepreneursUploaded byCraig Wagener
- CBBUploaded byGeviena Pinky Sy Sarmiento
- New Microsoft Word DocumentUploaded byshahid khan
- Voltage Pictures ComplaintUploaded bycuvien1
- SamsungUploaded bymoumita_389
- FCI_1_08Uploaded byClaudioPena
- Carte tehnica MU_VCE_0408-2_VEN_GB.pdfUploaded byChelea Adriana
- ECC-50-100Uploaded bypujFierros
- An auditory nerve stimulation chip with integrated AFE, sound processing, and power management for fully implantable cochlear implantsUploaded byGanesh Lakshmana Kumar
- Site PlanningUploaded byRosh Pilapil
- againstNozicksEntitlementTheory.pdfUploaded byJoel Curtis
- Steps for Drive TestUploaded byChandra Shekhar
- ASCAP PaymentsUploaded bythedcchief
- ATLSUploaded bysouthstar99
- IB Chemistry IA Hess's LawUploaded byJason J. Ln
- RC602-GEF datasheet.pdfUploaded bymauroelectro
- LEGO MAINUploaded byhelly
- Waldner Drying BrochureUploaded byEd Road
- COMPUTER LITERACYUploaded byPRINTDESK by Dan
- Hadya Jawab by Sheikh Mufti AbUploaded byAhnafMedia
- Copy of Nursing tics and Its Application 23doc[1][2]Uploaded byfatmaabed
- Nike 1990-2000Uploaded bykjhahhahaha
- 06770184.pdfUploaded bywagwan
- DocUploaded byYanto
- Banana Ripening ChamberUploaded byUbaid Ur Rehman
- CHAPTER IUploaded byMeister MJ
- flpl158 ppd critical analysis part 1Uploaded byapi-251685362
- Maxon GP 22 AUploaded byElectromate
- Surrogacy AgreementUploaded byGlomarie Gonayon
- kaylee suarez cover letterUploaded byapi-301895072