Professional Documents
Culture Documents
Simplerandomsamplingandsystematicsamplingprovidethefoundationforalmostallofthemore
complexsamplingdesignsbasedonprobabilitysampling.Theyarealsousuallytheeasiestdesignsto
implement.Thesetwodesignshighlightatradeoffsinherentinselectingasamplingdesign:toselect
sampleunitsatrandomtominimizetheriskofintroducingbiasesintothesampleortoselectsamples
systematicallytoensurethatsampleunitsarewelldistributedthroughoutthepopulation.
BothdesignsinvolveselectingnsampleunitsfromtheNunitsavailableinthepopulationandcanbe
implementedwithorwithoutreplacement.
SimpleRandomSampling
Whenthepopulationofinterestisrelativelyhomogeneousthensimplerandomsamplingworkswell,
whichmeansitprovidesestimatesthatareunbiasedandhavehighprecision.Whenlittleisknown
aboutapopulationinadvance,suchasinapilotstudy,simplerandomsamplingisacommondesign
choice.
Advantages:
Easytoimplement
Requireslittleknowledgeofthepopulationinadvance
Disadvantages:
Impreciserelativetootherdesignsifthepopulationisheterogeneous
Moreexpensivethanotherdesignsifentitiesareclumpedandthecosttotravelamongunitsis
appreciable
Howitisimplemented:
SelectnsampleunitsatrandomfromNavailableinthepopulation
Allunitswithinthesamplinguniversemusthavethesame
probabilityofbeingselected,thereforeeachandevery
sampleofsizendrawnfromthepopulationhasanequal
chanceofbeingselected.
Therearemanystrategiesavailableforselectingarandom
sample.Forlargepopulations,thisofteninvolvesgenerating
pseudorandomnumberswithacomputerandforsmall
populationsitmightinvolveusingatableofrandomnumbers
orevenwritingauniqueidentifierforeverysampleunitin
thepopulationonascrapofpaper,placingthosenumbersin
ajar,shakingit,thenselectingnscrapsofpaperfromthejar
blindly.Theapproachusedforselectingthesamplematters
littleprovidedtherearenoconstraintsonhowthesampleunitsareselectedandallunitshaveanequal
chanceofbeingselected.
EstimatingthePopulationMean
Thepopulationmean()isthetrueaveragenumberofentitiespersampleunitandisestimatedwith
thesamplemean( or y )whichhasanunbiasedestimator:
y
i 1
whereyiisthevaluefromeachunitinthesampleandnisthenumberofunitsinthesample.
Thepopulationvariance(2)isestimatedwiththesamplevariance(s2)whichhasanunbiasedestimator:
s2
(y
i 1
y)2
n 1
2
N ns
.
N n
Thestandarderroroftheestimateisthesquarerootofvarianceoftheestimate,whichasalwaysisthe
standarddeviationofthesamplingdistributionoftheestimate.Standarderrorisausefulgaugeofhow
preciselyaparameterhasbeenestimated.
2
N ns
.
N n
Standarderrorof is: SE ( )
N n
isthefinitepopulationcorrectionfactorwhichadjustsvarianceoftheestimator
N
Thequantity
FPC
(notvarianceofthepopulationwhichdoesnotchangewithn)toreflecttheamountofinformationthat
isknownaboutthepopulationthroughthesample.Practically,thecorrectionfactorreflectsthe
proportionofthepopulationthatremainsunknown.Therefore,asthesamplesizenapproachesthe
populationsizeN,thefinitepopulationcorrection
factorapproacheszero,sotheamountofvariation
FPC with N = 100
associatedwiththeestimatealsoapproacheszero.
1
Whenthesamplesizenislargerelativetothe
0.8
0.6
populationsizeN,thefractionofthepopulation
0.4
beingsampledn/Nissmall,sothecorrectionfactor
0.2
haslittleeffectontheestimateofvariance(Fig.2
0
FPC.xls).Ifthefinitepopulationcorrectionfactoris
0
20
40
60
80
100
ignored,includingthosecaseswhereNisunknown,
n
theeffectonthevarianceoftheestimatorisslightwhenNislarge.WhenNissmall,however,the
varianceoftheestimatorcanbeoverestimatedappreciably.
EstimatingthePopulationTotal
Likethepopulationmean,thetotalnumberofentitiesinthepopulationisanotherattributeestimated
commonly.Unlikethepopulationmeanorproportion,estimatingthepopulationtotalrequiresthatwe
knowthenumberofsamplingunitsinapopulation,N.
Thepopulationtotal
estimator: N
N
n
N isestimatedwiththesampletotal( )whichhasanunbiased
i 1
n
i 1
whereNisthetotalnumberofsampleunitsinapopulation,nisthenumberofunitsinthesample,and
yiisthevaluemeasuredfromeachsampleunit.
Instudiesofwildlifepopulations,thetotalnumberofentitiesinapopulationisoftenrefereedtoas
abundanceandistraditionallyrepresentedwiththesymbolN.Consequently,thereisrealpotential
forconfusingthenumberofentitiesinthepopulationwiththenumberofsamplingunitsinthe
samplingframe.Therefore,inthecontextofsamplingtheory,welluse torepresentthepopulation
totalandNtorepresentthenumberofsamplingunitsinapopulation.Later,whenaddressingwildlife
populationsspecifically,welluseNtorepresentabundancetoremainconsistentwiththeliteraturein
thatfield.
Becausetheestimator issimplythenumberofsampleunitsinthepopulationNtimesthemean
numberofentitiespersampleunit, ,thevarianceoftheestimate reflectsboththenumberofunits
inthesamplinguniverseNandthevarianceassociatedwith .Anunbiasedestimateforthevariance
oftheestimate is:
s 2 N n
var() N 2 var( ) N 2
n N
wheres2istheestimatedpopulationvariance.
Example:EstimatingacariboupopulationinAlaska.
Caribouwerecountedinstriptransectsthatwere1milewide.Asimplerandomsampleof15transects
(n)werechosenfromthe286transectspotentiallyavailable(N).Thenumberofcariboucountedwere
1,50,21,98,2,36,4,29,7,15,86,10,21,5,4.
Thesamplemeannumberofcariboucountedpertransect:=25.93
Thesamplevariance:s2=919.0667
286 15 919.07
58.0576
286 15
Theestimatedvarianceofthesamplemean: va r( y )
EstimatingaPopulationProportion
Ifthereisinterestinthecompositionofapopulation,wecoulduseasimplerandomsampletoestimate
theproportionofthepopulationpthatiscomposedofelementswithaparticulartrait,suchasthe
proportionofplantsthatflowerinagivenyear,theproportionofjuvenileanimalscaptured,the
proportionoffemalesinestrus,andsoon.Wewillconsideronlyclassificationsthatfollowbinomial
trialswhichmeansthateitheranelementinthepopulationhasthetraitofinterest(flowering)ornot
(notflowering)althoughextendingthisideatomorecomplexsettingsisstraightforward.
Inthecaseofsimplerandomsampling,thepopulationproportionfollowsthemeanexactly;thatis,p=
.Ifthisideaisnewtoyou,convinceyourselfbyworkingthroughanexample.Saywegeneratea
sampleof10elements,where4haveavalueof1and6haveavalueof0(1=presenceofatrait,0=
absenceofatrait).Theproportionofthesamplewiththetraitis4/10or0.40andsoisthearithmetic
mean,which=0.40([1+1+1+1+0+0+0+0+0+0]/10=4/10).Cosmic.
Itfollowsthatthepopulationproportion(p)isestimatedwiththesampleproportion( p )whichhasan
unbiasedestimator:
y
i 1
Becausewearedealingwithdichotomousproportions(sampleunitdoesordoesnothavethetrait),the
populationvariance2iscomputedbasedonvarianceforabinomialwhichistheproportionofthe
populationwiththetrait(p)timestheproportionthatdoesnothavethattrait(1p)orp(1p).The
estimateofthepopulationvariances2is: p (1 p ) .
2
N n p (1 p )
N n s
.
N n 1 N n 1
2
N n p (1 p )
N n s
.
N n 1
N n 1
Standarderrorof p is: SE ( p )
SystematicSampling
Occasionally,selectingsampleunitsatrandomcanintroducelogisticalchallengesthatpreclude
collectingdataefficiently.Ifthechanceofintroducingabiasisloworifidealdispersionofsampleunits
inthepopulationisahigherprioritythatastrictlyrandomsample,thenitmightbeappropriateto
choosesamplesnonrandomly.Likesimplerandomsampling,systematicsamplingisatypeof
probabilitysamplingwhereeachelementinthepopulationhasaknownandequalprobabilityofbeing
selected.Theprobabilisticframeworkismaintainedthroughselectionofoneormorerandomstarting
points.Althoughsometimesmoreconvenient,systematicsamplingprovideslessprotectionagainst
introducingbiasesinthesamplecomparedtorandomsampling.Estimatorsforsystematicsamplingand
simplerandomsamplingareidentical;onlythemethodofsampleselecteddiffers.Therefore,
systematicsamplingisusedtosimplifytheprocessofselectingasampleortoensureidealdispersionof
sampleunitsthroughoutthepopulation.
Advantages:
Easytoimplement
Maximumdispersionofsampleunitsthroughoutthepopulation
Requiresminimumknowledgeofthepopulation
Disadvantages:
Lessprotectionfrompossiblebiases
Canbeimpreciseandinefficientrelativetootherdesignsifthepopulationbeingsampledis
heterogeneous
Howitisimplemented:
Chooseastartingpointatrandom
Selectsamplesatuniformintervalsthereafter
1inksystematicsample
Mostcommonly,asystematicsampleisobtainedbyrandomlyselecting1unitfromthefirstkunitsin
thepopulationandeverykthelementthereafter.Thisapproachiscalleda1inksystematicsamplewith
arandomstart.Tochooseksothanasampleofappropriatesizeisselected,calculate:
k=Numberofunitsinpopulation/Numberofsampleunitsrequired
Forexample,ifweplantochoose40plotsfromafield
of400plots,k=400/40=10,sothisdesignwouldbea
1in10systematicsample.Theexampleinthefigureis
a1in8sampledrawnfromapopulationofN=300;
thisyieldsn=28.Notethatthesamplesizedrawnwill
varyanddependsonthelocationofthefirstunit
drawn.
EstimatingthePopulationMean
n
Thepopulationmean()isestimatedwith:
y
i 1
Thepopulationvariance(2)isestimatedwith: s 2
(y
i 1
y) 2
n 1
2
N ns
.
N n
2
N ns
Standarderrorof is: SE ( )
.
N n
EstimatingthePopulationTotal
Thepopulationtotal isestimatedwith: N
N
n
y
i 1
s 2 N n
.
N
s2
N 2
n
N n
EstimatingthePopulationProportion
Thepopulationproportion(p)isestimatedwiththesampleproportion( p )whichhasanunbiased
estimator:
y
i 1
Becauseweareestimatingadichotomousproportion,thepopulationvariance2isagaincomputed
withabinomialwhichistheproportionofthepopulationwiththetrait(p)timestheproportionwithout
thattrait(1p)orp(1p).Theestimateofthepopulationvariances2is: p (1 p ) .
2
N n s
N n p (1 p )
N n 1 N n 1
HowManySamples?
Optimalallocationisanapproachtomaximizesamplingefficiency;thatistoprovidehighprecisionfora
givenamountofsamplingeffort.
Adifferentquestionishowmanysamplesshouldwetakefromthepopulation?
First,establishthedegreeofprecisionrequired,B,theboundtheerrorofestimation,whichisthehalf
widthoftheconfidenceintervalwewishtoattainfromsampling.Determinethesamplesizenrequired
bysettingZSE( y )equaltoBandsolvingthisexpressionforn.
Zisaconstantthatdenotestheupper/2pointofthestandardnormaldistributionwhereisthesame
valueusedtoestablishthewidthofconfidenceintervals.
PopulationMean
N
Forsimplerandomsampling,set: B Z
N
n 2
z 2 2
1
solveforntoget: n
; n0
.
2 or n
1
1
1
1
B
n0 N
z 2 2 N
B2
1
NotethatifnwillbesmallrelativetoN,thepopulationcorrectionfactorcanbeignored,andthe
formulaforsamplesizereducedton0.
Example:Estimatetheaveragebodymassofmalefreshmanoncampus.
Assumethatnopriorinformationexistswithwhichtoestimatepopulationvariance2butweknowthat
themassofmostmalefreshmeniswithinarangeofabout100poundsandthereareN=1000students.
HowmanysamplesareneededtoestimatewithaboundontheerrorofestimationB=3pounds
usingsimplerandomsampling?
Althoughitisbesttohavedatawithwhichtoestimate2,perhapsfromasmallpilotstudy,therangeis
oftenapproximatelyequalto4,soonefourthoftherangemightbeusedasanapproximatevalueof
:
range 100
25 .
4
4
Substituting: n
1
1
1
2
1000
2 25
2
3
2
1
1
1
277.78 1000
1
217.4
0.0036 0.001
Therefore,about218samplesareneededtoestimatewithaboundontheerrorofestimationB=3
pounds.
PopulationTotal
Forsimplerandomsampling,solvefornfrom: B Z N N n
2
n
N 2 z 2 2
1
n
; n0
or n
.
2
1
1
1
1
B
n0 N
N 2 z 2 2 N
B2
1
Again,ifNislargerelativeton,thepopulationcorrectionfactorcanbeignored,andtheformulafor
samplesizereducedton0.
Example:Whatsamplesizeisnecessarytoestimatethecariboupopulationweexaminedto
withinB=2000animalsofthetruetotalwith90%confidence(=0.10).
Usings2=919fromearlierandZ=1.645,whichistheupper=0.10/2=0.05pointofthenormal
286 21645
. 2 919
51
distribution: n0
2000 2
Toadjustforfinitepopulationsize: n
1
44
1
1
51 286
10
StratifiedRandomSampling
Thewaywehaveselectedsampleunitsthusfarhasrequiredthatweknowlittleaboutthepopulationof
interestinadvanceofselectingthesample.Thisapproachonlyworksbestwhenthecharacteristicof
interestisrelativelyhomogeneousacrossthepopulation.If,however,thecharacteristicis
heterogeneous,thenestimatesbasedonthesedesignswillbeimprecise.Ifwehaveancillary
informationthatisassociatedwiththeheterogeneityinthepopulation,wecanuseusingalternate
designstoselectsampleswhichwillyieldincreasedprecisionforafixedamountofeffort.Thefirstof
thesedesignsisstratifiedrandomsampling.
Astratifiedrandomsampleisoneobtainedbydividingthepopulationelementsintomutuallyexclusive,
nonoverlappinggroups(strata),thenselectingasimplerandomsamplefromwithineachstratum
(stratumissingularforstrata).Everypotentialsampleunitcanbeassignedtoonlyonestratumandno
unitscanbeexcluded.
Stratifyinginvolvesclassifyingsamplingunitsofthepopulationintorelativelyhomogeneousgroups,
usuallybeforeselectingsampleunits.Strataarebasedoninformationotherthanthecharacteristic
beingmeasuredthatisknowntoorthoughttovarywiththecharacteristicofinterestinsuchawaythat
thecharacteristicismorehomogeneouswithinstratathanamongstrata.Therefore,anyfeaturethat
explainsvariationinthecharacteristicofinterestcanbeusedasabasisforstratifying.Forexample,if
ourgoalistoestimatethetotalnumberofagavesinanareaandweknowfrompreviousworkthat
agaveabundancevarieswithsoiltype,wemightchoosetostratifythepopulationbysoiltype.Because
sampleswithinstrataarelikelytobemoresimilarthanthoseamongstrata,samplingerrorwillbelower
andestimatesgeneratedwithinstratawillhavehigherprecisionthansimplerandomsamplesdrawn
fromthesamepopulation.
Asmostecologicalsystemsareheterogeneous,stratifyingisacommonapproachforincreasing
precisioninecologicalstudies.Commonstratainecologicalstudiesincludeelevation,aspect,orother
geographicfeaturesforstudyingplantcommunitiesandvegetationcommunitiesforstudyinganimal
communities.Whenchoosingamongpotentialstrata,youshouldseektominimizevariationwithin
strataandmaximizevariationamongstrata.
Stratifiedrandomsamplingisappropriatewheneverthereisheterogeneityinapopulationthatcanbe
classifiedwithancillaryinformation;themoredistinctthestrata,thehigherthegainsinprecision.The
samepopulationcanbestratifiedmultipletimessimultaneously.
Advantages:
Higherprecisionofestimates
Moreefficient
Separateestimatesforeachstratum
Disadvantages:
Requiresancillaryinformation
Canbemoretimeconsumingtoplanandimplement
11
Howitisimplemented:
Dividetheentirepopulationintononoverlappingstrata
Selectedasimplerandomsamplefromwithineachstrata
L=numberofstrata
Ni=numberofsampleunitswithinstratumi
N=numberofsampleunitsinthepopulation
EstimatingthePopulationMean
Estimatesfromstratifiedrandomsamplesaresimplytheweightedsumofestimatesfromaseriesof
simplerandomsamples,eachgeneratedwithinauniquestratum.Thisshouldbeapparentinthe
estimatorsbelow,suchasthatforthepopulationmean,whichisanaverageofthemeansfromeach
stratumweightedbythenumberofsampleunitsmeasuredwithineachstratum.Withonlyone
stratum,stratifiedrandomsamplingreducestosimplerandomsampling.
Thepopulationmean()isestimatedwith:
1
N1 1 N 2 2 N L L 1
N
N
N
i
i 1
Varianceoftheestimate isagainjustaweightedaverageofestimatesfromaseriesofrandom
samples,althoughitlooksabitcumbersome:
var( )
1 2 N 1 n1 s12
N1
N 2 N 1 n1
N nL
N L2 L
NL
s L2
n L
1
2
N
N ni
N i
i 1
Ni
s i2
ni
N ni
N 12 i
i 1
Ni
L
si2
ni
Standarderrorof is: SE ( )
1
N2
2
1
EstimatingthePopulationTotal
Likethemean,estimatingatotalforastratifiedrandomsampleisamatterofsummingindividual
estimatesofthetotalfromeachstratum, N i i .
Thepopulationtotal isestimatedwith: N 1 1 N 2 2 N L L
N
i 1
N ni
N i2 i
i 1
Ni
L
si2
ni
12
EstimatingthePopulationProportion
Estimatingtheproportionofthepopulationwithaparticulartrait(p)usingstratifiedrandomsampling
involvescombiningestimatesfrommultiplesimplerandomsamples,eachgeneratedwithinastratum.
Thepopulationproportionisestimatedwiththesampleproportion:
p N 1 p 1 N 2 p 2 N L p L N i p i
i 1
Varianceoftheestimate p is:
var( p )
1
N2
N
i 1
2
i
var( p i )
1
N2
N
i 1
2
i
N i ni
Ni
p i (1 p i )
ni 1
Example:Simpleexampleof12samplestakenfromapopulationof41entities.
Stratum(i)
Ni
ni
si2
20
1.6
3.3
2.8
4.0
12
0.6
2.2
Estimateofthepopulationmean: y
1
1
64.4 157
20(16
. ) 9(2.8) 12(0.6)
.
41
41
Estimateofthepopulationtotal=411.57=64.4.
Estimatedvarianceoftheestimatedpopulationmeanis:
va r( y )
1
412
3.3
4.0
2.2 322.8
Estimatedvarianceoftheestimatedpopulationtotal=4120.192=322.8.
AllocatingSamplingEffortamongStrata
Afterdecidingtousestratifyrandomsampling,weneedtodecidehowtodividesamplingeffortamong
differentstrata;thatprocessiscalledallocation.Whendecidingwheretoexpendeffort,thequestion
becomeshowbesttoallocatesamplingeffortamongstratasothatthesamplingprocesswillbethe
mostefficientbalanceofeffort,cost,andprecision.Shouldweallocatethesamesamplingeffortto
13
eachstratum?Ifstrataareofdifferentsizes,asisusuallythecase,shouldweallocatemoreeffortto
largerstratum?
Therearemanystrategiesforallocatingsamplingeffort,andthemoreinformationavailableaboutthe
populationofinterest,themoreefficienttheallocationstrategycanbe.Informationonthevariability
ofsampleswithineachstratum,therelativecostofobtainingasamplefromeachstratum,andthe
numberofsampleunitsineachstratumcanallhelptoincreasesamplingefficiency.Someofthemost
commonallocationsstrategiesareuniform,proportionaltosize,variation,andcost,andoptimal,which
simultaneouslyconsiderssize,variation,andcostorwhichevercombinationofthoseisavailable.All
strategiesfunctionbycreateasimpleproportionalmultiplierbywhichafixednumberofsamplescanbe
allocatedamongstrata.
UniformAllocation
Thesimplestallocationstrategyistoselectthesamenumberofsamplesfromeachstratum,whichisan
idealapproachifthereisnoinformationavailableaboutvariabilityofunitswithinstrata,thecostof
samplingissimilarforallstrata,andstrataareofsimilarsize.
AllocationProportionaltoSizeorVariation
Thenumberofsampleunitstoselectfromeachstratumcanbemadeproportionaltothenumberof
sampleunits(orsize)withineachstratum.Variationinastratumoftenincreaseswithathesizeofa
stratum,soinsomecasesthisapproachcanbeconsideredasaroughapproachforallocatingmore
efforttostratathatarelikelytobemorevariablestrata.Toallocationproportionaltostratumsize:
N
n i n L i
Ni
i 1
n N i
Toallocationproportionaltotheamountofvariationamongelementswithineachstratum,as
measuredbytheestimatedstandarddeviationwithineachstratum:
s
n i n L i
si
i 1
Thisapproachreliesonestimatesgeneratedfromapreviousstudyoralternativelybytheabilityto
gaugerelativedifferencesinvariationamongstrata,suchasexpectingonestratumtohave1.5times
thevariationasanotherstratum.
OptimalAllocation
Bothallocationapproachesabovearespecialcasesoftheoptimalallocationstrategywhichestimates
thepopulationmeanortotalwiththelowestvarianceforagivensamplesizeinstratifiedrandom
14
sampling.Thenumberofsamplesselectedfromeachstratumisproportionaltothesize,variation,as
wellasthecost(ci)ofsamplingineachstratum.Moresamplingeffortisallocatedtolargerandmore
variablestrata,andlesstostratathataremorecostlytosample.
N i si
c
ni n L i
N k sk
k 1
ck
15