You are on page 1of 6

CPSC 490

Graph Theory: MST's and Euler cycles

Union/Find (Disjoint Sets datastr u c t u r e )


Let'sstepawayfromgraphsforamomentandsolvethefollowingproblem.Consideracollectionofn
people,eachofwhom belongstoaparticularpoliticalparty.Weneedadatastructuretostorethe
assignmentofpeopletoparties.Thedatastructureneedstosupportjustthese2operations:
intFIND(intx):
Givenaperson,x,returnstheleaderofx'sparty.
voidUNION(intx,inty)
Giventwopersons,xandy,mergesx'sandy'spartiestogetherunderasingleleader.
Wearegoingtorepresentthestructurebyanarrayofsizen.Let'sassigneachpersonanumberinthe
range[0,n)andimposesomeextrastructureupontheparties.Eachpartymemberwillhaveadirect
superior. The leader will be her own superior. The array, uf[], will contain, for each person, the
numberofthatperson'sdirectsuperior.Thiseffectivelygivesusa"partytree",rootedattheleader,
whereeachnode hasalinktoitsparent.
WecanusethistoimplementbothFIND()andUNION()inlineartime.
Example1:SlowUNION/FIND
intFIND(intx){
if(uf[x]==x)returnx;//xistheleader
returnFIND(uf[x]);
}
voidUNION(intx,inty){
uf[FIND(x)]=FIND(y);
}

FIND()simplyfollowstheuf[]linksupuntilitreachestheleaderandreturns.UNIONpointsx'sleader
toy'sleader,effectivelymergingthetwotrees.Fromnowon,callingFIND(z)onanymemberofx's
partywillreturntheleaderofy'sparty,asdesired.Thechoiceofleaderforthenew,mergedpartyis
unimportant;wecouldswitchxandyandgetanequivalentresult.
Therunningtimeisnotsogreat.UNION()isaoneliner,butitcallsFIND()twice,soweneedto
optimizeFIND()toimprovetheoverallperformance.Clearly,thelimitingfactorhereisthedepthof
thesetrees.Wewouldliketokeepthetreesasshallowaspossibletominimizethenumberofrecursive
callsinFIND().
Onetechniquefordoingthisiscalled"pathcompression". SupposethatwecallFIND(x),anditreturns
ztheleaderofx'sparty.Whatwecoulddoisresetuf[x]toz,sothatthenexttimewecallFIND(x),it
wouldreturninjust2iterations.Infact,whenexecutingFIND(x),wewillmeetallthenodesalongthe
pathfromxtoz,sowecanresetalloftheirsuperiorstobetheroot,z.
Example2:FINDwithpathcompression
intFIND(intx){
if(uf[x]!=uf[uf[x]])uf[x]=FIND(uf[x]);
returnuf[x];
}

Thefirstlinecheckswhetherthepathfromxtotheleaderhaslengthatleast2.Ifitdoes,thenwe
resetuf[x]topointto theleader(otherwise,it'salreadypointingtotheleader).Finally,wesimply
returnuf[x].

CPSC 490

Graph Theory: MST's and Euler cycles

Well,thisdoesn'tseemlikeanimprovementFIND(x)stilltakeslineartimebecausewehavetofollow
thepathfromxtotheleader,andwemightmeetalloftheothernodesalongtheway.Thisisn'tafair
analysis though because if we call FIND(x) again, its running time will be constant. We need
amortizedanalysis.Supposewehavenpeopleandndifferentpartiestostartwith(everyoneisina
separatepartyofone).ThenweexecutekUNIONandFINDoperations,insomeunknownorder,with
someunknownparameters.Itcanbeshownthatwithpathcompression,thiswillonlyrequireO(k*log
(n))time.Themainideafortheproofistocountthenumberoftimesagivennodecanbereassigned
topointtotherootandthenumberofnodesatagivenlevelinthetrees.
Ifweaddanotherimprovement,wecanreducethisto"almostlinear"time.Thisoneiscalled"union
byrank".Wewillneedanextraarraythat,foreachnode,willstorearank.Theranksstartoffat1,
andrank[x]simplyequalstothedepthof thetreerootedatx.SinceinUNION(x,y),wehaveachoice
whethertopointx'sleaderaty'sory'sleaderatx's,wecannowpicktheshallowertreeandpointitat
thedeeperone,usingranks.Thiswillpreventanytreefrombecomingdeeperthanlog(n).Therefore,
usingunionbyrankbyitself,withoutpathcompression,willalsoguaranteearunningtimeofO(k*log
(n))forkoperations.
Ifwecombinethetwo,theanalysisgetsratherhairybecausepathcompressioncanmakedeeptrees
shallow again without changing the rank of the root element. Nevertheless, the effort is certainly
worthitbecausewegetarunningtimeofO((k+n)log*(n)),wherethelog*()(logstar)functionis
definedasfollows:
log*(x)=0ifx<=1;
log*(x)=1+log*(log(x))otherwise
log*(x)isthenumberoftimesyouneedtotakethelog()ofxrepeatedlyuntilyougetitdownto1.
Forall"reasonable"valuesofn(nsmallerthan2^64),log*(n)islessthan5.Oneproofofthistime
boundcanbefoundintheBigWhitebook(Cormenetal.,"IntroductiontoAlgorithms"),chapter21in
the2ndedition.AnimplementationofUNION()thatusesbothimprovementsmaylooklikethis.
Example3:UNIONwithunionbyrank
boolUNION(intx,inty){
intxx=FIND(x),yy=FIND(y);
if(xx==yy)returnfalse;

//makesurerank[xx]issmaller
if(rank[xx]>rank[yy]){intt=xx;xx=yy;yy=t;}

//ifbothareequal,thecombinedtreebecomes1deeper
if(rank[xx]==rank[yy])rank[yy]++;

uf[xx]=yy;
returntrue;
}

Asanextrabonus,UNION()nowreturnstruewhenthepartiesofxandyaretrulymergedduringthe
operation,anditreturnsfalseiftheystartedoffinthesamepartyandthiscallhadnoeffect(except
forsomepathcompression).

CPSC 490

Graph Theory: MST's and Euler cycles

Minimu m Spanni n g Trees (MST)


Anunrootedtree(orsimply,atree)isanundirected,connected,acyclicgraph.Thismeansthatthere
isexactlyonepathbetweeneverypairofverticesinatree.Aspanningtreeofagivengraph,G=(V,
E),isatreeT=(V,E'),whereE'isasubsetofE.
AsimplewaytothinkaboutanMSTisthefollowingproblem.Youhaveanumberoftowns(vertices)
connectedbyroads(edges)ofvariouslengths.It's expensivetomaintainroads,soyourtaskisto
eliminatesomeoftheroadsinsuchawaythatitisstillpossibletodrivefromanytowntoanyother
town.Furthermore,thetotallengthoftheremainingroadsisminimal.
Thefollowinggreedyalgorithmworksforthisproblem.Startwithnoedgesinthespanningtree.Take
theshortestedgeandaddittothetreewewilluseit.Nowtaketherestoftheedgesinorderof
increasinglengthandaddthemtothetreeaslongastheyconnecttwoverticesnotalreadyconnected
bythepreviousedges.Repeatuntilallverticesareconnected.Thinkofthisasgrowingaforestoftrees
whichgetjoinedtogetheruntiltheybecomeonebigspanningtree.Hereisanimplementation.
Example4:Kruskal'salgorithm
intuf[128];
//astructuretorepresentanedge
structEdge{
//thetwoendpointsandtheweight
intu,v,w;

//acomparatorthatsortsbyleastweight
booloperator<(constEdge&e)const{returnw<e.w;}
};
//thegraphrepresentedasalistofedges
Edgeedges[100000];
//thenumberofverticesandthenumberofedges
intn,m;
intkruskal(){
//sortthefirstmentriesinthe'edges'array
sort(edges,edges+m);
//initializetheunionfindarray
for(inti=0;i<n;i++)uf[i]=i;
//thenumberoftrees(parties),andthetotalweight
inttrees=n,sum=0;

for(inti=0;i<m&&trees>1;i++){
if(UNION(edges[i].u,edges[i].v)){
//useedgeiinthetree
trees;
sum+=edges[i].w;
}
}
returnsum;
}

CPSC 490

Graph Theory: MST's and Euler cycles

Weare using the last version of UNION(), the onethat returns true whenthe call results in two
differentpartiesbeingmerged.First,wesorttheedgesbyincreasingweight.Notethatsort()workson
anarray,too.Insteadofusinganadjacencylistormatrix,wesimplyneedalistofedges.Then,we
initializetheunionfindstructuretohaveeachvertexinitsown"party".Finally,wescanthelistof
edgesandseeifwecanuseanygivenedgetomergetwodifferentparties.Ifyes,thenwedecrement
thenumberoftrees(or"parties")andaddtheedge'sweighttothetotalsum.Thefinalsumisreturned
astheweightoftheminimumspanningtree.
Inthefinalloop,thealgorithmselectssomeoftheedgestobuildaforest,F,thateventuallybecomes
theminimumspanningtree.Toprovecorrectness,wewillshowthefollowinginvariant:"Aftereach
iterationofthefinalloop,FcontainsedgesthatarepresentinsomeminimumspanningtreeofG."In
thebeginning,Fisempty,sotheinvariantistriviallytrue.Supposetheinvariantistrueafteriteration
i1.Thenduringiterationi,ifedgenumberiisnotaddedbythealgorithm,wehavenothingtoprove.
Ifitisadded,thenthefactthatedges[]issortedensuresusthatthenewedge(u,v)isthesmallest
edgethatconnectstwodisconnectedverticesinV.Considersplittingthewholesetofverticesinthe
graphintotwoparts,AandB.LetAcontainu,andletBcontainv.Furthermore,lettherebenoedge
inFthatcrossesfromAtoB.Thisispossiblebecauseweareensuredthatthereisnopathfromutov
inF.
Byassumption,theinvariantwastrueatiterationi1,sothereexistssomeminimumspanningtree
thatcontainsF.Inthistree (callitT'),theremustbeanedgethatconnectssomevertexu'inAtosome
vertexv'inB.Sinceu'andv'aredisconnectedinF,theedge(u,v)mustbenoheavierthan(u',v')
(becauseofsorting).Nowlet'sremovetheedge(u',v')fromT'andinserttheedge(u,v)instead.This
givesusatreeT,whichisalsoaspanningtree.Itmustbeaminimumspanningtreebecauseitstotal
weightisnolargerthantheweightofT'.Hence,addingtheedge(u,v)toFsatisfiestheinvariant;
namely,thereexistssomeminimumspanningtree(T)thatcontainsalltheedgesinFandtheedge
(u,v).
Byinduction,theinvariantistruewhentheloopterminates.Atthatpoint,assumingtheoriginalgraph
wasconnected,wehavetrees=1,whichmeansthatwehaveasetofedges,F,thatformatreeand
arepartofsomeminimumspanningtree.Hence,Fisaminimumspanningtree.

Euler cycles (tours)


RecallthatanEulercycleisaclosedwalkthatvisitseachedgeexactlyonce.Itturnsoutthatthereisa
remarkably short, linear time algorithm for finding an Euler cycle, if one exists. Suppose G is a
connectedgraph.ThenGhasanEulercycleifthedegreeofeachvertexiseven.Thedegreeofavertex,v,
is the number of times v appears as an endpoint of an edge in G. To prove this claim, we will
demonstrateanalgorithmthatcomputesanEulercycleinsuchagraph.Theconverseoftheclaimis
alsotrueandisasimpleexercisetoprove.
First,wewillneedtointruduceafunctionthatfindsacycle(anycycle)inG,startingatagivenvertex,
u.Itstartswithanemptycycleandagraphwhereeachvertexhasanevendegreeandreturnsacycle,
removingallofitsedgesfromthegraph.
Example5:GreedyCycleDetector
voidgreedyCycle(intu){
while(true){
intv;
for(v=0;v<n;v++)if(graph[u][v])break;

CPSC 490

Graph Theory: MST's and Euler cycles

if(v<n){
graph[u][v]=graph[v][u]=false;
//addtheedge(u,v)tothecycle
u=v;
}elsebreak;
}
}

Thefunctionisverysimplefindanedge(u,v)anduseit.Usinganedgemeansremovingitfromthe
graphandaddingittothecycle.Hereisanoutlineofaproof.Notethatifuhasanodddegree,then
wewillalwaysfindsomeedge(u,v)leavingu(becausethereisatleastoneedge).Otherwise,wewill
exitthemainloopifthedegreeofuiszero.Atfirst,uhasevendegree,justlikeeveryothervertexin
thegraph.Afterthefirstiteration,whenwefindanedge(u,v)anderaseit,therewillbeexactlytwo
verticeswithodddegreeuandv,andwewillbeatv.Ineverysubsequentiteration,therewillbe
exactly2verticeswithodddegreesthestartingvertexandthecurrentvertex.Eventually,theloop
mustreturntothestartingvertexbecauseitclearlyterminates(wecan'tkeepremovingedgesforever),
anditmustterminateattheoriginalstartingvertextheonlyvertexthatcanhaveevendegreewhen
reachedinthemainloop.
TosolveEulerCycle,pickanystartingvertexandcallgreedyCycle()tocomputesomecycleinthe
graph.Ifthecyclehasusedalltheedges,it'sanEulercyclebydefinition.Otherwise,thereissome
vertex,v,onthecyclethathaspositiveevendegree.CallgreedyCycle(v)onthatvertexandinsertthe
newfoundcycleinplaceofvintheoriginalcycletogetalargercombinedcycle.Repeatuntilwehave
nounerasededgesleft.Notethatremovingacyclefromagraphwhoseverticesallhaveevendegree
preserves the property of every vertex having even degree. It may, however, make the graph
disconnected,butthatisnotaproblembecauseeachconnectedcomponentoftheremaininggraph
mustshareavertexwiththecurrentcyclethatwearebuilding.
Hereisanimplementationthatusesalinkedlistandbuildsallofthecyclesrecursivelyandatthe
sametime.Usinganadjacencylistandsomecleveriteratormanipulations,wecanreduceitsrunning
timetoO(m).
Example6:Eulercyclein O m n
list<int>cyc;
voideuler(list<int>::iteratori,intu){
for(intv=0;v<n;v++)if(graph[u][v]){
graph[u][v]=graph[v][u]=false;
euler(cyc.insert(i,u),v);
}
}
intmain(){
//readgraphintograph[][]andsetntothenumberofvertices
euler(cyc.begin(),0);
//cyccontainsaneulercyclestartingat0
}

Thefunctioneuler()operatesonalinkedlist.Ithasaniteratorthatpointstotheplaceinthelist
wherewewanttoinsertthenextvertex.uisthevertexwehavejustarrivedat.Ifuhasnoneighbours,
we are done. Otherwise, we insertu into thelinked list, erase the edge(u,v) and recurseon the
neighbour,v.Thisrecursivecallcouldeitherleadustothestartingvertex(thinkofthefirstcallto
greedyCycle()) or back to u (giving a cycle to beinserted in place of u). These are the only two
possibilitiesbecauseonlyuandthestartingvertexhaveodddegreeatanygivenmoment.

CPSC 490

Graph Theory: MST's and Euler cycles

Howwouldyouchangeeuler()ifthegraphweredirected?Whatifitcontainedduplicateedges?

You might also like