You are on page 1of 5

Agglomerative Hierarchical Algorithm

Hierarchical algorithms can be either agglomerative or divisive, that is top-down or bottom-up. All agglomerative hierarchical clustering algorithms begin with each object as a separate group. These groups are successively combined based on similarity until there is only one group remaining or a specified termination condition is satisfied. For n objects, n-1 mergings are done. Hierarchical algorithmsare rigid in that once a merge has been done, it cannot be undone. Although there are smaller computational costs with this, it can also cause problems if an erroneous merge is done. As such, merge points need to be chosen carefully. Here we describe a simple agglomerative clustering algorithm. More complex algorithms have been developed, such as BIRCH and CURE, in an attempt to improve the clustering quality of hierarchical algorithms.

Figure 3: Sample Dendogram In the context of hierarchical clustering, the hierarchy graph is called a dendogram. Figure 3 shows a sample dendogram that could be produced from a hierarchical clustering algorithm. Unlike with the k-means algorithm, the number of clusters (k) is not specified in hierarchical clustering. After the hierarchy is built, the user can specify the number of clusters required, from 1 to n. The top level of the hierarchy represents one cluster, or k=1. To examine more clusters, we simply need to traverse down the hierarchy. Agglomerative Hierarchical Algorithm: Given: A set X of objects {x1,...,xn} A distance function dis(c1,c2) 1. for i = 1 to n

ci = {xi} end for 2. C = {c1,...,cb} 3. l = n+1 4. while C.size > 1 do a) (cmin1,cmin2) = minimum dis(ci,cj) for all ci,cj in C b) remove cmin1 and cmin2 from C c) add {cmin1,cmin2} to C d) l = l + 1 end while Figure 4: Agglomerative Hierarchical Algorithm Figure 4 shows a simple hierarchical algorithm. The distance function in this algorithm can determine similarity of clusters through many methods, including single link and group-average. Single link calculates the distance between two clusters as the shortest distance between any two objects contained in those clusters. Groupaverage first finds the average values for all objects in the group (i.e., cluster) and the calculates the distance between clusters as the distance between the average values. Each object in X is initially used to create a cluster containing a single object. These clusters are successively merged into new clusters, which are added to the set of clusters, C. When a pair of clusters is merged, the original clusters are removed from C. Thus, the number of clusters in C decreases until there is only one cluster remaining, containing all the objects from X. The hierarchy of clusters is implicity represented in the nested sets of C. Example: Suppose the input to the simple agglomerative algorithm described above is the set X, shown in Figure 5 represented in matrix and graph form. We will use the Manhattan distance function and the single link method for calculating distance between clusters. The set X contains n=10 elements, x1 to x10, where x1=(0,0).

Figure 5: Sample Data Step 1. Initially, each element xi of X is placed in a cluster ci, where ci is a member of the set of clusters C. C = {{x1},{x2},{x3}, {x4},{x5},{x6},{x7}, {x8},{x9},{x10}} Step 2. Set l = 11. Step 3. (First iteration of while loop) C.size = 10

The minimum single link distance between two clusters is 1. This occurs in two places, between c2 and c10 and between c3 and c10. Depending on how our minimum function works we can choose either pair of clusters. Arbitrarily we choose the first. (cmin1,cmin2) = (c2,c10)

Since l = 10, c11 = c2 U c10 = {{x2},{x10}} Remove c2 and c10 from C. Add c11 to C. C = {{x1},{x3}, {x4},{x5},{x6},{x7}, {x8},{x9},{{x2}, {x10}}}

Set l = l + 1 = 12

Step 3. (Second iteration) C.size = 9

The minimum single link distance between two clusters is 1. This occurs between c3 and c11 because the distance between x3 and x10 is 1, where x10 is in c11.

(cmin1,cmin2) = (c3,c11)

c12 = c3 U c11 = {{{x2},{x10}},{x3}} Remove c3 and c11 from C. Add c12 to C. C = {{x1}, {x4},{x5},{x6},{x7}, {x8},{x9},{{{x2}, {x10}}, {x3}}}

Set l = 13

Step 3. (Third iteration) C.size = 8


(cmin1,cmin2) = (c1,c12) C = {{x4},{x5},{x6},{x7}, {x8},{x9},{{{{x2}, {x10}}, {x3}},{x1}}}

Step 3. (Fourth iteration) C.size = 7


(cmin1,cmin2) = (c4,c8) C = {{x5},{x6},{x7}, {x9},{{{{x2}, {x10}}, {x3}},{x1}},{{x4},{x8}}}

Step 3. (Fifth iteration) C.size = 6


(cmin1,cmin2) = (c5,c7) C = {{x6}, {x9},{{{{x2}, {x10}}, {x3}},{x1}},{{x4},{x8}}, {{x5},{x7}}}

Step 3. (Sixth iteration) C.size = 5


(cmin1,cmin2) = (c9,c13) C = {{x6}, {{x4},{x8}}, {{x5}, {x7}},{{{{{x2}, {x10}}, {x3}},{x1}}, {x9}}}

Step 3. (Seventh iteration) C.size = 4


(cmin1,cmin2) = (c6,c15) C = {{{x4},{x8}},{{{{{x2}, {x10}}, {x3}},{x1}},{x9}},{{x6}, {{x5},{x7}}}}

Step 3. (Eighth iteration) C.size = 3


(cmin1,cmin2) = (c14,c16) C = {{{x6},{{x5},{x7}}}, {{{x4},{x8}},{{{{{x2}, {x10}}, {x3}},{x1}},{x9}}}}

Step 3. (Ninth iteration) C.size = 2

(cmin1,cmin2) = (c17,c18) C ={{{{x4},{x8}},{{{{{x2}, {x10}}, {x3}},{x1}},{x9}}},{{x6}, {{x5},{x7}}}}

Step 3. (Tenth iteration) C.size = 1. Algorithm done. The cluster created from this algorithm can be seen in Figure 6. The corresponding dendogram formed from the hierarchy in C is shown in Figure 7. The points which appeared most closely together on the graph of input data in Figure 5 are grouped together more closely in the hierarchy.

Figure 6: Graph with Clusters

Figure 7: Sample Dendogram * Agglomerative Hierarchical Algorithm adapted from: http://www.cs.columbia.edu/~regina/cs4999/notes/lec04/lec04-index.html.


http://www2.cs.uregina.ca/~dbd/cs831/notes/clustering/clustering.html

You might also like