Professional Documents
Culture Documents
(ITEC 3040)
Clustering
(Modified from Aijun Ans slides)
Outline
1
What Is Clustering?
2
What Is Good Clustering?
Data Representation
3
Similarity (or Dissimilarity) Measures
4
Similarity (or Dissimilarity) Measures
Outline
10
5
Major Clustering Approaches
11
12
6
K-means
13
14
7
K-means example
15
K-means example
16
8
K-means example
17
K-means example
18
9
K-means example
19
K-means example
20
10
K-means example #2
21
K-means example #2
22
11
K-means example #2
23
24
12
A Limitation of K-means: Differing Sizes
25
A Limitation of K-means:
Non-globular (Non- Convex) Shapes
26
13
A Problem of k-Means Method
27
28
14
Typical k-medoids algorithm (PAM)
29
30
15
Pros and Cons of PAM
31
Outline
32
16
Hierarchical Clustering
33
Hierarchical Clustering
34
17
A Dendrogram Shows How the
Clusters are Merged Hierarchically
35
Inter-cluster Distances in
Hierarchical Clustering
36
18
Strengths and Limitations of
Hierarchical Methods
37
19