Professional Documents
Culture Documents
Segmentation
Dariusz Małyszkoa, Sławomir T. Wierzchońb
a
Faculty of Computer Science, Technical University of Białystok, Wiejska 45A, 15-351 Bialystok, Poland
malyszko@ii.pb.bialystok.pl
b
Faculty of Mathematics, Physics and Informatics, University of Gdańsk, Wita Stwosza 57, 80-952 Gdańsk-Oliwa
b
Institute of Computer Sciences, Polish Academy of Sciences, Ordona 21, 01-267 Warszawa
stw@ipipan.waw.pl
Abstract: Clustering or data grouping is a key initial this reason combining clustering techniques with genetic
procedure in image processing. This paper deals with algorithms robustness in optimization should yield high
the application of standard and genetic k-means quality performance and results [3, 4].
clustering algorithms in the area of image segmentation.
In order to assess and compare both versions of k- The present paper in Section 2 briefly reviews a
means algorithm and its variants, appropriate family of k-means clustering algorithms. Genetic
procedures and software have been designed and algorithms in the context of k-means clustering
implemented. Experimental results point that genetically techniques are outlined in Section 3. Cluster validation
optimized k-means algorithms proved their usefulness in indices are described in Section 4. Section 5 describes
the area of image analysis, yielding comparable and performed experiments and obtained results. Section 6
even better segmentation results. concludes the paper and points future research.
∑
i =1
k Chromosomes represent solutions consisting of
j =1 p centers of k clusters – each cluster center is a d-
xi − c j
dimensional vector of values in the range between 0 and
255 representing intensity of gray or color component.
p is a parameter with the value p ≥ 2. Zhang [7]
proposes value 3.5 as yielding the best results. Population initialization and fitness computation
Membership and weight functions are calculated as The clusters centers are initialized randomly to k d-
described in [5] and [7]. dimensional points with values in the range 0 – a 255.
Fitness value is calculated for each chromosome in the
population according to the rules given in Section 2.
2.4 Fuzzy k-means algorithm
Selection
Selection operation tries to choose best suited
Fuzzy partition of input data makes possible multiple
chromosomes from parent population that come into
cluster assignments. Therefore, optimized objective
mating pool and after cross-over and mutation operation
function has the following form:
create child chromosomes of child population. Most
n k
frequently genetic algorithms make use of tournament
∑∑ m r 2
FKM ( X , C ) = ij xi − c j (3)
selection that selects into mating pool the best individual
i =1 j =1
from predefined number of randomly chosen population
Details can be find in [4, 5 ]. The value of parameter chromosomes. This process is repeated for each parental
r should be constrained to the values r ≥ 1 . Larger chromosome.
TABLE 1 BEST AND AVERAGE FITNESS VALUES AND CLUSTER VALIDITY INDICES (DI, DBI, SDBI, QE)
OF THE K-MEANS POPULATIONS IN 5 TRIALS. THE FIRST NUMBER IS THE AVERAGE VALUE, THE
SECOND NUMBER IS THE BEST VALUE
1800000
1600000
1400000
1200000
Fitness
1000000
800000
600000
400000
200000
0
0 21 41 61 81 101 121 141 161 181
Iteration
Fig. 1. Fitness values of the best solutions in genetic k-means algorithm (upper line) and standard k-means algorithm (lower line) during
200 iterations.
Exemplary segmentations
After completion of k-means clustering algorithm better values in the case of standard versions of k-means
execution, required centers of clusters are obtained. algorithms (see Fig. 1), although some authors (for
Therefore, segmentation of the input images should be example [3]) suggest contrary performance. However,
performed in order to determine image partition into this observation is similar to results presented in [4].
meaningful regions. Segmentation quality can be Standard versions of k-means algorithms seem be better
assessed and compared for particular clustering in finding high fitness solutions. In the same time results
techniques. In Fig. 3 two exemplary segmentations of obtained in standard and genetic versions of k-means
1D Lena image (Fig. 1a) and 3D image (Fig. 1c) algorithms relative to validity indices are also
obtained in the run of genetic KM are presented. Pixels comparable. During extensive search of solution space,
assigned to the given cluster are displayed in the mean genetic versions of k-means algorithms most often find
color of all the pixels belonging to the cluster. solutions with slightly worse fitness values (see Fig. 1)
but at the same time with exceptionally good values of
6. Conclusion and summary individual validity indices. Further investigation into
this matter could present starting point into
improvement of k-means based image clustering
Results obtained in the performed experiments techniques.
suggest that genetic versions of k-means clustering
techniques are equally robust in comparison to standard
versions. Segmentation results proved that in the long Acknowledgement
run, both types of techniques applied to image clustering This work was supported by Białystok Technical
- lead to the comparable values of fitness with slightly University grant S/WI/5/03.
3a 3b
Fig. 3. Exemplary segmentations of 1D image (3a) and 3D image (3b)
[3] U. Maulik, S. Bandyopadhyay, "Genetic algorithm-based [8] M. Halkidi, M. Vazirgiannis, I. Batistakis, "Quality scheme
assessment in the clustering process". In Proc. of the 4th
clustering technique", Pattern Recognition 33, 2000, 1455-
European Conf. on Principles of Data Mining and
1465.
Knowledge Discovery, LNCS 1910, 2000, 265 -267.
[4] O.Hall, I.Barak, J.C. Bezdek, "Clustering with a genetically
optimized approach", IEEE Trans. Evo. Computation, 3,
[9] M.Halkidi et al., "Clustering validity checking methods:
Part II", SIGMOD Rec., 31, No. 3, 2002, 19-27
1999, 103-112
[10] R.H. Turi, "Clustering-based color image segmentation",
[5] G.Hamerly, C. Elkan, "Alternatives to the k-means PhD Thesis, Monash University, Australia 2001.
algorithm that find better clusterings", Proc. of the ACM
Conference on Information and Knowledge Management, [11] J.C. Bezdek, N.R. Pal, "Some new indexes of cluster
CIKM-2002, 2002, 600-607. validity", IEEE Trans. Sys. Man. Cyb., 28, 1998, 301-315.