You are on page 1of 14

International Journal of Computational Intelligence and Information Security, December 2012 Vol. 3, No.

10 ISSN: 1837-7823

A New Supervised Data Classification Method for Convex and Non Convex Classes
O. El Melhaoui, M. El Hitmy, and F. Lekhal LABO LETAS, FS, University of Mohammed I, Oujda, Morocco. wafa19819@gmail.com Abstract
The present paper proposes a supervised data classification technique for convex and non convex classes. The technique is based on two phases; the first phase deals with the creation of prototypes and elimination of noisy objects, the second phase consists of merging the nearest prototypes into classes. Each prototype is created based on the K nearest neighbours approach, and is represented by its gravity center. The objects which are very far from their neighbours are eliminated. The merge phase is inspired from the hierarchical method, and it consists of regrouping the closest prototypes in the same class in incremental way until that the number of created classes is equal to the number of classes set out initially. This new technique is compared to the Cmeans, the fuzzy C-means, the competitive neural networks and the fuzzy Min-Max classification methods through a number of simulations. The proposed technique has obtained good results. Keywords: Supervised classification, Convex and non convex classes, Creation of prototypes, Elimination of noisy objects, Merge, C-means, Fuzzy C-means, Competitive neural networks, Fuzzy Min-Max classification.

1. Introduction
Data classification is an active subject; it has played a crucial and highly beneficial role in hard tasks such as data analysis, quality control, biometrics (face recognition), medicine, geology (soil texture recognition) and automatic categorization of satellite pictures, etc. The classification is an abstraction and synthesis technique, it consists of partitioning a set of data entity into separate classes according to a similarity criterion, the objects are as similar as possible within a class , we talk about intra-class homogeneity, while objects of different classes are the most dissimilar possible, we talk about inter-class heterogeneity. This process allows having a simplified representation of the initial data. There are two main types of classification; unsupervised and supervised. We are interested in supervised methods where they assume that the number of classes is known initially. Among these, we find, C-means (CM), fuzzy C-means (FCM), support vector machine (SVM), K - nearest neighbours (KNN), neural networks (NN) etc. C-means, fuzzy C-means and standard competitive neural network methods proved a real ability to solve a nonlinear problem, and they are very popular because of their simplicity and their theoretical elegance, however they have several drawbacks, the most important are; the initialization of centers which can lead to local solutions [4, 2, 5] and they are not convenient for non convex or complex type of classes. The diversity and the complexity of the problem gives rise to many methods, among them the fuzzy min max classification (FMMC), which is well suited to convex or non convex type of classes. It consists of creating prototypes or hyperboxes iteratively until their stabilization, each iteration involves three stages: expansion, overlap and contraction. The present paper proposes a new technique for data classification suitable to convex and non convex type of classes, the learning process is made in two steps; the first step consists of creating prototypes and eliminating noisy objects, the second step consists of merging the prototypes according to some criteria. The creation of prototypes is based on the K nearest neighbours method while the elimination of isolated objects considers the fact whether the object is within a very weakly compact class or if it is in the peripherals of the class. Throughout this work a parameter D is used to measure the isolation of the object. Each object is supposed to be represented by its attribute vector extracted from the diverse features associated with the object. The position of each object to be classified is supposed to be known in the attribute space. The objects are initially stored in a matrix whose size is varying through the iterations. The objects are extracted in an orderly way from this matrix. The second step consists of regrouping the nearest prototypes into the same class iteratively. This paper is organised as follows: section 2 describes the different techniques of classification including Cmeans, fuzzy C-means, competitive neural networks and fuzzy Min-Max classification. The proposed method is described in section 3. The result of simulations and comparisons are introduced in section 4. Finally we give a conclusion.

International Journal of Computational Intelligence and Information Security, December 2012 Vol. 3, No. 10 ISSN: 1837-7823

2 - Classification methods 2.1. Descriptive element


Lets consider a set of M objects {Oi}, characterized by N parameters regrouped in a line vector Vatt=(a1,a2,..aN). Let Ri = (ain)1nN be a line vector of where the nth component ain is the value taken by an for the object Oi. Ri is called the observation associated with Oi, it is also called the realization of the attribute
N

vector for this object. is the observation space, it is also known as the parameters space. The observations (Ri)1i M are associated to C different classes (CLs) 1sC with respective centers (cs)1sC , each observation Ri is associated with its membership degree ui,s to the class CLs.
N

2.2. C-means (CM)


The C-means method was introduced by Mac Queen in 1967. CM is very popular and widely used in scientific and industrial applications, because of its great utility in the classification and its simplicity. It consists of looking for good centers of C classes which minimize intra-class variance and maximize the distance between the classes by an iterative process. The CM method consists of determining the class centers which minimize the optimization criterion defined by equation (1) [13]:

J m = ui ,s Ri c s
i =1 s =1

(1)

||.|| is the Euclidean distance, ui,s equal to 1 if Ri belongs to CLs, 0 if not.

2.3. Fuzzy C-means (FCM)


The fuzzy C-means (FCM) method was introduced by Dunn in 1973 and was improved by Bezdek in 1981. This technique inspired from the C-means algorithm, introduces the concept of fuzzy sets. Every object in the data set belongs to each class with a certain degree of membership rather than belonging entirely to one class. The underlying principle of the FCM is to form from M objects defined by their realizations (Ri)1iM, C classes (CLs)1sC by minimizing the criterion given by equation (2) and by considering the membership degree of each object to the different classes. The criterion to be optimized in the FCM is given by [1, 2, 14]:

J m = (u i , s ) df Ri c s
i =1 s =1

(2)
M

Under the constraints:

u i ,s = 1 for i=1 M
s =1

and

0<

u
i =1

i,s

<M for s=1 C

df is the fuzzy degree often taken equal 2,

u i , s is the membership degree of the object i to the class CLs.

For each i,s, ui,s belongs to the interval [0,1]. In order to minimize Jm, ui,s and cs, must be updated at each iteration according to [2]:

ui , s

(R c ) = (R c )
i s C k =1 i k

1 2 1 df 1 2 1 df

and

cs =

( )
M i =1 M i,s i =1 i,s

df

Ri
(3)
df

( )

2.4. Standard competitive neural networks (SCNN)


The competitive neural network has been widely used in various fields including classification and compression [7, 8, 9, 10]. It divides the input data into a number of disjoint clusters; it allows the output neurons to compete and excite one neuron and inhibits all others at a given time. The competitive network has two layers; input layer of N neurons and output layer of C neurons, fully interconnected, C represents the number of classes. The input layer receives the observation vector of dimension N, X = (x1, x2, ... xN), each neuron j in the output layer will compute the Euclidean distance between its weight vector Wj= (wj1, wj2 .... wjN) and the observation vector X, the final result of the network will give the index of a winner neuron whose weight vector is closest to the observation. The component wjn in the vector Wj is the synaptic weight of the connection between the input neuron n and the corresponding output neuron j. The standard competitive neural network adjusts the synaptic weights of the winner neuron by minimizing the criterion given by the formula [6, 15]: 10

International Journal of Computational Intelligence and Information Security, December 2012 Vol. 3, No. 10 ISSN: 1837-7823

(4) j = 1 if Wj is the closest to X, j = 0 otherwise. Using the gradient algorithm [13], also called the Winner Takes All rule, the weights of the winner neuron are updated at each step as: (5) W (t + 1) = W (t ) + ( X W )
j j j j

E=

1 C j X W j 2 j =1

is a learning rate which is usually a small real number between 0 and 1.

2.5. Fuzzy Min-Max classification (FMMC) 2.5.1. Principle


FMMC is a classification method introduced by Simpson in 1993 [3], based on neural network architecture. FMMC contains three layers, input, output and hidden layers. The number of neurons in the input layer is equal to the dimension of the data representation space. The number of neurons in the hidden layer increases in time, with respect to the creation of prototypes. The number of neurons in the output layer is equal to the number of classes known initially. The synaptic weights associated with the connections between input and hidden layers are formed by two matrices V and W representing the characteristics of the different prototypes in the hidden layer. The synaptic weights associated with the connections between hidden and output layers are formed by a Z matrix characterizing the association of prototypes to the classes. The learning process is made in three steps, expansion overlapping and contraction repeated for each training input pattern. These phases are controlled by two parameters; the sensitivity and the vigilance factor that control the maximum size of created hyperboxes. The fuzzy min-max classification neural network is built using hyperbox fuzzy sets. A hyperbox defines a region of the N-dimensional pattern space. A hyperbox Bj is defined by its min point Vj = (vjn)1nN and its max point Wj =(wjn)1nN. The fuzzy hyperbox membership function of an observation Oi to an hyperbox Bj is defined as follows: [3, 11, 12]

b j (Oi ) =

1 2N

[max(0,1 max(0, min(1, a


n =1

in

w jn ))) + max(0,1 max(0, min(1, v jn a in )))]

(6)

Where Oi =(ai1,ai2,.,aiN) is the ith input pattern, is the sensitivity parameter that regulates how fast the membership value decreases as the distance between Oi and Bj increases [6]. The combination of the min-max points and the hyperbox membership function defines a fuzzy set.

2.5.2 Learning
Lets consider A = {(Oi, dk) / i = 1, 2, M, k=1,,C} a training set, Oi is the input pattern and dk= {1,2, ..., C} is the index of one of the C classes. The fuzzy min-max learning is an expansion/contraction process. The learning process goes through several steps: Initialization: Initial values for and . Assume that the first input pattern form a hyperbox B1, defined by its min point V1 = (v1n)1nN and its max point W1 =(w1n)1nN, where V1= W1= O1. Repeat: 1. Select a new input pattern (Oi, dk) of the set A, identify a hyperbox for the same class, and provide the highest degree of membership. If a hyperbox cannot be found, a new hyperbox is formed and added to the neural network. 2. Expand the new created hyperbox Bj defined by a couple of points (v jn , w jn ) , where w jn and
* * *

= max(w jn , ain )

v * = min(v jn , ain ) , 1nN. jn


T= 1 N ( (max(w jn , ain ) min(v jn , ain ))) N n =1

3. Compute the size T where:

If T>, a new hyperbox is created. Otherwise, we check if there is an overlap between the hyperbox Bj expanded in the last expansion step, and the hyperboxes which represent other classes than that of Bj. 4. If an overlap between two hyperboxes of two different classes has been detected, a contraction of the two hyperboxes is carried out. Four possible cases for overlapping and contraction procedures are discussed in [3, 11]. Until stabilization of hyperboxes.

11

International Journal of Computational Intelligence and Information Security, December 2012 Vol. 3, No. 10 ISSN: 1837-7823

3. Proposed method 3.1 Descriptive element


Lets Consider a set of M objects {O1, O2, ..., Oi , ..., OM} characterized by N parameters, let Ri be the observation associated with the object Oi, and let mat_va be a matrix of M lines (representing the objects Oi) and N columns (representing the parameters aj), defined by:

mat _ va = a ij 1i M 1 j N

( )

3.2 Principle
The proposed method in this work is a supervised classification method. It is based on two steps, the first step creates the prototypes and removes the noisy objects, and the second step creates the classes from the created prototypes, called merge step. The objects to be classified are initially stored in a matrix called mat_va of dimension [M,N], the mat_va matrix will vary according to the index i of the iteration and will be called mat_vai, the first object of mat_vai,

O1i is taken. The K nearest neighbours to O1i are found from all the M objects considered initially, and the i farthest object to O1 from among its K nearest neighbours is obtained, this object will be called the Kth object. i Now a test is carried out and a decision of whether O1 is noisy or not may be obtained. If the Euclidian distance i i between the Kth object and O1 is greater than a threshold D set out initially then O1 is noisy or isolated and is i removed, if not O1 and its K nearest neighbours will constitute a consistent prototype. All the elements of the
obtained prototype are removed from mat_vai and mat_vai+1 is obtained. The merge step is inspired from the hierarchical clustering method; it consists of grouping the most similar prototypes into a class iteratively.

3.2.1. Creating the prototypes and removing the noisy objects,


The phase of creating prototypes is based on the cloud of points situated in the attributes space. Objects are initially stored in the realizations matrix mat_va, and are extracted from this matrix. The algorithm for this phase is: Iteration t=1, mat_va1 = mat_va 1. Find the K nearest neighbours of the object O1 from the cloud of points in the attributes space, O1 is the first element of mat_va. Let

1 k

(k = 1, ..., K) be the observations associated with the K nearest neighbours of O1.

1 1 2. Let d ( R1 , R K ) = max d ( R1 , Rk ) . k =1,.., K

If

1 d ( R1 , RK ) D , the object O1 and its K nearest neighbours form the prototype P1 of gravity center
1 R1 + R k k =1 K

g1 =

K +1

. The (K +1) objects of P1 in this case are compact and are similar to each other.

The (K +1) objects of P1 are removed from mat_va1 matrix, a mat_va2 is created with M-(K +1) rows, the new indexation for mat_va2 considers rows from 1 to M-(K+1). The empty rows in mat_va1 are filled with those non empty immediately coming after them. mat_va2 is formed in this way with no empty lines. If
1 d ( R1 , RK ) > D , then the object O1 of the observation R1, is considered to be isolated and noisy, and it is removed from mat_va1. Then mat_va2 is of the dimension [M-1, N], where the first line of mat_va1 is removed

and again a new indexation for mat_va2 takes place, the lines are indexed from 1 to M-1 Example

O1 O2 Let mat _ va = O be a matrix of objects, and K=2. Lets consider that O2 and O4 are the two nearest 1 3 O4 O5
neighbours to O1. We assume that the distance between O1 and the second nearest neighbour is smaller than D, 12

International Journal of Computational Intelligence and Information Security, December 2012 Vol. 3, No. 10 ISSN: 1837-7823

then O1, O2 and O4 form a prototype P whose elements are eliminated of mat_va1, mat_va2 becomes

O3 mat _ va 2 = O 5
If the distance between O1 and the second nearest neighbour is bigger than D, then O1 is eliminated from

O2 O3 mat_va1, O1 is a noisy object and mat_va2 becomes mat _ va 2 = O 4 O 5


. The algorithm corresponding to the creation of prototypes is: Iteration t= J: Let S be the number of prototypes created and mat_vaJ-1 the matrix formed in the J-1 iteration, it consists of objects not assigned to prototypes P1,, PS. The matrix of observations mat_vaJ will be formed in the following way: 1. Get the K-nearest neighbours of the first object of mat_vaJ-1,

O1J 1 , the Kth nearest neighbour of O1J 1 is

J OK 1 , it is considered to be the farthest from O1J 1 . +1 J 1 J 1 J 1 J 1 2. Let d( O K +1 , O1 ) be the Euclidean distance between O1 and O K +1 , J 1 J 1 J 1 If d( O K +1 , O1 ) > D, O1 is considered to be isolated and will be eliminated. J 1 J 1 If d( O K +1 , O1 ) D , then a new prototype is created and the (K +1) objects of the prototype are removed

from mat_vaJ-1 3. Create a new matrix mat_va J 4. Repeat until the last matrix mat_vae becomes empty.

3.2.2. Merge phase


We assume that phase 1 ended by forming H prototypes P1, PH. The task of the merge step is to aggregate the most similar prototypes in the appropriate class iteratively until the number of classes created will be equal to the number of classes fixed initially. This step has been inspired from the hierarchical clustering method. We choose the Euclidean distance as similarity criterion between the prototypes. Each prototype represents a class, we then have H classes (Figure A)
H classes

Figure A
P1

P2

P3

..

Pi

..

PH

We compute the Euclidean distance between all the gravity centers of H prototypes taken two by two mutually, so we have H(H -1) / 2 distances. The different distances are stored in a matrix mat= [d1,d2,dH (H-1) /2] after being ordered in an increasing way. The algorithm for the merge phase is: Itration t=1 Let d 1 ( g s , g r ) = min ( d ( g i , g j )) , where gi and gj are the gravity centers associated to the prototypes Pi,
i =1,.. H 1 j =i ,.. H

Pj respectively. Ps and Pr are considered similar prototypes, they will be grouped in one class CL1. The two classes associated with Ps and Pr are merged to form one single class, the number of circle used to represent the classes is reduced by one unit (figure B). In this case the number of classes becomes H-1.
H -1classes

P1

P2

P3

Ps

.....

Pr

.....

PH

P1

P2

P3

CL1(Ps, Pr)

PH

Figure B 13

International Journal of Computational Intelligence and Information Security, December 2012 Vol. 3, No. 10 ISSN: 1837-7823

Iteration t = 2 Let d2(gm,gn) be the second minimum distance. Pm and Pn are considered to be similar prototypes, a test is then carried out: 1. If one of the prototypes is already assigned to CL1, the other prototype will also be in CL1, the circle representing the other prototype is removed (Figure C). The number of classes becomes H-2.
H -2 classes

P1

P2

P3

CL1(Ps, Pr)

Pm

..

PH

P1

P2

PCL1(Ps Pr Pm) 3

PH

Figure C 2. If the two prototypes Pm and Pn are not assigned to class CL1, they will be grouped in a new class CL2 (Figure D), the circles associated with Pm and Pn are combined to form one circle associated to CL2. The number of classes becomes H-2.
H -2 classes

P1

P2

CL1(Ps, Pr)

Pm ..... Pn .PH

P1

P 2 ..

CL1(Ps,Pr) .CL2(Pm,Pn) PH

Figure D Iteration t = T We assume that the number of classes is S. Let dT(gc,gb) be the Tth minimum distance. Pc and Pb are considered to be similar prototypes, three cases are to be distinguished: 1. If Pc and Pb are not assigned to any class already created during the (T-1) iterations, a new class will be produced CLv. The number of classes becomes S-1. 2. If one of the prototypes (for example Pc) is already assigned to a class CLi already created during the T-1 iterations, then the other prototype Pb must be assigned to the same class CLi. The number of classes is S-1. 3. If both Pc and Pb prototypes are already assigned to different classes (CLe,CLf) (e<f), respectively, the two prototypes Pc and Pb must be assigned to the class CLe, all the prototypes that are already assigned to the class CLf must be assigned to the class CLe and the class CLf will be eliminated. In fact, CLf is combined with CLe to form one class that we call CLe , the number of classes becomes S-1. This procedure will run iteratively until that the remaining number of circles associated to the classes is equal to the number of classes fixed initially.

3.2.3. Example:
This example has three classes of spherical shape. Figure 1 shows the distribution of a Gaussian randomly generated data in the (X, Y) space, figure 2 shows the distribution of gravity centers of the created prototypes for K = 10 and D =20.

Figure 1: The distribution of a Gaussian randomly generated data in the (X, Y) space.

Figure 2: Prototypes with corresponding centers obtained by the proposed method.

14

International Journal of Computational Intelligence and Information Security, December 2012 Vol. 3, No. 10 ISSN: 1837-7823
9 classes

The first step of the proposed technique creates prototypes and eliminates noisy objects. 9 prototypes are created (Figure E).
P1 P2 P3

P5

P9

Figure E We notice from figure 2 that the prototypes of a class are all neighbours to each other. The gravity centers for all the created prototypes are given by: g1 = [16.0000 60.6364], g2= [16.1818 70.8182], g3= [26.7273 64.1818] g4= [35.7273 147.7273], g5= [42.1818 151.4545], g6= [44.7273 146.4545] g7= [52.0909 143.4545], g8= [99.0909 120.3636], g9= [105.636 117.4545] Table 1 gives the mutual distances between the gravity centers of all prototypes.
Table 1: different distances calculated between the different prototype gravity centers.

g1 g1 g2 g3 g4 g5 g6 g7 g8 g9
0 10.2 11.3 89.3 94.5 90.5 90.3 102 106

g2
10.2 0 12.4 79.3 84.7 80.8 81.0 96.6 100

g3
11.3 12.4 0 84.0 88.6 84.2 83.2 91.6 95.2

g4
89.3 79.3 84.0 0 7.45 9.08 16.9 69.0 76.2

g5
94.5 84.7 88.6 7.45 0 5.61 12.7 64.8 71.9

g6
90.5 80.8 84.2 9.08 5.61 0 7.95 60.3 67.5

g7
90.3 81.0 83.2 16.9 12.7 7.95 0 52.3 59.5

g8
102 96.6 91.6 69.0 64.8 60.3 52.3 0 7.16 8 classes

g9
106 100 95.2 76.2 71.9 67.5 59.5 7.16 0

1 iteration

st

d1 ( g 5 , g 6 ) = min (d ( g i , g j )) = 5.61 , then P5 and P6


i =1,..8 j =i ,..9

P1

P2

will be grouped in one class CL1, the number of classes is 8. 2nd iteration d2(g8,g9)= 7.16, then P8 and P9 will be grouped in the same class CL2, the number of classes is 7. 3 th iteration d3(g5,g4)=7.45, P4 and P5 are therefore grouped in the same class CL1, the number of classes is 6.
P1 P2

P3 CL1(P5 P6)
7 classes

P9

P1

P2

CL1(P5 P6) CL2(P8,P9)

6 classes

CL1(P5 P6, P4) .. CL2(P8,P9)


5 classes

4 th iteration D4(g6,g7)=7.95, P6 and P7 will be grouped in the same class CL1, the number of classes is 5.
P1 P2

CL1(P5 P6, P4, P7)

CL2(P8,P9)

5 th iteration d5(g6,g4)=9.08, P4 and P6 are already grouped in one class CL1, the number of classes is 5.
4 classes

6 th iteration d6(g1,g2)=10.2, P1 and P2 will be grouped in one class CL3, the number of classes is 4.

CL3(P1,P2)

P3

CL1(P5 P6, P4, P7)

CL2(P8,P9)

15

International Journal of Computational Intelligence and Information Security, December 2012 Vol. 3, No. 10 ISSN: 1837-7823

3 classes

7 iteration d7(g1,g3)=11.3, P1 and P3 will be grouped in one class CL3 the number of classes is 3.

th

CL3(P1,P2, P3)

CL1(P5 P6, P4, P7 CL2(P8,P9)

End of algorithm The number of classes is three, defining the real existing classes. CL1 contains P4, P5, P6, and P7. CL2 contains P8, and P9. CL3 contains P1, P2 and P3.

4. Simulations and comparisons


We consider in this section that the classes are of convex and non convex types. All the data used for the simulations have been generated by a Gaussian distribution routine through Matlab. The difference between the various simulations can be seen in the number of data, the form of classes, the degree of overlap between the classes and the class compactness. All data used in the simulations are of two dimensional type which have allowed to plot the data and to have a clear view of the results.

4.1. Simulation 1
This first simulation has three classes with no overlapping between them. The data are divided into two sets, one set of 90 objects is used for learning and one set of 70 objects is used for test. Figures 3 and 4 show the results of the classification of data in (A1, A2) space for CM, FCM and SCNN methods for bad and good initializations respectively.

Figure 3: Results of the classification in the (A1,A2) space for bad initialization.

Figure 4: Results of the classification in the (A1,A2) space for good initialization.

Figures 3 and 4 show that the methods C-means (CM), fuzzy C-means (FCM) and competitive neural networks (SCNN) converge to good solutions when using a good initialization, but when a bad initialization is used, these methods are trapped in local solutions. Figures 5 and 6 show the classification results using the fuzzy Min-Max classification (FMMC) method and the proposed method respectively. For FMMC we set =0.5 and =2. For the proposed method, D is set to 1 and K to 8.

Figure 6: Prototypes with corresponding centers obtained by the proposed method.

Figure 5: Results of the classification in (A1,A2) space for FMMC.

Figure 6: Prototypes with corresponding centers obtained by the proposed method.

16

International Journal of Computational Intelligence and Information Security, December 2012 Vol. 3, No. 10 ISSN: 1837-7823

We have also studied in this simulation the impact of K on the classification results, we found that if K is less than 8, the classification results are good, but the number of prototypes is higher. If K is greater than 10, the classification results may not be good. The prototypes within a class where the compacity is very low may not be formed at all, the objects may all be considered within this class as noisy, and the class ignored. Table 1 gives the classification rates for the methods: C-means, fuzzy C-means, standard competitive neural networks for bad initialization, fuzzy Min-Max classification and the proposed method.
Table 1: The classification rates obtained by the various methods considered in this work.
Classification methods C-means Fuzzy C-means Standard competitive neural network Fuzzy Min-Max classification Proposed method Number of prototypes Number of misclassified objects Classification rate (%) Learning time (s)

3 classes 3 classes 3 classes 48 9

30 30 30 0 0

57.14 57.14 57.14 100 100

0.188 0.047 0.016 0.422 0.0780

Table 1 shows that the methods CM, FCM and SCNN obtain bad classification results for bad initialization and the algorithms have converged to local solutions. The classification methods FMMC and the proposed method converge to better solutions but the running time for FMMC is much bigger than that required by the proposed method.

4.2 Simulation 2
This simulation has five classes of spherical shape which are slightly overlapping between them. The data used are divided into two parts, a training set containing 200 objects and a test set containing 100 objects. Figure 7 shows the classification results of data in (B1, B2) space for CM, FCM and SCNN methods using a good initialization. Figure 8 shows the results of data classification by the FMMC method, is set to 0.5 and to 1.2.

Figure 7: Results of the classification in the (B1,B2) space for good initialization.

Figure 8: Results of the classification in the (B1,B2) space for FMMC.

17

International Journal of Computational Intelligence and Information Security, December 2012 Vol. 3, No. 10 ISSN: 1837-7823

Figure 7 shows the good convergence of the C-means (CM), fuzzy C-means (FCM) and standard competitive neural networks (SCNN) methods, but when a bad initialization is used, all these methods are trapped in local solutions. Figure 9 shows the classification results for the proposed method, D=1 and K= 15.

Figure 9: Prototypes with corresponding centers obtained by the proposed method.

We have varied the value of K in this case and have obtained similar results as before. If K is less than 15, good classification results are obtained but at the expense of higher number of prototypes and increasing running time. If K is greater than 15, the result of classification may not be good, the risk of the algorithm to ignore less compact classes is always there. Table 2 gives the classification rate for C-means, fuzzy C-means and standard competitive neural networks using a good initialization, fuzzy Min-Max classification and the proposed method
Table 2: The classification rates obtained by the various methods considered in this work.
Classification methods C-means Fuzzy C-means Standard competitive neural network Fuzzy Min-Max classification Proposed method Number of prototypes Number of misclassified objects Classification rate (%) Learning time (s)

5 classes 5 classes 5 classes 101 7

3 3 2 5 0

97 97 98 95 100

0.609 0.343 0.031 1.266 0.766

From this table, the proposed method has the highest rate of classification, it is 100%. The other methods CM, FCM, SCNN and FMMC have a classification rate equal to 97%, 97%, 98% and 95% respectively.

4.3 Simulation 3
This simulation has three classes of spherical form. The data are divided into two parts, a training set containing 450 objects and a test set containing 300 objects. Figure 10 shows the classification results in (C1, C2) space using CM, FCM and SCNN methods for good initialization. Figure 11 shows the result of data classification for the FMMC method.

Figure 10: Results of the classification in the (C1,C2) space for bad initialization.

Figure 11: Result of data classification in the (C1,C2) for the FMMC method.

18

International Journal of Computational Intelligence and Information Security, December 2012 Vol. 3, No. 10 ISSN: 1837-7823

Figure 12 shows the data and the prototype gravity centers in (C1, C2) space for the proposed method, D is set to 1 and K to 23.

Figure 12: Prototypes with corresponding centers obtained by the proposed method.

It is found from this experiment that if K is less than 14, noisy objects can be selected into prototypes and may result in a bad classification. If K is greater than 47, there is a risk of the algorithm to ignore less compact classes. In this test, K may be of higher value than that of the previous tests because all the classes are more compact and the number of objects in each class is higher. Table 3 gives the classification rates obtained by CM, FCM, SCNN, FMMC and the proposed method. Table 3: The classification rates obtained by the various methods considered in this work.
Classification methods C-means Fuzzy c-means Standard competitive neural network Fuzzy Min-Max classification Proposed method Number of prototypes Number of misclassified objects Classification rate (%) Learning time (s)

3 classes 3 classes 3 classes 240 25

9 7 8 14 5

97 97.66 97.33 95.33 98.33

0.7500 0.3280 0.094 4.9531 2.8430

A higher classification rate is obtained by the proposed method; it is equal to 98.33% that means five objects were misclassified among 300 objects. For the CM, FCM, SCNN and FMMC, the classification rates are 97%, 97.66%, 97.33% and 95.33% respectively.

4.4 Simulation 4
This simulation has two classes of complex shape. The data used are divided into two parts, a training set containing 360 objects and a test set containing 180 objects. The data have been synthesized by the authors and this has made use of the rand routine in Matlab. Figure 13 shows the data classification results in (D1, D2) space by CM, FCM and SCNN. Figure 14 shows the data classification results by FMMC, =0.3 and =1.

Figure 13: Results of the classification in the (D1,D2) space for CM, FCM and SCNN.

Figure 14: Results of the classification in the (D1,D2) space for. FMMC

Figure 15 shows the data and the prototype gravity centers in (D1, D2) space for the proposed method, D is set to 1 and K to 30. 19

International Journal of Computational Intelligence and Information Security, December 2012 Vol. 3, No. 10 ISSN: 1837-7823

Figure 15: Prototype gravity centers obtained by the proposed method. If K is less than 30, good classification results are obtained, but the computing time and the number of prototypes increases. Table 4 gives the classification rate obtained by CM, FCM, SCNN and the proposed method.

Table 4: The classification rates obtained by the various methods considered in this work.
Classification methods C-means Fuzzy c-means Number of prototypes Number of misclassified objects Classification rate (%) Learning time (s)

2 classes 28 84.4 2.25 2 classes 28 84.4 1.1560 Standard competitive neural network 2 classes 26 85.56 1.67 Fuzzy Min-Max classification 118 0 100 2.1880 Proposed method 19 0 100 0.8910 From this table, we notice that the performance of the proposed method is higher than that of the others, we have obtained a good classification rate in a minimum running time.

4.5. Simulation 5
This simulation has two classes of complex shape. The data used are divided into two parts, a training set containing 556 objects and a test set containing 500 objects. The data have been synthesized by the authors and this has made use of the rand routine in Matlab. Figure 16 shows the data classification results in (E1, E2) space by CM, FCM and SCNN. Figure 17 shows the data classification results by FMMC, =0.6 and =1.

Figure 16: Results of the classification in the (E1, E2) space for CM, FCM and SCNN.

Figure 17: Results of the classification in the (E1, E2) space for FMMC

Figure 18 shows the data and the prototype gravity centers in (E1, E2) space for the proposed method, D is set to 1 and K to 30.

20

International Journal of Computational Intelligence and Information Security, December 2012 Vol. 3, No. 10 ISSN: 1837-7823

Figure 18: Prototype gravity centers obtained by the proposed method.

If K is less than 30; there will always be a good classification, but the computing time and the number of prototypes increase. Table 5 gives the classification rate obtained by CM, FCM, SCNN, FMMC and the proposed method.
Table 5: The classification rates obtained by the various methods considered in this work.
Classification methods C-means Fuzzy c-means Standard competitive neural network Fuzzy Min-Max classification Proposed method Number of prototypes Number of misclassified objects Classification rate (%) Learning time (s)

2 classes 2 classes 2 classes 141 28

214 214 214 0 0

57.2 57.2 57.2 100 100

1.0150 0.9220 0.17 3.04 1.84

A very high classification rate is obtained by the proposed method and FMMC but the execution time is less for the proposed method. The other methods CM, FCM and SCNN have failed to converge to the correct solution; they have a very low classification rate (57%).

5. Conclusion
The present paper proposes a supervised data classification technique for convex and non convex classes. The technique is carried out in two steps; step one creates the prototypes and eliminates noisy objects, step two merges prototypes into classes. The technique is based on the tuning of two parameters K and D. The prototypes are formed by the K nearest neighbours technique. The second step consists of merging the closest prototypes into classes. The line of circles architecture has been used to assist this task. . In this study, we have shown that bigger values of K may obtain bad classification results, while smaller values of K obtain higher number of prototypes and higher running time. A good choice of K must be made. Different simulations were performed to validate the proposed method, the difference between these simulations may be viewed in the number of data, the class compactness, the form of classes and the degree of overlap between them. The proposed method was compared to various techniques such as C-means, fuzzy C-means, competitive neural networks and the fuzzy min max classification. For complex classes, the traditional methods as CM, FCM and SCNN have failed to converge to the true solution whereas the proposed method has obtained good results with smaller convergence time In all the simulations performed in this work, the proposed method has always obtained better results in a fewer running time.

6. References
[1] Ouariachi, H., (2001). Classification non supervise de donnes par rseaux de neurones et une approche volutionniste: application la classification dimages. Thse de doctorat, Universit Mohamed 1, Maroc. [2] Nasri, M., (2004). Contribution la classification des donnes par approches volutionnistes: simulation et application aux images de textures. Thse de Doctorat, Universit Mohammed 1, Oujda, Maroc. [3] Simpson, P.K., (1992). Fuzzy min-max neural networks. Part 1: Classification. IEEE Transactions on Neural Networks, vol. 3, pp: 776-786. [4] Zalik, K. R. and Zalik, B., (2010). Validity index for clusters of different sizes and densities. Pattern Recognition Letters, 221-234. [5] Bouguelid, M. S., (2007). Contribution l'application de la reconnaissance des formes et la thorie des possibilits au diagnostic adaptatif et prdictif des systmes dynamiques . Thse de doctorat, Universit de Reins Champagne- Ardenne. [6] Borne, P., Benrejeb M., Haggge, J. Les rseaux de neurones, prsentation et applications. Editions technip. 21

International Journal of Computational Intelligence and Information Security, December 2012 Vol. 3, No. 10 ISSN: 1837-7823

[7] DIAB M. ,(2007). Classification des signaux EMG Utrins Afin de Dtecter Les Accouchements Prmaturs. Thse. [8] Stphane, D., Postaire J.-G, (1996). Classification interactive non supervise de donnes multidimensionnelles par rseaux de neurones apprentissage comptitif . Thse, France. [9] Hao, Y. Rgis, L., (1992). Etude des rseaux de neurones en mode non supervis: Application la reconnaissance des formes. Thse. [10] Kopcso, D., Pipino, L., Rybolt, W., (1992). Classifying the uncertainty arithmetic of individuals using competitive learning neural networks. Elsevier: Expert Systems with Applications, Vol 4, pp.157-169. [11] Gabrys, B., and Bargiel, A., (2000). "General Fuzzy Min-Max Neural Network for Clustering and Classification". IEEE Transactions on neural networks, Vol. 11, N. 3. [12] Chowhan, S.S., Shinde, G. N., (2011). Iris Recognition Using Fuzzy Min-Max Neural Network. International Journal of Computer and Electrical Engineering, Vol. 3, No. 5. [13] Khan, S., Ahmad, A., (2004). Cluster center initialization algorithm for K-means clustering. Elsevier Pattern Recognition Letters 25, pp. 12931302. [14] Mohamed N. Sameh, A., Yamany, M., Nevin M., Aly A. Farag, and Moriarty, T. (2002). A Modified Fuzzy C-Means Algorithm for Bias Field Estimation and Segmentation of MRI Data. IEEE, Transactions on medical imaging, Vol. 21, N. 3. [15] Dong-Chul Park, (2000) Centroid Neural Network for Unsupervised Competitive Learning. IEEE Transactions on neural networks, Vol. 11, N. 2.

22

You might also like