You are on page 1of 2

Implementation of clustering for categorical data using Concept-Drift.

In real situations most of the cases data changes over time. But clustering this type of data not only decreases the quality of the clusters but also disgrades the expectations of users, where usually require recent clustering results. To achieve the scalability we label the unlabeled data set in to proper clusters. But it is difficult in categorical domain unlike in numerical data. In this project we represent initial clustering by using Node Importance Representative(NIR) and then consider each data set at time in next time frame to frame appropriate cluster label or outlier by

treating data set as unknown data set using MARDL(Maximum Resemblance Data Labeling). We discussed drifting concept detection to drift the data set or not. And then to find the labeling to the drifted data set, then data set is to be inserted in appropriate cluster.

You might also like