Optimizing Fuzzy C Means Clustering Algorithm: Challenges and Applications

Authors

  • Amrita Bhattacherjee Department of Statistics, St. Xavier’s College, Kolkata – 700016, India
  • Sugata Sanyal School of Technology & Computer Science, Tata Institute of Fundamental Research, Mumbai – 400005, India
  • Ajith Abraham Machine Intelligence Research Labs (MIR Labs) Scientific Network for Innovation and Research Excellence, Auburn, Washington 98071, USA

Keywords:

Clustering, Fuzzy partitions, Time complexity, Fuzzy C-Means algorithm, Unsupervised Machine Learning

Abstract

The Fuzzy C-Means clustering technique is one of the most popular soft clustering algorithms in the field of data segmentation. However, its high time complexity makes it computationally expensive, when implemented on very large datasets. Kolen and Hutcheson [1] proposed a modification of the FCM Algorithm, which dramatically reduces the runtime of their algorithm, making it linear with respect to the number of clusters, as opposed to the original algorithm which was quadratic with respect to the number of clusters. This paper proposes further modification of the algorithm by Kolen et. al., by suggesting effective seed initialisation (by Fuzzy CMeans++, proposed by Stetco et. al. [2]) before feeding the initial cluster centers to the algorithm. The resultant model converges even faster. Empirical findings are illustrated using synthetic and real-world datasets. Finally, we check the algorithm’s robustness to perturbations in the data.

Downloads

Download data is not yet available.

Downloads

Published

2022-01-01

How to Cite

Amrita Bhattacherjee, Sugata Sanyal, & Ajith Abraham. (2022). Optimizing Fuzzy C Means Clustering Algorithm: Challenges and Applications. International Journal of Computer Information Systems and Industrial Management Applications, 14, 13. Retrieved from https://cspub-ijcisim.org/index.php/ijcisim/article/view/498

Issue

Section

Original Articles