Application of the Filter approach and the Clustering algorithm on Cancer datasets

Authors

  • SARA HADDOU BOUAZZA Department of Physics, Faculty of Sciences Semlalia, Cadi Ayyad University, Marrakech, Morocco
  • KHALID AUHMANI Department of Industrial Engineering, National School of Applied Sciences, Cadi Ayyad, Safi, Morocco
  • ABDELOUHAB ZEROUAL Department of Physics, Faculty of Sciences Semlalia, Cadi Ayyad University, Marrakech, Morocco

Keywords:

DNA Microarray; Feature selection; Supervised Classification; Clustering; image processing; Cancer classification

Abstract

In this paper, we compare the accuracy of classification for different cancers, based on gene microarray expression data. For this reason, we have used a combination between filter selection methods and clustering algorithms to select relevant features, in each cancer dataset, for gene classification. Our effort is carried out in two steps. First, we survey the effect of the selection methods, on the classification accuracy for cancers, by comparing the performances evaluated by different classifiers. The considered selection methods in this paper are SNR, ReliefF, Correlation Coefficient, Mutual Information, T-Statistics, Fisher, Max relevance Min redundancy, and Principal component analysis. We evaluated the performances of each selection method by the use of the K Nearest Neighbor, Support Vector Machine, Linear Discriminant Analyses, Decision tree for classification and Naïve Bayes classifier for a supervised classification task. As a second step, we preceded the selection step by a k-means and k-medians clustering operation. Obtained accuracies detect that the best classification accuracies were reached for a minimum subset of selected genes, in all cancers, in case we applied the k-means clustering for the selected genes by the filter methods.

Downloads

Download data is not yet available.

Downloads

Published

2018-01-01

How to Cite

SARA HADDOU BOUAZZA, KHALID AUHMANI, & ABDELOUHAB ZEROUAL. (2018). Application of the Filter approach and the Clustering algorithm on Cancer datasets. International Journal of Computer Information Systems and Industrial Management Applications, 10, 19. Retrieved from https://cspub-ijcisim.org/index.php/ijcisim/article/view/370

Issue

Section

Original Articles