Application of the Filter approach and the Clustering algorithm on Cancer datasets
Keywords:
DNA Microarray; Feature selection; Supervised Classification; Clustering; image processing; Cancer classificationAbstract
In this paper, we compare the accuracy of classification for different cancers, based on gene microarray expression data. For this reason, we have used a combination between filter selection methods and clustering algorithms to select relevant features, in each cancer dataset, for gene classification. Our effort is carried out in two steps. First, we survey the effect of the selection methods, on the classification accuracy for cancers, by comparing the performances evaluated by different classifiers. The considered selection methods in this paper are SNR, ReliefF, Correlation Coefficient, Mutual Information, T-Statistics, Fisher, Max relevance Min redundancy, and Principal component analysis. We evaluated the performances of each selection method by the use of the K Nearest Neighbor, Support Vector Machine, Linear Discriminant Analyses, Decision tree for classification and Naïve Bayes classifier for a supervised classification task. As a second step, we preceded the selection step by a k-means and k-medians clustering operation. Obtained accuracies detect that the best classification accuracies were reached for a minimum subset of selected genes, in all cancers, in case we applied the k-means clustering for the selected genes by the filter methods.
Downloads
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2023 International Journal of Computer Information Systems and Industrial Management Applications
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.