A Novel Ensemble Approach to Enhance the Performance of Web Server Logs Classification

Authors

  • Mohammed Hamed Ahmed Elhebir Faculty of Mathematical and Computer Sciences, University of Gezira , P.O. Box 20, Wad Medani, Sudan
  • Ajith Abraham Machine Intelligence Research Labs (MIR Labs), Scientific Network for Innovation and Research Excellence, P.O. Box 2259, WA, USA

Keywords:

Web Usage Mining, Base Classifiers, Meta Base Classifiers, Ensemble Methods, Voting.

Abstract

The World Wide Web (WWW) is growing in both the volume of traffic and the complexity of website, it has become very important to classify this web traffic and the usage of the web site according to predetermined attributes .Web Usage Mining (WUM) is the process of extracting knowledge from the accessed data by the web users. Classifying web users’ sessions provides valuable information for web designers to respond to their individual needs in time. The main objective of this paper is to classify users' sessions. However, most of classification algorithms obtained good performance for specific problems, but they are not robust enough for all kinds of problems. Combination of multiple classifiers can be considered as a general solution method for pattern discovery. It has been shown that the combination of classifiers obtains better results compared to a single classifier provided that its components are independent or they have diverse outputs. This paper compares the accuracy of ensemble models, which take advantage of groups of learners to yield better results. The Base classifiers that have been used in this approach are: decision tree algorithm, k-Nearest Neighbor, Naive Bayesian and BayesNet. Stacking and Voting are used as Meta classifiers. The performance of our approach is measured and compared using Sudan University of Science and Technology (SUST) web log data with session based timing. Different comparative analysis and evaluation were done using various metrics, such as Error Rate, ROC curves, Confusion Matrix, F- measure and the Matthews correlation coefficient. The results show that these ensemble machine learning models using voting meta classifier can significantly improve users sessions classification. It can achieve high accuracy in comparison with the outcomes of the all base and meta classifiers proposed.

Downloads

Download data is not yet available.

Downloads

Published

2015-01-01

How to Cite

Mohammed Hamed Ahmed Elhebir, & Ajith Abraham. (2015). A Novel Ensemble Approach to Enhance the Performance of Web Server Logs Classification. International Journal of Computer Information Systems and Industrial Management Applications, 7, 7. Retrieved from https://cspub-ijcisim.org/index.php/ijcisim/article/view/301

Issue

Section

Original Articles