Explainable PSO-Optimized Transformer Framework for Email Spam Detection Using Textual, Behavioural, and Temporal Intelligence

Kalyani Shivaji Ubale; Kamini Ashutosh Shirsath

doi:10.70917/ijcisim-2026-2018

Authors

Kalyani Shivaji Ubale Computer Engineering Department, K. K. Wagh Institute of Engineering Education and Research, Savitribai Phule Pune University (SPPU), Nashik, Maharashtra, India.
Kamini Ashutosh Shirsath Computer Engineering Department, Sandip Institute of Engineering and Management (SIEM), Savitribai Phule Pune University (SPPU), Nashik, Maharashtra, India.

DOI:

https://doi.org/10.70917/ijcisim-2026-2018

Keywords:

Email Spam Detection, Transformer Models, TinyBERT, ALBERT, ELECTRA, Particle Swarm Optimization, Explainable AI, SHAP, LIME, TF-IDF, Behavioural Feature Engineering, Temporal Feature Engineering, CEAS 2008 Dataset, Deep Learning, Natural Language Processing

Abstract

Despite the constant efforts to eliminate email spam, it continues to be a major cybersecurity threat, resulting in productivity losses and a gateway for phishing, malware and fraud. Many spam detection techniques based on rule-based heuristics or traditional machine learning algorithms are not able to adapt to the sophistication of the modern spam campaigns. In this paper, a novel Explainable PSO-Optimized Transformer Framework is proposed, which combines multi-dimensional features (textual, behavioural and temporal) with transformer-based deep learning models optimized using the Particle Swarm Optimization (PSO) hyper parameter tuning and Explainable Artificial Intelligence (XAI) techniques. Evaluation of the proposed framework is done over the CEAS 2008 Email Spam Dataset by using three lightweight transformer models namely, TinyBERT, ALBERT, and ELECTRA and optimising each of them using PSO for learning rate and weight decay. This framework includes TF-IDF text features, behaviour features engineered from word count, link count, HTML tag count, exclamation count, uppercase count and punctuation count and time features engineered from hour of day, day of week, month based features. SHAP-based explanations and LIME-based local interpretability are used to achieve transparency of model decisions on a global and local scale, respectively. The experimental results show that the highest mean F1 score is 0.9919 obtained by TinyBERT, followed by ALBERT (0.9914), and ELECTRA (0.9813) in stratified 5-fold cross-validation, outperforming other baseline models such as Naive Bayes (0.9800), Logistic Regression (0.9130), LightGBM (0.9803), and XGBoost (0.9820). The combination of PSO optimization, Explainability components and multi-feature engineering provides a powerful, interpretable and high-performance spam detection system that can be used in real-world email security deployments.

Downloads

Download data is not yet available.

Explainable PSO-Optimized Transformer Framework for Email Spam Detection Using Textual, Behavioural, and Temporal Intelligence

Authors

DOI:

Keywords:

Abstract

Downloads

Downloads

Published

How to Cite

Issue

Section

Information