A Deep Experimental Study of Ensemble-Based Phishing Detection in Centralised and Federated Settings
2025 (English)In: Proceedings - 2025 IEEE Conference on Pervasive and Intelligent Computing, PICom 2025, Institute of Electrical and Electronics Engineers (IEEE), 2025, p. 145-152Conference paper, Published paper (Refereed)
Abstract [en]
Phishing attacks remain a major cybersecurity threat in today's increasingly interconnected world, exploiting human vulnerabilities to bypass traditional defenses. This study evaluates a range of Machine Learning and Deep Learning models for phishing detection using three datasets: phishing-only, legitimate-only, and a balanced merged dataset. The experiments were carried out using centralized and FL approaches, with further robustness evaluation under adversarial scenarios such as phishing injection and label-flipping attacks. The results demonstrate that DL models, particularly LSTM and MLP, delivered strong performance in both stand-alone and ensemble setups. Notably, FL models preserved comparable performance and even slightly better accuracy than centralized models, while enhancing privacy, scalability, and robustness. Under adversarial scenarios, FL models maintained high robustness, validating the practical effectiveness of privacy-preserving phishing detection in real-world environments. In centralized settings, LSTM achieved up to 96.00% accuracy on phishing data, while heterogeneous DL ensembles (LSTM, MLP, CNN) further enhanced performance to 97.13%. This advantage was especially evident during crossdistribution evaluations, where LSTM bagging showed strong generalization, achieving up to 85.73% accuracy. In federated settings, LSTM models reached up to 86.75% accuracy and an F1-score around 86%. Even under phishing injection and data poisoning attacks, the 10-client LSTM model showed the best balance with 83.33% accuracy, 85.67% F1-score, and the highest AUC of 91.40%. The 10-client CNN and 15-client MLP models also performed well, with accuracies around 83.7% and F1-scores above 85.5%, with MLP reaching an AUC of 91.13%.
Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2025. p. 145-152
Keywords [en]
cybersecurity, federated learning, machine learning, phishing, privacy-preserving, social engineering, Behavioral research, Computer crime, Human computer interaction, Learning systems, Network security, Privacy-preserving techniques, Centralised, Cyber security, F1 scores, Machine-learning, Performance, Phishing attacks, Phishing detections, Privacy preserving, Long short-term memory
National Category
Security, Privacy and Cryptography
Identifiers
URN: urn:nbn:se:bth-29303DOI: 10.1109/PICom68402.2025.00025Scopus ID: 2-s2.0-105032923445ISBN: 9798331590925 (print)OAI: oai:DiVA.org:bth-29303DiVA, id: diva2:2050876
Conference
23rd IEEE Conference on Pervasive and Intelligent Computing, PICom 2025, Hakodate City, Nov 21-24
2026-04-072026-04-072026-04-07Bibliographically approved