Federated Learning for Privacy-Preserving Data Science: Performance, Efficiency, and Scalability Analysis

Nirwana, . and Mohammad, Azhar and Mehwish, Usman (2026) Federated Learning for Privacy-Preserving Data Science: Performance, Efficiency, and Scalability Analysis. Journal of Data Science, 2026 (01). pp. 81-101. ISSN 2805-5160

	Text 855 - Published Version Available under License Creative Commons Attribution. Download (45kB)
	Text jods2026_05.pdf - Published Version Available under License Creative Commons Attribution. Download (364kB)

Official URL: http://ipublishing.intimal.edu.my/jods.html

Abstract

The rapid growth of distributed and privacy-sensitive data environments has intensified the need for collaborative machine learning approaches that preserve confidentiality without sacrificing performance. Traditional centralized learning requires data aggregation, creating regulatory, ethical, and security risks. Although federated learning (FL) addresses this limitation by enabling decentralized training, existing implementations suffer from performance degradation under non-IID data distributions, unstable convergence, and high communication overhead. Moreover, many studies focus primarily on accuracy comparisons without systematically evaluating scalability and efficiency trade-offs. This study proposes an Adaptive Federated Learning (AFL) framework that integrates divergence-aware aggregation and intelligent client selection to enhance convergence stability and communication efficiency in heterogeneous environments. A comprehensive experimental evaluation was conducted across IID and non-IID data partitions, varying participation rates, and communication constraints. Performance was assessed using predictive accuracy, F1-score, convergence rounds, communication volume, and scalability metrics, with comparisons against centralized learning and standard FedAvg. Results demonstrate that AFL improves accuracy by up to 5.3% and macro F1-score by 6.5% under highly non-IID settings compared to FedAvg, while reducing convergence rounds by approximately 23% and communication overhead by up to 28%. Statistical analysis confirms significant performance gains (p < 0.01). The findings indicate that adaptive orchestration mechanisms substantially enhance federated robustness without compromising privacy advantages. This research aims to provide a system-level evaluation framework for privacy-preserving distributed learning and offers actionable guidance for deploying scalable federated systems in healthcare, finance, and other data-sensitive domains

Item Type:	Article
Uncontrolled Keywords:	Federated Learning; Privacy-Preserving AI; Distributed Data Science; Scalability; Secure Analytics
Subjects:	Q Science > Q Science (General) Q Science > QA Mathematics > QA75 Electronic computers. Computer science Q Science > QA Mathematics > QA76 Computer software
Depositing User:	Unnamed user with email masilah.mansor@newinti.edu.my
Date Deposited:	27 Feb 2026 00:47
Last Modified:	02 Mar 2026 01:48
URI:	http://eprints.intimal.edu.my/id/eprint/2302

Actions (login required)

View Item