Relaxing Core Assumptions: the Impact of Data, Model and Participation Heterogeneity on Performance, Privacy and Fairness in Federated Learning

Authors: Németh, G. D.

Publication: PhD Thesis, University of Alicante, 2025
PDF: Click here for the PDF paper

Federated Learning (FL) enables decentralized training of machine learning models on distributed data while preserving privacy by design. An FL design consists of clients training models on private data and a central server aggregating a global model based on the consensus among clients. In an ideal scenario, the training data and computing resources are identically and independently distributed (i.i.d.) among clients, therefore, clients can work together in agreement to reach a global optima. However, in realistic FL settings, heterogeneity arises between clients in terms of both data and resource availability. This research focuses on such scenarios, with a special interest on how the server can adapt the aggregation method from a simple averaging to address the clients’ diversity.

The first research direction discusses existing client selection methods and proposes a novel taxonomy of FL methods where the participation of the clients is actively managed by the server to achieve a global objective with respect to the client heterogeneity. This research direction is presented in [NLQO22].

The next chapter focuses on model heterogeneity as an inclusion policy for low-resource clients. It investigates the implications of client resource constraints on privacy given a reduced model complexity in low-resource clients. This work has been presented in [NLQO25].

The final area provides a solution to the data heterogeneity problem with distribution-aware client selection. Applying this solution can mitigate spurious correlations and improve algorithmic fairness in FL. This research line has been described in [NFN+25].

[NLQO22] Németh, G. D., Lozano, M. A., Quadrianto, N., and Oliver, N. (2022). A Snapshot of the Frontiers of Client Selection in Federated Learning. Transactions on Machine Learning Research. [NLQO25] Németh, G. D., Lozano, M. A., Quadrianto, N., and Oliver, N. (2025). Privacy and Accuracy Implications of Model Complexity and Integration in Heterogeneous Federated Learning. IEEE Access, 13, 40258-40274. [NFN+25] Németh, G. D., Fani, E., Ng, Y. J., Caputo, B., Lozano, M. A., Oliver, N., and Quadrianto, N.(2025). FedDiverse: Tackling Data Heterogeneity in Federated Learning with Diversity-Driven Client Selection. FLTA2025.