You can find below our recently accepted work in IEEE Transactions on Network and Service Management, entitled “StatAvg: Mitigating Data Heterogeneity in Federated Learning for Intrusion Detection Systems” and authored by P. S. Bouzinis, P. Radoglou-Grammatikis, I. Makris, T. Lagkas, V. Argyriou, G. Th. Papadopoulos, P. Sarigiannidis, and G. K. Karagiannidis.
Link: https://ieeexplore.ieee.org/document/10977044
Federated learning (FL) enables devices to collaboratively build a shared machine learning (ML) or deep learning (DL) model without exposing raw data. Its privacy-preserving nature has made it popular for intrusion detection systems (IDS) in the field of cybersecurity. However, data heterogeneity across participants poses challenges for FL-based IDS.
This study proposes a statistical averaging (StatAvg) method to alleviate non-independently and identically (non-iid) distributed features across local clients’ data in FL. In particular, StatAvg allows the FL clients to share their individual local data statistics with the server. These statistics include the mean and variance of each client’s feature vector. The server then aggregates this information to produce global statistics, which are shared with the clients and used for universal data normalization, i.e., common scaling of the input features by all clients. It is worth mentioning that StatAvg can seamlessly integrate with any FL aggregation strategy, as it occurs before the actual FL training process.