The integration of Federated Learning (FL) in the Internet of Medical Things (IoMT) represents a cutting-edge solution, enabling the training of Artificial Intelligence (AI) models directly on edge devices without the need to share sensitive patient information. This approach enhances privacy while preserving the quality and effectiveness of clinical analysis. However, in real-world scenarios, physical devices often generate data that is non-independent and non-identically distributed (Non-IID), creating significant challenges for the training process. This study proposes an experimental method to generate realistic data distributions from existing centralized datasets, capturing real-world heterogeneity in IoT-driven federated learning infrastructures. The proposed infrastructure utilizes advanced statistical techniques to transform IID datasets into Non-IID distributions. This transformation enables a systematic evaluation of the impact of Non-IID data on Federated Learning in ECG arrhythmia detection. Using the MIT-BIH Arrhythmia dataset, an accuracy drop of only 0.31% was observed in an extreme Non-IID scenario. However, significant execution time variability is observed, showing up to a 50% variation across clients, compared to medium Non-IID (15.5%) or IID (0.63%)conditions. This observation implies that Non-IID data leads to substantial disparities in computational workload across clients, which can slow down and destabilize the convergence process, as suggested by theoretical expectations.
Analyzing the Impact of Non-IID Data on IoT-Enabled Federated Learning for ECG Arrhythmia Detection
Davide CantoroPrimo
;Angela-Tafadzwa ShumbaSecondo
;Gianluigi Semeraro;Teodoro Montanaro;Ilaria Sergi;Massimo De VittorioPenultimo
;Luigi Patrono
Ultimo
2025-01-01
Abstract
The integration of Federated Learning (FL) in the Internet of Medical Things (IoMT) represents a cutting-edge solution, enabling the training of Artificial Intelligence (AI) models directly on edge devices without the need to share sensitive patient information. This approach enhances privacy while preserving the quality and effectiveness of clinical analysis. However, in real-world scenarios, physical devices often generate data that is non-independent and non-identically distributed (Non-IID), creating significant challenges for the training process. This study proposes an experimental method to generate realistic data distributions from existing centralized datasets, capturing real-world heterogeneity in IoT-driven federated learning infrastructures. The proposed infrastructure utilizes advanced statistical techniques to transform IID datasets into Non-IID distributions. This transformation enables a systematic evaluation of the impact of Non-IID data on Federated Learning in ECG arrhythmia detection. Using the MIT-BIH Arrhythmia dataset, an accuracy drop of only 0.31% was observed in an extreme Non-IID scenario. However, significant execution time variability is observed, showing up to a 50% variation across clients, compared to medium Non-IID (15.5%) or IID (0.63%)conditions. This observation implies that Non-IID data leads to substantial disparities in computational workload across clients, which can slow down and destabilize the convergence process, as suggested by theoretical expectations.| File | Dimensione | Formato | |
|---|---|---|---|
|
PDFExpress_Splitech2025_ECG_Non_IID_DavideCantoro.pdf
solo utenti autorizzati
Descrizione: Pre-print
Tipologia:
Versione editoriale
Licenza:
Copyright dell'editore
Dimensione
1.51 MB
Formato
Adobe PDF
|
1.51 MB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


