A Simple Data Augmentation for Feature Distribution Skewed Federated Learning. (arXiv:2306.09363v1 [cs.LG])

Federated learning (FL) facilitates collaborative learning among multiple
clients in a distributed manner, while ensuring privacy protection. However,
its performance is inevitably degraded as suffering data heterogeneity, i.e.,
non-IID data. In this paper, we focus on the feature distribution skewed FL
scenario, which is widespread in real-world applications. The main challenge
lies in the feature shift caused by the different underlying distributions of
local datasets. While the previous attempts achieved progress, few studies pay
attention to the data itself, the root of this issue. Therefore, the primary
goal of this paper is to develop a general data augmentation technique at the
input level, to mitigate the feature shift. To achieve this goal, we propose
FedRDN, a simple yet remarkably effective data augmentation method for feature
distribution skewed FL, which randomly injects the statistics of the dataset
from the entire federation into the client’s data. By this, our method can
effectively improve the generalization of features, thereby mitigating the
feature shift. Moreover, FedRDN is a plug-and-play component, which can be
seamlessly integrated into the data augmentation flow with only a few lines of
code. Extensive experiments on several datasets show that the performance of
various representative FL works can be further improved by combining them with
FedRDN, which demonstrates the strong scalability and generalizability of
FedRDN. The source code will be released.

Source: https://arxiv.org/abs/2306.09363

webmaster

Related post