Federated Learning: A Key to Privacy-Preserving AI in 6G

At ENSURE-6G’s 3rd event, “Advanced Wireless Network Security and Privacy,” Dr. Shen Wang from University College Dublin presented an insightful talk on “Federated learning for privacy-preserving 6G AI systems.” The presentation highlighted Federated Learning (FL) as a crucial approach for developing secure and privacy-aware AI in the context of emerging 6G networks.

Understanding Federated Learning

Federated Learning is a distributed machine learning paradigm where data owners, spread across various locations, train AI models without directly sharing their raw data [05:19]. This is particularly vital for sensitive information from end-users, banks, or hospitals, as the data remains local on the owner’s device or premises [08:02]. Instead of transmitting raw data, only model updates—such as gradients or parameters—are sent between the clients (data owners) and an aggregation server [08:22].

A common FL algorithm, Federated Averaging (FedAvg), operates as follows:

A central server sends an initial statistical model to multiple clients [10:02].
Clients train this model locally using their own data in parallel [11:04].
Clients then send their updated models back to the server [11:41].
The server aggregates these updates (e.g., by averaging them) to create an improved global model for the next iteration [11:57].

Privacy Risks and Challenges in FL for 6G

Despite its privacy-preserving nature, FL is not without risks. It’s mathematically possible to reconstruct original data from transmitted model updates over multiple iterations [14:08]. Furthermore, communication overhead can be substantial due to the size of model updates, although 6G’s high bandwidth is expected to mitigate this [15:01].

The heterogeneity inherent in 6G networks also presents significant challenges:

Data Heterogeneity: Clients will possess diverse types and quantities of data [15:55].
Device Heterogeneity: Varying computational capabilities among devices can lead to delays, as faster devices might have to wait for slower ones. While asynchronous FL can help, the large performance gaps in 6G make it difficult to manage effectively [16:24, 17:03].

UCD NetLab’s Pioneering Research

Dr. Wang showcased several research initiatives from UCD NetLab addressing these challenges:

Rackdeaf: Focused on defending against data reconstruction attacks in VR-based IoT scenarios by proposing a recommendation framework for privacy updates that minimize the impact on model performance [18:05].
Shield: Addressed security in hierarchical federated learning architectures (relevant for 6G), using outlier detection to defend against poisoning attacks at various layers [20:11].
Shepra: Utilized eXplainable AI (XAI) to generate justifications for model decisions, helping to identify malicious clients even if they constitute a majority, by comparing their behavior to a trusted sample [23:31].
Peer-to-Peer FL: Explored a fully distributed FL network without a central server, where each node performs training and aggregation. This work showed that communicating with fewer neighbors could lead to faster convergence, and differential privacy had minimal impact on model accuracy [26:34].

Barriers to Real-World 6G Implementation

Despite the advancements, real-world deployment of FL in 6G faces several barriers:

Scale and Data Owner Management: Managing a vast number of diverse data owners and explaining the nuances of data usage (that data remains local but is “used”) to end-users is challenging [32:28].
Heterogeneity at Scale: Scaling FL to manage extreme data and device heterogeneity makes asynchronous FL tuning increasingly complex [33:28].

Currently, successful FL deployments are often seen in smaller, enclosed scenarios, such as collaborations between a limited number of banks or hospitals where data sharing is beneficial but direct data exchange is restricted [34:10].

Dr. Wang’s talk underscored that while Federated Learning significantly enhances privacy compared to traditional machine learning methods, considerable challenges remain in adapting it to the vast, diverse, and complex landscape of 6G. This makes it a crucial and active area of ongoing research.

You can watch the full recorded talk here: Federated learning for privacy-preserving 6G AI systems