Distributed and Federated Learning
Professors
Prerequisites:
Students are required to have taken an introductory machine-learning course
Good knowledge on supervised learning.
Some knowledge on Gradient descent
Bases on unsupervised learning is recommended, but this is not a prerequisite.
Pedagogical objectives:
This course provides an overview of federated and distributed learning in terms of performance and sécurité. Both theoretical and practical aspects will be extensively explored in this course in order to acquire solid expertise on both aspects. By the end of the course, students should
- Understand the major difference between centralized and decentralized learning
- Understand the methods most commonly used in federated learning
- Understand the performance of federated learning when data is heterogeneous (Non-IID data).
- Be able to build and scale a simple federated system
- Have acquired competence in implementing federated learning in the fields of networks, security, health or others.
Evaluation modalities:
The evaluation will be based on a final exam, lab reports and/or project activity.
For the project, students may also conduct research in the field of federated learning and write a short paper.
Description:
This course is designed to extend students’ knowledge of learning in a decentralized setting. Decentralized learning techniques, such as federated learning, are set to deliver a new generation of machine learning applications by enabling efficient and reliable learning between multiple parties and from diverse data sources. This course will cover different aspects of federated learning, focusing on recent research developments and exploring important applications in different fields such as security, networks and healthcare.
Lectures
- Course Overview. Introduction to machine learning and Federated Learning.
- Decentralized Optimization and Gradient descent
- Federated learning: FedSGD and FedAvg
- Variations of Federated Aggregation.
- Federated Averaging with Heterogeneous Data
- Communication-Efficient Learning of deep networks in Federated Learning
- Federated Multi-Task learning
Lab sessions
- Build and scale a simple federated learning with MNIST, Cifar-10, Fashion-MNIST, MedMNIST, Shakespeare, and BCN Open Data. Open-source Federated Learning tools (Pytorch, Flower, etc.).
- Federated learning with Non-IID data.
Complementary content:
- Threats, attacks, and defenses to federated learning
- Designing an attack and setting up a defense for federated learning.
- Applications to Images, Networks, health, and vehicle-to-vehicle communications
- Labs
- Applications of federated learning to network anomaly detection: use of 5G and LoRaWAN testbeds and datasets, with lab [by CNAM].
- Applications of federation learning to medical equipment: use of aggregated and anonymized field data [by NTUU].
- Applications of federation learning to vehicle-to-vehicle communications: routing and content offloading [by UPC].
Flower: A Friendly Federated Learning Framework https://flower.ai/ Datasets: https://keras.io/api/datasets/ https://medmnist.com/ https://opendata-ajuntament.barcelona.cat/en https://github.com/cedric-cnam/5G3E-dataset/ Reference papers: • Original paper in federated learning: H. Brendan McMahan and Eider Moore and Daniel Ramage and Seth Hampson and Blaise Agüera y Arcas, Communication-Efficient Learning of Deep Networks from Decentralized Data, https://arxiv.org/abs/1602.05629 • Heterogeneous data: H. Brendan McMahan Eider Moore Daniel Ramage Seth Hampson Blaise Aguera y Arcas, “Communication-Efficient Learning of Deep Networks from Decentralized Data”https://arxiv.org/pdf/1602.05629.pdf • Computational Heterogeneity: FedProx and Scaffold • Computational Heterogeneity: FedNova • Security in federated learning: Krum, Backdoor Federated Learning, SVFed, • Client selection: Optimal sampling, Client Selection in FL, • Fairness: q-FFL, AgnosticFL Book: • Kiyoshi Nakayama, George Jeno, Federated Learning with Python, O’Reilly, October 2022 • Lam M. Nguyen, Trong Nghia Hoang, Pin-Yu Chen. Federated Learning.Theory and Practice. Elsevier 2024. ISBN: 9780443190384 [urls] Intel & MobileODT Cervical Cancer Screening Dataset: https://www.kaggle.com/competitions/intel-mobileodt-cervical-cancer-screening
Devices:
- Laboratory-Based Course Structure
- Open-Source Software Requirements