Team member
Rachel Phinnemore, Yufei Kang, Tianyu Wang
Description
Federated Learning (FL) offers the benefits of machine learning without compromising the privacy of the users. However, the implementation of federated learning introduces new challenges for optimizing the accuracy of ML models as well as converging to the optimal points due to the nature and distribution of training data used in federated learning. Namely, there are three attributes of data in federated learning that badly impede the optimization process: Massively Distributed Data, Non IID Data and Unbalanced Data[2]. Nevertheless, progress has been made to improve federated learning by exploring different adaptive server aggregation algorithms[1]. One finding from this study showed that model accuracy improvement was more sensitive to tuning the client update schemes than tuning the accuracy of the server aggregation method. As such, our CSC 2228 course project will use this finding as inspiration to explore how to improve the model accuracy of unbalanced Non IID data by implementing different client update schemes. While different client update schemes have been implemented and tested in “Optimization in Federated Learning’’ to explore which provides the greatest accuracy gains, this work focused solely on Non IID data. Unbalanced Non IID data presents significant challenges for realizing competitive model accuracy above and beyond the challenges presented by Non IID data alone. To our knowledge, our project will be the first to explore whether federated learning model accuracy of unbalanced Non IID data can be improved through implementing different client update schemes.
Project goals
- Baseline Goals
- Simulate a federated learning setting where local data is unbalanced and Non-IID distributed across remote clients.
- Improve and implement different client update schemes and evaluate them on the simulated unbalanced and Non-IID settings.
- Reimplement the state-of-the-art algorithms and introduce them as the comparison groups.
- Compare the proposed methods with comparison groups in different scenarios and draw final conclusions.
- Reach Goals
- Moderate Reach Goal: Investigate ways to improve model performance for unbalanced Non-IID data by fine-tuning client update hyperparameters (e.g., learning rate, momentum, etc.)
- Advanced Reach Goal: Investigate ways to improve performance of federated learning for unbalanced Non-IID data distribution further by possibly proposing a novel client update optimization method or an advanced server aggregation approach.
- Note: Some or all of these goals will be attempted if baseline goals are met.
Timeline
Date | Milestone |
---|---|
Week 1 Feb 2nd - Feb 9th |
Finding existing code to use to allow us to focus on implementing new client update schemes Setting up environment and ensuring existing code for us to build upon works |
Week 2 Feb 9th - Feb 16th |
Create Non-IID dataset and unbalanced Non-IID data |
Week 2-3 Feb 9th - Feb 23th |
Research different client update schemes to implement and implement them |
Week 3 Feb 16th - Feb 23rd |
Implement method to measure accuracy of model on client and global model Start writing progress report |
Week 4 Feb 23rd - Mar 2nd |
Running experiments to tune hyper parameters of client update schemes Begin outline for paper |
Week 5-7 Mar 2nd - Mar 16th |
Experiment running Get feedback from TA on paper outline Writing Paper |
Week 7-8 Mar 17th - Mar 31th |
Prepare presentation Get feedback from TA on presentation outline Polish paper |
Week 9 Apr 1st - Apr 7th |
Polish presentation Do practice presentation dry run |
Week 10 Apr 7th - April 16th |
Submit final report |