Federated Learning Frameworks for Collaborative Software Development That Protect Privacy

Adeyemi Praise
Ladoke Akintola University of Technology

View / Download Full Article (PDF)

Abstract

When people who are not in the same place work together to make software, they often work on the same codebase. That makes me very worried about the privacy of private code, sensitive development data, and intellectual property. Previously, centralized methods for training cooperative models would allow private data to leak out. This research proposes a federated learning framework enabling multiple organizations to collaboratively train machine learning models on their proprietary code data while ensuring data privacy. To prevent information leakage, we adopt privacy-enhancing methods such as secure aggregation and differential privacy. Experimental results demonstrated that the suggested method is a good fit for privacy-valued collaborative development settings, since it works as well as other models and protects privacy at the same time. It allows remote software engineering teams to cooperate safely and efficiently.

Keywords

Federated Learning, Privacy-Preserving, Collaborative Software Development, Differential Privacy, Secure Aggregation, Distributed Machine Learning, Software Engineering, Data Privacy.

References

[1] McMahan, H. B., Moore, E., Ramage, D., Hampson, S., & y Arcas, B. A. (2017). Communication-efficient learning of deep networks from decentralized data. Proceedings of the 20th International Conference on Artificial Intelligence and Statistics (AISTATS).

[2] Kairouz, P., McMahan, H. B., Avent, B., Bellet, A., Bennis, M., Bhagoji, A. N., ... & Zhao, S. (2021). Advances and open problems in federated learning. Foundations and Trends® in Machine Learning, 14(1–2), 1-210.

[3] Bonawitz, K., Ivanov, V., Kreuter, B., Marcedone, A., McMahan, H. B., Patel, S., ... & Seth, K. (2017). Practical secure aggregation for privacy-preserving machine learning. Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security.

[4] Abadi, M., Chu, A., Goodfellow, I., McMahan, H. B., Mironov, I., Talwar, K., & Zhang, L. (2016). Deep learning with differential privacy. Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security.

[5] Yang, Q., Liu, Y., Chen, T., & Tong, Y. (2019). Federated machine learning: Concept and applications. ACM Transactions on Intelligent Systems and Technology, 10(2), 1-19.

[6] Li, T., Sahu, A. K., Talwalkar, A., & Smith, V. (2020). Federated learning: Challenges, methods, and future directions. IEEE Signal Processing Magazine, 37(3), 50-60.

[7] Smith, V., Chiang, C. K., Sanjabi, M., & Talwalkar, A. (2017). Federated multi-task learning. Advances in Neural Information Processing Systems (NeurIPS).

[8] Shokri, R., & Shmatikov, V. (2015). Privacy-preserving deep learning. Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security.

[9] Geyer, R. C., Klein, T., & Nabi, M. (2017). Differentially private federated learning: A client level perspective. arXiv preprint arXiv:1712.07557.

[10] Xu, J., Glicksberg, B. S., Su, C., Walker, P., Bian, J., & Wang, F. (2021). Federated learning for healthcare informatics. Journal of Healthcare Informatics Research, 5(1), 1-19.

[11] Li, Q., He, B., & Song, D. (2020). Practical privacy-preserving collaborative learning with gradient compression. Proceedings of the 29th USENIX Security Symposium.

[12] Hard, A., Rao, K., Mathews, R., Ramaswamy, S., Beaufays, F., Augenstein, S., ... & Ramage, D. (2018). Federated learning for mobile keyboard prediction. arXiv preprint arXiv:1811.03604.

[13] Sun, Y., Wang, X., Chen, Z., & Zhang, Z. (2021). A survey of federated learning for edge computing: Research problems and solutions. arXiv preprint arXiv:2108.03148.

[14] Nasr, M., Shokri, R., & Houmansadr, A. (2019). Comprehensive privacy analysis of deep learning: Passive and active white-box inference attacks against centralized and federated learning. Proceedings of the 2019 IEEE Symposium on Security and Privacy.

[15] Bonawitz, K., Eichner, H., Grieskamp, W., Huba, D., Ingerman, A., Ivanov, V., ... & Van Overveldt, T. (2019). Towards federated learning at scale: System design. Proceedings of the 2nd SysML Conference.