- Title：MAIB-Talk-027: Knowledge Distillation for Efficient Learning in Heterogeneous Federated Systems 7- Date：10:00pm US East time, 08/26/2023
- Date：10:00am Beijing time, 08/27/2023
- Zoom ID：933 1613 9423
- Zoom PWD：416262
- Zoom: https://uwmadison.zoom.us/meeting/register/tJcudu-prTIuGNda1MsF8PKyRQlnGn06TP2E
MAIB: Manifold learning, Artificial Intelligence, Biology Forum (MAIB)
Presentation Record(Previous Presentation will be showed here if the video is not released for this talk)
Dr. Zhuangdi Zhu, https://zhuangdizhu.github.io/
Zhuangdi is currently a senior data and application scientist at Microsoft. She received her Ph.D. in Computer Science from Michigan State University in August 2022. Zhai Zhuang has broad research interests in machine learning theory and its applications. Her current research focuses on federated machine learning and reinforcement learning. Zhai Zhuang has rich industry experience in both IT and financial technology companies. During her work and research, she solved practical problems such as machine learning and human-computer interaction, wireless networks, Internet of Things, user recommendation and digital markets.
The rise of federated learning (FL) has brought machine learning to edge computing by leveraging data scattered across edge devices. However, the heterogeneous capacity of edge devices and the difference in data distribution are two major obstacles to the widespread application of FL, leading to long learning convergence time and high communication costs. The research we present in this talk addresses two fundamental challenges in FL, commonly referred to as system heterogeneity and data heterogeneity. In particular, to address the data heterogeneity in FL, we propose a data-free knowledge distillation FL algorithm. Our method benefits and calibrates on-device training by distilling inductive bias learned in a data-free manner. Next, to address system heterogeneity, we propose to learn self-distilling neural networks for arbitrary FL devices that can be flexibly pruned to different sizes and capture domain knowledge in a nested, progressive manner. Empirical studies echo the theoretical implications, showing the significant advantages of our method in achieving efficient and effective FL.
McMahan, B., Moore, E., Ramage, D., Hampson, S., and y Arcas, B. A. Communication-efficient learning of deep networks from decentralized data. In Artificial Intelli- gence and Statistics, pp. 1273–1282. PMLR, 2017.
Lin, T., Kong, L., Stich, S. U., and Jaggi, M. Ensemble distillation for robust model fusion in federated learning. arXiv preprint arXiv:2006.07242, 2020.
Horvath, S., Laskaridis, S., Almeida, M., Leontiadis, I., Ve- nieris, S. I., and Lane, N. D. Fjord: Fair and accurate federated learning under heterogeneous targets with or- dered dropout. 35th Conference on Neural Information Processing Systems (NeurIPS)., 2021.
Diao, E., Ding, J., and Tarokh, V. Heterofl: Computation and communication efficient federated learning for het- erogeneous clients. arXiv preprint arXiv:2010.01264, 2020.”
How this work will be useful for drug discovery and development?
The work on addressing system heterogeneity and data heterogeneity in federated learning (FL) can be highly beneficial for drug discovery and development in several ways:
Efficient Data Utilization: Drug discovery often involves analyzing vast amounts of data from various sources, including clinical trials, genetic databases, and chemical libraries. FL allows pharmaceutical companies and researchers to collaborate and leverage data from multiple sources without centrally aggregating sensitive patient data. By addressing data heterogeneity in FL, the proposed data-free knowledge distillation algorithm can effectively utilize diverse datasets from different sources without the need to share raw data. This ensures privacy and data security while improving the overall data utilization efficiency.
Edge Computing for Drug Discovery: The use of edge devices in FL enables distributed data processing and model training closer to the data sources, reducing the need for data transfer to a central server. This is particularly useful in drug discovery, where data from hospitals, research labs, and clinics can be spread across different geographic locations. The self-distilling neural network approach tailored for arbitrary FL devices allows each edge device to train models that suit its computational capacity, optimizing the learning process for individual devices and potentially speeding up drug discovery efforts.
Accelerated Model Training: Drug discovery requires the development of complex machine learning models for various tasks, such as predicting drug-protein interactions, analyzing molecular structures, or identifying potential drug candidates. System heterogeneity in FL can cause slow convergence and high communication costs. The proposed self-distilling neural network approach addresses this issue by dynamically adapting model complexity to the capabilities of each edge device. This accelerates the model training process, leading to faster iterations and quicker results.
Generalization across Diverse Data: Drug discovery often involves dealing with data from different patient populations, disease types, and genetic variations. Addressing data heterogeneity through the data-free knowledge distillation algorithm can help improve model generalization across diverse datasets. By transferring distilled knowledge across devices without sharing raw data, the FL system can learn from the collective experience of different sources and produce models that perform well on various data distributions.
Privacy and Security: Drug discovery involves sensitive information about patients, molecular structures, and potential drug compounds. Centralizing data for model training can raise privacy and security concerns. FL, coupled with the proposed methods, allows different parties to collaborate without sharing raw data, protecting individual privacy while still benefiting from a collective learning process.
Overall, the work on addressing system heterogeneity and data heterogeneity in FL can contribute significantly to drug discovery and development efforts. It enables more efficient and secure collaboration among researchers, accelerates model training, improves model generalization, and ensures the privacy of sensitive data. These advancements can lead to faster and more effective drug development processes, potentially accelerating the discovery of new treatments and therapies for various diseases.