DOI: 10.1109/COMST.2021.3058573
Terbit pada 10 Februari 2021 Pada IEEE Communications Surveys and Tutorials

Federated Machine Learning: Survey, Multi-Level Classification, Desirable Criteria and Future Directions in Communication and Networking Systems

O. A. Wahab A. Mourad T. Taleb + 1 penulis

Abstrak

The communication and networking field is hungry for machine learning decision-making solutions to replace the traditional model-driven approaches that proved to be not rich enough for seizing the ever-growing complexity and heterogeneity of the modern systems in the field. Traditional machine learning solutions assume the existence of (cloud-based) central entities that are in charge of processing the data. Nonetheless, the difficulty of accessing private data, together with the high cost of transmitting raw data to the central entity gave rise to a decentralized machine learning approach called Federated Learning. The main idea of federated learning is to perform an on-device collaborative training of a single machine learning model without having to share the raw training data with any third-party entity. Although few survey articles on federated learning already exist in the literature, the motivation of this survey stems from three essential observations. The first one is the lack of a fine-grained multi-level classification of the federated learning literature, where the existing surveys base their classification on only one criterion or aspect. The second observation is that the existing surveys focus only on some common challenges, but disregard other essential aspects such as reliable client selection, resource management and training service pricing. The third observation is the lack of explicit and straightforward directives for researchers to help them design future federated learning solutions that overcome the state-of-the-art research gaps. To address these points, we first provide a comprehensive tutorial on federated learning and its associated concepts, technologies and learning approaches. We then survey and highlight the applications and future directions of federated learning in the domain of communication and networking. Thereafter, we design a three-level classification scheme that first categorizes the federated learning literature based on the high-level challenge that they tackle. Then, we classify each high-level challenge into a set of specific low-level challenges to foster a better understanding of the topic. Finally, we provide, within each low-level challenge, a fine-grained classification based on the technique used to address this particular challenge. For each category of high-level challenges, we provide a set of desirable criteria and future research directions that are aimed to help the research community design innovative and efficient future solutions. To the best of our knowledge, our survey is the most comprehensive in terms of challenges and techniques it covers and the most fine-grained in terms of the multi-level classification scheme it presents.

Artikel Ilmiah Terkait

Federated Learning Systems: Architecture Alternatives

J. Bosch Hongyi Zhang H. H. Olsson

1 Desember 2020

Machine Learning (ML) and Artificial Intelligence (AI) have increasingly gained attention in research and industry. Federated Learning, as an approach to distributed learning, shows its potential with the increasing number of devices on the edge and the development of computing power. However, most of the current Federated Learning systems apply a single-server centralized architecture, which may cause several critical problems, such as the single-point of failure as well as scaling and performance problems. In this paper, we propose and compare four architecture alternatives for a Federated Learning system, i.e. centralized, hierarchical, regional and decentralized architectures. We conduct the study by using two well-known data sets and measuring several system performance metrics for all four alternatives. Our results suggest scenarios and use cases which are suitable for each alternative. In addition, we investigate the trade-off between communication latency, model evolution time and the model classification performance, which is crucial to applying the results into real-world industrial systems.

FedML: A Research Library and Benchmark for Federated Machine Learning

R. Raskar Praneeth Vepakomma Yan Kang + 13 lainnya

27 Juli 2020

Federated learning is a rapidly growing research field in the machine learning domain. Although considerable research efforts have been made, existing libraries cannot adequately support diverse algorithmic development (e.g., diverse topology and flexible message exchange), and inconsistent dataset and model usage in experiments make fair comparisons difficult. In this work, we introduce FedML, an open research library and benchmark that facilitates the development of new federated learning algorithms and fair performance comparisons. FedML supports three computing paradigms (distributed training, mobile on-device training, and standalone simulation) for users to conduct experiments in different system environments. FedML also promotes diverse algorithmic research with flexible and generic API design and reference baseline implementations. A curated and comprehensive benchmark dataset for the non-I.I.D setting aims at making a fair comparison. We believe FedML can provide an efficient and reproducible means of developing and evaluating algorithms for the federated learning research community. We maintain the source code, documents, and user community at this https URL.

Federated Learning for Internet of Things: Recent Advances, Taxonomy, and Open Challenges

L. U. Khan E. Hossain Zhu Han + 2 lainnya

28 September 2020

The Internet of Things (IoT) will be ripe for the deployment of novel machine learning algorithm for both network and application management. However, given the presence of massively distributed and private datasets, it is challenging to use classical centralized learning algorithms in the IoT. To overcome this challenge, federated learning can be a promising solution that enables on-device machine learning without the need to migrate the private end-user data to a central cloud. In federated learning, only learning model updates are transferred between end-devices and the aggregation server. Although federated learning can offer better privacy preservation than centralized machine learning, it has still privacy concerns. In this paper, first, we present the recent advances of federated learning towards enabling federated learning-powered IoT applications. A set of metrics such as sparsification, robustness, quantization, scalability, security, and privacy, is delineated in order to rigorously evaluate the recent advances. Second, we devise a taxonomy for federated learning over IoT networks. Finally, we present several open research challenges with their possible solutions.

Hybrid Learning: When Centralized Learning Meets Federated Learning in the Mobile Edge Computing Systems

Siye Wang Zhongyuan Zhao Howard H. Yang + 2 lainnya

1 Desember 2023

Federated learning is a new artificial intelligence technology with which an edge server can orchestrate with multiple end users to train a global model collaboratively. Under this setting, users only upload the locally trained parameters instead of their local data, substantially reducing communication costs and boosting data privacy. Nonetheless, federated learning mainly relies on users’ local training, overlooking the abundant computing resources owned by the edge server. To exploit the edge server’s processing power, we propose a hybrid learning paradigm that consists of centralized and federated learning components. This scheme uploads a portion of users’ data for centralized learning when the local model is trained under federated learning. We derive a theoretical upper bound for the model accuracy, which can be used to assess the performance of the proposed new learning paradigm. To balance the computation and communication resources for a good model accuracy performance, we establish a joint optimization problem of model accuracy, latency, and energy consumption. We also devise the corresponding joint optimization algorithm to solve the problem. Experiment results show that compared with centralized and federated learning, the proposed hybrid learning algorithm can effectively improve the model accuracy and significantly reduce computation and communication resources.

A survey on federated learning in data mining

Yihan Lv Chen Zhang Bin Yu + 2 lainnya

9 Desember 2021

Data mining is a process to extract unknown, hidden, and potentially useful information from data. But the problem of data island makes it arduous for people to collect and analyze scattered data, and there is also a privacy security issue when mining data. A collaboratively decentralized approach called federated learning unites multiple participants to generate a shareable global optimal model and keeps privacy‐sensitive data on local devices, which may bring great hope to us for solving the problems of decentralized data and privacy protection. Though federated learning has been widely used, few systematic studies have been conducted on the subject of federated learning in data mining. Hence, different from prior reviews in this field, we make a comprehensive summary and provide a novel taxonomy of the application of federated learning in data mining. This article starts by providing a thorough description of the relevant definitions and concepts, followed by an in‐depth investigation on the challenges faced by federated learning. In this context, we elaborate four taxonomies of major applications of federated learning in data mining, including education, healthcare, IoT, and intelligent transportation, and discuss them comprehensively. Finally, we discuss four promising research directions for further research, that is, privacy enhancement, improvement of communication efficiency, heterogeneous system processing, and reducing economic costs.

Daftar Referensi

0 referensi

Tidak ada referensi ditemukan.

Artikel yang Mensitasi

2 sitasi

Blockchain for federated learning toward secure distributed machine learning systems: a systemic survey

Dezhi Han Kuan Ching Li + 6 lainnya

20 November 2021

Federated learning (FL) is a promising decentralized deep learning technology, which allows users to update models cooperatively without sharing their data. FL is reshaping existing industry paradigms for mathematical modeling and analysis, enabling an increasing number of industries to build privacy-preserving, secure distributed machine learning models. However, the inherent characteristics of FL have led to problems such as privacy protection, communication cost, systems heterogeneity, and unreliability model upload in actual operation. Interestingly, the integration with Blockchain technology provides an opportunity to further improve the FL security and performance, besides increasing its scope of applications. Therefore, we denote this integration of Blockchain and FL as the Blockchain-based federated learning (BCFL) framework. This paper introduces an in-depth survey of BCFL and discusses the insights of such a new paradigm. In particular, we first briefly introduce the FL technology and discuss the challenges faced by such technology. Then, we summarize the Blockchain ecosystem. Next, we highlight the structural design and platform of BCFL. Furthermore, we present the attempts ins improving FL performance with Blockchain and several combined applications of incentive mechanisms in FL. Finally, we summarize the industrial application scenarios of BCFL.

Hierarchical Federated Learning With Social Context Clustering-Based Participant Selection for Internet of Medical Things Applications

Shohei Shimizu Xiaokang Zhou + 6 lainnya

1 Agustus 2023

The proliferation in embedded and communication technologies made the concept of the Internet of Medical Things (IoMT) a reality. Individuals’ physical and physiological status can be constantly monitored, and numerous data can be collected through wearable and mobile devices. However, the silo of individual data brings limitations to existing machine learning approaches to correctly identify a user’s health status. Distributed machine learning paradigms, such as federated learning, offer a potential solution for privacy-preserving knowledge sharing without sending raw personal data. However, federated learning is vulnerable to harmful participants that can degrade the overall model quality by sharing low-quality data. Therefore, it is critical to select suitable participants to ensure the accuracy and efficiency of federated learning. In this article, a unique clustering-based approach is proposed to use social context data for participant selection. Different edge participant groups will be established, and group-specific federated learning will be performed. The models of various edge groups will be further aggregated to strengthen the robustness of the global model. The experimental results demonstrated that through participant selection, clustering-based hierarchical federated learning can achieve better results with less participants in two different IoMT applications for ECG and human motion monitoring. This shows the efficacy of the proposed method in improving federated learning performance and efficiency in various IoMT applications.