Risk and Advantages of Federated Learning for Health Care Data Collaboration
Publication: ASCE-ASME Journal of Risk and Uncertainty in Engineering Systems, Part A: Civil Engineering
Volume 6, Issue 3
Abstract
This paper explores the problem of data collaboration in health care, which is the one of the critical infrastructure sectors designated by the Department of Home Security. Limitations to data sharing in health care obstruct the development of a new generation of medical technology powered by artificial intelligence (AI). Collaborative machine learning helps to overcome these limitations through training models on distributed data sets without data sharing. Among other approaches to collaborative machine learning, federated learning in recent years has demonstrated multiple advantages. However, it had been developed and tested in a highly distributed data environment, which is different from the typical cases of health care data collaboration. The objective of this paper is to validate the known advantages of federated learning and to assess possible risks in a small multiparty setting. The experiments show that federated learning can be successfully applied in a multiparty collaboration setting. However, with a small number of parties, it becomes easier to overfit to each local data so that the averaging steps have to occur more frequently. In addition, for the first time, the risks of a membership inference attack were assessed for different methods of collaborative machine learning.
Get full access to this article
View all available purchase options and get full access to this article.
Data Availability Statement
All data, models, and code generated or used during the study as well as experiment instances are available online at https://github.com/abogdanova/FL-MIA.
Acknowledgments
This study is funded by NEDO (New Energy and Industrial Technology Development Organization), the funding agency of the Japan Ministry of Economy, Trade and Industry (METI) for US–Japan Collaborative Research and Development of Next Generation AI Technology.
References
Bonawitz, K., V. Ivanov, B. Kreuter, A. Marcedone, H. B. McMahan, S. Patel, D. Ramage, A. Segal, and K. Seth. 2016. “Practical secure aggregation for federated learning on user-held data.” In Proc., NIPS Workshop on Private Multi-Party Machine Learning. Ithaca, NY: Cornell Univ.
Brisimi, T. S., R. Chen, T. Mela, A. Olshevsky, I. C. Paschalidis, and W. Shi. 2018. “Federated learning of predictive models from federated electronic health records.” Int. J. Med. Inf. 112 (Apr): 59–67. https://doi.org/10.1016/j.ijmedinf.2018.01.007.
Chang, K., N. Balachandar, C. Lam, D. Yi, J. Brown, A. Beers, B. Rosen, D. L. Rubin, and J. Kalpathy-Cramer. 2018. “Distributed deep learning networks among institutions for medical imaging.” J. Am. Med. Inf. Assoc. 25 (8): 945–954. https://doi.org/10.1093/jamia/ocy017.
Deist, T. M., et al. 2017. “Infrastructure and distributed learning methodology for privacy-preserving multi-centric rapid learning health care: Eurocat.” Clin. Transl. Radiat. Oncol. 4 (Jun): 24–31. https://doi.org/10.1016/j.ctro.2016.12.004.
Friedman, C. P., A. K. Wong, and D. Blumenthal. 2010. “Achieving a nationwide learning health system.” Sci. Transl. Med. 2 (57): 29–57. https://doi.org/10.1126/scitranslmed.3001456.
Hitaj, B., G. Ateniese, and F. Pérez-Cruz. 2017. “Deep models under the GAN: Information leakage from collaborative deep learning.” In Proc., 2017 ACM SIGSAC Conf. on Computer and Communications Security, 603–618. New York: Association for Computing Machinery.
Jochems, A., et al. 2017. “Developing and validating a survival prediction model for NSCLC patients through distributed learning across 3 countries.” Int. J. Radiat. Oncol. Biol. Phys. 99 (2): 344–352. https://doi.org/10.1016/j.ijrobp.2017.04.021.
Krizhevsky, A. 2009. “Learning multiple layers of features from tiny images.” M.S. thesis, Dept. of Computer Science, Univ. of Toronto.
McMahan, H., E. Moore, D. Ramage, and S. Hampson. 2016. “Communication-efficient learning of deep networks from decentralized data.” Preprint, submitted February 17, 2016. http://arxiv.org/abs/1602.05629.
Melis, L., C. Song, E. De Cristofaro, and V. Shmatikov. 2019. “Exploiting unintended feature leakage in collaborative learning.” In Proc., 2019 IEEE Symp. on Security and Privacy, 691–706. New York: IEEE.
Sheller, M. J., G. A. Reina, B. Edwards, J. Martin, and S. Bakas. 2018. “Multi-institutional deep learning modeling without sharing patient data: A feasibility study on brain tumor segmentation.” In Proc., Int. MICCAI Brainlesion Workshop, 92–104. Berlin: Springer.
Shokri, R., M. Stronati, C. Song, and V. Shmatikov. 2017. “Membership inference attacks against machine learning models.” In Proc., 2017 IEEE Symp. on Security and Privacy, 3–18. New York: IEEE.
Truex, S., L. Liu, M. E. Gursoy, L. Yu, and W. Wei. 2019. “Demystifying membership inference attacks in machine learning as a service.” In Proc., 2017 IEEE Transactions on Services Computing. New York: IEEE.
Information & Authors
Information
Published In
Copyright
©2020 American Society of Civil Engineers.
History
Received: Jul 1, 2019
Accepted: Apr 7, 2020
Published online: Jun 23, 2020
Published in print: Sep 1, 2020
Discussion open until: Nov 23, 2020
Authors
Metrics & Citations
Metrics
Citations
Download citation
If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. Simply select your manager software from the list below and click Download.