Federated Multimodal Learning for Predicting HIV Care Retention and Viral Suppression: Integrating EHR Phenotypes, Social Determinants of Health, and Explainable AI

Petri D. Jones; Anand Brivastava; Xavier Howard

Authors

Petri D. Jones Department of Computer Science, University of Alabama at Birmingham, Birmingham, AL, USA.
Anand Brivastava Department of Computer Science, Binghamton University, Binghamton, NY, USA.
Xavier Howard Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO, USA.

Keywords:

federated learning, multimodal, HIV care retention, viral suppression, social determinants of health, explainable AI, EHR phenotyping, health equity, differential privacy, model governance

Abstract

The HIV care continuum remains marked by persistent disparities in retention and viral suppression, particularly among marginalized populations where structural barriers intersect with clinical phenotypes. Existing predictive models often rely on centralized electronic health record data, which raises privacy concerns, fails to capture social determinants of health at scale, and lacks the transparency needed for clinical adoption. This paper proposes a federated multimodal learning framework that integrates structured EHR phenotypes with geocoded and survey-based social determinants of health while embedding explainable artificial intelligence techniques to ensure model interpretability. We examine the architectural trade-offs inherent in federated learning for heterogeneous health data sources, including communication efficiency, non-IID data distributions, and differential privacy budgets. The framework further incorporates fairness-aware aggregation to mitigate biases that could propagate health inequities. We discuss infrastructure requirements for deployment across safety-net clinics and public health agencies, emphasizing sustainability through continuous learning and model governance. The integration of explainability methods such as feature attribution and counterfactual reasoning enables clinicians and policymakers to interrogate predictions and intervene appropriately. Through a comparative analysis of centralized, federated, and hybrid architectures, we demonstrate that federated multimodal learning can achieve comparable predictive performance while preserving data sovereignty and providing actionable insights. Policy implications for data sharing, consent models, and regulatory oversight are considered. This work contributes a systems-level design for ethically responsible, privacy-preserving, and interpretable machine learning in HIV care.

References

1. Gardner, E. M., McLees, M. P., Steiner, J. F., Del Rio, C., & Burman, W. J. (2011). The spectrum of engagement in HIV care and its relevance to test-and-treat strategies for prevention of HIV infection. Clinical Infectious Diseases, 52(6), 793–800.

2. Kay, E. S., Batey, D. S., & Mugavero, M. J. (2016). The HIV treatment cascade and care continuum: Updates, goals, and recommendations for the future. AIDS Research and Therapy, 13(1), 35.

3. Pellowski, J. A., Kalichman, S. C., Matthews, K. A., & Adler, N. (2013). A pandemic of the poor: Social disadvantage and the U.S. HIV epidemic. American Psychologist, 68(4), 197–209.

4. McMahan, B., Moore, E., Ramage, D., Hampson, S., & y Arcas, B. A. (2017). Communication-efficient learning of deep networks from decentralized data. Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, 1273–1282.

5. Lodi, S., Phillips, A., Touloumi, G., & Pantazis, N. (2019). Machine learning for HIV care: Review of current applications and future directions. Journal of the International AIDS Society, 22(7), e25334.

6. Magnus, M., Herwehe, J., Murtaza-Rossini, M., Reif, S., & Schmidt, N. (2021). Leveraging electronic health records and social determinants of health data to improve HIV outcomes. AIDS and Behavior, 25(Suppl 2), 181–189.

7. Rieke, N., Hancox, J., Li, W., Milletari, F., Roth, H. R., Albarqouni, S., ... & Cardoso, J. M. (2020). The future of digital health with federated learning. NPJ Digital Medicine, 3(1), 119.

8. Li, T., Sahu, A. K., Talwalkar, A., & Smith, V. (2020). Federated learning: Challenges, methods, and future directions. IEEE Signal Processing Magazine, 37(3), 50–60.

9. Yue, Y., Khanal, A., Lyu, T., Weissman, S., & Liang, C. (2025, May). EHR Phenotyping Methods for Measuring Treatment Adherence Among People Living With HIV in All of Us: Towards Disparities and Inequalities in HIV Care Continuum. In AMIA Annual Symposium Proceedings (Vol. 2024, p. 1294).

10. Ling, C., & Wang, Y. (2025). TLFQC: A High-compatible R Shiny based Platform for Automated and Codeless TLFs Generation and Validation. In PharmaSUG 2025 conference proceedings.

11. Lundberg, S. M., & Lee, S. I. (2017). A unified approach to interpreting model predictions. Advances in Neural Information Processing Systems, 30, 4765–4774.

12. Hatef, E., Vanderver, B., Kharrazi, H., & Weiner, J. P. (2019). Advancing social determinants of health research and data integration into health care: Opportunities and challenges. Health Affairs, 38(11), 1858–1865.

13. Yang, Q., Liu, Y., Chen, T., & Tong, Y. (2019). Federated machine learning: Concept and applications. ACM Transactions on Intelligent Systems and Technology, 10(2), 1–19.

14. Abadi, M., Chu, A., Goodfellow, I., McMahan, H. B., Mironov, I., Talwar, K., & Zhang, L. (2016). Deep learning with differential privacy. Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, 308–318.

15. Nazabal, A., Olmos, P. M., Ghahramani, Z., & Valera, I. (2020). Handling incomplete heterogeneous data using VAEs. Pattern Recognition, 107, 107501.

16. Holzinger, A., Langs, G., Denk, H., Zatloukal, K., & Muller, H. (2019). Causability and explainability of artificial intelligence in medicine. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 9(4), e1312.

17. Ribeiro, M. T., Singh, S., & Guestrin, C. (2016). “Why should I trust you?” Explaining the predictions of any classifier. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 1135–1144.

18. Kaye, J., Whitley, E. A., Lund, D., Morrison, M., Teare, H., & Melham, K. (2015). Dynamic consent: A patient interface for twenty-first century research networks. European Journal of Human Genetics, 23(2), 141–146.

19. Zafar, M. B., Valera, I., Gomez Rodriguez, M., & Gummadi, K. P. (2017). Fairness beyond disparate treatment & disparate impact: Learning classification without disparate mistreatment. Proceedings of the 26th International Conference on World Wide Web, 1171–1180.

20. Shokri, R., Stronati, M., Song, C., & Shmatikov, V. (2017). Membership inference attacks against machine learning models. 2017 IEEE Symposium on Security and Privacy, 3–18.

21. Shenoy, E. S., & Honda, H. (2022). Implementation of machine learning in safety-net hospitals: Barriers and opportunities. JAMA Health Forum, 3(4), e220443.

22. Blanchard, P., El Mhamdi, E. M., Guerraoui, R., & Stainer, J. (2017). Machine learning with adversaries: Byzantine tolerant gradient descent. Advances in Neural Information Processing Systems, 30, 119–129.

23. Vayena, E., & Blasimme, A. (2018). Health research with big data: Time for systemic oversight. Journal of Law and the Biosciences, 5(2), 270–291.

Federated Multimodal Learning for Predicting HIV Care Retention and Viral Suppression: Integrating EHR Phenotypes, Social Determinants of Health, and Explainable AI

Authors

Keywords:

Abstract

References

Downloads

Published

How to Cite

Issue

Section

License

Journal Information

Indexing & Infrastructure

Current Issue

Information

Make a Submission