Digital Twin-Driven Risk Assessment and Security Optimization of Large Language Model Agents in Personalized Healthcare

Mikkel Bell; Gerald L. Becker; Clifford Perry

Authors

Mikkel Bell Department of Computer Science and Engineering, University at Buffalo, Buffalo, NY, USA.
Gerald L. Becker Department of Computer Science, University of Central Florida, Orlando, FL, USA.
Clifford Perry Department of Computer Science and Engineering, University of Nevada, Reno, Reno, NV, USA.

Keywords:

digital twin; large language model agents; risk assessment; security optimization; personalized healthcare; adversarial robustness; system-level governance

Abstract

The integration of large language model agents into personalized healthcare promises transformative improvements in clinical decision support, patient engagement, and treatment customization. However, these autonomous software entities introduce unprecedented risks stemming from adversarial manipulation, data biases, and systemic failures that can compromise patient safety and privacy. This paper presents a novel framework that leverages digital twin technology to systematically assess and optimize the security profile of LLM agents operating in healthcare environments. The digital twin constructs a high-fidelity virtual replica of the agent, its interaction context, and the underlying care ecosystem, enabling continuous simulation of threat scenarios without endangering real patients. Through a dual-loop architecture, risk assessment outputs inform proactive security optimization, while the optimized configurations are fed back into the digital twin for validation. We examine the structural trade-offs between fidelity, computational cost, and real-time responsiveness, and discuss the architectural requirements for integrating digital twin pipelines with clinical data infrastructures. The governance implications of such simulation-driven risk management are analyzed, including questions of regulatory accountability, cross-institutional data sharing, and algorithmic fairness. By situating the discussion at the system level, we avoid narrow technical fixes and instead advocate for a holistic socio-technical approach that treats LLM agent security as an emergent property of continuous, evidence-based co-adaptation between the digital and physical realms. The paper concludes with a forward-looking perspective on how federated digital twin ecosystems could underpin a new class of resilient healthcare AI infrastructures, balancing innovation with rigorous safety guarantees.

References

1. Bommasani, R., Hudson, D. A., Adeli, E., Altman, R., Arora, S., von Arx, S., ... & Liang, P. (2021). On the opportunities and risks of foundation models. arXiv preprint arXiv:2108.07258.

2. Lee, P., Bubeck, S., & Petro, J. (2023). Benefits, limits, and risks of GPT-4 as an AI chatbot for medicine. New England Journal of Medicine, 388(13), 1233–1239.

3. Greshake, K., Abdelnabi, S., Mishra, S., Endres, C., Holz, T., & Fritz, M. (2023). Not what you’ve signed up for: Compromising real-world LLM-integrated applications with indirect prompt injection. In Proceedings of the 16th ACM Workshop on Artificial Intelligence and Security (pp. 79–90).

4. Zou, A., Wang, Z., Kolter, J. Z., & Fredrikson, M. (2023). Universal and transferable adversarial attacks on aligned language models. arXiv preprint arXiv:2307.15043.

5. Weidinger, L., Mellor, J., Rauh, M., Griffin, C., Uesato, J., Huang, P. S., ... & Gabriel, I. (2021). Ethical and social risks of harm from language models. arXiv preprint arXiv:2112.04359.

6. Grieves, M., & Vickers, J. (2017). Digital twin: Mitigating unpredictable, undesirable emergent behavior in complex systems. In Transdisciplinary Perspectives on Complex Systems (pp. 85–113). Springer.

7. Corral-Acero, J., Margara, F., Marciniak, M., Rodero, C., Loncaric, F., Feng, Y., ... & Lamata, P. (2020). The ‘Digital Twin’ to enable the vision of precision cardiology. European Heart Journal, 41(48), 4556–4564.

8. Singhal, K., Azizi, S., Tu, T., Mahdavi, S. S., Wei, J., Chung, H. W., ... & Natarajan, V. (2023). Large language models encode clinical knowledge. Nature, 620(7972), 172–180.

9. Takagi, S., & Kato, T. (2021). Surrogate modeling for high-fidelity simulation: A review. Structural and Multidisciplinary Optimization, 64(5), 2689–2717.

10. Yao, S., Yu, D., Zhao, J., Shafto, I., Griffiths, T. L., Huang, T., & Zhu, S. C. (2023). Tree of thoughts: Deliberate problem solving with large language models. arXiv preprint arXiv:2305.10601.

11. Obermeyer, Z., Powers, B., Vogeli, C., & Mullainathan, S. (2019). Dissecting racial bias in an algorithm used to manage the health of populations. Science, 366(6464), 447–453.

12. Rieke, N., Hancox, J., Li, W., Milletarì, F., Roth, H. R., Albarqouni, S., ... & Cardoso, M. J. (2020). The future of digital health with federated learning. NPJ Digital Medicine, 3(1), 1–7.

13. Mehrabi, N., Morstatter, F., Saxena, N., Lerman, K., & Galstyan, A. (2021). A survey on bias and fairness in machine learning. ACM Computing Surveys, 54(6), 1–35.

14. Schirmer, G., & Denil, M. (2019). Sim-to-real transfer for robotics: A survey. arXiv preprint arXiv:1909.11013.

15. Hu, S. (2026). Research on Security Enhancement Methods for Adversarial Robust Large Language Model Intelligent Agents for Medical Decision-Making Tasks. arXiv preprint arXiv:2605.08257.

16. Carlini, N., & Wagner, D. (2018). Audio adversarial examples: Targeted attacks on speech-to-text. In 2018 IEEE Security and Privacy Workshops (SPW) (pp. 1–7). IEEE.

17. Chowdhury, G. G. (2021). A review of digital twins in healthcare: Towards an integrated approach. Health Informatics Journal, 27(3), 14604582211043159.

18. Topol, E. J. (2019). High-performance medicine: The convergence of human and artificial intelligence. Nature Medicine, 25(1), 44–56.

19. European Commission. (2021). Proposal for a Regulation laying down harmonised rules on artificial intelligence (Artificial Intelligence Act). COM/2021/206 final.

20. Mökander, J., & Floridi, L. (2022). From algorithmic accountability to digital due process: The case for a Digital Twin regulatory sandbox. Philosophy & Technology, 35(3), 1–20.

21. Kairouz, P., McMahan, H. B., Avent, B., Bellet, A., Bennis, M., Bhagoji, A. N., ... & Zhao, S. (2021). Advances and open problems in federated learning. Foundations and Trends in Machine Learning, 14(1–2), 1–210.

22. Rasheed, A., San, O., & Kvamsdal, T. (2020). Digital twin: Values, challenges and enablers from a modeling perspective. IEEE Access, 8, 21980–22012.

23. Wu, C., Wu, W., & Pan, Y. (2022). Security and privacy of digital twin: A survey. IEEE Internet of Things Journal, 9(19), 18397–18412.

24. Albahri, A. S., Duhaim, A. M., Fadhel, M. A., Alnoor, A., Baqer, N. S., Alzubaidi, L., ... & Santamaría, J. (2023). A systematic review of trustworthy and explainable artificial intelligence in healthcare: Assessment of quality, bias risk, and data fusion. Information Fusion, 96, 156–191.

25. Schneider, J., & Breitinger, F. (2023). AI security in healthcare: Understanding the threat landscape. ACM Computing Surveys, 55(12), 1–32.

Digital Twin-Driven Risk Assessment and Security Optimization of Large Language Model Agents in Personalized Healthcare

Authors

Keywords:

Abstract

References

Downloads

Published

How to Cite

Issue

Section

License

Journal Information

Indexing & Infrastructure

Current Issue

Information

Make a Submission