Protein Mutation Effect Prediction via Graph-Based Modeling of Residue pKa Shifts and Local Electrostatic Rewiring

Authors

  • Leon Satkings Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO, USA.
  • Lummy Lawrence Department of Computer Science, University of North Texas, Denton, TX, USA.

Keywords:

protein mutation effect prediction, pKa shifts, electrostatic networks, graph neural networks, socio-technical systems, biomedical AI governance

Abstract

Predicting the functional and stability consequences of amino acid substitutions remains a cornerstone challenge in computational biology, with far-reaching implications for precision medicine, protein engineering, and the interpretation of genomic variation. While significant progress has been made by leveraging evolutionary sequence conservation and global structural features, the nuanced role of localized electrostatic perturbations, particularly those arising from residue-specific pKa shifts, has been comparatively underexplored in large-scale mutation effect prediction systems. This paper presents an interdisciplinary perspective on the design, deployment, and governance of graph-based modeling frameworks that explicitly encode pKa value alterations and the resultant electrostatic rewiring at protein interfaces. We articulate a system-level architecture in which proteins are represented as heterogeneous graphs with ionizable residues as charge-carrying nodes, and where edge attributes capture coulombic coupling and solvent exposure. By integrating physically inspired feature engineering with message-passing neural networks, such a system can capture the propagation of local charge disruptions across the contact network. The discussion extends beyond algorithmic design to address critical infrastructure demands, data provenance, robustness under distribution shift, fairness across protein families and human populations, model interpretability, and the policy frameworks required for clinical translation. We explore structural trade-offs between model granularity and computational tractability, the sustainability of training pipelines, and the ethical dimensions of embedding biophysical models into decision-support infrastructures. The analysis underscores that the value of such predictive platforms lies not solely in accuracy metrics but in the capacity to yield mechanistically transparent and socially accountable insights for variant interpretation at scale.

References

1. Ingraham, J., Garg, V., Barzilay, R., & Jaakkola, T. (2019). Generative models for graph-based protein design. Advances in Neural Information Processing Systems, 32.

2. Riesselman, A. J., Ingraham, J. B., & Marks, D. S. (2018). Deep generative models of genetic variation capture the effects of mutations. Nature Methods, 15(10), 816–822.

3. Sondergaard, C. R., Olsson, M. H. M., Rostkowski, M., & Jensen, J. H. (2011). Improved treatment of ligands and coupling effects in empirical calculation and rationalization of pKa values. Journal of Chemical Theory and Computation, 7(7), 2284–2295.

4. Bas, D. C., Rogers, D. M., & Jensen, J. H. (2008). Very fast prediction and rationalization of pKa values for protein-ligand complexes. Proteins: Structure, Function, and Bioinformatics, 73(3), 765–783.

5. Jumper, J., Evans, R., Pritzel, A., Green, T., Figurnov, M., Ronneberger, O., ... & Hassabis, D. (2021). Highly accurate protein structure prediction with AlphaFold. Nature, 596(7873), 583–589.

6. Song, Z., Wang, R., Jiao, X., & Huang, Z. (2026). Graph-Based Deep Learning Models for Predicting p K a Values of Protein-Ionizable Residues via Physically Inspired Feature Engineering. Journal of Chemical Information and Modeling.

7. Bashford, D. (2004). Macroscopic electrostatic models for protonation states in proteins. Frontiers in Bioscience, 9, 1082–1099.

8. Baldassarre, F., Menéndez Hurtado, D., Elofsson, A., & Azizpour, H. (2021). GraphQA: protein model quality assessment using graph convolutional networks. Bioinformatics, 37(3), 360–366.

9. Chen, I. Y., Pierson, E., Rose, S., Joshi, S., Ferryman, K., & Ghassemi, M. (2021). Ethical machine learning in healthcare. Annual Review of Biomedical Data Science, 4, 123–144.

10. wwPDB consortium. (2019). Protein Data Bank: the single global archive for 3D macromolecular structure data. Nucleic Acids Research, 47(D1), D520–D526.

11. Strubell, E., Ganesh, A., & McCallum, A. (2019). Energy and policy considerations for deep learning in NLP. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 3645–3650.

12. Kendall, A., & Gal, Y. (2017). What uncertainties do we need in Bayesian deep learning for computer vision? Advances in Neural Information Processing Systems, 30.

13. Popejoy, A. B., & Fullerton, S. M. (2016). Genomics is failing on diversity. Nature, 538(7624), 161–164.

14. Schuhmacher, A., Gatto, A., Hinder, M., Kuss, M., & Gassmann, O. (2022). The state of artificial intelligence in biopharma 2022. Drug Discovery Today, 27(9), 2522–2529.

15. Kumar, S., & Nussinov, R. (2002). Close-range electrostatic interactions in proteins. ChemBioChem, 3(7), 604–617.

16. Reeb, J., Hecht, M., Mahlich, Y., Bromberg, Y., & Rost, B. (2020). Variant effect predictions incorporate both sequence and structure information. Bioinformatics, 36(12), 3637–3644.

17. Ying, R., Bourgeois, D., You, J., Zitnik, M., & Leskovec, J. (2019). GNNExplainer: Generating explanations for graph neural networks. Advances in Neural Information Processing Systems, 32.

18. World Health Organization. (2021). Ethics and governance of artificial intelligence for health: WHO guidance. World Health Organization.

19. Karpov, P., Godin, G., & Tetko, I. V. (2020). Protein function prediction using graph neural networks and sequence embeddings. In 2020 IEEE International Conference on Bioinformatics and Biomedicine (BIBM) (pp. 463–468). IEEE.

20. Baker, M. (2016). 1,500 scientists lift the lid on reproducibility. Nature, 533(7604), 452–454.

Downloads

Published

2026-05-21

How to Cite

Leon Satkings, & Lummy Lawrence. (2026). Protein Mutation Effect Prediction via Graph-Based Modeling of Residue pKa Shifts and Local Electrostatic Rewiring. International Journal of Clinical and Translational Medicine, 1(1). Retrieved from https://ijctmed.org/index.php/home/article/view/158