Graph Neural Networks for Immune Gene–Disease Association Discovery Using Long-Read Sequencing and Population Genomics Data
Keywords:
Graph Neural Networks, Immune Gene Variation, Long-Read Sequencing, Population Genomics, Disease Association, Scalable Frameworks, Ethical GovernanceAbstract
The growing availability of long-read sequencing data and large-scale population genomics resources presents unprecedented opportunities for characterizing the highly polymorphic immune gene regions that underlie susceptibility to infectious and autoimmune diseases. However, the complexity of immune gene families, including the major histocompatibility complex and killer-cell immunoglobulin-like receptors, demands computational frameworks capable of integrating heterogeneous genomic signals through structured relational learning. This paper proposes a graph neural network approach for immune gene–disease association discovery that leverages long-read sequencing calls and population-level variation data within a unified graph representation. We discuss the architectural trade-offs between inductive and transductive learning paradigms, the scalability of message-passing schemes over genome-scale interaction graphs, and the integration of multi-omic layers such as expression quantitative trait loci and epigenetic marks. A central emphasis is placed on the system-level design choices that govern model robustness, including graph construction from phased haplotypes, handling of missing data in rare alleles, and the incorporation of clinical covariates to reduce confounding. Infrastructure considerations for deploying such models across distributed computing environments are examined, along with strategies for ensuring fairness when training data are drawn from ancestrally diverse cohorts. The paper also addresses policy and governance challenges related to data privacy, consent for long-read repositories, and the ethical deployment of predictive models in clinical decision support. By situating graph neural networks within the broader socio-technical infrastructure of immune genomics, we aim to provide a roadmap for future research that balances analytical power with responsible translation.
References
1. Kipf, T. N., & Welling, M. (2017). Semi-supervised classification with graph convolutional networks. In International Conference on Learning Representations (ICLR).
2. Zhou, J., Cui, G., Hu, S., Zhang, Z., Yang, C., Liu, Z., Wang, L., Li, C., & Sun, M. (2020). Graph neural networks: A review of methods and applications. AI Open, 1, 57-81.
3. Defferrard, M., Bresson, X., & Vandergheynst, P. (2016). Convolutional neural networks on graphs with fast localized spectral filtering. In Advances in Neural Information Processing Systems (pp. 3844-3852).
4. Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., & Polosukhin, I. (2017). Attention is all you need. In Advances in Neural Information Processing Systems (pp. 5998-6008).
5. Manolio, T. A., Collins, F. S., Cox, N. J., Goldstein, D. B., Hindorff, L. A., Hunter, D. J., McCarthy, M. I., Ramos, E. M., Cardon, L. R., Chakravarti, A., Cho, J. H., Guttmacher, A. E., Kong, A., Kruglyak, L., Mardis, E., Rotimi, C. N., Slatkin, M., Valle, D., Whittemore, A. S., ... Visscher, P. M. (2009). Finding the missing heritability of complex diseases. Nature, 461(7265), 747-753.
6. Abecasis, G. R., Altshuler, D., Auton, A., Brooks, L. D., Durbin, R. M., Gibbs, R. A., Hurles, M. E., & McVean, G. A. (2012). An integrated map of genetic variation from 1,092 human genomes. Nature, 491(7422), 56-65.
7. Amarasinghe, S. L., Su, S., Dong, X., Zappia, L., Ritchie, M. E., & Gouil, Q. (2020). Opportunities and challenges in long-read sequencing data analysis. Genome Biology, 21(1), 330.
8. Jain, M., Koren, S., Miga, K. H., Quick, J., Rand, A. C., Sasani, T. A., Tyson, J. R., Beggs, A. D., Dilthey, A. T., Fiddes, I. T., Malla, S., Marriott, H., Nieto, T., O'Grady, J., Olsen, H. E., Pedersen, B. S., Rhie, A., Richardson, H., Quinlan, A. R., ... Loose, M. (2018). Nanopore sequencing and assembly of a human genome with ultra-long reads. Nature Biotechnology, 36(4), 338-345.
9. Alberts, B., Johnson, A., Lewis, J., Raff, M., Roberts, K., & Walter, P. (2002). Molecular Biology of the Cell (4th ed.). Garland Science.
10. Trowsdale, J., & Knight, J. C. (2013). Major histocompatibility complex genomics and human disease. Annual Review of Genomics and Human Genetics, 14, 301-323.
11. Horton, R., Wilming, L., Rand, V., Lovering, R. C., Bruford, E. A., Khodiyar, V. K., Lush, M. J., Povey, S., Talbot, C. C., Jr., Wright, M. W., Wain, H. M., Trowsdale, J., Ziegler, A., & Beck, S. (2004). Gene map of the extended human MHC. Nature Reviews Genetics, 5(12), 889-899.
12. Schurz, H., Naranbhai, V., & Kinnear, C. (2021). The diversity of the immune system: A perspective on gene variation. Frontiers in Immunology, 12, 689.
13. Li, H., & Durbin, R. (2009). Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics, 25(14), 1754-1760.
14. Robinson, J., Halliwell, J. A., Hayhurst, J. D., Flicek, P., Parham, P., & Marsh, S. G. (2015). The IPD and IMGT/HLA database: allele variant databases. Nucleic Acids Research, 43(D1), D423-D431.
15. Hamilton, W., Ying, Z., & Leskovec, J. (2017). Inductive representation learning on large graphs. In Advances in Neural Information Processing Systems (pp. 1024-1034).
16. Veličković, P., Cucurull, G., Casanova, A., Romero, A., Lio, P., & Bengio, Y. (2018). Graph attention networks. In International Conference on Learning Representations.
17. Xu, K., Hu, W., Leskovec, J., & Jegelka, S. (2019). How powerful are graph neural networks? In International Conference on Learning Representations.
18. Zitnik, M., Agrawal, M., & Leskovec, J. (2018). Modeling polypharmacy side effects with graph convolutional networks. Bioinformatics, 34(13), i457-i466.
19. Gilmer, J., Schoenholz, S. S., Riley, P. F., Vinyals, O., & Dahl, G. E. (2017). Neural message passing for quantum chemistry. In International Conference on Machine Learning (pp. 1263-1272).
20. Ruffalo, M., Koyutürk, M., & Ray, S. (2016). Network-based prediction of disease-gene associations. In Proceedings of the 7th ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics (pp. 333-342).
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2026 International Journal of Clinical and Translational Medicine

This work is licensed under a Creative Commons Attribution 4.0 International License.
This article is published under the Creative Commons Attribution 4.0 International License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.



