Cross-Modal Deep Hashing Framework for Wearable PPG Signal Retrieval with Self-Supervised Semantic Representation Learning

Authors

  • Henri Gaylor Department of Computer Science, University of New Hampshire, Durham, NH, USA.
  • Gavide Tmith Department of Computer Science, University of Houston, Houston, TX, USA.
  • Hirant Weshington Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO, USA.
  • Bereld Wegner Department of Computer Science, University of Central Florida, Orlando, FL, USA.

Keywords:

deep hashing, photoplethysmography, self-supervised learning, cross-modal retrieval, wearable computing, semantic representation

Abstract

The proliferation of wearable photoplethysmography sensors has generated vast streams of cardiovascular data, creating an urgent need for efficient, semantic-aware retrieval mechanisms that can operate across heterogeneous contextual modalities. This paper presents a cross-modal deep hashing framework designed for PPG signal retrieval that integrates self-supervised semantic representation learning to extract robust, modality-invariant features from unlabeled physiological time series and associated metadata. The framework maps PPG segments and their corresponding semantic descriptors into a shared binary Hamming space, enabling fast approximate nearest neighbor search while preserving clinically meaningful similarities. A comprehensive system-level analysis is conducted, addressing architectural choices that balance quantization error against retrieval precision, the integration of contrastive and masked reconstruction objectives for representation learning, and the trade-offs inherent in deploying such models on resource-constrained wearable edge devices. The discussion extends to governance and policy considerations, including data privacy, fairness across demographic groups, and the sustainability of large-scale health retrieval infrastructures. By emphasizing structural robustness, bias mitigation, and cross-modal alignment, the proposed framework offers a principled pathway toward scalable, privacy-preserving, and equitable health monitoring systems. The paper concludes with an examination of deployment scenarios, evaluation benchmarks, and future directions for cross-modal biosignal retrieval in real-world healthcare ecosystems.

References

1. He, K., Fan, H., Wu, Y., Xie, S., & Girshick, R. (2020). Momentum contrast for unsupervised visual representation learning. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 9729-9738). https://doi.org/10.1109/CVPR42600.2020.00975

2. Allen, J. (2007). Photoplethysmography and its application in clinical physiological measurement. Physiological Measurement, 28(3), R1–R39. https://doi.org/10.1088/0967-3334/28/3/R01

3. Chen, T., Kornblith, S., Norouzi, M., & Hinton, G. (2020). A simple framework for contrastive learning of visual representations. Proceedings of the 37th International Conference on Machine Learning (pp. 1597-1607). PMLR.

4. Mehari, T., & Strodthoff, N. (2022). Self-supervised representation learning from electrocardiography data. Biomedical Signal Processing and Control, 71, 103244. https://doi.org/10.1016/j.bspc.2021.103244

5. Liu, H., Wang, R., Shan, S., & Chen, X. (2016). Deep supervised hashing for fast image retrieval. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 2064-2072). https://doi.org/10.1109/CVPR.2016.227

6. Zhang, D., & Li, W. J. (2014). Large-scale supervised multimodal hashing with semantic correlation maximization. Proceedings of the AAAI Conference on Artificial Intelligence, 28(1). https://doi.org/10.1609/aaai.v28i1.8955

7. Xie, L., Shen, J., & Zhu, L. (2019). Online cross-modal hashing for web image retrieval. IEEE Transactions on Multimedia, 21(10), 2583-2595. https://doi.org/10.1109/TMM.2019.2907590

8. Su, P., Ding, X. R., Zhang, Y. T., Liu, J., Miao, F., & Zhao, N. (2019). Long-term blood pressure prediction with deep recurrent neural networks. 2019 IEEE EMBS International Conference on Biomedical & Health Informatics (pp. 1-4). https://doi.org/10.1109/BHI.2019.8834684

9. Chen, X., Fan, H., Girshick, R., & He, K. (2020). Improved baselines with momentum contrastive learning. arXiv preprint arXiv:2003.04297.

10. Zerveas, G., Jayaraman, S., Patel, D., Bhamidipaty, A., & Eickhoff, C. (2021). A transformer-based framework for multivariate time series representation learning. Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining (pp. 2114-2124). https://doi.org/10.1145/3447548.3467401

11. Yang, E., Deng, C., Li, C., Liu, W., Li, J., & Tao, D. (2018). Shared predictive cross-modal deep quantization. IEEE Transactions on Neural Networks and Learning Systems, 29(11), 5292-5303. https://doi.org/10.1109/TNNLS.2018.2793863

12. Yu, Z., Wu, S., Dou, Z., & Bakker, E. M. (2022). Deep hashing with self-supervised asymmetric semantic excavation and margin-scalable constraint. Neurocomputing, 483, 87-104.

13. Liang, Y., Chen, Z., Ward, R., & Elgendi, M. (2018). Photoplethysmography and deep learning: enhancing hypertension risk stratification. Biosensors, 8(4), 101. https://doi.org/10.3390/bios8040101

14. Zhou, Z., Chen, X., Li, E., Zeng, L., Luo, K., & Zhang, J. (2019). Edge intelligence: paving the last mile of artificial intelligence with edge computing. Proceedings of the IEEE, 107(8), 1738-1762. https://doi.org/10.1109/JPROC.2019.2918951

15. Chen, I. Y., Pierson, E., Rose, S., Joshi, S., Ferryman, K., & Ghassemi, M. (2021). Ethical machine learning in healthcare. Annual Review of Biomedical Data Science, 4, 123-144. https://doi.org/10.1146/annurev-biodatasci-092820-114757

16. Price, W. N., & Cohen, I. G. (2019). Privacy in the age of medical big data. Nature Medicine, 25(1), 37–43. https://doi.org/10.1038/s41591-018-0272-7

17. Radford, A., Kim, J. W., Hallacy, C., Ramesh, A., Goh, G., Agarwal, S., ... & Sutskever, I. (2021). Learning transferable visual models from natural language supervision. Proceedings of the 38th International Conference on Machine Learning (pp. 8748-8763). PMLR.

18. Guo, Z., Chen, T., Jiao, Y., Pan, Y., Hu, X., & Ferrario, M. (2026). SIGMA-PPG: Statistical-prior Informed Generative Masking Architecture for PPG Foundation Model. arXiv preprint arXiv:2601.21031.

19. Wang, J., Zhang, T., Song, J., Sebe, N., & Shen, H. T. (2018). A survey on learning to hash. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(4), 769-790. https://doi.org/10.1109/TPAMI.2017.2699960

20. Elgendi, M. (2012). On the analysis of fingertip photoplethysmogram signals. Current Cardiology Reviews, 8(1), 14-25. https://doi.org/10.2174/157340312801215782

21. Rieke, N., Hancox, J., Li, W., Milletari, F., Roth, H. R., Albarqouni, S., ... & Cardoso, M. J. (2020). The future of digital health with federated learning. NPJ Digital Medicine, 3(1), 119. https://doi.org/10.1038/s41746-020-00323-1

22. Schmidt, P., Reiss, A., Duerichen, R., & Van Laerhoven, K. (2019). Introducing WESAD, a multimodal dataset for wearable stress and affect detection. Proceedings of the 2018 on International Conference on Multimodal Interaction (pp. 400-408). https://doi.org/10.1145/3242969.3242985

23. Yue, Y., Khanal, A., Lyu, T., Weissman, S., & Liang, C. (2025, May). EHR Phenotyping Methods for Measuring Treatment Adherence Among People Living With HIV in All of Us: Towards Disparities and Inequalities in HIV Care Continuum. In AMIA Annual Symposium Proceedings (Vol. 2024, p. 1294). 24.Shui, Y., Jin, R., Dou, Z., & Gao, Z. (2026). ProtoGuard-SL: Prototype Consistency Based Backdoor Defense for Vertical Split Learning. arXiv preprint arXiv:2604.03595.

Downloads

Published

2026-06-12

How to Cite

Henri Gaylor, Gavide Tmith, Hirant Weshington, & Bereld Wegner. (2026). Cross-Modal Deep Hashing Framework for Wearable PPG Signal Retrieval with Self-Supervised Semantic Representation Learning. International Journal of Clinical and Translational Medicine, 1(1). Retrieved from https://ijctmed.org/index.php/home/article/view/148