Cross-Modal Deep Hashing Framework for Wearable PPG Signal Retrieval with Self-Supervised Semantic Representation Learning

Henri Gaylor; Gavide Tmith; Hirant Weshington; Bereld Wegner

Authors

Henri Gaylor Department of Computer Science, University of New Hampshire, Durham, NH, USA.
Gavide Tmith Department of Computer Science, University of Houston, Houston, TX, USA.
Hirant Weshington Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO, USA.
Bereld Wegner Department of Computer Science, University of Central Florida, Orlando, FL, USA.

Keywords:

deep hashing, photoplethysmography, self-supervised learning, cross-modal retrieval, wearable computing, semantic representation

Abstract

The proliferation of wearable photoplethysmography sensors has generated vast streams of cardiovascular data, creating an urgent need for efficient, semantic-aware retrieval mechanisms that can operate across heterogeneous contextual modalities. This paper presents a cross-modal deep hashing framework designed for PPG signal retrieval that integrates self-supervised semantic representation learning to extract robust, modality-invariant features from unlabeled physiological time series and associated metadata. The framework maps PPG segments and their corresponding semantic descriptors into a shared binary Hamming space, enabling fast approximate nearest neighbor search while preserving clinically meaningful similarities. A comprehensive system-level analysis is conducted, addressing architectural choices that balance quantization error against retrieval precision, the integration of contrastive and masked reconstruction objectives for representation learning, and the trade-offs inherent in deploying such models on resource-constrained wearable edge devices. The discussion extends to governance and policy considerations, including data privacy, fairness across demographic groups, and the sustainability of large-scale health retrieval infrastructures. By emphasizing structural robustness, bias mitigation, and cross-modal alignment, the proposed framework offers a principled pathway toward scalable, privacy-preserving, and equitable health monitoring systems. The paper concludes with an examination of deployment scenarios, evaluation benchmarks, and future directions for cross-modal biosignal retrieval in real-world healthcare ecosystems.

References

1. He, K., Fan, H., Wu, Y., Xie, S., & Girshick, R. (2020). Momentum contrast for unsupervised visual representation learning. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 9729-9738). https://doi.org/10.1109/CVPR42600.2020.00975

2. Allen, J. (2007). Photoplethysmography and its application in clinical physiological measurement. Physiological Measurement, 28(3), R1–R39. https://doi.org/10.1088/0967-3334/28/3/R01

3. Chen, T., Kornblith, S., Norouzi, M., & Hinton, G. (2020). A simple framework for contrastive learning of visual representations. Proceedings of the 37th International Conference on Machine Learning (pp. 1597-1607). PMLR.

4. Mehari, T., & Strodthoff, N. (2022). Self-supervised representation learning from electrocardiography data. Biomedical Signal Processing and Control, 71, 103244. https://doi.org/10.1016/j.bspc.2021.103244

5. Liu, H., Wang, R., Shan, S., & Chen, X. (2016). Deep supervised hashing for fast image retrieval. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 2064-2072). https://doi.org/10.1109/CVPR.2016.227

6. Zhang, D., & Li, W. J. (2014). Large-scale supervised multimodal hashing with semantic correlation maximization. Proceedings of the AAAI Conference on Artificial Intelligence, 28(1). https://doi.org/10.1609/aaai.v28i1.8955

7. Xie, L., Shen, J., & Zhu, L. (2019). Online cross-modal hashing for web image retrieval. IEEE Transactions on Multimedia, 21(10), 2583-2595. https://doi.org/10.1109/TMM.2019.2907590

8. Su, P., Ding, X. R., Zhang, Y. T., Liu, J., Miao, F., & Zhao, N. (2019). Long-term blood pressure prediction with deep recurrent neural networks. 2019 IEEE EMBS International Conference on Biomedical & Health Informatics (pp. 1-4). https://doi.org/10.1109/BHI.2019.8834684

9. Chen, X., Fan, H., Girshick, R., & He, K. (2020). Improved baselines with momentum contrastive learning. arXiv preprint arXiv:2003.04297.

10. Zerveas, G., Jayaraman, S., Patel, D., Bhamidipaty, A., & Eickhoff, C. (2021). A transformer-based framework for multivariate time series representation learning. Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining (pp. 2114-2124). https://doi.org/10.1145/3447548.3467401

11. Yang, E., Deng, C., Li, C., Liu, W., Li, J., & Tao, D. (2018). Shared predictive cross-modal deep quantization. IEEE Transactions on Neural Networks and Learning Systems, 29(11), 5292-5303. https://doi.org/10.1109/TNNLS.2018.2793863

12. Yu, Z., Wu, S., Dou, Z., & Bakker, E. M. (2022). Deep hashing with self-supervised asymmetric semantic excavation and margin-scalable constraint. Neurocomputing, 483, 87-104.

13. Liang, Y., Chen, Z., Ward, R., & Elgendi, M. (2018). Photoplethysmography and deep learning: enhancing hypertension risk stratification. Biosensors, 8(4), 101. https://doi.org/10.3390/bios8040101

14. Zhou, Z., Chen, X., Li, E., Zeng, L., Luo, K., & Zhang, J. (2019). Edge intelligence: paving the last mile of artificial intelligence with edge computing. Proceedings of the IEEE, 107(8), 1738-1762. https://doi.org/10.1109/JPROC.2019.2918951

15. Chen, I. Y., Pierson, E., Rose, S., Joshi, S., Ferryman, K., & Ghassemi, M. (2021). Ethical machine learning in healthcare. Annual Review of Biomedical Data Science, 4, 123-144. https://doi.org/10.1146/annurev-biodatasci-092820-114757

16. Price, W. N., & Cohen, I. G. (2019). Privacy in the age of medical big data. Nature Medicine, 25(1), 37–43. https://doi.org/10.1038/s41591-018-0272-7

17. Radford, A., Kim, J. W., Hallacy, C., Ramesh, A., Goh, G., Agarwal, S., ... & Sutskever, I. (2021). Learning transferable visual models from natural language supervision. Proceedings of the 38th International Conference on Machine Learning (pp. 8748-8763). PMLR.

18. Guo, Z., Chen, T., Jiao, Y., Pan, Y., Hu, X., & Ferrario, M. (2026). SIGMA-PPG: Statistical-prior Informed Generative Masking Architecture for PPG Foundation Model. arXiv preprint arXiv:2601.21031.

19. Wang, J., Zhang, T., Song, J., Sebe, N., & Shen, H. T. (2018). A survey on learning to hash. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(4), 769-790. https://doi.org/10.1109/TPAMI.2017.2699960

20. Elgendi, M. (2012). On the analysis of fingertip photoplethysmogram signals. Current Cardiology Reviews, 8(1), 14-25. https://doi.org/10.2174/157340312801215782

21. Rieke, N., Hancox, J., Li, W., Milletari, F., Roth, H. R., Albarqouni, S., ... & Cardoso, M. J. (2020). The future of digital health with federated learning. NPJ Digital Medicine, 3(1), 119. https://doi.org/10.1038/s41746-020-00323-1

22. Schmidt, P., Reiss, A., Duerichen, R., & Van Laerhoven, K. (2019). Introducing WESAD, a multimodal dataset for wearable stress and affect detection. Proceedings of the 2018 on International Conference on Multimodal Interaction (pp. 400-408). https://doi.org/10.1145/3242969.3242985

23. Yue, Y., Khanal, A., Lyu, T., Weissman, S., & Liang, C. (2025, May). EHR Phenotyping Methods for Measuring Treatment Adherence Among People Living With HIV in All of Us: Towards Disparities and Inequalities in HIV Care Continuum. In AMIA Annual Symposium Proceedings (Vol. 2024, p. 1294). 24.Shui, Y., Jin, R., Dou, Z., & Gao, Z. (2026). ProtoGuard-SL: Prototype Consistency Based Backdoor Defense for Vertical Split Learning. arXiv preprint arXiv:2604.03595.

Cross-Modal Deep Hashing Framework for Wearable PPG Signal Retrieval with Self-Supervised Semantic Representation Learning

Authors

Keywords:

Abstract

References

Downloads

Published

How to Cite

Issue

Section

License

Journal Information

Indexing & Infrastructure

Current Issue

Information

Make a Submission