POPAR: Patch Order Prediction and Appearance Recovery for Self-supervised Medical Image Analysis

Jiaxuan Pang; Fatemeh Haghighi; Dong Ao Ma; Nahid Ul Islam; Mohammad Reza Hosseinzadeh Taher; Michael B. Gotway; Jianming Liang

doi:10.1007/978-3-031-16852-9_8

POPAR: Patch Order Prediction and Appearance Recovery for Self-supervised Medical Image Analysis

Jiaxuan Pang, Fatemeh Haghighi, Dong Ao Ma, Nahid Ul Islam, Mohammad Reza Hosseinzadeh Taher, Michael B. Gotway, Jianming Liang

Health Solutions, College of (CHS)

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

1 Scopus citations

Abstract

Vision transformer-based self-supervised learning (SSL) approaches have recently shown substantial success in learning visual representations from unannotated photographic images. However, their acceptance in medical imaging is still lukewarm, due to the significant discrepancy between medical and photographic images. Consequently, we propose POPAR (patch order prediction and appearance recovery), a novel vision transformer-based self-supervised learning framework for chest X-ray images. POPAR leverages the benefits of vision transformers and unique properties of medical imaging, aiming to simultaneously learn patch-wise high-level contextual features by correcting shuffled patch orders and fine-grained features by recovering patch appearance. We transfer POPAR pretrained models to diverse downstream tasks. The experiment results suggest that (1) POPAR outperforms state-of-the-art (SoTA) self-supervised models with vision transformer backbone; (2) POPAR achieves significantly better performance over all three SoTA contrastive learning methods; and (3) POPAR also outperforms fully-supervised pretrained models across architectures. In addition, our ablation study suggests that to achieve better performance on medical imaging tasks, both fine-grained and global contextual features are preferred. All code and models are available at GitHub.com/JLiangLab/POPAR.

Original language	English (US)
Title of host publication	Domain Adaptation and Representation Transfer - 4th MICCAI Workshop, DART 2022, Held in Conjunction with MICCAI 2022, Proceedings
Editors	Konstantinos Kamnitsas, Lisa Koch, Mobarakol Islam, Ziyue Xu, Jorge Cardoso, Qi Dou, Nicola Rieke, Sotirios Tsaftaris
Publisher	Springer Science and Business Media Deutschland GmbH
Pages	77-87
Number of pages	11
ISBN (Print)	9783031168512
DOIs	https://doi.org/10.1007/978-3-031-16852-9_8
State	Published - 2022
Event	4th MICCAI Workshop on Domain Adaptation and Representation Transfer, DART 2022, held in conjunction with the 25th International Conference on Medical Image Computing and Computer-Assisted Intervention, MICCAI 2022 - Singapore, Singapore Duration: Sep 22 2022 → Sep 22 2022

Publication series

Name	Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume	13542 LNCS
ISSN (Print)	0302-9743
ISSN (Electronic)	1611-3349

Conference

Conference	4th MICCAI Workshop on Domain Adaptation and Representation Transfer, DART 2022, held in conjunction with the 25th International Conference on Medical Image Computing and Computer-Assisted Intervention, MICCAI 2022
Country/Territory	Singapore
City	Singapore
Period	9/22/22 → 9/22/22

Keywords

Medical image analysis
Self-supervised learning
Transfer learning
Vision transformer

ASJC Scopus subject areas

Theoretical Computer Science
General Computer Science

Access to Document

10.1007/978-3-031-16852-9_8

Cite this

Pang, J., Haghighi, F., Ma, D. A., Islam, N. U., Hosseinzadeh Taher, M. R., Gotway, M. B., & Liang, J. (2022). POPAR: Patch Order Prediction and Appearance Recovery for Self-supervised Medical Image Analysis. In K. Kamnitsas, L. Koch, M. Islam, Z. Xu, J. Cardoso, Q. Dou, N. Rieke, & S. Tsaftaris (Eds.), Domain Adaptation and Representation Transfer - 4th MICCAI Workshop, DART 2022, Held in Conjunction with MICCAI 2022, Proceedings (pp. 77-87). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 13542 LNCS). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-031-16852-9_8

POPAR: Patch Order Prediction and Appearance Recovery for Self-supervised Medical Image Analysis. / Pang, Jiaxuan; Haghighi, Fatemeh; Ma, Dong Ao et al.
Domain Adaptation and Representation Transfer - 4th MICCAI Workshop, DART 2022, Held in Conjunction with MICCAI 2022, Proceedings. ed. / Konstantinos Kamnitsas; Lisa Koch; Mobarakol Islam; Ziyue Xu; Jorge Cardoso; Qi Dou; Nicola Rieke; Sotirios Tsaftaris. Springer Science and Business Media Deutschland GmbH, 2022. p. 77-87 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 13542 LNCS).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Pang, J, Haghighi, F, Ma, DA, Islam, NU, Hosseinzadeh Taher, MR, Gotway, MB & Liang, J 2022, POPAR: Patch Order Prediction and Appearance Recovery for Self-supervised Medical Image Analysis. in K Kamnitsas, L Koch, M Islam, Z Xu, J Cardoso, Q Dou, N Rieke & S Tsaftaris (eds), Domain Adaptation and Representation Transfer - 4th MICCAI Workshop, DART 2022, Held in Conjunction with MICCAI 2022, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 13542 LNCS, Springer Science and Business Media Deutschland GmbH, pp. 77-87, 4th MICCAI Workshop on Domain Adaptation and Representation Transfer, DART 2022, held in conjunction with the 25th International Conference on Medical Image Computing and Computer-Assisted Intervention, MICCAI 2022, Singapore, Singapore, 9/22/22. https://doi.org/10.1007/978-3-031-16852-9_8

Pang J, Haghighi F, Ma DA, Islam NU, Hosseinzadeh Taher MR, Gotway MB et al. POPAR: Patch Order Prediction and Appearance Recovery for Self-supervised Medical Image Analysis. In Kamnitsas K, Koch L, Islam M, Xu Z, Cardoso J, Dou Q, Rieke N, Tsaftaris S, editors, Domain Adaptation and Representation Transfer - 4th MICCAI Workshop, DART 2022, Held in Conjunction with MICCAI 2022, Proceedings. Springer Science and Business Media Deutschland GmbH. 2022. p. 77-87. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). doi: 10.1007/978-3-031-16852-9_8

Pang, Jiaxuan ; Haghighi, Fatemeh ; Ma, Dong Ao et al. / POPAR : Patch Order Prediction and Appearance Recovery for Self-supervised Medical Image Analysis. Domain Adaptation and Representation Transfer - 4th MICCAI Workshop, DART 2022, Held in Conjunction with MICCAI 2022, Proceedings. editor / Konstantinos Kamnitsas ; Lisa Koch ; Mobarakol Islam ; Ziyue Xu ; Jorge Cardoso ; Qi Dou ; Nicola Rieke ; Sotirios Tsaftaris. Springer Science and Business Media Deutschland GmbH, 2022. pp. 77-87 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).

@inproceedings{7263b3b312c44fc599e888ddb9222099,

title = "POPAR: Patch Order Prediction and Appearance Recovery for Self-supervised Medical Image Analysis",

abstract = "Vision transformer-based self-supervised learning (SSL) approaches have recently shown substantial success in learning visual representations from unannotated photographic images. However, their acceptance in medical imaging is still lukewarm, due to the significant discrepancy between medical and photographic images. Consequently, we propose POPAR (patch order prediction and appearance recovery), a novel vision transformer-based self-supervised learning framework for chest X-ray images. POPAR leverages the benefits of vision transformers and unique properties of medical imaging, aiming to simultaneously learn patch-wise high-level contextual features by correcting shuffled patch orders and fine-grained features by recovering patch appearance. We transfer POPAR pretrained models to diverse downstream tasks. The experiment results suggest that (1) POPAR outperforms state-of-the-art (SoTA) self-supervised models with vision transformer backbone; (2) POPAR achieves significantly better performance over all three SoTA contrastive learning methods; and (3) POPAR also outperforms fully-supervised pretrained models across architectures. In addition, our ablation study suggests that to achieve better performance on medical imaging tasks, both fine-grained and global contextual features are preferred. All code and models are available at GitHub.com/JLiangLab/POPAR.",

keywords = "Medical image analysis, Self-supervised learning, Transfer learning, Vision transformer",

author = "Jiaxuan Pang and Fatemeh Haghighi and Ma, {Dong Ao} and Islam, {Nahid Ul} and {Hosseinzadeh Taher}, {Mohammad Reza} and Gotway, {Michael B.} and Jianming Liang",

note = "Funding Information: Acknowledgments. This research has been supported in part by ASU and Mayo Clinic through a Seed Grant and an Innovation Grant, and in part by the NIH under Award Number R01HL128785. The content is solely the responsibility of the authors and does not necessarily represent the official views of the NIH. This work has utilized the GPUs provided in part by the ASU Research Computing and in part by the Extreme Science and Engineering Discovery Environment (XSEDE) funded by the National Science Foundation (NSF) under grant numbers: ACI-1548562, ACI-1928147, and ACI-2005632. The content of this paper is covered by patents pending. Publisher Copyright: {\textcopyright} 2022, The Author(s), under exclusive license to Springer Nature Switzerland AG.; 4th MICCAI Workshop on Domain Adaptation and Representation Transfer, DART 2022, held in conjunction with the 25th International Conference on Medical Image Computing and Computer-Assisted Intervention, MICCAI 2022 ; Conference date: 22-09-2022 Through 22-09-2022",

year = "2022",

doi = "10.1007/978-3-031-16852-9_8",

language = "English (US)",

isbn = "9783031168512",

series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

publisher = "Springer Science and Business Media Deutschland GmbH",

pages = "77--87",

editor = "Konstantinos Kamnitsas and Lisa Koch and Mobarakol Islam and Ziyue Xu and Jorge Cardoso and Qi Dou and Nicola Rieke and Sotirios Tsaftaris",

booktitle = "Domain Adaptation and Representation Transfer - 4th MICCAI Workshop, DART 2022, Held in Conjunction with MICCAI 2022, Proceedings",

address = "Germany",

}

TY - GEN

T1 - POPAR

T2 - 4th MICCAI Workshop on Domain Adaptation and Representation Transfer, DART 2022, held in conjunction with the 25th International Conference on Medical Image Computing and Computer-Assisted Intervention, MICCAI 2022

AU - Pang, Jiaxuan

AU - Haghighi, Fatemeh

AU - Ma, Dong Ao

AU - Islam, Nahid Ul

AU - Hosseinzadeh Taher, Mohammad Reza

AU - Gotway, Michael B.

AU - Liang, Jianming

N1 - Funding Information: Acknowledgments. This research has been supported in part by ASU and Mayo Clinic through a Seed Grant and an Innovation Grant, and in part by the NIH under Award Number R01HL128785. The content is solely the responsibility of the authors and does not necessarily represent the official views of the NIH. This work has utilized the GPUs provided in part by the ASU Research Computing and in part by the Extreme Science and Engineering Discovery Environment (XSEDE) funded by the National Science Foundation (NSF) under grant numbers: ACI-1548562, ACI-1928147, and ACI-2005632. The content of this paper is covered by patents pending. Publisher Copyright: © 2022, The Author(s), under exclusive license to Springer Nature Switzerland AG.

PY - 2022

Y1 - 2022

N2 - Vision transformer-based self-supervised learning (SSL) approaches have recently shown substantial success in learning visual representations from unannotated photographic images. However, their acceptance in medical imaging is still lukewarm, due to the significant discrepancy between medical and photographic images. Consequently, we propose POPAR (patch order prediction and appearance recovery), a novel vision transformer-based self-supervised learning framework for chest X-ray images. POPAR leverages the benefits of vision transformers and unique properties of medical imaging, aiming to simultaneously learn patch-wise high-level contextual features by correcting shuffled patch orders and fine-grained features by recovering patch appearance. We transfer POPAR pretrained models to diverse downstream tasks. The experiment results suggest that (1) POPAR outperforms state-of-the-art (SoTA) self-supervised models with vision transformer backbone; (2) POPAR achieves significantly better performance over all three SoTA contrastive learning methods; and (3) POPAR also outperforms fully-supervised pretrained models across architectures. In addition, our ablation study suggests that to achieve better performance on medical imaging tasks, both fine-grained and global contextual features are preferred. All code and models are available at GitHub.com/JLiangLab/POPAR.

AB - Vision transformer-based self-supervised learning (SSL) approaches have recently shown substantial success in learning visual representations from unannotated photographic images. However, their acceptance in medical imaging is still lukewarm, due to the significant discrepancy between medical and photographic images. Consequently, we propose POPAR (patch order prediction and appearance recovery), a novel vision transformer-based self-supervised learning framework for chest X-ray images. POPAR leverages the benefits of vision transformers and unique properties of medical imaging, aiming to simultaneously learn patch-wise high-level contextual features by correcting shuffled patch orders and fine-grained features by recovering patch appearance. We transfer POPAR pretrained models to diverse downstream tasks. The experiment results suggest that (1) POPAR outperforms state-of-the-art (SoTA) self-supervised models with vision transformer backbone; (2) POPAR achieves significantly better performance over all three SoTA contrastive learning methods; and (3) POPAR also outperforms fully-supervised pretrained models across architectures. In addition, our ablation study suggests that to achieve better performance on medical imaging tasks, both fine-grained and global contextual features are preferred. All code and models are available at GitHub.com/JLiangLab/POPAR.

KW - Medical image analysis

KW - Self-supervised learning

KW - Transfer learning

KW - Vision transformer

UR - http://www.scopus.com/inward/record.url?scp=85140446751&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85140446751&partnerID=8YFLogxK

U2 - 10.1007/978-3-031-16852-9_8

DO - 10.1007/978-3-031-16852-9_8

M3 - Conference contribution

AN - SCOPUS:85140446751

SN - 9783031168512

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 77

EP - 87

BT - Domain Adaptation and Representation Transfer - 4th MICCAI Workshop, DART 2022, Held in Conjunction with MICCAI 2022, Proceedings

A2 - Kamnitsas, Konstantinos

A2 - Koch, Lisa

A2 - Islam, Mobarakol

A2 - Xu, Ziyue

A2 - Cardoso, Jorge

A2 - Dou, Qi

A2 - Rieke, Nicola

A2 - Tsaftaris, Sotirios

PB - Springer Science and Business Media Deutschland GmbH

Y2 - 22 September 2022 through 22 September 2022

ER -