Diabetic Retinopathy Classification Using Swin Transformer with Multi Wavelet
Keywords:Diabetic retinopathy, Swin transformer, muti- Wavelet, APTOS 2019, Vision transformers.
Diabetic retinopathy (DR) impacts over a third of individuals diagnosed with diabetes and stands as the leading cause of vision loss in working-age adults worldwide. Therefore, the early detection and treatment of DR can play a crucial role in minimizing vision loss. This research paper proposes a novel technique that combines Wavelet and multi-Wavelet transforms with Swin Transformer to automatically identify the progression level of diabetic retinopathy. A notable innovation of this study lies in the implementation of the multi-Wavelet transform for extracting relevant features. By incorporating the resulting images into the Swin Transformer model, a unique approach is introduced during the feature extraction phase. The researchers conducted experiments using the publicly available Kaggle APTOS 2019 dataset, which comprises 3662 images. The achieved training accuracy in the experiments was an impressive 97.78%, with a test accuracy of 97.54%. The highest accuracy observed during training reached 98.09%. In comparison, when applying the multi-Wavelet approach to multiclass classification, the training and validation accuracies were 91.60% and 82.42%, respectively, with a testing accuracy of 82%. These results indicate that the multi-Wavelet approach outperforms alternative methods in the study. The model demonstrated exceptional performance in binary classification tasks, exhibiting high accuracies on both the training and test sets. However, it is important to note that the model's accuracy decreased when employed in multiclass classification, emphasizing the need for further investigation and refinement to handle more diverse classification scenarios.
G. Eason, B. Noble, and I.N. Sneddon, “On certain integrals of Lipschitz-Hankel type involving products of Bessel functions,” Phil. Trans. Roy. Soc. London, vol. A247, pp. 529-551, April 1955. (references)
A. Grzybowski, P. Brona, G. Lim, and P. Ruamviboonsuk, “Artificial intelligence for diabetic retinopathy screening: a review,” Springer Nat., 2020.
S. Natarajan, A. Jain, R. Krishnan, and A. Rogye, “Diagnostic Accuracy of Community-Based Diabetic Retinopathy Screening With an Offline Artificial Intelligence System on a Smartphone,” JAMAOphthalmology, 2019.
M. Wintergerst, D. Mishra, L. Hartmann, and F. Holz, “Diabetic Retinopathy Screening Using Smartphone-Based Fundus Imaging in India,” Am. Acad. Ophthalmol., vol. 127, no. 0161-6420/20, 2020.
H. Wang, P. Cao, J. Wang, and O. R. Zaiane, “UCTransNet: Rethinking the Skip Connections in U-Net from a Channel-wise Perspective with Transformer,” arXiv Prepr. arXiv2105.05537, 2021, [Online]. Available: http://arxiv.org/abs/2109.04335.
W. Wang, E. Xie, X. Li, and D.-P. Fan, “Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions arXiv:2102.12122v2,” arXiv:2102.12122v2 [cs.CV], 2021.
H. Song, D. Sun, S. Chun, and V. Jampani, “An Extendable, Efficient and Effective Transformer-based Object Detector,” arXiv:2204.07962v1, 2022.
L. Wang, R. Li, C. Duan, C. Zhang, and X. Meng, “A Novel Transformer based Semantic Segmentation Scheme for Fine-Resolution Remote Sensing Images,” Geosci. Remote Sens. Lett., 2021.
Y. Gu, Z. Piao, and S. J. Yoo, “STHarDNet: Swin Transformer with HarDNet for MRI Segmentation,” Appl. Sci., 2022.
Z. Liao, N. Fan, and K. Xu, “Swin Transformer Assisted Prior Attention Network for Medical Image Segmentation,” Appl. Sci., 2022.
J. Liang, J. Cao, G. Sun, and K. Zhang, “SwinIR: Image Restoration Using Swin Transformer,” arXiv:2108.10257v1, 2021.
S. Hao, B. Wu, K. Zhao, and Y. Ye, “Two-Stream Swin Transformer with Differentiable Sobel Operator for Remote Sensing Image Classificatio,” Remote Sens., 2022.
A. Hatamizadeh, V. Nath, Y. Tang, D. Yang, H. Roth, and D. Xu, “Swin UNETR: Swin Transformers for Semantic Segmentation of Brain Tumors in MRI Images.” 2022, [Online]. Available: http://arxiv.org/abs/2201.01266.
A. HM Al-Helali, H. Ali, B. Al-Dulaimi, D. Alzubaydi, and W. Mahmoud, A, “Slantlet transform for multispectral image fusion,” J. Comput. Sci., vol. 5, no. 4, p. PP. 263-267, 2009.
A. Al-Helali, W. A. Mahmoud, and H. Ali, “A Fast personal palm print authentication Based on 3d-multi Wavelet Transformation,” Transnatl. J. Sci. Technol., vol. 2, no. 8, 2012.
H. Al-Taai, W. A. Mahmoud, and M. Abdulwahab, “New fast method for computing multiWavelet coefficients from 1D up to 3D,” Proc. 1st Int. Conf. Digit. Comm. Comp. App., Jordan, no. PP. 412-422, 2007.
A. H. Kattoush and W. Ameen Mahmoud Al-Jawher, “A radon-multiWavelet based OFDM system design and simulation under different channel conditions” Journal of Wireless personal communications,” J. Wirel. Pers. Commun., vol. 71, 2017
W. A. Mahmoud Al-Jawher and T. Abbas, “Feature combination and mapping using multiWavelet Transform,” IASJ, AL-Rafidain, 2005.
W. A. Mahmoud Al-Jawher, “A Smart Single Matrix Realization of Fast Walidlet Transform,” Int. J. Res. Rev., vol. 2, no. 2, pp. 144–150, 2011.
W. A. Mahmoud, A. S. Hadi, and T. M. Jawad, “Development of a 2-D Wavelet Transform based on Kronecker Product,” Al-Nahrain J. Sci., vol. 15, no. 4, pp. 208–213, 2012.
J. Ahn, J. Hong, J. Ju, and H. Jung, “Rethinking Query, Key, and Value Embedding in Vision Transformer under Tiny Model Constraints,”
arXiv:2111.10017v1, 2021, [Online]. Available: http://arxiv.org/abs/2111.10017.
E. In, “Lmsa: Low-Relation Mutil-Head Self- Attention Mechanism in Visual Transformer,” pp. 1–11, 2022.
B. Tymchenko, P. Marchenko, and D. Spodarets, “Deep Learning Approach to Diabetic Retinopathy Detection Borys,” arXiv:2003.02261v1, 2020.
J. D. Bodapati et al., “Blended multi-modal deep convnet features for diabetic retinopathy severity prediction,” Electron., vol. 9, no. 6, 2020, doi: 10.3390/electronics9060914
How to Cite
Copyright (c) 2023 Rasha Ali Dihin, Ebtesam AlShemmary, Waleed Al-Jawher
This work is licensed under a Creative Commons Attribution 4.0 International License.