Multimodal deep learning based on the combination of EfficientNetV2 and ViT for Alzheimer’s disease early diagnosis enhanced by SAGAN data augmentation
Keywords:
Alzheimer’s early diagnosis,CNN,Transformer, Vision transformer,Self attentionAbstract
The digitalization of health data and the innovative eHealth technologies has created a new paradigm shift from traditional medicine methods to a new predictable, individualized medicine based on patient-centric approaches. The emerging fields of predictive and precision medicine are evolutionary methods to treat the disease based on the patient’s characteristics such as his lifestyle, genetic profile ,and environment to understand the disease. Alzheimer’s (AD) early detection is still a challenging task. Researchers adopt advanced imaging techniques such as Magnetic Resonance Imaging (MRI) and fluorodeoxyglucose (FDG)-positron emission tomography (PET) to ensure a relevant understanding of AD disease. Extracting insights from these data is the key step towards disease early prediction and preventing its progression. Recently, deep learning methods have shown unparalleled success and have made a significant headway on brain diseases detection. Convolution neural network is a type of deep learning that has shown a stateof-the-art performance on the AD detection and early diagnosis. However its application has many limitations. The recent architectures such as transformers ensure an efficient image recognition and feature extraction with less complexity. In this study we investigate and evaluate the application of the different CNN and transformers models on Alzheimer’s disease early diagnosis. Further, we introduce a multi-modal method based on the MRI and PET modality for Alzheimer’s disease detection using the combination of the Efficientnet V2 and the vision transformer enhanced by a new data augmentation based on the self attention generative adversarial networks(SAGAN). We validated the proposed method using the Alzheimer’s Disease Neuroimaging Initiative (ADNI) and the Open Access Series of Imaging Studies (OASIS). Our proposed method combines the main advantages of the vision transformer and Efficientnet V2 achieving a 96% accuracy rate. This new method outperforms different CNN models and transformer methods and ensures a robust feature extraction and representation.
Downloads
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2023 International Journal of Computer Information Systems and Industrial Management Applications

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.