Knee Disease Detection and Severity Classification Using Deep Learning Models: A Vision Transformer Approach

Sakshi M Gunde; Pranoti P Mane

doi:10.70917/ijcisim-2026-2673

Authors

Sakshi M Gunde Assistant Professor, AISSMS Institute of Information Technology, Pune (Vishwakarma Institute of Technology, Pune)
Pranoti P Mane HOD, Dept of Computer Engineering, MES's Wadia College of Engineering, Pune.

DOI:

https://doi.org/10.70917/ijcisim-2026-2673

Keywords:

Knee Osteoarthritis, Vision Transformer, Deep Learning, Kellgren-Lawrence Grading, Medical Image Analysis, Transfer Learning

Abstract

Knee diseases, particularly Knee Osteoarthritis (KOA), represent one of the most prevalent musculoskeletal disorders worldwide, affecting hundreds of millions of individuals and imposing a substantial burden on healthcare systems. Early and accurate diagnosis is paramount for preventing disease progression and enabling timely therapeutic intervention. Conventional diagnostic approaches relying on radiologist interpretation of plain radiographs are inherently subjective, time-consuming, and susceptible to inter-observer variability, necessitating the development of robust automated systems capable of consistent, reproducible, and clinically meaningful assessments. The automated analysis of knee X-ray images presents multifaceted challenges, including significant class imbalance across Kellgren-Lawrence (KL) grading categories, subtle radiographic distinctions between adjacent severity grades, heterogeneity in imaging acquisition protocols, and the limited availability of large-scale annotated clinical datasets. Furthermore, existing deep learning models exhibit limitations in interpretability, which constrains their translational utility in clinical practice. This research proposes a novel deep learning framework integrating Vision Transformer (ViT) architecture with transfer learning and ensemble strategies for automated KOA detection and severity grading. The methodology encompasses comprehensive image preprocessing, augmentation pipelines, patch-based spatial tokenization, multi-head self-attention mechanisms, and a multi-class classification head calibrated for five-grade KL severity assessment. The system is trained and validated on the publicly available Knee Osteoarthritis Dataset (KOA Dataset) sourced from Kaggle, comprising radiographic images representative of all KL grades.Three core algorithmic strategies are employed—Vision Transformer (ViT) for global feature extraction via self-attention, Adaptive Learning Rate Scheduling for optimization stability, and a Focal Loss mechanism for addressing class imbalance. Mathematical formulations are rigorously derived for each algorithmic component. The proposed model achieves a peak validation accuracy of 95.2%, a macro-averaged F1-score of 95.0%, and AUC values exceeding 0.94 across all five KL grades, significantly outperforming ResNet-50, VGG-16, EfficientNet, and baseline ViT configurations. These results demonstrate the superior capacity of the proposed framework to capture fine-grained spatial features critical for reliable KOA severity stratification. The proposed ViT-based deep learning system offers a clinically viable, scalable, and highly accurate solution for automated knee disease diagnosis. It constitutes a significant contribution toward AI-assisted orthopedic radiology, with potential for direct deployment in clinical decision support systems and telemedicine platforms.

Downloads

Download data is not yet available.

Knee Disease Detection and Severity Classification Using Deep Learning Models: A Vision Transformer Approach

Authors

DOI:

Keywords:

Abstract

Downloads

Downloads

Published

How to Cite

Issue

Section

Information