ABSTRACT
Atmospheric fog poses critical challenges for computer vision systems in autonomous driving, surveillance, and robotics, where reliable object classification is essential. Under severe fog, classification accuracy can degrade by over 50%, and most existingapproaches rely on separate defogging steps that limit real-time applicability. This study introduces the Enhanced Density-Aware Cross-Scale Transformer (EDCST), a novel architecture for direct object classification under foggy conditions without requiring prior defogging. To support comprehensive evaluation, we developed a physics-based simulation framework generating four fog types (uniform, gradient, patchy, adaptive) across nine intensity levels (0-80% scattering). EDCST leverages 384-dimensional embeddings, eight transformer layers, and twelve attention heads, trained using curriculum learning with OneCycleLR scheduling. On CODaN-Fog (15,500 images at 224×224 resolution), EDCST achieves 84.4% accuracy on clean images and retains 74.2% accuracy under severe fog (80% intensity), outperforming baseline transformers by 15.8 percentage points. Class-wise sensitivity analysis reveals that larger objectssuch as vehicles and animals maintain over 75% classification performance, while smaller objects are more affected. Patchy fog causes the greatest accuracy drop (19.1%), followed by adaptive (8.9%) and uniform fog (6.8%). The model converges in 100 epochs within 513 minutes on Tesla V100 GPU. This work introduces a real-time-capable classification framework that eliminates defogging requirements and maintains strong performance under diverse fog conditions, making it highly suitable for safety-critical vision applications.
Keywords: Object classification, atmospheric fog, robust computer vision, atmospheric scattering, curriculum learning, fog simulation.
