Suivez-nous :
Comparative Evaluation of MFCC and Mel-spectrogram Features for CNN-Based Respiratory Abnormality Detection

Automated respiratory sound analysis addresses critical limitations in traditional
clinical auscultation, particularly high inter-observer variability and limited
specialist access in resource-constrained settings. This study rigorously compares
Mel-Frequency Cepstral Coefficients (MFCC) and Mel-spectrogram representations for classifying respiratory abnormalities using convolutional neural networks. Using the ICBHI 2017 dataset (920 recordings, 6,898 cycles from 126 patients), we implemented identical CNN architectures differing only in input features. Class imbalance was addressed through Synthetic Minority Over-sampling Technique applied exclusively to training data. The MFCC model achieved 83% accuracy with superior sensitivity for normal sounds (97% recall), while Mel-spectrograms reached 82% accuracy with higher precision (95%). MFCC demonstrated better crackle detection (76% vs 73% recall) and wheeze precision (75% vs 71%), attributed to
enhanced transient spectral capture through discrete cosine transformation. Both models showed strong discrimination (AUC > 0.90). MFCC offers computational
efficiency advantages for screening applications, while Mel-spectrograms provide interpretability for diagnostic contexts. This controlled comparison provides evidence-based guidance for computer-aided respiratory diagnostic system design, particularly relevant for resource-limited healthcare environments.