Technical ReportTechnical Report · UIC Spring 2026·2026

Responsible AI for Scientific Discovery: Evaluating Explainability Methods for Galaxy Morphology Classification across Multiple Architectures and Datasets

Gargi SatheArchit Rathod
abstract

Deep learning classifiers are now standard tools for automated galaxy morphology classification at the scale demanded by next-generation astronomical surveys. Their black-box nature, however, undermines scientific trust. We present a comparative study of four post-hoc explainability methods — Grad-CAM, LIME, Integrated Gradients, and GradientSHAP — applied to four convolutional architectures (ResNet-18, VGG-16, EfficientNet-B0, and a lightweight Custom CNN) on the Galaxy10 DECaLS dataset, with a secondary case study on the Galaxy Zoo Evo tiny subset. The secondary GZ Evo experiment surfaces a key lesson: under strict vote-fraction filtering and small-sample conditions, XAI evaluation can become unstable in ways that practitioners must report honestly. Together the experiments argue that XAI faithfulness rankings depend on both architecture and dataset conditions, and that no single explainer dominates universally.

keywords
Galaxy MorphologyExplainabilityXAIGrad-CAMLIMEIntegrated GradientsGradientSHAPCNNResNetAstronomyRubin LSST