View markdown source on GitHub

Gleam Image Learner - Validating Skin Lesion Classification on HAM10000

Contributors

Questions

Objectives

last_modification Published: Jan 28, 2026
last_modification Last Updated: Jan 28, 2026

Introduction to GLEAM Image Learner and Galaxy

Speaker Notes

GLEAM Image Learner simplifies deep learning by automating tasks like data preprocessing, model training with transfer learning, and comprehensive evaluation. In this tutorial, we will explore how Image Learner can be used within Galaxy to build reliable image classifiers, using the HAM10000 skin lesion dataset as a case study.


Use Case: Skin Lesion Classification with HAM10000

Workflow overview for HAM10000 classification


Dataset Preprocessing (Balanced Subset)


Balanced Dataset Composition

Lesion Type Count Percentage
Melanocytic nevus (nv) 200 14.3%
Melanoma (mel) 200 14.3%
Basal cell carcinoma (bcc) 200 14.3%
Actinic keratosis (akiec) 200 14.3%
Benign keratosis (bkl) 200 14.3%
Dermatofibroma (df) 200 14.3%
Vascular lesion (vasc) 200 14.3%
Total 1,400 100%

Data Augmentation Strategy

Example of horizontal flip augmentation

Adapted from Shetty et al., 2022 (Scientific Reports 12, 18134)


Transfer Learning with Image Learner


Image Learner in Galaxy


Model Configuration

Parameter Value Rationale
Task Type Multi-class classification Seven lesion classes
Model CAFormer S18 384 Efficient transformer-based architecture
Epochs 30 Sufficient for convergence
Batch Size 32 Balances memory and stability
Data Split Stratified 70/10/20 Train/validation/test

Running Image Learner

  1. Upload Data:
    • images_96.zip (1,400 images - 200 per class) from Zenodo
    • image_metadata_new.csv (class labels and metadata)
    • Zenodo link
  2. Run Image Learner:
    • Input images: images_96.zip
    • Input metadata: image_metadata_new.csv
    • Task: Classification
    • Model: CAFormer S18 384
    • Configure parameters as shown
  3. Evaluate Model:
    • Use Image Learner’s report to assess performance
    • Analyze ROC-AUC curves and confusion matrix

Image Learner Model Report

Model and training summary


Training Performance

Test performance summary showing training progression


Test Results and Diagnostics

Per-class metrics heatmap by lesion class


Confusion Matrix

Confusion matrix showing classification results


Comparison with Shetty et al. (2022)

Reference: Shetty et al., 2022 (Scientific Reports 12, 18134)

Metric Shetty et al., 2022 (CNN) Image Learner (this tutorial)
Accuracy 0.94 (94%) 0.87 (87%)
Weighted Precision 0.88 (88%) 0.87 (87%)
Weighted Recall 0.85 (85%) 0.87 (87%)
Weighted F1-Score 0.86 (86%) 0.87 (87%)

Conclusion


Thank you!

This material is the result of a collaborative work. Thanks to the Galaxy Training Network and all the contributors! Galaxy Training Network Tutorial Content is licensed under Creative Commons Attribution 4.0 International License.