DeepLabV3+ Binary Segmentation Model

This repository provides a DeepLabV3+ model for binary semantic segmentation (background vs. foreground), packaged for deployment with Cog / Replicate. It is designed for high-accuracy foreground detection tasks such as roof segmentation in aerial imagery, but can be retrained for any binary segmentation dataset.

Features

DeepLabV3+ architecture
Two output classes (0 = background, 1 = foreground)
Combined Cross-Entropy + Dice Loss
IoU-based validation and model selection
Mixed-precision training (AMP)
Data augmentation with Albumentations
Automatic best-checkpoint saving
GPU and CPU inference support

Training

Training is handled via train.py. The script: - Loads paired images and masks
- Applies normalization and augmentation
- Trains using mixed precision
- Logs training and validation loss
- Computes validation IoU

The best performing checkpoint is saved to: models/checkpoint.pth

Training curves and a sample prediction image are also exported.

Inference

predict.py loads the saved checkpoint and performs inference by: 1. Normalizing the input image using ImageNet mean and standard deviation
2. Running the model on GPU if available, otherwise CPU
3. Applying softmax and argmax to produce class predictions
4. Exporting a grayscale PNG mask where: - 0 = background
- 255 = foreground

Output is written to: /tmp/output.png

Model created 2 days, 17 hours ago

Model updated 2 days, 16 hours ago