This mini-project extracts the conditional encoder–decoder branch from CDSegNet and turns it into a light-weight point-cloud semantic segmentation trainer for the Replica dataset. Diffusion components are deliberately omitted; the model focuses on hierarchical point feature encoding and decoding with skip connections.
cdsegnet_autoencoder_only/– reusable Python package exposing the Replica dataset loader, point autoencoder model, and utility helpers.train.py– entry point for launching training/evaluation runs.cache/– autogenerated directory (when the training script runs) that hosts cached.npzcopies of the Replica point clouds to avoid repeatedly parsing the large ASCII.pcdfiles.
cd /usr/project/CDSegNet_autoencoder_only
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
python train.py \
--data-root ../E_data/replica_dataset/preprocessed \
--output-dir ./outputs \
--train-scenes apartment_0,apartment_1 \
--val-scenes apartment_2 \
--epochs 50 \
--batch-size 4Key artefacts are written to outputs/:
train_config.json– frozen copy of the CLI arguments.dataset_meta.pt– label mapping and class-name metadata.logs/metrics.txt– per-epoch training/validation metrics.checkpoints/latest.ptandcheckpoints/best.pt– PyTorch checkpoints with model/optimizer state.
- The dataset class loads Replica point clouds once, caches them under
cache/, and samples random 4,096-point subsets per__getitem__. - Basic augmentations include random rotation around the gravity axis and Gaussian jitter. Validation runs without augmentation.
- The network implements a four-stage encoder/decoder with farthest-point sampling, local k-NN aggregation, and nearest-neighbour feature interpolation, mirroring the structure of CDSegNet's conditional branch.
To run evaluation on a custom split, point the --val-scenes flag to the
desired Replica scene names. The dataset automatically shares the label index
mapping between training and validation splits to keep predictions aligned.