Add validation_interval to TrainConfig. Run validation and stats-logging every validation_interval. #452

Abdul-Mukit · 2025-11-13T03:39:51Z

Description

Closes: #449
Add args.validation_interval to TrainConfig.
During the training session, make evaluate() calls and stats-logging every args.validation_interval .

Setting args.validation_interval to a large value (e.g. equal to the args.epochs + 1) essentially means we will not run the evaluate() operation throughout the training.

Related:
#416

Type of change

New feature (non-breaking change which adds functionality)

How has this change been tested, please provide a testcase or example of how you tested the change?

Trained locally on basketball-player-detection-2.v13i.coco.zip dataset using the a script.

Test 1:

Train for 5 epochs with validation_interval=2.

from rfdetr import RFDETRNano

model = RFDETRNano()

model.train(
    dataset_dir="/home/abdul/projects/rf-detr/datasets_downloads/basketball-player-detection-2",
    epochs=5,
    batch_size=2,
    grad_accum_steps=1,
    lr=1e-4,
    num_workers=2,
    device='cuda',
    checkpoint_interval=1,
    validation_interval=2,
)

Expected behaviour:

Validation runs every 2 epochs.
All weight checkpoints are still generated according to args.checkpoint_interval.
log.txt contains an entry every 2 epochs.
metrics_plot.png shows data points every 2 epochs.

Test 2:

Train for 5 epochs without any validation. Set validation_interval=6.

from rfdetr import RFDETRNano

model = RFDETRNano()

model.train(
    dataset_dir="/home/abdul/projects/rf-detr/datasets_downloads/basketball-player-detection-2",
    epochs=5,
    batch_size=2,
    grad_accum_steps=1,
    lr=1e-4,
    num_workers=2,
    device='cuda',
    checkpoint_interval=1,
    validation_interval=6,
)

Expected behaviour:

No validation is run throughout the training process.
All weight checkpoints are still generated according to args.checkpoint_interval.
No log.txt, metrics_plot.png, checkpoint_best_regular.pth, checkpoint_best_ema.pth, checkpoint_best_total.pth, results.json files are generated. These are generated only when at least one validation call finishes.
No test is run even if args.run_test === True, as checkpoint_best_total.pth doesn't exist.

Any specific deployment considerations

None

Docs

None

…to skip validation and stats logging at every epoch.

…y it.

…True. Check if the file exists before trying to load it.

Add args.validation_interval to config. Use args.validation_interval …

a59ac19

…to skip validation and stats logging at every epoch.

Abdul-Mukit requested review from Matvezy, SkalskiP, isaacrob-roboflow and probicheaux as code owners November 13, 2025 03:39

Fix: Check if checkpoint_best_regular.pth exists before trying to cop…

288d06b

…y it.

Abdul-Mukit mentioned this pull request Nov 13, 2025

Disable validation during training? #449

Open

Fix: Check if checkpoint_best_total.pth exists when args.run_test is …

399b997

…True. Check if the file exists before trying to load it.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add validation_interval to TrainConfig. Run validation and stats-logging every validation_interval. #452

Add validation_interval to TrainConfig. Run validation and stats-logging every validation_interval. #452

Uh oh!

Abdul-Mukit commented Nov 13, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Add validation_interval to TrainConfig. Run validation and stats-logging every validation_interval. #452

Are you sure you want to change the base?

Add validation_interval to TrainConfig. Run validation and stats-logging every validation_interval. #452

Uh oh!

Conversation

Abdul-Mukit commented Nov 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of change

How has this change been tested, please provide a testcase or example of how you tested the change?

Test 1:

Test 2:

Any specific deployment considerations

Docs

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Abdul-Mukit commented Nov 13, 2025 •

edited

Loading