Skip to content
This repository was archived by the owner on Mar 21, 2024. It is now read-only.
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .idea/InnerEye-DeepLearning.iml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -96,6 +96,7 @@ in inference-only runs when using lightning containers.
- ([#553](https://github.com/microsoft/InnerEye-DeepLearning/pull/553)) Fix incomplete test data module setup in Lightning inference.
- ([#557](https://github.com/microsoft/InnerEye-DeepLearning/pull/557)) Fix issue where learning rate was not set
correctly in the SimCLR module
- ([#622](https://github.com/microsoft/InnerEye-DeepLearning/pull/622)) Fix issue with multi-GPU jobs on a VM: each process tries to create a folder structure
- ([#558](https://github.com/microsoft/InnerEye-DeepLearning/pull/558)) Fix issue with the CovidModel config where model
weights from a finetuning run were incompatible with the model architecture created for non-finetuning runs.
- ([#604](https://github.com/microsoft/InnerEye-DeepLearning/pull/604)) Fix issue where runs on a VM would download the dataset even when a local dataset is provided.
Expand Down
11 changes: 9 additions & 2 deletions InnerEye/ML/deep_learning_config.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@
from InnerEye.ML.common import CHECKPOINT_FOLDER, DATASET_CSV_FILE_NAME, \
ModelExecutionMode, VISUALIZATION_FOLDER, \
create_unique_timestamp_id, get_best_checkpoint_path
from health_azure.utils import is_global_rank_zero


@unique
Expand Down Expand Up @@ -135,8 +136,14 @@ def create(project_root: Path,
else:
logging.info("All results will be written to a subfolder of the project root folder.")
root = project_root.absolute() / DEFAULT_AML_UPLOAD_DIR
timestamp = create_unique_timestamp_id()
run_folder = root / f"{timestamp}_{model_name}"
if is_global_rank_zero():
timestamp = create_unique_timestamp_id()
run_folder = root / f"{timestamp}_{model_name}"
else:
# Handle the case where there are multiple DDP threads on the same machine outside AML.
# Each child process will be started with the current working directory set to be the output
# folder of the rank 0 process. We want all other process to write to that same folder.
run_folder = Path.cwd().absolute()
Comment thread
ant0nsc marked this conversation as resolved.
outputs_folder = run_folder
logs_folder = run_folder / DEFAULT_LOGS_DIR_NAME
else:
Expand Down