Skip to content
This repository was archived by the owner on Mar 21, 2024. It is now read-only.

Automatically and linearly scale the learning rate of the SSL encoder to the number of GPUS#667

Merged
maxilse merged 10 commits into
mainfrom
maxilse/scale_lr
Mar 22, 2022
Merged

Automatically and linearly scale the learning rate of the SSL encoder to the number of GPUS#667
maxilse merged 10 commits into
mainfrom
maxilse/scale_lr

Conversation

@maxilse

@maxilse maxilse commented Feb 22, 2022

Copy link
Copy Markdown
Contributor

The learning rate is now linearly scaled by the number of GPUs available, e.g., lr = 0.001, 8 GPUs are available => lr=0.008.

I tested it for SimCLR and BYOL (https://ml.azure.com/experiments/id/81fa8775-1a25-47ae-9fe7-c13a6b91a421?wsid=/subscriptions/db9fc1d1-b44e-45a8-902d-8c766c255568/resourceGroups/innereyerg/providers/Microsoft.MachineLearningServices/workspaces/innereye4ws&tid=72f988bf-86f1-41af-91ab-2d7cd011db47).

In both cases the results are as expected:

  • SimCLR: More GPUs give better representations due to larger batches == more negative eaxmples.
  • BYOL: More GPUs speed up the training but do not give better results.

I did not add any test since we have these two tests already: test_simclr_num_gpus() and test_simclr_num_nodes()

@maxilse maxilse requested a review from ant0nsc February 22, 2022 15:47

@fepegar fepegar left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM in general, I just added some questions.

Comment thread InnerEye/ML/SSL/lightning_containers/ssl_container.py Outdated
Comment thread InnerEye/ML/SSL/lightning_containers/ssl_container.py Outdated
Comment thread CHANGELOG.md
Comment thread InnerEye/ML/SSL/lightning_containers/ssl_container.py Outdated
@maxilse maxilse enabled auto-merge (squash) February 22, 2022 17:44
fepegar
fepegar previously approved these changes Feb 22, 2022
Comment thread InnerEye/ML/SSL/lightning_containers/ssl_container.py
ant0nsc
ant0nsc previously approved these changes Mar 16, 2022
@maxilse maxilse merged commit 6791dce into main Mar 22, 2022
@maxilse maxilse deleted the maxilse/scale_lr branch March 22, 2022 10:23
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants