Skip to content

Updating the hyperparameter search space#154

Closed
ArlindKadra wants to merge 1 commit intorefactor_developmentfrom
refactor_development_ss_enhancement
Closed

Updating the hyperparameter search space#154
ArlindKadra wants to merge 1 commit intorefactor_developmentfrom
refactor_development_ss_enhancement

Conversation

@ArlindKadra
Copy link

Given the recent experiments with the cocktails, I think the hyperparameter search space needs this update in the case when the architecture is not restricted. And this is a general update and not only related to the cocktails.

https://arxiv.org/abs/2006.13799

Matches the search space from the paper above and adds log sampled weight decay values.

beta1: Tuple[Tuple, float] = ((0.85, 0.999), 0.9),
beta2: Tuple[Tuple, float] = ((0.9, 0.9999), 0.9),
weight_decay: Tuple[Tuple, float] = ((0.0, 0.1), 0.0)
weight_decay: Tuple[Tuple, float, bool] = ((0.0, 0.1), 0.0, True)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in the paper, I think weight decay is not on a log scale

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ravinkohli You are right, that is why I also wrote adds log sampled weight decay. Common practice has it like that:
https://ml.informatik.uni-freiburg.de/papers/16-AUTOML-AutoNet.pdf


weight_decay = UniformFloatHyperparameter('weight_decay', lower=weight_decay[0][0], upper=weight_decay[0][1],
default_value=weight_decay[1])
default_value=weight_decay[1], log=weight_decay[2])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here also


weight_decay = UniformFloatHyperparameter('weight_decay', lower=weight_decay[0][0], upper=weight_decay[0][1],
default_value=weight_decay[1])
default_value=weight_decay[1], log=weight_decay[2])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

def get_hyperparameter_search_space(dataset_properties: Optional[Dict] = None,
lr: Tuple[Tuple, float, bool] = ((1e-5, 1e-1), 1e-2, True),
weight_decay: Tuple[Tuple, float] = ((0.0, 0.1), 0.0),
weight_decay: Tuple[Tuple, float, bool] = ((0.0, 0.1), 0.0, True),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also here

Copy link
Contributor

@ravinkohli ravinkohli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for this PR. We'll need these changes when we want to compare this to the paper version. I don't think the weight decay is on a log scale, other than that these changes look good

@ArlindKadra
Copy link
Author

Thank you for this PR. We'll need these changes when we want to compare this to the paper version. I don't think the weight decay is on a log scale, other than that these changes look good

@ravinkohli I would say that common practice has it that you sample on the log scale for the l2 regularization term too as it is here: https://ml.informatik.uni-freiburg.de/papers/16-AUTOML-AutoNet.pdf
So, I would suggest keeping this addition that changes from the paper.

@ArlindKadra
Copy link
Author

Closing the pull request since I added it in the cocktails branch and we can later merge it into development.

@ArlindKadra ArlindKadra closed this Apr 6, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants