Updating the hyperparameter search space#154
Updating the hyperparameter search space#154ArlindKadra wants to merge 1 commit intorefactor_developmentfrom
Conversation
| beta1: Tuple[Tuple, float] = ((0.85, 0.999), 0.9), | ||
| beta2: Tuple[Tuple, float] = ((0.9, 0.9999), 0.9), | ||
| weight_decay: Tuple[Tuple, float] = ((0.0, 0.1), 0.0) | ||
| weight_decay: Tuple[Tuple, float, bool] = ((0.0, 0.1), 0.0, True) |
There was a problem hiding this comment.
in the paper, I think weight decay is not on a log scale
There was a problem hiding this comment.
@ravinkohli You are right, that is why I also wrote adds log sampled weight decay. Common practice has it like that:
https://ml.informatik.uni-freiburg.de/papers/16-AUTOML-AutoNet.pdf
|
|
||
| weight_decay = UniformFloatHyperparameter('weight_decay', lower=weight_decay[0][0], upper=weight_decay[0][1], | ||
| default_value=weight_decay[1]) | ||
| default_value=weight_decay[1], log=weight_decay[2]) |
|
|
||
| weight_decay = UniformFloatHyperparameter('weight_decay', lower=weight_decay[0][0], upper=weight_decay[0][1], | ||
| default_value=weight_decay[1]) | ||
| default_value=weight_decay[1], log=weight_decay[2]) |
| def get_hyperparameter_search_space(dataset_properties: Optional[Dict] = None, | ||
| lr: Tuple[Tuple, float, bool] = ((1e-5, 1e-1), 1e-2, True), | ||
| weight_decay: Tuple[Tuple, float] = ((0.0, 0.1), 0.0), | ||
| weight_decay: Tuple[Tuple, float, bool] = ((0.0, 0.1), 0.0, True), |
ravinkohli
left a comment
There was a problem hiding this comment.
Thank you for this PR. We'll need these changes when we want to compare this to the paper version. I don't think the weight decay is on a log scale, other than that these changes look good
@ravinkohli I would say that common practice has it that you sample on the log scale for the l2 regularization term too as it is here: https://ml.informatik.uni-freiburg.de/papers/16-AUTOML-AutoNet.pdf |
|
Closing the pull request since I added it in the cocktails branch and we can later merge it into development. |
Given the recent experiments with the cocktails, I think the hyperparameter search space needs this update in the case when the architecture is not restricted. And this is a general update and not only related to the cocktails.
https://arxiv.org/abs/2006.13799
Matches the search space from the paper above and adds log sampled
weight decayvalues.