Skip to content
This repository was archived by the owner on Nov 17, 2023. It is now read-only.

Round and sign straight-through-estimators C operators.#16373

Merged
sxjscience merged 2 commits intoapache:masterfrom
igolan:straight_through_estimators_operators
Oct 8, 2019
Merged

Round and sign straight-through-estimators C operators.#16373
sxjscience merged 2 commits intoapache:masterfrom
igolan:straight_through_estimators_operators

Conversation

@igolan
Copy link
Copy Markdown
Contributor

@igolan igolan commented Oct 4, 2019

Description

Implemented sign and round straight-through-estimators operators in C.
Straight-through-estimators have derivative of 1 everywhere instead of 0 everywhere, this is required for quantized training.

Checklist

Essentials

  • Changes are complete (i.e. I finished coding on this PR)
  • All changes have test coverage:
  • Code is well-documented:
  • For user-facing API changes, API doc string has been updated.
  • For new C++ functions in header files, their functionalities and arguments are documented.
  • For new examples, README.md is added to explain the what the example does, the source of the dataset, expected performance on test set and reference to the original paper if applicable
  • Check the API doc at https://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-$PR_ID/$BUILD_ID/index.html
  • To the my best knowledge, examples are not affected by this change.

Changes

  • contrib.round_ste() including test and API doc
  • contrib.sign_ste() including test and API doc

Comments

N/A

@sxjscience
Copy link
Copy Markdown
Member

@szhengac @xidulu I think the stochastic version of straight-through estimator is related to the distribution package.

@igolan
Copy link
Copy Markdown
Contributor Author

igolan commented Oct 4, 2019

@sxjscience , just to clarify, it's not a stochastic version of round/sign, it's just round and sign with derivative of 1 (instead of 0).

@sxjscience
Copy link
Copy Markdown
Member

@igolan Yes, just to ping the other people to see if we could merge the efforts.

@xidulu
Copy link
Copy Markdown
Contributor

xidulu commented Oct 5, 2019

@sxjscience
You are right, straight through estimator could be useful when performing hard assignment on relaxed discrete variable. (e.g. https://arxiv.org/pdf/1611.01144.pdf section 2.2).

@sxjscience sxjscience merged commit d5666ed into apache:master Oct 8, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants