Skip to content

Comments

Fix performance bugs in scalar reductions (#509)#543

Merged
marcinz merged 1 commit intonv-legate:branch-22.07from
marcinz:cherry-pick-e65032b
Aug 17, 2022
Merged

Fix performance bugs in scalar reductions (#509)#543
marcinz merged 1 commit intonv-legate:branch-22.07from
marcinz:cherry-pick-e65032b

Conversation

@marcinz
Copy link
Collaborator

@marcinz marcinz commented Aug 17, 2022

  • Unify the template for device reduction tree and do some cleanup

  • Fix performance bugs in scalar reduction kernels:

  • Use unsigned 64-bit integers instead of signed integers wherever
    possible; CUDA hasn't added an atomic intrinsic for the latter yet.

  • Move reduction buffers from zero-copy memory to framebuffer. This
    makes the slow atomic update code path in reduction operators
    run much more efficiently.

  • Use thew new scalar reduction buffer in binary reductions as well

  • Use only the RHS type in the reduction buffer as we never call apply

  • Minor clean up per review

  • Rename the buffer class and method to make the intent explicit

  • Flip the polarity of reduce's template parameter

* Unify the template for device reduction tree and do some cleanup

* Fix performance bugs in scalar reduction kernels:

* Use unsigned 64-bit integers instead of signed integers wherever
  possible; CUDA hasn't added an atomic intrinsic for the latter yet.

* Move reduction buffers from zero-copy memory to framebuffer. This
  makes the slow atomic update code path in reduction operators
  run much more efficiently.

* Use thew new scalar reduction buffer in binary reductions as well

* Use only the RHS type in the reduction buffer as we never call apply

* Minor clean up per review

* Rename the buffer class and method to make the intent explicit

* Flip the polarity of reduce's template parameter
@marcinz marcinz requested a review from magnatelee August 17, 2022 21:23
@marcinz marcinz merged commit 2959f0a into nv-legate:branch-22.07 Aug 17, 2022
manopapad pushed a commit that referenced this pull request Feb 18, 2025
* Update reshape() implementation to avoid unnecessary deepcopy

* Optimize reshape function and update tests

* Refactor reshaping logic and add type hints to test
manopapad pushed a commit that referenced this pull request Feb 18, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants