[Executorch][LLM] Use caching allocator for runner#15710
[Executorch][LLM] Use caching allocator for runner#15710kimishpatel wants to merge 1 commit intogh/kimishpatel/210/basefrom
Conversation
We observed that on iOS it improves perf by 6% because SDPA op does temp allocations. No significant difference on android though. Differential Revision: [D86120038](https://our.internmc.facebook.com/intern/diff/D86120038/) [ghstack-poisoned]
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/15710
Note: Links to docs will display an error until the docs builds have been completed. ❌ 127 New Failures, 1 Unrelated FailureAs of commit c826be1 with merge base aba44fd ( NEW FAILURES - The following jobs have failed:
FLAKY - The following job failed but was likely due to flakiness present on trunk:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
We observed that on iOS it improves perf by 6% because SDPA op does temp allocations. No significant difference on android though. Differential Revision: [D86120038](https://our.internmc.facebook.com/intern/diff/D86120038/) ghstack-source-id: 322129671 Pull Request resolved: #15710
This PR needs a
|
| model_path, data_files, Module::LoadMode::File); | ||
| model_path, | ||
| data_files, | ||
| Module::LoadMode::File, |
|
Looks like this PR hasn't been updated in a while so we're going to go ahead and mark this as |
|
Looks like this PR hasn't been updated in a while so we're going to go ahead and mark this as |
Stack from ghstack (oldest at bottom):
We observed that on iOS it improves perf by 6% because SDPA op does temp allocations.
No significant difference on android though.
Differential Revision: D86120038