Use to_edge_lower_and_transform for XNNPack by jackzhxng · Pull Request #8624 · pytorch/executorch

jackzhxng · 2025-02-21T19:49:44Z

Summary

Use to_edge_transform_and_lower in export_llama for XNNPack. As part of these changes, this also means that you cannot specify multiple backends in export_llama in the args, although I'm not sure if that is happening anywhere at the moment.

Closes #8621

Performance regression benchmarking for xnnpack (on android) vs. past 3 days:

These benchmark numbers also normally fluctuate a bit across runs and these differences are within the usual fluctuation ranges.

Test plan

See if CI passes

pytorch-bot · 2025-02-21T19:49:48Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/8624

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 951d91e with merge base 77589c6 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

iseeyuan · 2025-02-24T17:35:01Z

Have you tested the performance of a model (like llama 3B), before and after? Asking because export_llama is used by different users to prepare for the .pte files in cpu. Please make sure there's no perf regress.

jackzhxng · 2025-02-24T18:21:15Z

Running on demand perf benchmark here: https://github.com/pytorch/executorch/actions/runs/13505211116

examples/models/llama/export_llama_lib.py

tarun292 · 2025-02-24T18:27:49Z

examples/models/llama/export_llama_lib.py

+    )
+    if args.verbose:
+        print_delegation_info(builder.edge_manager.exported_program().graph_module)
+    if args.num_sharding > 0 and args.qnn:


This code shouldn't be here right?

Oh yeah technically should remove since qnn. Will remove after benchmarking finishes

jackzhxng · 2025-02-24T20:04:10Z

@iseeyuan please see the performance benchmark graph I posted in the pr description

mergennachin · 2025-02-24T21:24:20Z

Can you run internal CI tests before merging? Otherwise looks good.

facebook-github-bot · 2025-02-24T22:00:10Z

@jackzhxng has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

Summary: Use `to_edge_transform_and_lower` in `export_llama` for XNNPack. As part of these changes, this also means that you cannot specify multiple backends in `export_llama` in the args, although I'm not sure if that is happening anywhere at the moment. Closes #8621 Performance regression benchmarking for xnnpack (on android) vs. past 3 days: <img width="1427" alt="Screenshot 2025-02-24 at 11 39 52 AM" src="https://github.com/user-attachments/assets/1640cf2c-a579-491f-8940-7ccfbe464903" /> These benchmark numbers also normally fluctuate a bit across runs and these differences are within the usual fluctuation ranges. Test Plan: See if CI passes Differential Revision: D70124742 Pulled By: jackzhxng

facebook-github-bot · 2025-02-25T00:16:11Z

This pull request was exported from Phabricator. Differential Revision: D70124742

facebook-github-bot · 2025-02-25T00:33:02Z

@jackzhxng has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

Summary: Use `to_edge_transform_and_lower` in `export_llama` for XNNPack. As part of these changes, this also means that you cannot specify multiple backends in `export_llama` in the args, although I'm not sure if that is happening anywhere at the moment. Closes #8621 Performance regression benchmarking for xnnpack (on android) vs. past 3 days: <img width="1427" alt="Screenshot 2025-02-24 at 11 39 52 AM" src="https://github.com/user-attachments/assets/1640cf2c-a579-491f-8940-7ccfbe464903" /> These benchmark numbers also normally fluctuate a bit across runs and these differences are within the usual fluctuation ranges. Test Plan: See if CI passes Differential Revision: D70124742 Pulled By: jackzhxng

facebook-github-bot · 2025-02-25T02:15:01Z

This pull request was exported from Phabricator. Differential Revision: D70124742

This reverts commit b5344c1.

* Revert "Switch to new ao quant api for 8da4w (#8501)" This reverts commit f3fc096. * Revert "Use to_edge_lower_and_transform for XNNPack (#8624)" This reverts commit b5344c1. #8624 caused concerning test failure internally -- out of bounds array access. #8501 depends on it per author

Summary: Trying to bring back #8624 after it got reverted due to an internal test failing Differential Revision: D70221944 Pulled By: jackzhxng

Differential Revision: D70221944 Pull Request resolved: #8717

* Revert "Switch to new ao quant api for 8da4w (#8501)" This reverts commit f3fc096. * Revert "Use to_edge_lower_and_transform for XNNPack (#8624)" This reverts commit b5344c1.

jackzhxng requested review from iseeyuan, larryliu0820 and lucylq as code owners February 21, 2025 19:49

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 21, 2025

jackzhxng changed the title ~~Use to_edge_lower_and_transform for xnnpack~~ Use to_edge_lower_and_transform for XNNPack Feb 21, 2025

jackzhxng added topic: not user facing release notes: xnnpack Changes to the XNNPack backend delegate and removed topic: not user facing labels Feb 21, 2025

jackzhxng mentioned this pull request Feb 21, 2025

Switch to new ao quant api for 8da4w #8501

Merged

jackzhxng added the ciflow/trunk label Feb 21, 2025

mergennachin self-requested a review February 21, 2025 20:43

tarun292 reviewed Feb 24, 2025

View reviewed changes

examples/models/llama/export_llama_lib.py Show resolved Hide resolved

tarun292 reviewed Feb 24, 2025

View reviewed changes

jackzhxng temporarily deployed to upload-benchmark-results February 24, 2025 19:19 — with GitHub Actions Inactive

jackzhxng temporarily deployed to upload-benchmark-results February 24, 2025 19:48 — with GitHub Actions Inactive

mergennachin approved these changes Feb 24, 2025

View reviewed changes

facebook-github-bot force-pushed the jz/export_llama_new_api branch from b30733d to 1b0c5e4 Compare February 25, 2025 00:16

facebook-github-bot added the fb-exported label Feb 25, 2025

facebook-github-bot force-pushed the jz/export_llama_new_api branch from 930ec0e to 951d91e Compare February 25, 2025 02:14

jackzhxng merged commit b5344c1 into main Feb 25, 2025
119 of 121 checks passed

jackzhxng deleted the jz/export_llama_new_api branch February 25, 2025 09:58

jackzhxng restored the jz/export_llama_new_api branch February 25, 2025 09:59

jackzhxng temporarily deployed to upload-benchmark-results February 25, 2025 10:45 — with GitHub Actions Inactive

jackzhxng mentioned this pull request Feb 25, 2025

[DRAFT] Export llama uses to_edge_lower_and_transform #7524

Closed

swolchok added a commit that referenced this pull request Feb 26, 2025

Revert "Use to_edge_lower_and_transform for XNNPack (#8624)"

8a95288

This reverts commit b5344c1.

swolchok mentioned this pull request Feb 26, 2025

Revert #8501 and #8624 #8716

Merged

jackzhxng added a commit that referenced this pull request Feb 26, 2025

Use to_edge_lower_and_transform for XNNPack (#8624)

f737078

jackzhxng mentioned this pull request Feb 26, 2025

Use to_edge_lower_and_transform for XNNPack (#8624) #8717

Merged

facebook-github-bot pushed a commit that referenced this pull request Feb 27, 2025

Use to_edge_lower_and_transform for XNNPack (#8624)

30d4cc8

Differential Revision: D70221944 Pull Request resolved: #8717

jackzhxng mentioned this pull request Mar 11, 2025

Fix pre-autograd transforms not getting persisted during xnnpack export #9118

Merged

iseeyuan pushed a commit that referenced this pull request Mar 14, 2025

Revert #8501 and #8624 (#8716)

5f32355

* Revert "Switch to new ao quant api for 8da4w (#8501)" This reverts commit f3fc096. * Revert "Use to_edge_lower_and_transform for XNNPack (#8624)" This reverts commit b5344c1.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use to_edge_lower_and_transform for XNNPack#8624

Use to_edge_lower_and_transform for XNNPack#8624
jackzhxng merged 1 commit intomainfrom
jz/export_llama_new_api

jackzhxng commented Feb 21, 2025 •

edited

Loading

Uh oh!

pytorch-bot bot commented Feb 21, 2025 •

edited

Loading

Uh oh!

iseeyuan commented Feb 24, 2025

Uh oh!

jackzhxng commented Feb 24, 2025 •

edited

Loading

Uh oh!

Uh oh!

tarun292 Feb 24, 2025

Uh oh!

jackzhxng Feb 24, 2025

Uh oh!

jackzhxng commented Feb 24, 2025

Uh oh!

mergennachin commented Feb 24, 2025

Uh oh!

facebook-github-bot commented Feb 24, 2025

Uh oh!

facebook-github-bot commented Feb 25, 2025

Uh oh!

facebook-github-bot commented Feb 25, 2025

Uh oh!

facebook-github-bot commented Feb 25, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

jackzhxng commented Feb 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Uh oh!

pytorch-bot bot commented Feb 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/8624

✅ No Failures

Uh oh!

iseeyuan commented Feb 24, 2025

Uh oh!

jackzhxng commented Feb 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

tarun292 Feb 24, 2025

Choose a reason for hiding this comment

Uh oh!

jackzhxng Feb 24, 2025

Choose a reason for hiding this comment

Uh oh!

jackzhxng commented Feb 24, 2025

Uh oh!

mergennachin commented Feb 24, 2025

Uh oh!

facebook-github-bot commented Feb 24, 2025

Uh oh!

facebook-github-bot commented Feb 25, 2025

Uh oh!

facebook-github-bot commented Feb 25, 2025

Uh oh!

facebook-github-bot commented Feb 25, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

jackzhxng commented Feb 21, 2025 •

edited

Loading

pytorch-bot bot commented Feb 21, 2025 •

edited

Loading

jackzhxng commented Feb 24, 2025 •

edited

Loading