Pybind merge fix by cavusmustafa · Pull Request #42 · ynimmaga/executorch

cavusmustafa · 2025-03-27T00:04:36Z

Merged latest executorch main into openvino_backend
Updated setup.py for latest pybind build updates

@kimishpatel

ETCoreML crashes in FB app. During debugging, we traced issue down to this. cc @kimishpatel @YifanShenSZ @cymbalrush

HF version bump. Ensure `optimum-executorch` can work on new `transformers` models with `executorch==0.6.0` ### Test plan CI to test HF models Co-authored-by: Guang Yang <guangyang@fb.com>

Summary: . Differential Revision: D71752750

Differential Revision: D71761219 Pull Request resolved: pytorch#9554

Summary: . Reviewed By: bsoyluoglu Differential Revision: D71752749

### Summary Seeing this error in Linux wheel building jobs: ``` Collecting numpy (from torchvision==0.22.0.dev20250311) Downloading numpy-2.2.4-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (62 kB) INFO: pip is looking at multiple versions of torchvision to determine which version is compatible with other requirements. This could take a while. The conflict is caused by: The user requested torch==2.7.0.dev20250311 torchvision 0.22.0.dev20250311+cpu depends on torch==2.7.0.dev20250310 ``` ### Test plan CI

https://github.com/pytorch/executorch/actions/runs/14047575373/job/39331644423 There seems to be some CI issues with: ``` torch._dynamo.exc.FailOnRecompileLimitHit: recompile_limit reached with one_graph=True. Excessive recompilations can degrade performance due to the compilation overhead of each recompilation. To monitor recompilations, enable TORCH_LOGS=recompiles. If recompilations are expected, consider increasing ``` To help resolve this we reset dynamo at setup for all unittests. Let's see if this helps

Differential Revision: D70329890 Pull Request resolved: pytorch#8772

@larryliu0820

### Summary We seem to be using a combination of CMAKE_ARGS and environment variables when creating wheels. Ultimately, CMake only uses the cmake args, however we redefine some of these flags as env vars to help `setup.py` determine if a certain feature is turned on. Specifically, it looks for pybinding vars to bundle pybindings. Let's remove this redundancy and just use the CMAKE_ARGS as the single source of truth. For more details and other considerations, see pytorch#9494 (abandoned). Note that even in the wheel building jobs, we use cmake args instead of environment variables to control features: https://github.com/pytorch/executorch/blob/644b7ddf14180d97e348faa627f576e13d367d69/.ci/scripts/wheel/envvar_base.sh#L20 https://github.com/pytorch/executorch/blob/644b7ddf14180d97e348faa627f576e13d367d69/.ci/scripts/wheel/envvar_macos.sh#L14-L15 ### Test plan build + check CMakeCache.txt to ensure flags are set ```bash # Expected: EXECUTORCH_BUILD_PYBIND=OFF EXECUTORCH_BUILD_XNNPACK=OFF EXECUTORCH_BUILD_COREML=OFF $ rm -rf pip-out dist && ./install_executorch.sh --pybind off # Expected: EXECUTORCH_BUILD_PYBIND=ON EXECUTORCH_BUILD_XNNPACK=ON EXECUTORCH_BUILD_COREML=OFF $ rm -rf pip-out dist && ./install_executorch.sh # Expected: EXECUTORCH_BUILD_PYBIND=ON EXECUTORCH_BUILD_XNNPACK=OFF EXECUTORCH_BUILD_COREML=ON $ rm -rf pip-out dist && ./install_executorch.sh --pybind coreml # Expected: EXECUTORCH_BUILD_PYBIND=ON EXECUTORCH_BUILD_XNNPACK=ON EXECUTORCH_BUILD_COREML=ON $ rm -rf pip-out dist && ./install_executorch.sh --pybind xnnpack coreml # Throws an error $ rm -rf pip-out dist && ./install_executorch.sh --pybind coreml off ``` cc @larryliu0820 @lucylq

Differential Revision: D71752746 Pull Request resolved: pytorch#9597

Differential Revision: D71752743 Pull Request resolved: pytorch#9606

Differential Revision: D71752747 Pull Request resolved: pytorch#9608

Differential Revision: D71752748 Pull Request resolved: pytorch#9609

Reverts pytorch#9408

@larryliu0820

### Summary We want `EXECUTORCH_BUILD_PYBIND` enabled if the user wants to build the bindings — so let's just do it. Unless of course, they explicitly choose not to by defining the arg themselves. ### Test plan CI cc @larryliu0820 @lucylq

@mergennachin

Summary: A few SoCs have been supported recently, updated the documentation. Differential Revision: D71827272 cc @mergennachin @byjlw

Number of delegates and tolerance has changed so update it.

This was (accidentally) removed at a refactoring. Also takes the chance to use the new XfailIf.. decorator. Signed-off-by: Erik Lundell <erik.lundell@arm.com>

…rch#9158) ### Summary Add a proxy for an `export_llama` performance regression test by comparing the ops in the graph before and after the PR. The export happens without loading a checkpoint or params file, which means that all of the base `ModelArgs` values for `llama_transformer` will be used. ### Test plan N/A

Differential Revision: D71833608 Pull Request resolved: pytorch#9603

Add conv3d tests, though most are skipped since conv3d support is not yet implemented. Signed-off-by: Erik Lundell <erik.lundell@arm.com>

Seems to succeed in rare instances due to randomness. Signed-off-by: Erik Lundell <erik.lundell@arm.com>

### Summary This is the stage 1 of Mimi Enablement. Stage 2 will consist of actual model enablement. - Support OP: - exp - expm1 - elu - transpose conv1d - bitwise_and - scalar_tensor - stack - unbind

Summary: To support embedded system builds which threw an error on the warning for left shift by 32 on a 32 bit dtype, the code was modified to: ``` memory_offset |= static_cast<size_t>(memory_offset_high) << (sizeof(size_t) - sizeof(uint32_t)); ``` This fails for build of OSS qwen example however. Instead, we modify to add a check for ``` sizeof(size_t) > sizeof(uint32_t) ``` in the conditional instead of changing the computation. In our builds of interest, this compiles away the if branch Reviewed By: digantdesai, dpalmasan Differential Revision: D71488571

@larryliu0820

### Summary Context: pytorch#9481 * Include the `executorchcoreml` pybinding in the builds * Remove separate installation option * Turn on CoreML by default for macOS builds * Add a dependency on coremltools for macOS ### Test plan CI ``` $ rm -rf cmake-out pip-out dist && ./install_executorch.sh $ ./examples/models/llama/install_requirements.sh $ .ci/scripts/test_llama.sh -model stories110M -build_tool cmake -dtype fp32 -mode coreml $ .ci/scripts/test_llama.sh -model stories110M -build_tool cmake -dtype fp32 -mode xnnpack+custom+quantize_kv ``` cc @larryliu0820 @lucylq

@digantdesai

…ntization for Example Models (pytorch#9634) ### Summary Changes: 1. When initializing Llama2 for aot_compiler, since checkpoints can only e downloaded from hugging face, we initialize llama2 with uninitialized weights. The problem with this is that when running quantization, we can run into errors with the histogram if the unitialized values are nan. We fix this by initializing the weights with zeros if no check point is provided. This enforces that quantization step can still work. 2. Quant Type in AoT compiler. When looking at the model options available to XNNPACK, everything is quantized with per-tensor static quantization. This isn't the best option for all the models available. For example transformer based models like Llama and MobileBert would likely prefer dynamically quantized per channel weights, where has CNN like MobileNet would prefer statically quantized per channel weights. We add this type of Quant Type to the existing models options. This also helps with Test Timeouts. per-tensor static quantization on a model like llama can take a long time due to the introduction of MANY q/dq nodes, and the complex partitions it creates. As a result, proposing partitions can take a long time due to the constant BFS to find the largest possible partition. By specifying the more apt quantization scheme like dynamic per-channel quantization, we can avoid this complexity. Overall this should help with flakey [nan, nan] errors in the quantization histogram, and it should also help with CI timing out. ### Test plan OSS XNNPACK CI for all model delegation cc @digantdesai @cbilgin

### Summary * After pytorch#9483, we should have CoreML support out of the box for macOS * Unfortunately, we still need `backends/apple/coreml/scripts/install_requirements.sh` to use the [coreml_executorch_runner](https://github.com/pytorch/executorch/tree/main/examples/apple/coreml/executor_runner) (used for testing) * I should have caught all the usage ### Test plan Read

I have no idea what this file actually does, but it seems like we are supposed to have this?

…ch#9509) Disable one, fix the other. Testing: built internally

pytorch#9511) I planned to do this everywhere and forgot. Clean it all up, leave a note, enforce the note with visibility. This makes sure everything in buck-land gets ET_USE_THREADPOOL. Test Plan: Profiled run on internal model, no longer seeing parallel_for_no_threadpool

…AME_AS_COMPUTE (pytorch#9613) As the title says, this is mostly a few related find-replaces, plus marking SupportedTensorDtypes::SAME_AS_COMPUTE deprecated.

As the code comment says, these APIs are undergoing development (see e.g. pytorch#9613) and it's pretty inconvenient that they're incidentally committed-to externally. Mark them deprecated so we have the option to drop that commitment in (IIUC) 0.7.

Previous attempt to bump HF transformers version to latest is reverted due to llava model imcompatibility. This PR is to just ensure the CI are able to test `optimum-executorch` with latest version of HF transformers and upcoming `executorch==0.6.0`. Note: This change is purely on CI and only for optimum-executorch, should not affect other models like llava. Co-authored-by: Guang Yang <guangyang@fb.com>

Differential Revision: D69994481 Pull Request resolved: pytorch#8703

Monitor: https://github.com/pytorch/executorch/actions/runs/14093362753/job/39475283157?pr=9658#logs

metascroy and others added 30 commits March 25, 2025 11:39

Create intermediate directories in ETCoreMLModelManager (pytorch#9562)

c20e8ab

ETCoreML crashes in FB app. During debugging, we traced issue down to this. cc @kimishpatel @YifanShenSZ @cymbalrush

Bump HF version (pytorch#9408)

3a55f32

HF version bump. Ensure `optimum-executorch` can work on new `transformers` models with `executorch==0.6.0` ### Test plan CI to test HF models Co-authored-by: Guang Yang <guangyang@fb.com>

Add a test target to SwiftPM package. (pytorch#9557)

cabc4e9

Utility helper to deduce scalar type from NSNumber. (pytorch#9552)

f13aeff

Summary: . Differential Revision: D71752750

Remove Android ExecuTorch demo

fd75d05

Differential Revision: D71761219 Pull Request resolved: pytorch#9554

Utility helper to extract a value from NSNumber. (pytorch#9595)

72f879c

Summary: . Reviewed By: bsoyluoglu Differential Revision: D71752749

Switch to new ao quant api for 8da4w (pytorch#8501)

fc25829

Differential Revision: D70329890 Pull Request resolved: pytorch#8772

Rename deduceScalarType to deduceType (pytorch#9604)

9e8503c

Utility helpers to convert between std::vector and NSArray.

2e2cf98

Differential Revision: D71752746 Pull Request resolved: pytorch#9597

Enum for Module load mode.

02729b8

Differential Revision: D71752743 Pull Request resolved: pytorch#9606

Enum for Value tag.

811352d

Differential Revision: D71752747 Pull Request resolved: pytorch#9608

Enum for Tensor scalar type and shape dynamism.

610f398

Differential Revision: D71752748 Pull Request resolved: pytorch#9609

Revert "Bump HF version" (pytorch#9600)

4b8ac94

Reverts pytorch#9408

Update Qualcomm SoCs support list (pytorch#9592)

306b649

Summary: A few SoCs have been supported recently, updated the documentation. Differential Revision: D71827272 cc @mergennachin @byjlw

Arm backend: fix test_llama.py (pytorch#9575)

785afce

Number of delegates and tolerance has changed so update it.

Arm backend: Set qtol=1 for flaky tests. (pytorch#9641)

50ec82e

This was (accidentally) removed at a refactoring. Also takes the chance to use the new XfailIf.. decorator. Signed-off-by: Erik Lundell <erik.lundell@arm.com>

Add buck target for hf_download

7159650

Differential Revision: D71833608 Pull Request resolved: pytorch#9603

Arm backend: Extend convolution support check to 3d (pytorch#9640)

265b9b7

Add conv3d tests, though most are skipped since conv3d support is not yet implemented. Signed-off-by: Erik Lundell <erik.lundell@arm.com>

Arm backend: set scalar div xfails to non strict. (pytorch#9644)

cb1d175

Seems to succeed in rare instances due to randomness. Signed-off-by: Erik Lundell <erik.lundell@arm.com>

Qualcomm AI Engine Direct - Mimi Enablement Stage 1 (pytorch#9570)

c18e5f6

### Summary This is the stage 1 of Mimi Enablement. Stage 2 will consist of actual model enablement. - Support OP: - exp - expm1 - elu - transpose conv1d - bitwise_and - scalar_tensor - stack - unbind

Update docs after moving Android ExecuTorchDemo app

9d243e9

swolchok and others added 11 commits March 26, 2025 13:26

Add portable ELU implementation + test (pytorch#9520)

751e646

add sample input for new core op elu.out (pytorch#9577)

12240cf

I have no idea what this file actually does, but it seems like we are supposed to have this?

Fix -Wglobal-constructors/-Wshadow for executor_runner targets (pytor…

4873681

…ch#9509) Disable one, fix the other. Testing: built internally

elementwise_util: s/common/compute/ almost everywhere and deprecate S…

118f0a4

…AME_AS_COMPUTE (pytorch#9613) As the title says, this is mostly a few related find-replaces, plus marking SupportedTensorDtypes::SAME_AS_COMPUTE deprecated.

Remove tombstone messages proactively

80992a9

Differential Revision: D69994481 Pull Request resolved: pytorch#8703

Upgrade arm runner for phi4-mini (pytorch#9658)

f5bbad1

Monitor: https://github.com/pytorch/executorch/actions/runs/14093362753/job/39475283157?pr=9658#logs

fix merge conflict: openvino pybind

28af462

pybind openvino merge update

6258b42

cavusmustafa requested review from suryasidd and ynimmaga March 27, 2025 00:04

ynimmaga merged commit 3f53cc2 into ynimmaga:openvino_backend Mar 27, 2025
18 of 346 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pybind merge fix#42

Pybind merge fix#42
ynimmaga merged 41 commits intoynimmaga:openvino_backendfrom
cavusmustafa:pybind_merge_fix

cavusmustafa commented Mar 27, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

18 participants

Conversation

cavusmustafa commented Mar 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

18 participants

cavusmustafa commented Mar 27, 2025 •

edited

Loading