Merged
Conversation
ETCoreML crashes in FB app. During debugging, we traced issue down to this. cc @kimishpatel @YifanShenSZ @cymbalrush
HF version bump. Ensure `optimum-executorch` can work on new `transformers` models with `executorch==0.6.0` ### Test plan CI to test HF models Co-authored-by: Guang Yang <guangyang@fb.com>
Summary: . Differential Revision: D71752750
Differential Revision: D71761219 Pull Request resolved: pytorch#9554
Summary: . Reviewed By: bsoyluoglu Differential Revision: D71752749
### Summary
Seeing this error in Linux wheel building jobs:
```
Collecting numpy (from torchvision==0.22.0.dev20250311)
Downloading numpy-2.2.4-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (62 kB)
INFO: pip is looking at multiple versions of torchvision to determine which version is compatible with other requirements. This could take a while.
The conflict is caused by:
The user requested torch==2.7.0.dev20250311
torchvision 0.22.0.dev20250311+cpu depends on torch==2.7.0.dev20250310
```
### Test plan
CI
https://github.com/pytorch/executorch/actions/runs/14047575373/job/39331644423 There seems to be some CI issues with: ``` torch._dynamo.exc.FailOnRecompileLimitHit: recompile_limit reached with one_graph=True. Excessive recompilations can degrade performance due to the compilation overhead of each recompilation. To monitor recompilations, enable TORCH_LOGS=recompiles. If recompilations are expected, consider increasing ``` To help resolve this we reset dynamo at setup for all unittests. Let's see if this helps
Differential Revision: D70329890 Pull Request resolved: pytorch#8772
### Summary We seem to be using a combination of CMAKE_ARGS and environment variables when creating wheels. Ultimately, CMake only uses the cmake args, however we redefine some of these flags as env vars to help `setup.py` determine if a certain feature is turned on. Specifically, it looks for pybinding vars to bundle pybindings. Let's remove this redundancy and just use the CMAKE_ARGS as the single source of truth. For more details and other considerations, see pytorch#9494 (abandoned). Note that even in the wheel building jobs, we use cmake args instead of environment variables to control features: https://github.com/pytorch/executorch/blob/644b7ddf14180d97e348faa627f576e13d367d69/.ci/scripts/wheel/envvar_base.sh#L20 https://github.com/pytorch/executorch/blob/644b7ddf14180d97e348faa627f576e13d367d69/.ci/scripts/wheel/envvar_macos.sh#L14-L15 ### Test plan build + check CMakeCache.txt to ensure flags are set ```bash # Expected: EXECUTORCH_BUILD_PYBIND=OFF EXECUTORCH_BUILD_XNNPACK=OFF EXECUTORCH_BUILD_COREML=OFF $ rm -rf pip-out dist && ./install_executorch.sh --pybind off # Expected: EXECUTORCH_BUILD_PYBIND=ON EXECUTORCH_BUILD_XNNPACK=ON EXECUTORCH_BUILD_COREML=OFF $ rm -rf pip-out dist && ./install_executorch.sh # Expected: EXECUTORCH_BUILD_PYBIND=ON EXECUTORCH_BUILD_XNNPACK=OFF EXECUTORCH_BUILD_COREML=ON $ rm -rf pip-out dist && ./install_executorch.sh --pybind coreml # Expected: EXECUTORCH_BUILD_PYBIND=ON EXECUTORCH_BUILD_XNNPACK=ON EXECUTORCH_BUILD_COREML=ON $ rm -rf pip-out dist && ./install_executorch.sh --pybind xnnpack coreml # Throws an error $ rm -rf pip-out dist && ./install_executorch.sh --pybind coreml off ``` cc @larryliu0820 @lucylq
Differential Revision: D71752746 Pull Request resolved: pytorch#9597
Differential Revision: D71752743 Pull Request resolved: pytorch#9606
Differential Revision: D71752747 Pull Request resolved: pytorch#9608
Differential Revision: D71752748 Pull Request resolved: pytorch#9609
### Summary We want `EXECUTORCH_BUILD_PYBIND` enabled if the user wants to build the bindings — so let's just do it. Unless of course, they explicitly choose not to by defining the arg themselves. ### Test plan CI cc @larryliu0820 @lucylq
Summary: A few SoCs have been supported recently, updated the documentation. Differential Revision: D71827272 cc @mergennachin @byjlw
Number of delegates and tolerance has changed so update it.
This was (accidentally) removed at a refactoring. Also takes the chance to use the new XfailIf.. decorator. Signed-off-by: Erik Lundell <erik.lundell@arm.com>
…rch#9158) ### Summary Add a proxy for an `export_llama` performance regression test by comparing the ops in the graph before and after the PR. The export happens without loading a checkpoint or params file, which means that all of the base `ModelArgs` values for `llama_transformer` will be used. ### Test plan N/A
Differential Revision: D71833608 Pull Request resolved: pytorch#9603
Add conv3d tests, though most are skipped since conv3d support is not yet implemented. Signed-off-by: Erik Lundell <erik.lundell@arm.com>
Seems to succeed in rare instances due to randomness. Signed-off-by: Erik Lundell <erik.lundell@arm.com>
### Summary This is the stage 1 of Mimi Enablement. Stage 2 will consist of actual model enablement. - Support OP: - exp - expm1 - elu - transpose conv1d - bitwise_and - scalar_tensor - stack - unbind
Summary:
To support embedded system builds which threw an error on the warning
for left shift by 32 on a 32 bit dtype, the code was modified to:
```
memory_offset |= static_cast<size_t>(memory_offset_high)
<< (sizeof(size_t) - sizeof(uint32_t));
```
This fails for build of OSS qwen example however.
Instead, we modify to add a check for
```
sizeof(size_t) > sizeof(uint32_t)
```
in the conditional instead of changing the computation.
In our builds of interest, this compiles away the if branch
Reviewed By: digantdesai, dpalmasan
Differential Revision: D71488571
### Summary Context: pytorch#9481 * Include the `executorchcoreml` pybinding in the builds * Remove separate installation option * Turn on CoreML by default for macOS builds * Add a dependency on coremltools for macOS ### Test plan CI ``` $ rm -rf cmake-out pip-out dist && ./install_executorch.sh $ ./examples/models/llama/install_requirements.sh $ .ci/scripts/test_llama.sh -model stories110M -build_tool cmake -dtype fp32 -mode coreml $ .ci/scripts/test_llama.sh -model stories110M -build_tool cmake -dtype fp32 -mode xnnpack+custom+quantize_kv ``` cc @larryliu0820 @lucylq
…ntization for Example Models (pytorch#9634) ### Summary Changes: 1. When initializing Llama2 for aot_compiler, since checkpoints can only e downloaded from hugging face, we initialize llama2 with uninitialized weights. The problem with this is that when running quantization, we can run into errors with the histogram if the unitialized values are nan. We fix this by initializing the weights with zeros if no check point is provided. This enforces that quantization step can still work. 2. Quant Type in AoT compiler. When looking at the model options available to XNNPACK, everything is quantized with per-tensor static quantization. This isn't the best option for all the models available. For example transformer based models like Llama and MobileBert would likely prefer dynamically quantized per channel weights, where has CNN like MobileNet would prefer statically quantized per channel weights. We add this type of Quant Type to the existing models options. This also helps with Test Timeouts. per-tensor static quantization on a model like llama can take a long time due to the introduction of MANY q/dq nodes, and the complex partitions it creates. As a result, proposing partitions can take a long time due to the constant BFS to find the largest possible partition. By specifying the more apt quantization scheme like dynamic per-channel quantization, we can avoid this complexity. Overall this should help with flakey [nan, nan] errors in the quantization histogram, and it should also help with CI timing out. ### Test plan OSS XNNPACK CI for all model delegation cc @digantdesai @cbilgin
### Summary * After pytorch#9483, we should have CoreML support out of the box for macOS * Unfortunately, we still need `backends/apple/coreml/scripts/install_requirements.sh` to use the [coreml_executorch_runner](https://github.com/pytorch/executorch/tree/main/examples/apple/coreml/executor_runner) (used for testing) * I should have caught all the usage ### Test plan Read
I have no idea what this file actually does, but it seems like we are supposed to have this?
…ch#9509) Disable one, fix the other. Testing: built internally
pytorch#9511) I planned to do this everywhere and forgot. Clean it all up, leave a note, enforce the note with visibility. This makes sure everything in buck-land gets ET_USE_THREADPOOL. Test Plan: Profiled run on internal model, no longer seeing parallel_for_no_threadpool
…AME_AS_COMPUTE (pytorch#9613) As the title says, this is mostly a few related find-replaces, plus marking SupportedTensorDtypes::SAME_AS_COMPUTE deprecated.
As the code comment says, these APIs are undergoing development (see e.g. pytorch#9613) and it's pretty inconvenient that they're incidentally committed-to externally. Mark them deprecated so we have the option to drop that commitment in (IIUC) 0.7.
Previous attempt to bump HF transformers version to latest is reverted due to llava model imcompatibility. This PR is to just ensure the CI are able to test `optimum-executorch` with latest version of HF transformers and upcoming `executorch==0.6.0`. Note: This change is purely on CI and only for optimum-executorch, should not affect other models like llava. Co-authored-by: Guang Yang <guangyang@fb.com>
Differential Revision: D69994481 Pull Request resolved: pytorch#8703
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.