Arm backend: Int16 linear support#14258
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/14258
Note: Links to docs will display an error until the docs builds have been completed. ❌ 2 New FailuresAs of commit 0bdb35c with merge base 0329a8a ( NEW FAILURES - The following jobs have failed:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
e170183 to
0ab797f
Compare
|
Unrelated failures. |
| self.add_pass(FuseConstantArgsPass(exported_program)) | ||
| self.add_pass(InsertTableOpsPass(exported_program)) | ||
| # If we have a conv2d with int16 activation split up into a convolution | ||
| # and an addition, to work-around the lack of support for int48 in torch |
There was a problem hiding this comment.
and an addition, to work-around the lack of support for int48 in torch
Or can it be done by using torch.dtype.int64 instead and then detecting and lowering it as int48 downstream?
There was a problem hiding this comment.
I was starting of in that direction, but it interfere a bit with the int64->int32 handling, so rather keep it separate.
There was a problem hiding this comment.
yeah given int64 is treated as radioactive :P
digantdesai
left a comment
There was a problem hiding this comment.
This is awesome, No conv tests for 16a8w, just curious.
5b9c54e to
ac4203c
Compare
| ) | ||
|
|
||
|
|
||
| @common.parametrize("test_data", test_data_all_16a8w) |
There was a problem hiding this comment.
No, not yet. Test will be added for Ethos-U with a Vela update when it's in place.
There was a problem hiding this comment.
I have added tests for U55 and U85
|
@digantdesai has imported this pull request. If you are a Meta employee, you can view this in D82552402. |
|
I see this testfail :( FAILED backends/arm/test/ops/test_linear.py::test_linear_16a8w_tosa_INT[model_linear_rank1_large_randn,per_channel_quant=True] - AssertionError: Output 0 does not match reference output. |
ac4203c to
54ad8e2
Compare
|
I removed milestone 1.0 for now Vela does not support this anyway so lets take this a bit calmer. |
|
@digantdesai , this has come out of sync with Meta internal version and I'm not allowed merge it, would you like to assist? |
|
yeah let me try to merge this. |
|
@digantdesai has imported this pull request. If you are a Meta employee, you can view this in D82552402. |
|
if it doesn't work I can give you a patch.. |
Signed-off-by: Per Åstrand <per.astrand@arm.com> Change-Id: I2b189b559f699c7eda6921ed515c0e8a849226ca
Support quantization to 16a8w. Since the resulting TOSA operator needs to have the bias in int48 which isn't avaiable as a type in torch, the conv2d needs to be decomposed into a conv + add, where the conv result is scaled down to 32 bit before the addition of the bias is done. Signed-off-by: Per Åstrand <per.astrand@arm.com> Change-Id: Ib8cae694035796374a55a9909e501596e983abf5
For the case when the activation is 16 bit the bias in TOSA must be a int48_t tensor. Since that can't be represented using torch.dtypes the corresponding node.meta is set with a key 'tosa_dtype_48bit' to pass through the note to the creation of the TOSA Tensor. Also make sure to distinguish between int32 and int48 tensors in fuse constant ops pass. Signed-off-by: Per Åstrand <per.astrand@arm.com> Change-Id: Iefe64f2b02f388c905c9c818ee7d2a6af40bc9e3
Signed-off-by: Per Åstrand <per.astrand@arm.com> Change-Id: Ibe158d8d35a632547290f1b9a055d061ae267d77
Enable tests of int16 activations and int8 weight quantization. Test for large_rand is disabled to sort out why the test is flaky. Signed-off-by: Per Åstrand <per.astrand@arm.com> Change-Id: I9de5d472f8862edebcf82c140399985db930c069
Add a enum class to handle special dtypes that can't be represented in torch (i.e. int48_t) to avoid leaking serializer types into the pass handling of the backend. Signed-off-by: Per Åstrand <per.astrand@arm.com> Change-Id: I3388cec3c8a26f28790eedc3f124c336b6724cb4
dc7a0a7 to
18c9985
Compare
|
Fails are unrelated |
|
@digantdesai we had to rebase as a file was changed and as we changed it again we got back to the same problem and need your help again. Sorry about that. |
|
@digantdesai has imported this pull request. If you are a Meta employee, you can view this in D82552402. |
### Summary Adds support for a16w8 for linear when targeting a backend with +int16 extension. Fixes pytorch#13729 ### Test plan Tested through unit tests. Signed-off-by: Per Åstrand <per.astrand@arm.com> Co-authored-by: Digant Desai <digantdesai@meta.com>
Summary
Adds support for a16w8 for linear when targeting a backend with +int16 extension.
Fixes #13729
Test plan
Tested through unit tests.
cc @digantdesai @freddan80 @zingo @oscarandersson8218