Skip to content

AArch64 LSE atomic_rmw support#3322

Merged
cfallin merged 3 commits intobytecodealliance:mainfrom
sparker-arm:aarch64-lse-ops
Sep 22, 2021
Merged

AArch64 LSE atomic_rmw support#3322
cfallin merged 3 commits intobytecodealliance:mainfrom
sparker-arm:aarch64-lse-ops

Conversation

@sparker-arm
Copy link
Contributor

Rename the existing AtomicRMW to AtomicRMWLoop and directly lower
atomic_rmw operations, without a loop if LSE support is available.

Copyright (c) 2021, Arm Limited

@afonso360
Copy link
Contributor

It'd be a good idea add a target aarch64 has_lse to runtests/atomic-rmw.clif and runtests/atomic-rmw-2.clif, if QEMU supports LSE.

@github-actions github-actions bot added cranelift Issues related to the Cranelift code generator cranelift:area:aarch64 Issues related to AArch64 backend. labels Sep 9, 2021
@akirilov-arm
Copy link
Contributor

Currently we force QEMU to emulate a processor that lacks LSE support.

@cfallin
Copy link
Member

cfallin commented Sep 9, 2021

I was curious if QEMU has any -cpu options between A72 (the current) and max (which we disabled to avoid pointer-auth issues on Linux, per #3183); the latest source appears to have a configuration option for the A64FX as well, which is ARMv8.2 (with LSE, without PAC). Maybe we could update the CI config and try that?

@cfallin
Copy link
Member

cfallin commented Sep 9, 2021

(This was added 9 days ago so I expect we'd have to pull down a git checkout of qemu rather than build a release; but that's probably fine IMHO.)

@sparker-arm
Copy link
Contributor Author

I was curious if QEMU has any -cpu options between A72 (the current) and max (which we disabled to avoid pointer-auth issues on Linux, per #3183); the latest source appears to have a configuration option for the A64FX as well, which is ARMv8.2 (with LSE, without PAC). Maybe we could update the CI config and try that?

What might be a more scalable, though less pretty option, would be to use the max configuration and turn off the features that we don't want (only PAC?) @akirilov-arm is working on that feature currently so I imagine we'd need testing at some point in the near future. But I haven't ever had to disable features with QEMU, and briefly looking at the code, I'm wondering if all the features have a nice user facing option...

@afonso360
Copy link
Contributor

afonso360 commented Sep 10, 2021

Reading this, it looks like we can just -cpu max,pauth=off. My local qemu 6.0.0 build recognizes this as a valid feature, I'll try opening a PR to see if CI fails.

Rename the existing AtomicRMW to AtomicRMWLoop and directly lower
atomic_rmw operations, without a loop if LSE support is available.

Copyright (c) 2021, Arm Limited
@alexcrichton
Copy link
Member

As a side note, this is somewhat unlikely to get used much in practice at least with Wasmtime itself due to this block which is gated on the off-by-default stdsimd feature (which requires nightly). If y'all are interseted in getting this supported in Wasmtime by default that'd need to be updated with a works-on-stable implementation. (I'm not sure of the status of stabilizing the required macro from libstd itself)

@akirilov-arm
Copy link
Contributor

Yes, we are aware of that - my colleague @adamgemmell from Arm's Rust team has proposed stabilising the ISA extension test macro in issue rust-lang/rust#86941 independently of the stdsimd feature. We are going to update the block after the issue is closed, so that it just checks for the target architecture. In the meantime I think it is fine to continue with the code generation optimizations, and @afonso360's suggestion enables us to do some testing.

@alexcrichton
Copy link
Member

Nice! And yeah of course I think this is fine to land (as well as any other improvements). Just because Wasmtime doesn't use something by default doesn't mean it's not useful!

@sparker-arm
Copy link
Contributor Author

@cfallin, would you be able to look at this now the QEMU changes have landed?

Copy link
Member

@cfallin cfallin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@cfallin cfallin merged commit 3474965 into bytecodealliance:main Sep 22, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cranelift:area:aarch64 Issues related to AArch64 backend. cranelift Issues related to the Cranelift code generator

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants