Filing this mostly for visibility. We've been chasing a deterministic General Protection Exception on x86_64-unknown-linux-gnu that only reproduces when a downstream consumer builds leanMultisig under opt-level = "z" and codegen-units = 1 simultaneously. Either knob on its own is fine. The crash lands inside rec_aggregation::xmss_aggregate → lean_prover::prove_execution → prove_generic_logup → Poseidon16Precompile::bus.
Our downstream (blockblaz/zeam) already shipped a workaround in #759 by switching to opt-level = "s", so we're unblocked. But the interesting part sits in leanMultisig, so I wanted to write it up here.
What actually faults
Debug symbols point at Poseidon16Precompile::bus, but the real body at the faulting PC is a monomorphization of lean_vm::tables::utils::eval_virtual_bus_column that got inlined into lean_prover's CGU:
pub(crate) fn eval_virtual_bus_column<AB: AirBuilder, EF: ExtensionField<PF<EF>>>(
extra_data: &ExtraDataForBuses<EF>, flag: AB::IF, data: &[AB::IF],
) -> AB::EF {
let (logup_alphas_eq_poly, bus_beta) = extra_data.transmute_bus_data::<AB::EF>();
assert!(data.len() < logup_alphas_eq_poly.len());
(logup_alphas_eq_poly.iter().zip(data).map(|(c, d)| *c * *d).sum::<AB::EF>()
+ *logup_alphas_eq_poly.last().unwrap() * AB::F::from_usize(LOGUP_PRECOMPILE_DOMAINSEP))
* *bus_beta + flag
}
Twelve lines of safe Rust, but at the crash site objdump shows a single basic block keeping all 16 YMM registers live through vpblendd/vpbroadcastd/vpmuludq/vinserti128/vpaddq, and dying on a vpinsrd $0x1, 0x14(%rsp), %xmm5, %xmm5 — a stack-relative reload of a spilled vector lane. The whole iterator chain has been inlined into one monstrous SIMD fold, and at -Oz the frame layout for that fold looks wrong. Bumping CGU to ≥ 2 keeps the function outlined at the crate boundary, which is why that mitigation works.
(For what it's worth: the unsafe transmute in ExtraDataForBuses::transmute_bus_data isn't the culprit here — at this monomorphization AB::EF == EF so it's an identity transmute. nm confirms the affected symbol only exists in lean_vm's CGU. Still a pattern I'd love to see go away for readability reasons, but it's sound.)
What we tried
Bisected on an AMD Zen 4 VM with stable rustc 1.95.0. Everything below runs against leanMultisig rev 2eb4b9d983171139af36749f127dd9890c9109e6:
- Per-crate
opt-level = "s" overrides across mt-*, rec_aggregation, backend, lean_prover, lean_vm, utils, sub_protocols — none of those combos fixed it. A lean_compiler-only override did, but it turned out to be a CGU-partitioning side-effect that's unstable under small cache changes.
RUSTFLAGS="-Cllvm-args=-enable-machine-outliner=never" — no effect.
#[inline(never)] on eval_virtual_bus_column itself — no effect.
codegen-units = 16 with opt-level = "z" — clean.
codegen-units = 1 with opt-level = "s" — clean.
So the condition is genuinely the conjunction of { "z", 1 }, not either alone.
Reproducer
I don't have a standalone leanMultisig-only repro yet — it needs a realistic witness flowing into prove_execution. Via zeam it's:
git clone https://github.com/blockblaz/zeam.git
cd zeam && git checkout 17f1083 # pre-#759, opt-level="z" still present
./zig build run -Dprover=risc0 -- prove -z risc0
# crashes in ~25s with "General protection exception" on Linux x86_64
Flipping either opt-level to "s" or codegen-units to anything ≥ 2 in rust/Cargo.toml's [profile.risc0-release] clears it. Also reproduces on GitHub's 2-core ubuntu-latest runners.
If it would help, I'm happy to try to carve a standalone #[test] inside leanMultisig that builds under { "z", 1 } and triggers the same codegen path — something around prove_generic_logup with a trivial witness. Let me know if you'd want that shape or something different.
Probably worth doing regardless of a fix
A short note somewhere in the README / a CODEGEN.md saying "consumers using opt-level = "z" need codegen-units >= 2, or use opt-level = "s" — the { "z", 1 } combo miscompiles on x86_64 rustc ≥ 1.95" would save anyone else hitting this a lot of bisecting. Happy to PR that once you confirm the preferred wording/location.
A proper rustc/LLVM upstream issue is the long-term fix, but needs a minimized reproducer first.
Environment
- rustc 1.95.0 (59807616e 2026-04-14) stable
- AMD EPYC-Genoa (Zen 4), also on GitHub
ubuntu-latest
- Ubuntu 24.04, Linux 6.8
-Ctarget-cpu=x86-64-v3 (so AVX2, not AVX-512)
- leanMultisig
2eb4b9d, zeam pre-fix ref 17f1083, downstream fix blockblaz/zeam#759
Filing this mostly for visibility. We've been chasing a deterministic General Protection Exception on
x86_64-unknown-linux-gnuthat only reproduces when a downstream consumer builds leanMultisig underopt-level = "z"andcodegen-units = 1simultaneously. Either knob on its own is fine. The crash lands insiderec_aggregation::xmss_aggregate → lean_prover::prove_execution → prove_generic_logup → Poseidon16Precompile::bus.Our downstream (blockblaz/zeam) already shipped a workaround in #759 by switching to
opt-level = "s", so we're unblocked. But the interesting part sits in leanMultisig, so I wanted to write it up here.What actually faults
Debug symbols point at
Poseidon16Precompile::bus, but the real body at the faulting PC is a monomorphization oflean_vm::tables::utils::eval_virtual_bus_columnthat got inlined intolean_prover's CGU:Twelve lines of safe Rust, but at the crash site
objdumpshows a single basic block keeping all 16 YMM registers live throughvpblendd/vpbroadcastd/vpmuludq/vinserti128/vpaddq, and dying on avpinsrd $0x1, 0x14(%rsp), %xmm5, %xmm5— a stack-relative reload of a spilled vector lane. The whole iterator chain has been inlined into one monstrous SIMD fold, and at-Ozthe frame layout for that fold looks wrong. Bumping CGU to ≥ 2 keeps the function outlined at the crate boundary, which is why that mitigation works.(For what it's worth: the
unsafe transmuteinExtraDataForBuses::transmute_bus_dataisn't the culprit here — at this monomorphizationAB::EF == EFso it's an identity transmute.nmconfirms the affected symbol only exists inlean_vm's CGU. Still a pattern I'd love to see go away for readability reasons, but it's sound.)What we tried
Bisected on an AMD Zen 4 VM with stable rustc 1.95.0. Everything below runs against leanMultisig rev
2eb4b9d983171139af36749f127dd9890c9109e6:opt-level = "s"overrides across mt-*, rec_aggregation, backend, lean_prover, lean_vm, utils, sub_protocols — none of those combos fixed it. Alean_compiler-only override did, but it turned out to be a CGU-partitioning side-effect that's unstable under small cache changes.RUSTFLAGS="-Cllvm-args=-enable-machine-outliner=never"— no effect.#[inline(never)]oneval_virtual_bus_columnitself — no effect.codegen-units = 16withopt-level = "z"— clean.codegen-units = 1withopt-level = "s"— clean.So the condition is genuinely the conjunction of
{ "z", 1 }, not either alone.Reproducer
I don't have a standalone leanMultisig-only repro yet — it needs a realistic witness flowing into
prove_execution. Via zeam it's:Flipping either
opt-levelto"s"orcodegen-unitsto anything ≥ 2 inrust/Cargo.toml's[profile.risc0-release]clears it. Also reproduces on GitHub's 2-coreubuntu-latestrunners.If it would help, I'm happy to try to carve a standalone
#[test]inside leanMultisig that builds under{ "z", 1 }and triggers the same codegen path — something aroundprove_generic_logupwith a trivial witness. Let me know if you'd want that shape or something different.Probably worth doing regardless of a fix
A short note somewhere in the README / a CODEGEN.md saying "consumers using
opt-level = "z"needcodegen-units >= 2, or useopt-level = "s"— the{ "z", 1 }combo miscompiles on x86_64 rustc ≥ 1.95" would save anyone else hitting this a lot of bisecting. Happy to PR that once you confirm the preferred wording/location.A proper rustc/LLVM upstream issue is the long-term fix, but needs a minimized reproducer first.
Environment
ubuntu-latest-Ctarget-cpu=x86-64-v3(so AVX2, not AVX-512)2eb4b9d, zeam pre-fix ref17f1083, downstream fix blockblaz/zeam#759