MTP batched-verify (BatchForward2) crashes when SnapKV eviction is active: _kvCache.Length != startPos

## Symptom

When SnapKV prefill eviction is active on an MTP model (e.g. `Qwen3.6-27B-MTP`), the first decode iteration throws:

```
BatchForward2: _kvCache.Length=128 != startPos=<promptLen>. Caches must be at startPos before the batched verify call.
```

(`128` is the SnapKV budget; `<promptLen>` is the original, pre-eviction prompt length.)

## Root cause

SnapKV eviction is a prefill-only compaction. After it runs, `PagedKvCache` deliberately splits its two length notions (see `PagedKvCache.cs`):

- `_length` (physical) shrinks to the kept-slot count `K` (= budget, e.g. 128) — the slot index of the next append.
- `_logicalLength` stays at the original prompt length `N` — the absolute position the next decode token sits at, used for RoPE so cached (already-RoPE'd) keys and the incoming query share a reference frame.

The single-token `Forward` decode path already handles this `Length != LogicalLength` split correctly (covered by `CudaHybridGdnSnapKv_LongPrompt_CacheShrinksToBudget_DecodeStaysWellFormed`).

`BatchForward2` (the MTP N=2 batched-verify primitive in both `HybridGdnForwardPass` and `CudaHybridGdnForwardPass`) opens with a precondition `if (_kvCache.Length != startPos) throw ...`. `MtpDecoder.DecodeBatched` passes `startPos = _nextPos` (= logical position). Post-eviction `_kvCache.Length == K` but `startPos == N`, so the precondition fails on the very first batched-verify call. There is no reconciliation between eviction and batched verify.

## Fix (this PR)

Cleanly **gate batched-verify off when the cache has been evicted** — detect `_kvCache.Length != _kvCache.LogicalLength` and fall back to the sequential MTP decode path (which uses `Forward`, already eviction-safe). SnapKV + MTP then coexist correctly, just without the N=2 verify speedup while eviction is active. Add a decode-after-eviction coherence test.

## Follow-up (not this PR)

Teach `BatchForward2` to operate on an evicted cache directly: change the precondition to key off `LogicalLength` (the RoPE position) and make the two-token append/attention use physical `Length` for storage slots and `LogicalLength(+0/+1)` for RoPE — mirroring what single-token `Forward` already does. That restores the verify speedup under eviction. Requires re-running the MTP byte-parity oracle (`MtpDecoder_GreedyParity_LlamaCpp`) since MTP FP parity is fragile.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MTP batched-verify (BatchForward2) crashes when SnapKV eviction is active: _kvCache.Length != startPos #130

Symptom

Root cause

Fix (this PR)

Follow-up (not this PR)

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

MTP batched-verify (BatchForward2) crashes when SnapKV eviction is active: _kvCache.Length != startPos #130

Description

Symptom

Root cause

Fix (this PR)

Follow-up (not this PR)

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions