Switch remote backend from polling to streaming SSE by JadenFiotto-Kaufman · Pull Request #112 · ndif-team/workbench

JadenFiotto-Kaufman · 2026-04-20T19:01:23Z

Summary

Replace the old two-step polling flow (HTTP submit → browser polls NDIF → HTTP fetch-results) with a single SSE endpoint per interpretability tool. The FastAPI route opens one stream to the client, forwards each NDIF status update as it arrives, downloads the final result inside the backend, formats it, and emits a single terminal `data` event. The browser no longer talks to NDIF directly.

Depends on ndif-team/nnsight#648 (splits `submit_request` / `handle_response` and adds `async_submit_request`) and AdamBelfki3/nnsightful#2 (tools return the backend when provided).

How it works

`workbench/_api/streaming_backend.py::StreamingRemoteBackend` subclasses nnsight's `RemoteBackend`:

`call(tracer)` (sync, fires from `trace` / `session` `exit`) captures the tracer and serializes the request — no I/O.
`aiter()` (async) opens a `socketio.AsyncSimpleClient`, stamps its session id into the headers, async-POSTs via the parent's new `async_submit_request`, and yields `ResponseModel`s as they come in. On `COMPLETED` it downloads the result via the parent's `async_get_result`, replaces `response.data` with the decoded dict, yields once more, and stops. On `ERROR` it yields and raises `RemoteException`.

`workbench/_api/sse.py` holds the shared SSE helpers (`sse_event`, `stream_backend`, `stream_value`, `stream_error`). Each route then reduces to:

```python
backend = state.make_streaming_backend(model=model)
tool._run(model, ..., remote=True, backend=backend) # primes backend

def process(raw):
raw["tokenizer"] = ... # local context
return tool._format(raw, ...) # -> ToolData / BaseModel

return StreamingResponse(stream_backend(backend, process), media_type=MEDIA_TYPE)
```

For non-`Tool` routes (predictions, generation, lens line/grid) the same pattern applies — each route factors its inline `model.trace` into a `trace` and `format` pair.

Endpoints

Collapsed pairs into single `/run*` SSE routes:

Before	After
`POST /logit_lens/start` + `POST /logit_lens/results/{job_id}`	`POST /logit_lens/run`
`POST /activation_patching/start` + `.../results/{job_id}`	`POST /activation_patching/run`
`POST /models/start-prediction` + `.../results-prediction/{job_id}`	`POST /models/run-prediction`
`POST /models/start-generate` + `.../results-generate/{job_id}`	`POST /models/run-generate`
`POST /lens/start-line` + `.../results-line/{job_id}`	`POST /lens/run-line`
`POST /lens/start-grid` + `.../results-grid/{job_id}`	`POST /lens/run-grid`

Each stream emits N×`event: status` frames (with the `ResponseModel` minus its `data` field) followed by one terminal frame — either `event: data` with the JSON-encoded payload or `event: error` with an `{error}` object.

Frontend

New `workbench/_web/src/lib/runAndStream.ts` — a `fetch` + `ReadableStream` SSE parser. Dispatches `status` frames to `useWorkspace.setJobStatus`, resolves the promise on the `data` frame, throws on `error` or a premature end-of-stream.
`useLens2`, `useActivationPatching`, `usePrediction`, `useGenerate`, `useLensLine`, `useLensGrid` all now call `runAndStream` against the new endpoints.
`config.ts` collapses the old `start*` + `results*` entries into single `run*` entries and drops `ndifStatusUrl` — the browser no longer polls NDIF.
`lib/startAndPoll.ts` is deleted.

Commit layout

Add StreamingRemoteBackend + SSE helpers — `streaming_backend.py`, `sse.py`, `state.make_streaming_backend`.
Migrate logit_lens + activation_patching routes to SSE — tool-based routes.
Migrate models + lens routes to SSE — non-tool routes.
Switch frontend from polling to SSE — `runAndStream` + rewire + delete `startAndPoll`.
StreamingRemoteBackend: use parent async_submit_request — cleanup once Split submit_request from handle_response; add async submit + get_response nnsight#648 is in.

Live verification

Smoke-tested end-to-end against api.ndif.us with `openai-community/gpt2`: `POST /logit_lens/run` streams RECEIVED → QUEUED → DISPATCHED → RUNNING status frames, then one `data` frame carrying the full `LogitLensData`. No errors.

Test plan

Local mode (`REMOTE=false`): logit-lens, activation-patching, prediction, generate, lens line, lens grid each return their expected payload via a single `data` SSE event.
Remote mode (`REMOTE=true`): logit-lens on gpt2 streams status frames and delivers a final data frame matching the pre-change output.
Other five remote flows verified end-to-end.
Remote mode: simulated NDIF error propagates as an `event: error` frame and surfaces on the client.
Browser network tab shows one long-lived request per run (no polling loop, no direct calls to `api.ndif.us`).
Chart thumbnails still upload on success (Lens2 path doesn't require the capture; V1 lens path does).

StreamingRemoteBackend is a subclass of nnsight's RemoteBackend that defers submission and status-waiting so the caller can forward each status update to the browser over Server-Sent Events. - __call__ (sync, fired from trace/session __exit__) captures the tracer and serializes the request; no I/O. - __aiter__ opens an async WebSocket, stamps the session id into the headers, async-POSTs the submit via httpx.AsyncClient, and yields ResponseModels from the socket as they arrive. - On COMPLETED: download the result (replacing response.data with the decoded dict), yield once more, stop. - On ERROR: yield, then raise RemoteException. workbench/_api/sse.py factors the shared frame formatter, the async generator that drives the backend, and small helpers for single-value local-mode streams and error-only streams. state.py exposes the new backend via make_streaming_backend(model); make_backend is kept so any remaining legacy callers still compile.

Replace the old /start + /results/{job_id} pair for each tool with a single POST /run endpoint that streams status events from NDIF directly to the client and emits one final data event carrying the formatted ToolData. The browser no longer polls NDIF for status. The endpoint flow: backend = state.make_streaming_backend(model=model) tool._run(model, ..., remote=True, backend=backend) async for response in backend: if response.status == COMPLETED: raw = response.data # dict of save-keyed tensors raw.update({tokenizer, input_tokens, model_name, ...}) yield data event (tool._format(raw, ...)) else: yield status event

Convert the remaining polling endpoints — /models/start-prediction, /models/start-generate, /lens/start-line, /lens/start-grid — and their companion /results/{job_id} routes into single /run-* SSE endpoints using the same pattern as the tool routes. These routes don't use the nnsightful Tool class, so each one splits its existing function into a _trace_* (runs model.trace / model.generate and saves outputs) and a _format_* (builds the response from the raw dict or the live tensors returned locally). In remote mode, the routes iterate the streaming backend; in local mode they call the trace directly and emit a single data event. Existing telemetry milestones are preserved at STARTED / 403 ERROR; the READY / COMPLETE milestones previously logged against the job id are dropped because the id isn't assigned until iteration begins.

Add lib/runAndStream.ts — a fetch + ReadableStream SSE client that parses status/data/error frames, forwards job status to useWorkspace.setJobStatus, resolves on the data frame, and throws on the error frame or a stream that ends without one. Repoint every API hook at the new /run-* endpoints and remove the polling helper: - lensApi.ts useLens2 → /logit_lens/run - activationPatchingApi.ts useActivationPatching → /activation_patching/run - modelsApi.ts usePrediction → /models/run-prediction useGenerate → /models/run-generate - chartApi.ts useLensLine → /lens/run-line useLensGrid → /lens/run-grid config.ts: collapse the start* + results* endpoint pairs into single run* entries and drop ndifStatusUrl; the browser no longer talks to NDIF directly. lib/startAndPoll.ts is deleted.

vercel · 2026-04-20T19:01:29Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
workbench	Ready	Preview, Comment	Apr 20, 2026 9:07pm

The duplicated _async_submit in StreamingRemoteBackend existed only because the parent RemoteBackend didn't expose an async submit. With ndif-team/nnsight#648 merged, we can delete it and call super().async_submit_request() directly — which also means the job_id bookkeeping (self.job_id = response.id) is handled by the parent, not open-coded here. Depends on ndif-team/nnsight#648.

JadenFiotto-Kaufman added 6 commits April 17, 2026 11:18

allow localhost origin

4fbc18d

Merge branch 'main' of github.com:ndif-team/workbench into main

ebb70f7

vercel Bot deployed to Preview April 20, 2026 19:02 View deployment

JadenFiotto-Kaufman mentioned this pull request Apr 20, 2026

Split submit_request from handle_response; add async submit + get_response ndif-team/nnsight#648

Merged

5 tasks

vercel Bot deployed to Preview April 20, 2026 21:07 View deployment

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Switch remote backend from polling to streaming SSE#112

Switch remote backend from polling to streaming SSE#112
JadenFiotto-Kaufman wants to merge 7 commits into
mainfrom
feat/sse-remote-backend

JadenFiotto-Kaufman commented Apr 20, 2026 •

edited

Loading

Uh oh!

vercel Bot commented Apr 20, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

JadenFiotto-Kaufman commented Apr 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

How it works

Endpoints

Frontend

Commit layout

Live verification

Test plan

Uh oh!

vercel Bot commented Apr 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

JadenFiotto-Kaufman commented Apr 20, 2026 •

edited

Loading

vercel Bot commented Apr 20, 2026 •

edited

Loading