Skip to content

feat(engine): PersonDetector.detect_batch — batched inference primitive (P4 foundation)#97

Merged
TCVinNYC merged 1 commit into
mainfrom
feat/p4-detect-batch
Jun 24, 2026
Merged

feat(engine): PersonDetector.detect_batch — batched inference primitive (P4 foundation)#97
TCVinNYC merged 1 commit into
mainfrom
feat/p4-detect-batch

Conversation

@TCVinNYC

Copy link
Copy Markdown
Member

Foundation for cross-camera batching (P4): adds PersonDetector.detect_batch(frames) — letterbox N frames into one [N,3,S,S] tensor, ONE session.run, parse each output slice exactly as the single-frame detect() does. Returns one list[Detection] per frame; detect_batch(frames)[i] is guaranteed equal to detect(frames[i]) (batching is a pure perf optimization).

  • Fixed-batch-1 fallback: if the ONNX model was exported with a fixed batch dim of 1, it loops per-frame detect() (counter saved/restored) so the method is always correct regardless of export. The true N>1 batched path runs on dynamic-batch detectors.
  • detect() (single-frame) is untouched; detect_interval gating is left to callers (documented).
  • Tests: batched==per-frame equivalence on 3 frames, empty/single-frame, and the fixed-batch-1 fallback. Full suite green.

Scope: the primitive only. The cross-thread COALESCING scheduler (collect concurrent per-camera detect() calls within a small window → one batched run; behind a flag) is the follow-up — it is GPU-throughput-focused (not CPU) and needs a real multi-cam GPU rig to validate. Noted, not built here.

🤖 Generated with Claude Code

…tive (P4)

Letterboxes N frames, one session.run → [N,3,S,S], parses each output slice
with the same logic as detect() so detect_batch(frames)[i] == detect(frames[i]).
Degrades gracefully to a per-frame detect() loop for fixed-batch-1 ONNX exports.
9 new tests; 1015 total passed.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@TCVinNYC TCVinNYC merged commit 76ade94 into main Jun 24, 2026
3 checks passed
@TCVinNYC TCVinNYC mentioned this pull request Jun 24, 2026
TCVinNYC added a commit that referenced this pull request Jun 24, 2026
Second **pre-release** of the 2.2.0 reliability + performance roadmap,
for real-hardware validation before the stable 2.2.0. Bumps
`__version__` to `2.2.0-rc2`; on merge, tag `v2.2.0-rc2` triggers the
pre-release installer build.

**New since rc1 (all merged, CI-green + multi-agent-reviewed):**
- #92 CPU subservice governor — fewer CPU spikes (phase-stagger) +
cadence throttle under load
- #93 preview-rate-cap (~20fps)
- #90 live GPU-acceleration verdict in the Services panel
- #91 OpenVINO auto-install for Intel
- #94 scene-adaptive ReID threshold
- #95/#96 opt-in fused TargetAssociator (off by default)
- #97 batched detect_batch primitive

**Validate especially:** CPU usage/spikes with all subservices running
(your priority), and \`python -m autoptz --bench\` / the Services-panel
verdict on Intel-Mac+AMD. To try the new tracking logic, set
\`tracking.use_target_associator = true\`.

Follow-ups after your validation: P4 coalescing scheduler, wire the
associator ReID/pose cues + flip its default, int8 CPU quant, stable
Win/Linux device binding.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant