✨ Add --fail-on-diff option and improve SDK resilience#169
Conversation
Visual diffs no longer break tests by default - SDKs can now choose whether
to fail on diffs via CLI flag, env var, or per-snapshot option.
**Server changes:**
- Visual diffs return 200 with `status: 'diff'` instead of 422
- Consistent response format: `status` field is 'match', 'diff', 'new', or 'baseline-updated'
- SDKs can decide whether to fail tests based on the status
**CLI changes:**
- Add `--fail-on-diff` flag to `vizzly tdd start` and `vizzly tdd run`
- Store `failOnDiff` in server.json for SDK discovery
**Ember SDK changes:**
- Read `failOnDiff` from server.json, env var (VIZZLY_FAIL_ON_DIFF), or per-snapshot option
- Gracefully skip snapshots when no server is running (like Percy)
- Add `failOnDiff` option to `vizzlySnapshot()` for per-snapshot control
**Priority order for failOnDiff:**
1. Per-snapshot option: `vizzlySnapshot('name', { failOnDiff: true })`
2. Environment variable: `VIZZLY_FAIL_ON_DIFF=true`
3. CLI flag: `vizzly tdd start --fail-on-diff`
4. Default: false (diffs logged as warnings, tests pass)
Code Review: Add --fail-on-diff option and improve SDK resilienceOverall, this is a well-designed and well-tested PR that successfully decouples visual diff detection from test failure. The implementation is clean, consistent, and follows good engineering practices. ✅ Strengths1. Excellent API Design
2. Strong Test Coverage
3. Good Documentation
4. Backwards Compatibility
🔍 Issues & Suggestions1. Missing Test: Per-Snapshot failOnDiff OverrideSeverity: Medium The Ember SDK supports vizzlySnapshot('name', { failOnDiff: true }) but there's no test verifying this actually works. This is a critical feature that should be tested. Location: clients/ember/src/test-support/index.js:278 2. Inconsistent Error Handling for Missing diffPercentageSeverity: Low In clients/ember/src/test-support/index.js:282, you use optional chaining for result.diffPercentage?.toFixed(2), but the server always returns diffPercentage when status === 'diff'. The optional chaining suggests the field might be missing, which could hide bugs. Location: clients/ember/src/test-support/index.js:282 3. Potential Race Condition in Server Discovery CacheSeverity: Low The cachedServerInfo in snapshot-server.js is cached globally but only cleared in tests. If the TDD server is restarted with different failOnDiff settings during a long-running test session, the cache won't reflect the new value. Location: clients/ember/src/launcher/snapshot-server.js:37-40 4. Missing Environment Variable DocumentationSeverity: Low The PR adds VIZZLY_FAIL_ON_DIFF environment variable, but it's not documented in CLAUDE.md (which lists other env vars around line 120). 5. Function Name Could Be ClearerSeverity: Very Low (nitpick) The function shouldFailOnDiff() in test-support/index.js:142 only checks the environment variable, not the full priority chain. Consider renaming to getEnvFailOnDiff() for clarity. 🎯 Code QualityExcellent:
🔒 Security & PerformanceNo security concerns identified. The changes don't introduce any new attack surfaces or handle sensitive data. Performance impact: Negligible. Minimal I/O impact and trivial CPU cost for boolean flag checks. ✨ Final VerdictThis is production-ready with minor improvements recommended. The core design is sound, the implementation is clean, and test coverage is good. Address Issue #1 (missing per-snapshot test) before merging, and consider the other suggestions for future polish. Estimated risk: Low - Changes are well-isolated, backwards-compatible, and thoroughly tested. Nice work on this feature! The API design (especially the priority chain and the 200-not-422 decision) shows thoughtful engineering. |
Fixes #179 - Vitest client was checking for 'passed'/'failed' status values but PR #169 changed the TDD handler to return 'match'/'diff'. **Vitest SDK fix:** - Changed status checks from 'passed' to 'match' - Changed status checks from 'failed' to 'diff' - Added handling for 'baseline-updated' status **CI improvements:** - Added `test-sdk-e2e` job that always runs (not conditional on path changes) - Tests Core JS Client, Vitest, Ember, and Ruby SDKs end-to-end - Removed redundant `test-site` job (now part of test-sdk-e2e) - Shared Playwright browser cache across SDK tests This ensures CLI changes that break SDK integrations are caught before merge.
## Summary - Fix Vitest SDK to use correct status values from TDD handler (`match`/`diff` instead of `passed`/`failed`) - Add `test-sdk-e2e` CI job that always runs to catch CLI changes that break SDK integrations - Remove redundant `test-site` job (consolidated into `test-sdk-e2e`) - Add shared Playwright browser cache across SDK tests for faster CI Fixes #179 ## Changes **Vitest SDK (`clients/vitest/src/setup.js`):** - Changed `'passed'` → `'match'` status check - Changed `'failed'` → `'diff'` status check - Added handling for `'baseline-updated'` status **CI Workflow (`.github/workflows/ci.yml`):** - Added `test-sdk-e2e` job that runs on every PR/push (not conditional on SDK file changes) - Tests Core JS Client, Vitest SDK, Ember SDK, and Ruby SDK end-to-end - Consolidated `test-site` into `test-sdk-e2e` - Shared Playwright browser cache (chromium + firefox) across all SDK tests ## Context PR #169 changed the TDD handler to return `'match'`/`'diff'` instead of `'passed'`/`'failed'`, but the Vitest client wasn't updated. This caused the error "Unknown comparison status: match" reported in #179. The SDK E2E tests would have caught this - they now run on every PR to prevent similar regressions. See #180 for follow-up work to add Storybook and Static-site SDK E2E tests. ## Test plan - [x] CLI lint passes - [x] CLI tests pass (1784/1784) - [x] Vitest client lint passes - [x] Vitest client unit tests pass (12/12) - [x] CI workflow YAML is valid - [ ] CI `test-sdk-e2e` job passes (will verify in this PR)
Summary
status: 'diff'instead of 422 error--fail-on-diffCLI flag forvizzly tdd startandvizzly tdd runChanges
Server:
200withstatus: 'diff'instead of422statusfield:'match','diff','new', or'baseline-updated'CLI:
--fail-on-diffflag onvizzly tdd startandvizzly tdd runfailOnDiffstored inserver.jsonfor SDK discoveryEmber SDK:
failOnDifffrom server.json, env var, or per-snapshot optionfailOnDiffoption onvizzlySnapshot()for per-snapshot controlUsage
Test plan
tdd-handler.test.jsfor new 200 response formatgetServerInfo()readingfailOnDiff