Skip to content

feat(external-api): round-trip dashboard containers, tabs, and tile container/tab refs#2201

Merged
kodiakhq[bot] merged 11 commits into
mainfrom
alex/HDX-2150-external-api-containers-tabs
May 11, 2026
Merged

feat(external-api): round-trip dashboard containers, tabs, and tile container/tab refs#2201
kodiakhq[bot] merged 11 commits into
mainfrom
alex/HDX-2150-external-api-containers-tabs

Conversation

@alex-fedotyev
Copy link
Copy Markdown
Contributor

@alex-fedotyev alex-fedotyev commented May 5, 2026

Summary

PR #2015 added a dashboard organization layer (containers with optional tabs, plus per-tile containerId and tabId) but the v2 external API was not updated to round-trip the new fields. External integrations that build dashboards programmatically had no way to use the new layer.

This wires the full set of fields through CREATE / GET / LIST / UPDATE on /api/v2/dashboards. Dashboards saved without containers round-trip unchanged.

Closes #2150. Follow-up to #2015 (commit 7665fbe).

What's in scope

  • Dashboard body Zod schema gains containers: DashboardContainer[]? (imported from @hyperdx/common-utils) and the tile schema gains containerId? and tabId?.
  • convertToExternalDashboard now emits containers (only when at least one is present, so dashboards without the layer round-trip with the field absent).
  • convertTileToExternalChart and convertToInternalTileConfig propagate containerId and tabId. The legacy series-format translator in externalApi.ts also propagates them so both code paths preserve the fields.
  • The containers: 1 projection is added to the Mongoose find and findOne calls.
  • New cross-field validation on the body schema:
    • container ids unique within a dashboard
    • tab ids unique within a container
    • tile containerId resolves to a real container
    • tile tabId resolves to a tab inside that container
    • tile tabId requires containerId to be set
  • OpenAPI JSDoc additions for DashboardContainer, DashboardContainerTab, the new tile fields, and the new dashboard field on Dashboard / CreateDashboardRequest / UpdateDashboardRequest. openapi.json regenerated.
  • A changeset entry.

Out of scope

Each item below has a tracking issue so the gap is visible after merge.

Tier

The triage classifier marks packages/api/src/routers/external-api/v2/* as critical-path, so this lands as Tier 4 by directory rule, even though the diff is small (~284 prod lines) and additive. Splitting further would separate the body schema, the conversion utilities, and the route wiring from each other and not actually reduce review burden. Happy to break this up if there's a preferred way to slice it.

Test plan

  • yarn ci:lint (lint + tsc + spectral) on @hyperdx/common-utils, @hyperdx/api, @hyperdx/app
  • yarn knip (no new unused exports)
  • Integration: yarn jest dashboards.test.ts -t "Containers and tabs", all 8 new tests pass
  • Integration: full yarn jest dashboards.test.ts, 86/86 tests pass (no regressions in old or new format suites)
  • Integration: yarn jest src/mcp/__tests__/dashboards.test.ts, 19/19 MCP dashboard tests pass (the MCP body schema shares with the external API body schema, so this confirms the new validations don't break the MCP path)
  • openapi.json regenerated and committed; spectral lint passes

@vercel
Copy link
Copy Markdown

vercel Bot commented May 5, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
hyperdx-oss Ready Ready Preview, Comment May 11, 2026 7:56pm

Request Review

@changeset-bot
Copy link
Copy Markdown

changeset-bot Bot commented May 5, 2026

🦋 Changeset detected

Latest commit: 77aa4a2

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 4 packages
Name Type
@hyperdx/api Minor
@hyperdx/common-utils Patch
@hyperdx/app Minor
@hyperdx/otel-collector Minor

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@vercel
Copy link
Copy Markdown

vercel Bot commented May 5, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
hyperdx-oss Building Building Preview, Comment May 5, 2026 8:19pm

Request Review

@github-actions github-actions Bot added the review/tier-4 Critical — deep review + domain expert sign-off label May 5, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 5, 2026

🔴 Tier 4 — Critical

Touches auth, data models, config, tasks, OTel pipeline, ClickHouse, or CI/CD.

Why this tier:

  • Critical-path files (2):
    • packages/api/src/routers/external-api/v2/dashboards.ts
    • packages/api/src/routers/external-api/v2/utils/dashboards.ts
  • Cross-layer change: touches backend (packages/api) + shared utils (packages/common-utils)

Review process: Deep review from a domain expert. Synchronous walkthrough may be required.
SLA: Schedule synchronous review within 2 business days.

Stats
  • Production files changed: 7
  • Production lines changed: 674 (+ 1069 in test files, excluded from tier calculation)
  • Branch: alex/HDX-2150-external-api-containers-tabs
  • Author: alex-fedotyev

To override this classification, remove the review/tier-4 label and apply a different review/tier-* label. Manual overrides are preserved on subsequent pushes.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 5, 2026

PR Review

Thorough PR with comprehensive validation, good bounds, backward-compat handling, and 26+ new integration/unit tests covering happy paths, error paths, legacy-doc self-heal, and PUT preserve-on-omit. The schema/handler split for container-ref validation is correct: structure checks stay at the schema level, but tile-ref resolution moves to the handler so PUT can fall back to existing containers when the body omits the field.

✅ No critical issues found.

Minor observations (non-blocking)

  • ⚠️ Behavior change in convertToInternalTileConfig: removes the stale top-level name field from internal tile shape (now only on config.name). The comment notes the renderer never reads it, but this is a tile-write shape change beyond the stated PR scope — worth calling out explicitly in the changeset in case downstream consumers (e.g., Mongo-direct readers, dashboard exports) ever relied on the duplicate. Not a regression since the test confirms config.name is preserved.
  • ⚠️ Orphan-ref read-time warnings are unthrottled: convertTileToExternalChart calls logger.warn per offending tile per GET. A legacy dashboard with many orphan tiles will spam logs on every list/get. Consider aggregating to one log line per dashboard, or sampling. Not a correctness issue.
  • ⚠️ Read-path silently dedupes duplicate container ids via last-write-wins when building containerById in convertToExternalDashboard. Comment acknowledges this, but the failure mode (one container "wins" and the other's tiles get their refs dropped as "orphan") is asymmetric and lossy on legacy data. A warn-log when containers.length !== containerById.size would make this debuggable.
  • ⚠️ min(1) tightening on TileSchema.containerId/tabId in common-utils/types.ts flows into the canonical DashboardSchema used by the internal API too, not just the external API. Any legacy Mongo doc with containerId: "" would now fail parse through that schema. The external read path heals this, but if DashboardSchema.parse(...) is ever called against a legacy doc on the internal API side, it'd throw. The PR's test plan covers external-API round-trip but doesn't explicitly cover internal-API ingestion of legacy empty-string refs — worth a spot-check that no internal route parses raw Mongo docs through this schema.
  • ⚠️ collectTileContainerRefIssues builds a no-op Zod schema just to reuse the path-formatting. Functional, but indirect — directly invoking validateDashboardTileContainerRefs against a synthetic RefinementCtx (or formatting issues without Zod) would be more readable. Pure ergonomics, not a defect.

Strengths

  • DoS caps (DASHBOARD_MAX_TILES = 500, DASHBOARD_MAX_CONTAINERS = 50, DASHBOARD_CONTAINER_MAX_TABS = 20) are appropriately sized and documented.
  • EXTERNAL_DASHBOARD_PROJECTION constant nicely DRYs the two find call sites.
  • setPayload typed as Partial<IDashboard> instead of Record<string, unknown> is a real improvement.
  • Concurrency limitation is documented in OpenAPI rather than hidden.
  • The P0/P1/P2 negative test cases (PUT preserve-on-omit, tab-only orphan, container-only orphan, empty-string heal) read like they're driven by a deep-review punch list; they cover the exact subtle bugs this kind of refactor would normally introduce.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 5, 2026

E2E Test Results

All tests passed • 168 passed • 3 skipped • 1192s

Status Count
✅ Passed 168
❌ Failed 0
⚠️ Flaky 5
⏭️ Skipped 3

Tests ran across 4 shards in parallel.

View full report →

alex-fedotyev added a commit that referenced this pull request May 5, 2026
After reading notes/principles/external-api-audit.md and walking
through the UI surface (useDashboardContainers.tsx, DashboardContainer.tsx,
GroupTabBar.tsx), three gaps were caught that the initial implementation
missed.

- OpenAPI parity: TileBase.containerId / tabId now declare minLength: 1
  to match the Zod schema's z.string().min(1).optional(). The Zod fix
  landed in the previous commit but the OpenAPI didn't pick up the
  constraint until JSDoc was updated and openapi.json regenerated.

- Test gap: explicit empty containers: [] now has its own round-trip
  test. The conversion normalizes [] back to absent on read (the
  existing length-guard makes this work), but the behavior wasn't
  asserted.

- Test gap: tile.containerId or tile.tabId set to an empty string is
  now explicitly rejected. Previously this would have failed
  cross-field validation only because no real container has id "",
  not because the tile-level rule fired.

UI invariants the API stays permissive about (auto-fixed by the UI
rather than rejected) are documented in the per-feature code map
under notes/repo-conventions/hyperdx/dashboards-containers-tabs.md
in the workspace.
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 6, 2026

Compound Engineering Review

Five specialist reviewers (TypeScript-quality, security, performance, architecture, simplicity) ran in parallel against the diff vs. base 2785597a. No P0s. Five P1s, mostly clustered around missing input caps, a missed projection cost, and an under-typed Mongoose update.

P1 — should resolve before merge

  • P1 packages/api/src/utils/zod.ts:423-424 + :474tile.containerId/tabId have only min(1) (no .max) and externalDashboardTileListSchema has no .max on the array. With the 32MB body limit (api-app.ts:55) and tiles: Schema.Types.Mixed, a client can persist a huge tile list with multi-MB id strings that round-trip back through every GET → cap ids at DASHBOARD_CONTAINER_ID_MAX (256) to mirror DashboardContainerSchema, and add .max(N) (e.g. 500) to the tile list schema.
  • P1 packages/api/src/routers/external-api/v2/dashboards.ts:1398 — LIST endpoint now selects containers for every dashboard. Worst case ≈310KB × 200 dashboards = ~62MB JSON per LIST → drop containers from EXTERNAL_DASHBOARD_PROJECTION and require GET-by-id to read it (or paginate LIST).
  • P1 packages/api/src/routers/external-api/v2/dashboards.ts:1948setPayload: Record<string, unknown> opts out of Mongoose's UpdateQuery<IDashboard> typing; a typo like setPayload.contianers = … compiles and silently no-ops → type as mongoose.UpdateQuery<IDashboard>['$set'] (or Partial<IDashboard>); the conditional-assign idiom still works.
  • P1 packages/common-utils/src/types.ts:1005-1015DashboardSchema.containers.refine(...) only checks duplicate container ids; tab-uniqueness and tile→container/tab resolution live only in the API-side superRefine. Internal write paths (DashboardWithoutIdSchema) and any future MCP write tool will accept dashboards the external API rejects → lift the four invariants into a shared validateDashboardLayout(data) helper in common-utils and call from both DashboardSchema.superRefine and buildDashboardBodySchema.
  • P1 packages/api/src/routers/external-api/v2/utils/dashboards.ts:776 — Early return on hasDuplicateContainerId silences disjoint tile-level errors (unknown containerId, tabId requires containerId), forcing a two-step fix loop → drop the early return; if kept, gate only the tile→container resolution branch (the tabId requires containerId check is independent).

P2 — worth addressing

  • P2 packages/common-utils/src/types.ts:899 — Tightening containerId/tabId to .min(1).optional() will reject any persisted tile with containerId: '' on next round-trip → run db.dashboards.find({"tiles.containerId":""}) before merge; if any rows exist, backfill or use .transform(s => s || undefined).
  • P2 packages/api/src/models/dashboard.ts:36containers: Schema.Types.Array is free-form; defense-in-depth rests on Zod alone → declare a typed Mongoose subdoc mirroring DashboardContainerSchema, or add a Mongoose validator.
  • P2 packages/api/src/routers/external-api/v2/dashboards.ts:39EXTERNAL_DASHBOARD_PROJECTION as const is not tied to IDashboard keys; dropping a model field leaves the projection silently stale → add satisfies mongoose.ProjectionType<IDashboard>.
  • P2 packages/api/src/routers/external-api/v2/utils/dashboards.ts:760seenTabIds.add(tab.id) runs unconditionally, inconsistent with the container loop's else-branch placement; functionally identical but reads as a bug → move into the else.
  • P2 packages/api/src/utils/externalApi.ts:36-208 vs v2/utils/dashboards.ts:303-313, 490-516 — Two parallel external↔internal tile translators each got containerId/tabId plumbed through; future tile-layout fields will need both touched → extract pickTileLayoutFields(tile) returning {id, x, y, w, h, name, containerId?, tabId?} and call from both.
  • P2 packages/api/src/utils/externalApi.ts:206 — No test exercises the legacy series-format translator with containers/tabs → add one fixture in the legacy suite to lock parity.
  • P2 packages/api/src/routers/external-api/v2/utils/dashboards.ts:806container.tabs?.find(t => t.id === tile.tabId) is O(T) per tile (T capped at 20 so OK now); since containerById is already built, attach a tabsById map per container during the first pass to make tile lookups O(1) → trivial restructure.
  • P2 packages/api/src/routers/external-api/v2/dashboards.ts:1191-1195, 1242-1246, 1293-1297containers JSDoc duplicated verbatim across Dashboard/CreateDashboardRequest/UpdateDashboardRequest → extract a DashboardContainersField component and $ref it (mirrors what DashboardContainerTab already does).
  • P2 packages/api/src/routers/external-api/v2/utils/dashboards.ts:786, 807 — Error messages echo client-supplied ids verbatim into 400 responses and Pino logs; with ids capped (see P1) the bloat risk reduces, but consider truncating echoed ids to ~64 chars regardless.

Verified non-issues

Reviewers run

kieran-typescript-reviewer, security-sentinel, performance-oracle, architecture-strategist, code-simplicity-reviewer.

alex-fedotyev pushed a commit that referenced this pull request May 6, 2026
Compound-review feedback on #2201:

- Tighten internal `TileSchema.containerId`/`tabId` to `min(1).optional()`
  so an empty string isn't a valid id (would otherwise silently pass
  `tile.containerId !== undefined` checks).
- Add `.max()` bounds on internal schemas: `id`/`title` capped at 256
  chars (`DashboardContainerSchema`, `DashboardContainerTabSchema`),
  `tabs` capped at `DASHBOARD_CONTAINER_MAX_TABS = 20`, and
  dashboard-level `containers` capped at `DASHBOARD_MAX_CONTAINERS =
  50`. The external API body schema now also caps `containers` so a
  client can't submit thousands of containers and trigger O(n*m) refine
  cost.
- Collapse the three sequential `containers.forEach` passes
  (container-id uniqueness, tab-id uniqueness, container-by-id map)
  into a single pass. The map is now built INSIDE the duplicate-id
  guard so duplicates aren't masked by last-write-wins. A new
  short-circuit returns before tile-resolution if container ids
  weren't unique, so the user fixes the container layer first instead
  of getting cascading "unknown containerId" errors on top.
- Extract `EXTERNAL_DASHBOARD_PROJECTION` constant in v2/dashboards.ts
  so the GET-list and GET-by-id projections stay in sync (this PR
  added `containers: 1` to both, the next field shouldn't have to).
- Add three missing test cases:
    - PUT-path duplicate-container-id rejection.
    - Tile with `containerId` set when the dashboard omits the
      `containers` field entirely (was previously a NPE-by-coincidence
      on `data.containers ?? []`).
    - Tile in a tabbed container that omits `tabId` (renders in the
      container shell, not under any tab); guards that the schema
      doesn't accidentally force `tabId` onto every tile in a tabbed
      container.

Cross-schema invariant lifting (the largest item the bot raised) is
deferred to a follow-up so this PR stays scoped to the external API
plus narrow internal-schema tightening.
@alex-fedotyev
Copy link
Copy Markdown
Contributor Author

P2 items addressed in 574ee8e:

  • Tightened internal TileSchema.containerId/tabId to min(1).optional()
  • .max() bounds added: 256-char ids/titles, 20 tabs/container, 50 containers/dashboard. The external API caps containers so a client cannot trigger O(n*m) refine cost
  • Single-pass over containers (container-id uniqueness + per-container tab-id uniqueness + map build), short-circuit if container-id duplicates exist before attempting tile resolution
  • EXTERNAL_DASHBOARD_PROJECTION constant in v2/dashboards.ts so the GET-list and GET-by-id projections stay in sync
  • Three missing tests added: PUT-path duplicate-container-id rejection, tile with containerId set when containers is omitted entirely (was a NPE-by-coincidence), tile in a tabbed container that omits tabId

Cross-schema invariant lifting + applyCommonTileFields deferred to follow-up #2225 (the lift would tighten DashboardSchema validation against existing internal writers, which needs its own scoped review and a production-data scan).

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 7, 2026

Deep Review

🔴 P0/P1 — must fix

✅ No critical issues found.

🟡 P2 — recommended

  • packages/api/src/routers/external-api/v2/dashboards.ts:~1996 — PUT against a non-existent dashboard with containers omitted and tiles carrying containerId returns 400 "Tile references unknown containerId" instead of 404, because effectiveContainers = containers ?? existingDashboard?.containers ?? [] collapses the null-doc case into an empty-containers validation against tiles that reference real (but unfindable) refs.
    • Fix: Short-circuit with 404 when existingDashboard == null before running collectTileContainerRefIssues, mirroring the post-update null check.
    • adversarial, reliability
  • packages/api/src/routers/external-api/v2/utils/dashboards.ts:~333-352convertTileToExternalChart emits one logger.warn per orphan containerId/tabId per tile per dashboard; the GET-list endpoint multiplies this by dashboard count, so a tenant with many legacy docs produces unbounded warn volume on every poll and the chronic-vs-new orphan signal gets lost.
    • Fix: Aggregate to one warn per dashboard with a count, or downgrade to debug/info since the read path silently self-heals anyway.
    • adversarial, reliability
  • packages/api/src/routers/external-api/v2/utils/dashboards.ts:~316 — Round-trip breaks for legacy Mongo docs that exceed the new write-time caps: GET emits a tile with containerId longer than 256 chars or a containers array with duplicate ids (or tiles array > DASHBOARD_MAX_TILES) verbatim, and the next PUT of that same response is rejected by min(1).max(256) / uniqueness / .max(500).
    • Fix: Drop oversized/duplicate refs on read (mirror the empty-string treatment) and dedup the emitted containers array, or document that round-trip is best-effort for pre-cap data and provide a clearly-distinguished error code so clients can detect the case.
    • adversarial, api-contract
  • packages/api/openapi.json:~1932 — The spec declares tabId and containerId as independent optional strings; the cross-field rule tabId requires containerId lives only in the prose description, so generated SDKs (Stainless, openapi-generator) compile happily for invalid combinations and the constraint surfaces only as a runtime 400.
    • Fix: Add dependentRequired: { tabId: ['containerId'] } (OpenAPI 3.1) or allOf with if/then/required (3.0) on TileBase so generated clients enforce the dependency.

🔵 P3 nitpicks (collapsed)

🔵 P3 nitpicks (2)
  • packages/api/src/routers/external-api/v2/dashboards.ts:~1700,2002 — Tile-ref validation returns a flat { message: tileRefIssues.join('; ') }, while every other body-validation error on these endpoints flows through validateRequestWithEnhancedErrors and produces the structured enhanced-errors envelope; clients that pattern-match the structured shape fall through to a generic error branch for these two paths and lose the per-field path metadata that collectTileContainerRefIssues already builds.
    • Fix: Route tile-ref issues through the enhanced-errors envelope (or have collectTileContainerRefIssues produce the same shape) so the response shape is uniform across body and cross-ref validation.
    • correctness, reliability
  • packages/api/src/routers/external-api/v2/dashboards.ts:~1995 — GET self-heals orphan tile refs silently; PUT rejects them. A client that wants to PUT { containers: [], tiles: [{ id, containerId: 'old' }] } to clear containers in one shot is forced into a two-step rewrite (null out every affected tile.containerId first), even though the end-state is exactly what GET would have rendered.
    • Fix: Either strip orphan refs on PUT when containers is explicitly empty (mirror GET behavior) or document the required tile-rewrite step in the OpenAPI concurrency callout.

Reviewers (10): correctness, security, adversarial, api-contract, reliability, testing, maintainability, kieran-typescript, performance, project-standards.

Testing gaps:

  • No test for PUT against a non-existent dashboard with tiles carrying containerId — would catch the 400-vs-404 regression.
  • No boundary test for DASHBOARD_MAX_CONTAINERS (51 containers should 400) or DASHBOARD_CONTAINER_MAX_TABS (21 tabs in one container should 400).
  • No test for PUT with explicit containers: [] while a tile still references an existing containerId (clear-while-referring contract is unpinned).
  • No test for a legacy Mongo doc whose stored containerId is > 256 chars or whose containers array contains duplicate ids — these survive the read path but the response is no longer a valid PUT body.
  • No test asserts the OpenAPI-documented default: true for collapsible/bordered matches the runtime Zod schema (which has no .default(true), so the field round-trips as absent rather than true).
  • No test covers the legacy series-tile path (translateExternalChartToTileConfig) with a tile carrying containerId/tabId that resolves against the request body's containers.

alex-fedotyev added a commit that referenced this pull request May 7, 2026
Deep-review feedback on #2201, mechanical items from the May 7 pass:

- Cap external tile `containerId`/`tabId` at 256 chars to mirror the
  internal `DashboardContainer` schema. The constant
  `DASHBOARD_CONTAINER_ID_MAX` is now exported from
  `@hyperdx/common-utils` so the external schema and the internal one
  pull from one source of truth.
- Cap a single dashboard payload at 500 tiles via the new
  `DASHBOARD_MAX_TILES` constant. Without the cap, an external API
  caller could push a payload tens of MB into Mongo in one request.
- Type the PUT setPayload as `Partial<IDashboard>` instead of
  `Record<string, unknown>` so a misnamed field fails at compile time.
- Treat empty-string `containerId`/`tabId` on legacy Mongo docs as
  absent on read so dashboards predating the containers feature still
  round-trip through the now-stricter external schema (which enforces
  `min(1)`). Added a regression test that mutates Mongo directly to
  simulate the legacy state.
- Replace `pick(externalTile, [...])` in `convertToInternalTileConfig`
  with explicit destructuring (mirroring the pattern in
  `convertTileToExternalChart`). The picked `name` was a stale top-
  level field on the resulting Tile (Tile has no top-level `name`);
  the rendered config still carries the name on `config.name`.
- Extract `validateDashboardContainersConsistency` into
  `@hyperdx/common-utils/dist/types` so the canonical schema and the
  external-API request body schema agree on what a valid
  `{containers, tiles}` payload is. The external body's `superRefine`
  now delegates to the helper.
- Drop the export on `DASHBOARD_CONTAINER_MAX_TABS` (used only by the
  schema definition next to it).
- OpenAPI now publishes matching `maxLength: 256` on container/tab
  ids, `maxItems: 20` on `DashboardContainer.tabs`, `maxItems: 50` on
  the request `containers` array, and `maxItems: 500` on the request
  `tiles` array. Regenerated `openapi.json`.

Boundary tests cover 256-char ids vs 257, 500-tile payloads vs 501,
and the legacy empty-string read path. Helper has standalone unit
tests in `v2/utils/__tests__/dashboards.test.ts`.
alex-fedotyev and others added 5 commits May 8, 2026 00:19
…ontainer/tab refs (#2150)

PR #2015 added a dashboard organization layer (containers with optional
tabs, tiles join a container via containerId and a tab via tabId) but
the v2 external API was not updated to round-trip the new fields.
External integrations that build dashboards programmatically had no way
to use the new layer.

This wires the full set of fields through CREATE / GET / LIST / UPDATE.
Dashboards saved without containers round-trip unchanged (Mongoose
returns an empty array for missing containers, so the conversion only
emits the field when at least one container is present).

The body schema validates that:
- container ids are unique within the dashboard
- tab ids are unique within their container
- tile.containerId resolves to a real container
- tile.tabId resolves to a tab inside the tile's container
- tile.tabId requires tile.containerId to be set

Tests cover create + get round-trip, update round-trip with re-homing
tiles and dropping a container, optional-field defaults, all five
validation rejections, and the no-containers backward-compat case.

The conversion utilities also pick up containerId / tabId on the tile
itself: convertToInternalTileConfig now extends its pick list (was the
specific bug v2 of the plan missed) and the legacy series translator
in externalApi.ts also propagates the fields so both code paths
preserve them.

Refs #2150, follows up on #2015 (7665fbe).
Empty-string values previously passed per-field validation and only
hit the cross-field check (no container has id ''). Adding .min(1)
matches the shared DashboardContainerSchema pattern and surfaces a
field-level error instead.
After reading notes/principles/external-api-audit.md and walking
through the UI surface (useDashboardContainers.tsx, DashboardContainer.tsx,
GroupTabBar.tsx), three gaps were caught that the initial implementation
missed.

- OpenAPI parity: TileBase.containerId / tabId now declare minLength: 1
  to match the Zod schema's z.string().min(1).optional(). The Zod fix
  landed in the previous commit but the OpenAPI didn't pick up the
  constraint until JSDoc was updated and openapi.json regenerated.

- Test gap: explicit empty containers: [] now has its own round-trip
  test. The conversion normalizes [] back to absent on read (the
  existing length-guard makes this work), but the behavior wasn't
  asserted.

- Test gap: tile.containerId or tile.tabId set to an empty string is
  now explicitly rejected. Previously this would have failed
  cross-field validation only because no real container has id "",
  not because the tile-level rule fired.

UI invariants the API stays permissive about (auto-fixed by the UI
rather than rejected) are documented in the per-feature code map
under notes/repo-conventions/hyperdx/dashboards-containers-tabs.md
in the workspace.
Compound-review feedback on #2201:

- Tighten internal `TileSchema.containerId`/`tabId` to `min(1).optional()`
  so an empty string isn't a valid id (would otherwise silently pass
  `tile.containerId !== undefined` checks).
- Add `.max()` bounds on internal schemas: `id`/`title` capped at 256
  chars (`DashboardContainerSchema`, `DashboardContainerTabSchema`),
  `tabs` capped at `DASHBOARD_CONTAINER_MAX_TABS = 20`, and
  dashboard-level `containers` capped at `DASHBOARD_MAX_CONTAINERS =
  50`. The external API body schema now also caps `containers` so a
  client can't submit thousands of containers and trigger O(n*m) refine
  cost.
- Collapse the three sequential `containers.forEach` passes
  (container-id uniqueness, tab-id uniqueness, container-by-id map)
  into a single pass. The map is now built INSIDE the duplicate-id
  guard so duplicates aren't masked by last-write-wins. A new
  short-circuit returns before tile-resolution if container ids
  weren't unique, so the user fixes the container layer first instead
  of getting cascading "unknown containerId" errors on top.
- Extract `EXTERNAL_DASHBOARD_PROJECTION` constant in v2/dashboards.ts
  so the GET-list and GET-by-id projections stay in sync (this PR
  added `containers: 1` to both, the next field shouldn't have to).
- Add three missing test cases:
    - PUT-path duplicate-container-id rejection.
    - Tile with `containerId` set when the dashboard omits the
      `containers` field entirely (was previously a NPE-by-coincidence
      on `data.containers ?? []`).
    - Tile in a tabbed container that omits `tabId` (renders in the
      container shell, not under any tab); guards that the schema
      doesn't accidentally force `tabId` onto every tile in a tabbed
      container.

Cross-schema invariant lifting (the largest item the bot raised) is
deferred to a follow-up so this PR stays scoped to the external API
plus narrow internal-schema tightening.
Deep-review feedback on #2201, mechanical items from the May 7 pass:

- Cap external tile `containerId`/`tabId` at 256 chars to mirror the
  internal `DashboardContainer` schema. The constant
  `DASHBOARD_CONTAINER_ID_MAX` is now exported from
  `@hyperdx/common-utils` so the external schema and the internal one
  pull from one source of truth.
- Cap a single dashboard payload at 500 tiles via the new
  `DASHBOARD_MAX_TILES` constant. Without the cap, an external API
  caller could push a payload tens of MB into Mongo in one request.
- Type the PUT setPayload as `Partial<IDashboard>` instead of
  `Record<string, unknown>` so a misnamed field fails at compile time.
- Treat empty-string `containerId`/`tabId` on legacy Mongo docs as
  absent on read so dashboards predating the containers feature still
  round-trip through the now-stricter external schema (which enforces
  `min(1)`). Added a regression test that mutates Mongo directly to
  simulate the legacy state.
- Replace `pick(externalTile, [...])` in `convertToInternalTileConfig`
  with explicit destructuring (mirroring the pattern in
  `convertTileToExternalChart`). The picked `name` was a stale top-
  level field on the resulting Tile (Tile has no top-level `name`);
  the rendered config still carries the name on `config.name`.
- Extract `validateDashboardContainersConsistency` into
  `@hyperdx/common-utils/dist/types` so the canonical schema and the
  external-API request body schema agree on what a valid
  `{containers, tiles}` payload is. The external body's `superRefine`
  now delegates to the helper.
- Drop the export on `DASHBOARD_CONTAINER_MAX_TABS` (used only by the
  schema definition next to it).
- OpenAPI now publishes matching `maxLength: 256` on container/tab
  ids, `maxItems: 20` on `DashboardContainer.tabs`, `maxItems: 50` on
  the request `containers` array, and `maxItems: 500` on the request
  `tiles` array. Regenerated `openapi.json`.

Boundary tests cover 256-char ids vs 257, 500-tile payloads vs 501,
and the legacy empty-string read path. Helper has standalone unit
tests in `v2/utils/__tests__/dashboards.test.ts`.
@alex-fedotyev alex-fedotyev force-pushed the alex/HDX-2150-external-api-containers-tabs branch from 2e727a9 to 7b8b8ad Compare May 8, 2026 00:19
Deep-review feedback on #2201, P0/P1 + critical P2 items:

- Move tile-level container/tab ref resolution out of the request
  body schema and into the POST and PUT handlers. The schema-level
  superRefine called the helper with `data.containers ?? []`, which
  rejected any tile that referenced a real container when the PUT
  body omitted `containers` (the documented preserve-on-omit
  branch). The handler now resolves against the effective container
  set (body containers OR existing dashboard containers), so a PUT
  that updates only `tiles` and keeps a tile homed in a preserved
  container succeeds.
- Split `validateDashboardContainersConsistency` into a
  structure-only pass and a tile-ref-only pass; keep the composite
  for backward compatibility. The body schema now calls the
  structure-only helper; handlers run the tile-ref pass via a new
  `collectTileContainerRefIssues` wrapper that returns formatted
  `path: message` strings consistent with
  `validateRequestWithEnhancedErrors`.
- Self-heal orphan tile.containerId / tile.tabId on read. A doc may
  carry a tile pointing at a container that has since been removed,
  or a tab that no longer exists in its container; round-trip these
  as if the ref were absent so a subsequent PUT validates instead
  of failing schema with "Tile references unknown containerId".
  Each drop is logged with the dashboard id, tile id, and offending
  ref. The PUT projection now fetches `containers` from Mongo so
  the fallback can resolve.
- Document in OpenAPI that PUT does not support optimistic
  concurrency; concurrent PUTs may silently overwrite each other.
  Adding ETag-style concurrency would be a breaking change to the
  request shape and is left for a follow-up.

Tests:
- 4 integration tests at the request layer:
  - PUT that omits `containers` with tiles homed in preserved
    containers; expects 200 + containers preserved on response.
  - PUT that omits `containers` and references an unknown
    containerId; expects 400.
  - GET on a doc whose tile.containerId no longer matches; expects
    `containerId` and `tabId` absent on response.
  - GET on a doc whose tile.tabId no longer matches the container's
    tabs; expects `tabId` absent, `containerId` preserved.
- 3 unit tests on `collectTileContainerRefIssues` for the empty,
  unknown-containerId, and tabId-without-containerId paths.
- 4 unit tests on `convertToExternalDashboard` for each orphan-heal
  branch plus the full-resolution case.
Comment thread packages/common-utils/src/types.ts Outdated
Comment on lines +1129 to +1166
/**
* Composite validation of dashboard containers + tile container/tab
* references. Runs the structure pass and (if container ids were
* unique) the tile-ref pass. Kept for callers that have both
* containers and tiles in one schema scope.
*
* The external API splits these passes across a schema-level
* structure check and a handler-level tile-ref check so the PUT path
* can fall back to the existing dashboard's containers when the
* request body omits the field; see
* `routers/external-api/v2/utils/dashboards.ts` and the handlers in
* `routers/external-api/v2/dashboards.ts`.
*/
export function validateDashboardContainersConsistency<
T extends TileForValidation,
>(
containers: ContainerForValidation[],
tiles: T[],
ctx: z.RefinementCtx,
paths?: {
containersPath?: (string | number)[];
tilesPath?: (string | number)[];
},
): void {
const { containerById, hasDuplicateContainerId } =
validateDashboardContainersStructure(containers, ctx, {
containersPath: paths?.containersPath,
});

// If container ids weren't unique, tile-level resolution would
// produce confusing errors on top of the duplicate-id ones; skip
// the tile pass and let the user fix the container layer first.
if (hasDuplicateContainerId) return;

validateDashboardTileContainerRefs(containerById, tiles, ctx, {
tilesPath: paths?.tilesPath,
});
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems unused? Can we remove it?

Suggested change
/**
* Composite validation of dashboard containers + tile container/tab
* references. Runs the structure pass and (if container ids were
* unique) the tile-ref pass. Kept for callers that have both
* containers and tiles in one schema scope.
*
* The external API splits these passes across a schema-level
* structure check and a handler-level tile-ref check so the PUT path
* can fall back to the existing dashboard's containers when the
* request body omits the field; see
* `routers/external-api/v2/utils/dashboards.ts` and the handlers in
* `routers/external-api/v2/dashboards.ts`.
*/
export function validateDashboardContainersConsistency<
T extends TileForValidation,
>(
containers: ContainerForValidation[],
tiles: T[],
ctx: z.RefinementCtx,
paths?: {
containersPath?: (string | number)[];
tilesPath?: (string | number)[];
},
): void {
const { containerById, hasDuplicateContainerId } =
validateDashboardContainersStructure(containers, ctx, {
containersPath: paths?.containersPath,
});
// If container ids weren't unique, tile-level resolution would
// produce confusing errors on top of the duplicate-id ones; skip
// the tile pass and let the user fix the container layer first.
if (hasDuplicateContainerId) return;
validateDashboardTileContainerRefs(containerById, tiles, ctx, {
tilesPath: paths?.tilesPath,
});
}

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dropped. The composite was only used by its own unit test; production code splits the two passes across schema-level and handler-level so PUT can fall back to the existing dashboard. Fixed in 5db7141.

Comment thread packages/common-utils/src/types.ts Outdated
* - Duplicate tab ids within a container (path
* `<containersPath>[i].tabs[j].id`).
*/
export function validateDashboardContainersStructure(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I'd suggest putting these in a dashboard utils or validations file, instead of this types file which is intended only to include types and type guards.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call. Moved both helpers and their two helper types to a new @hyperdx/common-utils/dist/dashboardValidation module; types.ts is back to types and type guards only. Fixed in 5db7141.

Comment on lines +989 to +993
* description: Whether the user can collapse the group. Defaults to true.
* example: true
* bordered:
* type: boolean
* description: Whether to show a visual border around the group. Defaults to true.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use default: true instead of just noting the default in the description.

Suggested change
* description: Whether the user can collapse the group. Defaults to true.
* example: true
* bordered:
* type: boolean
* description: Whether to show a visual border around the group. Defaults to true.
* description: Whether the user can collapse the group.
* default: true
* example: true
* bordered:
* type: boolean
* description: Whether to show a visual border around the group.
* default: true

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. JSDoc switched to explicit default: true in a4214d3 / 9ef8699; the regenerated openapi.json carries it in b48b837.

Address review feedback from pulpdrew: use OpenAPI `default` field
instead of noting the default in the description string.
… and bordered

Address review feedback from pulpdrew: use OpenAPI default field
instead of noting the default in the description string.
Review feedback from pulpdrew on #2201:

- `validateDashboardContainersStructure` and
  `validateDashboardTileContainerRefs` (and the two helper types
  `ContainerForValidation` / `TileForValidation`) move out of
  `@hyperdx/common-utils/dist/types` into a new
  `@hyperdx/common-utils/dist/dashboardValidation` module. The types
  file is back to just types and type guards, matching the rest of the
  codebase.
- Drop `validateDashboardContainersConsistency`. It was only referenced
  by its own unit test; production code in the v2 dashboards router
  calls the two underlying helpers in sequence directly (and has to,
  because the structure pass runs at the schema level while the
  tile-ref pass runs at the handler level so PUT can fall back to the
  existing dashboard's containers).
- Test file now exercises the two helpers in the same order production
  does, with a `runHelper` that wires them into one `z.superRefine`.
  All 19 existing assertions hold unchanged.

No behaviour change for external API callers.
…ults

The JSDoc fix in 9ef8699 / a4214d3 updated the OpenAPI source on the
v2 dashboards router but didn't refresh the committed `openapi.json`
artifact. Regenerate so the published spec carries the explicit
`default: true` for `collapsible` and `bordered` instead of leaving
the default note inline in the description.
@kodiakhq kodiakhq Bot merged commit 41395ca into main May 11, 2026
19 checks passed
@kodiakhq kodiakhq Bot deleted the alex/HDX-2150-external-api-containers-tabs branch May 11, 2026 20:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

automerge review/tier-4 Critical — deep review + domain expert sign-off

Projects

None yet

Development

Successfully merging this pull request may close these issues.

External Dashboards API: expose container + tab fields

3 participants