Skip to content

[HDX-4120] feat(api): support heatmap tiles in external dashboards API#2200

Open
alex-fedotyev wants to merge 11 commits intomainfrom
alex/HDX-4120-external-api-heatmap
Open

[HDX-4120] feat(api): support heatmap tiles in external dashboards API#2200
alex-fedotyev wants to merge 11 commits intomainfrom
alex/HDX-4120-external-api-heatmap

Conversation

@alex-fedotyev
Copy link
Copy Markdown
Contributor

@alex-fedotyev alex-fedotyev commented May 5, 2026

Summary

Heatmap was the only builder-mode display type that did not round-trip through the external dashboards API. The serializer dropped it into the "unsupported" fall-through, so creating, fetching, and updating heatmap tiles via /api/v2/dashboards lost the config.

This wires up heatmap end-to-end on the external API: a dedicated select-item schema, an explicit case in both serialization directions, OpenAPI JSDoc, and tests.

Follow-up to #2107 (review feedback from @pulpdrew, who asked whether we had a follow-up ticket to update the external API for the new visualization type).

What's in the diff

  • Zod schemas (packages/api/src/utils/zod.ts): a heatmap select-item schema that exposes the literal aggFn: "heatmap" plus valueExpression, optional countExpression, alias, heatmapScaleType; and a heatmap chart-config schema with optional groupBy and numberFormat. Heatmap is added to the builder discriminated union only.
  • Conversion utilities (packages/api/src/routers/external-api/v2/utils/dashboards.ts):
    • Builder serializer: replaces the old case DisplayType.Heatmap: fall-through with an explicit case that reads heatmap-specific fields off config.select[0] and emits aggFn: "heatmap" on the external surface.
    • Builder deserializer: new case 'heatmap': mirroring the Pie pattern; maps the external aggFn: "heatmap" back to the internal aggFn: "count" that the editor form persists, while preserving countExpression, alias, and heatmapScaleType on the select item.
    • The raw-SQL switch is intentionally left untouched: heatmap stays in the unsupported fall-through there because heatmap rendering requires isBuilderChartConfig.
  • OpenAPI JSDoc (packages/api/src/routers/external-api/v2/dashboards.ts): HeatmapSelectItem and HeatmapBuilderChartConfig components, and heatmap added to the TileConfig oneOf and discriminator mapping.
  • Tests (packages/api/src/routers/external-api/__tests__/dashboards.test.ts): heatmap added to the existing "round-trip all supported chart types" tests for both POST and PUT, plus an explicit rejection test confirming raw-SQL heatmap tiles return 400.
  • packages/api/openapi.json: regenerated.

Notes for review

  • No raw-SQL heatmap variant. PR feat: Wire heatmap chart into dashboard editor and tile rendering #2107 made heatmap builder-only and DBDashboardPage.tsx requires isBuilderChartConfig for heatmap rendering, so the raw-SQL fall-through stays.
  • heatmapScaleType and countExpression are persisted on the per-select-item level (via DerivedColumnSchema in packages/common-utils/src/types.ts), not on the chart config root. The form binds them as series.0.heatmapScaleType / series.0.countExpression. The schema and conversion utilities follow that.
  • The aggFn translation (external "heatmap" ↔ internal "count") keeps the saved Mongo document identical to what the editor form produces, so heatmap tiles created via the API render the same way as ones created via the UI.

Tier

Lands as review/tier-4 because anything under packages/api/src/routers/external-api/ is on the critical-path list. Diff is ~250 prod lines (most of it OpenAPI JSDoc and Zod boilerplate); no schema migrations or auth changes.

Test plan

  • make ci-lint (yarn lint, tsc --noEmit, OpenAPI lint)
  • make ci-unit (common-utils + app)
  • CI runs the full integration suite (heatmap round-trip POST + PUT + raw-SQL rejection); local make dev-int requires Docker BuildKit which isn't available on this host.

Deep-review carryover (2026-05-07)

@changeset-bot
Copy link
Copy Markdown

changeset-bot Bot commented May 5, 2026

🦋 Changeset detected

Latest commit: a217900

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 3 packages
Name Type
@hyperdx/api Minor
@hyperdx/app Minor
@hyperdx/otel-collector Minor

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@vercel
Copy link
Copy Markdown

vercel Bot commented May 5, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
hyperdx-oss Ready Ready Preview, Comment May 8, 2026 1:16am

Request Review

@github-actions github-actions Bot added the review/tier-4 Critical — deep review + domain expert sign-off label May 5, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 5, 2026

🔴 Tier 4 — Critical

Touches auth, data models, config, tasks, OTel pipeline, ClickHouse, or CI/CD.

Why this tier:

  • Critical-path files (2):
    • packages/api/src/routers/external-api/v2/dashboards.ts
    • packages/api/src/routers/external-api/v2/utils/dashboards.ts
  • Cross-layer change: touches frontend (packages/app) + backend (packages/api) + shared utils (packages/common-utils)

Review process: Deep review from a domain expert. Synchronous walkthrough may be required.
SLA: Schedule synchronous review within 2 business days.

Stats
  • Production files changed: 9
  • Production lines changed: 756 (+ 505 in test files, excluded from tier calculation)
  • Branch: alex/HDX-4120-external-api-heatmap
  • Author: alex-fedotyev

To override this classification, remove the review/tier-4 label and apply a different review/tier-* label. Manual overrides are preserved on subsequent pushes.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 5, 2026

E2E Test Results

All tests passed • 164 passed • 3 skipped • 1138s

Status Count
✅ Passed 164
❌ Failed 0
⚠️ Flaky 5
⏭️ Skipped 3

Tests ran across 4 shards in parallel.

View full report →

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 5, 2026

<!-- claude-code-review -->

PR Review

✅ No critical issues found.

Implementation is clean and well-scoped:

  • Explicit DisplayType.Heatmap case added in both serializer and deserializer (packages/api/src/routers/external-api/v2/utils/dashboards.ts), replacing the previous unsupported fall-through.
  • Dedicated Zod schemas (externalDashboardHeatmapSelectItemSchema, externalDashboardHeatmapChartConfigSchema) with valueExpression.min(1) and select.length(1) matching the editor's validateChartForm rule.
  • Heatmap added only to the builder discriminated union (raw-SQL stays rejected at the zod boundary), matching DBDashboardPage.tsx's isBuilderChartConfig requirement.
  • Trace-source enforcement via getHeatmapTilesWithNonTraceSources mirrors the UI's ChartEditorControls source-picker gating.
  • Test coverage is thorough: full round-trip (POST + PUT), minimal-fields round-trip, raw-SQL rejection, non-trace-source rejection, empty-valueExpression rejection, multi-select rejection.
  • OpenAPI JSDoc and openapi.json regeneration are in sync.

Minor (non-blocking) observations:

  • getSources() is now called 3× in parallel inside Promise.all (was 2×) since getHeatmapTilesWithNonTraceSources does its own fetch rather than receiving a shared existingSources map. Consistent with the pre-existing pattern (getMissingSources and getSourceConnectionMismatches already each fetch independently), so no regression — but a future refactor could deduplicate.
  • The PR description mentions an optional groupBy on the heatmap chart config schema, but the actual Zod/OpenAPI schema correctly omits it (matching HeatmapSeriesEditor). The code is right; the description copy is slightly out of date.
  • In convertToExternalTileChartConfig, numberFormat: config.numberFormat is emitted unconditionally where elsewhere in the file the conditional-spread pattern (...(x !== undefined ? { x } : {})) is used for optional fields. JSON serialization strips the undefined so behavior is correct, just stylistically inconsistent.

@alex-fedotyev
Copy link
Copy Markdown
Contributor Author

Thanks for the review.

  • Optional field consistency (point 1): Good catch. Fixed in 9e09229convertToExternalHeatmapSelectItem now uses !== undefined to match convertToInternalTileConfig, so empty-string round-trips don't silently drop fields.

  • numberFormat: config.numberFormat (point 2): This is actually the prevailing pattern in the file (10+ uses across Line, StackedBar, Table, Number, Pie, and the raw-SQL variants). Leaving it for consistency with the surrounding code; happy to swap to the spread form here AND across the file in a follow-up if we want a sweep.

  • knip: previously addressed in c232564 (extracted convertToExternalHeatmapSelectItem helper that types its return as ExternalDashboardHeatmapSelectItem).

alex-fedotyev added a commit that referenced this pull request May 6, 2026
The external API surface in PR #2200 exposes where/whereLanguage at the
chart-config level for heatmaps (UI HeatmapSeriesEditor doesn't render
per-series filters; the persisted shape stores them once via
SelectSQLStatementSchema). The MCP heatmap schema was dropping them, so
once #2200 lands a save through hyperdx_save_dashboard would silently
discard the filter on read-back.

Add `where` and `whereLanguage` to mcpHeatmapTileSchema.config with the
same defaults the search tile uses (where: '', whereLanguage: 'lucene').
Extend the round-trip test with a Lucene filter, add an explicit
whereLanguage="sql" round-trip, and a negative test for an invalid
whereLanguage. Update the example string and changeset to document the
new fields.
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 6, 2026

PR Review

✅ No critical issues found.

This PR cleanly fills the heatmap gap in the external dashboards API. End-to-end coverage (Zod, REST + MCP, OpenAPI, tests) is solid and the design choices are well-reasoned:

  • Silent-downgrade prevention is correct. The convertToExternalTileChartConfig heatmap arm emits a placeholder {aggFn: 'heatmap', valueExpression: ''} instead of falling through to the defaultTileConfig line tile. This forces re-PUT to fail loudly via the input schema's min(1) rule, with test coverage at dashboards.test.ts (the corrupted-heatmap GET test).
  • PUT scoping via filterChangedHeatmapTiles is the right call. Re-validating only heatmap tiles whose sourceId/displayType actually changed (or that were freshly added) avoids wedging unrelated edits when a source's kind was mutated post-acceptance — exercised by the "does not re-validate ... for unchanged heatmap tiles" test.
  • Shared validation between REST and MCP. Hoisting fetchSourcesForValidation and reusing getHeatmapTilesWithIncompatibleSources / filterChangedHeatmapTiles in both saveDashboard.ts and dashboards.ts keeps both surfaces in lockstep.
  • HEATMAP_ALLOWED_SOURCE_KINDS centralization. Single source of truth for [Trace] shared by ChartEditorControls.tsx and the API gate; expanding heatmap to a new source kind is now a one-line change.

Minor observations (non-blocking):

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 6, 2026

Compound Engineering Review

✅ No critical issues found. Security audit returned no findings (tenant scoping, IDOR, info-leak, auth all sound). Type safety, test parity, and adherence to sibling patterns (Pie/Search/Markdown) are clean.

P2 — Important

  • P2 packages/app/src/components/SourceSelect.tsx:85 + ChartEditorControls.tsx:108allowedSourceKinds prop typed as mutable SourceKind[] forces [...HEATMAP_ALLOWED_SOURCE_KINDS] spread (no other call site spreads); allocates a new array each render → widen prop to ReadonlyArray<SourceKind> and drop the spread.
  • P2 packages/api/src/routers/external-api/v2/utils/dashboards.ts:103-109, 530-536 — conditional-spread ...(item.x !== undefined ? { x: item.x } : {}) repeated 6× for countExpression/alias/heatmapScaleType → replace with pick(item, ['countExpression','alias','heatmapScaleType']) (lodash already imported); preserves empty-string round-trip and removes the comment that motivated the verbose form.
  • P2 packages/api/src/routers/external-api/v2/dashboards.ts:1683-1722, 1918-1957 — POST/PUT each run 4 sibling validators in Promise.all, several re-querying sources; create+update branches are now byte-identical → extract validateDashboardReferences(teamId, tiles, filters) that fetches sources/connections once and returns a structured result, plus a respondIfInvalidReferences(res, result) helper to dedupe the 4 error blocks.
  • P2 packages/api/src/routers/external-api/v2/utils/dashboards.ts:514-528 — internal aggFn: 'count' + aggCondition: '' are encoding applyHeatmapDefaults's UI-form quirk in the API converter; if the editor default ever shifts (e.g. histogram), API-built and UI-built tiles diverge silently → share the default-builder via common-utils (buildInternalHeatmapSelectItem) imported from both the form and the converter, or at minimum add a unit test asserting parity so the next editor change fails loudly.
  • P2 packages/common-utils/src/guards.ts:13-35HEATMAP_ALLOWED_SOURCE_KINDS is a 1-element array wrapped in a ReadonlyArray<SourceKind> const + 4-line predicate + 9-line JSDoc with two consumers using different shapes (UI spreads array, API calls predicate); when metric-heatmaps need a richer "kind + metricType" rule, the kinds-array contract breaks → simplify to a single as const constant and let the API call .includes(...) inline, or expose only isHeatmapCompatibleSource(source) + a getHeatmapAllowedSourceKinds() accessor so the predicate can grow without the UI ossifying around the array.
  • P2 packages/api/src/routers/external-api/v2/utils/dashboards.ts:95-110convertToExternalHeatmapSelectItem param typed as Exclude<BuilderSavedChartConfig['select'][number], string> reads heatmap-only fields (countExpression, heatmapScaleType) that only exist on DerivedColumn → tighten the param to DerivedColumn (or a Pick<DerivedColumn, …>) so the contract reflects what the body requires.
  • P2 packages/api/src/routers/external-api/v2/utils/dashboards.ts:291-319, 524-538 — aggFn round-trip is silently lossy: serializer always emits 'heatmap', deserializer always writes 'count', regardless of the internal value; legacy/manually-written docs with aggFn: 'avg' would be relabeled without notice → either log a warning when item.aggFn !== 'count' && item.aggFn !== 'heatmap' in the serializer, or document the lossiness explicitly in the existing comment.
  • P2 packages/api/src/routers/external-api/v2/utils/dashboards.ts:303, 644-651 — two dead defensive guards: (a) typeof item === 'string' on a select element where the schema guarantees objects (string case is already caught one level up by Array.isArray), and (b) && tile.config.sourceId truthiness check on a field that objectIdSchema already validates as a non-empty string → drop both for clarity (or leave a one-line "belt-and-suspenders" comment).

alex-fedotyev pushed a commit that referenced this pull request May 6, 2026
Compound-review feedback on #2200, all P2:

- Drop redundant `whereLanguage.optional()`; the underlying
  `SearchConditionLanguageSchema` (alias `whereLanguageSchema`) is
  already `.optional()` and the sibling
  `externalDashboardSearchChartConfigSchema` writes it without the
  outer wrap.
- Drop the `where: externalConfig.where ?? ''` redundancy in the
  deserializer; the Zod schema declares `.default('')` so the field is
  always a string post-parse.
- Rename schema `HeatmapBuilderChartConfig` -> `HeatmapChartConfig`
  to match the sibling `SearchChartConfig` / `MarkdownChartConfig`
  naming. The `Builder` suffix is reserved for cases that disambiguate
  from a sibling `RawSqlChartConfig`; heatmap has no raw-SQL variant.
- `convertToExternalHeatmapSelectItem` now takes a non-optional item.
  The case-arm caller checks for missing/string/empty-valueExpression
  items and falls through to `defaultTileConfig` with a logger.warn
  instead of silently emitting an out-of-contract payload that
  violates the external schema's `min(1)` rule.
- Extract `HEATMAP_ALLOWED_SOURCE_KINDS` and
  `isHeatmapCompatibleSource` into common-utils. Both
  `ChartEditorControls.tsx` (UI) and the API's
  `getHeatmapTilesWithIncompatibleSources` (renamed from
  `...WithNonTraceSources`) reference the same set so UI and API
  gates move together.
- Add a parity comment on the deserializer pointing to
  `applyHeatmapDefaults` so the `aggCondition`/`aggConditionLanguage`
  defaults stay coupled to the editor's behaviour.
- Consolidate three pure-Zod schema-rejection tests
  (raw-SQL-heatmap absent, empty valueExpression, multiple select
  items) into a single parametrized `it.each` block; the runtime
  non-Trace-source rejection stays as its own test because it
  exercises the new `getHeatmapTilesWithIncompatibleSources` path.
@alex-fedotyev
Copy link
Copy Markdown
Contributor Author

P2 cleanups addressed in e89989d:

  • Dropped redundant whereLanguage.optional() and where: ?? '' to match sibling chart-config arms
  • Renamed HeatmapBuilderChartConfigHeatmapChartConfig (the Builder suffix is reserved for cases that disambiguate from a sibling RawSql type; heatmap has no raw-SQL variant)
  • convertToExternalHeatmapSelectItem is now non-optional; the case-arm caller falls through to defaultTileConfig with logger.warn for legacy/corrupted Mongo docs that lack valueExpression
  • Extracted HEATMAP_ALLOWED_SOURCE_KINDS and isHeatmapCompatibleSource to common-utils so ChartEditorControls.tsx and getHeatmapTilesWithIncompatibleSources (renamed) reference the same set; UI and API gates now move together
  • Parity comment on the deserializer pointing at applyHeatmapDefaults for aggCondition/aggConditionLanguage
  • Three pure-Zod rejection tests consolidated into a single parametrized it.each; the runtime non-Trace-source rejection stays as its own test because it exercises the new code path

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 7, 2026

Deep Review

🔴 P0/P1 -- must fix

  • packages/api/src/routers/external-api/v2/utils/dashboards.ts:899 -- cleanupDashboardAlerts only deletes alerts on raw-SQL tiles with unsupported display types or removed tile ids, so converting an alerted line tile to heatmap (same tile id) strands the existing builder alert against a displayType that does not support alerts.
    • Fix: Extend the cleanup filter to also delete alerts when a builder tile's new displayType is not in displayTypeSupportsBuilderAlerts(...).
    • reliability
  • packages/api/src/routers/external-api/v2/utils/dashboards.ts:726 -- filterChangedHeatmapTiles has four branches (new tile, existing-was-raw-SQL, displayType-changed-to-heatmap, sourceId-changed) but only the same-source no-op branch is exercised by integration tests, so a regression in any other branch passes silently.
    • Fix: Add PUT tests covering sourceId change to a non-Trace source, line→heatmap displayType change, raw-SQL→heatmap conversion, and a tile id whose previous config was heatmap but is now non-heatmap.
    • testing
  • packages/api/src/mcp/tools/dashboards/saveDashboard.ts:299 -- The MCP updateDashboard flow calls filterChangedHeatmapTiles and getHeatmapTilesWithIncompatibleSources but no MCP test ever reaches them, so the entire MCP heatmap-update gate is untested.
    • Fix: Mirror the REST PUT scoping tests at the MCP layer -- unchanged-heatmap bypass, sourceId-change rejection, and displayType-change-to-heatmap validation.
    • testing

🟡 P2 -- recommended

  • packages/api/src/routers/external-api/v2/utils/dashboards.ts:315 -- The corrupted-heatmap placeholder emits valueExpression: '', which violates the minLength: 1 declared on HeatmapSelectItem in the OpenAPI spec shipped in this same PR; strict-validating SDK clients will throw on any GET that returns the placeholder.
    • Fix: Either drop minLength: 1 on the response-side schema (split request and response variants) or emit a non-empty sentinel like '<corrupted>' so the response stays in contract while still failing a re-PUT.
    • api-contract, adversarial, kieran-typescript
  • packages/api/src/utils/zod.ts:309 -- countExpression: z.string().max(10000).optional() accepts ''; the deserializer's !== undefined write persists empty string, the renderer's ?? 'count()' only nullish-coalesces, and the heatmap bucket_calc CTE then emits invalid SELECT AS "value_count" FROM ... SQL when a user views the tile.
    • Fix: Tighten the schema to countExpression: z.string().min(1).max(10000).optional(), or change the renderer fallback to || for the heatmap path so empty strings also default to count().
    • adversarial
  • packages/api/src/routers/external-api/__tests__/dashboards.test.ts:2560 -- The corrupted-heatmap GET test asserts only config.displayType === 'heatmap', so a regression that drops select, flips aggFn to 'count', or changes the where-defaults still passes.
    • Fix: Assert the placeholder shape structurally -- select length 1, select[0].aggFn === 'heatmap', select[0].valueExpression === '', where === '', and whereLanguage === 'lucene'.
    • testing
  • packages/api/src/mcp/__tests__/dashboards.test.ts:462 -- REST rejects raw-SQL heatmap tiles via an it.each table, but MCP has no parallel test, so a regression that lets heatmap leak into mcpSqlTileSchema's displayType enum would silently pass.
    • Fix: Add an MCP test calling hyperdx_save_dashboard with { configType: 'sql', displayType: 'heatmap', ... } and asserting isError: true.
    • testing
  • packages/api/src/routers/external-api/v2/dashboards.ts:1933 -- All four 400-class validators (missingSources, missingConnections, sourceConnectionMismatches, heatmapNonTraceSources) now run before the findOneAndUpdate 404 path, so a PUT to a non-existent dashboard id with a non-Trace heatmap returns the misleading Heatmap tiles require a Trace source instead of 404.
    • Fix: Return 404 immediately after the Promise.all when existingDashboard == null so the existence check precedes the validation gates.
    • reliability
  • packages/api/src/mcp/tools/dashboards/schemas.ts:249 -- mcpHeatmapSelectItemSchema is a hand-maintained near-duplicate of externalDashboardHeatmapSelectItemSchema (the inline comment even concedes "stay in lockstep"), so a future field addition must be made in two places or the surfaces silently diverge.
    • Fix: Export the heatmap select-item shape from utils/zod.ts and have the MCP variant wrap it with .describe(...) overlays for the LLM hints, so there is a single source of truth.
    • maintainability
🔵 P3 nitpicks (4)
  • packages/api/src/routers/external-api/v2/utils/dashboards.ts:14 -- The file imports lodash twice (import { pick } from 'lodash' and import _ from 'lodash'); only pick, _.omitBy, and _.isNil are used, so the dual import reads as accidental and trips lint setups that flag duplicate imports.
    • Fix: Collapse to import { pick, omitBy, isNil } from 'lodash' and replace _.omitBy(internalConfig, _.isNil) with omitBy(internalConfig, isNil).
  • packages/api/src/routers/external-api/v2/utils/dashboards.ts:13 -- SearchConditionLanguageSchema as whereLanguageSchema is already aliased in utils/zod.ts; re-aliasing it locally just for savedQueryLanguage and the SavedQueryLanguage type adds a second renaming layer for no benefit.
    • Fix: Drop the local alias and use SearchConditionLanguageSchema directly, or re-export the alias from utils/zod.ts and import from there.
  • packages/api/src/routers/external-api/v2/utils/dashboards.ts:454 -- Both exhaustiveness default: arms (after externalConfig satisfies never) silently fall through to internalConfig = {} as SavedChartConfig; if any future schema drift ever lets the assertion compile, the converter persists {} to Mongo as a tile config.
    • Fix: Replace the empty-cast assignments with throw new Error('unreachable: unhandled displayType') so an exhaustiveness escape becomes a loud failure instead of silent corruption.
  • packages/api/src/routers/external-api/v2/utils/dashboards.ts:312 -- The placeholder logger.warn records tileId: sourceId, but the converter's signature is (config: SavedChartConfig) and has no access to the actual tile id; operators searching logs by tile id will never find these warnings.
    • Fix: Either thread the tile id into the converter so the log carries a real id, or rename the log field to sourceId and add a note that the tile id is unavailable at this layer.

Reviewers (10): correctness, testing, maintainability, project-standards, api-contract, security, performance, adversarial, kieran-typescript, reliability.

Testing gaps:

  • No coverage for the line→heatmap stale-alert path now reachable via this PR's writable heatmap surface.
  • No test exercises the renderer with the placeholder's empty valueExpression to verify the dashboard degrades gracefully rather than crashing the whole render.
  • The convertToExternalSelectItem safeParse-with-fallback silently downgrades unknown aggFn values (e.g. quantileMerge) to 'none' and drops out-of-anchor level values on read; no test asserts or warns on this drift.

alex-fedotyev added a commit that referenced this pull request May 7, 2026
Deep-review on #2200 flagged the previous comment as overstating
parity. The editor's `applyHeatmapDefaults` writes `numberFormat:
{ output: 'duration', factor: 0.001 }`, `series.0.countExpression:
'count()'`, and `aggConditionLanguage: getStoredLanguage() ?? 'lucene'`,
while this converter hardcodes `'lucene'` for the language and passes
`numberFormat`/`countExpression` through verbatim from the external
payload (or leaves them absent).

Comment now enumerates the divergences and the rendering implications:
the renderer does not read `aggConditionLanguage` for heatmap tiles
(heatmap has no per-select where), and an API-built tile renders
without duration formatting unless the caller specifies `numberFormat`.
No behavior change.
alex-fedotyev added a commit that referenced this pull request May 7, 2026
Deep-review on #2200 flagged that buildCreateDashboardPrompt and
buildQueryGuidePrompt enumerate every supported displayType but
omit heatmap, hiding the new capability from agents reading the
prompt.

Adds heatmap to:

  - TILE TYPE GUIDE (buildCreateDashboardPrompt): one-line description
    plus the Trace-source-only and aggFn:"heatmap" + valueExpression
    requirements.
  - COMMON MISTAKES TO AVOID: a single bullet covering the source-kind
    and select-item rules.
  - PER-TILE TYPE CONSTRAINTS (buildQueryGuidePrompt): 1 select item,
    no groupBy, Trace sources only.

Not in this commit: the MCP source-kind runtime gate
(getHeatmapTilesWithIncompatibleSources) flagged as P0/P1 #2 in the
same deep-review. The helper is added in #2200 and is not yet
importable from this branch; the gate will land in the rebase commit
after #2200 merges.
alex-fedotyev and others added 10 commits May 8, 2026 00:19
Heatmap is the only builder-mode display type that did not round-trip
through the external dashboards API. The serializer dropped it into the
"unsupported" fall-through, so creating, fetching, and updating heatmap
tiles via /api/v2/dashboards lost the config.

This adds:

- A heatmap select-item schema in zod.ts that exposes the literal
  aggFn 'heatmap', valueExpression, optional countExpression, alias,
  and heatmapScaleType. Heatmap is included in the builder
  discriminated union only; it is intentionally not added to the raw
  SQL union since heatmap rendering requires isBuilderChartConfig.
- A heatmap chart config schema with optional groupBy and numberFormat.
- Builder serialization that reads the heatmap-specific fields off
  config.select[0] and emits aggFn 'heatmap' on the external surface,
  passing through groupBy and numberFormat.
- Builder deserialization that maps the external aggFn 'heatmap' back
  to the internal aggFn 'count' the editor form persists, preserving
  countExpression, alias, and heatmapScaleType on the select item.
- OpenAPI JSDoc with HeatmapSelectItem and HeatmapBuilderChartConfig
  components, and heatmap added to the TileConfig discriminator.
- Integration tests: heatmap added to the existing "round-trip all
  supported chart types" tests for both POST and PUT, plus an explicit
  rejection test for raw SQL heatmap tiles.
- Regenerated openapi.json.

Follow-up to #2107 (review feedback from @pulpdrew).
knip flagged the exported type as unused. Extracted the inline heatmap
select-item construction into a small convertToExternalHeatmapSelectItem
helper that mirrors the existing convertToExternalSelectItem pattern and
returns the typed shape, kills the knip warning, and removes the
duplicated nullish-spread bookkeeping at the call site.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Reviewer flagged that convertToExternalHeatmapSelectItem used truthy
checks (countExpression ? ...) while convertToInternalTileConfig used
!== undefined. An empty-string countExpression/alias round-trip would
silently drop the field. Use !== undefined consistently in both
directions.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The HeatmapSeriesEditor in the UI binds a single SearchWhereInput to the
top-level `where` / `whereLanguage` and does not render any groupBy
input. The previous schema had it backwards: it carried groupBy (not
exposed in the UI, never set by the form) and was missing
where / whereLanguage (the only filter heatmap actually uses).

- zod: drop groupBy, add where + whereLanguage to
  externalDashboardHeatmapChartConfigSchema
- conversion utils: serializer emits chart-level where + whereLanguage;
  deserializer maps them onto BuilderSavedChartConfig.where /
  whereLanguage instead of stuffing them into select-item aggCondition
- OpenAPI: HeatmapBuilderChartConfig has where + whereLanguage instead
  of groupBy; description notes the row-level filter is at chart level
- tests: POST round-trip exercises sql where ("ServiceName = 'api'")
  with heatmapScaleType: 'log'; PUT round-trip exercises lucene where
  ("service:api") with heatmapScaleType: 'linear'. Both languages and
  both scale types are now covered end-to-end.
- regenerated openapi.json
…ression

Two more gaps found while auditing the schema against the heatmap UI surface:

1. The heatmap UI's source picker is restricted to SourceKind.Trace
   (ChartEditorControls.tsx:103-107: allowedSourceKinds={[SourceKind.Trace]}).
   The external API previously accepted any source kind, so a metric or log
   source would round-trip cleanly through /api/v2/dashboards and produce a
   tile that does not render in the UI. Adds getHeatmapTilesWithNonTraceSources
   and wires it into both POST and PUT alongside the existing source /
   connection validators, returning 400 with a descriptive message.

2. validateChartForm in the editor rejects empty valueExpression on heatmap
   ("Value expression is required for heatmap charts"). The Zod schema
   accepted empty string. Tightens to z.string().min(1).max(10000) and
   updates the OpenAPI minLength to match.

Tests added: rejection on non-Trace source (with assertion on the error
message), rejection on empty valueExpression, rejection on multi-item
select array.

Regenerated openapi.json.
The existing realistic-payload round-trip exercises every optional field
populated. The minimal path (no countExpression / alias /
heatmapScaleType / where / whereLanguage / numberFormat) is what the
deserializer's !== undefined guards in v2/utils/dashboards.ts exist to
protect, and was uncovered until now.

The Zod schema applies where: z.string().optional().default(''), so the
round-trip surfaces an empty string for where while the other optional
fields stay undefined. The expected response keeps explicit on the
applied default rather than asserting strict equality with the request.
When the external API request omits whereLanguage on a heatmap tile,
the deserializer at v2/utils/dashboards.ts:514 fills it with 'lucene'
so the Mongo doc stays consistent across heatmap and non-heatmap chart
types. The minimal-fields round-trip needs to expect whereLanguage on
the response, not undefined.

The schema-level constraint stays optional (matching the OpenAPI
shape); the default lives at the persistence boundary, not the
contract boundary.
Compound-review feedback on #2200, all P2:

- Drop redundant `whereLanguage.optional()`; the underlying
  `SearchConditionLanguageSchema` (alias `whereLanguageSchema`) is
  already `.optional()` and the sibling
  `externalDashboardSearchChartConfigSchema` writes it without the
  outer wrap.
- Drop the `where: externalConfig.where ?? ''` redundancy in the
  deserializer; the Zod schema declares `.default('')` so the field is
  always a string post-parse.
- Rename schema `HeatmapBuilderChartConfig` -> `HeatmapChartConfig`
  to match the sibling `SearchChartConfig` / `MarkdownChartConfig`
  naming. The `Builder` suffix is reserved for cases that disambiguate
  from a sibling `RawSqlChartConfig`; heatmap has no raw-SQL variant.
- `convertToExternalHeatmapSelectItem` now takes a non-optional item.
  The case-arm caller checks for missing/string/empty-valueExpression
  items and falls through to `defaultTileConfig` with a logger.warn
  instead of silently emitting an out-of-contract payload that
  violates the external schema's `min(1)` rule.
- Extract `HEATMAP_ALLOWED_SOURCE_KINDS` and
  `isHeatmapCompatibleSource` into common-utils. Both
  `ChartEditorControls.tsx` (UI) and the API's
  `getHeatmapTilesWithIncompatibleSources` (renamed from
  `...WithNonTraceSources`) reference the same set so UI and API
  gates move together.
- Add a parity comment on the deserializer pointing to
  `applyHeatmapDefaults` so the `aggCondition`/`aggConditionLanguage`
  defaults stay coupled to the editor's behaviour.
- Consolidate three pure-Zod schema-rejection tests
  (raw-SQL-heatmap absent, empty valueExpression, multiple select
  items) into a single parametrized `it.each` block; the runtime
  non-Trace-source rejection stays as its own test because it
  exercises the new `getHeatmapTilesWithIncompatibleSources` path.
Deep-review on #2200 flagged the previous comment as overstating
parity. The editor's `applyHeatmapDefaults` writes `numberFormat:
{ output: 'duration', factor: 0.001 }`, `series.0.countExpression:
'count()'`, and `aggConditionLanguage: getStoredLanguage() ?? 'lucene'`,
while this converter hardcodes `'lucene'` for the language and passes
`numberFormat`/`countExpression` through verbatim from the external
payload (or leaves them absent).

Comment now enumerates the divergences and the rendering implications:
the renderer does not read `aggConditionLanguage` for heatmap tiles
(heatmap has no per-select where), and an API-built tile renders
without duration formatting unless the caller specifies `numberFormat`.
No behavior change.
Six findings from the bot review on #2200:

P0/P1
- MCP schema rejected heatmap tiles. Adds `mcpHeatmapTileSchema`
  (mirroring `externalDashboardHeatmapChartConfigSchema`) and includes
  it in the `mcpTilesParam` union so `hyperdx_save_dashboard` accepts
  the same heatmap shape the REST endpoint accepts.
- MCP create/update did not run the heatmap source-kind gate. Both
  paths now call `getHeatmapTilesWithIncompatibleSources` after schema
  validation so an MCP-issued create cannot persist a heatmap that
  the REST PUT would reject with 400 on the next round-trip. The
  update path scopes the check to heatmap tiles whose `sourceId` or
  `displayType` changed in this request, matching the REST PUT.
- `buildQueryGuidePrompt` did not document heatmap. Adds entries to
  TILE TYPE GUIDE, PER-TILE TYPE CONSTRAINTS, AGGREGATION FUNCTIONS,
  and COMMON MISTAKES so the LLM cannot skip the Trace-source and
  non-empty-`valueExpression` rules when generating tiles.

P2
- `convertTileToExternalChart` returned `displayType: 'line'` via
  `defaultTileConfig` for any heatmap tile whose internal doc lacked a
  non-empty `valueExpression`. A GET / mutate / PUT round-trip would
  then silently overwrite the heatmap with a line chart in Mongo
  (data loss). The heatmap branch now emits a heatmap-shaped
  placeholder with empty `valueExpression`, so the response preserves
  `displayType` and a re-PUT surfaces the breakage as a clear schema
  error from the input schema's `min(1)` rule rather than dropping
  the heatmap.
- The PUT-time `getHeatmapTilesWithIncompatibleSources` ran on every
  request regardless of whether heatmap tiles changed, so a source
  whose `kind` was changed to non-Trace after the dashboard was
  originally accepted wedged every PUT. The check now scopes to
  heatmap tiles whose `sourceId` or `displayType` changed compared
  to the existing dashboard via a new `filterChangedHeatmapTiles`
  helper.
- POST and PUT each ran three separate `getSources(team)` queries
  (one per validation helper). The helpers now accept a pre-fetched
  `SourceForValidation[]`, hoisted into a single
  `fetchSourcesForValidation` call shared across
  `getMissingSources`, `getSourceConnectionMismatches`, and
  `getHeatmapTilesWithIncompatibleSources`. Same change applied to
  the MCP create/update paths. Drops two redundant DB round-trips
  per request.

Tests cover the MCP schema accept/reject paths, the heatmap
source-kind gate at the MCP layer, the GET placeholder behavior, the
PUT scoping (no 400 on unrelated edits when the underlying source's
`kind` was changed later), and that the prompt documents heatmap in
all four sections.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@alex-fedotyev
Copy link
Copy Markdown
Contributor Author

Pushed a fix-pack addressing the deep-review findings in commit a217900. Mapping:

P0/P1

  1. schemas.ts:312 MCP heatmap schema gap. Adds mcpHeatmapTileSchema mirroring externalDashboardHeatmapChartConfigSchema and includes it in the mcpTilesParam union.
  2. saveDashboard.ts:115 MCP source-kind gate. Both the create and update paths now run getHeatmapTilesWithIncompatibleSources. The update path scopes the check to heatmap tiles whose sourceId or displayType changed in the request, matching the REST PUT.
  3. content.ts:27 heatmap missing from prompt. Added entries to TILE TYPE GUIDE, PER-TILE TYPE CONSTRAINTS, AGGREGATION FUNCTIONS, and COMMON MISTAKES (the bot called out three sections; AGGREGATION FUNCTIONS was added too because aggFn: "heatmap" is heatmap-only).

P2

  1. utils/dashboards.ts:309 silent line downgrade. convertTileToExternalChart now emits a heatmap-shaped placeholder with empty valueExpression instead of falling through to defaultTileConfig. A re-PUT surfaces the breakage as a clear schema error rather than persisting displayType: 'line'.
  2. dashboards.ts:1921 PUT wedging on unrelated edits. New filterChangedHeatmapTiles helper restricts the source-kind check to heatmap tiles whose sourceId or displayType changed. Same scoping applied to the MCP update path.
  3. dashboards.ts:1691 triple getSources(team). Hoisted into a single fetchSourcesForValidation call shared across all three helpers (REST POST, REST PUT, MCP create, MCP update). Drops two redundant DB round-trips per request.

Tests cover the MCP schema accept/reject paths, the heatmap source-kind gate at the MCP layer, the GET placeholder behavior, the PUT scoping (no 400 on unrelated edits when the underlying source's kind was changed later), and the prompt content.

Skipping the P2/P3 items the task spec marked as deferred (renames, cosmetic TS, aggFn translation tracked separately, duplicate POST/PUT validation ladder, /api/v2/charts/series heatmap gap, P3 nitpicks).

Copy link
Copy Markdown
Contributor

@pulpdrew pulpdrew left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two small thoughts on the API spec, otherwise LGTM

aggFn: z.literal('heatmap'),
valueExpression: z.string().min(1).max(10000),
countExpression: z.string().max(10000).optional(),
alias: z.string().max(10000).optional(),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alias isn't relevant to heatmaps, is it?

// asks for it.
select: [
{
aggFn: 'count',
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're just always saving count here, could we just remove the aggFn property from the OpenAPI spec / zod schema for the heatmap select item? Why make the user provide it if we ignore it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

automerge review/tier-4 Critical — deep review + domain expert sign-off

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants