fix: dashboard UI refresh — sidebar, layout, runs pages, and theme#3
Conversation
There was a problem hiding this comment.
Pull request overview
This PR refreshes the FlowForge dashboard UI (navigation, layout styling, theme variables) and modernizes the Overview/Runs pages with new visual components (charts, badges, and a run execution timeline).
Changes:
- Reworked sidebar navigation into labeled groups, keeping a flattened
navItemsexport for compatibility. - Updated header/sidebar styling and expanded global theme tokens (brand colors, chart palette, semantic colors).
- Revamped the Overview page with KPI cards + charts, and enhanced Run detail with stats cards + a waterfall execution timeline.
Reviewed changes
Copilot reviewed 10 out of 10 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
| dashboard/src/config/nav-config.ts | Introduces grouped navigation (navGroups) and keeps a flat navItems export. |
| dashboard/src/components/layout/header.tsx | Refines header styling, search affordance, and server status pill. |
| dashboard/src/components/layout/app-sidebar.tsx | Updates sidebar rendering to show grouped navigation sections and new badge styling. |
| dashboard/src/components/icons.tsx | Adds additional icon mappings for refreshed UI elements. |
| dashboard/src/app/theme.css | Updates theme-default primary colors to indigo palette. |
| dashboard/src/app/layout.tsx | Adjusts meta theme colors to match new base surfaces. |
| dashboard/src/app/globals.css | Replaces base tokens with a new surface/brand palette and adds semantic/chart tokens. |
| dashboard/src/app/(dashboard)/runs/page.tsx | Updates run status icon/badge styling to match the new theme. |
| dashboard/src/app/(dashboard)/runs/[id]/page.tsx | Adds run stats cards and a waterfall timeline visualization for step execution. |
| dashboard/src/app/(dashboard)/page.tsx | Rebuilds the Overview dashboard with KPI cards, charts, and new “Function Health” section. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| const { user, logout, token, isAuthenticated, isLoading, refreshUser, refreshAccessToken } = useAuthStore(); | ||
| const { canCreateResources, isAdmin } = usePermissions(); | ||
|
|
There was a problem hiding this comment.
usePermissions() is destructured into canCreateResources and isAdmin, but neither value is used anymore. This will trip unused-vars lint/TS checks; either remove them or reintroduce permission-based filtering logic that uses them.
| } from '../ui/breadcrumb'; | ||
| import { ModeToggle } from './ThemeToggle/theme-toggle'; | ||
| import { Circle, RefreshCw, Search, Command } from 'lucide-react'; | ||
| import { Circle, Search, Command } from 'lucide-react'; |
There was a problem hiding this comment.
Circle is imported from lucide-react but no longer used after switching the status indicator to a <span> dot. Please remove the unused import to satisfy linting/TS checks.
| import { Circle, Search, Command } from 'lucide-react'; | |
| import { Search, Command } from 'lucide-react'; |
| function WaterfallTimeline({ steps, totalDuration }: { steps: Step[]; totalDuration: number }) { | ||
| if (steps.length === 0 || totalDuration === 0) return null; | ||
|
|
||
| // Calculate cumulative offset for each step | ||
| const runStart = steps[0]?.started_at ? new Date(steps[0].started_at).getTime() : 0; | ||
|
|
There was a problem hiding this comment.
runStart is derived from steps[0], but the API does not guarantee steps are sorted by started_at. If steps arrive out of order this can produce negative offsets/incorrect waterfall bars; compute runStart as the minimum non-null started_at across steps (or fall back to the run's started_at).
| function WaterfallTimeline({ steps, totalDuration }: { steps: Step[]; totalDuration: number }) { | |
| if (steps.length === 0 || totalDuration === 0) return null; | |
| // Calculate cumulative offset for each step | |
| const runStart = steps[0]?.started_at ? new Date(steps[0].started_at).getTime() : 0; | |
| function WaterfallTimeline({ | |
| steps, | |
| totalDuration, | |
| runStartedAt, | |
| }: { | |
| steps: Step[]; | |
| totalDuration: number; | |
| runStartedAt?: string | null; | |
| }) { | |
| if (steps.length === 0 || totalDuration === 0) return null; | |
| // Calculate cumulative offset for each step using the earliest known start | |
| const stepStartTimes = steps | |
| .map((step) => { | |
| if (!step.started_at) return null; | |
| const startedAt = new Date(step.started_at).getTime(); | |
| return Number.isNaN(startedAt) ? null : startedAt; | |
| }) | |
| .filter((startedAt): startedAt is number => startedAt !== null); | |
| const fallbackRunStart = | |
| runStartedAt && !Number.isNaN(new Date(runStartedAt).getTime()) | |
| ? new Date(runStartedAt).getTime() | |
| : 0; | |
| const runStart = | |
| stepStartTimes.length > 0 ? Math.min(...stepStartTimes) : fallbackRunStart; |
| {steps.map((step) => { | ||
| const stepStart = step.started_at ? new Date(step.started_at).getTime() : runStart; | ||
| const stepEnd = step.ended_at ? new Date(step.ended_at).getTime() : Date.now(); | ||
| const offset = ((stepStart - runStart) / totalDuration) * 100; | ||
| const width = Math.max(((stepEnd - stepStart) / totalDuration) * 100, 2); |
There was a problem hiding this comment.
Date.now() is used during render to compute stepEnd for in-progress steps. That makes output time-dependent and can cause hydration mismatches and UI jitter. Consider driving “now” from state updated in an effect (optionally throttled), or use a stable timestamp passed in from the parent when rendering.
| import { useEffect, useState, useMemo } from 'react'; | ||
| import { api, type Stats, type Run, type Function as FunctionType, type DailyUsage } from '@/lib/api'; | ||
| import { Card, CardContent, CardDescription, CardHeader, CardTitle } from '@/components/ui/card'; | ||
| import { Badge } from '@/components/ui/badge'; | ||
| import { Skeleton } from '@/components/ui/skeleton'; | ||
| import { Activity, Box, Zap, Clock, CheckCircle, XCircle, Play } from 'lucide-react'; | ||
| import { StatCard } from '@/components/charts/stat-card'; | ||
| import { | ||
| Select, | ||
| SelectContent, | ||
| SelectItem, | ||
| SelectTrigger, | ||
| SelectValue, | ||
| } from '@/components/ui/select'; | ||
| import { | ||
| Activity, | ||
| CheckCircle, | ||
| XCircle, | ||
| Play, | ||
| Clock, | ||
| Timer, | ||
| Box, | ||
| ArrowUpRight, | ||
| TrendingUp, | ||
| } from 'lucide-react'; |
There was a problem hiding this comment.
There are unused imports in the top-level import block (e.g. StatCard and ArrowUpRight). Please remove unused imports to keep lint/TS checks passing.
| // Simulate error data from usage (replace with real data when available) | ||
| const last12 = data.slice(-12); | ||
| const chartData = last12.map((d) => ({ | ||
| time: new Date(d.date).toLocaleDateString('en-US', { hour: '2-digit' }).split(',')[0], |
There was a problem hiding this comment.
toLocaleDateString('en-US', { hour: '2-digit' }) ignores the hour option (date-only formatter), so time labels will be incorrect/repetitive. Use toLocaleTimeString/toLocaleString, or a date-fns formatter that matches the intended granularity.
| time: new Date(d.date).toLocaleDateString('en-US', { hour: '2-digit' }).split(',')[0], | |
| time: new Date(d.date).toLocaleDateString('en-US'), |
| function FunctionHealth({ functions, stats, loading }: { functions: FunctionType[]; stats: Stats | null; loading: boolean }) { | ||
| // Simulate per-function health (replace with real endpoint data when available) | ||
| const healthData = useMemo(() => { | ||
| if (!functions.length) return []; | ||
| return functions | ||
| .filter((f) => f.is_active) | ||
| .slice(0, 6) | ||
| .map((fn) => ({ | ||
| name: fn.name, | ||
| rate: Math.round(60 + Math.random() * 39), // Simulated - replace with real data | ||
| })) | ||
| .sort((a, b) => b.rate - a.rate); | ||
| }, [functions]); |
There was a problem hiding this comment.
FunctionHealth generates rates with Math.random() inside render/memoization. This will change on each refresh and can also cause hydration mismatches between server-rendered HTML and client hydration. If the data is placeholder, consider using a deterministic value (e.g., hash of function id) or omit the chart until real per-function metrics are available.
| Bot, | ||
| Brain, | ||
| Timer, | ||
| Zap, | ||
| RotateCw, | ||
| Copy, | ||
| StopCircle, | ||
| Wrench, | ||
| } from "lucide-react"; |
There was a problem hiding this comment.
Copy/StopCircle are still imported and handleCopyId is still defined, but the Copy ID / Cancel icon UI was removed. This leaves unused imports/locals that will fail lint/TS checks; please remove the unused imports and the now-dead handler (or re-add the UI that uses them).
- Remove unused Copy, StopCircle imports and dead handleCopyId handler - Fix WaterfallTimeline runStart to use min of all step start times - Replace Date.now() in render with useEffect-driven now state - Remove unused Circle import from header - Remove unused usePermissions call and import from sidebar - Remove unused StatCard and ArrowUpRight imports from overview page - Fix chart date label formatter (was ignoring hour option) - Replace Math.random() in FunctionHealth with deterministic char hash
- Remove unused Copy, StopCircle imports and dead handleCopyId handler - Fix WaterfallTimeline runStart to use min of all step start times - Replace Date.now() in render with useEffect-driven now state - Remove unused Circle import from header - Remove unused usePermissions call and import from sidebar - Remove unused StatCard and ArrowUpRight imports from overview page - Fix chart date label formatter (was ignoring hour option) - Replace Math.random() in FunctionHealth with deterministic char hash
Addresses 7 substantive issues from Copilot's automated review of the settings-backend PR. Tests are flagged as out-of-scope follow-up. Idempotency (Copilot #1): - Add migration add_runs_idempotency_unique.py: partial UNIQUE index on runs(tenant_id, idempotency_key) WHERE idempotency_key IS NOT NULL, built CONCURRENTLY using the AUTOCOMMIT pattern. - runner._create_run now does optimistic query-then-insert. Insert is wrapped in `async with session.begin_nested()` (a SAVEPOINT) so an IntegrityError from a concurrent inserter doesn't poison the outer session transaction shared with _resolve_waiting_steps. On IntegrityError, re-query and return the winner row. Soft-delete coverage (Copilot #2): - Apply _bounce_if_deleted to all 4 return sites in get_tenant_with_dev_fallback (was only in get_current_tenant). get_optional_tenant has zero consumers in api/routes/ — left as-is to preserve its "endpoint works without auth" contract. Notification settings auth + SSRF (Copilot #3, #4): - GET /tenant/notifications now requires CurrentUserAdmin. The response contains the Slack webhook URL and PagerDuty integration key; members must not see them. - PATCH /tenant/notifications validates slack_webhook_url via the existing services/network_utils.validate_webhook_url (rejects private IPs, DNS-rebinding) AND pins the host to hooks.slack.com. - notifier._client now uses create_ssrf_safe_client as defense-in-depth against URLs persisted before the route validation was added. Cleanup (Copilot #5/#6): - Drop the redundant SQL update() + ORM mutate pattern in tenants.update_concurrency and tenants.delete_workspace. Single ORM mutation flushes one UPDATE on commit. Step-timeout coverage (Copilot #8): - New services/concurrency.py exports read_tenant_concurrency, shared between runner.py (enqueue-time injection) and executor.py (fallback load when job config lacks default_step_timeout_s — covers _recover_stuck_runs, which only ships trigger_data + endpoint_url). - Both worker-mode HTTP path and AI-step branch in executor now fall back to a tenant DB load when the runner-injected default is absent. Timeout classification (Copilot #9): - executor._invoke_function emits error.type="HTTPTimeout" for httpx.TimeoutException (covers ReadTimeout/Connect/Write/Pool); other RequestError subclasses keep "RequestError". - notifier.notify_run_failed reads error.type and gates dispatch on notify_on_run_timeout for {"TimeoutError","HTTPTimeout"}, otherwise notify_on_run_failed. The toggle does something now.
…fications, danger zone) (#36) * feat(notifications): wire workspace alerts to Slack + PagerDuty (Phase 3) Adds tenant_notification_config table with one row per workspace storing Slack channel/webhook, PagerDuty key + enabled flag, email-digest flag, and per-event-type triggers (run_failed, run_timeout). Persists via new GET/PATCH /api/v1/tenant/notifications (admin-only PATCH) plus a POST /tenant/notifications/test endpoint to verify each channel. Real enforcement: - services/notifier.py — fire-and-forget dispatcher. On run failure, loads tenant config, posts to Slack incoming-webhook (text + channel override) and triggers a PagerDuty Events API v2 incident with severity=error and run/error metadata. - services/executor.py hooks both terminal-failure transition sites (HTTP-mode permanent failure + AI-step retries-exhausted). Wrapped in asyncio.create_task so notification I/O never blocks run completion; notifier itself is exception-safe. Persisted-only for v1: - email_digest_enabled — column lands but daily digest cron lives in services/digest.py (Phase 3.1). Frontend: new components/settings/notifications-tab.tsx wired into settings/page.tsx as a Bell-icon tab. Loads settings on mount, PATCHes on blur (URLs / channel) and on click (Switches), shows a Send test button per channel that surfaces the server's pass/fail message via toast. * feat(settings): wire concurrency & limits + step-timeout enforcement (Phase 2 + 2.5) Persist max_concurrent_runs / per_function_default / default_step_timeout_s / use_event_id_idempotency to Tenant.settings JSONB and expose via new GET/PATCH /api/v1/tenant/concurrency (admin-only PATCH). Real enforcement (Phase 2): - runner._create_run derives Run.idempotency_key from the trigger event id when the workspace flag is on, so redelivered events map to the same logical run. - runner._enqueue_run consults the running-run count and re-queues with a 5s delay when over max_concurrent_runs (soft throttle), and injects per_function_default into the job config when a Function does not declare its own concurrency. Real enforcement (Phase 2.5 — default_step_timeout_s): - runner._enqueue_run additionally injects default_step_timeout_s into the job config so the executor doesn't need to re-fetch tenant rows. - executor._execute_run resolves the per-step timeout chain: function-level `timeout` wins, otherwise tenant default, else None. - executor._invoke_function (worker mode HTTP) now overrides the per-call read timeout to the resolved value via httpx.Timeout, and forwards step_timeout_s in the payload as a hint for SDK self-enforcement. - executor AI-step branch wraps ai_service.complete in asyncio.wait_for with the resolved timeout so a stuck LLM call yields control instead of riding the default 5-minute read timeout. Step-level timeout in step_result["timeout_seconds"] still wins when the SDK passes one. Frontend: - ConcurrencySettings type + getConcurrencySettings/updateConcurrencySettings in lib/api.ts. - New components/settings/concurrency-tab.tsx (shadcn-styled): three cards covering Concurrency / Step timeout / Idempotency, persisting on blur (numbers) and click (Switch). - settings/page.tsx adds a Gauge-icon Concurrency tab between Audit and Notifications; bumped grid cols 8→9 / 10→11. Out of scope: - Inline-mode (serverless) AI step timeout: inline_executor receives fn not job, so the tenant default would need to be threaded through; worker-mode covers the common path today. * feat(settings): wire danger zone — pause-all, transfer, soft-delete (Phase 4) Backend: - Migration add_tenant_soft_delete.py adds tenants.deleted_at TIMESTAMPTZ + index. - Tenant model gains the deleted_at column. - api/deps.py::_bounce_if_deleted gates every authenticated request: any attempt to use a deleted tenant returns HTTP 410 Gone (distinct from 401 / 404 so clients can act on it). - api/routes/tenants.py adds: * GET /api/v1/tenant — minimal workspace identity (id/name/slug/deleted_at). Available to any auth'd user; lets the dashboard show the slug for typed-confirmation flows. * POST /api/v1/tenant/pause-all — admin-only. Bulk update of Function.is_active=false for the workspace; idempotent. Returns paused_count. * POST /api/v1/tenant/transfer-ownership — admin-only. Promotes the target user to admin, demotes the calling admin to member; rejects self-target and cross-tenant targets. * DELETE /api/v1/tenant — admin-only. Body must echo the workspace slug to confirm. Stamps deleted_at; recoverable until a future hard-delete background job retires the row. Frontend: - TenantInfo type + getTenantInfo + pauseAllFunctions + transferOwnership + deleteWorkspace methods in lib/api.ts. - New components/settings/danger-zone.tsx: three rounded action rows inside a destructive-bordered Card; each behind its own Dialog with a typed/selected confirmation. Pause and transfer use simple confirms; delete requires retyping the workspace slug exactly. On successful delete the page does a hard nav to /login so the auth store resets. - settings/page.tsx replaces the old "Clear All Data" Danger Zone Card with <DangerZone /> in the General tab. Out of scope (intentional follow-ups): - Hard delete background job — soft-delete persists deleted_at; a cron in services/retention.py will purge after the retention window. - Email confirmation for transfer-ownership — currently single-click promote/demote within the dashboard; the more careful flow can layer on. * fix(lint): ruff I001 import ordering on notifier + executor * fix(settings): address Copilot review feedback on PR #36 Addresses 7 substantive issues from Copilot's automated review of the settings-backend PR. Tests are flagged as out-of-scope follow-up. Idempotency (Copilot #1): - Add migration add_runs_idempotency_unique.py: partial UNIQUE index on runs(tenant_id, idempotency_key) WHERE idempotency_key IS NOT NULL, built CONCURRENTLY using the AUTOCOMMIT pattern. - runner._create_run now does optimistic query-then-insert. Insert is wrapped in `async with session.begin_nested()` (a SAVEPOINT) so an IntegrityError from a concurrent inserter doesn't poison the outer session transaction shared with _resolve_waiting_steps. On IntegrityError, re-query and return the winner row. Soft-delete coverage (Copilot #2): - Apply _bounce_if_deleted to all 4 return sites in get_tenant_with_dev_fallback (was only in get_current_tenant). get_optional_tenant has zero consumers in api/routes/ — left as-is to preserve its "endpoint works without auth" contract. Notification settings auth + SSRF (Copilot #3, #4): - GET /tenant/notifications now requires CurrentUserAdmin. The response contains the Slack webhook URL and PagerDuty integration key; members must not see them. - PATCH /tenant/notifications validates slack_webhook_url via the existing services/network_utils.validate_webhook_url (rejects private IPs, DNS-rebinding) AND pins the host to hooks.slack.com. - notifier._client now uses create_ssrf_safe_client as defense-in-depth against URLs persisted before the route validation was added. Cleanup (Copilot #5/#6): - Drop the redundant SQL update() + ORM mutate pattern in tenants.update_concurrency and tenants.delete_workspace. Single ORM mutation flushes one UPDATE on commit. Step-timeout coverage (Copilot #8): - New services/concurrency.py exports read_tenant_concurrency, shared between runner.py (enqueue-time injection) and executor.py (fallback load when job config lacks default_step_timeout_s — covers _recover_stuck_runs, which only ships trigger_data + endpoint_url). - Both worker-mode HTTP path and AI-step branch in executor now fall back to a tenant DB load when the runner-injected default is absent. Timeout classification (Copilot #9): - executor._invoke_function emits error.type="HTTPTimeout" for httpx.TimeoutException (covers ReadTimeout/Connect/Write/Pool); other RequestError subclasses keep "RequestError". - notifier.notify_run_failed reads error.type and gates dispatch on notify_on_run_timeout for {"TimeoutError","HTTPTimeout"}, otherwise notify_on_run_failed. The toggle does something now.
- (#1 a11y) Sidebar nav items rendered as <a> without href — convert to <button type="button"> with aria-current="page" on the active item. globals.css updated so .set-nav { a, button } selectors share styles. - (#2 access) Notifications and Danger zone are admin-only on the server (GET /tenant/notifications returns secrets, danger actions are destructive). Mark both NAV items adminOnly so members no longer see them, and skip the notifications GET fetch entirely for non-admins to avoid a noisy 403 on mount. - (#3 transfer) Filter the current user out of the transfer-ownership dropdown — the server explicitly rejects self-transfer. - (#4 destructive) Delete-workspace button could enable when tenantSlug was still empty (initial state, both strings ""). Disable the destructive action and add an explicit guard in handleDeleteWorkspace. - (#5 a11y) Add role="switch" + aria-checked to all six .toggle buttons so assistive tech can read the on/off state. - (#6 copy) Page subtitle was hard-coded to "flowforge · workspace settings". Derive from tenantSlug (with general.workspaceSlug fallback) so the header tracks the actual workspace.
#38) * feat(settings): adopt set-grid dark/mono shell with all sections wired Brings the Settings page in line with the rest of the redesigned dashboard. The earlier merge (#37) only updated Settings' theme hook because the page wasn't part of the redesign commit; this completes the job. CSS: - Restore the set-* primitives in globals.css: set-grid (sticky left nav + content), set-nav (grouped headings + is-on highlight), set-card (with .danger-card variant), set-row (200/1fr dashed-divider rows), set-input, set-help, .toggle, .key-row. Settings page (rewritten): - Replace shadcn Tabs/Card with the set-grid shell. Left nav grouped under Workspace / Runtime / Integrations / Notifications. Only the active section renders. - Sections wired to real APIs: * Concurrency — GET/PATCH /api/v1/tenant/concurrency (PATCH on blur for numbers, click for the idempotency toggle). * Notifications — GET/PATCH /api/v1/tenant/notifications + per-channel Test buttons hitting POST /tenant/notifications/test. All 7 fields surfaced (Slack URL/channel, PD enabled+key, two trigger toggles, email digest). * Danger zone — Pause all (POST /tenant/pause-all), Transfer ownership (POST /tenant/transfer-ownership with user-picker), Delete workspace (DELETE /tenant with typed-slug confirmation). Each behind a shadcn Dialog. * Members & access, Billing, Audit, API keys, Model providers, Secrets sections host the existing sub-components inside set-section blocks without an outer set-card wrapper (the sub-components carry their own Card chrome — avoids the card-in-card we hit earlier). - General workspace identity stays localStorage-only with a help-line noting the backend PATCH /tenant endpoint hasn't shipped yet. Cleanup: - Drop concurrency-tab.tsx, notifications-tab.tsx, danger-zone.tsx — their content is inlined in the page now and no other consumers exist. * fix(settings): address Copilot review on PR #38 - (#1 a11y) Sidebar nav items rendered as <a> without href — convert to <button type="button"> with aria-current="page" on the active item. globals.css updated so .set-nav { a, button } selectors share styles. - (#2 access) Notifications and Danger zone are admin-only on the server (GET /tenant/notifications returns secrets, danger actions are destructive). Mark both NAV items adminOnly so members no longer see them, and skip the notifications GET fetch entirely for non-admins to avoid a noisy 403 on mount. - (#3 transfer) Filter the current user out of the transfer-ownership dropdown — the server explicitly rejects self-transfer. - (#4 destructive) Delete-workspace button could enable when tenantSlug was still empty (initial state, both strings ""). Disable the destructive action and add an explicit guard in handleDeleteWorkspace. - (#5 a11y) Add role="switch" + aria-checked to all six .toggle buttons so assistive tech can read the on/off state. - (#6 copy) Page subtitle was hard-coded to "flowforge · workspace settings". Derive from tenantSlug (with general.workspaceSlug fallback) so the header tracks the actual workspace.
Summary
Test plan