Skip to content

Add developer guide explaining how NextAurora works#1

Merged
emeraldleaf merged 2 commits into
mainfrom
copilot/review-repo-functionality
May 9, 2026
Merged

Add developer guide explaining how NextAurora works#1
emeraldleaf merged 2 commits into
mainfrom
copilot/review-repo-functionality

Conversation

Copilot AI commented Mar 6, 2026

Copy link
Copy Markdown
Contributor

The repo lacked a single entry point for developers to understand how the codebase is structured and how requests flow through the system end-to-end.

Changes

  • docs/how-it-works.md — new developer walkthrough covering:

    • Clean Architecture layer responsibilities and dependency rules
    • DDD entity patterns (private constructors, Create() factory guards, encapsulated collections, domain methods)
    • CQRS + MediatR pipeline: ValidationBehaviorLoggingBehavior → handler
    • Full traced walkthrough of a POST /api/orders request — HTTP endpoint → gRPC product validation + stock reservation → Order.Create() → repository → OrderPlacedEvent publish
    • Service Bus publish/consume pattern with correlation ID injection and DLQ abandonment
    • Choreography-based order lifecycle saga (Placed → Paid → Shipped via events)
    • Cross-cutting concerns: three-layer validation, GlobalExceptionHandler exception-to-status mapping, correlation ID propagation across HTTP/gRPC/Service Bus
    • Aspire AppHost.cs — what it wires up and why hardcoded URLs are absent
    • Test naming convention (MethodName_Condition_ExpectedResult) and builder pattern
    • "Where to look for what" quick-reference table
  • README.md — adds a Documentation section linking all five guides (how-it-works, architecture, observability, event-replay, BRD) so readers can navigate the docs without hunting through the repo.


🔒 GitHub Advanced Security automatically protects Copilot coding agent pull requests. You can protect all pull requests by enabling Advanced Security for your repositories. Learn more about Advanced Security.

Co-authored-by: emeraldleaf <5404190+emeraldleaf@users.noreply.github.com>
Copilot AI changed the title [WIP] Explain functionality of the repository Add developer guide explaining how NextAurora works Mar 6, 2026
@emeraldleaf emeraldleaf marked this pull request as ready for review May 9, 2026 15:22
@emeraldleaf emeraldleaf merged commit 8ece53d into main May 9, 2026
Copilot stopped work on behalf of emeraldleaf due to an error May 9, 2026 22:53
Copilot stopped work on behalf of emeraldleaf due to an error May 9, 2026 22:53
emeraldleaf added a commit that referenced this pull request May 23, 2026
CodeRabbit (PR #17) flagged that the "in-flight PR" phrasing goes stale the
moment this PR merges. Reword to describe the configured state (upload is in
ci.yml) and the pending event (badge reflects aggregate after the next green
CI run), not the temporal state of this PR.

Addresses CodeRabbit finding #2 of 2 on the PR. Finding #1 (pin codecov-action
to commit SHA) deferred — would create inconsistency with the other six actions
in the same workflow (actions/checkout@v6, actions/setup-dotnet@v5,
actions/cache@v5, dorny/test-reporter@v3, github/codeql-action/*@v4, plus the
two codecov-action calls), all of which use version-pinned mutable tags by
existing repo convention. If we want SHA-pinning, it's a workflow-wide
hardening pass, not a one-off here.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
emeraldleaf added a commit that referenced this pull request May 28, 2026
Three small edits:

1. CodeRabbit #1 (full-saga-deployment-plan.md:335): the "Related docs"
   pointer said STATUS.md references this doc "under 'Open issues'" but
   the entry actually lives under "Next." Fixed to match.

2. CodeRabbit #2 (STATUS.md:103): the "Next" entry still said "four open
   decisions (D1-D4) block Phase 1" but the plan doc now lists D1-D4 as
   resolved 2026-05-27. Updated to reflect resolved state + name the
   chosen options + point at Phase 1A as the next concrete work.

3. Seq named as the Phase 3 telemetry choice. Sourced from an article on
   structured logging + distributed tracing for microservices with Seq.
   Seq fits this deployment shape better than App Insights — unified logs
   + traces in one UI, self-hostable on Fly with persistent volume
   (matches D2 Keycloak pattern), free tier covers demo scope, OTLP-native
   (slots into existing OpenTelemetry export with one config-line change).
   Gotcha captured: pin OpenTelemetry.Instrumentation.* packages
   explicitly in Directory.Packages.props, since non-stable RC versions
   for instrumentations like StackExchangeRedis differ across major
   bumps.

No new rule encoding — the article reinforces existing CLAUDE.md rules
(structured logging with message templates, OpenTelemetry instrumentation
patterns) without adding net-new lessons worth encoding.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
emeraldleaf added a commit that referenced this pull request May 28, 2026
* plan(deploy): add full-saga demo deployment plan + STATUS pointer

Cross-session tracking artifact for the multi-PR, multi-week effort to
stand up NextAurora as a portfolio-grade demo deployment running the
full Order → Payment → Shipping → Notification saga over real cloud
infrastructure with the Stripe gateway stubbed.

Structure:
- Three phases, each independently shippable:
  1. Order saga visible (Catalog + Order + minimal Storefront). Order
     persists, OrderPlacedEvent stages, saga stalls because PaymentService
     isn't deployed yet — itself a teaching demo.
  2. Full saga (Payment + Shipping + Notification). End-to-end flow over
     real infrastructure with stubbed Stripe.
  3. Polish (observability + ops + minimal UX) — demoable to humans.

- Four open decisions (D1-D4) called out explicitly: SQL Server hosting,
  identity provider, messaging transport, cost ceiling. All block Phase 1.

- Out-of-scope explicitly listed: real Stripe SDK (stub retirement),
  PaymentRecoveryJob retry-with-key (gated on stub retirement),
  production-grade DR/SLAs. Honest framing of what this is and isn't.

- Cost ledger structure ready for first entries once provisioning begins.

STATUS.md "Next" section now points at the plan so future sessions
can pick up coherently.

Following the IDSD/intent-driven discipline encoded in the recent
articles: explicit intent + expectations + context as a tracked
artifact BEFORE implementation, not after. The plan is the "ICE"
artifact for this multi-PR effort.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* plan(deploy): resolve D1-D4 decisions + split Phase 1 into 1A + 1B

User-resolved decisions captured in the plan doc:

- D1 (SQL Server hosting): Postgres-only-for-demo with provider swap.
  Dev environment keeps two-engine split (the architectural story is
  still genuine for dev/learning). Deployed environment uses Postgres
  for all four services with state. README footnote will explain the
  demo-deployment exception.

- D2 (Identity): Auth0 free tier. Hosted, simplest setup. Keycloak
  stays for local dev. ServiceDefaults JWT config is already
  config-driven — points at Auth0 issuer in deployed environment.

- D3 (Messaging): AWS SQS+SNS free tier. Wolverine reconfig from
  UseAzureServiceBus to UseAmazonSqs. 1M req/mo free covers demo
  volume comfortably.

- D4 (Cost ceiling): $30/mo target, $50/mo hard ceiling. Phase 1 ≤$15,
  Phase 2 ≤$30. Fly.io spend cap to be set at $50 before any
  provisioning.

Phase 1 now splits naturally into two sub-PRs given D1:

- Phase 1A: Postgres provider swap (code only). Add `DatabaseProvider`
  config setting, branch Order + Payment Program.cs on it, regenerate
  Postgres migrations, branch concurrency-token config (RowVersion vs
  xmin). Integration tests cover both paths. README footnote.

- Phase 1B: Deploy Order + minimal Storefront. Original Phase 1 goal,
  now sitting on top of 1A's plumbing.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* plan(deploy): switch D2 to Keycloak self-hosted + restructure Phase 1 + Phase 3 rate-limit audit

User reconsidered D2 after seeing an article on Keycloak's value as an
identity provider. Switched from Auth0 to Keycloak self-hosted on Fly.

Rationale:
- Same IdP in dev and prod. NextAurora already runs Keycloak in local
  Aspire dev with the realm at realms/nextaurora-realm.json. Deployed
  Keycloak imports the same realm export on boot — one IdP, one realm,
  two environments. Removes "works in dev, breaks in prod" risk around
  realm shape, claim names, test users.
- Stronger portfolio story than vendor-managed Auth0.
- Cost is roughly comparable (~$5-10/mo Fly Machine vs Auth0 free
  vendor-locked).

Trade-off: real ops surface (Postgres DB for Keycloak state, persistent
volume, realm-import-on-boot, two-stage readiness probe, ~30-60s cold
start for Java + realm import).

Phase 1 now splits into three sub-PRs:
- 1A: Postgres provider swap (code only) — unchanged
- 1B: Deploy Keycloak self-hosted (NEW infrastructure-only sub-phase)
- 1C: Deploy Order + minimal Storefront against deployed Keycloak

Phase 3 deliverables grown by one item: rate-limit audit + Redis-backed
swap if any service runs 2+ Fly Machines. The in-memory ASP.NET Core
limiter on Catalog search + Payment process is correct for single-
instance Phase 2 but silently weakens under scale-out. The rule
encoding lands in a separate PR (CLAUDE.md / .coderabbit.yaml /
architecture-reviewer.md).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* plan(deploy): CodeRabbit review fixes + Seq for Phase 3 telemetry

Three small edits:

1. CodeRabbit #1 (full-saga-deployment-plan.md:335): the "Related docs"
   pointer said STATUS.md references this doc "under 'Open issues'" but
   the entry actually lives under "Next." Fixed to match.

2. CodeRabbit #2 (STATUS.md:103): the "Next" entry still said "four open
   decisions (D1-D4) block Phase 1" but the plan doc now lists D1-D4 as
   resolved 2026-05-27. Updated to reflect resolved state + name the
   chosen options + point at Phase 1A as the next concrete work.

3. Seq named as the Phase 3 telemetry choice. Sourced from an article on
   structured logging + distributed tracing for microservices with Seq.
   Seq fits this deployment shape better than App Insights — unified logs
   + traces in one UI, self-hostable on Fly with persistent volume
   (matches D2 Keycloak pattern), free tier covers demo scope, OTLP-native
   (slots into existing OpenTelemetry export with one config-line change).
   Gotcha captured: pin OpenTelemetry.Instrumentation.* packages
   explicitly in Directory.Packages.props, since non-stable RC versions
   for instrumentations like StackExchangeRedis differ across major
   bumps.

No new rule encoding — the article reinforces existing CLAUDE.md rules
(structured logging with message templates, OpenTelemetry instrumentation
patterns) without adding net-new lessons worth encoding.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
@emeraldleaf emeraldleaf deleted the copilot/review-repo-functionality branch June 4, 2026 00:40
emeraldleaf added a commit that referenced this pull request Jun 5, 2026
…n-CLAUDE.md) (#114)

* chore: prevention pass — hook + CLAUDE.md rule + CodeRabbit instruction (parts 2-4 of file-move discipline)

Three of four layers of the file-move drift prevention loop. Part #1
(extend the CI broken-link guard to scan Dockerfile*) lands in a
follow-up commit on this same PR once #112 is merged and rebased.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* chore: extend prevention to Dockerfile COPY + diagrams + topology docs

Three more layers on top of cf27628 (parts 2-4 of the file-move
prevention loop). Same compounding-loop principle, broader coverage:

(a) Broken-COPY audit — Dockerfile* COPY/ADD source paths that don't
    exist in the build context. Catches the kind of drift that left
    Dockerfile.catalog referencing the pre-VSA-collapse 4-project layout
    for months after PR #31. Skips `--from=<stage>` cross-stage copies
    and wildcards (resolved at build time, not against the repo tree).

(b) Diagram-pair audit — every docs/*.excalidraw must have its sibling
    docs/*.svg and vice versa. Reviewers look at the .svg on github.com
    to understand the system; .excalidraw is the editable source. If one
    exists without the other, the diagram review surface is broken.
    Does NOT verify the .svg matches what would be regenerated from the
    .excalidraw source (that needs Playwright in CI — separate, heavier
    gate). The pair-existence check is the cheap mechanical floor.

(c) CLAUDE.md "Doc-and-diagram discipline" rule. Sibling to the file-move
    rule. Encodes that docs and diagrams are the REVIEW SURFACE, not
    byproducts — when reviewers look at the system, they read
    docs/architecture.md and look at docs/nextaurora-architecture.svg.
    If those are stale, every review reasons against a fiction. Names
    concrete pairings: AppHost.cs ↔ architecture.md/svg, Extensions.cs
    middleware order ↔ service-request-flow.svg, EF/cache/outbox changes
    ↔ perf-doc + their sibling diagrams.

(d) CodeRabbit path_instructions for NextAurora.AppHost/AppHost.cs and
    NextAurora.ServiceDefaults/Extensions.cs — when topology or
    middleware-order code changes, flag missing paired doc/diagram
    updates at review time. PR-description waivers acceptable when
    the deferred update is named in a tracking issue.

All three new mechanical guards (broken-link from #112, broken-COPY,
diagram-pair) smoke-tested locally on the current tree → exit 0.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* chore: trim CLAUDE.md (5 surfaces, lean discipline, perf + observability)

Third commit on this branch — the lean-CLAUDE.md pass on top of the
file-move (cf27628) + doc-and-diagram (4e1ba94) prevention layers.

CLAUDE.md drops from 358 → 301 lines (~16% smaller) by moving deep-dive
content out and leaving headlines + links. Every rule preserved; nothing
the AI needs to know got dropped.

Continuous Rule Encoding — 6 surfaces → 5:
- GitHub Issues moved out of the encoding loop (it's a deferral mechanism,
  not encoding — closed issues aren't re-read in future sessions)
- Mirror update in docs/dev-loop.md (table + prose)
- Diagram redesigned with 5 surfaces in a clean row, simpler tier boxes
  (no crammed tool lists), CLAUDE.md annotated "(kept lean)" to reinforce
  the discipline visually

New "one-paragraph max per rule" discipline on CLAUDE.md surface 1:
- "If a rule needs more than ~6 lines, the rule stays as a bolded headline
   + one-paragraph summary in CLAUDE.md; detail moves to docs/ or skills/"
- Test: "could this rule + its rationale fit on one screen?"

CI size guard (build job):
- CLAUDE.md soft-warning at 400 lines, hard-fail at 500
- Mechanical floor on bloat regression

Performance Rules trim (~30 dense bullets → 22 headlines + links):
- Rules themselves unchanged (project-not-map, Task.WhenAll, outbox-atomic,
  Guid v7, AsSpan, Dapper escape hatch, etc.)
- Deep-dive paragraphs → docs/performance-and-data-correctness.md +
  dotnet-performance skill (both already exist)

Observability section trim (~78 lines → 21):
- Kept 3 always-on traps: HTTP middleware order, Wolverine middleware
  instance-methods, outbox-outside-handler atomicity
- Moved "how it works" detail → docs/architecture.md "Cross-Cutting
  Concerns" and "Event-Driven Architecture" (both already cover this)

Verification:
- wc -l CLAUDE.md: 301 (well under the 400/500 size budget)
- broken-link audit on CLAUDE.md: exit 0
- diagram-pair audit: exit 0
- broken-COPY audit: exit 0
- All rule headlines preserved; rules with paraphrases in code comments
  ("See CLAUDE.md") still align with canonical wording

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* docs: actually move CLAUDE.md trim content to destination docs

The prior commit (5c938b2) trimmed CLAUDE.md from 358→301 lines and
added "See docs/..." links. An audit found the content I trimmed wasn't
actually in those destination docs — I'd deleted it, not moved it.

This commit does the real moves.

New file: docs/observability-and-context-propagation.md
- Section-aligned with CLAUDE.md "Observability & Context Propagation"
- Full mechanism + headers/baggage mapping table + sources
- HTTP middleware order with the canonical 4-line code example
- Wolverine pipeline scope detail (pipeline order: validation → context
  → handler → AutoApplyTransactions)
- Wolverine envelope context extraction mechanism
- Transactional Outbox config + outbox-outside-handler atomicity trap
  with the canonical safe wrapper code block
- Structured logging scope hygiene
- Event Replay note

Additions to docs/performance-and-data-correctness.md:
- New "## Additional always-on patterns" section after "The 14 always-on
  rules" — patterns that don't fit a single-rule shape
- Non-sargable predicates + EmailNormalized normalize-at-write-time pattern
- Task.WhenAll parallel awaits with three caveats (dependent ops, shared
  DbContext, multi-failure observability)
- Long-running work / 202 Accepted pattern with two atomicity paths and
  cloud-managed alternatives (Durable Functions, Step Functions, Temporal)
- Fan-out on message bus + MaxDegreeOfParallelism throttle
- Guid.CreateVersion7 with the time-decodable trade-off
- AsSpan ref-struct + async-boundary constraint

CLAUDE.md link redirects (no rule text changes):
- "Long-running work" → docs/performance-and-data-correctness.md
  "Long-running work belongs on the message bus" (new section)
- "Observability & Context Propagation" mechanism →
  docs/observability-and-context-propagation.md (new file)
- "Outbox outside a Wolverine handler" wrapper →
  docs/observability-and-context-propagation.md anchor (new file)

Naming convention: section-aligned filenames where new docs are created
(observability-and-context-propagation.md mirrors the CLAUDE.md heading).
Existing docs kept where they already align by topic
(performance-and-data-correctness.md).

Verification:
- wc -l CLAUDE.md: 301 (unchanged; under 400 soft / 500 hard budget)
- broken-link audit: exit 0 (full repo)
- All trimmed content now lives in either the new doc or the existing
  perf doc; nothing the AI needs to know got dropped

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants