Add developer guide explaining how NextAurora works#1
Merged
Conversation
Co-authored-by: emeraldleaf <5404190+emeraldleaf@users.noreply.github.com>
Copilot
AI
changed the title
[WIP] Explain functionality of the repository
Add developer guide explaining how NextAurora works
Mar 6, 2026
Copilot stopped work on behalf of
emeraldleaf due to an error
May 9, 2026 22:53
Copilot stopped work on behalf of
emeraldleaf due to an error
May 9, 2026 22:53
16 tasks
emeraldleaf
added a commit
that referenced
this pull request
May 23, 2026
CodeRabbit (PR #17) flagged that the "in-flight PR" phrasing goes stale the moment this PR merges. Reword to describe the configured state (upload is in ci.yml) and the pending event (badge reflects aggregate after the next green CI run), not the temporal state of this PR. Addresses CodeRabbit finding #2 of 2 on the PR. Finding #1 (pin codecov-action to commit SHA) deferred — would create inconsistency with the other six actions in the same workflow (actions/checkout@v6, actions/setup-dotnet@v5, actions/cache@v5, dorny/test-reporter@v3, github/codeql-action/*@v4, plus the two codecov-action calls), all of which use version-pinned mutable tags by existing repo convention. If we want SHA-pinning, it's a workflow-wide hardening pass, not a one-off here. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
emeraldleaf
added a commit
that referenced
this pull request
May 28, 2026
Three small edits: 1. CodeRabbit #1 (full-saga-deployment-plan.md:335): the "Related docs" pointer said STATUS.md references this doc "under 'Open issues'" but the entry actually lives under "Next." Fixed to match. 2. CodeRabbit #2 (STATUS.md:103): the "Next" entry still said "four open decisions (D1-D4) block Phase 1" but the plan doc now lists D1-D4 as resolved 2026-05-27. Updated to reflect resolved state + name the chosen options + point at Phase 1A as the next concrete work. 3. Seq named as the Phase 3 telemetry choice. Sourced from an article on structured logging + distributed tracing for microservices with Seq. Seq fits this deployment shape better than App Insights — unified logs + traces in one UI, self-hostable on Fly with persistent volume (matches D2 Keycloak pattern), free tier covers demo scope, OTLP-native (slots into existing OpenTelemetry export with one config-line change). Gotcha captured: pin OpenTelemetry.Instrumentation.* packages explicitly in Directory.Packages.props, since non-stable RC versions for instrumentations like StackExchangeRedis differ across major bumps. No new rule encoding — the article reinforces existing CLAUDE.md rules (structured logging with message templates, OpenTelemetry instrumentation patterns) without adding net-new lessons worth encoding. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
emeraldleaf
added a commit
that referenced
this pull request
May 28, 2026
* plan(deploy): add full-saga demo deployment plan + STATUS pointer
Cross-session tracking artifact for the multi-PR, multi-week effort to
stand up NextAurora as a portfolio-grade demo deployment running the
full Order → Payment → Shipping → Notification saga over real cloud
infrastructure with the Stripe gateway stubbed.
Structure:
- Three phases, each independently shippable:
1. Order saga visible (Catalog + Order + minimal Storefront). Order
persists, OrderPlacedEvent stages, saga stalls because PaymentService
isn't deployed yet — itself a teaching demo.
2. Full saga (Payment + Shipping + Notification). End-to-end flow over
real infrastructure with stubbed Stripe.
3. Polish (observability + ops + minimal UX) — demoable to humans.
- Four open decisions (D1-D4) called out explicitly: SQL Server hosting,
identity provider, messaging transport, cost ceiling. All block Phase 1.
- Out-of-scope explicitly listed: real Stripe SDK (stub retirement),
PaymentRecoveryJob retry-with-key (gated on stub retirement),
production-grade DR/SLAs. Honest framing of what this is and isn't.
- Cost ledger structure ready for first entries once provisioning begins.
STATUS.md "Next" section now points at the plan so future sessions
can pick up coherently.
Following the IDSD/intent-driven discipline encoded in the recent
articles: explicit intent + expectations + context as a tracked
artifact BEFORE implementation, not after. The plan is the "ICE"
artifact for this multi-PR effort.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* plan(deploy): resolve D1-D4 decisions + split Phase 1 into 1A + 1B
User-resolved decisions captured in the plan doc:
- D1 (SQL Server hosting): Postgres-only-for-demo with provider swap.
Dev environment keeps two-engine split (the architectural story is
still genuine for dev/learning). Deployed environment uses Postgres
for all four services with state. README footnote will explain the
demo-deployment exception.
- D2 (Identity): Auth0 free tier. Hosted, simplest setup. Keycloak
stays for local dev. ServiceDefaults JWT config is already
config-driven — points at Auth0 issuer in deployed environment.
- D3 (Messaging): AWS SQS+SNS free tier. Wolverine reconfig from
UseAzureServiceBus to UseAmazonSqs. 1M req/mo free covers demo
volume comfortably.
- D4 (Cost ceiling): $30/mo target, $50/mo hard ceiling. Phase 1 ≤$15,
Phase 2 ≤$30. Fly.io spend cap to be set at $50 before any
provisioning.
Phase 1 now splits naturally into two sub-PRs given D1:
- Phase 1A: Postgres provider swap (code only). Add `DatabaseProvider`
config setting, branch Order + Payment Program.cs on it, regenerate
Postgres migrations, branch concurrency-token config (RowVersion vs
xmin). Integration tests cover both paths. README footnote.
- Phase 1B: Deploy Order + minimal Storefront. Original Phase 1 goal,
now sitting on top of 1A's plumbing.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* plan(deploy): switch D2 to Keycloak self-hosted + restructure Phase 1 + Phase 3 rate-limit audit
User reconsidered D2 after seeing an article on Keycloak's value as an
identity provider. Switched from Auth0 to Keycloak self-hosted on Fly.
Rationale:
- Same IdP in dev and prod. NextAurora already runs Keycloak in local
Aspire dev with the realm at realms/nextaurora-realm.json. Deployed
Keycloak imports the same realm export on boot — one IdP, one realm,
two environments. Removes "works in dev, breaks in prod" risk around
realm shape, claim names, test users.
- Stronger portfolio story than vendor-managed Auth0.
- Cost is roughly comparable (~$5-10/mo Fly Machine vs Auth0 free
vendor-locked).
Trade-off: real ops surface (Postgres DB for Keycloak state, persistent
volume, realm-import-on-boot, two-stage readiness probe, ~30-60s cold
start for Java + realm import).
Phase 1 now splits into three sub-PRs:
- 1A: Postgres provider swap (code only) — unchanged
- 1B: Deploy Keycloak self-hosted (NEW infrastructure-only sub-phase)
- 1C: Deploy Order + minimal Storefront against deployed Keycloak
Phase 3 deliverables grown by one item: rate-limit audit + Redis-backed
swap if any service runs 2+ Fly Machines. The in-memory ASP.NET Core
limiter on Catalog search + Payment process is correct for single-
instance Phase 2 but silently weakens under scale-out. The rule
encoding lands in a separate PR (CLAUDE.md / .coderabbit.yaml /
architecture-reviewer.md).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* plan(deploy): CodeRabbit review fixes + Seq for Phase 3 telemetry
Three small edits:
1. CodeRabbit #1 (full-saga-deployment-plan.md:335): the "Related docs"
pointer said STATUS.md references this doc "under 'Open issues'" but
the entry actually lives under "Next." Fixed to match.
2. CodeRabbit #2 (STATUS.md:103): the "Next" entry still said "four open
decisions (D1-D4) block Phase 1" but the plan doc now lists D1-D4 as
resolved 2026-05-27. Updated to reflect resolved state + name the
chosen options + point at Phase 1A as the next concrete work.
3. Seq named as the Phase 3 telemetry choice. Sourced from an article on
structured logging + distributed tracing for microservices with Seq.
Seq fits this deployment shape better than App Insights — unified logs
+ traces in one UI, self-hostable on Fly with persistent volume
(matches D2 Keycloak pattern), free tier covers demo scope, OTLP-native
(slots into existing OpenTelemetry export with one config-line change).
Gotcha captured: pin OpenTelemetry.Instrumentation.* packages
explicitly in Directory.Packages.props, since non-stable RC versions
for instrumentations like StackExchangeRedis differ across major
bumps.
No new rule encoding — the article reinforces existing CLAUDE.md rules
(structured logging with message templates, OpenTelemetry instrumentation
patterns) without adding net-new lessons worth encoding.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
emeraldleaf
added a commit
that referenced
this pull request
Jun 5, 2026
…n-CLAUDE.md) (#114) * chore: prevention pass — hook + CLAUDE.md rule + CodeRabbit instruction (parts 2-4 of file-move discipline) Three of four layers of the file-move drift prevention loop. Part #1 (extend the CI broken-link guard to scan Dockerfile*) lands in a follow-up commit on this same PR once #112 is merged and rebased. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * chore: extend prevention to Dockerfile COPY + diagrams + topology docs Three more layers on top of cf27628 (parts 2-4 of the file-move prevention loop). Same compounding-loop principle, broader coverage: (a) Broken-COPY audit — Dockerfile* COPY/ADD source paths that don't exist in the build context. Catches the kind of drift that left Dockerfile.catalog referencing the pre-VSA-collapse 4-project layout for months after PR #31. Skips `--from=<stage>` cross-stage copies and wildcards (resolved at build time, not against the repo tree). (b) Diagram-pair audit — every docs/*.excalidraw must have its sibling docs/*.svg and vice versa. Reviewers look at the .svg on github.com to understand the system; .excalidraw is the editable source. If one exists without the other, the diagram review surface is broken. Does NOT verify the .svg matches what would be regenerated from the .excalidraw source (that needs Playwright in CI — separate, heavier gate). The pair-existence check is the cheap mechanical floor. (c) CLAUDE.md "Doc-and-diagram discipline" rule. Sibling to the file-move rule. Encodes that docs and diagrams are the REVIEW SURFACE, not byproducts — when reviewers look at the system, they read docs/architecture.md and look at docs/nextaurora-architecture.svg. If those are stale, every review reasons against a fiction. Names concrete pairings: AppHost.cs ↔ architecture.md/svg, Extensions.cs middleware order ↔ service-request-flow.svg, EF/cache/outbox changes ↔ perf-doc + their sibling diagrams. (d) CodeRabbit path_instructions for NextAurora.AppHost/AppHost.cs and NextAurora.ServiceDefaults/Extensions.cs — when topology or middleware-order code changes, flag missing paired doc/diagram updates at review time. PR-description waivers acceptable when the deferred update is named in a tracking issue. All three new mechanical guards (broken-link from #112, broken-COPY, diagram-pair) smoke-tested locally on the current tree → exit 0. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * chore: trim CLAUDE.md (5 surfaces, lean discipline, perf + observability) Third commit on this branch — the lean-CLAUDE.md pass on top of the file-move (cf27628) + doc-and-diagram (4e1ba94) prevention layers. CLAUDE.md drops from 358 → 301 lines (~16% smaller) by moving deep-dive content out and leaving headlines + links. Every rule preserved; nothing the AI needs to know got dropped. Continuous Rule Encoding — 6 surfaces → 5: - GitHub Issues moved out of the encoding loop (it's a deferral mechanism, not encoding — closed issues aren't re-read in future sessions) - Mirror update in docs/dev-loop.md (table + prose) - Diagram redesigned with 5 surfaces in a clean row, simpler tier boxes (no crammed tool lists), CLAUDE.md annotated "(kept lean)" to reinforce the discipline visually New "one-paragraph max per rule" discipline on CLAUDE.md surface 1: - "If a rule needs more than ~6 lines, the rule stays as a bolded headline + one-paragraph summary in CLAUDE.md; detail moves to docs/ or skills/" - Test: "could this rule + its rationale fit on one screen?" CI size guard (build job): - CLAUDE.md soft-warning at 400 lines, hard-fail at 500 - Mechanical floor on bloat regression Performance Rules trim (~30 dense bullets → 22 headlines + links): - Rules themselves unchanged (project-not-map, Task.WhenAll, outbox-atomic, Guid v7, AsSpan, Dapper escape hatch, etc.) - Deep-dive paragraphs → docs/performance-and-data-correctness.md + dotnet-performance skill (both already exist) Observability section trim (~78 lines → 21): - Kept 3 always-on traps: HTTP middleware order, Wolverine middleware instance-methods, outbox-outside-handler atomicity - Moved "how it works" detail → docs/architecture.md "Cross-Cutting Concerns" and "Event-Driven Architecture" (both already cover this) Verification: - wc -l CLAUDE.md: 301 (well under the 400/500 size budget) - broken-link audit on CLAUDE.md: exit 0 - diagram-pair audit: exit 0 - broken-COPY audit: exit 0 - All rule headlines preserved; rules with paraphrases in code comments ("See CLAUDE.md") still align with canonical wording Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * docs: actually move CLAUDE.md trim content to destination docs The prior commit (5c938b2) trimmed CLAUDE.md from 358→301 lines and added "See docs/..." links. An audit found the content I trimmed wasn't actually in those destination docs — I'd deleted it, not moved it. This commit does the real moves. New file: docs/observability-and-context-propagation.md - Section-aligned with CLAUDE.md "Observability & Context Propagation" - Full mechanism + headers/baggage mapping table + sources - HTTP middleware order with the canonical 4-line code example - Wolverine pipeline scope detail (pipeline order: validation → context → handler → AutoApplyTransactions) - Wolverine envelope context extraction mechanism - Transactional Outbox config + outbox-outside-handler atomicity trap with the canonical safe wrapper code block - Structured logging scope hygiene - Event Replay note Additions to docs/performance-and-data-correctness.md: - New "## Additional always-on patterns" section after "The 14 always-on rules" — patterns that don't fit a single-rule shape - Non-sargable predicates + EmailNormalized normalize-at-write-time pattern - Task.WhenAll parallel awaits with three caveats (dependent ops, shared DbContext, multi-failure observability) - Long-running work / 202 Accepted pattern with two atomicity paths and cloud-managed alternatives (Durable Functions, Step Functions, Temporal) - Fan-out on message bus + MaxDegreeOfParallelism throttle - Guid.CreateVersion7 with the time-decodable trade-off - AsSpan ref-struct + async-boundary constraint CLAUDE.md link redirects (no rule text changes): - "Long-running work" → docs/performance-and-data-correctness.md "Long-running work belongs on the message bus" (new section) - "Observability & Context Propagation" mechanism → docs/observability-and-context-propagation.md (new file) - "Outbox outside a Wolverine handler" wrapper → docs/observability-and-context-propagation.md anchor (new file) Naming convention: section-aligned filenames where new docs are created (observability-and-context-propagation.md mirrors the CLAUDE.md heading). Existing docs kept where they already align by topic (performance-and-data-correctness.md). Verification: - wc -l CLAUDE.md: 301 (unchanged; under 400 soft / 500 hard budget) - broken-link audit: exit 0 (full repo) - All trimmed content now lives in either the new doc or the existing perf doc; nothing the AI needs to know got dropped Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
4 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The repo lacked a single entry point for developers to understand how the codebase is structured and how requests flow through the system end-to-end.
Changes
docs/how-it-works.md— new developer walkthrough covering:Create()factory guards, encapsulated collections, domain methods)ValidationBehavior→LoggingBehavior→ handlerPOST /api/ordersrequest — HTTP endpoint → gRPC product validation + stock reservation →Order.Create()→ repository →OrderPlacedEventpublishGlobalExceptionHandlerexception-to-status mapping, correlation ID propagation across HTTP/gRPC/Service BusAppHost.cs— what it wires up and why hardcoded URLs are absentMethodName_Condition_ExpectedResult) and builder patternREADME.md— adds a Documentation section linking all five guides (how-it-works,architecture,observability,event-replay,BRD) so readers can navigate the docs without hunting through the repo.🔒 GitHub Advanced Security automatically protects Copilot coding agent pull requests. You can protect all pull requests by enabling Advanced Security for your repositories. Learn more about Advanced Security.