feat(repo-memory): add format-json option to pretty-print JSON files before commit#39540
Conversation
Co-authored-by: dsyme <7204669+dsyme@users.noreply.github.com>
|
❌ Test Quality Sentinel failed to deliver outputs during test quality analysis. |
|
🧠 Matt Pocock Skills Reviewer has completed the skills-based review. ✅ |
|
✅ Design Decision Gate 🏗️ completed the design decision gate check. |
|
✅ PR Code Quality Reviewer completed the code quality review. |
There was a problem hiding this comment.
Pull request overview
This PR adds an opt-in format-json: true setting to the repo-memory tool so that .json files are pretty-printed (2-space indent, trailing newline) before being committed to the memory branch, improving readability and reviewability of agent-written JSON.
Changes:
- Added
format-jsontoRepoMemoryEntryand parsing logic for both object and array tool config forms. - Wired
FORMAT_JSON=trueinto the generated push step environment only when enabled. - Implemented recursive
.jsonpretty-printing inpush_repo_memory.cjs, plus schema/docs/tests updates.
Show a summary per file
| File | Description |
|---|---|
pkg/workflow/repo_memory.go |
Adds FormatJSON config parsing and conditionally emits FORMAT_JSON to the push step env. |
pkg/workflow/repo_memory_test.go |
Adds coverage for parsing defaults and push-step env-var emission. |
pkg/parser/schemas/main_workflow_schema.json |
Extends the workflow schema to allow format-json in both repo-memory config variants. |
docs/src/content/docs/reference/repo-memory.md |
Documents the new option in the advanced configuration section. |
actions/setup/js/push_repo_memory.cjs |
Implements the JSON formatting behavior when FORMAT_JSON is enabled. |
Copilot's findings
Tip
Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Files reviewed: 5/5 changed files
- Comments generated: 1
| try { | ||
| const raw = fs.readFileSync(fullPath, "utf8"); | ||
| if (!raw.trim()) { | ||
| continue; | ||
| } | ||
| const parsed = JSON.parse(raw); | ||
| const formatted = JSON.stringify(parsed, null, 2) + "\n"; | ||
| if (raw !== formatted) { | ||
| fs.writeFileSync(fullPath, formatted, "utf8"); | ||
| core.info(`Formatted JSON: ${path.relative(destMemoryPath, fullPath)}`); | ||
| } | ||
| } catch (/** @type {any} */ error) { | ||
| core.warning(`Skipping JSON formatting for ${path.relative(destMemoryPath, fullPath)}: ${error.message}`); | ||
| } |
🏗️ Design Decision Gate — ADR RequiredThis PR makes significant changes to core business logic (153 new lines in business-logic directories, above the 100-line threshold) but does not have a linked Architecture Decision Record (ADR).
📄 Draft ADR — copy into
|
🧪 Test Quality Sentinel Report✅ Test Quality Score: 90/100 — Excellent
📊 Metrics & Test Classification (4 tests analyzed)
Go: 1 ( Score breakdown:
|
There was a problem hiding this comment.
Skills-Based Review 🧠
Applied /tdd and /grill-with-docs — requesting changes on JavaScript test coverage.
📋 Key Themes & Highlights
Blocking concern
- JS test gap:
push_repo_memory.test.cjshas 1566 lines of coverage for this file, but the entireFORMAT_JSONfeature — the runtime logic that actually mutates files before git commit — has zero tests. All four issues flagged in inline comments point back to this single gap.
Positive highlights
- ✅ Go compiler side is thoroughly tested: 127 lines covering object config, array config, default-false, and env-var emission/omission.
- ✅ The runtime defaults are safe: invalid JSON files are skipped with a
core.warningrather than failing the push; theFORMAT_JSONenv var is only emitted when explicitly enabled, so all existing users are unaffected. - ✅ The
if (raw !== formatted)guard avoids gratuitous writes — good for idempotency. - ✅
.gitdirectory exclusion is correct and deliberate. - ✅ Schema change covers both the object-config and array-item variants, consistent with the
allowed-extensionspattern.
Summary of inline comments
| File | Line | Skill | Issue |
|---|---|---|---|
push_repo_memory.cjs |
372 | /tdd |
formatJSONFilesInDir untestable in current location; suggest extracting to module or adding source-content tests |
push_repo_memory.cjs |
387 | /grill-with-docs |
Integer-keyed objects get key-reordered by JSON.stringify; worth a doc note |
push_repo_memory.cjs |
380 | /tdd |
.jsonl exclusion is documented but untested |
push_repo_memory.cjs |
388 | /tdd |
raw !== formatted no-op path is untested — hot path for repeat runs |
🧠 Reviewed using Matt Pocock's skills by Matt Pocock Skills Reviewer
| * Recursively find and format all .json files under a directory | ||
| * @param {string} dirPath - Directory to scan | ||
| */ | ||
| function formatJSONFilesInDir(dirPath) { |
There was a problem hiding this comment.
[/tdd] formatJSONFilesInDir is defined inside the if (formatJSON) block inside main() — making it untestable in isolation. push_repo_memory.test.cjs already has 1566 lines of coverage for this file, but FORMAT_JSON handling has zero test coverage despite being the core runtime behaviour of this feature.
💡 Two paths to fix
Option A — extract to a module (same pattern as glob_pattern_helpers.cjs):
// format_json_helpers.cjs
function formatJSONFilesInDir(dirPath, { destMemoryPath, fs, path, core }) { ... }
module.exports = { formatJSONFilesInDir };This makes the function importable and testable with a mock fs.
Option B — source-content checks (already used at lines 1502–1565):
expect(scriptContent).toContain('FORMAT_JSON === "true"');
expect(scriptContent).toContain('.endsWith(".json")');
expect(scriptContent).toContain('core.warning');Lower fidelity but consistent with existing test style.
Behaviours worth covering:
- Compact JSON → 2-space indent + trailing
\n - Already-formatted → no
writeFileSynccall (no-op) - Empty file → skipped
- Invalid JSON →
core.warning, notcore.setFailed .jsonl/.txt→ untouched.git/→ not traversed
| continue; | ||
| } | ||
| const parsed = JSON.parse(raw); | ||
| const formatted = JSON.stringify(parsed, null, 2) + "\n"; |
There was a problem hiding this comment.
[/grill-with-docs] JSON.stringify sorts integer-like string keys before other keys — {"1":"a","name":"b"} round-trips to {"1":"a","name":"b"} (fine), but {"name":"b","1":"a"} becomes {"1":"a","name":"b"} (key-order change). Agents that happen to use numeric keys would see a one-time noisy diff the first time format-json runs.
This is a V8 spec behaviour, not a bug in this code, but the docs (repo-memory.md) should mention it so users aren't surprised by unexpected diffs.
| if (entry.name !== ".git") { | ||
| formatJSONFilesInDir(fullPath); | ||
| } | ||
| } else if (entry.isFile() && entry.name.endsWith(".json")) { |
There was a problem hiding this comment.
[/tdd] The docs state format-json has no effect on .jsonl files, but this is untested. A source-content test (consistent with the existing style at lines 1502–1565) could guard this:
it('should only format .json files, not .jsonl or other types', () => {
expect(scriptContent).toContain('.endsWith(".json")');
// Verify .jsonl is not in the same condition
expect(scriptContent).not.toMatch(/endsWith\(".jsonl"\)/);
});This also documents the intended exclusion for future maintainers.
| } | ||
| const parsed = JSON.parse(raw); | ||
| const formatted = JSON.stringify(parsed, null, 2) + "\n"; | ||
| if (raw !== formatted) { |
There was a problem hiding this comment.
[/tdd] The raw !== formatted early-exit is a smart optimization (avoids a needless write + dirty git status entry), but it's entirely untested. The already-formatted no-op path is the hot path for users who run the workflow repeatedly — skipping it means each run silently rewrites every file.
💡 Suggested test (source-content check)
it('should skip writeFileSync when JSON is already formatted', () => {
// Verify the guard exists in source — protects against accidental removal
expect(scriptContent).toContain('if (raw !== formatted)');
});For a more behavioural test, extract formatJSONFilesInDir to a module (see comment on line 372) and drive it with a mock fs that asserts writeFileSync is not called when the file content already matches the formatted output.
|
@copilot run pr-finisher skill |
There was a problem hiding this comment.
REQUEST_CHANGES — Four correctness issues in the JS formatter that need fixing before merge.
### Blocking issues summary
-
Scope expansion → guaranteed limit violations on first enable (line 400):
formatJSONFilesInDirwalks the entire memory branch root, not just newly copied files. First enable on a branch with ≥100 pre-existing JSON files will always hitmaxFileCountormaxPatchSize, failing the push with an opaque error. Either restrict formatting to just-copied files, or clearly document and warn about this behavior. -
No file-size guard in the formatter (line 382): Pre-existing large JSON files on the branch bypass the
maxFileSizecheck that protects the copy phase, risking OOM on the runner. -
readdirSyncfailure in subdirectory is fatal (line 373): A single unreadable subdirectory kills the entire push. Per-file errors are warnings; per-directory errors should be too. -
Integer-like string keys silently reordered (line 387): V8 sorts integer-indexed keys numerically before string keys on stringify, causing an irreversible one-time mutation to key ordering that is not logged.
Non-blocking notes
FORMAT_JSONis not listed in the module-level JSDoc env-var comment block (lines 13–40). Should be added betweenFILE_GLOB_FILTERandGH_TOKEN.- The JavaScript formatting logic has no unit tests at the CJS level; the Go tests only cover YAML parsing and env-var emission, not actual JSON reformatting behavior.
🔎 Code quality review by PR Code Quality Reviewer
| } | ||
|
|
||
| try { | ||
| formatJSONFilesInDir(destMemoryPath); |
There was a problem hiding this comment.
First-enable will always exceed maxFileCount/maxPatchSize limits on a populated branch: formatJSONFilesInDir(destMemoryPath) walks the entire memory branch root, not just the files that were just copied. When format-json: true is first activated on a branch that already has ≥ maxFileCount (default 100) pre-existing JSON files, all of them are reformatted, changedFileCount exceeds the limit, and the push is hard-failed — even though the agent only wrote a single new file. The error message (Too many changed files (150 > 100)) gives no hint that format-json caused it.
💡 Suggested fix
Option A (preferred) — only format files that were just copied:
const copiedPaths = new Set(
filesToCopy.map(f => path.resolve(path.join(destMemoryPath, f.relativePath)))
);
// inside formatJSONFilesInDir, change the condition to:
} else if (entry.isFile() && entry.name.endsWith('.json') && copiedPaths.has(path.resolve(fullPath))) {Option B (documentation-only) — add a note in the docs that first enable on a populated branch reformats ALL pre-existing JSON files and may require increasing maxFileCount/maxPatchSize.
The same scope expansion inflates git diff --cached size, so maxPatchSize is equally at risk of being triggered.
| continue; | ||
| } | ||
| const parsed = JSON.parse(raw); | ||
| const formatted = JSON.stringify(parsed, null, 2) + "\n"; |
There was a problem hiding this comment.
JSON.stringify silently reorders integer-like string keys on first format pass: V8 enumerates integer-indexed properties (non-negative integer string keys ≤ 232−2) before string keys, sorted numerically. A file like {"b":1,"10":3,"2":4} becomes {"2":4,"10":3,"b":1} after one round-trip. This mutation is silent (no warning, no log line), one-time, and irreversible. For config schemas, OpenAPI specs, or ordered key–value stores stored in memory, this is unexpected data corruption.
💡 Details and fix
Reproduction:
const o = JSON.parse('{"b":1,"10":3,"2":4}');
console.log(JSON.stringify(o, null, 2));
// Output:
// {
// "2": 4,
// "10": 3,
// "b": 1
// }This is a one-time normalization (idempotent after first write), but the silent reordering may break downstream consumers that rely on key order.
Consider logging a per-file warning when the stringified form differs from the original to give visibility into what was mutated, not just that it was mutated:
core.info(`Formatted JSON (structural change): ${path.relative(destMemoryPath, fullPath)}`);| * @param {string} dirPath - Directory to scan | ||
| */ | ||
| function formatJSONFilesInDir(dirPath) { | ||
| const entries = fs.readdirSync(dirPath, { withFileTypes: true }); |
There was a problem hiding this comment.
readdirSync failure in a subdirectory is fatal while per-file failures are warnings — asymmetric error handling: If readdirSync throws for any subdirectory (e.g. EACCES, a broken directory symlink, or a race-condition ENOENT), the exception propagates through the recursive call stack and is caught by the outer try/catch at line 399, which calls core.setFailed() and aborts the push. Individual file read/parse errors at line 392 are caught and emitted as core.warning, allowing the run to continue. The asymmetry means a single unreadable subdirectory kills the entire push while an unreadable file does not.
💡 Suggested fix
Wrap the recursive call and the readdirSync in per-directory error handling:
function formatJSONFilesInDir(dirPath) {
let entries;
try {
entries = fs.readdirSync(dirPath, { withFileTypes: true });
} catch (err) {
core.warning(`Skipping directory (cannot read): ${path.relative(destMemoryPath, dirPath)}: ${err.message}`);
return;
}
for (const entry of entries) {
// ...
if (entry.isDirectory() && entry.name !== ".git") {
formatJSONFilesInDir(fullPath); // inner failures now self-contained
}
}
}| } | ||
| } else if (entry.isFile() && entry.name.endsWith(".json")) { | ||
| try { | ||
| const raw = fs.readFileSync(fullPath, "utf8"); |
There was a problem hiding this comment.
No file-size guard before readFileSync — OOM risk for large pre-existing JSON files: The copy phase enforces maxFileSize per file (line 289), but the formatter reads every .json file on the branch into memory with no size check. A 200 MB accumulated-history file that was committed to the branch before maxFileSize was lowered, or that was once allowed but now exceeds the limit, will be fully loaded, parsed, and re-serialized. On a standard GitHub Actions runner (7 GB RAM shared with git and Node), a handful of large files can exhaust the heap and crash the runner process with an OOM error that is caught by the outer try/catch and surfaces as a confusing core.setFailed failure.
💡 Suggested fix
Add a stat check before reading, consistent with the copy-phase guard:
const statResult = fs.statSync(fullPath);
if (statResult.size > maxFileSize) {
core.warning(`Skipping large JSON file (> ${maxFileSize} bytes): ${path.relative(destMemoryPath, fullPath)}`);
continue;
}
const raw = fs.readFileSync(fullPath, "utf8");Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Ran a pr-finisher pass and pushed |
AI agents frequently write JSON memory files as single-line blobs even when instructed otherwise, making them hard to review or manually correct in the repository. A new
format-json: trueoption onrepo-memoryentries forces all.jsonfiles to be pretty-printed (2-space indent, trailing newline) before they are committed to the memory branch.Changes
RepoMemoryEntrystruct — newFormatJSON boolfield (format-jsonYAML key)extractRepoMemoryConfig— parsesformat-jsonin both object and array config pathsbuildPushRepoMemoryJob— emitsFORMAT_JSON: 'true'env var on the push step only when enabled (omitted otherwise, no-op default)push_repo_memory.cjs— after files are copied but beforegit add, recursively formats.jsonfiles viaJSON.stringify(parsed, null, 2); invalid JSON files are skipped with a warning rather than failing the pushformat-jsonboolean added to both the object and array-itemrepo-memoryschema variantsformat-jsondocumented in the advanced config example and option descriptionsExample
Or per-entry in array notation: