fix: populate missing tarball checksums (#84) + init --provider=postgresql (#85)#86
fix: populate missing tarball checksums (#84) + init --provider=postgresql (#85)#86renecannao wants to merge 5 commits intomasterfrom
Conversation
91 tarball entries (all MySQL 8.4.x/9.x, MariaDB, Percona 5.7+/8.x, TiDB) had no checksum, so CompareTarballChecksum silently returned nil and users downloaded without any integrity verification. Scraped 89 checksums from upstream sources: - MySQL (70 entries): MD5 from dev.mysql.com/downloads/mysql pages - MariaDB (5 entries): SHA256 from archive.mariadb.org sha256sums.txt - Percona Server (14 entries): SHA256 from per-tarball .sha256sum sidecars The two remaining uncheckable entries are TiDB 3.0.0 "master" rolling tarballs, which CompareTarballChecksum already exempts. Also: - CompareTarballChecksum now prints a warning to stderr when checksum is empty instead of silently skipping verification - Scraper scripts committed under scripts/checksums/ for repeatability when new versions are added Verified one MySQL checksum end-to-end: scraped: MD5:43cd4e14b688e9364ec4dbd58dd37437 actual: 43cd4e14b688e9364ec4dbd58dd37437 mysql-8.4.8-linux-glibc2.17-x86_64.tar.xz
dbdeployer init was hardcoded to download a MySQL tarball, even though
PostgreSQL is a first-class provider for deploy/unpack/downloads. Add
a --provider flag so users can initialize a PostgreSQL environment:
dbdeployer init --provider=postgresql
For PostgreSQL, init:
- creates $SANDBOX_HOME as usual
- points $SANDBOX_BINARY at ~/opt/postgresql (unless overridden)
- does NOT try to auto-download MySQL
- prints instructions for obtaining the PG .deb files and running
`dbdeployer unpack --provider=postgresql` to populate the binary dir
Auto-downloading PostgreSQL .debs (to make init fully hands-off for
PG) is left for a follow-up — this commit resolves #85 by making init
not blow up when the user's target is PostgreSQL.
Also adds a CI step that:
- runs init --provider=postgresql in an isolated dir
- verifies directories are created
- verifies the PG instructions are printed
- verifies init does not reference a MySQL tarball
- verifies init --provider=oracle errors out
|
Warning Rate limit exceeded
Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 10 minutes and 11 seconds. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
📝 WalkthroughWalkthroughAdds PostgreSQL support to Changes
Sequence DiagramssequenceDiagram
participant User
participant CLI as dbdeployer (init)
participant Cmd as cmd/init.go
participant Ops as ops/init.go
participant Installer as providers/postgresql (macOS/Linux)
participant Filesystem as FS
User->>CLI: dbdeployer init --provider=postgresql
CLI->>Cmd: parse flags (--provider)
Cmd->>Ops: InitEnvironment(provider)
Ops->>Ops: validate provider (mysql|postgresql)
alt provider == postgresql
Ops->>Installer: choose platform-specific installer
Installer->>FS: create/modify sandbox-binary (~/opt/postgresql/<X.Y>)
Installer-->>Ops: success / instructions (Linux)
else provider == mysql
Ops->>Downloads: find/download/unpack tarball
Downloads-->>Ops: tarball extracted
end
Ops-->>User: init complete (or printed instructions)
sequenceDiagram
participant Downloads as dbdeployer downloads
participant Registry as downloads/tarball_list.json
participant Checksum as CompareTarballChecksum
participant User
User->>Downloads: request tarball
Downloads->>Registry: read metadata (checksum?)
Registry-->>Downloads: checksum present/empty
Downloads->>Checksum: CompareTarballChecksum(...)
alt checksum present
Checksum-->>Downloads: verify hashes
else checksum missing
Checksum-->>User: WARN: checksum missing (stderr)
end
Downloads-->>User: deliver tarball
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Code Review
This pull request introduces a --provider flag to the init command, enabling support for PostgreSQL alongside the default MySQL. It updates the initialization logic to handle PostgreSQL-specific paths and instructions, adds checksums for various database tarballs, and includes Python scripts to automate checksum scraping from official sources. Review feedback recommends enhancing the MySQL scraper's error handling and portability, and suggests making the PostgreSQL setup instructions more generic regarding version numbers.
| def fetch(url: str, cookies: str | None = None) -> str: | ||
| """Fetch a URL via curl and return the body.""" | ||
| cmd = ["curl", "-sL"] | ||
| if cookies: | ||
| cmd += ["-b", cookies] | ||
| cmd.append(url) | ||
| return subprocess.check_output(cmd, text=True) |
There was a problem hiding this comment.
The fetch function does not handle errors from curl. If curl fails (e.g., with a 404 error), subprocess.check_output will raise an exception and crash the script. It would be more robust to handle this error, similar to how it's done in scrape_mariadb_percona_checksums.py.
| def fetch(url: str, cookies: str | None = None) -> str: | |
| """Fetch a URL via curl and return the body.""" | |
| cmd = ["curl", "-sL"] | |
| if cookies: | |
| cmd += ["-b", cookies] | |
| cmd.append(url) | |
| return subprocess.check_output(cmd, text=True) | |
| def fetch(url: str, cookies: str | None = None) -> str: | |
| """Fetch a URL via curl and return the body.""" | |
| cmd = ["curl", "-fsL"] | |
| if cookies: | |
| cmd += ["-b", cookies] | |
| cmd.append(url) | |
| try: | |
| return subprocess.check_output(cmd, text=True, stderr=subprocess.DEVNULL) | |
| except subprocess.CalledProcessError: | |
| return "" |
ops/init.go
Outdated
| fmt.Println("PostgreSQL binaries are not auto-downloaded.") | ||
| fmt.Println("Fetch the server + client .deb packages from either:") | ||
| fmt.Println(" - https://apt.postgresql.org/pub/repos/apt/pool/main/p/postgresql-16/") | ||
| fmt.Println(" - your distribution's PostgreSQL package mirror") | ||
| fmt.Println() | ||
| fmt.Println("Then unpack them into the sandbox-binary directory:") | ||
| fmt.Printf(" dbdeployer unpack --provider=postgresql \\\n") | ||
| fmt.Printf(" postgresql-16_*.deb postgresql-client-16_*.deb\n") |
There was a problem hiding this comment.
The setup instructions for PostgreSQL hardcode version 16 in the URL and example commands. This could be confusing for users wanting to install a different version. It would be better to clarify that this is an example and provide more generic guidance.
| fmt.Println("PostgreSQL binaries are not auto-downloaded.") | |
| fmt.Println("Fetch the server + client .deb packages from either:") | |
| fmt.Println(" - https://apt.postgresql.org/pub/repos/apt/pool/main/p/postgresql-16/") | |
| fmt.Println(" - your distribution's PostgreSQL package mirror") | |
| fmt.Println() | |
| fmt.Println("Then unpack them into the sandbox-binary directory:") | |
| fmt.Printf(" dbdeployer unpack --provider=postgresql \\\n") | |
| fmt.Printf(" postgresql-16_*.deb postgresql-client-16_*.deb\n") | |
| fmt.Println("PostgreSQL binaries are not auto-downloaded.") | |
| fmt.Println("Fetch the server + client .deb packages for your desired version (e.g. 16) from either:") | |
| fmt.Println(" - https://apt.postgresql.org/pub/repos/apt/ (e.g. .../pool/main/p/postgresql-16/)") | |
| fmt.Println(" - your distribution's PostgreSQL package mirror") | |
| fmt.Println() | |
| fmt.Println("Then unpack them into the sandbox-binary directory (example for version 16):") | |
| fmt.Printf(" dbdeployer unpack --provider=postgresql \\n") | |
| fmt.Printf(" postgresql-16_*.deb postgresql-client-16_*.deb\n") |
|
|
||
| def prime_cookies() -> str: | ||
| """Prime cookies by fetching the archives page once.""" | ||
| jar = "/tmp/mysql-scrape-cookies.txt" |
There was a problem hiding this comment.
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.github/workflows/integration_tests.yml:
- Around line 382-425: The CI currently exercises PostgreSQL init but lacks the
PR-time gate to reject changes that add non-TiDB tarballs without checksums; add
a new job step in .github/workflows/integration_tests.yml (near the existing
"Test init --provider=postgresql" step) that runs the checksum-completeness
validator against downloads/tarball_list.json and fails the job if any non-TiDB
entry lacks a checksum; invoke the same validation logic used at runtime (the
code in downloads/remote_registry.go) — e.g., call the repository's checksum
validation command or a small Go script that loads downloads/tarball_list.json
and enforces checksum presence for new tarballs — so PRs are blocked when a
missing checksum is detected.
In `@ops/init.go`:
- Around line 70-79: The code currently overwrites the global default
sandboxBinary when provider == "postgresql" causing unintended persistence;
change the behavior so init only uses ~/opt/postgresql as a local runtime value
(sandboxBinary) and does not mark it as an explicit global default unless the
user actually passed --sandbox-binary. Concretely: in the provider switch where
sandboxBinary is set for PostgreSQL, do not modify persistence-related defaults
or globals; instead set only the local sandboxBinary variable, and in the
defaults-update/persistence block (the later defaults-update logic) consult
flags.Changed(globals.SandboxBinaryLabel) (populate
globals.SandboxBinaryExplicit in cmd/init.go) before writing the sandbox-binary
into defaults so that only an explicit flag causes persistence. Ensure the same
fix is applied to the second occurrence referenced (lines ~166-173) to avoid
accidental global updates.
In `@scripts/checksums/scrape_mysql_checksums.py`:
- Around line 30-36: fetch currently returns only the body and hides HTTP
errors; change fetch(url, cookies) to capture and return the HTTP status
alongside the response body (e.g., return (status_code, body)) instead of
swallowing non-200 responses, by invoking curl to output the status code (or
using --write-out) and parsing it; then update scrape_page to check the returned
status_code from fetch and bail (raise an error or return early) on any non-200
response so the script fails explicitly rather than treating error pages as “no
checksum found.”
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 2fd7a9d1-b635-4fec-9a2b-4a91e86d8790
📒 Files selected for processing (8)
.github/workflows/integration_tests.ymlcmd/init.godownloads/remote_registry.godownloads/tarball_list.jsonops/init.goscripts/checksums/README.mdscripts/checksums/scrape_mariadb_percona_checksums.pyscripts/checksums/scrape_mysql_checksums.py
| - name: Test init --provider=postgresql | ||
| run: | | ||
| # Run init against isolated dirs, verify it creates them and | ||
| # prints the PostgreSQL setup instructions. | ||
| OUT=$(./dbdeployer init --provider=postgresql \ | ||
| --sandbox-binary=/tmp/init-pg-bin \ | ||
| --sandbox-home=/tmp/init-pg-home \ | ||
| --skip-shell-completion 2>&1) | ||
| echo "$OUT" | ||
|
|
||
| # Verify directories were created | ||
| [ -d /tmp/init-pg-bin ] || { echo "FAIL: sandbox-binary not created"; exit 1; } | ||
| [ -d /tmp/init-pg-home ] || { echo "FAIL: sandbox-home not created"; exit 1; } | ||
|
|
||
| # Verify the PostgreSQL setup instructions were printed | ||
| echo "$OUT" | grep -q "PostgreSQL binaries are not auto-downloaded" || { | ||
| echo "FAIL: expected PostgreSQL setup instructions in output" | ||
| exit 1 | ||
| } | ||
| echo "$OUT" | grep -q "dbdeployer unpack --provider=postgresql" || { | ||
| echo "FAIL: expected unpack instructions in output" | ||
| exit 1 | ||
| } | ||
|
|
||
| # Verify it didn't try to download a MySQL tarball | ||
| echo "$OUT" | grep -q "mysql-" && { | ||
| echo "FAIL: output references mysql tarball" | ||
| exit 1 | ||
| } | ||
|
|
||
| # Sanity: invalid provider should fail | ||
| ./dbdeployer init --provider=oracle --dry-run \ | ||
| --sandbox-binary=/tmp/init-bad \ | ||
| --sandbox-home=/tmp/init-bad-home 2>&1 | grep -q "unknown provider" || { | ||
| echo "FAIL: invalid provider did not error as expected" | ||
| exit 1 | ||
| } | ||
|
|
||
| # Cleanup init side-effects so the rest of the job uses its own dirs | ||
| rm -rf /tmp/init-pg-bin /tmp/init-pg-home /tmp/init-bad /tmp/init-bad-home | ||
| # Restore sandbox defaults so subsequent steps find ~/opt/postgresql | ||
| ./dbdeployer defaults update sandbox-binary "$HOME/opt/mysql" | ||
| ./dbdeployer defaults update sandbox-home "$HOME/sandboxes" | ||
| echo "OK: dbdeployer init --provider=postgresql works" |
There was a problem hiding this comment.
Add the checksum-completeness gate to CI.
This workflow now covers the PostgreSQL init path, but it still doesn’t implement #84’s requirement to fail PRs when a new non-TiDB tarball is added without a checksum. downloads/tarball_list.json can still regress if it’s edited directly, because the runtime validation in downloads/remote_registry.go is not a PR-time guard.
Suggested CI step
+ - name: Verify tarball checksums are populated
+ run: |
+ missing=$(jq -r '
+ .Tarballs[]
+ | select((.flavor != "tidb") and (.checksum == null or .checksum == ""))
+ | .name
+ ' downloads/tarball_list.json)
+ [ -z "$missing" ] || {
+ echo "Missing checksums for:"
+ echo "$missing"
+ exit 1
+ }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In @.github/workflows/integration_tests.yml around lines 382 - 425, The CI
currently exercises PostgreSQL init but lacks the PR-time gate to reject changes
that add non-TiDB tarballs without checksums; add a new job step in
.github/workflows/integration_tests.yml (near the existing "Test init
--provider=postgresql" step) that runs the checksum-completeness validator
against downloads/tarball_list.json and fails the job if any non-TiDB entry
lacks a checksum; invoke the same validation logic used at runtime (the code in
downloads/remote_registry.go) — e.g., call the repository's checksum validation
command or a small Go script that loads downloads/tarball_list.json and enforces
checksum presence for new tarballs — so PRs are blocked when a missing checksum
is detected.
| // For PostgreSQL, point sandbox-binary at ~/opt/postgresql instead of | ||
| // the MySQL default, unless the user explicitly overrode it via | ||
| // --sandbox-binary. | ||
| if provider == "postgresql" && sandboxBinary == defaults.Defaults().SandboxBinary { | ||
| home, herr := os.UserHomeDir() | ||
| if herr != nil { | ||
| return fmt.Errorf("cannot determine home directory: %s", herr) | ||
| } | ||
| sandboxBinary = path.Join(home, "opt", "postgresql") | ||
| } |
There was a problem hiding this comment.
Don’t persist the PostgreSQL path as the new global default unless the user explicitly asked for it.
Right now init --provider=postgresql rewrites sandboxBinary to ~/opt/postgresql, and the later defaults-update block persists that as the global sandbox-binary. That changes subsequent MySQL commands too, which is why the workflow has to manually restore defaults afterward. It also can’t distinguish an inherited custom MySQL path from an explicit --sandbox-binary override.
Suggested direction
type InitOptions struct {
SandboxBinary string
SandboxHome string
DryRun bool
SkipDownloads bool
SkipTarballDownload bool
SkipCompletion bool
Provider string
+ SandboxBinaryExplicit bool
}- if provider == "postgresql" && sandboxBinary == defaults.Defaults().SandboxBinary {
+ if provider == "postgresql" && !options.SandboxBinaryExplicit {
home, herr := os.UserHomeDir()
if herr != nil {
return fmt.Errorf("cannot determine home directory: %s", herr)
}
sandboxBinary = path.Join(home, "opt", "postgresql")
}- if sandboxBinary != defaults.Defaults().SandboxBinary {
+ if options.SandboxBinaryExplicit && sandboxBinary != defaults.Defaults().SandboxBinary {
// update defaults
}cmd/init.go can populate SandboxBinaryExplicit via flags.Changed(globals.SandboxBinaryLabel).
Also applies to: 166-173
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@ops/init.go` around lines 70 - 79, The code currently overwrites the global
default sandboxBinary when provider == "postgresql" causing unintended
persistence; change the behavior so init only uses ~/opt/postgresql as a local
runtime value (sandboxBinary) and does not mark it as an explicit global default
unless the user actually passed --sandbox-binary. Concretely: in the provider
switch where sandboxBinary is set for PostgreSQL, do not modify
persistence-related defaults or globals; instead set only the local
sandboxBinary variable, and in the defaults-update/persistence block (the later
defaults-update logic) consult flags.Changed(globals.SandboxBinaryLabel)
(populate globals.SandboxBinaryExplicit in cmd/init.go) before writing the
sandbox-binary into defaults so that only an explicit flag causes persistence.
Ensure the same fix is applied to the second occurrence referenced (lines
~166-173) to avoid accidental global updates.
| def fetch(url: str, cookies: str | None = None) -> str: | ||
| """Fetch a URL via curl and return the body.""" | ||
| cmd = ["curl", "-sL"] | ||
| if cookies: | ||
| cmd += ["-b", cookies] | ||
| cmd.append(url) | ||
| return subprocess.check_output(cmd, text=True) |
There was a problem hiding this comment.
Fail explicitly on HTTP fetch errors instead of treating them as “no checksum found.”
curl -sL will happily return HTML for 404/interstitial pages, so scrape_page() can silently interpret a fetch failure as an empty result set and under-populate the registry. The MariaDB/Percona scraper in scripts/checksums/scrape_mariadb_percona_checksums.py:19-33 already preserves HTTP status; this one should do the same.
Suggested direction
-def fetch(url: str, cookies: str | None = None) -> str:
- """Fetch a URL via curl and return the body."""
- cmd = ["curl", "-sL"]
+def fetch(url: str, cookies: str | None = None) -> tuple[int, str]:
+ """Fetch a URL via curl and return (http_status, body)."""
+ cmd = ["curl", "-sL", "-w", "\n%{http_code}"]
if cookies:
cmd += ["-b", cookies]
cmd.append(url)
- return subprocess.check_output(cmd, text=True)
+ try:
+ out = subprocess.check_output(cmd, text=True, stderr=subprocess.DEVNULL)
+ except subprocess.CalledProcessError:
+ return (0, "")
+ body, _, code = out.rpartition("\n")
+ try:
+ return (int(code), body)
+ except ValueError:
+ return (0, "")Then make scrape_page() bail out on non-200 responses.
🧰 Tools
🪛 Ruff (0.15.9)
[error] 36-36: subprocess call: check for execution of untrusted input
(S603)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@scripts/checksums/scrape_mysql_checksums.py` around lines 30 - 36, fetch
currently returns only the body and hides HTTP errors; change fetch(url,
cookies) to capture and return the HTTP status alongside the response body
(e.g., return (status_code, body)) instead of swallowing non-200 responses, by
invoking curl to output the status code (or using --write-out) and parsing it;
then update scrape_page to check the returned status_code from fetch and bail
(raise an error or return early) on any non-200 response so the script fails
explicitly rather than treating error pages as “no checksum found.”
On macOS there is no official PostgreSQL tarball distribution. The
closest to a drop-in binary package is Postgres.app, which bundles
full PG major versions inside a .dmg with the same bin/lib/share
layout that dbdeployer already uses.
This commit wires Postgres.app into `dbdeployer init --provider=postgresql`
for darwin builds:
1. Query the Postgres.app GitHub releases API for the latest release
2. Pick the newest single-major-version .dmg asset
(skipping the large multi-version bundle)
3. Download to a tempdir
4. Mount headlessly with `hdiutil attach -nobrowse -readonly -noverify`
5. Run the bundled `postgres --version` to detect the exact X.Y
version (deploy postgresql requires X.Y format)
6. Copy bin/lib/share/include into ~/opt/postgresql/<X.Y>/
7. Detach the DMG
No sudo, no GUI, no Homebrew. Postgres.app is a universal binary so
the same code path works on both Apple Silicon and Intel Macs.
Linux behavior is unchanged — `init --provider=postgresql` still
prints instructions to download .deb files manually, since the PGDG
pool structure (distro-specific, architecture-specific, multiple
build variants per release) is too messy to resolve generically.
Tests:
- Unit test for the GitHub release asset filtering
(offline, using httptest.NewServer)
- CI job `postgresql-macos-test` on macos-14 and macos-15:
* runs `init --provider=postgresql`
* verifies binaries land at ~/opt/postgresql/<X.Y>/bin
* verifies the directory name is in X.Y format
* runs `deploy postgresql <X.Y>`
* verifies CREATE TABLE / INSERT / SELECT roundtrip
There was a problem hiding this comment.
🧹 Nitpick comments (4)
providers/postgresql/macos_install_test.go (1)
54-57: Consider handling the JSON encoding error.The error from
json.NewEncoder(w).Encode(fakeGithubRelease)is silently discarded. While unlikely to fail in a test with a static struct, explicitly handling it would prevent silent test corruption if the fixture becomes invalid.♻️ Suggested improvement
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { w.Header().Set("Content-Type", "application/json") - _ = json.NewEncoder(w).Encode(fakeGithubRelease) + if err := json.NewEncoder(w).Encode(fakeGithubRelease); err != nil { + http.Error(w, err.Error(), http.StatusInternalServerError) + } }))🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@providers/postgresql/macos_install_test.go` around lines 54 - 57, The test's HTTP handler discards the error from json.NewEncoder(w).Encode(fakeGithubRelease), which can silently mask encoding failures; update the httptest.NewServer handler function used to construct server to capture the error returned by json.NewEncoder(w).Encode(fakeGithubRelease) and call t.Fatalf or t.Errorf (or write an appropriate HTTP 500) when encoding fails so the test fails loudly; locate the anonymous handler passed to httptest.NewServer and modify the json encoding call there to check and handle the returned error.providers/postgresql/macos_install.go (3)
121-129: Sorting assumes major versions remain ≤2 digits.The string comparison
assets[j].Major > assets[i].Majorworks correctly for PostgreSQL majors 14-18 (and up to 99), but would sort incorrectly if PostgreSQL ever reaches major version 100+ (e.g., "100" < "99" lexicographically).This is unlikely to be an issue in practice for many years, but consider parsing to integers if future-proofing is desired.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@providers/postgresql/macos_install.go` around lines 121 - 129, The current sorting loop compares assets[j].Major and assets[i].Major as strings which can misorder majors with three+ digits; modify the sort to compare numeric majors by parsing assets[].Major into integers (handle parse errors safely, e.g., treat non-numeric or parse failure as 0 or skip) and then use the integer comparison in the nested loop where assets are swapped; update any references in this block that use assets and the Major field to use the parsed int for sorting.
279-286: File close error is discarded.If
out.Close()fails (e.g., due to disk full on final flush), the error is silently ignored becausedefer out.Close()doesn't capture the return value. For large DMG downloads, this could mask data corruption.♻️ Handle close error
- out, err := os.Create(dst) //nolint:gosec // dst is a path we constructed in a tempdir - if err != nil { - return err - } - defer out.Close() - - _, err = io.Copy(out, resp.Body) - return err + out, err := os.Create(dst) //nolint:gosec // dst is a path we constructed in a tempdir + if err != nil { + return err + } + + _, copyErr := io.Copy(out, resp.Body) + closeErr := out.Close() + if copyErr != nil { + return copyErr + } + return closeErr🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@providers/postgresql/macos_install.go` around lines 279 - 286, The file currently defers out.Close() and returns the io.Copy error, discarding any Close error; change the logic around the writer "out" so you capture the result of io.Copy (the current "err"), then call out.Close() explicitly, capture its error (e.g., closeErr) and if closeErr is non-nil and the copy error is nil, return closeErr (or combine/annotate both errors), ensuring errors from io.Copy and out.Close() (final flush) are both handled before returning; locate the block that creates "out" and calls io.Copy(out, resp.Body) to implement this change.
169-177: Consider the security implications of-noverify.The
-noverifyflag skips DMG checksum verification. While this speeds up mounting, it bypasses macOS's built-in integrity check for the downloaded image. Since the DMG is downloaded over HTTPS from GitHub, the transport is secure, but an attacker who compromises the GitHub release could provide a corrupted/malicious DMG that would not be detected.If performance is acceptable, consider removing
-noverifyto let hdiutil verify the DMG's internal checksum.♻️ Consider removing -noverify
mountCmd := exec.Command("hdiutil", "attach", - "-nobrowse", "-readonly", "-noverify", + "-nobrowse", "-readonly", "-mountpoint", mountPoint, dmgPath)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@providers/postgresql/macos_install.go` around lines 169 - 177, The code creates mountCmd using exec.Command("hdiutil", "attach", "-nobrowse", "-readonly", "-noverify", "-mountpoint", mountPoint, dmgPath) which uses -noverify (skips DMG checksum verification); remove the "-noverify" argument so hdiutil performs integrity checks when mounting. Update the argument list passed to exec.Command in the mountCmd creation (where hdiutil attach is invoked) to exclude "-noverify" and keep "-nobrowse" and "-readonly" as before, then run tests/flows to ensure mounting still works under expected performance constraints.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@providers/postgresql/macos_install_test.go`:
- Around line 54-57: The test's HTTP handler discards the error from
json.NewEncoder(w).Encode(fakeGithubRelease), which can silently mask encoding
failures; update the httptest.NewServer handler function used to construct
server to capture the error returned by
json.NewEncoder(w).Encode(fakeGithubRelease) and call t.Fatalf or t.Errorf (or
write an appropriate HTTP 500) when encoding fails so the test fails loudly;
locate the anonymous handler passed to httptest.NewServer and modify the json
encoding call there to check and handle the returned error.
In `@providers/postgresql/macos_install.go`:
- Around line 121-129: The current sorting loop compares assets[j].Major and
assets[i].Major as strings which can misorder majors with three+ digits; modify
the sort to compare numeric majors by parsing assets[].Major into integers
(handle parse errors safely, e.g., treat non-numeric or parse failure as 0 or
skip) and then use the integer comparison in the nested loop where assets are
swapped; update any references in this block that use assets and the Major field
to use the parsed int for sorting.
- Around line 279-286: The file currently defers out.Close() and returns the
io.Copy error, discarding any Close error; change the logic around the writer
"out" so you capture the result of io.Copy (the current "err"), then call
out.Close() explicitly, capture its error (e.g., closeErr) and if closeErr is
non-nil and the copy error is nil, return closeErr (or combine/annotate both
errors), ensuring errors from io.Copy and out.Close() (final flush) are both
handled before returning; locate the block that creates "out" and calls
io.Copy(out, resp.Body) to implement this change.
- Around line 169-177: The code creates mountCmd using exec.Command("hdiutil",
"attach", "-nobrowse", "-readonly", "-noverify", "-mountpoint", mountPoint,
dmgPath) which uses -noverify (skips DMG checksum verification); remove the
"-noverify" argument so hdiutil performs integrity checks when mounting. Update
the argument list passed to exec.Command in the mountCmd creation (where hdiutil
attach is invoked) to exclude "-noverify" and keep "-nobrowse" and "-readonly"
as before, then run tests/flows to ensure mounting still works under expected
performance constraints.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 753b1084-1ab9-47fd-8678-b03f4737d330
📒 Files selected for processing (4)
.github/workflows/integration_tests.ymlops/init.goproviders/postgresql/macos_install.goproviders/postgresql/macos_install_test.go
🚧 Files skipped from review as they are similar to previous changes (1)
- ops/init.go
dbdeployer invokes initdb with `-L <basedir>/share`, which expects
postgres.bki directly at that path. Postgres.app keeps datadir
contents at share/postgresql/ (nested), so initdb failed with:
initdb: error: file "share/postgres.bki" does not exist
Flatten share/postgresql/* into share/ after the main copy, in
addition to the nested copy that the postgres runtime needs (via
compiled-in sharedir relocation). Mirrors what the Linux .deb
extractor already does.
The unauthenticated GitHub API allows 60 req/hour per IP. Shared CI runners on GitHub Actions routinely exhaust this: the first macOS init run got HTTP 403. Send GITHUB_TOKEN as a Bearer token if set in the environment (automatic on every GitHub Actions runner). This raises the limit to 5000 req/hour and is harmless for unauthenticated local use where the token is simply absent. Also expose GITHUB_TOKEN to the macOS init CI step.
Summary
Addresses both issues #84 and #85, plus adds macOS PostgreSQL support as a follow-up to the #85 fix.
#84 — Missing checksums for 91 tarballs
91 out of 254 tarball entries had no
checksumfield, soCompareTarballChecksumsilently skipped verification and users downloaded without any integrity check.Populated 89 checksums from upstream sources:
dev.mysql.com/downloads/mysqlarchive pagesarchive.mariadb.orgsha256sums.txtfiles.sha256sumsidecars ondownloads.percona.comThe 2 remaining uncheckable entries are TiDB 3.0.0 "master" rolling tarballs, which the code already exempts.
Verified end-to-end for one MySQL entry:
Other changes:
CompareTarballChecksumnow prints a warning to stderr when called with an empty checksum, instead of silently skippingscripts/checksums/with a README so this can be repeated when new versions are added#85 —
dbdeployer initdoesn't support PostgreSQL (Linux + macOS)initwas hardcoded to download a MySQL tarball even though PostgreSQL is a first-class provider fordeploy/unpack/downloads.Added
--providerflag todbdeployer initwith full provider support:Linux — prints clear instructions for obtaining PG
.debfiles and runningdbdeployer unpack --provider=postgresql. (Automating .deb download from PGDG is non-trivial — multiple distro/arch variants per release — so it's deferred to a follow-up.)macOS — fully automated via Postgres.app:
.dmgassethdiutil attach -nobrowse -readonlypostgres --versionto detect exactX.Yversionbin/lib/share/includeinto~/opt/postgresql/<X.Y>/No sudo, no GUI, no Homebrew. Postgres.app is a universal binary so the same path works on Apple Silicon and Intel.
For both:
--provideris validated (mysql,postgresql) and errors on unknown values.CI coverage
postgresql-macos-testjob, matrix: macos-14, macos-15):init --provider=postgresql→ downloads real Postgres.app DMG~/opt/postgresql/<X.Y>/binX.Yformatdeploy postgresql <X.Y>→ real sandboxCREATE TABLE / INSERT / SELECTroundtripTest plan
init --provider=postgresqlon Linux creates dirs, prints instructionsinit --provider=oracleerrorsinit --provider=mysql --dry-runstill workspostgresql-macos-testjob passes on macos-14 and macos-15Closes #84
Closes #85
Summary by CodeRabbit
New Features
Bug Fixes
Tests
Documentation
Chores