Skip to content

feat: add auto-update mechanism for CLI#825

Open
m-abdelwahab wants to merge 61 commits intomasterfrom
mahmoud/autoupdates
Open

feat: add auto-update mechanism for CLI#825
m-abdelwahab wants to merge 61 commits intomasterfrom
mahmoud/autoupdates

Conversation

@m-abdelwahab
Copy link
Copy Markdown
Collaborator

@m-abdelwahab m-abdelwahab commented Mar 27, 2026

Summary

Adds a complete auto-update system to the Railway CLI that detects the install method, picks the right update strategy, and applies updates in the background — transparently and safely. Users no longer need to manually run railway upgrade to stay current.

New commands

  • railway autoupdate enable — Enable auto-updates (preserves any rollback skip; prints a notice if one is active)
  • railway autoupdate disable — Disable auto-updates and clean up staged binaries (returns instantly, no blocking wait)
  • railway autoupdate status — Show current state: enabled/disabled reason, install method, update strategy, latest known version, staged update, last check time, in-flight background PID, and skipped version
  • railway autoupdate skip — Skip the current pending version so auto-update stops trying to install it (essential for package-manager installs where --rollback isn't available)
  • railway upgrade --rollback — Revert to a previous CLI version from local backups (shell installs only)

Design

Two-phase update: stage now, apply next run

The core mechanism is stage → apply:

  1. A detached child process downloads the release asset and writes it to ~/.railway/staged-update/. This process is fully independent — it survives after the parent exits, so slow downloads never block the user. Background downloads timeout after 30 seconds; interactive railway upgrade allows 120 seconds.
  2. On the next interactive invocation, the CLI atomically replaces itself with the staged binary before running the command. The new version takes effect on the invocation after that.

Why not download-and-exec in one shot? Replacing a running binary and exec'ing into it is fragile, especially on Windows where the OS locks the running executable. The two-phase approach avoids self-replacement races, keeps each invocation's behavior predictable (you always run the binary you launched), and makes rollback straightforward since the old binary is backed up before replacement.

Why detached child processes? A CLI command should return in milliseconds. Downloads can take seconds. Spawning a detached process (process_group(0) on Unix, CREATE_NEW_PROCESS_GROUP on Windows) means the download runs independently of the parent's lifetime. A 1-second timeout on the version-check task ensures even railway --help returns instantly — the only work gated on that timeout is the fast GitHub API call, never the download.

12-hour API check gate

The CLI queries the GitHub Releases API at most once per 12 hours. The check timestamp is stored in ~/.railway/version.json and compared as a fixed duration from the last check.

Why 12 hours? It balances hotfix discovery speed (a critical fix is picked up within half a day) against API call volume. Users who work in morning and afternoon sessions get a fresh check each session. The fixed-duration approach avoids the timezone-dependent edge cases of a calendar-day gate (e.g., checking at 11:59pm UTC and not rechecking until the next calendar day).

The version check runs in all environments (including non-TTY and piped output) to keep the cache fresh. Only the staged-binary apply and update banner are suppressed in non-interactive contexts.

If the user rolls back from a bad version, the check gate is explicitly cleared so the CLI re-checks on every invocation until a newer (non-skipped) release appears — ensuring a hotfix is discovered promptly.

Install method detection

The CLI resolves current_exe(), follows symlinks, and matches the path against known patterns:

Install Method Detection Heuristic Update Strategy
Shell (writable) Binary in ~/.railway/bin, /usr/local/bin, generic */bin Background download + atomic binary swap
Shell (non-writable) Same paths, write probe fails Notification only; user guided to sudo / Administrator
npm / Bun / Scoop Binary under node_modules, .bun, or scoop paths Detached package manager upgrade
Homebrew Binary under homebrew or Cellar Notification only (brew too slow for background)
Cargo .cargo/bin or custom CARGO_INSTALL_ROOT (detected via .crates.toml marker) Notification only (cargo install too slow)
Unknown System paths (/usr/bin, /nix/, /snap/), version managers (.asdf, .mise, .volta, etc.) Notification only

Why notification-only for Homebrew and Cargo? Both package managers can take minutes to complete (Homebrew resolves the full dependency graph; Cargo recompiles from source). Spawning them in the background would leave a long-running process that's invisible to the user, consuming CPU and potentially conflicting with other brew/cargo operations. npm, Bun, and Scoop are fast enough for background use.

Why a write probe instead of checking file permissions? Unix file permissions don't account for ACLs, mount flags, or SELinux policies. Creating a temp file in the binary's directory and immediately removing it is the most reliable cross-platform writability check.

Version skipping

When a release is broken, users need a way to say "stop trying to install this version" without disabling all future updates. The skipped_version field in version.json serves this purpose:

  • Shell installs: railway upgrade --rollback swaps the binary back and sets skipped_version automatically.
  • Package-manager installs (npm/Bun/Scoop): railway autoupdate skip marks the currently cached pending version as skipped. This is essential because --rollback isn't available for package-manager installs — without skip, the next CLI invocation would re-run the package manager and put the user right back on the broken version.

The skip clears automatically when a newer release is successfully applied (via auto-update or railway upgrade). autoupdate enable does not clear it — the user can explicitly take the skipped version with railway upgrade if they choose.

Failure recovery

Shell-download failures are tracked by a download_failures counter. After 3 consecutive failures, the cached version is cleared to force a fresh API check — preventing a stale or unreachable version from blocking discovery of newer releases.

Package-manager spawns are time-gated: at most one spawn per hour per cached version. The gate resets when the GitHub API discovers a new release. This prevents rapid-fire retries when multiple CLI invocations happen before the update finishes (e.g., 5 quick commands during a slow npm update), while still retrying periodically.

Scenario Behavior
Download times out or fails Cached version preserved for retry; after 3 consecutive failures, cache clears to force fresh API check
Package manager spawn too recent Skipped; retries after 1 hour
Staged binary is stale (>7 days) Discarded automatically
Staged binary missing but metadata exists Metadata cleaned; next invocation re-downloads
Staged binary is wrong architecture Cleaned up, not applied
Binary directory not writable Detected via write probe; user guided to sudo / Administrator

Concurrency & safety

  • Exclusive file locking (fs2) on ~/.railway/update.lock prevents concurrent processes from racing on download/stage/apply. A separate package-update.lock serializes package manager spawns.
  • PID file guard with 10-minute TTL prevents duplicate package manager spawns, with platform-specific liveness checks (Unix kill(pid, 0), Windows OpenProcess).
  • Detached children re-check preferences — background processes verify is_auto_update_disabled() before downloading and again after acquiring the lock, so autoupdate disable from another terminal is respected immediately.
  • Atomic file operations on all persistent state — version cache, preferences, and binary replacement all use temp-file-then-rename (MoveFileExW with MOVEFILE_REPLACE_EXISTING on Windows).
  • Lock file dropped, not deleted — releasing the handle avoids a TOCTOU race where another process creates a new file at the same path between unlink and open.
  • Read-only commands stay read-only — bare railway, railway help, railway upgrade, railway autoupdate, and check_updates never trigger staged-binary apply or background spawns. Non-TTY environments stage updates but never apply. CI environments never self-mutate.
  • Shared validationvalidate_staged() is the single source of truth for staged-update safety checks (stale, wrong platform, not newer, skipped version), used by both the silent startup apply and the interactive railway upgrade fallback path.

Telemetry

Auto-update emits an autoupdate_apply telemetry event when a staged update is silently applied at startup. The event includes the old CLI version (cli_version) and the new version (sub_command), enabling tracking of update adoption rates and success.

All other auto-update actions (enable, disable, skip, status, upgrade, rollback) are tracked through the existing command telemetry system.

User controls

  • railway autoupdate [enable|disable|status|skip] for persistent preferences
  • RAILWAY_NO_AUTO_UPDATE=1 env var override — fully silent (no banner, no updates)
  • Auto-updates disabled automatically in CI environments — fully silent
  • railway autoupdate disable (preference) — stops installation but still shows "new version available" banner
  • All update notifications sent to stderr to avoid corrupting piped JSON output

Why different banner behavior for env var vs. preference? The env var and CI are scripted environments where any extra output is noise. The preference is for cautious users who want to know about releases but control when they install — they still benefit from the notification.

Backup & rollback

  • The 3 most recent binary versions are kept in ~/.railway/backups/, sorted by semver
  • Backups include the target triple to handle shared home directories across architectures
  • railway upgrade --rollback presents an interactive picker if multiple candidates exist
  • Rollback is only available for shell installs; package-manager users downgrade via their package manager and use autoupdate skip to prevent re-upgrade

Platform support

  • macOS (x86_64, aarch64), Linux (x86_64, aarch64 — musl), Windows (x86_64) — full self-update support
  • FreeBSD (x86_64) — target triple detected, gated until release pipeline publishes assets
  • Windows-specific: .old.exe rename-then-copy strategy (OS locks running binaries), cleanup on next run, CREATE_NEW_PROCESS_GROUP for detached processes, platform-appropriate elevation instructions
  • Compile-time BUILD_TARGET env var (set by build.rs) ensures the self-updater fetches the exact ABI variant (gnu vs musl, msvc vs gnu)

Test plan

  • Shell install: verify background download stages binary, next invocation applies it
  • Shell install: verify railway upgrade downloads and applies interactively
  • Shell install: verify railway upgrade --rollback reverts and skips the rolled-back version
  • npm install: verify background package manager spawn runs npm update -g
  • npm install: verify railway autoupdate skip stops re-upgrading after manual downgrade
  • npm install: verify spawn is time-gated (no re-spawn within 1 hour)
  • Homebrew install: verify notification-only (no background spawn)
  • railway autoupdate disable returns instantly and cleans staged binaries
  • railway autoupdate enable re-enables without clearing rollback skip
  • railway autoupdate status shows install method, strategy, latest version, staged update, last check, and in-flight PID
  • Non-TTY: verify version check runs but no staged apply or banner
  • CI / RAILWAY_NO_AUTO_UPDATE=1: verify fully silent (no banner, no updates)
  • autoupdate_apply telemetry event fires after silent startup apply
  • Update banner appears on stderr, not stdout

🤖 Generated with Claude Code

m-abdelwahab and others added 13 commits March 27, 2026 07:24
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ck issues

- Don't await download_and_stage on exit; spawn it as fire-and-forget so
  commands like --help return instantly even when an update is available
- Add writability check before attempting self-update download, skipping
  the download entirely for root-owned paths like /usr/local/bin
- Replace acquire-then-drop file lock with PID file guard for package
  manager updates to prevent duplicate concurrent spawns

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Move download_and_stage back into the awaited task so the tokio
  runtime doesn't cancel it on exit, but cap handle_update_task with
  a 5-second timeout so short commands like --help aren't blocked
- Write "PID TIMESTAMP" to the lock file and treat entries older than
  10 minutes as stale, fixing the permanent block on Windows where
  is_pid_alive always returned true

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… control flow

- Write last_update_check even when CLI is up-to-date, preventing a
  GitHub API call on every invocation
- Use separate file paths for self-update flock (update.lock) and
  package-manager PID guard (package-update.pid)
- Return Ok(None) instead of bail! for "already checked today" since
  it's normal control flow, not an error
- Make Preferences::write atomic via temp file + rename

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…date

- Bypass same-day gate for known-pending versions so timed-out downloads
  retry on the next invocation
- Skip spawning update task when running `autoupdate` subcommand to avoid
  racing with preference changes
- Use nix crate for Unix PID liveness check; add Windows implementation
  via winapi instead of conservative fallback
- Detach child update process from console Ctrl+C on Windows
- Add Windows CREATE_NEW_PROCESS_GROUP flag to package manager spawning
- Use scopeguard for write-probe cleanup in install method detection
- Warn when checksums.txt is missing rather than silently skipping
- Improve rollback: back up current binary first, support multi-candidate
  selection via inquire prompt
- Remove freebsd target triple (no release asset published)
- Use nanosecond-precision tmp file names to reduce collision risk

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
…on-writable shell installs, and preserve retry signal

- Skip try_apply_staged() for both `upgrade` and `autoupdate` subcommands
  so `railway autoupdate disable` doesn't swap the binary before disabling
- Add can_write_binary() check in `railway upgrade` for shell installs in
  non-writable locations (e.g. /usr/local/bin with sudo), guiding users to
  use sudo or reinstall instead of failing with a permission error
- Move clear_latest() from eager call in main to post-success in the
  background task, so a timed-out download preserves the retry signal for
  the next invocation instead of losing it behind the same-day gate

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ownload_and_stage, and add FreeBSD target

- For installs that can't self-update or auto-run a package manager
  (Homebrew, Cargo, Unknown, non-writable Shell), clear latest_version
  after the notification so the next day's check_update() can discover
  newer releases instead of freezing on the first cached version
- Wrap download_and_stage() with an exclusive file lock (update.lock)
  using double-checked locking to prevent concurrent CLI processes from
  racing on the staged-update directory
- Add FreeBSD x86_64 to detect_target_triple() to match install.sh
  support, preventing shell-installed FreeBSD users from hitting an
  "Unsupported platform" error on upgrade

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…fe rename, custom bin dir detection, and rollback target tracking

- download_and_stage returns Result<bool> so callers distinguish "lock
  held" (no work done) from a real staging; failed package-manager
  spawns no longer fall through to the notification-only branch that
  clears the cache
- Add rename_replacing() helper that removes the destination on Windows
  before renaming, fixing silent write failures for preferences.json
  and update.json after the first successful write
- Shell install detection now falls back to any parent directory named
  "bin", covering custom --bin-dir installs (~/bin, /opt/bin, etc.)
- Backup filenames include the target triple; rollback filters
  candidates by current architecture to prevent cross-arch restoration

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…g, preference persistence, and FreeBSD self-update

- Use rename_replacing() in UpdateCheck::write() so Windows cache clears work
- Add interact_or! and can_write_binary() guards to rollback path
- Report CI-disabled state in autoupdate status
- Create ~/.railway dir in Preferences::write() for clean HOME directories
- Remove FreeBSD from self-update targets until release pipeline publishes assets

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…D self-update, and preserve notifications when auto-updates disabled

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…, and use atomic Windows rename

- Notification-only installs (Homebrew, Cargo, Unknown) now preserve
  latest_version in the cache so the "new version available" notice
  actually shows on the next invocation instead of being cleared before
  the user sees it.
- Detached package-manager updates no longer clear the retry signal on
  spawn — the cached version persists until the user is actually on the
  new version.
- Failed staged-update applies preserve the staged payload for retry
  instead of deleting it; the 7-day staleness TTL handles permanent
  failures.
- `railway upgrade` now clears the version cache after a successful
  update so the next invocation doesn't redundantly re-download.
- Preferences::write() returns Result so `railway autoupdate disable`
  and `railway telemetry disable` report failures instead of silently
  succeeding.
- Standardize timestamp_nanos_opt() on unwrap_or_default() everywhere.
- Windows rename_replacing uses MoveFileExW(MOVEFILE_REPLACE_EXISTING)
  for a single-syscall atomic replace instead of remove+rename.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…lper

Consolidate repeated UpdateCheck write blocks in spawn_update_task into
a single persist_latest() method, and skip redundant writes when
check_update() already persisted the timestamp.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@m-abdelwahab m-abdelwahab added the release/minor Author minor release label Mar 27, 2026
m-abdelwahab and others added 16 commits March 27, 2026 18:22
list_backups relied on filesystem modification times, which required
write access to set in tests and is fragile across platforms. Sort by
the semver version embedded in the backup filename instead.

Also removes low-value tests that only assert trivial string formatting
or compiler-guaranteed match exhaustiveness.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…, and clarify update message

- Spawn background downloads in a detached child process so they survive
  beyond the parent's exit timeout instead of restarting from zero
- Add download_failures counter to version.json; after 3 consecutive
  failures, clear the cached version to force a fresh API re-check
  (fixes infinite retry loop on yanked/stale releases)
- Reduce exit timeout from 5s to 2s since it now only gates the fast
  API version check, not the download
- Append "(active on next run)" to the auto-update message since the
  current process still executes the old binary

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ndant spawns

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…rollback prune timing, and check_update timeout

- Correct "5 s" → "2 s" in download_and_stage comment to match handle_update_task
- Simplify can_write_binary probe to unconditional create-then-remove
- Skip spawn_update_task in non-TTY when auto-update is disabled
- Clean up leftover .old.exe on Windows at the start of try_apply_staged
- Defer backup pruning until after rollback succeeds so candidates aren't removed before the picker
- Add 30 s timeout to check_update reqwest client to prevent indefinite hangs

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…output

println! in the auto-update and version-banner paths wrote to stdout,
which breaks JSON parsing when the CLI is invoked programmatically
(e.g. by Claude Code piping `railway status --json`).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Once the installed binary reaches or surpasses the cached latest_version,
clear the cache so spawn_update_task falls through to a fresh check_update().
This prevents repeated package-manager spawns on every invocation after an
update lands and allows discovery of newer releases on manual install paths.

Also null out stdin on the detached package-manager update process to match
spawn_background_download and fully detach from the terminal.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… permanent

Previously spawn_update_task skipped check_update() entirely when a
cached version existed, and re-persisted the timestamp on every
invocation. This prevented the same-day gate from ever expiring, so the
CLI would advertise a stale cached version forever and never discover
newer releases.

Now we always call check_update() first (gated to once per UTC day),
falling back to the cached version only within the same day for download
retries.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
background_stage_update() was clearing the cached latest_version
immediately after staging succeeded, but the binary isn't replaced until
try_apply_staged() runs on a later invocation. If that apply step fails,
the user loses both the upgrade banner and future download retries until
the 7-day stale TTL expires. The cache is already cleared on successful
apply in try_apply_staged(), so the staging path should leave it alone.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
self_update_interactive() dropped the update lock after staging but
before applying, letting a concurrent try_apply_staged() race the binary
replacement. Now the lock is held across both staging and apply.

rollback() was mutating the binary and cleaning staged state without
acquiring update.lock at all, which could interleave with a background
auto-updater. Now it acquires the same lock before replacing the binary.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ose stage-apply race

update_strategy() now checks can_self_update() and can_write_binary()
so `autoupdate status` no longer claims "Background download + auto-swap"
when the binary directory is not writable or the platform is unsupported.

self_update_interactive() now holds a single lock across both
download_and_stage_inner() and apply_staged_update(), eliminating the
window where a concurrent try_apply_staged() could consume the staged
binary between staging and applying.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…lock race

Removing the lock file while the handle is still held unlinks the inode
on Unix, allowing a concurrent process to create a new file at the same
path and acquire its own "exclusive" lock. Replacing remove_file with
drop releases the lock via the OS and leaves the file as an inert
sentinel.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Extract shared spawn_detached() into util/mod.rs, deduplicating the
  detached-process setup from spawn_background_download and
  spawn_package_manager_update
- Remove redundant staged-update check in spawn_background_download
  (child process already checks in download_and_stage)
- Consolidate scattered from_cache/persist_latest branches into a
  single needs_persist flag
- Cache is_auto_update_disabled() result in main() instead of calling
  it twice
- Use persist_latest(None) in check_update's up-to-date branch
- Trim narrating comments

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…rely

After rollback, record the rolled-back-from version as skipped in
version.json. Auto-update skips only that version and resumes normally
once a newer release is published. This avoids silent staleness from
a full pause while still respecting the user's rollback intent.

- Add skipped_version field to UpdateCheck (serde-default for compat)
- Record skip in rollback(), guard try_apply_staged() against the race
  where a pre-rollback detached download stages the skipped version
- Clear skip on successful update (clear_after_update) and autoupdate enable
- Suppress "new version available" notification for skipped version
- Show skipped version in autoupdate status

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
m-abdelwahab and others added 30 commits April 1, 2026 16:18
…ope)

The missing checksums.txt in the release pipeline predates this PR.
Reverting the workflow change to keep this PR focused on auto-update fixes.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…rmissions

- Gate background updates and staged-apply on is_tty so cron jobs,
  scripts, and piped invocations never silently self-mutate
- Resolve symlinks in InstallMethod::detect() to prevent Intel Homebrew
  misclassification when current_exe() returns the symlink path
- Replace catch-all bin/ Shell detection with explicit allowlist of
  known shell-installer directories to avoid overwriting version-manager
  binaries
- Probe binary directory writability in can_auto_run_package_manager()
  to prevent repeated doomed npm/Bun/Scoop updates on sudo installs
- Acquire package-update lock in autoupdate disable to wait for
  in-flight package manager updates before returning

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…, fix status accuracy

- autoupdate disable now polls the actual child PID instead of
  acquiring a lock the detached process never holds, so it truly
  waits for in-flight npm/Bun/Scoop updates (30s timeout with warning)
- Replace install-method allowlist with version-manager exclusion list
  (asdf, mise, rtx, proto, volta, fnm, nodenv, rbenv, pyenv) and
  restore the */bin/ catch-all so custom --bin-dir installs from
  install.sh are correctly classified as Shell again
- update_strategy() now reflects the writability check for
  npm/Bun/Scoop, showing "Notification only" when the binary
  directory is not writable instead of overpromising

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…inary, quiesce detached child on disable

- `railway autoupdate enable` no longer clears skipped_version; the
  rollback guard persists until a newer release is applied
- Staged-update fast paths now verify the binary exists before
  short-circuiting; missing binary cleans stale metadata immediately
- Detached background child re-checks auto-update preferences before
  downloading and again after acquiring the update lock

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
check_updates is a read-only command but was not in the
update-management guard, so it could apply staged binaries and
spawn background downloads.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ter rollback, fix Windows upgrade instructions

- spawn_package_manager_update re-checks is_auto_update_disabled()
  after acquiring its lock, preventing a concurrent invocation from
  spawning an updater after the user ran autoupdate disable
- skip_version() now clears last_update_check so the next invocation
  performs a fresh API check and can discover a newer release published
  the same day as the rollback
- Upgrade/rollback non-writable instructions now show Windows-appropriate
  guidance (run as Administrator) instead of sudo/bash commands

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ire package-update lock in disable

- check_update() no longer arms the daily gate when the discovered
  version matches skipped_version, so a fix release published later
  the same day is discovered on the next invocation
- autoupdate disable now acquires package_update_lock (blocking) before
  reading the PID file, closing the race where spawn_package_manager_update
  writes the PID after disable already looked

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…eout

disable was unconditionally removing the PID file even when the
detached updater was still alive. This left no in-flight marker,
so re-enabling auto-updates could launch a duplicate updater.
Now the PID file is only removed once the child has actually exited.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… file

On a fresh install where ~/.railway has never been written,
rollback would fail with "Failed to create update lock file"
instead of reporting no backups available.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The */bin catch-all could misclassify a custom Cargo install root
(e.g. CARGO_INSTALL_ROOT=~/tools) as a shell install, allowing the
auto-updater to self-replace a Cargo-managed binary. Now checks for
Cargo's .crates.toml marker in the parent of bin/ before falling
through to Shell.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…p loss, check Cargo marker before shell-path heuristic

check_update() loaded version.json, did a network call (up to 30s), then
wrote the stale snapshot back — silently overwriting a skipped_version set
by a concurrent rollback.  Now re-reads from disk after the network call.

InstallMethod::detect() matched /usr/local/bin before the .crates.toml
marker check, so CARGO_INSTALL_ROOT=/usr/local installs were misclassified
as Shell.  Moved the marker check above the shell-path heuristic.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… from npm detection

clear_latest() called persist_latest(None) which set last_update_check=now,
preventing the background task from discovering a hotfix published later the
same day after a manual upgrade.  Now clears the cached version without
stamping the gate.

pnpm global paths contain "npm" as a substring, causing misclassification
as InstallMethod::Npm and driving updates through the wrong package manager.
Added an early pnpm check that falls back to Unknown.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…dates

run_upgrade_command() now acquires the same file lock and checks the PID
file used by spawn_package_manager_update(), preventing concurrent
package-manager processes against the same global install when a user
runs `railway upgrade` while a background auto-update is in flight.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…d version on API failure

Use raw args to detect update-management subcommands so that clap
DisplayHelp/DisplayVersion error paths (e.g. `railway upgrade --help`)
no longer trigger try_apply_staged() as a side effect.

Match on the Result from check_update() instead of using `?` so that
API errors fall back to the cached known_version for retry rather than
short-circuiting the entire update task.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…aths

--help, --version, and invalid-input paths now bypass both
try_apply_staged() and the background update spawn, ensuring they are
truly read-only with zero added latency.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…check

- Extract `parse_pid_file()` in check_update.rs to deduplicate PID file
  parsing across autoupdate, upgrade, and check_update modules
- Remove unreachable `.crates.toml` guard in install_method.rs catch-all
  (the same check already fires earlier in detect())

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… idempotent

- After rollback, an API failure would fall back to the cached (skipped)
  version and call persist_latest(), arming the daily gate. This prevented
  re-checking for a newer release until the next day. Now skipped versions
  skip persist_latest so the API is re-checked on every invocation.
- Add --clobber to gh release upload so re-running the checksums job
  doesn't fail on an existing asset.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…get triple

- Bare `railway` (no subcommand) now skips try_apply_staged() and the
  background updater, matching the read-only behavior of --help/--version.
  Previously a first-time user typing `railway` to explore would trigger
  update side effects before seeing help.

- detect_target_triple() now uses the compile-time BUILD_TARGET from
  build.rs instead of runtime OS/arch guessing. This ensures the
  self-updater fetches the correct ABI variant (e.g. a binary built for
  x86_64-unknown-linux-gnu will not be replaced with a musl build).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… format

- Add "help" to the read-only invocation guard so `railway help` skips
  try_apply_staged() and the background updater, matching the behavior
  of --help and bare `railway`.

- i686-pc-windows-gnu is cross-compiled on Linux and only ships as
  .tar.gz (no .zip). The asset name logic now accounts for this, and
  extraction derives the format from the asset name rather than
  re-checking the target.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…, eager download spawn

- clear_latest() now resets last_update_check so that same-day hotfixes
  are discovered after the user catches up to a cached version.

- Shell installs on unsupported self-update platforms (e.g. FreeBSD) now
  get a dedicated match arm showing the reinstall command instead of
  falling through to the vague catch-all message.

- spawn_update_task now starts the background download from the cached
  version before the API call, so the 2s exit timeout cannot strand the
  spawn on slow networks. The API check still runs to refresh the cache.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The eager download spawn could race with a newer version discovered by
the API: the cached-version child holds the update lock, so the
newer-version child exits immediately, and try_apply_staged would apply
the stale cached release on the next run.

Fix: only eagerly spawn when the same-day gate is armed (last_update_check
is today). In that case check_update(false) returns Ok(None) instantly
without a network request, so the API cannot discover a newer version that
would conflict. When the gate is NOT armed, the API call goes over the
network and may return a newer release — defer the spawn to after the
check completes.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…log file

spawn_detached() creates the log file without ensuring the parent
directory exists. On a fresh install where ~/.railway doesn't exist yet,
this fails silently (the spawn result is ignored), preventing the
background download from starting.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ne, add PID TTL to disable

- Version discovery and the "New version available" banner now run
  regardless of auto-update preference. Disabling auto-update stops
  automatic installation, not release awareness.

- `railway upgrade` now falls back to an already-staged update when the
  network check fails, so offline users can apply a previously
  downloaded binary.

- `autoupdate disable` now applies the 10-minute PID file TTL before
  trusting the stored PID, consistent with upgrade and
  spawn_package_manager_update. Prevents blocking 30s on a recycled PID.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The fallback path in self_update_interactive now applies the same guards
as try_apply_staged: rejects stale entries, wrong-platform binaries,
versions <= current, and versions the user rolled back from. Prevents
offline `railway upgrade` from downgrading or re-applying a skipped
version.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Switch version check gate from calendar-day UTC to fixed 12h window
- Remove checksums.txt machinery (code, tests, workflow job) — co-located
  checksums don't add security; TLS handles integrity in transit
- Reduce exit-delay timeout from 2s to 1s for version check
- Simplify `autoupdate disable` — set flag and clean staged, no PID polling
- Run version check in non-TTY to keep cache fresh for script-heavy users
- Suppress update banner when disabled via env var or CI, keep for preference
- Extract `validate_staged()` to deduplicate safety checks
- Enrich `autoupdate status` with pipeline state (latest version, staged,
  last check time, in-flight PID)
- Add `autoupdate skip` subcommand for package-manager installs
- Split download timeout: 30s background, 120s interactive
- Replace count-based package-manager retry (5 attempts) with time gate (1/hr)
- Add telemetry event for silent auto-update apply
- Clean up orphaned generate-checksums workflow job

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Extract `is_package_update_running()` helper to replace 3 identical
  PID-file-parse + age-check + liveness-check blocks
- Extract `PID_STALENESS_TTL_SECS` constant (was magic number 600)
- Remove unnecessary clone in `autoupdate skip`
- Remove TOCTOU `.exists()` check before `hard_link` in backup
- Remove unnecessary comment

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Move validate_staged() inside the exclusive lock in try_apply_staged()
  to close TOCTOU window where another process could delete the staged
  binary between validation and apply
- Use std::mem::forget on detached Child handles to avoid leaking OS
  resources on Windows (both package-manager and background-download paths)
- Use rename_replacing() in replace_binary() on Unix for consistency
- Replace silent `let _ = write()` in cache mutation methods with
  try_write() that logs warnings on failure

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Resolves conflicts:
- src/telemetry.rs: keep auto-update additions, drop Notices struct
  and show_notice_if_needed (moved to install script in #832)
- src/main.rs: keep auto-update startup logic, drop show_notice call

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

release/minor Author minor release

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant