From c6f1e4f1292a45a411091082051f10a8224de70a Mon Sep 17 00:00:00 2001 From: Kristin Martin Date: Mon, 15 Jun 2026 21:32:23 +0000 Subject: [PATCH 1/6] Add long-running tasks and machine lifecycle blueprint New guide covering the interaction between auto_stop_machines and long-running in-process work: how the proxy decides to stop a machine, why background tasks are invisible to that decision, and two patterns (disable autostop with an in-app drain; split web and worker into separate process groups) to keep work from getting killed. Also covers kill_signal/kill_timeout semantics under autostop and other stop pathways. Adds the entry to the Background Jobs & Automation section of the blueprints index (with a NEW!! tag) and to the corresponding sidebar nav group. --- blueprints/index.html.md | 1 + blueprints/long-running-tasks.html.md | 238 ++++++++++++++++++++++++++ partials/_guides_nav.html.erb | 3 +- 3 files changed, 241 insertions(+), 1 deletion(-) create mode 100644 blueprints/long-running-tasks.html.md diff --git a/blueprints/index.html.md b/blueprints/index.html.md index db15174a35..987668d875 100644 --- a/blueprints/index.html.md +++ b/blueprints/index.html.md @@ -66,6 +66,7 @@ How to run periodic tasks, long-running jobs, infrastructure automation, and the - [Deferring long-running tasks to a distributed work queue](/docs/blueprints/work-queues/) - [Task scheduling guide with Cron Manager and friends](/docs/blueprints/task-scheduling/) - [Crontab with Supercronic](/docs/blueprints/supercronic/) +- [Long-running tasks and machine lifecycle](/docs/blueprints/long-running-tasks/) NEW!! diff --git a/blueprints/long-running-tasks.html.md b/blueprints/long-running-tasks.html.md new file mode 100644 index 0000000000..cf35d8f5f7 --- /dev/null +++ b/blueprints/long-running-tasks.html.md @@ -0,0 +1,238 @@ +--- +title: Long-running tasks and machine lifecycle +layout: docs +nav: guides +author: kcmartin +date: 2026-06-15 +--- + +This page covers what happens when your machine is busy doing work, but Fly thinks it's idle. Specifically: how `auto_stop_machines` decides what to stop, why a background task is invisible to that decision, and the two patterns that keep work from getting killed. + +If you're picking a queue technology or a cron runner, start with the [work queues](/docs/blueprints/work-queues/) or [task scheduling](/docs/blueprints/task-scheduling/) blueprints instead. This page is about the machine behavior underneath them. + +## The problem + +A typical setup: a FastAPI endpoint accepts a request, spawns an async task to generate a report, returns `202 Accepted`, and closes the connection. The proxy sees no active connections. A few minutes later, it stops the machine. The report dies half-finished. + +This isn't a bug. It's `auto_stop_machines` working exactly as documented. The proxy looks at inbound traffic. It does not look inside the container. From the proxy's point of view, a machine running a 20-minute job and a machine doing nothing look identical. + +There are two ways to fix it. Pick one based on whether your work is bursty or steady, request-triggered or queue-driven. + +## How autostop actually decides + +The Fly proxy evaluates machines every few minutes. The exact rule depends on how many machines you have: + +**Multiple machines.** The proxy uses your `soft_limit` concurrency setting to compute excess capacity: + +``` +excess = num_machines − (num_machines_over_soft_limit + 1) +``` + +If `excess ≥ 1`, the proxy stops one machine. The `+ 1` keeps a buffer of one idle machine for incoming traffic. + +**Single machine.** Simpler: if load is zero, the proxy stops the machine. + +In both cases, "load" means traffic the proxy can see. Background work running inside the machine, whether that's async workers, cron-style loops, or anything else not driven by an inbound request, doesn't count. There's also no way for your application to tell the proxy, "I'm busy, leave me alone." + +This is the central fact for the rest of the page. Everything below is a way to work around it. + +### Stop vs. suspend + +`auto_stop_machines` takes three values: `"off"`, `"stop"`, and `"suspend"`. + +- **stop** shuts the machine down cold. A restart takes seconds (about 2s for a Rails app, less for a small binary). +- **suspend** dumps the entire VM state (memory, CPU, network) to disk. Resume takes a few hundred milliseconds. + +**Stop** is the simpler default: the machine shuts down when it's idle and cold-starts when it's needed again. For most apps, that's the right tradeoff. + +**Suspend** is the right choice when cold start is too painful (slow framework boot, heavy initialization, large in-memory state) and you'd still like to idle when inactive. The tradeoff: suspend is rougher on the underlying platform and has more constraints: + +- Machines must have 4 GB of RAM or less. +- Swap and schedules are not supported. +- Machines updated before June 20, 2024 cannot be suspended. +- Suspend is not durable. Fly does not guarantee that a suspended machine will resume. Host migration, maintenance, or capacity pressure can turn what would have been a resume into a cold start. Treat suspend as a faster version of stop, not a guaranteed warm restart. +- A few log lines may be lost across a suspend/resume cycle, and the system clock can take a second or two to re-synchronize after resume. See "[Suspend vs. Stop](/docs/getting-started/troubleshooting/#suspend-vs-stop)" for details on clock skew. + +Billing is the same for both: you pay for stopped machines like you pay for suspended ones. + +For the rest of this page, "stop" and "suspend" are interchangeable. The patterns work the same way for both. + +## Pattern A: disable autostop, manage shutdown in the app + +**Use this when** your app has long-lived workers, in-process job runners, or any background work that the application itself can track. + +Turn autostop off in `fly.toml`: + +```toml +[http_service] + internal_port = 8080 + auto_stop_machines = "off" + auto_start_machines = true +``` + +With autostop off, the proxy never stops your machines for being idle. They stay up until something else stops them (a deploy, `fly machine stop`, or a host migration). You're paying for every machine 24/7, in every region you've scaled into, so make sure that's the right tradeoff before adopting this pattern. + +When deploys, manual stops, or host migrations _do_ stop the machine, your app gets `SIGTERM` and has `kill_timeout` seconds to clean up. The default of 5 seconds is almost certainly too short. Bump it. These are top-level keys in `fly.toml`: + +```toml +kill_signal = "SIGTERM" +kill_timeout = "30s" +``` + +The maximum is 300 seconds. `kill_timeout` is a drain window, not a "let the job finish" knob. If your jobs run longer than 5 minutes, either checkpoint them so they can resume, or stop accepting new work and let in-flight jobs drain before the timeout. Don't wait for everything to finish. + +A minimal shutdown pattern in Node: + +```javascript +let activeJobs = 0 +let shuttingDown = false + +async function runJob(payload) { + if (shuttingDown) throw new Error("shutting down") + activeJobs++ + try { + await doWork(payload) + } finally { + activeJobs-- + } +} + +process.on("SIGTERM", () => { + shuttingDown = true + const start = Date.now() + const deadline = 25_000 // 5s under kill_timeout + const tick = setInterval(() => { + if (activeJobs === 0 || Date.now() - start > deadline) { + clearInterval(tick) + process.exit(0) + } + }, 200) +}) +``` + +In Python with asyncio: + +```python +import asyncio, signal + +active = 0 +shutting_down = asyncio.Event() + +async def run_job(payload): + global active + if shutting_down.is_set(): + raise RuntimeError("shutting down") + active += 1 + try: + await do_work(payload) + finally: + active -= 1 + +async def shutdown(): + shutting_down.set() + try: + await asyncio.wait_for(_drain(), timeout=25) + except asyncio.TimeoutError: + pass + +async def _drain(): + while active > 0: + await asyncio.sleep(0.2) + +loop = asyncio.get_event_loop() +loop.add_signal_handler(signal.SIGTERM, lambda: asyncio.create_task(shutdown())) +``` + +Both patterns refuse new work as soon as `SIGTERM` arrives, then wait for in-flight jobs up to a deadline a few seconds shorter than `kill_timeout`. The safety margin matters, if you wait the full 30s, Fly's `SIGKILL` arrives before your `exit(0)` runs. + +## Pattern B: split web and worker into separate process groups + +**Use this when** web traffic is bursty (good candidate for autostop) but background work is steady or long-running (bad candidate for autostop). + +Split with `processes` in `fly.toml`: + +```toml +[processes] + web = "bundle exec puma" + worker = "bundle exec sidekiq" + +[http_service] + internal_port = 8080 + auto_stop_machines = "suspend" + auto_start_machines = true + processes = ["web"] +``` + +The worker process group has no `[http_service]` attached, so the proxy never touches its machines. Autostop applies only to the web tier. + +Scale them independently: + +```cmd +fly scale count web=2 worker=1 +``` + +This is the pattern Sidekiq, Celery, and BullMQ workers actually want. The web tier scales to zero off-hours; the worker tier runs whenever there's work in the queue. + +Tradeoff: you're paying for at least one worker machine continuously. If your work is batchy enough that on-demand workers make sense, use the [work queues blueprint's on-demand worker pattern](/docs/blueprints/work-queues/) instead, as that spins up a fresh machine per job and lets it stop when done. + +## Graceful shutdown: what Fly sends + +When something stops your machine, whether that's `auto_stop_machines`, `fly machine stop`, a deploy, or a host migration, Fly sends `kill_signal` (default: `SIGTERM`) to PID 1. After waiting `kill_timeout` seconds, it sends `SIGKILL`. + +The defaults are conservative: + +| Option | Default | Max | Notes | +| --- | --- | --- | --- | +| `kill_signal` | `SIGTERM` | — | Also accepts `SIGQUIT`, `SIGUSR1`, `SIGUSR2`, `SIGKILL`, `SIGSTOP` | +| `kill_timeout` | `5s` | `300s` | The drain window before `SIGKILL` | + +Five seconds is enough for an HTTP server to close keepalives. It is not enough for a long-running job to finish. If you have any background work, set `kill_timeout` to a value that allows your typical job to complete. You'll need to determine this on your side. Both keys are top-level in `fly.toml`: + +```toml +kill_signal = "SIGTERM" +kill_timeout = "30s" +``` + +PID 1 receives the signal. In a Docker container running your app directly, that's your process. In a container running a shell wrapper (`CMD ["sh", "-c", "..."]`), the shell is PID 1 and `SIGTERM` doesn't propagate. Use the exec form: `CMD ["myapp"]`, or `exec myapp` inside the wrapper. + +`kill_timeout` is not a "finish your work" timer. It's a drain window. Inside it, you should: + +1. Stop accepting new work +1. Let in-flight work finish, or checkpoint it +1. Exit cleanly + +If your jobs take longer than 5 minutes, you can't drain them inside `kill_timeout`. You need either Pattern A with checkpoint/resume, or Pattern B with a worker tier that's never autostopped. + +Run `fly config validate --strict` before relying on any of this. By default, `fly config validate` silently accepts unrecognized sections and keys. A typo or outdated section name can pass validation and then do nothing at runtime. Strict mode catches those errors. + +## Picking a pattern + +| Situation | Pattern | +| --- | --- | +| Jobs are short (< 30 seconds) | Increase `kill_timeout`; everything else can stay as default | +| Long-running jobs, steady web traffic | A — disable autostop, in-app drain | +| Long-running jobs, bursty web traffic | B — split web/worker processes | +| Cron-style scheduled jobs | See [task scheduling](/docs/blueprints/task-scheduling/) | +| Queue-driven workers | B — combine with [work queues](/docs/blueprints/work-queues/) | +| One-off jobs (fire and forget per request) | On-demand workers — see [work queues](/docs/blueprints/work-queues/) | +| Can't restructure right now | A — accept the continuous machine cost | + +## Common problems + +**My `SIGTERM` handler runs but the job still gets killed.** `kill_timeout` is shorter than your handler needs. Bump it (max 300s) and set your handler's deadline a few seconds under that. + +**The machine stops mid-job even with `auto_stop_machines = "off"`.** Autostop is only one of several things that stop machines. Deploys, `fly machine stop`, scale-down, and host migrations all do too. Check `fly logs` for the `instance refused` or `host migration` events. Pattern A still applies. The only difference is that autostop is no longer the trigger. + +**Why doesn't a self-ping keep my machine alive?** It won't. The [autostop reference](/docs/reference/fly-proxy-autostop-autostart/) defines idle as "a load of 0" but doesn't specify what counts as load. Empirically, sending a successful HTTP request every 60 seconds from a machine to its own `.fly.dev` hostname does not prevent autostop. The proxy still stops the machine after 5 to 10 minutes. To keep a machine running through idle traffic, turn off autostop (Pattern A) or move the work into a process group without `[http_service]` (Pattern B). + +**Worker machines won't stop when I deploy.** A process group with no `[http_service]`, such as the worker tier in Pattern B, is invisible to the proxy. Deploys still update those machines because `flyctl` talks to them directly, but the proxy does not manage their lifecycle and cannot autostop them. To stop them gracefully, send a signal with `fly machine stop` or let `fly deploy` replace them during a deployment. + +**Suspend resumes are slower than the docs say.** Suspend isn't durable. If Fly can't restore the snapshot (host migration, capacity pressure), you get a cold start. There's no flag to tell you which happened; check the first-request latency. If cold starts matter, run with `min_machines_running = 1`. + +## Where to go next + +- [Work queues blueprint](/docs/blueprints/work-queues/): Picking a queue technology +- [Task scheduling blueprint](/docs/blueprints/task-scheduling/): cron-style triggers and scheduled machines +- [Autostart and autostop reference](/docs/reference/fly-proxy-autostop-autostart/): The proxy's full decision logic +- [Configuration reference](/docs/reference/configuration/): `kill_signal`, `kill_timeout`, `processes`, `auto_stop_machines` +- [Machine states](/docs/machines/machine-states/): what `stopping`, `stopped`, and `suspended` actually mean diff --git a/partials/_guides_nav.html.erb b/partials/_guides_nav.html.erb index 5837c74f67..0391ceb27b 100644 --- a/partials/_guides_nav.html.erb +++ b/partials/_guides_nav.html.erb @@ -69,7 +69,8 @@ { text: "Building Infrastructure Automation without Terraform", path: "/docs/blueprints/infra-automation-without-terraform/" }, { text: "Deferring long-running tasks to a distributed work queue", path: "/docs/blueprints/work-queues/" }, { text: "Task scheduling guide with Cron Manager and friends", path: "/docs/blueprints/task-scheduling/" }, - { text: "Crontab with Supercronic", path: "/docs/blueprints/supercronic/" } + { text: "Crontab with Supercronic", path: "/docs/blueprints/supercronic/" }, + { text: "Long-running tasks and machine lifecycle", path: "/docs/blueprints/long-running-tasks/" } ] } ] From dbf86a90d9e5b7b4c7e1ec23db66d2cb8ca782e2 Mon Sep 17 00:00:00 2001 From: Kristin Martin Date: Mon, 15 Jun 2026 21:37:28 +0000 Subject: [PATCH 2/6] Sweep em dashes in long-running-tasks blueprint tables Replace em dash separators with colons in the Picking a pattern table, and replace the em dash placeholder in the kill_signal Max column with n/a. --- blueprints/long-running-tasks.html.md | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/blueprints/long-running-tasks.html.md b/blueprints/long-running-tasks.html.md index cf35d8f5f7..8ab24ecfb9 100644 --- a/blueprints/long-running-tasks.html.md +++ b/blueprints/long-running-tasks.html.md @@ -183,7 +183,7 @@ The defaults are conservative: | Option | Default | Max | Notes | | --- | --- | --- | --- | -| `kill_signal` | `SIGTERM` | — | Also accepts `SIGQUIT`, `SIGUSR1`, `SIGUSR2`, `SIGKILL`, `SIGSTOP` | +| `kill_signal` | `SIGTERM` | n/a | Also accepts `SIGQUIT`, `SIGUSR1`, `SIGUSR2`, `SIGKILL`, `SIGSTOP` | | `kill_timeout` | `5s` | `300s` | The drain window before `SIGKILL` | Five seconds is enough for an HTTP server to close keepalives. It is not enough for a long-running job to finish. If you have any background work, set `kill_timeout` to a value that allows your typical job to complete. You'll need to determine this on your side. Both keys are top-level in `fly.toml`: @@ -210,12 +210,12 @@ Run `fly config validate --strict` before relying on any of this. By default, `f | Situation | Pattern | | --- | --- | | Jobs are short (< 30 seconds) | Increase `kill_timeout`; everything else can stay as default | -| Long-running jobs, steady web traffic | A — disable autostop, in-app drain | -| Long-running jobs, bursty web traffic | B — split web/worker processes | +| Long-running jobs, steady web traffic | A: disable autostop, in-app drain | +| Long-running jobs, bursty web traffic | B: split web/worker processes | | Cron-style scheduled jobs | See [task scheduling](/docs/blueprints/task-scheduling/) | -| Queue-driven workers | B — combine with [work queues](/docs/blueprints/work-queues/) | -| One-off jobs (fire and forget per request) | On-demand workers — see [work queues](/docs/blueprints/work-queues/) | -| Can't restructure right now | A — accept the continuous machine cost | +| Queue-driven workers | B: combine with [work queues](/docs/blueprints/work-queues/) | +| One-off jobs (fire and forget per request) | On-demand workers; see [work queues](/docs/blueprints/work-queues/) | +| Can't restructure right now | A: accept the continuous machine cost | ## Common problems From 19025b2e5d2a8095cdcde6aa97db481777a5ef77 Mon Sep 17 00:00:00 2001 From: Kristin Martin Date: Mon, 15 Jun 2026 21:45:58 +0000 Subject: [PATCH 3/6] Move long-running tasks blueprint to top of its section on overview page --- blueprints/index.html.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/blueprints/index.html.md b/blueprints/index.html.md index 987668d875..9b6e8fb5aa 100644 --- a/blueprints/index.html.md +++ b/blueprints/index.html.md @@ -62,11 +62,11 @@ Make it fast. Make it reliable. Monitor what's happening. How to run periodic tasks, long-running jobs, infrastructure automation, and the things that run when you’re asleep. +- [Long-running tasks and machine lifecycle](/docs/blueprints/long-running-tasks/) NEW!! - [Building Infrastructure Automation without Terraform](/docs/blueprints/infra-automation-without-terraform/) - [Deferring long-running tasks to a distributed work queue](/docs/blueprints/work-queues/) - [Task scheduling guide with Cron Manager and friends](/docs/blueprints/task-scheduling/) - [Crontab with Supercronic](/docs/blueprints/supercronic/) -- [Long-running tasks and machine lifecycle](/docs/blueprints/long-running-tasks/) NEW!! From b79a2bb918461db867074e97cca9707d35d4f888 Mon Sep 17 00:00:00 2001 From: Kristin Martin Date: Mon, 15 Jun 2026 21:47:09 +0000 Subject: [PATCH 4/6] Mirror sidebar order: long-running tasks at top of section --- partials/_guides_nav.html.erb | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/partials/_guides_nav.html.erb b/partials/_guides_nav.html.erb index 0391ceb27b..429db9d396 100644 --- a/partials/_guides_nav.html.erb +++ b/partials/_guides_nav.html.erb @@ -66,11 +66,11 @@ title: "Background Jobs & Automation", open: true, links: [ + { text: "Long-running tasks and machine lifecycle", path: "/docs/blueprints/long-running-tasks/" }, { text: "Building Infrastructure Automation without Terraform", path: "/docs/blueprints/infra-automation-without-terraform/" }, { text: "Deferring long-running tasks to a distributed work queue", path: "/docs/blueprints/work-queues/" }, { text: "Task scheduling guide with Cron Manager and friends", path: "/docs/blueprints/task-scheduling/" }, - { text: "Crontab with Supercronic", path: "/docs/blueprints/supercronic/" }, - { text: "Long-running tasks and machine lifecycle", path: "/docs/blueprints/long-running-tasks/" } + { text: "Crontab with Supercronic", path: "/docs/blueprints/supercronic/" } ] } ] From 55181179083f4d42c75ee52d8fdaeec5588f5f44 Mon Sep 17 00:00:00 2001 From: Kristin Martin Date: Mon, 15 Jun 2026 21:54:12 +0000 Subject: [PATCH 5/6] Use 'guide' instead of 'blueprint' in prose Replace prose references to 'blueprint(s)' with 'guide(s)' throughout the doc. Link paths under /docs/blueprints/ are unchanged. --- blueprints/long-running-tasks.html.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/blueprints/long-running-tasks.html.md b/blueprints/long-running-tasks.html.md index 8ab24ecfb9..f1500b3ed7 100644 --- a/blueprints/long-running-tasks.html.md +++ b/blueprints/long-running-tasks.html.md @@ -8,7 +8,7 @@ date: 2026-06-15 This page covers what happens when your machine is busy doing work, but Fly thinks it's idle. Specifically: how `auto_stop_machines` decides what to stop, why a background task is invisible to that decision, and the two patterns that keep work from getting killed. -If you're picking a queue technology or a cron runner, start with the [work queues](/docs/blueprints/work-queues/) or [task scheduling](/docs/blueprints/task-scheduling/) blueprints instead. This page is about the machine behavior underneath them. +If you're picking a queue technology or a cron runner, start with the [work queues](/docs/blueprints/work-queues/) or [task scheduling](/docs/blueprints/task-scheduling/) guides instead. This page is about the machine behavior underneath them. ## The problem @@ -173,7 +173,7 @@ fly scale count web=2 worker=1 This is the pattern Sidekiq, Celery, and BullMQ workers actually want. The web tier scales to zero off-hours; the worker tier runs whenever there's work in the queue. -Tradeoff: you're paying for at least one worker machine continuously. If your work is batchy enough that on-demand workers make sense, use the [work queues blueprint's on-demand worker pattern](/docs/blueprints/work-queues/) instead, as that spins up a fresh machine per job and lets it stop when done. +Tradeoff: you're paying for at least one worker machine continuously. If your work is batchy enough that on-demand workers make sense, use the [work queues guide's on-demand worker pattern](/docs/blueprints/work-queues/) instead, as that spins up a fresh machine per job and lets it stop when done. ## Graceful shutdown: what Fly sends @@ -231,8 +231,8 @@ Run `fly config validate --strict` before relying on any of this. By default, `f ## Where to go next -- [Work queues blueprint](/docs/blueprints/work-queues/): Picking a queue technology -- [Task scheduling blueprint](/docs/blueprints/task-scheduling/): cron-style triggers and scheduled machines +- [Work queues guide](/docs/blueprints/work-queues/): Picking a queue technology +- [Task scheduling guide](/docs/blueprints/task-scheduling/): cron-style triggers and scheduled machines - [Autostart and autostop reference](/docs/reference/fly-proxy-autostop-autostart/): The proxy's full decision logic - [Configuration reference](/docs/reference/configuration/): `kill_signal`, `kill_timeout`, `processes`, `auto_stop_machines` - [Machine states](/docs/machines/machine-states/): what `stopping`, `stopped`, and `suspended` actually mean From d474af0af340d31ab40ae77b3ca3426bdf3f55e6 Mon Sep 17 00:00:00 2001 From: Kristin Martin Date: Tue, 16 Jun 2026 10:30:54 -0700 Subject: [PATCH 6/6] add link and change page to guide --- blueprints/long-running-tasks.html.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/blueprints/long-running-tasks.html.md b/blueprints/long-running-tasks.html.md index f1500b3ed7..35e969a024 100644 --- a/blueprints/long-running-tasks.html.md +++ b/blueprints/long-running-tasks.html.md @@ -22,7 +22,7 @@ There are two ways to fix it. Pick one based on whether your work is bursty or s The Fly proxy evaluates machines every few minutes. The exact rule depends on how many machines you have: -**Multiple machines.** The proxy uses your `soft_limit` concurrency setting to compute excess capacity: +**Multiple machines.** The proxy uses your `soft_limit` [concurrency setting](/docs/blueprints/setting-concurrency-limits/) to compute excess capacity: ``` excess = num_machines − (num_machines_over_soft_limit + 1) @@ -34,7 +34,7 @@ If `excess ≥ 1`, the proxy stops one machine. The `+ 1` keeps a buffer of one In both cases, "load" means traffic the proxy can see. Background work running inside the machine, whether that's async workers, cron-style loops, or anything else not driven by an inbound request, doesn't count. There's also no way for your application to tell the proxy, "I'm busy, leave me alone." -This is the central fact for the rest of the page. Everything below is a way to work around it. +This is the central fact for the rest of the guide. Everything below is a way to work around it. ### Stop vs. suspend @@ -55,7 +55,7 @@ This is the central fact for the rest of the page. Everything below is a way to Billing is the same for both: you pay for stopped machines like you pay for suspended ones. -For the rest of this page, "stop" and "suspend" are interchangeable. The patterns work the same way for both. +For the rest of this guide, "stop" and "suspend" are interchangeable. The patterns work the same way for both. ## Pattern A: disable autostop, manage shutdown in the app @@ -147,7 +147,7 @@ Both patterns refuse new work as soon as `SIGTERM` arrives, then wait for in-fli ## Pattern B: split web and worker into separate process groups -**Use this when** web traffic is bursty (good candidate for autostop) but background work is steady or long-running (bad candidate for autostop). +**Use this when** web traffic is bursty (a good candidate for autostop) but background work is steady or long-running (a bad candidate for autostop). Split with `processes` in `fly.toml`: