Prevent duplicate edge workers unless existing worker is offline or unkown#58586
Conversation
25ac066 to
4e72433
Compare
…nknown Add validation to the edge worker registration endpoint to prevent launching multiple workers with the same hostname. If a worker with the same name already exists in an active state (running, idle, starting, terminating, or maintenance), the registration will fail with HTTP 409 CONFLICT. Workers can only reuse a name if the existing worker is in OFFLINE, UNKNOWN, or OFFLINE_MAINTENANCE state.
4e72433 to
8cf122f
Compare
jscheffl
left a comment
There was a problem hiding this comment.
Thanks! This is a very good point that is good to be fixed to prevent deployment errors to mix-up hostnames.
One nit only:
When I start a second worker, the pure HTTP error is in the logs like:
root@bbd99465f97f:/opt/airflow# airflow edge worker --pid another
2025-11-23T09:31:39.223750Z [info ] Starting worker with API endpoint http://localhost:8080/edge_worker/v1/rpcapi [airflow.providers.edge3.cli.edge_command] loc=edge_command.py:80
____________ _____________
____ |__( )_________ __/__ /________ __
____ /| |_ /__ ___/_ /_ __ /_ __ \_ | /| / /
___ ___ | / _ / _ __/ _ / / /_/ /_ |/ |/ /
_/_/ |_/_/ /_/ /_/ /_/ \____/____/|__/
____ __ _ __ __
/ __/__/ /__ ____ | | /| / /__ ____/ /_____ ____
/ _// _ / _ `/ -_) | |/ |/ / _ \/ __/ '_/ -_) __/
/___/\_,_/\_, /\__/ |__/|__/\___/_/ /_/\_\\__/_/
/___/
409 Client Error: Conflict for url: http://localhost:8080/edge_worker/v1/worker/bbd99465f97f
Can you explicitly handle the exception on the client and generate a better log message in the console? e.g. like we have for version conflicts in providers/edge3/src/airflow/providers/edge3/cli/api_client.py:133
|
Just realized: This PR will now lower a bit comfort though, when using breeze and then |
Does this look good to you? I made this update |
Thats true, looks like the state does not get cleared. You would need to wait a few minutes for the api server to determine worker hearbeat is missing and change to unknown state, then you will be able to launch. OR just launch with I don't think this is a big deal though. |
…nkown (apache#58586) * Prevent duplicate edge workers unless existing worker is offline or unknown Add validation to the edge worker registration endpoint to prevent launching multiple workers with the same hostname. If a worker with the same name already exists in an active state (running, idle, starting, terminating, or maintenance), the registration will fail with HTTP 409 CONFLICT. Workers can only reuse a name if the existing worker is in OFFLINE, UNKNOWN, or OFFLINE_MAINTENANCE state. * Jens Suggestions
…nkown (apache#58586) * Prevent duplicate edge workers unless existing worker is offline or unknown Add validation to the edge worker registration endpoint to prevent launching multiple workers with the same hostname. If a worker with the same name already exists in an active state (running, idle, starting, terminating, or maintenance), the registration will fail with HTTP 409 CONFLICT. Workers can only reuse a name if the existing worker is in OFFLINE, UNKNOWN, or OFFLINE_MAINTENANCE state. * Jens Suggestions
…nkown (apache#58586) * Prevent duplicate edge workers unless existing worker is offline or unknown Add validation to the edge worker registration endpoint to prevent launching multiple workers with the same hostname. If a worker with the same name already exists in an active state (running, idle, starting, terminating, or maintenance), the registration will fail with HTTP 409 CONFLICT. Workers can only reuse a name if the existing worker is in OFFLINE, UNKNOWN, or OFFLINE_MAINTENANCE state. * Jens Suggestions
Add validation to the edge worker registration endpoint to prevent
launching multiple workers with the same hostname. If a worker with
the same name already exists in an active state (running, idle,
starting, terminating, or maintenance), the registration will fail
with HTTP 409 CONFLICT. Workers can only reuse a name if the existing
worker is in OFFLINE, UNKNOWN, or OFFLINE_MAINTENANCE state.