LCORE-493: auth e2e tests hardening#494
Conversation
WalkthroughAdds wait_for_container_health(container_name, max_attempts=3) to E2E test utilities and updates switch_config_and_restart to accept container_name and cleanup; after restart it now polls Docker health until "healthy" (with timeouts/retries and error handling) instead of a fixed sleep. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
actor Tester
participant Switch as switch_config_and_restart
participant Docker as Docker Engine
participant Health as wait_for_container_health
Tester->>Switch: switch_config_and_restart(orig, repl, container_name, cleanup?)
Switch->>Docker: Copy replacement config & restart container
Switch->>Health: wait_for_container_health(container_name, max_attempts=3)
loop up to 3 attempts
Health->>Docker: docker inspect --format {{.State.Health.Status}} (10s timeout)
alt status == "healthy"
Health-->>Switch: healthy → return
else status != "healthy" or error/timeout
Health-->>Health: log attempt, sleep 5s, retry
end
end
Note over Health,Switch: After final failed attempt, logs timeout/error and returns
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Suggested reviewers
Poem
✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
tests/e2e/utils/utils.py (2)
68-74: Fix backup semantics; align docstring and logic. Always create backup; let cleanup only delete it.Skipping backup creation when cleanup=True risks losing the ability to restore and contradicts the docstring.
def switch_config_and_restart( original_file: str, replacement_file: str, container_name: str, cleanup: bool = False, ) -> str: @@ - cleanup: If True, remove the backup file after restoration (default: False) + cleanup: If True, remove the backup file at the end of this call (default: False) @@ - if not cleanup and not os.path.exists(backup_file): + if not os.path.exists(backup_file): try: shutil.copy(original_file, backup_file)Also applies to: 80-81, 87-93
100-111: Add a timeout to docker restart and separate exception handling.Without a timeout, tests can hang indefinitely; combine with the new health wait for robust readiness.
# Restart container try: subprocess.run( ["docker", "restart", container_name], capture_output=True, text=True, - check=True, + check=True, + timeout=60, ) - except (subprocess.CalledProcessError, subprocess.TimeoutExpired) as e: - print(f"Failed to restart container {container_name}: {e.stderr}") + except subprocess.CalledProcessError as e: + print(f"Failed to restart container {container_name}: {e.stderr}") + raise + except subprocess.TimeoutExpired as e: + print(f"Timed out restarting container {container_name} after {e.timeout}s") raise @@ - # Wait for container to be healthy + # Wait for container to be healthy (raises on timeout) wait_for_container_health(container_name)Also applies to: 112-114
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (1)
tests/e2e/utils/utils.py(2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: build-pr
- GitHub Check: e2e_tests
dfcbed0 to
0a946cd
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
tests/e2e/utils/utils.py (1)
70-75: Add missingcontainer_name(and optionalcleanup) to allswitch_config_and_restartcallsThe invocations in tests/e2e/features/environment.py at lines 90 and 100 are still using the old two-arg signature and must be updated to include the required
container_nameparameter (andcleanupif needed).
♻️ Duplicate comments (1)
tests/e2e/utils/utils.py (1)
35-68: Fail fast on unhealthy/timeout; fix double-sleep; support containers without HEALTHCHECK.Current loop swallows failures, never raises on timeout, and sleeps twice per attempt (once inside try, once after), making timing/logs inaccurate. Tests may proceed while the container isn’t ready.
Apply this hardened rewrite:
-def wait_for_container_health(container_name: str, max_attempts: int = 3) -> None: - """Wait for container to be healthy.""" - for attempt in range(max_attempts): - try: - result = subprocess.run( - [ - "docker", - "inspect", - "--format={{.State.Health.Status}}", - container_name, - ], - capture_output=True, - text=True, - check=True, - timeout=10, - ) - if result.stdout.strip() == "healthy": - break - else: - if attempt < max_attempts - 1: - time.sleep(5) - else: - print( - f"{container_name} not healthy after {max_attempts * 5} seconds" - ) - except (subprocess.CalledProcessError, subprocess.TimeoutExpired): - pass - - if attempt < max_attempts - 1: - print(f"⏱ Attempt {attempt + 1}/{max_attempts} - waiting...") - time.sleep(5) - else: - print(f"Could not check health status for {container_name}") +def wait_for_container_health( + container_name: str, + max_attempts: int = 6, + interval_sec: int = 5, + inspect_timeout_sec: int = 10, +) -> None: + """Wait for container to be healthy (or running if no HEALTHCHECK); raise on timeout.""" + start = time.monotonic() + last_status = "unknown" + for attempt in range(1, max_attempts + 1): + try: + result = subprocess.run( + [ + "docker", + "inspect", + "--format={{if .State.Health}}{{.State.Health.Status}}{{else}}no-healthcheck{{end}}", + container_name, + ], + capture_output=True, + text=True, + check=True, + timeout=inspect_timeout_sec, + ) + status = result.stdout.strip() + last_status = status + if status == "healthy": + return + if status == "no-healthcheck": + # Fallback: consider "running" as ready when no HealthCheck is defined. + state = subprocess.run( + ["docker", "inspect", "--format={{.State.Status}}", container_name], + capture_output=True, + text=True, + check=True, + timeout=inspect_timeout_sec, + ) + if state.stdout.strip() == "running": + return + except (subprocess.CalledProcessError, subprocess.TimeoutExpired): + # ignore and retry + pass + + if attempt < max_attempts: + print(f"⏱ Attempt {attempt}/{max_attempts} - waiting...") + time.sleep(interval_sec) + else: + elapsed = int(time.monotonic() - start) + raise TimeoutError(f"{container_name} not ready after {elapsed}s (last status: {last_status})")
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (1)
tests/e2e/utils/utils.py(2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: build-pr
- GitHub Check: e2e_tests
| # Wait for container to be healthy | ||
| wait_for_container_health(container_name) | ||
|
|
There was a problem hiding this comment.
🛠️ Refactor suggestion
Surface restart failures with a timeout and clearer error reporting.
TimeoutExpired won’t fire without a timeout; also bubble up a concise message.
# Restart container
try:
subprocess.run(
["docker", "restart", container_name],
capture_output=True,
text=True,
check=True,
+ timeout=60,
)
- except (subprocess.CalledProcessError, subprocess.TimeoutExpired) as e:
- print(f"Failed to restart container {container_name}: {e.stderr}")
+ except (subprocess.CalledProcessError, subprocess.TimeoutExpired) as e:
+ err = getattr(e, "stderr", None) or str(e)
+ print(f"Failed to restart container {container_name}: {err}")
raise📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| # Wait for container to be healthy | |
| wait_for_container_health(container_name) | |
| # Restart container | |
| try: | |
| subprocess.run( | |
| ["docker", "restart", container_name], | |
| capture_output=True, | |
| text=True, | |
| check=True, | |
| timeout=60, | |
| ) | |
| except (subprocess.CalledProcessError, subprocess.TimeoutExpired) as e: | |
| err = getattr(e, "stderr", None) or str(e) | |
| print(f"Failed to restart container {container_name}: {err}") | |
| raise |
🤖 Prompt for AI Agents
In tests/e2e/utils/utils.py around lines 114 to 116, the call to
wait_for_container_health(container_name) currently can hang indefinitely and
any TimeoutExpired won't be raised because no timeout is provided; update the
call to pass a reasonable timeout (e.g., timeout_seconds or a constant) and wrap
the call in a try/except that catches TimeoutExpired (and optionally
subprocess.TimeoutExpired) and re-raises a concise, informative exception (or
raise RuntimeError) that includes the container name and that the container
failed to become healthy within the timeout; ensure the timeout value is
configurable or clearly documented.
Description
auth e2e tests hardening by adding health check after docker restart
Type of change
Related Tickets & Documents
Checklist before requesting a review
Testing
Summary by CodeRabbit