Skip to content

fix(glue): Fix GlueJobOperator verbose logs not showing in deferrable mode#63086

Merged
o-nikolas merged 8 commits intoapache:mainfrom
shivaam:fix/glue-deferrable-verbose-56535
Mar 27, 2026
Merged

fix(glue): Fix GlueJobOperator verbose logs not showing in deferrable mode#63086
o-nikolas merged 8 commits intoapache:mainfrom
shivaam:fix/glue-deferrable-verbose-56535

Conversation

@shivaam
Copy link
Copy Markdown
Contributor

@shivaam shivaam commented Mar 8, 2026

Why

When using GlueJobOperator(deferrable=True, verbose=True), CloudWatch logs are silently ignored. The verbose flag is stored in the trigger but never read — the inherited AwsBaseWaiterTrigger.run() only polls job status via a boto3 waiter and has no log-fetching logic.

This means users who switch from deferrable=False to deferrable=True lose all verbose CloudWatch log output with no warning.

closes: #56535

What

Added a run() override and _forward_logs() helper to GlueJobCompleteTrigger in triggers/glue.py:

  • When verbose=False: delegates to super().run() — zero behavior change.
  • When verbose=True: custom async poll loop that checks job state and streams logs from both /output and /error CloudWatch log groups using get_log_events with continuation tokens.
  • Log format matches the sync path (GlueJobHook.print_job_logs): tab-indented lines prefixed with Glue Job Run <log_group> Logs:, and No new log from the Glue Job in <log_group> when idle.

How

Follows the same pattern as the ECS TaskDoneTrigger._forward_logs() which already does async CloudWatch log tailing in this codebase. Uses get_log_events (async, token-based) instead of the sync path's filter_log_events (paginator-based), but produces identical user-facing output.

Testing

  • 7 unit tests covering: success, failure, max attempts exceeded, ResourceNotFoundException, pagination, log format verification, and no-new-events case. All pass.
  • Manually tested with a real Glue Python Shell job (20 steps, 15s apart) running both sync and deferrable tasks side by side in Breeze.

Sync task output:

INFO - Glue Job Run /aws-glue/python-jobs/output Logs:
	Processing step 4/20...
INFO - No new log from the Glue Job in /aws-glue/python-jobs/error
INFO - Polling for AWS Glue Job test-verbose-logging-sync current run state with status RUNNING
INFO - No new log from the Glue Job in /aws-glue/python-jobs/output
INFO - Glue Job Run /aws-glue/python-jobs/output Logs:
	Processing step 5/20...

Deferrable task output (after fix):

INFO - Glue Job Run /aws-glue/python-jobs/output Logs:
	Processing step 4/20...
	Processing step 5/20...
WARNING - No new Glue driver logs so far.
INFO - Polling for AWS Glue Job test-verbose-logging-deferrable current run state: RUNNING
INFO - Glue Job Run /aws-glue/python-jobs/output Logs:
	Processing step 6/20...
	Processing step 7/20...

The deferrable path batches more steps per poll cycle (30s vs 6s polling interval) but the format is now consistent.


Was generative AI tooling used to co-author this PR?
  • Yes — Kiro (Claude Opus 4.6)

@boring-cyborg boring-cyborg bot added area:providers provider:amazon AWS/Amazon - related issues labels Mar 8, 2026
@shivaam shivaam force-pushed the fix/glue-deferrable-verbose-56535 branch from 1ece990 to ed4abf2 Compare March 8, 2026 21:54
@shivaam shivaam force-pushed the fix/glue-deferrable-verbose-56535 branch from ed4abf2 to 3d66e21 Compare March 8, 2026 22:00
@shivaam shivaam marked this pull request as ready for review March 8, 2026 22:07
@shivaam shivaam requested a review from o-nikolas as a code owner March 8, 2026 22:07
@eladkal eladkal requested review from ferruzzi and vincbeck March 10, 2026 01:44
@Srabasti
Copy link
Copy Markdown
Contributor

Static checks are failing for glue.py file. Please run prek locally in branch and commit.

@potiuk potiuk marked this pull request as draft March 10, 2026 23:48
@potiuk
Copy link
Copy Markdown
Member

potiuk commented Mar 10, 2026

@shivaam This PR has been converted to draft because it does not yet meet our Pull Request quality criteria.

Issues found:

  • Pre-commit / static checks: Failing: CI image checks / Static checks. Run prek run --from-ref main locally to find and fix issues. See Pre-commit / static checks docs.

Note: Your branch is 57 commits behind main. Some check failures may be caused by changes in the base branch rather than by your PR. Please rebase your branch and push again to get up-to-date CI results.

What to do next:

  • The comment informs you what you need to do.
  • Fix each issue, then mark the PR as "Ready for review" in the GitHub UI - but only after making sure that all the issues are fixed.
  • Maintainers will then proceed with a normal review.

Converting a PR to draft is not a rejection — it is an invitation to bring the PR up to the project's standards so that maintainer review time is spent productively. If you have questions, feel free to ask on the Airflow Slack.

@shivaam shivaam force-pushed the fix/glue-deferrable-verbose-56535 branch from 4b06866 to 03dce1c Compare March 11, 2026 01:08
@shivaam shivaam marked this pull request as ready for review March 11, 2026 01:11
shivaam and others added 4 commits March 19, 2026 07:22
…bOperator

When using GlueJobOperator with deferrable=True and verbose=True, CloudWatch
logs were silently ignored because the trigger inherited the base waiter's
run() method which only polls job status. This adds a run() override and
_forward_logs() helper to the GlueJobCompleteTrigger that streams logs from
both output and error CloudWatch log groups, matching the format used by the
synchronous path.

closes: apache#56535
Removed docstring from fetch_logs method and added a comment.
@shivaam shivaam force-pushed the fix/glue-deferrable-verbose-56535 branch from 4d9c451 to 211a3d5 Compare March 19, 2026 14:22
shivaam added 4 commits March 19, 2026 08:43
Extract get_glue_log_group_names() and format_glue_logs() as module-level
helpers in hooks/glue.py so that GlueJobHook.print_job_logs (sync) and
GlueJobCompleteTrigger._forward_logs (async) share identical log formatting
and log group name extraction logic.
@o-nikolas o-nikolas merged commit b086a22 into apache:main Mar 27, 2026
176 of 177 checks passed
o-nikolas added a commit to aws-mwaa/upstream-to-airflow that referenced this pull request Mar 28, 2026
o-nikolas added a commit that referenced this pull request Mar 28, 2026
nailo2c pushed a commit to nailo2c/airflow that referenced this pull request Mar 30, 2026
… mode (apache#63086)

* fix(glue): Add verbose CloudWatch log streaming for deferrable GlueJobOperator

When using GlueJobOperator with deferrable=True and verbose=True, CloudWatch
logs were silently ignored because the trigger inherited the base waiter's
run() method which only polls job status. This adds a run() override and
_forward_logs() helper to the GlueJobCompleteTrigger that streams logs from
both output and error CloudWatch log groups, matching the format used by the
synchronous path

Extract get_glue_log_group_names() and format_glue_logs() as module-level
helpers in hooks/glue.py so that GlueJobHook.print_job_logs (sync) and
GlueJobCompleteTrigger._forward_logs (async) share identical log formatting
and log group name extraction logic.

closes: apache#56535

---------

Co-authored-by: Elad Kalif <45845474+eladkal@users.noreply.github.com>
nailo2c pushed a commit to nailo2c/airflow that referenced this pull request Mar 30, 2026
Suraj-kumar00 pushed a commit to Suraj-kumar00/airflow that referenced this pull request Apr 7, 2026
… mode (apache#63086)

* fix(glue): Add verbose CloudWatch log streaming for deferrable GlueJobOperator

When using GlueJobOperator with deferrable=True and verbose=True, CloudWatch
logs were silently ignored because the trigger inherited the base waiter's
run() method which only polls job status. This adds a run() override and
_forward_logs() helper to the GlueJobCompleteTrigger that streams logs from
both output and error CloudWatch log groups, matching the format used by the
synchronous path

Extract get_glue_log_group_names() and format_glue_logs() as module-level
helpers in hooks/glue.py so that GlueJobHook.print_job_logs (sync) and
GlueJobCompleteTrigger._forward_logs (async) share identical log formatting
and log group name extraction logic.

closes: apache#56535

---------

Co-authored-by: Elad Kalif <45845474+eladkal@users.noreply.github.com>
Suraj-kumar00 pushed a commit to Suraj-kumar00/airflow that referenced this pull request Apr 7, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:providers provider:amazon AWS/Amazon - related issues

Projects

None yet

Development

Successfully merging this pull request may close these issues.

GlueJobOperator in verbose mode do not pull logs correctly

6 participants