Skip to content

fix(glue): Fix GlueJobOperator verbose logs not showing in deferrable mode#64342

Merged
o-nikolas merged 3 commits intoapache:mainfrom
shivaam:fix/glue-deferrable-verbose-56535
Apr 10, 2026
Merged

fix(glue): Fix GlueJobOperator verbose logs not showing in deferrable mode#64342
o-nikolas merged 3 commits intoapache:mainfrom
shivaam:fix/glue-deferrable-verbose-56535

Conversation

@shivaam
Copy link
Copy Markdown
Contributor

@shivaam shivaam commented Mar 28, 2026

Re-land of #63086 (reverted by #64340) with corrected error handling.

Changes

When GlueJobOperator runs with deferrable=True and verbose=True, CloudWatch
logs were silently ignored because the trigger inherited the base waiter's run()
method which only polls job status.

Improvements over the original PR:

  • Yield TriggerEvent with error status instead of raising AirflowException
    matches the AwsBaseWaiterTrigger.run() pattern so execute_complete() handles
    errors in the worker, not the triggerer. This eliminates the AirflowException
    import from both the trigger and test files, which was the root cause of the revert.
  • Extract get_glue_log_group_names() and format_glue_logs() as shared helpers in
    hooks/glue.py to eliminate sync/async duplication with GlueJobHook.print_job_logs.

closes: #56535
related: #63086, #64340


Was generative AI tooling used to co-author this PR?
  • Yes — Claude Code (Opus 4.6)

Generated-by: Claude Code (Opus 4.6) following the guidelines

@shivaam shivaam force-pushed the fix/glue-deferrable-verbose-56535 branch from 85cb82e to 945c8a3 Compare March 28, 2026 02:22
@o-nikolas
Copy link
Copy Markdown
Contributor

I see that all the checks are green, but this is still marked draft. Is this ready for review @shivaam?

@shivaam shivaam force-pushed the fix/glue-deferrable-verbose-56535 branch from 945c8a3 to 763a0f0 Compare April 1, 2026 05:06
@shivaam shivaam marked this pull request as ready for review April 1, 2026 05:09
@shivaam shivaam requested a review from o-nikolas as a code owner April 1, 2026 05:09
@potiuk potiuk added the ready for maintainer review Set after triaging when all criteria pass. label Apr 1, 2026
@shivaam
Copy link
Copy Markdown
Contributor Author

shivaam commented Apr 1, 2026

I see that all the checks are green, but this is still marked draft. Is this ready for review @shivaam?

Yes, it is ready to be reviewed. @o-nikolas

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR restores verbose CloudWatch log forwarding for GlueJobOperator(deferrable=True, verbose=True) by adding an async polling/log-forwarding path to the Glue trigger and by extracting shared log formatting/helpers into the Glue hook.

Changes:

  • Override GlueJobCompleteTrigger.run() to poll Glue job state and forward CloudWatch logs when verbose=True, yielding TriggerEvent errors instead of raising in the triggerer.
  • Extract get_glue_log_group_names() and format_glue_logs() into hooks/glue.py and reuse them from both sync and async paths.
  • Add unit tests covering verbose deferrable behavior and _forward_logs() pagination / no-new-events cases.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File Description
providers/amazon/src/airflow/providers/amazon/aws/triggers/glue.py Adds verbose-mode async polling + CloudWatch log forwarding to the Glue job completion trigger.
providers/amazon/src/airflow/providers/amazon/aws/hooks/glue.py Extracts shared helpers to unify log group naming and log formatting between sync/async paths.
providers/amazon/tests/unit/amazon/aws/triggers/test_glue.py Adds unit tests for verbose deferrable Glue trigger behavior and log forwarding.

shivaam added 2 commits April 7, 2026 22:03
…bOperator

When using GlueJobOperator with deferrable=True and verbose=True,
CloudWatch logs were silently ignored because the trigger inherited
the base waiter's run() method which only polls job status.

This adds a run() override and _forward_logs() helper to
GlueJobCompleteTrigger that streams logs from both output and error
CloudWatch log groups, matching the format used by the synchronous path.

Key changes:
- Extract get_glue_log_group_names() and format_glue_logs() as shared
  helpers in hooks/glue.py to eliminate sync/async duplication
- Override run() in GlueJobCompleteTrigger for verbose log streaming
- Yield TriggerEvent with error status on failure (matching base class
  pattern) instead of raising AirflowException in the triggerer process
- Add tests for verbose success, failure, max attempts, pagination,
  ResourceNotFoundException, and no-new-events scenarios

closes: apache#56535
…logger

- Use logs_client.meta.region_name instead of self.region_name for the
  CloudWatch URL in ResourceNotFoundException handler, avoiding invalid
  URLs when region_name is None (resolved from connection)
- Replace caplog with mock.patch.object(trigger.log, ...) in tests per
  Airflow community convention
@shivaam shivaam force-pushed the fix/glue-deferrable-verbose-56535 branch from da8c6c1 to a2fc628 Compare April 8, 2026 05:03
@o-nikolas o-nikolas merged commit e55d90f into apache:main Apr 10, 2026
95 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready for maintainer review Set after triaging when all criteria pass.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

GlueJobOperator in verbose mode do not pull logs correctly

4 participants