Fix flaky tests in test_otel.py::TestOtelIntegration#52936
Merged
potiuk merged 1 commit intoapache:mainfrom Jul 6, 2025
Merged
Fix flaky tests in test_otel.py::TestOtelIntegration#52936potiuk merged 1 commit intoapache:mainfrom
potiuk merged 1 commit intoapache:mainfrom
Conversation
Contributor
Author
|
@potiuk @amoghrajesh Can you take a look please? |
Member
|
You never know with flaky tests :D.. That's their nature... |
amoghrajesh
reviewed
Jul 6, 2025
Contributor
amoghrajesh
left a comment
There was a problem hiding this comment.
Nice, fixing a flaky test isnt easy!
Contributor
Author
|
Thank you for the quick merge! Let's hope there are no more random failures. |
HsiuChuanHsu
pushed a commit
to HsiuChuanHsu/airflow
that referenced
this pull request
Jul 10, 2025
stephen-bracken
pushed a commit
to stephen-bracken/airflow
that referenced
this pull request
Jul 15, 2025
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Related issue: #52906
Although the test always passes locally, it has 20-30% failure rate on the remote CI.
For testing it, I created a custom CI that runs the class on repeat.
I was able to determine that
test_scheduler_change_after_the_first_task_finisheswas the flaky test method and once that failed, it would cause every test running after it to fail as well. After commenting it out, the class passed 40/40 runs where it would pass only 25/40 runs.https://github.com/xBis7/airflow/actions/runs/16096527407
https://github.com/xBis7/airflow/actions/runs/16096660040
https://github.com/xBis7/airflow/actions/runs/16096742809
https://github.com/xBis7/airflow/actions/runs/16096823166
The actual issue causing the flakiness is the value of the
scheduler_health_check_thresholdflag.The test
The rest of the tests need
scheduler_health_check_thresholdto have a low value so that scheduler2 can mark scheduler1 as unhealthy pretty quickly. But the opposite is needed for this test.During 20-30% of the runs that the test is failing, scheduler2 is marking scheduler1 as unhealthy and therefore recreating the older spans because they are considered lost. The test is then timing out waiting for the span status to change from
ENDEDtoSHOULD_ENDwhich will never happen.After increasing the flag just for this test, the flakiness is gone. I've run the test 99/100 times successfully. The other run was canceled because the workflow didn't have enough resources and the test exceeded the 30 mins threshold.
https://github.com/xBis7/airflow/actions/runs/16098256324
^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named
{pr_number}.significant.rstor{issue_number}.significant.rst, in airflow-core/newsfragments.