Skip to content

Remove Database Isolation feature flag and run DB isolation tests#41067

Closed
potiuk wants to merge 1 commit intoapache:mainfrom
potiuk:remove-database-isolation-feature-flag
Closed

Remove Database Isolation feature flag and run DB isolation tests#41067
potiuk wants to merge 1 commit intoapache:mainfrom
potiuk:remove-database-isolation-feature-flag

Conversation

@potiuk
Copy link
Copy Markdown
Member

@potiuk potiuk commented Jul 27, 2024

This PR removes AIP-44 feature flag and replaces "in-progress-disabled" test with dedicated "DatabaseIsolation" one.

The DatabaseIsolation test will run all "db-tests" with enabled DB isolation mode and running internal-api component - groups of tests marked with "skip-if-database-isolation" will be skipped.

  • tests/models/test_taskinstance.py (32 tests failing)
  • tests/operators/test_python.py (2 tests failing)
  • tests/sensors/test_external_task_sensor.py (2 tests failing)

Some of those tests might need to be excluded as they make no sense to be run (for example if those are tests that are testing scheduler or webserver code). It might also be that we need to improve our test harness for isolation mode in some cases.


How to run the tests? It's fairly easy to run the tests with breeeze (you can checkout this pr with gh pr checkout 41067 and work on top of it.

You can run tests using DB isolation already using database isolation. It works in the way that you need to enter breeze shell --database-isolation, split the terminals with tmux, run internal-api component in one of them and then you can run pytest tests in the other terminal. Again - sources are mounted to inside the container so you can easily iterate on tests (ctrl-c restart internal-api to pick up changes).

See: https://github.com/apache/airflow/blob/main/contributing-docs/testing/unit_tests.rst#running-tests-with-database-isolation for detailed description.

One tests that fully works for now is dag_processing/test_job_runner.py for example and a good way to see how it works - you will see that the test initializes DB direcly, but then the dag_processing code calls the internal-api server and you will see messages going back-forth in the "internal-api" output.

You should make sure that you use pytest --run-db-tests-only flag - and all non-db tests will be skipped, that should speed up your tests when you run whole module or directory.


How to submit fixes?

After you make a fix - submit a PR explaining which tests are fixed with Related: #41067 in the commit message, That's it. I will be continuously rebasing this PR on top of main and keeping the inventory of what has been fixed.


^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

@potiuk potiuk added the full tests needed We need to run full set of tests for this PR to merge label Jul 27, 2024
@potiuk potiuk marked this pull request as draft July 27, 2024 18:18
@potiuk
Copy link
Copy Markdown
Member Author

potiuk commented Jul 27, 2024

This PR will have quite a number of failing tests still - some of them because they should just be disabled - on a module level in most cases (with adding pytest.mark.skuip-if-database-isolation to pytest.mark.db-test.

Some of them because there are still some missing pieces to fix in the internal API.

The "special" DatabaseIsolation test that replaces "InProgressDisabled" runs the special "Database Isolation" mode that allows all our test fixtures and generally "test code" to access the database, while any code that is being tested from "airflow" should get special TracingSession assigned that will fail if the "airflow" code will attempt to use the database directly rather than via internal API.

I will make an extra pass today - once it completes - and disable all the "big" modules that should be disabled from testing, and maybe fix some obvious cases. But anyone (especially @jscheffl @vincbeck and @dstandish ) are most welcome to sign-up to fix some of those in parallel (please comment here when you pick something - I will be merging my fixes individually as separate PRs and will keep on rebasing that PR until we will get it hopefully green before 2.10rc1.

@potiuk potiuk force-pushed the remove-database-isolation-feature-flag branch from d2fc06f to 78db57a Compare July 27, 2024 18:29
@aritra24
Copy link
Copy Markdown
Collaborator

I'll try taking a look at some failures tomorrow my time

@potiuk potiuk force-pushed the remove-database-isolation-feature-flag branch from 78db57a to f1ef44e Compare July 27, 2024 18:41
@potiuk potiuk linked an issue Jul 27, 2024 that may be closed by this pull request
@potiuk potiuk force-pushed the remove-database-isolation-feature-flag branch from f1ef44e to 7454874 Compare July 27, 2024 18:59
@potiuk potiuk force-pushed the remove-database-isolation-feature-flag branch from 7454874 to f51a5de Compare July 27, 2024 19:57
@jscheffl
Copy link
Copy Markdown
Contributor

I LOVE pull-requests which delete more lines than adding. But as being DRAFT, looking forward for review (if family time permits tomorrow)

@potiuk potiuk force-pushed the remove-database-isolation-feature-flag branch from f51a5de to c641e40 Compare July 27, 2024 22:24
@potiuk
Copy link
Copy Markdown
Member Author

potiuk commented Aug 11, 2024

Rebased with latest changes to see where we are in main

@jscheffl
Copy link
Copy Markdown
Contributor

I fix the follow-up of test_variable.py

@bugraoz93
Copy link
Copy Markdown
Contributor

tests/decorators/test_python.py seems like an easy one. Creating the PR shortly

@potiuk
Copy link
Copy Markdown
Member Author

potiuk commented Aug 14, 2024

I think this one is likely to have only two remaining issues - I fixed the "regular" failures - by excluding serialization for the tests that failed with it in "decorator/test_python"

@potiuk
Copy link
Copy Markdown
Member Author

potiuk commented Aug 14, 2024

OK @vincbeck @jscheffl @aritra24 @eladkal others - this one seem to be finally green ! Made it "ready for review" and once we merge it - we can attempt to get it into v2-10-test branch (we need to add a little documentation to it- but I'd say we should aim to either get it for rc2 of 2.10 if we have one or maybe aim to get it in for 2.10.1

Copy link
Copy Markdown
Contributor

@vincbeck vincbeck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really nice work from this PR and the others already merged which fixed tests one by one! 🎉

Copy link
Copy Markdown
Collaborator

@aritra24 aritra24 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yay! This looks great! Amazing work all!

Copy link
Copy Markdown
Contributor

@jscheffl jscheffl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow. What a stretch. Cool Teamwork (felt like a bit of distributed hackaton) getting this green! I feel a bit exhaiusted after this fix series and LGTM... then I can build on top of this and make the WIP of AIP-69 to main (hopefully soon!)

@bugraoz93
Copy link
Copy Markdown
Contributor

Awesome to see this green! Amazing work!

@ferruzzi
Copy link
Copy Markdown
Contributor

Sorry I missed out on this one, looks like a cool project. It had green tests and two approvals, but also over 200 "hidden items" in the discussion thread so I'm afraid to hit the merge button on it without more context. Congrats on getting it done.

@jscheffl
Copy link
Copy Markdown
Contributor

@potiuk good to merge or still hold this back?

@potiuk
Copy link
Copy Markdown
Member Author

potiuk commented Aug 16, 2024

@potiuk good to merge or still hold this back?

I am struggling with it a bit - because it will likely make it more difficult for contributors to add their tests to main - while we are anyhow targetting the AIP-44 to be ripped off from Airflow 3 eventually.

Looking at the kind of issues we had to go through, it will be a bit of a pain to merge that one to main - only to remove it afterwards, so I'd rather cherry-pick and merge that one to v2-10-test, however technically speaking this is a "new" feature enabled there, so if we folow it literally, it should go to 2.11.0. Probably the same kind of issue that @uranusjr mentioned in the [RESULT][VOTE] discussion: https://lists.apache.org/thread/56b7bdrzszkqtd6kgy360okw72568ld2

So I think maybe we should have a v2-11-test branch already created and we should start cherry-picking thing there that do not belong to 2.10.0 (and periodically rebasing it - or use cherry-picker to simultaneously cherry-pick things to both v2-10-test and v2-11-test when cherrypicking stuff to v2-10-test.

@utkarsharma2 @ephraimbuddy @kaxil @uranusjr @jscheffl - would love to hear your thoughts about it.

@ephraimbuddy
Copy link
Copy Markdown
Contributor

@potiuk good to merge or still hold this back?

I am struggling with it a bit - because it will likely make it more difficult for contributors to add their tests to main - while we are anyhow targetting the AIP-44 to be ripped off from Airflow 3 eventually.

Looking at the kind of issues we had to go through, it will be a bit of a pain to merge that one to main - only to remove it afterwards, so I'd rather cherry-pick and merge that one to v2-10-test, however technically speaking this is a "new" feature enabled there, so if we folow it literally, it should go to 2.11.0. Probably the same kind of issue that @uranusjr mentioned in the [RESULT][VOTE] discussion: https://lists.apache.org/thread/56b7bdrzszkqtd6kgy360okw72568ld2

So I think maybe we should have a v2-11-test branch already created and we should start cherry-picking thing there that do not belong to 2.10.0 (and periodically rebasing it - or use cherry-picker to simultaneously cherry-pick things to both v2-10-test and v2-11-test when cherrypicking stuff to v2-10-test.

@utkarsharma2 @ephraimbuddy @kaxil @uranusjr @jscheffl - would love to hear your thoughts about it.

Yeah. This is a concern. Should we have a branch called v2-dev where all airflow 2 PRs will go into, then the release manager would cherry-pick to v2-10-test when releasing? I think that having v2-11-test might make us miss PRs or having multiple places to PR to

@potiuk
Copy link
Copy Markdown
Member Author

potiuk commented Aug 16, 2024

Yeah. This is a concern. Should we have a branch called v2-dev where all airflow 2 PRs will go into, then the release manager would cherry-pick to v2-10-test when releasing? I think that having v2-11-test might make us miss PRs or having multiple places to PR to

Shall we spin it off to a devlist discussion?

This PR removes AIP-44 feature flag and replaces "in-progress-disabled"
test with dedicated "DatabaseIsolation" one.

The DatabaseIsolation test will run all "db-tests" with enabled
DB isolation mode and running `internal-api` component - groups
of tests marked with "skip-if-database-isolation" will be skipped.
@jscheffl
Copy link
Copy Markdown
Contributor

Yeah. This is a concern. Should we have a branch called v2-dev where all airflow 2 PRs will go into, then the release manager would cherry-pick to v2-10-test when releasing? I think that having v2-11-test might make us miss PRs or having multiple places to PR to

Shall we spin it off to a devlist discussion?

Devlist discussion might be fair.

Options that I see contributing above:

  1. adding to main to then rip it out just to bring it to 2.x branch... beneficial would be at least having less differences between v2 and v3 for a moment... but benefit is small
  2. adding to 2.10-test would be is you account it precise not a bugfix. But not merging it woul dmean that in 2.10 line the chance is high that the existing internal API is degrading very fast - that is why we had the tests added. Whereas the feature flag removal is just a feature flag removal. The PR is not adding functionality per-se but just enabling it.

My main concern and pain is the high probability that the internal API degrades very fast in 2.x line.

How about... we split the PR into:

  • Feature flag --> Airflow 2.11 to be cherry-picked - this will probably not have risk to degrade
  • Internal API Tests --> Merge to v2-10-test to ensure API works - is just adding tests, not "features"

@potiuk
Copy link
Copy Markdown
Member Author

potiuk commented Aug 18, 2024

How about... we split the PR into:

  • Feature flag --> Airflow 2.11 to be cherry-picked - this will probably not have risk to degrade
  • Internal API Tests --> Merge to v2-10-test to ensure API works - is just adding tests, not "features"

I like this. Let's do it.

@potiuk
Copy link
Copy Markdown
Member Author

potiuk commented Aug 19, 2024

cc: @utkarsharma2 - I am merging this one as with some of the test changes (only tests) I believe it brings stability to the running tests and you need it for Python client relase.

@potiuk
Copy link
Copy Markdown
Member Author

potiuk commented Aug 19, 2024

cc: @utkarsharma2 - I am merging this one as with some of the test changes (only tests) I believe it brings stability to the running tests and you need it for Python client relase.

Not this one - the other one for v2-10-test

@potiuk
Copy link
Copy Markdown
Member Author

potiuk commented Aug 19, 2024

@jscheffl -> shall we close that one then after #41567 merged? Are you good with that we are not planning to merge it and essentially we will remove AIP-44 from main - is that fine for you ?

@utkarsharma2
Copy link
Copy Markdown
Contributor

cc: @utkarsharma2 - I am merging this one as with some of the test changes (only tests) I believe it brings stability to the running tests and you need it for Python client relase.

Thanks @potiuk, much appreciated :)

@jscheffl
Copy link
Copy Markdown
Contributor

@jscheffl -> shall we close that one then after #41567 merged? Are you good with that we are not planning to merge it and essentially we will remove AIP-44 from main - is that fine for you ?

I think to be realistic and as we are riding an almost dead horse here... yes. I can accept if we are not removing the feature flag in Airflow 2.11. Is too much discussion for a short living improvement. Attempting to make AIP-69 with Airflow 2.10 anyway will assume the feature flag must be set.

@potiuk
Copy link
Copy Markdown
Member Author

potiuk commented Aug 20, 2024

We still might remove the feature flag in 2.11. This will be easy - we do not need to keep that PR around for that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:CLI area:dev-tools area:serialization area:webserver Webserver related Issues full tests needed We need to run full set of tests for this PR to merge

Projects

No open projects
Status: ✅ Done

Development

Successfully merging this pull request may close these issues.

Make sure all tests are passing in DB Isolation mode

8 participants