Immediately persist new TaskInstanceHistory records#48780
Immediately persist new TaskInstanceHistory records#48780sean-rose wants to merge 1 commit intoapache:v2-10-testfrom
TaskInstanceHistory records#48780Conversation
18094a7 to
a612488
Compare
|
This is not right the right place to do this -- functions that take a session should not commit, that is either handled by the decorator, or should be the responsibility of the caller. So this commit needs to be handled somewhere else |
It would be ideal if the session commit could be handled by the decorator or caller, but in the scenario we're running into that isn't happening successfully, so IMO Airflow ought to be more proactive in persisting the task instance history records. Also, from looking at other Airflow code it doesn't seem uncommon for functions that take a session to commit (e.g. there are eight such
The Or we could create a separate session there specifically for saving the task instance history record, like: with create_session() as ti_history_session:
TaskInstanceHistory.record_ti(ti, session=ti_history_session)However, I'm not familiar enough with the Airflow codebase to know what drawbacks creating a session like that might have. I generally try to follow the conventions/precedents of the surrounding code when making changes in unfamiliar codebases, and at least in @ashb what approach would you recommend? |
|
This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 5 days if no further activity occurs. Thank you for your contributions. |
|
We've run into this same issue due to the performing a database transaction inside |
Since
TaskInstanceHistoryrecords are now used as the basis for showing previous task tries in the "Details", "Gantt", and "Logs" tabs (#40304), it's important they are recorded reliably.After upgrading our Airflow instance from 2.9.3 to 2.10.5 we discovered that links to previous task tries which failed and were retried per their
retriessetting aren't showing up in the Airflow UI (similar to #43739), and I've confirmed this is due to the associatedTaskInstanceHistoryrecords not being successfully persisted to the database despite theTaskInstanceHistory.record_ti()call for retries happening (in our case I suspect it's due to unintended side effects of theon_retry_callbackthatacryl-datahub-airflow-plugin0.13.2.4 is adding to all our tasks, as disabling that plugin makes the problem go away).This change is meant to ensure that
TaskInstanceHistoryrecords are always persisted, despite things potentially going awry later in the retry handling.related: #43739
^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named
{pr_number}.significant.rstor{issue_number}.significant.rst, in airflow-core/newsfragments.