-
Notifications
You must be signed in to change notification settings - Fork 16.9k
Convert DagFileProcessor.execute_callbacks to Internal API #28900
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
609aebb
ac48800
b57ddca
917abfb
6dfebce
1867db4
6b110ea
c21fb81
d92fa83
cc52f47
ee27659
68bfc72
839dc6c
30d6ef9
9fb937d
81ee61b
3c300a7
dc6031e
12c908a
ccf3a86
791ddbe
120a030
4f1ef33
e72652e
2fca2b3
48ef1cf
ee63263
3f5dc7f
f64ea0c
b61f9e6
7bed9bc
c8a8d20
9277e53
fea8db0
1fff88d
6b80508
c9ad169
dfc01a8
6d35f2f
920aae6
9492538
10638cc
0179ba5
995cab1
03694b4
33640a9
52dd4b0
0f042e9
f463837
8cf1fb1
16c334b
7192075
d29dcd2
de30beb
ed97834
658be83
62b9ba8
640c819
87b4fac
67f2ee6
9914a36
6881a51
073ce8b
d32a21e
927344a
b42390b
ea1d9cd
d87346f
d3480a7
638f149
3249ce4
9bb04fe
67fd268
c6cc066
e782d50
9a5e1ab
bdd2370
4c1d4fa
7f73bc1
0ae55e1
72d106e
10a511f
c4d385d
c80105b
ae79b6b
46d2e25
8a71857
4796186
9ac5d30
8ca799b
e795915
a14f941
b2d5bc9
975eba8
adc10fd
7921dc5
f76bc56
4178b7c
2a3edf1
8d35a1e
2de2375
85743a8
5882674
10183ac
c6fb38b
5682a56
cbd4757
df2ff72
512872d
a44ac1a
dee5655
8c60f21
3923f2c
989672e
d51371e
2b9014b
c6573c7
0653a33
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -88,7 +88,7 @@ | |
| TaskNotFound, | ||
| ) | ||
| from airflow.jobs.job import run_job | ||
| from airflow.models.abstractoperator import AbstractOperator | ||
| from airflow.models.abstractoperator import AbstractOperator, TaskStateChangeCallback | ||
| from airflow.models.base import Base, StringID | ||
| from airflow.models.baseoperator import BaseOperator | ||
| from airflow.models.dagcode import DagCode | ||
|
|
@@ -139,6 +139,7 @@ | |
| from airflow.models.operator import Operator | ||
| from airflow.models.slamiss import SlaMiss | ||
| from airflow.serialization.pydantic.dag import DagModelPydantic | ||
| from airflow.serialization.pydantic.dag_run import DagRunPydantic | ||
| from airflow.typing_compat import Literal | ||
| from airflow.utils.task_group import TaskGroup | ||
|
|
||
|
|
@@ -888,7 +889,7 @@ def get_next_data_interval(self, dag_model: DagModel) -> DataInterval | None: | |
| # infer from the logical date. | ||
| return self.infer_automated_data_interval(dag_model.next_dagrun) | ||
|
|
||
| def get_run_data_interval(self, run: DagRun) -> DataInterval: | ||
| def get_run_data_interval(self, run: DagRun | DagRunPydantic) -> DataInterval: | ||
| """Get the data interval of this run. | ||
|
|
||
| For compatibility, this method infers the data interval from the DAG's | ||
|
|
@@ -1383,8 +1384,43 @@ def normalized_schedule_interval(self) -> ScheduleInterval: | |
| _schedule_interval = self.schedule_interval | ||
| return _schedule_interval | ||
|
|
||
| @staticmethod | ||
| @internal_api_call | ||
| @provide_session | ||
| def handle_callback(self, dagrun, success=True, reason=None, session=NEW_SESSION): | ||
| def fetch_callback( | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @vincbeck @potiuk @mhenc @uranusjr just curious what your thoughts are on backward compatibility on this one. technically it ( that said, considering it as part of the public API also seems absurd. do you think we should put in our "public API" some "cover your ass" type of language that sort of expresses that .... methods which are clearly not for public use, even if not marked internal, are internal. maybe we could add some language that explains what that means. like methods not related to the dag authoring interface etc -- not sure. but in any case, we should mark
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It looks backward compatible to me, Github makes it confusing but
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Agree it's not "disappearing" now so not an issue. But I also would like to have a bit of philosophical rant here. I thnk we are approaching backwards compatibility here as "0 / 1". And I keep on repeating it's totally wrong. Hyrum's right is totallly right here: https://www.hyrumslaw.com/ - in sufficiantly complex system ANY change is braaking. You would have to stop changing things to stop things from breaking. There is no way around it. And we cannot describe super precise rules about it upfront. We cannot really say:
That would make us slow and very shortly we would lose any flexibliity. I think we will never achieve 100% correctness and sets of rules that will clearly say for each change "breaking/not breaking" in automated way. We can approach it and get closer to it by adding more and more rules and description - but we will at most asymptotically get closer to the certainty - never achieving it and at some point in time, adding more and more rules will make it more not less confusing and contradicting. So we should strive for 'good enough" and "pretty correct" set of rules - but also accept the fact that there will be exceptions and room for interpretation and even for arbitrary decisions that others (including some of our users) might not agree with. As I see it (and what I think SemVer also explains) Backwards Compatiility ad Semver is NOT about following certain rules "adding a parameter is breaking, renamig any method no marked as private is breaking". IMHO this is about three things:
And yes - it means we will sometimes have to make arbitrary decisions based on gut feelings not data nor precise rules followed. And yes - it means that sometimes there will be individual angry users who will tells us "but you promised backwards compatibiliity - bring it back NOW", and there will be cases where we disageree between ourselves - maintainers - what is backwards compatible and what is. not and we will have to vote on it eventually. And yes - sometimes it will mean we will take a wrong decisiion and break too many workflows of too many users and we will have to quickly release a bugfix that will revert it. All this. And more. And we will remains humans making sometimes flawed and imperfect decisions based on our insticts and intentions and gut feelings not data and strict rules - rather than robots following precise rules and prescribed algorithms. I think this is why we - as maintainers are still needed in the project - to make such decisions. Sorry If I've gotten a bit too philosophical, but I do think we are quite too often trying to make things crystal clear and be free of making the decisions so that we don't have to well, make decisions. It's needed in many cases - that's why I am also adding a lot of rules on how we approach things - for example provider's maintenance lifecycale. But I treat it more as communication tool and write down our intentions and where possible leave enough room for interpretation and decision making. Where we can - yes we should make clear rule. But when we can't we should state our intentions, communicate general principles, and simply try - as best as we can - to fulfill those stated intentions (but we should attempt to communicate those intentions so that our users are aware of them). |
||
| dag: DAG, | ||
| dag_run_id: str, | ||
| success: bool = True, | ||
| reason: str | None = None, | ||
| *, | ||
| session: Session = NEW_SESSION, | ||
| ) -> tuple[list[TaskStateChangeCallback], Context] | None: | ||
| """ | ||
| Fetch the appropriate callbacks depending on the value of success. | ||
|
|
||
| This method gets the context of a single TaskInstance part of this DagRun and returns it along | ||
| the list of callbacks. | ||
|
|
||
| :param dag: DAG object | ||
| :param dag_run_id: The DAG run ID | ||
| :param success: Flag to specify if failure or success callback should be called | ||
| :param reason: Completion reason | ||
| :param session: Database session | ||
| """ | ||
| callbacks = dag.on_success_callback if success else dag.on_failure_callback | ||
| if callbacks: | ||
| dagrun = DAG.fetch_dagrun(dag_id=dag.dag_id, run_id=dag_run_id, session=session) | ||
| callbacks = callbacks if isinstance(callbacks, list) else [callbacks] | ||
| tis = dagrun.get_task_instances(session=session) | ||
| ti = tis[-1] # get first TaskInstance of DagRun | ||
|
potiuk marked this conversation as resolved.
|
||
| ti.task = dag.get_task(ti.task_id) | ||
| context = ti.get_template_context(session=session) | ||
| context["reason"] = reason | ||
| return callbacks, context | ||
| return None | ||
|
|
||
| @provide_session | ||
| def handle_callback(self, dagrun: DagRun, success=True, reason=None, session=NEW_SESSION): | ||
| """ | ||
| Triggers on_failure_callback or on_success_callback as appropriate. | ||
|
|
||
|
|
@@ -1400,21 +1436,29 @@ def handle_callback(self, dagrun, success=True, reason=None, session=NEW_SESSION | |
| :param reason: Completion reason | ||
| :param session: Database session | ||
| """ | ||
| callbacks = self.on_success_callback if success else self.on_failure_callback | ||
| if callbacks: | ||
| callbacks = callbacks if isinstance(callbacks, list) else [callbacks] | ||
| tis = dagrun.get_task_instances(session=session) | ||
| ti = tis[-1] # get first TaskInstance of DagRun | ||
| ti.task = self.get_task(ti.task_id) | ||
| context = ti.get_template_context(session=session) | ||
| context.update({"reason": reason}) | ||
| callbacks, context = DAG.fetch_callback( | ||
| dag=self, dag_run_id=dagrun.run_id, success=success, reason=reason, session=session | ||
| ) or (None, None) | ||
|
|
||
| DAG.execute_callback(callbacks, context, self.dag_id) | ||
|
|
||
| @classmethod | ||
| def execute_callback(cls, callbacks: list[Callable] | None, context: Context | None, dag_id: str): | ||
| """ | ||
| Triggers the callbacks with the given context. | ||
|
|
||
| :param callbacks: List of callbacks to call | ||
| :param context: Context to pass to all callbacks | ||
| :param dag_id: The dag_id of the DAG to find. | ||
| """ | ||
| if callbacks and context: | ||
| for callback in callbacks: | ||
| self.log.info("Executing dag callback function: %s", callback) | ||
| cls.logger().info("Executing dag callback function: %s", callback) | ||
| try: | ||
| callback(context) | ||
| except Exception: | ||
| self.log.exception("failed to invoke dag state update callback") | ||
| Stats.incr("dag.callback_exceptions", tags={"dag_id": dagrun.dag_id}) | ||
| cls.logger().exception("failed to invoke dag state update callback") | ||
| Stats.incr("dag.callback_exceptions", tags={"dag_id": dag_id}) | ||
|
|
||
| def get_active_runs(self): | ||
| """ | ||
|
|
@@ -1452,16 +1496,19 @@ def get_num_active_runs(self, external_trigger=None, only_running=True, session= | |
|
|
||
| return session.scalar(query) | ||
|
|
||
| @staticmethod | ||
| @internal_api_call | ||
| @provide_session | ||
| def get_dagrun( | ||
| self, | ||
| def fetch_dagrun( | ||
| dag_id: str, | ||
| execution_date: datetime | None = None, | ||
| run_id: str | None = None, | ||
| session: Session = NEW_SESSION, | ||
| ): | ||
| ) -> DagRun | DagRunPydantic: | ||
| """ | ||
| Return the dag run for a given execution date or run_id if it exists, otherwise none. | ||
|
|
||
| :param dag_id: The dag_id of the DAG to find. | ||
| :param execution_date: The execution date of the DagRun to find. | ||
| :param run_id: The run_id of the DagRun to find. | ||
| :param session: | ||
|
|
@@ -1471,11 +1518,22 @@ def get_dagrun( | |
| raise TypeError("You must provide either the execution_date or the run_id") | ||
| query = select(DagRun) | ||
| if execution_date: | ||
| query = query.where(DagRun.dag_id == self.dag_id, DagRun.execution_date == execution_date) | ||
| query = query.where(DagRun.dag_id == dag_id, DagRun.execution_date == execution_date) | ||
| if run_id: | ||
| query = query.where(DagRun.dag_id == self.dag_id, DagRun.run_id == run_id) | ||
| query = query.where(DagRun.dag_id == dag_id, DagRun.run_id == run_id) | ||
| return session.scalar(query) | ||
|
|
||
| @provide_session | ||
| def get_dagrun( | ||
| self, | ||
| execution_date: datetime | None = None, | ||
| run_id: str | None = None, | ||
| session: Session = NEW_SESSION, | ||
| ) -> DagRun | DagRunPydantic: | ||
| return DAG.fetch_dagrun( | ||
| dag_id=self.dag_id, execution_date=execution_date, run_id=run_id, session=session | ||
| ) | ||
|
vincbeck marked this conversation as resolved.
|
||
|
|
||
| @provide_session | ||
| def get_dagruns_between(self, start_date, end_date, session=NEW_SESSION): | ||
| """ | ||
|
|
||
Uh oh!
There was an error while loading. Please reload this page.