Apache Airflow version
2.8.2
If "Other Airflow 2 version" selected, which one?
2.8.3rc1
What happened?
I'm running a spark-pi example using the SparkKubernetesOperator:
task_id='spark_pi_submit',
namespace='lot1-spark-jobs',
application_file="/example_spark_kubernetes_operator_pi.yaml",
kubernetes_conn_id="kubernetes_default",
do_xcom_push=True,
in_cluster=True,
delete_on_termination=True,
dag=dag
)
It was running fine on 2.8.1. After upgrading to airflow 2.8.2 I got the following error:
│ kube_client=self.client, │
│ ^^^^^^^^^^^ │
│ File "/usr/local/lib/python3.11/functools.py", line 1001, in __get__ │
│ val = self.func(instance) │
│ ^^^^^^^^^^^^^^^^^^^ │
│ File "/home/airflow/.local/lib/python3.11/site-packages/airflow/providers/cncf/kubernetes/operators/spark_kubernetes.py", line 250, in client │
│ return self.hook.core_v1_client │
│ ^^^^^^^^^ │
│ File "/usr/local/lib/python3.11/functools.py", line 1001, in __get__ │
│ val = self.func(instance) │
│ ^^^^^^^^^^^^^^^^^^^ │
│ File "/home/airflow/.local/lib/python3.11/site-packages/airflow/providers/cncf/kubernetes/operators/spark_kubernetes.py", line 242, in hook │
│ or self.template_body.get("kubernetes", {}).get("kube_config_file", None), │
│ ^^^^^^^^^^^^^^^^^^ │
│ File "/home/airflow/.local/lib/python3.11/site-packages/airflow/providers/cncf/kubernetes/operators/spark_kubernetes.py", line 198, in template_body │
│ return self.manage_template_specs() │
│ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ │
│ File "/home/airflow/.local/lib/python3.11/site-packages/airflow/providers/cncf/kubernetes/operators/spark_kubernetes.py", line 127, in manage_template_specs │
│ template_body = _load_body_to_dict(open(self.application_file)) │
│ ^^^^^^^^^^^^^^^^^^^^^^^^^^^ │
│ FileNotFoundError: [Errno 2] No such file or directory: 'apiVersion: "sparkoperator.k8s.io/v1beta2"\nkind: SparkApplication\nmetadata:\n name: spark-pi\n namespace: lot1-spark-jobs\ns │
│ [2024-03-10T10:29:15.613+0000] {taskinstance.py:1149} INFO - Marking task as UP_FOR_RETRY. dag_id=spark_pi, task_id=spark_pi_submit, execution_date=20240310T102910, start_date=20240310T │
It looks like self.application_file eventually contains the content of the file it point to.
I suspect it was caused by changes introduced by PR-22253. I'm quite new to Airflow and Python but my guess is that "application_file" property hasn't to be managed as a template_property since template representations where moved to template_body.
What you think should happen instead?
No response
How to reproduce
Given my understanding of the issue, a very simple example of SparkKubernetesOperator using application_file property should reproduce this issue.
Operating System
kind kubernetes
Versions of Apache Airflow Providers
No response
Deployment
Official Apache Airflow Helm Chart
Deployment details
No response
Anything else?
No response
Are you willing to submit PR?
Code of Conduct
Apache Airflow version
2.8.2
If "Other Airflow 2 version" selected, which one?
2.8.3rc1
What happened?
I'm running a spark-pi example using the SparkKubernetesOperator:
It was running fine on 2.8.1. After upgrading to airflow 2.8.2 I got the following error:
It looks like self.application_file eventually contains the content of the file it point to.
I suspect it was caused by changes introduced by PR-22253. I'm quite new to Airflow and Python but my guess is that "application_file" property hasn't to be managed as a template_property since template representations where moved to template_body.
What you think should happen instead?
No response
How to reproduce
Given my understanding of the issue, a very simple example of SparkKubernetesOperator using application_file property should reproduce this issue.
Operating System
kind kubernetes
Versions of Apache Airflow Providers
No response
Deployment
Official Apache Airflow Helm Chart
Deployment details
No response
Anything else?
No response
Are you willing to submit PR?
Code of Conduct