Provide an alternative OpenTelemetry implementation for traces that follows standard otel practices#43941
Provide an alternative OpenTelemetry implementation for traces that follows standard otel practices#43941ashb merged 98 commits intoapache:mainfrom
Conversation
|
Congratulations on your first Pull Request and welcome to the Apache Airflow community! If you have any issues or are unsure about any anything please check our Contributors' Guide (https://github.com/apache/airflow/blob/main/contributing-docs/README.rst)
|
|
I'm confirming, but otel traces might have been experimental still, and if that's the case we are free to change them for a good reason, and your description certainly sounds like one! |
|
If I may, can you add some indication to the title that this is related to OTel Traces specifically? We also have OTel Metrics implemented and it would be nice to minimize confusion. |
|
@ferruzzi I adjusted the title. The only change in this patch that is related to metrics, it's https://github.com/apache/airflow/pull/43941/files#diff-1cca954ec0be1aaf2c212e718c004cb0902a96ac60043bf0c97a782dee52cc32R85-R86 If you think that it's out of scope, then I can remove it. |
|
My initial approach wasn't considering scheduler HA. I've updated the patch accordingly. There have been two main challenges
To avoid breaking things, each scheduler will have a list of the spans that it started and will be solely responsible for ending them. We will save two new attributes on the The new columns will be
Possible scenarios with 2 scheduler (this can easily work with more)
Note that with the current airflow otel implementation, you can't create sub-spans from tasks. All dagrun spans are visualized like this I'll do some testing to make sure nothing has been broken from merging with main and I'll move the PR out of draft. |
| ti: TaskInstance = session.scalars( | ||
| select(TaskInstance).where( | ||
| TaskInstance.dag_id == key.dag_id, | ||
| TaskInstance.task_id == key.task_id, | ||
| TaskInstance.run_id == key.run_id, | ||
| ) | ||
| ).one() | ||
| if ti.state in State.finished: | ||
| self.set_ti_span_attrs(span=span, state=ti.state, ti=ti) | ||
| span.end(end_time=datetime_to_nano(ti.end_date)) | ||
| ti.span_status = SpanStatus.ENDED | ||
| else: | ||
| span.end() | ||
| ti.span_status = SpanStatus.NEEDS_CONTINUANCE |
There was a problem hiding this comment.
Follow up work: This should probably look at ti.try_id instead of TaskInstanceKey
airflow-core/src/airflow/migrations/versions/0065_3_0_0_add_new_otel_span_fields.py
Outdated
Show resolved
Hide resolved
airflow-core/src/airflow/migrations/versions/0065_3_0_0_add_new_otel_span_fields.py
Outdated
Show resolved
Hide resolved
airflow-core/src/airflow/migrations/versions/0065_3_0_0_add_new_otel_span_fields.py
Outdated
Show resolved
Hide resolved
ashb
left a comment
There was a problem hiding this comment.
There's some follow on work we can/should do, but this is worth getting in as it is for 3.0rc1
airflow-core/src/airflow/migrations/versions/0065_3_0_0_add_new_otel_span_fields.py
Outdated
Show resolved
Hide resolved
…w_otel_span_fields.py
The perils of working in an un-configured web editor
|
Just the known uv+lowerdeps issue. Lets merge this! |
|
Awesome work, congrats on your first merged pull request! You are invited to check our Issue Tracker for additional contributions. |
|
And a hell of a first contribution too! |
…ollows standard otel practices (apache#43941) There are some OpenTelemetry standard practices that help keep the usage consistent across multiple projects. According to those * when a root span starts, the trace id and the span id are generated randomly * while the span is active, the context is captured * injected into a carrier object which is a map or a python dictionary * the carrier with the captured context is propagated across services and used to create sub spans * the new sub-span extracts the parent context from the carrier and uses it to set the parent * the trace id is accessed from the carrier and the span id is generated randomly To explain more, this is what the flow of operations should be 1. Dag run starts - start `dagrun` span i. Get the `dagrun` span context 2. Start task - with the `dagrun` context, start `ti` span i. Get the `ti` span context 4. With the `ti` span context create task-sub span 5. Task finishes - `ti` span ends 6. Dag run finishes - `dagrun` span ends Airflow follows a different approach * trace and span ids are generated in a deterministic way from various properties of the `dag_run` and the `task_instance` * the spans for a dag and its tasks, are generated after the run has finished This is the flow of the current implementation 1. Dag run starts 2. Tasks start 3. Tasks finish 4. Create spans for the tasks 5. Dag run finishes 6. Create span for the dag run The current approach makes it impossible to create spans from under tasks while using the existing airflow code. To achieve that, you need to use https://github.com/howardyoo/airflow_otel_provider which has to generate the same trace id and span id as airflow otherwise the spans won't be properly associated. This isn't easily maintainable and it also makes it hard for people that are familiar with otel but new to airflow, to start using the feature. These are some references to OpenTelemetry docs https://opentelemetry.io/docs/concepts/context-propagation/ https://opentelemetry.io/docs/languages/python/propagation/ https://www.w3.org/TR/trace-context/ ## Implementation description For the dagrun and ti spans, I've reused the attributes from the original implementation. I found in my testing that the span timings were occasionally off. That was due to the fact that the time conversion to nanoseconds wasn't considering that the timezone isn't always present. To be able to propagate the context of a span, and use it to create subspan, the span must be active. For example, for a dag run, the span can't be created at the end but * it needs to start with the run * stay active so that the context can be captured and propagated * end once the run also ends Same goes for a task and any sub-spans. With this approach, we can use the new otel methods for creating spans directly from under a task without the need of the `airflow_otel_provider`. These spans will be children of the task span. Check `test_otel_dag.py` for an example of usage. In scheduler HA, multiple schedulers might process a dag, that is running for a long time. The OpenTelemetry philosophy is that spans can't be shared between processes. The process that starts a span should also be the only one that can end it. To workaround that while staying consistent with the otel spec, each scheduler will keep track of the spans that it starts and will also be solely responsible for ending them. Let's assume that we are expecting these spans to be created and exported ``` dag_span |_ task1_span |_ task1_sub1_span |_ task1_sub1_sub_span |_ task1_sub2_span |_ task2_span ``` With the existing airflow implementation, the spans will be ``` dag_span |_ task1_span |_ task2_span ``` Here are some possible scenarios and how the dag spans will look like, assuming that the dag creates the same spans as above 1. Scheduler1 processes a dag from start to finish. ``` dag_span |_ task1_span |_ task1_sub1_span |_ task1_sub1_sub_span |_ task1_sub2_span |_ task2_span ``` 2. Scheduler1 starts a dag and sometime during the run, Scheduler2 picks up the processing and finishes the dag. Through a db update, Scheduler2 will notify whoever scheduler started the spans, to end them. ``` dag_span |_ task1_span |_ task1_sub1_span |_ task1_sub1_sub_span |_ task1_sub2_span |_ task2_span ``` 3. Scheduler1 starts a dag and while the dag is running, exits gracefully. Scheduler2 picks up the processing and finishes the dag while the initial scheduler is no longer running. Scheduler1 will end all active spans that it has a record of and update the db, while exiting. Scheduler2 will create continued spans. This will result in 2 or more spans in place of a long span, but the total duration will be the same. The continued span will be the new parent. ``` dag_span |_ task1_span |_ scheduler_exits_span |_ new_scheduler_span |_ dag_span_continued |_ task1_span_continued |_ task1_sub1_span |_ task1_sub1_sub_span |_ task1_sub2_span |_ task2_span ``` 4. Scheduler1 starts a dag and while the dag is running, exits forcefully. Scheduler2 picks up the processing and finishes the dag while the initial scheduler is no longer running. Scheduler1 didn't have time to end any spans. Once the new scheduler picks up that the scheduler that started the spans, is unhealthy, it will recreate the lost spans. ``` dag_span_recreated |_ task1_span_recreated |_ task1_sub1_span |_ task1_sub1_sub_span |_ task1_sub2_span |_ task2_span ``` --------- Co-authored-by: Ash Berlin-Taylor <ash_github@firemirror.com>
…ollows standard otel practices (apache#43941) There are some OpenTelemetry standard practices that help keep the usage consistent across multiple projects. According to those * when a root span starts, the trace id and the span id are generated randomly * while the span is active, the context is captured * injected into a carrier object which is a map or a python dictionary * the carrier with the captured context is propagated across services and used to create sub spans * the new sub-span extracts the parent context from the carrier and uses it to set the parent * the trace id is accessed from the carrier and the span id is generated randomly To explain more, this is what the flow of operations should be 1. Dag run starts - start `dagrun` span i. Get the `dagrun` span context 2. Start task - with the `dagrun` context, start `ti` span i. Get the `ti` span context 4. With the `ti` span context create task-sub span 5. Task finishes - `ti` span ends 6. Dag run finishes - `dagrun` span ends Airflow follows a different approach * trace and span ids are generated in a deterministic way from various properties of the `dag_run` and the `task_instance` * the spans for a dag and its tasks, are generated after the run has finished This is the flow of the current implementation 1. Dag run starts 2. Tasks start 3. Tasks finish 4. Create spans for the tasks 5. Dag run finishes 6. Create span for the dag run The current approach makes it impossible to create spans from under tasks while using the existing airflow code. To achieve that, you need to use https://github.com/howardyoo/airflow_otel_provider which has to generate the same trace id and span id as airflow otherwise the spans won't be properly associated. This isn't easily maintainable and it also makes it hard for people that are familiar with otel but new to airflow, to start using the feature. These are some references to OpenTelemetry docs https://opentelemetry.io/docs/concepts/context-propagation/ https://opentelemetry.io/docs/languages/python/propagation/ https://www.w3.org/TR/trace-context/ ## Implementation description For the dagrun and ti spans, I've reused the attributes from the original implementation. I found in my testing that the span timings were occasionally off. That was due to the fact that the time conversion to nanoseconds wasn't considering that the timezone isn't always present. To be able to propagate the context of a span, and use it to create subspan, the span must be active. For example, for a dag run, the span can't be created at the end but * it needs to start with the run * stay active so that the context can be captured and propagated * end once the run also ends Same goes for a task and any sub-spans. With this approach, we can use the new otel methods for creating spans directly from under a task without the need of the `airflow_otel_provider`. These spans will be children of the task span. Check `test_otel_dag.py` for an example of usage. In scheduler HA, multiple schedulers might process a dag, that is running for a long time. The OpenTelemetry philosophy is that spans can't be shared between processes. The process that starts a span should also be the only one that can end it. To workaround that while staying consistent with the otel spec, each scheduler will keep track of the spans that it starts and will also be solely responsible for ending them. Let's assume that we are expecting these spans to be created and exported ``` dag_span |_ task1_span |_ task1_sub1_span |_ task1_sub1_sub_span |_ task1_sub2_span |_ task2_span ``` With the existing airflow implementation, the spans will be ``` dag_span |_ task1_span |_ task2_span ``` Here are some possible scenarios and how the dag spans will look like, assuming that the dag creates the same spans as above 1. Scheduler1 processes a dag from start to finish. ``` dag_span |_ task1_span |_ task1_sub1_span |_ task1_sub1_sub_span |_ task1_sub2_span |_ task2_span ``` 2. Scheduler1 starts a dag and sometime during the run, Scheduler2 picks up the processing and finishes the dag. Through a db update, Scheduler2 will notify whoever scheduler started the spans, to end them. ``` dag_span |_ task1_span |_ task1_sub1_span |_ task1_sub1_sub_span |_ task1_sub2_span |_ task2_span ``` 3. Scheduler1 starts a dag and while the dag is running, exits gracefully. Scheduler2 picks up the processing and finishes the dag while the initial scheduler is no longer running. Scheduler1 will end all active spans that it has a record of and update the db, while exiting. Scheduler2 will create continued spans. This will result in 2 or more spans in place of a long span, but the total duration will be the same. The continued span will be the new parent. ``` dag_span |_ task1_span |_ scheduler_exits_span |_ new_scheduler_span |_ dag_span_continued |_ task1_span_continued |_ task1_sub1_span |_ task1_sub1_sub_span |_ task1_sub2_span |_ task2_span ``` 4. Scheduler1 starts a dag and while the dag is running, exits forcefully. Scheduler2 picks up the processing and finishes the dag while the initial scheduler is no longer running. Scheduler1 didn't have time to end any spans. Once the new scheduler picks up that the scheduler that started the spans, is unhealthy, it will recreate the lost spans. ``` dag_span_recreated |_ task1_span_recreated |_ task1_sub1_span |_ task1_sub1_sub_span |_ task1_sub2_span |_ task2_span ``` --------- Co-authored-by: Ash Berlin-Taylor <ash_github@firemirror.com>
|
Indeed. Thanks for doing it. It's great that we have now way "better" OTEL integration in place. |
…ollows standard otel practices (apache#43941) There are some OpenTelemetry standard practices that help keep the usage consistent across multiple projects. According to those * when a root span starts, the trace id and the span id are generated randomly * while the span is active, the context is captured * injected into a carrier object which is a map or a python dictionary * the carrier with the captured context is propagated across services and used to create sub spans * the new sub-span extracts the parent context from the carrier and uses it to set the parent * the trace id is accessed from the carrier and the span id is generated randomly To explain more, this is what the flow of operations should be 1. Dag run starts - start `dagrun` span i. Get the `dagrun` span context 2. Start task - with the `dagrun` context, start `ti` span i. Get the `ti` span context 4. With the `ti` span context create task-sub span 5. Task finishes - `ti` span ends 6. Dag run finishes - `dagrun` span ends Airflow follows a different approach * trace and span ids are generated in a deterministic way from various properties of the `dag_run` and the `task_instance` * the spans for a dag and its tasks, are generated after the run has finished This is the flow of the current implementation 1. Dag run starts 2. Tasks start 3. Tasks finish 4. Create spans for the tasks 5. Dag run finishes 6. Create span for the dag run The current approach makes it impossible to create spans from under tasks while using the existing airflow code. To achieve that, you need to use https://github.com/howardyoo/airflow_otel_provider which has to generate the same trace id and span id as airflow otherwise the spans won't be properly associated. This isn't easily maintainable and it also makes it hard for people that are familiar with otel but new to airflow, to start using the feature. These are some references to OpenTelemetry docs https://opentelemetry.io/docs/concepts/context-propagation/ https://opentelemetry.io/docs/languages/python/propagation/ https://www.w3.org/TR/trace-context/ ## Implementation description For the dagrun and ti spans, I've reused the attributes from the original implementation. I found in my testing that the span timings were occasionally off. That was due to the fact that the time conversion to nanoseconds wasn't considering that the timezone isn't always present. To be able to propagate the context of a span, and use it to create subspan, the span must be active. For example, for a dag run, the span can't be created at the end but * it needs to start with the run * stay active so that the context can be captured and propagated * end once the run also ends Same goes for a task and any sub-spans. With this approach, we can use the new otel methods for creating spans directly from under a task without the need of the `airflow_otel_provider`. These spans will be children of the task span. Check `test_otel_dag.py` for an example of usage. In scheduler HA, multiple schedulers might process a dag, that is running for a long time. The OpenTelemetry philosophy is that spans can't be shared between processes. The process that starts a span should also be the only one that can end it. To workaround that while staying consistent with the otel spec, each scheduler will keep track of the spans that it starts and will also be solely responsible for ending them. Let's assume that we are expecting these spans to be created and exported ``` dag_span |_ task1_span |_ task1_sub1_span |_ task1_sub1_sub_span |_ task1_sub2_span |_ task2_span ``` With the existing airflow implementation, the spans will be ``` dag_span |_ task1_span |_ task2_span ``` Here are some possible scenarios and how the dag spans will look like, assuming that the dag creates the same spans as above 1. Scheduler1 processes a dag from start to finish. ``` dag_span |_ task1_span |_ task1_sub1_span |_ task1_sub1_sub_span |_ task1_sub2_span |_ task2_span ``` 2. Scheduler1 starts a dag and sometime during the run, Scheduler2 picks up the processing and finishes the dag. Through a db update, Scheduler2 will notify whoever scheduler started the spans, to end them. ``` dag_span |_ task1_span |_ task1_sub1_span |_ task1_sub1_sub_span |_ task1_sub2_span |_ task2_span ``` 3. Scheduler1 starts a dag and while the dag is running, exits gracefully. Scheduler2 picks up the processing and finishes the dag while the initial scheduler is no longer running. Scheduler1 will end all active spans that it has a record of and update the db, while exiting. Scheduler2 will create continued spans. This will result in 2 or more spans in place of a long span, but the total duration will be the same. The continued span will be the new parent. ``` dag_span |_ task1_span |_ scheduler_exits_span |_ new_scheduler_span |_ dag_span_continued |_ task1_span_continued |_ task1_sub1_span |_ task1_sub1_sub_span |_ task1_sub2_span |_ task2_span ``` 4. Scheduler1 starts a dag and while the dag is running, exits forcefully. Scheduler2 picks up the processing and finishes the dag while the initial scheduler is no longer running. Scheduler1 didn't have time to end any spans. Once the new scheduler picks up that the scheduler that started the spans, is unhealthy, it will recreate the lost spans. ``` dag_span_recreated |_ task1_span_recreated |_ task1_sub1_span |_ task1_sub1_sub_span |_ task1_sub2_span |_ task2_span ``` --------- Co-authored-by: Ash Berlin-Taylor <ash_github@firemirror.com>
This is my first time contributing to Airflow and I'm not sure if there should be an AIP or a mailing discussion for such changes. I'd appreciate any feedback.
Issue description
related: #40802
There are some OpenTelemetry standard practices that help keep the usage consistent across multiple projects. According to those
To explain more, this is what the flow of operations should be
dagrunspani. Get the
dagrunspan contextdagruncontext, starttispani. Get the
tispan contexttispan context create task-sub spantispan endsdagrunspan endsAirflow follows a different approach
dag_runand thetask_instanceThis is the flow of the current implementation
The current approach makes it impossible to create spans from under tasks while using the existing airflow code. To achieve that, you need to use https://github.com/howardyoo/airflow_otel_provider which has to generate the same trace id and span id as airflow otherwise the spans won't be properly associated. This isn't easily maintainable and it also makes it hard for people that are familiar with otel but new to airflow, to start using the feature.
These are some references to OpenTelemetry docs
https://opentelemetry.io/docs/concepts/context-propagation/
https://opentelemetry.io/docs/languages/python/propagation/
https://www.w3.org/TR/trace-context/
Implementation description
For the dagrun and ti spans, I've reused the attributes from the original implementation. I found in my testing that the span timings were occasionally off. That was due to the fact that the time conversion to nanoseconds wasn't considering that the timezone isn't always present.
To be able to propagate the context of a span, and use it to create subspan, the span must be active.
For example, for a dag run, the span can't be created at the end but
Same goes for a task and any sub-spans.
With this approach, we can use the new otel methods for creating spans directly from under a task without the need of the
airflow_otel_provider. These spans will be children of the task span.Check
test_otel_dag.pyfor an example of usage.In scheduler HA, multiple schedulers might process a dag, that is running for a long time. The OpenTelemetry philosophy is that spans can't be shared between processes. The process that starts a span should also be the only one that can end it.
To workaround that while staying consistent with the otel spec, each scheduler will keep track of the spans that it starts and will also be solely responsible for ending them.
Let's assume that we are expecting these spans to be created and exported
With the existing airflow implementation, the spans will be
Here are some possible scenarios and how the dag spans will look like, assuming that the dag creates the same spans as above
Scheduler1 processes a dag from start to finish.
Scheduler1 starts a dag and sometime during the run, Scheduler2 picks up the processing and finishes the dag.
Through a db update, Scheduler2 will notify whoever scheduler started the spans, to end them.
Scheduler1 starts a dag and while the dag is running, exits gracefully. Scheduler2 picks up the processing and finishes the dag while the initial scheduler is no longer running.
Scheduler1 will end all active spans that it has a record of and update the db, while exiting. Scheduler2 will create continued spans. This will result in 2 or more spans in place of a long span, but the total duration will be the same. The continued span will be the new parent.
Scheduler1 starts a dag and while the dag is running, exits forcefully. Scheduler2 picks up the processing and finishes the dag while the initial scheduler is no longer running.
Scheduler1 didn't have time to end any spans. Once the new scheduler picks up that the scheduler that started the spans, is unhealthy, it will recreate the lost spans.
Testing
otel_tracermethods and updated the existing onesPythonVirtualenvOperatorwithout issuesThe integration test can be used with otel and jaeger as well. To do that, follow the steps on this comment.
^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named
{pr_number}.significant.rstor{issue_number}.significant.rst, in newsfragments.