You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@airflow.apache.org by Jarek Potiuk <ja...@potiuk.com> on 2022/04/01 13:25:45 UTC

Re: [DISCUSSION] AIP-49 OpenTelemetry Support for Apache Airflow

Or maybe that people are so stunned by the beauty and usefulness of it
that they cannot even say a word :)

On Thu, Mar 31, 2022 at 11:37 PM Howard Yoo <ho...@gmail.com> wrote:
>
> I think it may mean all is well, perhaps :-)
> Howard
>
> On Thu, Mar 31, 2022 at 9:29 AM Jarek Potiuk <ja...@potiuk.com> wrote:
>>
>> Hello everyone,
>>
>> Would be great to get some comments and reviews  - especially from
>> those who are users and are doing monitoring. Howard made a lot of
>> effort to show examples of how OTEL might help in this.
>>
>> Otherwise, is the silence sign that all is good ?
>>
>> J.
>>
>> On Fri, Mar 25, 2022 at 9:53 PM Jarek Potiuk <ja...@potiuk.com> wrote:
>> >
>> > And let me add to it - we laid some foundations for it with Melodie
>> > Ezeani - the Outreachy intern where we did some internal integration
>> > work that let us understand the challenges and state of
>> > open-telemetry.
>> >
>> > I am super excited about what open-telemetry can bring to Airflow -
>> > both short term (in metrics and instrumentation) and longer term - in
>> > logging when logging is mature enough.
>> >
>> > The open-telemetry mov is at the heart of the principles we have -
>> > making Airflow "modern" but at the same time delegating what's not
>> > "core" to those who can do it better. Open Telemetry is precisely
>> > about that,
>> >
>> > Howard particularly brought a great experience from using Open
>> > Telemetry before and making some good judgements and POC on the
>> > metrics produced by Airflow and how they can be useful from the
>> > "maintainer value" side. I looked at it from the technical integration
>> > POV - and Melodie helped to validate some of the assumptions and
>> > expose some of the technical challenges.
>> > The composite result is good, but we are looking with Howard on some
>> > insightful comments and critique - especially from the users of
>> > Airflow!
>> >
>> > I look forward to more cool stuff on Airflow!
>> >
>> > J.
>> >
>> > On Fri, Mar 25, 2022 at 9:39 PM Howard Yoo <ho...@gmail.com> wrote:
>> > >
>> > > Hi all,
>> > >
>> > > I am pleased to announce the start of the discussion for the new AIP draft that was recently been published: https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-49+OpenTelemetry+Support+for+Apache+Airflow
>> > >
>> > > Jarek Potuik and I have been discussing about this proposal since early this year. During that time, we worked together on drafting this proposal, as well as doing another round of mini-POC to refresh and test on feasibility of OpenTelemetry on Apache Airflow.
>> > >
>> > > As the POC was successful in terms of testing the OpenTelemetry on Airflow, we would like to expand the discussion to a wider user community here this mailing list to gather more consensus, comments, and feedbacks regarding this AIP.
>> > >
>> > > Sincerely,
>> > > Howard and Jarek

Re: [DISCUSSION] AIP-49 OpenTelemetry Support for Apache Airflow

Posted by Howard Yoo <ho...@gmail.com>.
Finally had some time to go over the changes and wrote summary of it in AIP
for better understanding, and updated the proposal doc:
https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-49+OpenTelemetry+Support+for+Apache+Airflow#AIP49OpenTelemetrySupportforApacheAirflow-CodeChangesSummary


Hope this helps!
Howard

On Fri, Apr 29, 2022 at 11:49 AM Howard Yoo <ho...@gmail.com> wrote:

> Got it. Will describe the changes that might be subjected to be done -
> however, please note that the changes were the things that I did on my own
> during the POC, so the final changes may obviously be different.
>
> On Fri, Apr 29, 2022 at 3:57 AM Elad Kalif <el...@apache.org> wrote:
>
>> I'd prefer that the Code Changes section would be more detailed. This is
>> a very important part.
>> I'm not sure referring to a private repo is the way to do it but even if
>> you should at least list the components which are subject to changes.
>> You don't need to specify every change you are going to do - only the
>> critical ones where it couples with airflow core.
>> You have changes for dag_proccessing, executors, scheduler_job and even
>> migration to add columns.
>>
>> https://github.com/howardyoo/airflow/commit/cc83c2b377ac22f0e7ef82e7f59784df972037fd
>> These are important
>>
>>
>> On Wed, Apr 27, 2022 at 10:09 PM Howard Yoo <ho...@gmail.com> wrote:
>>
>>> Hi Elad, I have updated the AIP-49 with the appropriate changes to
>>> contain what you requested in your previous comments. I think the AIP is
>>> fairly comprehensive and ready to be voted, unless there is any objections.
>>>
>>> Sincerely,
>>> Howard
>>>
>>> On Mon, Apr 25, 2022 at 11:03 PM Howard Yoo <ho...@gmail.com> wrote:
>>>
>>>> Hi Elad,
>>>> Will take a look at it and let you know!
>>>>
>>>> Thanks,
>>>> Howard
>>>>
>>>> On Mon, Apr 25, 2022 at 6:35 AM Elad Kalif <el...@apache.org> wrote:
>>>>
>>>>> Hi Howard,
>>>>> Have you made progress with addressing the comments?
>>>>> I think once points are addressed we can maybe start a vote?
>>>>>
>>>>> On Tue, Apr 12, 2022 at 12:39 PM Jarek Potiuk <ja...@potiuk.com>
>>>>> wrote:
>>>>>
>>>>>> > 1. I would assume the path for StatsD (dogstatsd) would be for
>>>>>> deprecation - we will perhaps comment or mark it as deprecation - and
>>>>>> should follow the established (or usual?) process of feature deprecation of
>>>>>> Airflow, once the opentelemetry is in place.
>>>>>>
>>>>>> Yep. Deprecation should be. And maybe accompanied with a "statsd
>>>>>> open-telemetry exporter" ? I think eventually we should have just OTEL
>>>>>> stats and nothing else. The current "configurable" metrics class
>>>>>> should be fully replaced with "OTEL configurable" metrics. Same as
>>>>>> (but that's a much longer and uncertain path) logging configuration
>>>>>> should get replaced if/when OTEL logging lives to its promises.
>>>>>>
>>>>>> > 2. Airflow will provide its own list of 'instrumented metrics' out
>>>>>> of the box - with the list specified in its documentation so that users
>>>>>> would be aware of them. However, with opentelemetry, the users will also
>>>>>> have the ability to add their 'custom' metrics / traces / logs to be
>>>>>> collected via opentelemetry as needed. That option and how-to's should also
>>>>>> be documented.
>>>>>>
>>>>>> Good point. Actually the nice thing is that adding new metrics could
>>>>>> then be done as part of task execution for example - so we should
>>>>>> indeed have some libraries/tools or maybe even operator's /task
>>>>>> interface should have some built-in capabilities and allow for
>>>>>> declarative ways of adding metrics (but this can be added as a
>>>>>> follow-up - it does not need to be described and hashed out yet IMHO -
>>>>>> but we can add it to the docs as "future improvement") .
>>>>>>
>>>>>> > 3. A little clarification - I believe there were two POC's - which
>>>>>> at the time during the first POC, it had lacked some of the features (e.g.
>>>>>> traces and logs). However, in the second POC, there is a stable release of
>>>>>> Traces, and beta release of logs, so things have been progressing.
>>>>>> Opentelemetry is highly evolving and many things do change and get added
>>>>>> relatively quickly. During my second POC (the one mentioned in the attached
>>>>>> PDF) I was able to validate that all of the 'key' features that we needed
>>>>>> to implement opentelemetry for airflow has been released and available.
>>>>>>
>>>>>> Yeah. I concur with that Howard wrote.  What was there a few months
>>>>>> ago a little "shaky", becomes more and more solid as time passes.  And
>>>>>> I also spoke with a few people who are involved in OpenTelemetry
>>>>>> standard and development (one of my friends is site-lead for Sumo
>>>>>> Logic Poland and they take active part in that effort and I just spoke
>>>>>> to him about it). Also AWS is very much vested in it as far as I
>>>>>> understand - I also spoke to Google and they are very much supporting
>>>>>> OTEL as industry standard. I think there is a firm industry backing
>>>>>> behind OTEL and there is no way it "won't progress" or "falter". This
>>>>>> is a little bit of a "leap of faith" that it will become fully
>>>>>> featured for our needs, but I think what is already there is "enough"
>>>>>> to justify the move and anything that comes out of Beta is a bonus.
>>>>>> Eventually if things will not go fast enough for us - we can also
>>>>>> actually contribute there on the "collection" level.  It will likely
>>>>>> have a much better outcome than if we try to integrate airflow with
>>>>>> multiple services ourselves - we will just have to make sure things
>>>>>> get collected "well" - and then all the different services will more
>>>>>> likely than not write exporters for it.
>>>>>>
>>>>>> > 4. This is a debatable topic - and I believe those may not be a
>>>>>> part of the core airflow code base, but would suggest perhaps we could
>>>>>> create a 'contribution' repo which may maintain those third party assets
>>>>>> (e.g. Grafana dashboard, Datadog dashboard, etc.) that users may use and
>>>>>> even participate in maintaining it.
>>>>>>
>>>>>> Yeah. I think  this is also an opportunity for someone who is
>>>>>> specializing in those or even let Grafana add it to their "portfolio".
>>>>>> I can imagine there might be some basic dashboards generated by
>>>>>> companies which do some kinds of integrations (and with an option of
>>>>>> "come to us when you want more customizability". While we won't be
>>>>>> able to endorse those, we can easily let it land in the "ecosystem"
>>>>>> page of ours. I think also we can easily reach out (for example to
>>>>>> Grafana to do it - in the case of Grafana, Myrle Krantz, ex Treasurer
>>>>>> from ASF who I know well is a Senior Manager in Grafana responsible
>>>>>> for Cloud team).
>>>>>>
>>>>>>
>>>>>> > 5. POC may not have all the changes, as the work was to 'prototype'
>>>>>> and answer questions like 'what if' when opentelemetry is in place. The
>>>>>> proposal actually has link to a GIT repo that contains the changes that
>>>>>> were done during the POC:
>>>>>> https://github.com/howardyoo/airflow/tree/opentelemetry-poc-1 .
>>>>>> Since the details of the changes would make the existing PDF even more
>>>>>> subjected to TL;DR, I linked this git branch for anyone interested in the
>>>>>> changes to take a look. I hope this would be sufficient.
>>>>>>
>>>>>> I think Elad is right that some  more details need to "surface" from
>>>>>> the POC to AIP. While the changes are very little - they don't change
>>>>>> any flows or logic in Airflow, it's more to "selectively" add
>>>>>> collections and make sure common "Span" id is shared throughout the
>>>>>> code (which I think is the biggest change).
>>>>>> I think Howard, it might make sense to extract parts of the POC and
>>>>>> put them as an outline of changes to implement in the "AIP".
>>>>>>
>>>>>> The POC is more trace of what you've done, but I think simply
>>>>>> translating this into "this is what we need to do" for those who will
>>>>>> just read AIP is important. Our AIP's are much more "Technical" in
>>>>>> nature than most of the
>>>>>>
>>>>>> >
>>>>>> > 6. Yes, I agree - I will update the AIP proposal to make the scope
>>>>>> more clearer. Thanks for the feedback!
>>>>>>
>>>>>> Yeah. Adding "future improvements" and especially "This is what we are
>>>>>> not going to do and leave for later" is important part of every AIP. I
>>>>>> think it's sometimes much more important to state what we are NOT
>>>>>> going to vs. what we are going to do.
>>>>>>
>>>>>> >
>>>>>> > Howard
>>>>>> >
>>>>>> > On Sun, Apr 3, 2022 at 1:27 PM Elad Kalif <el...@apache.org>
>>>>>> wrote:
>>>>>> >>
>>>>>> >> Thanks Howard!
>>>>>> >> Sorry for the delay, this was a long read. The PDF alone is 19
>>>>>> pages  :)
>>>>>> >> looks very good!
>>>>>> >>
>>>>>> >>  I have 6 questions/points to raise:
>>>>>> >>
>>>>>> >> 1. I'm not clear about what is to happen with StatsD .
>>>>>> >> It states "Make OpenTelemetry and StatsD optional and
>>>>>> interchangeable."
>>>>>> >> But do we want to support both in the long run?
>>>>>> >> It doesn't specify if we are deprecating statsD along with
>>>>>> completion of this AIP.
>>>>>> >>
>>>>>> >> 2. regarding adding metrics.
>>>>>> >> Do we intend to let users define their own KPIs/metrics to be
>>>>>> measured or it will be a closed list set by Airflow?
>>>>>> >>
>>>>>> >> 3. The POC specifies it uses a feature of open-telemetry (add
>>>>>> metrics) which was not yet released. Do we know the timeline for the
>>>>>> feature to be released?
>>>>>> >> Can we vote on a plan to use something which is not yet publicly
>>>>>> available?
>>>>>> >>
>>>>>> >> 4. Are the Grafana dashboards / other dashboards to be part of the
>>>>>> Airflow core code base?
>>>>>> >> I wonder if this should be in a dedicated repo?
>>>>>> >>
>>>>>> >> 5. Could you please clarify in the AIP page what changes are
>>>>>> required to the Airflow code base?
>>>>>> >> Some of them appear in the PDF but I'm not sure if that is all of
>>>>>> them?
>>>>>> >>
>>>>>> >> 6. The PDF has several open questions/ideas.
>>>>>> >> I think it would be best to add to the AIP a scope paragraph
>>>>>> listing what will be included in the first phase and what is left for other
>>>>>> phases.
>>>>>> >>
>>>>>> >> On Fri, Apr 1, 2022 at 4:26 PM Jarek Potiuk <ja...@potiuk.com>
>>>>>> wrote:
>>>>>> >>>
>>>>>> >>> Or maybe that people are so stunned by the beauty and usefulness
>>>>>> of it
>>>>>> >>> that they cannot even say a word :)
>>>>>> >>>
>>>>>> >>> On Thu, Mar 31, 2022 at 11:37 PM Howard Yoo <ho...@gmail.com>
>>>>>> wrote:
>>>>>> >>> >
>>>>>> >>> > I think it may mean all is well, perhaps :-)
>>>>>> >>> > Howard
>>>>>> >>> >
>>>>>> >>> > On Thu, Mar 31, 2022 at 9:29 AM Jarek Potiuk <ja...@potiuk.com>
>>>>>> wrote:
>>>>>> >>> >>
>>>>>> >>> >> Hello everyone,
>>>>>> >>> >>
>>>>>> >>> >> Would be great to get some comments and reviews  - especially
>>>>>> from
>>>>>> >>> >> those who are users and are doing monitoring. Howard made a
>>>>>> lot of
>>>>>> >>> >> effort to show examples of how OTEL might help in this.
>>>>>> >>> >>
>>>>>> >>> >> Otherwise, is the silence sign that all is good ?
>>>>>> >>> >>
>>>>>> >>> >> J.
>>>>>> >>> >>
>>>>>> >>> >> On Fri, Mar 25, 2022 at 9:53 PM Jarek Potiuk <ja...@potiuk.com>
>>>>>> wrote:
>>>>>> >>> >> >
>>>>>> >>> >> > And let me add to it - we laid some foundations for it with
>>>>>> Melodie
>>>>>> >>> >> > Ezeani - the Outreachy intern where we did some internal
>>>>>> integration
>>>>>> >>> >> > work that let us understand the challenges and state of
>>>>>> >>> >> > open-telemetry.
>>>>>> >>> >> >
>>>>>> >>> >> > I am super excited about what open-telemetry can bring to
>>>>>> Airflow -
>>>>>> >>> >> > both short term (in metrics and instrumentation) and longer
>>>>>> term - in
>>>>>> >>> >> > logging when logging is mature enough.
>>>>>> >>> >> >
>>>>>> >>> >> > The open-telemetry mov is at the heart of the principles we
>>>>>> have -
>>>>>> >>> >> > making Airflow "modern" but at the same time delegating
>>>>>> what's not
>>>>>> >>> >> > "core" to those who can do it better. Open Telemetry is
>>>>>> precisely
>>>>>> >>> >> > about that,
>>>>>> >>> >> >
>>>>>> >>> >> > Howard particularly brought a great experience from using
>>>>>> Open
>>>>>> >>> >> > Telemetry before and making some good judgements and POC on
>>>>>> the
>>>>>> >>> >> > metrics produced by Airflow and how they can be useful from
>>>>>> the
>>>>>> >>> >> > "maintainer value" side. I looked at it from the technical
>>>>>> integration
>>>>>> >>> >> > POV - and Melodie helped to validate some of the assumptions
>>>>>> and
>>>>>> >>> >> > expose some of the technical challenges.
>>>>>> >>> >> > The composite result is good, but we are looking with Howard
>>>>>> on some
>>>>>> >>> >> > insightful comments and critique - especially from the users
>>>>>> of
>>>>>> >>> >> > Airflow!
>>>>>> >>> >> >
>>>>>> >>> >> > I look forward to more cool stuff on Airflow!
>>>>>> >>> >> >
>>>>>> >>> >> > J.
>>>>>> >>> >> >
>>>>>> >>> >> > On Fri, Mar 25, 2022 at 9:39 PM Howard Yoo <
>>>>>> howardyoo@gmail.com> wrote:
>>>>>> >>> >> > >
>>>>>> >>> >> > > Hi all,
>>>>>> >>> >> > >
>>>>>> >>> >> > > I am pleased to announce the start of the discussion for
>>>>>> the new AIP draft that was recently been published:
>>>>>> https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-49+OpenTelemetry+Support+for+Apache+Airflow
>>>>>> >>> >> > >
>>>>>> >>> >> > > Jarek Potuik and I have been discussing about this
>>>>>> proposal since early this year. During that time, we worked together on
>>>>>> drafting this proposal, as well as doing another round of mini-POC to
>>>>>> refresh and test on feasibility of OpenTelemetry on Apache Airflow.
>>>>>> >>> >> > >
>>>>>> >>> >> > > As the POC was successful in terms of testing the
>>>>>> OpenTelemetry on Airflow, we would like to expand the discussion to a wider
>>>>>> user community here this mailing list to gather more consensus, comments,
>>>>>> and feedbacks regarding this AIP.
>>>>>> >>> >> > >
>>>>>> >>> >> > > Sincerely,
>>>>>> >>> >> > > Howard and Jarek
>>>>>>
>>>>>

Re: [DISCUSSION] AIP-49 OpenTelemetry Support for Apache Airflow

Posted by Howard Yoo <ho...@gmail.com>.
Got it. Will describe the changes that might be subjected to be done -
however, please note that the changes were the things that I did on my own
during the POC, so the final changes may obviously be different.

On Fri, Apr 29, 2022 at 3:57 AM Elad Kalif <el...@apache.org> wrote:

> I'd prefer that the Code Changes section would be more detailed. This is a
> very important part.
> I'm not sure referring to a private repo is the way to do it but even if
> you should at least list the components which are subject to changes.
> You don't need to specify every change you are going to do - only the
> critical ones where it couples with airflow core.
> You have changes for dag_proccessing, executors, scheduler_job and even
> migration to add columns.
>
> https://github.com/howardyoo/airflow/commit/cc83c2b377ac22f0e7ef82e7f59784df972037fd
> These are important
>
>
> On Wed, Apr 27, 2022 at 10:09 PM Howard Yoo <ho...@gmail.com> wrote:
>
>> Hi Elad, I have updated the AIP-49 with the appropriate changes to
>> contain what you requested in your previous comments. I think the AIP is
>> fairly comprehensive and ready to be voted, unless there is any objections.
>>
>> Sincerely,
>> Howard
>>
>> On Mon, Apr 25, 2022 at 11:03 PM Howard Yoo <ho...@gmail.com> wrote:
>>
>>> Hi Elad,
>>> Will take a look at it and let you know!
>>>
>>> Thanks,
>>> Howard
>>>
>>> On Mon, Apr 25, 2022 at 6:35 AM Elad Kalif <el...@apache.org> wrote:
>>>
>>>> Hi Howard,
>>>> Have you made progress with addressing the comments?
>>>> I think once points are addressed we can maybe start a vote?
>>>>
>>>> On Tue, Apr 12, 2022 at 12:39 PM Jarek Potiuk <ja...@potiuk.com> wrote:
>>>>
>>>>> > 1. I would assume the path for StatsD (dogstatsd) would be for
>>>>> deprecation - we will perhaps comment or mark it as deprecation - and
>>>>> should follow the established (or usual?) process of feature deprecation of
>>>>> Airflow, once the opentelemetry is in place.
>>>>>
>>>>> Yep. Deprecation should be. And maybe accompanied with a "statsd
>>>>> open-telemetry exporter" ? I think eventually we should have just OTEL
>>>>> stats and nothing else. The current "configurable" metrics class
>>>>> should be fully replaced with "OTEL configurable" metrics. Same as
>>>>> (but that's a much longer and uncertain path) logging configuration
>>>>> should get replaced if/when OTEL logging lives to its promises.
>>>>>
>>>>> > 2. Airflow will provide its own list of 'instrumented metrics' out
>>>>> of the box - with the list specified in its documentation so that users
>>>>> would be aware of them. However, with opentelemetry, the users will also
>>>>> have the ability to add their 'custom' metrics / traces / logs to be
>>>>> collected via opentelemetry as needed. That option and how-to's should also
>>>>> be documented.
>>>>>
>>>>> Good point. Actually the nice thing is that adding new metrics could
>>>>> then be done as part of task execution for example - so we should
>>>>> indeed have some libraries/tools or maybe even operator's /task
>>>>> interface should have some built-in capabilities and allow for
>>>>> declarative ways of adding metrics (but this can be added as a
>>>>> follow-up - it does not need to be described and hashed out yet IMHO -
>>>>> but we can add it to the docs as "future improvement") .
>>>>>
>>>>> > 3. A little clarification - I believe there were two POC's - which
>>>>> at the time during the first POC, it had lacked some of the features (e.g.
>>>>> traces and logs). However, in the second POC, there is a stable release of
>>>>> Traces, and beta release of logs, so things have been progressing.
>>>>> Opentelemetry is highly evolving and many things do change and get added
>>>>> relatively quickly. During my second POC (the one mentioned in the attached
>>>>> PDF) I was able to validate that all of the 'key' features that we needed
>>>>> to implement opentelemetry for airflow has been released and available.
>>>>>
>>>>> Yeah. I concur with that Howard wrote.  What was there a few months
>>>>> ago a little "shaky", becomes more and more solid as time passes.  And
>>>>> I also spoke with a few people who are involved in OpenTelemetry
>>>>> standard and development (one of my friends is site-lead for Sumo
>>>>> Logic Poland and they take active part in that effort and I just spoke
>>>>> to him about it). Also AWS is very much vested in it as far as I
>>>>> understand - I also spoke to Google and they are very much supporting
>>>>> OTEL as industry standard. I think there is a firm industry backing
>>>>> behind OTEL and there is no way it "won't progress" or "falter". This
>>>>> is a little bit of a "leap of faith" that it will become fully
>>>>> featured for our needs, but I think what is already there is "enough"
>>>>> to justify the move and anything that comes out of Beta is a bonus.
>>>>> Eventually if things will not go fast enough for us - we can also
>>>>> actually contribute there on the "collection" level.  It will likely
>>>>> have a much better outcome than if we try to integrate airflow with
>>>>> multiple services ourselves - we will just have to make sure things
>>>>> get collected "well" - and then all the different services will more
>>>>> likely than not write exporters for it.
>>>>>
>>>>> > 4. This is a debatable topic - and I believe those may not be a part
>>>>> of the core airflow code base, but would suggest perhaps we could create a
>>>>> 'contribution' repo which may maintain those third party assets (e.g.
>>>>> Grafana dashboard, Datadog dashboard, etc.) that users may use and even
>>>>> participate in maintaining it.
>>>>>
>>>>> Yeah. I think  this is also an opportunity for someone who is
>>>>> specializing in those or even let Grafana add it to their "portfolio".
>>>>> I can imagine there might be some basic dashboards generated by
>>>>> companies which do some kinds of integrations (and with an option of
>>>>> "come to us when you want more customizability". While we won't be
>>>>> able to endorse those, we can easily let it land in the "ecosystem"
>>>>> page of ours. I think also we can easily reach out (for example to
>>>>> Grafana to do it - in the case of Grafana, Myrle Krantz, ex Treasurer
>>>>> from ASF who I know well is a Senior Manager in Grafana responsible
>>>>> for Cloud team).
>>>>>
>>>>>
>>>>> > 5. POC may not have all the changes, as the work was to 'prototype'
>>>>> and answer questions like 'what if' when opentelemetry is in place. The
>>>>> proposal actually has link to a GIT repo that contains the changes that
>>>>> were done during the POC:
>>>>> https://github.com/howardyoo/airflow/tree/opentelemetry-poc-1 . Since
>>>>> the details of the changes would make the existing PDF even more subjected
>>>>> to TL;DR, I linked this git branch for anyone interested in the changes to
>>>>> take a look. I hope this would be sufficient.
>>>>>
>>>>> I think Elad is right that some  more details need to "surface" from
>>>>> the POC to AIP. While the changes are very little - they don't change
>>>>> any flows or logic in Airflow, it's more to "selectively" add
>>>>> collections and make sure common "Span" id is shared throughout the
>>>>> code (which I think is the biggest change).
>>>>> I think Howard, it might make sense to extract parts of the POC and
>>>>> put them as an outline of changes to implement in the "AIP".
>>>>>
>>>>> The POC is more trace of what you've done, but I think simply
>>>>> translating this into "this is what we need to do" for those who will
>>>>> just read AIP is important. Our AIP's are much more "Technical" in
>>>>> nature than most of the
>>>>>
>>>>> >
>>>>> > 6. Yes, I agree - I will update the AIP proposal to make the scope
>>>>> more clearer. Thanks for the feedback!
>>>>>
>>>>> Yeah. Adding "future improvements" and especially "This is what we are
>>>>> not going to do and leave for later" is important part of every AIP. I
>>>>> think it's sometimes much more important to state what we are NOT
>>>>> going to vs. what we are going to do.
>>>>>
>>>>> >
>>>>> > Howard
>>>>> >
>>>>> > On Sun, Apr 3, 2022 at 1:27 PM Elad Kalif <el...@apache.org>
>>>>> wrote:
>>>>> >>
>>>>> >> Thanks Howard!
>>>>> >> Sorry for the delay, this was a long read. The PDF alone is 19
>>>>> pages  :)
>>>>> >> looks very good!
>>>>> >>
>>>>> >>  I have 6 questions/points to raise:
>>>>> >>
>>>>> >> 1. I'm not clear about what is to happen with StatsD .
>>>>> >> It states "Make OpenTelemetry and StatsD optional and
>>>>> interchangeable."
>>>>> >> But do we want to support both in the long run?
>>>>> >> It doesn't specify if we are deprecating statsD along with
>>>>> completion of this AIP.
>>>>> >>
>>>>> >> 2. regarding adding metrics.
>>>>> >> Do we intend to let users define their own KPIs/metrics to be
>>>>> measured or it will be a closed list set by Airflow?
>>>>> >>
>>>>> >> 3. The POC specifies it uses a feature of open-telemetry (add
>>>>> metrics) which was not yet released. Do we know the timeline for the
>>>>> feature to be released?
>>>>> >> Can we vote on a plan to use something which is not yet publicly
>>>>> available?
>>>>> >>
>>>>> >> 4. Are the Grafana dashboards / other dashboards to be part of the
>>>>> Airflow core code base?
>>>>> >> I wonder if this should be in a dedicated repo?
>>>>> >>
>>>>> >> 5. Could you please clarify in the AIP page what changes are
>>>>> required to the Airflow code base?
>>>>> >> Some of them appear in the PDF but I'm not sure if that is all of
>>>>> them?
>>>>> >>
>>>>> >> 6. The PDF has several open questions/ideas.
>>>>> >> I think it would be best to add to the AIP a scope paragraph
>>>>> listing what will be included in the first phase and what is left for other
>>>>> phases.
>>>>> >>
>>>>> >> On Fri, Apr 1, 2022 at 4:26 PM Jarek Potiuk <ja...@potiuk.com>
>>>>> wrote:
>>>>> >>>
>>>>> >>> Or maybe that people are so stunned by the beauty and usefulness
>>>>> of it
>>>>> >>> that they cannot even say a word :)
>>>>> >>>
>>>>> >>> On Thu, Mar 31, 2022 at 11:37 PM Howard Yoo <ho...@gmail.com>
>>>>> wrote:
>>>>> >>> >
>>>>> >>> > I think it may mean all is well, perhaps :-)
>>>>> >>> > Howard
>>>>> >>> >
>>>>> >>> > On Thu, Mar 31, 2022 at 9:29 AM Jarek Potiuk <ja...@potiuk.com>
>>>>> wrote:
>>>>> >>> >>
>>>>> >>> >> Hello everyone,
>>>>> >>> >>
>>>>> >>> >> Would be great to get some comments and reviews  - especially
>>>>> from
>>>>> >>> >> those who are users and are doing monitoring. Howard made a lot
>>>>> of
>>>>> >>> >> effort to show examples of how OTEL might help in this.
>>>>> >>> >>
>>>>> >>> >> Otherwise, is the silence sign that all is good ?
>>>>> >>> >>
>>>>> >>> >> J.
>>>>> >>> >>
>>>>> >>> >> On Fri, Mar 25, 2022 at 9:53 PM Jarek Potiuk <ja...@potiuk.com>
>>>>> wrote:
>>>>> >>> >> >
>>>>> >>> >> > And let me add to it - we laid some foundations for it with
>>>>> Melodie
>>>>> >>> >> > Ezeani - the Outreachy intern where we did some internal
>>>>> integration
>>>>> >>> >> > work that let us understand the challenges and state of
>>>>> >>> >> > open-telemetry.
>>>>> >>> >> >
>>>>> >>> >> > I am super excited about what open-telemetry can bring to
>>>>> Airflow -
>>>>> >>> >> > both short term (in metrics and instrumentation) and longer
>>>>> term - in
>>>>> >>> >> > logging when logging is mature enough.
>>>>> >>> >> >
>>>>> >>> >> > The open-telemetry mov is at the heart of the principles we
>>>>> have -
>>>>> >>> >> > making Airflow "modern" but at the same time delegating
>>>>> what's not
>>>>> >>> >> > "core" to those who can do it better. Open Telemetry is
>>>>> precisely
>>>>> >>> >> > about that,
>>>>> >>> >> >
>>>>> >>> >> > Howard particularly brought a great experience from using Open
>>>>> >>> >> > Telemetry before and making some good judgements and POC on
>>>>> the
>>>>> >>> >> > metrics produced by Airflow and how they can be useful from
>>>>> the
>>>>> >>> >> > "maintainer value" side. I looked at it from the technical
>>>>> integration
>>>>> >>> >> > POV - and Melodie helped to validate some of the assumptions
>>>>> and
>>>>> >>> >> > expose some of the technical challenges.
>>>>> >>> >> > The composite result is good, but we are looking with Howard
>>>>> on some
>>>>> >>> >> > insightful comments and critique - especially from the users
>>>>> of
>>>>> >>> >> > Airflow!
>>>>> >>> >> >
>>>>> >>> >> > I look forward to more cool stuff on Airflow!
>>>>> >>> >> >
>>>>> >>> >> > J.
>>>>> >>> >> >
>>>>> >>> >> > On Fri, Mar 25, 2022 at 9:39 PM Howard Yoo <
>>>>> howardyoo@gmail.com> wrote:
>>>>> >>> >> > >
>>>>> >>> >> > > Hi all,
>>>>> >>> >> > >
>>>>> >>> >> > > I am pleased to announce the start of the discussion for
>>>>> the new AIP draft that was recently been published:
>>>>> https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-49+OpenTelemetry+Support+for+Apache+Airflow
>>>>> >>> >> > >
>>>>> >>> >> > > Jarek Potuik and I have been discussing about this proposal
>>>>> since early this year. During that time, we worked together on drafting
>>>>> this proposal, as well as doing another round of mini-POC to refresh and
>>>>> test on feasibility of OpenTelemetry on Apache Airflow.
>>>>> >>> >> > >
>>>>> >>> >> > > As the POC was successful in terms of testing the
>>>>> OpenTelemetry on Airflow, we would like to expand the discussion to a wider
>>>>> user community here this mailing list to gather more consensus, comments,
>>>>> and feedbacks regarding this AIP.
>>>>> >>> >> > >
>>>>> >>> >> > > Sincerely,
>>>>> >>> >> > > Howard and Jarek
>>>>>
>>>>

Re: [DISCUSSION] AIP-49 OpenTelemetry Support for Apache Airflow

Posted by Elad Kalif <el...@apache.org>.
I'd prefer that the Code Changes section would be more detailed. This is a
very important part.
I'm not sure referring to a private repo is the way to do it but even if
you should at least list the components which are subject to changes.
You don't need to specify every change you are going to do - only the
critical ones where it couples with airflow core.
You have changes for dag_proccessing, executors, scheduler_job and even
migration to add columns.
https://github.com/howardyoo/airflow/commit/cc83c2b377ac22f0e7ef82e7f59784df972037fd
These are important


On Wed, Apr 27, 2022 at 10:09 PM Howard Yoo <ho...@gmail.com> wrote:

> Hi Elad, I have updated the AIP-49 with the appropriate changes to contain
> what you requested in your previous comments. I think the AIP is fairly
> comprehensive and ready to be voted, unless there is any objections.
>
> Sincerely,
> Howard
>
> On Mon, Apr 25, 2022 at 11:03 PM Howard Yoo <ho...@gmail.com> wrote:
>
>> Hi Elad,
>> Will take a look at it and let you know!
>>
>> Thanks,
>> Howard
>>
>> On Mon, Apr 25, 2022 at 6:35 AM Elad Kalif <el...@apache.org> wrote:
>>
>>> Hi Howard,
>>> Have you made progress with addressing the comments?
>>> I think once points are addressed we can maybe start a vote?
>>>
>>> On Tue, Apr 12, 2022 at 12:39 PM Jarek Potiuk <ja...@potiuk.com> wrote:
>>>
>>>> > 1. I would assume the path for StatsD (dogstatsd) would be for
>>>> deprecation - we will perhaps comment or mark it as deprecation - and
>>>> should follow the established (or usual?) process of feature deprecation of
>>>> Airflow, once the opentelemetry is in place.
>>>>
>>>> Yep. Deprecation should be. And maybe accompanied with a "statsd
>>>> open-telemetry exporter" ? I think eventually we should have just OTEL
>>>> stats and nothing else. The current "configurable" metrics class
>>>> should be fully replaced with "OTEL configurable" metrics. Same as
>>>> (but that's a much longer and uncertain path) logging configuration
>>>> should get replaced if/when OTEL logging lives to its promises.
>>>>
>>>> > 2. Airflow will provide its own list of 'instrumented metrics' out of
>>>> the box - with the list specified in its documentation so that users would
>>>> be aware of them. However, with opentelemetry, the users will also have the
>>>> ability to add their 'custom' metrics / traces / logs to be collected via
>>>> opentelemetry as needed. That option and how-to's should also be documented.
>>>>
>>>> Good point. Actually the nice thing is that adding new metrics could
>>>> then be done as part of task execution for example - so we should
>>>> indeed have some libraries/tools or maybe even operator's /task
>>>> interface should have some built-in capabilities and allow for
>>>> declarative ways of adding metrics (but this can be added as a
>>>> follow-up - it does not need to be described and hashed out yet IMHO -
>>>> but we can add it to the docs as "future improvement") .
>>>>
>>>> > 3. A little clarification - I believe there were two POC's - which at
>>>> the time during the first POC, it had lacked some of the features (e.g.
>>>> traces and logs). However, in the second POC, there is a stable release of
>>>> Traces, and beta release of logs, so things have been progressing.
>>>> Opentelemetry is highly evolving and many things do change and get added
>>>> relatively quickly. During my second POC (the one mentioned in the attached
>>>> PDF) I was able to validate that all of the 'key' features that we needed
>>>> to implement opentelemetry for airflow has been released and available.
>>>>
>>>> Yeah. I concur with that Howard wrote.  What was there a few months
>>>> ago a little "shaky", becomes more and more solid as time passes.  And
>>>> I also spoke with a few people who are involved in OpenTelemetry
>>>> standard and development (one of my friends is site-lead for Sumo
>>>> Logic Poland and they take active part in that effort and I just spoke
>>>> to him about it). Also AWS is very much vested in it as far as I
>>>> understand - I also spoke to Google and they are very much supporting
>>>> OTEL as industry standard. I think there is a firm industry backing
>>>> behind OTEL and there is no way it "won't progress" or "falter". This
>>>> is a little bit of a "leap of faith" that it will become fully
>>>> featured for our needs, but I think what is already there is "enough"
>>>> to justify the move and anything that comes out of Beta is a bonus.
>>>> Eventually if things will not go fast enough for us - we can also
>>>> actually contribute there on the "collection" level.  It will likely
>>>> have a much better outcome than if we try to integrate airflow with
>>>> multiple services ourselves - we will just have to make sure things
>>>> get collected "well" - and then all the different services will more
>>>> likely than not write exporters for it.
>>>>
>>>> > 4. This is a debatable topic - and I believe those may not be a part
>>>> of the core airflow code base, but would suggest perhaps we could create a
>>>> 'contribution' repo which may maintain those third party assets (e.g.
>>>> Grafana dashboard, Datadog dashboard, etc.) that users may use and even
>>>> participate in maintaining it.
>>>>
>>>> Yeah. I think  this is also an opportunity for someone who is
>>>> specializing in those or even let Grafana add it to their "portfolio".
>>>> I can imagine there might be some basic dashboards generated by
>>>> companies which do some kinds of integrations (and with an option of
>>>> "come to us when you want more customizability". While we won't be
>>>> able to endorse those, we can easily let it land in the "ecosystem"
>>>> page of ours. I think also we can easily reach out (for example to
>>>> Grafana to do it - in the case of Grafana, Myrle Krantz, ex Treasurer
>>>> from ASF who I know well is a Senior Manager in Grafana responsible
>>>> for Cloud team).
>>>>
>>>>
>>>> > 5. POC may not have all the changes, as the work was to 'prototype'
>>>> and answer questions like 'what if' when opentelemetry is in place. The
>>>> proposal actually has link to a GIT repo that contains the changes that
>>>> were done during the POC:
>>>> https://github.com/howardyoo/airflow/tree/opentelemetry-poc-1 . Since
>>>> the details of the changes would make the existing PDF even more subjected
>>>> to TL;DR, I linked this git branch for anyone interested in the changes to
>>>> take a look. I hope this would be sufficient.
>>>>
>>>> I think Elad is right that some  more details need to "surface" from
>>>> the POC to AIP. While the changes are very little - they don't change
>>>> any flows or logic in Airflow, it's more to "selectively" add
>>>> collections and make sure common "Span" id is shared throughout the
>>>> code (which I think is the biggest change).
>>>> I think Howard, it might make sense to extract parts of the POC and
>>>> put them as an outline of changes to implement in the "AIP".
>>>>
>>>> The POC is more trace of what you've done, but I think simply
>>>> translating this into "this is what we need to do" for those who will
>>>> just read AIP is important. Our AIP's are much more "Technical" in
>>>> nature than most of the
>>>>
>>>> >
>>>> > 6. Yes, I agree - I will update the AIP proposal to make the scope
>>>> more clearer. Thanks for the feedback!
>>>>
>>>> Yeah. Adding "future improvements" and especially "This is what we are
>>>> not going to do and leave for later" is important part of every AIP. I
>>>> think it's sometimes much more important to state what we are NOT
>>>> going to vs. what we are going to do.
>>>>
>>>> >
>>>> > Howard
>>>> >
>>>> > On Sun, Apr 3, 2022 at 1:27 PM Elad Kalif <el...@apache.org> wrote:
>>>> >>
>>>> >> Thanks Howard!
>>>> >> Sorry for the delay, this was a long read. The PDF alone is 19
>>>> pages  :)
>>>> >> looks very good!
>>>> >>
>>>> >>  I have 6 questions/points to raise:
>>>> >>
>>>> >> 1. I'm not clear about what is to happen with StatsD .
>>>> >> It states "Make OpenTelemetry and StatsD optional and
>>>> interchangeable."
>>>> >> But do we want to support both in the long run?
>>>> >> It doesn't specify if we are deprecating statsD along with
>>>> completion of this AIP.
>>>> >>
>>>> >> 2. regarding adding metrics.
>>>> >> Do we intend to let users define their own KPIs/metrics to be
>>>> measured or it will be a closed list set by Airflow?
>>>> >>
>>>> >> 3. The POC specifies it uses a feature of open-telemetry (add
>>>> metrics) which was not yet released. Do we know the timeline for the
>>>> feature to be released?
>>>> >> Can we vote on a plan to use something which is not yet publicly
>>>> available?
>>>> >>
>>>> >> 4. Are the Grafana dashboards / other dashboards to be part of the
>>>> Airflow core code base?
>>>> >> I wonder if this should be in a dedicated repo?
>>>> >>
>>>> >> 5. Could you please clarify in the AIP page what changes are
>>>> required to the Airflow code base?
>>>> >> Some of them appear in the PDF but I'm not sure if that is all of
>>>> them?
>>>> >>
>>>> >> 6. The PDF has several open questions/ideas.
>>>> >> I think it would be best to add to the AIP a scope paragraph listing
>>>> what will be included in the first phase and what is left for other phases.
>>>> >>
>>>> >> On Fri, Apr 1, 2022 at 4:26 PM Jarek Potiuk <ja...@potiuk.com>
>>>> wrote:
>>>> >>>
>>>> >>> Or maybe that people are so stunned by the beauty and usefulness of
>>>> it
>>>> >>> that they cannot even say a word :)
>>>> >>>
>>>> >>> On Thu, Mar 31, 2022 at 11:37 PM Howard Yoo <ho...@gmail.com>
>>>> wrote:
>>>> >>> >
>>>> >>> > I think it may mean all is well, perhaps :-)
>>>> >>> > Howard
>>>> >>> >
>>>> >>> > On Thu, Mar 31, 2022 at 9:29 AM Jarek Potiuk <ja...@potiuk.com>
>>>> wrote:
>>>> >>> >>
>>>> >>> >> Hello everyone,
>>>> >>> >>
>>>> >>> >> Would be great to get some comments and reviews  - especially
>>>> from
>>>> >>> >> those who are users and are doing monitoring. Howard made a lot
>>>> of
>>>> >>> >> effort to show examples of how OTEL might help in this.
>>>> >>> >>
>>>> >>> >> Otherwise, is the silence sign that all is good ?
>>>> >>> >>
>>>> >>> >> J.
>>>> >>> >>
>>>> >>> >> On Fri, Mar 25, 2022 at 9:53 PM Jarek Potiuk <ja...@potiuk.com>
>>>> wrote:
>>>> >>> >> >
>>>> >>> >> > And let me add to it - we laid some foundations for it with
>>>> Melodie
>>>> >>> >> > Ezeani - the Outreachy intern where we did some internal
>>>> integration
>>>> >>> >> > work that let us understand the challenges and state of
>>>> >>> >> > open-telemetry.
>>>> >>> >> >
>>>> >>> >> > I am super excited about what open-telemetry can bring to
>>>> Airflow -
>>>> >>> >> > both short term (in metrics and instrumentation) and longer
>>>> term - in
>>>> >>> >> > logging when logging is mature enough.
>>>> >>> >> >
>>>> >>> >> > The open-telemetry mov is at the heart of the principles we
>>>> have -
>>>> >>> >> > making Airflow "modern" but at the same time delegating what's
>>>> not
>>>> >>> >> > "core" to those who can do it better. Open Telemetry is
>>>> precisely
>>>> >>> >> > about that,
>>>> >>> >> >
>>>> >>> >> > Howard particularly brought a great experience from using Open
>>>> >>> >> > Telemetry before and making some good judgements and POC on the
>>>> >>> >> > metrics produced by Airflow and how they can be useful from the
>>>> >>> >> > "maintainer value" side. I looked at it from the technical
>>>> integration
>>>> >>> >> > POV - and Melodie helped to validate some of the assumptions
>>>> and
>>>> >>> >> > expose some of the technical challenges.
>>>> >>> >> > The composite result is good, but we are looking with Howard
>>>> on some
>>>> >>> >> > insightful comments and critique - especially from the users of
>>>> >>> >> > Airflow!
>>>> >>> >> >
>>>> >>> >> > I look forward to more cool stuff on Airflow!
>>>> >>> >> >
>>>> >>> >> > J.
>>>> >>> >> >
>>>> >>> >> > On Fri, Mar 25, 2022 at 9:39 PM Howard Yoo <
>>>> howardyoo@gmail.com> wrote:
>>>> >>> >> > >
>>>> >>> >> > > Hi all,
>>>> >>> >> > >
>>>> >>> >> > > I am pleased to announce the start of the discussion for the
>>>> new AIP draft that was recently been published:
>>>> https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-49+OpenTelemetry+Support+for+Apache+Airflow
>>>> >>> >> > >
>>>> >>> >> > > Jarek Potuik and I have been discussing about this proposal
>>>> since early this year. During that time, we worked together on drafting
>>>> this proposal, as well as doing another round of mini-POC to refresh and
>>>> test on feasibility of OpenTelemetry on Apache Airflow.
>>>> >>> >> > >
>>>> >>> >> > > As the POC was successful in terms of testing the
>>>> OpenTelemetry on Airflow, we would like to expand the discussion to a wider
>>>> user community here this mailing list to gather more consensus, comments,
>>>> and feedbacks regarding this AIP.
>>>> >>> >> > >
>>>> >>> >> > > Sincerely,
>>>> >>> >> > > Howard and Jarek
>>>>
>>>

Re: [DISCUSSION] AIP-49 OpenTelemetry Support for Apache Airflow

Posted by Howard Yoo <ho...@gmail.com>.
Hi Elad, I have updated the AIP-49 with the appropriate changes to contain
what you requested in your previous comments. I think the AIP is fairly
comprehensive and ready to be voted, unless there is any objections.

Sincerely,
Howard

On Mon, Apr 25, 2022 at 11:03 PM Howard Yoo <ho...@gmail.com> wrote:

> Hi Elad,
> Will take a look at it and let you know!
>
> Thanks,
> Howard
>
> On Mon, Apr 25, 2022 at 6:35 AM Elad Kalif <el...@apache.org> wrote:
>
>> Hi Howard,
>> Have you made progress with addressing the comments?
>> I think once points are addressed we can maybe start a vote?
>>
>> On Tue, Apr 12, 2022 at 12:39 PM Jarek Potiuk <ja...@potiuk.com> wrote:
>>
>>> > 1. I would assume the path for StatsD (dogstatsd) would be for
>>> deprecation - we will perhaps comment or mark it as deprecation - and
>>> should follow the established (or usual?) process of feature deprecation of
>>> Airflow, once the opentelemetry is in place.
>>>
>>> Yep. Deprecation should be. And maybe accompanied with a "statsd
>>> open-telemetry exporter" ? I think eventually we should have just OTEL
>>> stats and nothing else. The current "configurable" metrics class
>>> should be fully replaced with "OTEL configurable" metrics. Same as
>>> (but that's a much longer and uncertain path) logging configuration
>>> should get replaced if/when OTEL logging lives to its promises.
>>>
>>> > 2. Airflow will provide its own list of 'instrumented metrics' out of
>>> the box - with the list specified in its documentation so that users would
>>> be aware of them. However, with opentelemetry, the users will also have the
>>> ability to add their 'custom' metrics / traces / logs to be collected via
>>> opentelemetry as needed. That option and how-to's should also be documented.
>>>
>>> Good point. Actually the nice thing is that adding new metrics could
>>> then be done as part of task execution for example - so we should
>>> indeed have some libraries/tools or maybe even operator's /task
>>> interface should have some built-in capabilities and allow for
>>> declarative ways of adding metrics (but this can be added as a
>>> follow-up - it does not need to be described and hashed out yet IMHO -
>>> but we can add it to the docs as "future improvement") .
>>>
>>> > 3. A little clarification - I believe there were two POC's - which at
>>> the time during the first POC, it had lacked some of the features (e.g.
>>> traces and logs). However, in the second POC, there is a stable release of
>>> Traces, and beta release of logs, so things have been progressing.
>>> Opentelemetry is highly evolving and many things do change and get added
>>> relatively quickly. During my second POC (the one mentioned in the attached
>>> PDF) I was able to validate that all of the 'key' features that we needed
>>> to implement opentelemetry for airflow has been released and available.
>>>
>>> Yeah. I concur with that Howard wrote.  What was there a few months
>>> ago a little "shaky", becomes more and more solid as time passes.  And
>>> I also spoke with a few people who are involved in OpenTelemetry
>>> standard and development (one of my friends is site-lead for Sumo
>>> Logic Poland and they take active part in that effort and I just spoke
>>> to him about it). Also AWS is very much vested in it as far as I
>>> understand - I also spoke to Google and they are very much supporting
>>> OTEL as industry standard. I think there is a firm industry backing
>>> behind OTEL and there is no way it "won't progress" or "falter". This
>>> is a little bit of a "leap of faith" that it will become fully
>>> featured for our needs, but I think what is already there is "enough"
>>> to justify the move and anything that comes out of Beta is a bonus.
>>> Eventually if things will not go fast enough for us - we can also
>>> actually contribute there on the "collection" level.  It will likely
>>> have a much better outcome than if we try to integrate airflow with
>>> multiple services ourselves - we will just have to make sure things
>>> get collected "well" - and then all the different services will more
>>> likely than not write exporters for it.
>>>
>>> > 4. This is a debatable topic - and I believe those may not be a part
>>> of the core airflow code base, but would suggest perhaps we could create a
>>> 'contribution' repo which may maintain those third party assets (e.g.
>>> Grafana dashboard, Datadog dashboard, etc.) that users may use and even
>>> participate in maintaining it.
>>>
>>> Yeah. I think  this is also an opportunity for someone who is
>>> specializing in those or even let Grafana add it to their "portfolio".
>>> I can imagine there might be some basic dashboards generated by
>>> companies which do some kinds of integrations (and with an option of
>>> "come to us when you want more customizability". While we won't be
>>> able to endorse those, we can easily let it land in the "ecosystem"
>>> page of ours. I think also we can easily reach out (for example to
>>> Grafana to do it - in the case of Grafana, Myrle Krantz, ex Treasurer
>>> from ASF who I know well is a Senior Manager in Grafana responsible
>>> for Cloud team).
>>>
>>>
>>> > 5. POC may not have all the changes, as the work was to 'prototype'
>>> and answer questions like 'what if' when opentelemetry is in place. The
>>> proposal actually has link to a GIT repo that contains the changes that
>>> were done during the POC:
>>> https://github.com/howardyoo/airflow/tree/opentelemetry-poc-1 . Since
>>> the details of the changes would make the existing PDF even more subjected
>>> to TL;DR, I linked this git branch for anyone interested in the changes to
>>> take a look. I hope this would be sufficient.
>>>
>>> I think Elad is right that some  more details need to "surface" from
>>> the POC to AIP. While the changes are very little - they don't change
>>> any flows or logic in Airflow, it's more to "selectively" add
>>> collections and make sure common "Span" id is shared throughout the
>>> code (which I think is the biggest change).
>>> I think Howard, it might make sense to extract parts of the POC and
>>> put them as an outline of changes to implement in the "AIP".
>>>
>>> The POC is more trace of what you've done, but I think simply
>>> translating this into "this is what we need to do" for those who will
>>> just read AIP is important. Our AIP's are much more "Technical" in
>>> nature than most of the
>>>
>>> >
>>> > 6. Yes, I agree - I will update the AIP proposal to make the scope
>>> more clearer. Thanks for the feedback!
>>>
>>> Yeah. Adding "future improvements" and especially "This is what we are
>>> not going to do and leave for later" is important part of every AIP. I
>>> think it's sometimes much more important to state what we are NOT
>>> going to vs. what we are going to do.
>>>
>>> >
>>> > Howard
>>> >
>>> > On Sun, Apr 3, 2022 at 1:27 PM Elad Kalif <el...@apache.org> wrote:
>>> >>
>>> >> Thanks Howard!
>>> >> Sorry for the delay, this was a long read. The PDF alone is 19 pages
>>> :)
>>> >> looks very good!
>>> >>
>>> >>  I have 6 questions/points to raise:
>>> >>
>>> >> 1. I'm not clear about what is to happen with StatsD .
>>> >> It states "Make OpenTelemetry and StatsD optional and
>>> interchangeable."
>>> >> But do we want to support both in the long run?
>>> >> It doesn't specify if we are deprecating statsD along with completion
>>> of this AIP.
>>> >>
>>> >> 2. regarding adding metrics.
>>> >> Do we intend to let users define their own KPIs/metrics to be
>>> measured or it will be a closed list set by Airflow?
>>> >>
>>> >> 3. The POC specifies it uses a feature of open-telemetry (add
>>> metrics) which was not yet released. Do we know the timeline for the
>>> feature to be released?
>>> >> Can we vote on a plan to use something which is not yet publicly
>>> available?
>>> >>
>>> >> 4. Are the Grafana dashboards / other dashboards to be part of the
>>> Airflow core code base?
>>> >> I wonder if this should be in a dedicated repo?
>>> >>
>>> >> 5. Could you please clarify in the AIP page what changes are required
>>> to the Airflow code base?
>>> >> Some of them appear in the PDF but I'm not sure if that is all of
>>> them?
>>> >>
>>> >> 6. The PDF has several open questions/ideas.
>>> >> I think it would be best to add to the AIP a scope paragraph listing
>>> what will be included in the first phase and what is left for other phases.
>>> >>
>>> >> On Fri, Apr 1, 2022 at 4:26 PM Jarek Potiuk <ja...@potiuk.com> wrote:
>>> >>>
>>> >>> Or maybe that people are so stunned by the beauty and usefulness of
>>> it
>>> >>> that they cannot even say a word :)
>>> >>>
>>> >>> On Thu, Mar 31, 2022 at 11:37 PM Howard Yoo <ho...@gmail.com>
>>> wrote:
>>> >>> >
>>> >>> > I think it may mean all is well, perhaps :-)
>>> >>> > Howard
>>> >>> >
>>> >>> > On Thu, Mar 31, 2022 at 9:29 AM Jarek Potiuk <ja...@potiuk.com>
>>> wrote:
>>> >>> >>
>>> >>> >> Hello everyone,
>>> >>> >>
>>> >>> >> Would be great to get some comments and reviews  - especially from
>>> >>> >> those who are users and are doing monitoring. Howard made a lot of
>>> >>> >> effort to show examples of how OTEL might help in this.
>>> >>> >>
>>> >>> >> Otherwise, is the silence sign that all is good ?
>>> >>> >>
>>> >>> >> J.
>>> >>> >>
>>> >>> >> On Fri, Mar 25, 2022 at 9:53 PM Jarek Potiuk <ja...@potiuk.com>
>>> wrote:
>>> >>> >> >
>>> >>> >> > And let me add to it - we laid some foundations for it with
>>> Melodie
>>> >>> >> > Ezeani - the Outreachy intern where we did some internal
>>> integration
>>> >>> >> > work that let us understand the challenges and state of
>>> >>> >> > open-telemetry.
>>> >>> >> >
>>> >>> >> > I am super excited about what open-telemetry can bring to
>>> Airflow -
>>> >>> >> > both short term (in metrics and instrumentation) and longer
>>> term - in
>>> >>> >> > logging when logging is mature enough.
>>> >>> >> >
>>> >>> >> > The open-telemetry mov is at the heart of the principles we
>>> have -
>>> >>> >> > making Airflow "modern" but at the same time delegating what's
>>> not
>>> >>> >> > "core" to those who can do it better. Open Telemetry is
>>> precisely
>>> >>> >> > about that,
>>> >>> >> >
>>> >>> >> > Howard particularly brought a great experience from using Open
>>> >>> >> > Telemetry before and making some good judgements and POC on the
>>> >>> >> > metrics produced by Airflow and how they can be useful from the
>>> >>> >> > "maintainer value" side. I looked at it from the technical
>>> integration
>>> >>> >> > POV - and Melodie helped to validate some of the assumptions and
>>> >>> >> > expose some of the technical challenges.
>>> >>> >> > The composite result is good, but we are looking with Howard on
>>> some
>>> >>> >> > insightful comments and critique - especially from the users of
>>> >>> >> > Airflow!
>>> >>> >> >
>>> >>> >> > I look forward to more cool stuff on Airflow!
>>> >>> >> >
>>> >>> >> > J.
>>> >>> >> >
>>> >>> >> > On Fri, Mar 25, 2022 at 9:39 PM Howard Yoo <ho...@gmail.com>
>>> wrote:
>>> >>> >> > >
>>> >>> >> > > Hi all,
>>> >>> >> > >
>>> >>> >> > > I am pleased to announce the start of the discussion for the
>>> new AIP draft that was recently been published:
>>> https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-49+OpenTelemetry+Support+for+Apache+Airflow
>>> >>> >> > >
>>> >>> >> > > Jarek Potuik and I have been discussing about this proposal
>>> since early this year. During that time, we worked together on drafting
>>> this proposal, as well as doing another round of mini-POC to refresh and
>>> test on feasibility of OpenTelemetry on Apache Airflow.
>>> >>> >> > >
>>> >>> >> > > As the POC was successful in terms of testing the
>>> OpenTelemetry on Airflow, we would like to expand the discussion to a wider
>>> user community here this mailing list to gather more consensus, comments,
>>> and feedbacks regarding this AIP.
>>> >>> >> > >
>>> >>> >> > > Sincerely,
>>> >>> >> > > Howard and Jarek
>>>
>>

Re: [DISCUSSION] AIP-49 OpenTelemetry Support for Apache Airflow

Posted by Howard Yoo <ho...@gmail.com>.
Hi Elad,
Will take a look at it and let you know!

Thanks,
Howard

On Mon, Apr 25, 2022 at 6:35 AM Elad Kalif <el...@apache.org> wrote:

> Hi Howard,
> Have you made progress with addressing the comments?
> I think once points are addressed we can maybe start a vote?
>
> On Tue, Apr 12, 2022 at 12:39 PM Jarek Potiuk <ja...@potiuk.com> wrote:
>
>> > 1. I would assume the path for StatsD (dogstatsd) would be for
>> deprecation - we will perhaps comment or mark it as deprecation - and
>> should follow the established (or usual?) process of feature deprecation of
>> Airflow, once the opentelemetry is in place.
>>
>> Yep. Deprecation should be. And maybe accompanied with a "statsd
>> open-telemetry exporter" ? I think eventually we should have just OTEL
>> stats and nothing else. The current "configurable" metrics class
>> should be fully replaced with "OTEL configurable" metrics. Same as
>> (but that's a much longer and uncertain path) logging configuration
>> should get replaced if/when OTEL logging lives to its promises.
>>
>> > 2. Airflow will provide its own list of 'instrumented metrics' out of
>> the box - with the list specified in its documentation so that users would
>> be aware of them. However, with opentelemetry, the users will also have the
>> ability to add their 'custom' metrics / traces / logs to be collected via
>> opentelemetry as needed. That option and how-to's should also be documented.
>>
>> Good point. Actually the nice thing is that adding new metrics could
>> then be done as part of task execution for example - so we should
>> indeed have some libraries/tools or maybe even operator's /task
>> interface should have some built-in capabilities and allow for
>> declarative ways of adding metrics (but this can be added as a
>> follow-up - it does not need to be described and hashed out yet IMHO -
>> but we can add it to the docs as "future improvement") .
>>
>> > 3. A little clarification - I believe there were two POC's - which at
>> the time during the first POC, it had lacked some of the features (e.g.
>> traces and logs). However, in the second POC, there is a stable release of
>> Traces, and beta release of logs, so things have been progressing.
>> Opentelemetry is highly evolving and many things do change and get added
>> relatively quickly. During my second POC (the one mentioned in the attached
>> PDF) I was able to validate that all of the 'key' features that we needed
>> to implement opentelemetry for airflow has been released and available.
>>
>> Yeah. I concur with that Howard wrote.  What was there a few months
>> ago a little "shaky", becomes more and more solid as time passes.  And
>> I also spoke with a few people who are involved in OpenTelemetry
>> standard and development (one of my friends is site-lead for Sumo
>> Logic Poland and they take active part in that effort and I just spoke
>> to him about it). Also AWS is very much vested in it as far as I
>> understand - I also spoke to Google and they are very much supporting
>> OTEL as industry standard. I think there is a firm industry backing
>> behind OTEL and there is no way it "won't progress" or "falter". This
>> is a little bit of a "leap of faith" that it will become fully
>> featured for our needs, but I think what is already there is "enough"
>> to justify the move and anything that comes out of Beta is a bonus.
>> Eventually if things will not go fast enough for us - we can also
>> actually contribute there on the "collection" level.  It will likely
>> have a much better outcome than if we try to integrate airflow with
>> multiple services ourselves - we will just have to make sure things
>> get collected "well" - and then all the different services will more
>> likely than not write exporters for it.
>>
>> > 4. This is a debatable topic - and I believe those may not be a part of
>> the core airflow code base, but would suggest perhaps we could create a
>> 'contribution' repo which may maintain those third party assets (e.g.
>> Grafana dashboard, Datadog dashboard, etc.) that users may use and even
>> participate in maintaining it.
>>
>> Yeah. I think  this is also an opportunity for someone who is
>> specializing in those or even let Grafana add it to their "portfolio".
>> I can imagine there might be some basic dashboards generated by
>> companies which do some kinds of integrations (and with an option of
>> "come to us when you want more customizability". While we won't be
>> able to endorse those, we can easily let it land in the "ecosystem"
>> page of ours. I think also we can easily reach out (for example to
>> Grafana to do it - in the case of Grafana, Myrle Krantz, ex Treasurer
>> from ASF who I know well is a Senior Manager in Grafana responsible
>> for Cloud team).
>>
>>
>> > 5. POC may not have all the changes, as the work was to 'prototype' and
>> answer questions like 'what if' when opentelemetry is in place. The
>> proposal actually has link to a GIT repo that contains the changes that
>> were done during the POC:
>> https://github.com/howardyoo/airflow/tree/opentelemetry-poc-1 . Since
>> the details of the changes would make the existing PDF even more subjected
>> to TL;DR, I linked this git branch for anyone interested in the changes to
>> take a look. I hope this would be sufficient.
>>
>> I think Elad is right that some  more details need to "surface" from
>> the POC to AIP. While the changes are very little - they don't change
>> any flows or logic in Airflow, it's more to "selectively" add
>> collections and make sure common "Span" id is shared throughout the
>> code (which I think is the biggest change).
>> I think Howard, it might make sense to extract parts of the POC and
>> put them as an outline of changes to implement in the "AIP".
>>
>> The POC is more trace of what you've done, but I think simply
>> translating this into "this is what we need to do" for those who will
>> just read AIP is important. Our AIP's are much more "Technical" in
>> nature than most of the
>>
>> >
>> > 6. Yes, I agree - I will update the AIP proposal to make the scope more
>> clearer. Thanks for the feedback!
>>
>> Yeah. Adding "future improvements" and especially "This is what we are
>> not going to do and leave for later" is important part of every AIP. I
>> think it's sometimes much more important to state what we are NOT
>> going to vs. what we are going to do.
>>
>> >
>> > Howard
>> >
>> > On Sun, Apr 3, 2022 at 1:27 PM Elad Kalif <el...@apache.org> wrote:
>> >>
>> >> Thanks Howard!
>> >> Sorry for the delay, this was a long read. The PDF alone is 19 pages
>> :)
>> >> looks very good!
>> >>
>> >>  I have 6 questions/points to raise:
>> >>
>> >> 1. I'm not clear about what is to happen with StatsD .
>> >> It states "Make OpenTelemetry and StatsD optional and interchangeable."
>> >> But do we want to support both in the long run?
>> >> It doesn't specify if we are deprecating statsD along with completion
>> of this AIP.
>> >>
>> >> 2. regarding adding metrics.
>> >> Do we intend to let users define their own KPIs/metrics to be measured
>> or it will be a closed list set by Airflow?
>> >>
>> >> 3. The POC specifies it uses a feature of open-telemetry (add metrics)
>> which was not yet released. Do we know the timeline for the feature to be
>> released?
>> >> Can we vote on a plan to use something which is not yet publicly
>> available?
>> >>
>> >> 4. Are the Grafana dashboards / other dashboards to be part of the
>> Airflow core code base?
>> >> I wonder if this should be in a dedicated repo?
>> >>
>> >> 5. Could you please clarify in the AIP page what changes are required
>> to the Airflow code base?
>> >> Some of them appear in the PDF but I'm not sure if that is all of them?
>> >>
>> >> 6. The PDF has several open questions/ideas.
>> >> I think it would be best to add to the AIP a scope paragraph listing
>> what will be included in the first phase and what is left for other phases.
>> >>
>> >> On Fri, Apr 1, 2022 at 4:26 PM Jarek Potiuk <ja...@potiuk.com> wrote:
>> >>>
>> >>> Or maybe that people are so stunned by the beauty and usefulness of it
>> >>> that they cannot even say a word :)
>> >>>
>> >>> On Thu, Mar 31, 2022 at 11:37 PM Howard Yoo <ho...@gmail.com>
>> wrote:
>> >>> >
>> >>> > I think it may mean all is well, perhaps :-)
>> >>> > Howard
>> >>> >
>> >>> > On Thu, Mar 31, 2022 at 9:29 AM Jarek Potiuk <ja...@potiuk.com>
>> wrote:
>> >>> >>
>> >>> >> Hello everyone,
>> >>> >>
>> >>> >> Would be great to get some comments and reviews  - especially from
>> >>> >> those who are users and are doing monitoring. Howard made a lot of
>> >>> >> effort to show examples of how OTEL might help in this.
>> >>> >>
>> >>> >> Otherwise, is the silence sign that all is good ?
>> >>> >>
>> >>> >> J.
>> >>> >>
>> >>> >> On Fri, Mar 25, 2022 at 9:53 PM Jarek Potiuk <ja...@potiuk.com>
>> wrote:
>> >>> >> >
>> >>> >> > And let me add to it - we laid some foundations for it with
>> Melodie
>> >>> >> > Ezeani - the Outreachy intern where we did some internal
>> integration
>> >>> >> > work that let us understand the challenges and state of
>> >>> >> > open-telemetry.
>> >>> >> >
>> >>> >> > I am super excited about what open-telemetry can bring to
>> Airflow -
>> >>> >> > both short term (in metrics and instrumentation) and longer term
>> - in
>> >>> >> > logging when logging is mature enough.
>> >>> >> >
>> >>> >> > The open-telemetry mov is at the heart of the principles we have
>> -
>> >>> >> > making Airflow "modern" but at the same time delegating what's
>> not
>> >>> >> > "core" to those who can do it better. Open Telemetry is precisely
>> >>> >> > about that,
>> >>> >> >
>> >>> >> > Howard particularly brought a great experience from using Open
>> >>> >> > Telemetry before and making some good judgements and POC on the
>> >>> >> > metrics produced by Airflow and how they can be useful from the
>> >>> >> > "maintainer value" side. I looked at it from the technical
>> integration
>> >>> >> > POV - and Melodie helped to validate some of the assumptions and
>> >>> >> > expose some of the technical challenges.
>> >>> >> > The composite result is good, but we are looking with Howard on
>> some
>> >>> >> > insightful comments and critique - especially from the users of
>> >>> >> > Airflow!
>> >>> >> >
>> >>> >> > I look forward to more cool stuff on Airflow!
>> >>> >> >
>> >>> >> > J.
>> >>> >> >
>> >>> >> > On Fri, Mar 25, 2022 at 9:39 PM Howard Yoo <ho...@gmail.com>
>> wrote:
>> >>> >> > >
>> >>> >> > > Hi all,
>> >>> >> > >
>> >>> >> > > I am pleased to announce the start of the discussion for the
>> new AIP draft that was recently been published:
>> https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-49+OpenTelemetry+Support+for+Apache+Airflow
>> >>> >> > >
>> >>> >> > > Jarek Potuik and I have been discussing about this proposal
>> since early this year. During that time, we worked together on drafting
>> this proposal, as well as doing another round of mini-POC to refresh and
>> test on feasibility of OpenTelemetry on Apache Airflow.
>> >>> >> > >
>> >>> >> > > As the POC was successful in terms of testing the
>> OpenTelemetry on Airflow, we would like to expand the discussion to a wider
>> user community here this mailing list to gather more consensus, comments,
>> and feedbacks regarding this AIP.
>> >>> >> > >
>> >>> >> > > Sincerely,
>> >>> >> > > Howard and Jarek
>>
>

Re: [DISCUSSION] AIP-49 OpenTelemetry Support for Apache Airflow

Posted by Elad Kalif <el...@apache.org>.
Hi Howard,
Have you made progress with addressing the comments?
I think once points are addressed we can maybe start a vote?

On Tue, Apr 12, 2022 at 12:39 PM Jarek Potiuk <ja...@potiuk.com> wrote:

> > 1. I would assume the path for StatsD (dogstatsd) would be for
> deprecation - we will perhaps comment or mark it as deprecation - and
> should follow the established (or usual?) process of feature deprecation of
> Airflow, once the opentelemetry is in place.
>
> Yep. Deprecation should be. And maybe accompanied with a "statsd
> open-telemetry exporter" ? I think eventually we should have just OTEL
> stats and nothing else. The current "configurable" metrics class
> should be fully replaced with "OTEL configurable" metrics. Same as
> (but that's a much longer and uncertain path) logging configuration
> should get replaced if/when OTEL logging lives to its promises.
>
> > 2. Airflow will provide its own list of 'instrumented metrics' out of
> the box - with the list specified in its documentation so that users would
> be aware of them. However, with opentelemetry, the users will also have the
> ability to add their 'custom' metrics / traces / logs to be collected via
> opentelemetry as needed. That option and how-to's should also be documented.
>
> Good point. Actually the nice thing is that adding new metrics could
> then be done as part of task execution for example - so we should
> indeed have some libraries/tools or maybe even operator's /task
> interface should have some built-in capabilities and allow for
> declarative ways of adding metrics (but this can be added as a
> follow-up - it does not need to be described and hashed out yet IMHO -
> but we can add it to the docs as "future improvement") .
>
> > 3. A little clarification - I believe there were two POC's - which at
> the time during the first POC, it had lacked some of the features (e.g.
> traces and logs). However, in the second POC, there is a stable release of
> Traces, and beta release of logs, so things have been progressing.
> Opentelemetry is highly evolving and many things do change and get added
> relatively quickly. During my second POC (the one mentioned in the attached
> PDF) I was able to validate that all of the 'key' features that we needed
> to implement opentelemetry for airflow has been released and available.
>
> Yeah. I concur with that Howard wrote.  What was there a few months
> ago a little "shaky", becomes more and more solid as time passes.  And
> I also spoke with a few people who are involved in OpenTelemetry
> standard and development (one of my friends is site-lead for Sumo
> Logic Poland and they take active part in that effort and I just spoke
> to him about it). Also AWS is very much vested in it as far as I
> understand - I also spoke to Google and they are very much supporting
> OTEL as industry standard. I think there is a firm industry backing
> behind OTEL and there is no way it "won't progress" or "falter". This
> is a little bit of a "leap of faith" that it will become fully
> featured for our needs, but I think what is already there is "enough"
> to justify the move and anything that comes out of Beta is a bonus.
> Eventually if things will not go fast enough for us - we can also
> actually contribute there on the "collection" level.  It will likely
> have a much better outcome than if we try to integrate airflow with
> multiple services ourselves - we will just have to make sure things
> get collected "well" - and then all the different services will more
> likely than not write exporters for it.
>
> > 4. This is a debatable topic - and I believe those may not be a part of
> the core airflow code base, but would suggest perhaps we could create a
> 'contribution' repo which may maintain those third party assets (e.g.
> Grafana dashboard, Datadog dashboard, etc.) that users may use and even
> participate in maintaining it.
>
> Yeah. I think  this is also an opportunity for someone who is
> specializing in those or even let Grafana add it to their "portfolio".
> I can imagine there might be some basic dashboards generated by
> companies which do some kinds of integrations (and with an option of
> "come to us when you want more customizability". While we won't be
> able to endorse those, we can easily let it land in the "ecosystem"
> page of ours. I think also we can easily reach out (for example to
> Grafana to do it - in the case of Grafana, Myrle Krantz, ex Treasurer
> from ASF who I know well is a Senior Manager in Grafana responsible
> for Cloud team).
>
>
> > 5. POC may not have all the changes, as the work was to 'prototype' and
> answer questions like 'what if' when opentelemetry is in place. The
> proposal actually has link to a GIT repo that contains the changes that
> were done during the POC:
> https://github.com/howardyoo/airflow/tree/opentelemetry-poc-1 . Since the
> details of the changes would make the existing PDF even more subjected to
> TL;DR, I linked this git branch for anyone interested in the changes to
> take a look. I hope this would be sufficient.
>
> I think Elad is right that some  more details need to "surface" from
> the POC to AIP. While the changes are very little - they don't change
> any flows or logic in Airflow, it's more to "selectively" add
> collections and make sure common "Span" id is shared throughout the
> code (which I think is the biggest change).
> I think Howard, it might make sense to extract parts of the POC and
> put them as an outline of changes to implement in the "AIP".
>
> The POC is more trace of what you've done, but I think simply
> translating this into "this is what we need to do" for those who will
> just read AIP is important. Our AIP's are much more "Technical" in
> nature than most of the
>
> >
> > 6. Yes, I agree - I will update the AIP proposal to make the scope more
> clearer. Thanks for the feedback!
>
> Yeah. Adding "future improvements" and especially "This is what we are
> not going to do and leave for later" is important part of every AIP. I
> think it's sometimes much more important to state what we are NOT
> going to vs. what we are going to do.
>
> >
> > Howard
> >
> > On Sun, Apr 3, 2022 at 1:27 PM Elad Kalif <el...@apache.org> wrote:
> >>
> >> Thanks Howard!
> >> Sorry for the delay, this was a long read. The PDF alone is 19 pages  :)
> >> looks very good!
> >>
> >>  I have 6 questions/points to raise:
> >>
> >> 1. I'm not clear about what is to happen with StatsD .
> >> It states "Make OpenTelemetry and StatsD optional and interchangeable."
> >> But do we want to support both in the long run?
> >> It doesn't specify if we are deprecating statsD along with completion
> of this AIP.
> >>
> >> 2. regarding adding metrics.
> >> Do we intend to let users define their own KPIs/metrics to be measured
> or it will be a closed list set by Airflow?
> >>
> >> 3. The POC specifies it uses a feature of open-telemetry (add metrics)
> which was not yet released. Do we know the timeline for the feature to be
> released?
> >> Can we vote on a plan to use something which is not yet publicly
> available?
> >>
> >> 4. Are the Grafana dashboards / other dashboards to be part of the
> Airflow core code base?
> >> I wonder if this should be in a dedicated repo?
> >>
> >> 5. Could you please clarify in the AIP page what changes are required
> to the Airflow code base?
> >> Some of them appear in the PDF but I'm not sure if that is all of them?
> >>
> >> 6. The PDF has several open questions/ideas.
> >> I think it would be best to add to the AIP a scope paragraph listing
> what will be included in the first phase and what is left for other phases.
> >>
> >> On Fri, Apr 1, 2022 at 4:26 PM Jarek Potiuk <ja...@potiuk.com> wrote:
> >>>
> >>> Or maybe that people are so stunned by the beauty and usefulness of it
> >>> that they cannot even say a word :)
> >>>
> >>> On Thu, Mar 31, 2022 at 11:37 PM Howard Yoo <ho...@gmail.com>
> wrote:
> >>> >
> >>> > I think it may mean all is well, perhaps :-)
> >>> > Howard
> >>> >
> >>> > On Thu, Mar 31, 2022 at 9:29 AM Jarek Potiuk <ja...@potiuk.com>
> wrote:
> >>> >>
> >>> >> Hello everyone,
> >>> >>
> >>> >> Would be great to get some comments and reviews  - especially from
> >>> >> those who are users and are doing monitoring. Howard made a lot of
> >>> >> effort to show examples of how OTEL might help in this.
> >>> >>
> >>> >> Otherwise, is the silence sign that all is good ?
> >>> >>
> >>> >> J.
> >>> >>
> >>> >> On Fri, Mar 25, 2022 at 9:53 PM Jarek Potiuk <ja...@potiuk.com>
> wrote:
> >>> >> >
> >>> >> > And let me add to it - we laid some foundations for it with
> Melodie
> >>> >> > Ezeani - the Outreachy intern where we did some internal
> integration
> >>> >> > work that let us understand the challenges and state of
> >>> >> > open-telemetry.
> >>> >> >
> >>> >> > I am super excited about what open-telemetry can bring to Airflow
> -
> >>> >> > both short term (in metrics and instrumentation) and longer term
> - in
> >>> >> > logging when logging is mature enough.
> >>> >> >
> >>> >> > The open-telemetry mov is at the heart of the principles we have -
> >>> >> > making Airflow "modern" but at the same time delegating what's not
> >>> >> > "core" to those who can do it better. Open Telemetry is precisely
> >>> >> > about that,
> >>> >> >
> >>> >> > Howard particularly brought a great experience from using Open
> >>> >> > Telemetry before and making some good judgements and POC on the
> >>> >> > metrics produced by Airflow and how they can be useful from the
> >>> >> > "maintainer value" side. I looked at it from the technical
> integration
> >>> >> > POV - and Melodie helped to validate some of the assumptions and
> >>> >> > expose some of the technical challenges.
> >>> >> > The composite result is good, but we are looking with Howard on
> some
> >>> >> > insightful comments and critique - especially from the users of
> >>> >> > Airflow!
> >>> >> >
> >>> >> > I look forward to more cool stuff on Airflow!
> >>> >> >
> >>> >> > J.
> >>> >> >
> >>> >> > On Fri, Mar 25, 2022 at 9:39 PM Howard Yoo <ho...@gmail.com>
> wrote:
> >>> >> > >
> >>> >> > > Hi all,
> >>> >> > >
> >>> >> > > I am pleased to announce the start of the discussion for the
> new AIP draft that was recently been published:
> https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-49+OpenTelemetry+Support+for+Apache+Airflow
> >>> >> > >
> >>> >> > > Jarek Potuik and I have been discussing about this proposal
> since early this year. During that time, we worked together on drafting
> this proposal, as well as doing another round of mini-POC to refresh and
> test on feasibility of OpenTelemetry on Apache Airflow.
> >>> >> > >
> >>> >> > > As the POC was successful in terms of testing the OpenTelemetry
> on Airflow, we would like to expand the discussion to a wider user
> community here this mailing list to gather more consensus, comments, and
> feedbacks regarding this AIP.
> >>> >> > >
> >>> >> > > Sincerely,
> >>> >> > > Howard and Jarek
>

Re: [DISCUSSION] AIP-49 OpenTelemetry Support for Apache Airflow

Posted by Jarek Potiuk <ja...@potiuk.com>.
> 1. I would assume the path for StatsD (dogstatsd) would be for deprecation - we will perhaps comment or mark it as deprecation - and should follow the established (or usual?) process of feature deprecation of Airflow, once the opentelemetry is in place.

Yep. Deprecation should be. And maybe accompanied with a "statsd
open-telemetry exporter" ? I think eventually we should have just OTEL
stats and nothing else. The current "configurable" metrics class
should be fully replaced with "OTEL configurable" metrics. Same as
(but that's a much longer and uncertain path) logging configuration
should get replaced if/when OTEL logging lives to its promises.

> 2. Airflow will provide its own list of 'instrumented metrics' out of the box - with the list specified in its documentation so that users would be aware of them. However, with opentelemetry, the users will also have the ability to add their 'custom' metrics / traces / logs to be collected via opentelemetry as needed. That option and how-to's should also be documented.

Good point. Actually the nice thing is that adding new metrics could
then be done as part of task execution for example - so we should
indeed have some libraries/tools or maybe even operator's /task
interface should have some built-in capabilities and allow for
declarative ways of adding metrics (but this can be added as a
follow-up - it does not need to be described and hashed out yet IMHO -
but we can add it to the docs as "future improvement") .

> 3. A little clarification - I believe there were two POC's - which at the time during the first POC, it had lacked some of the features (e.g. traces and logs). However, in the second POC, there is a stable release of Traces, and beta release of logs, so things have been progressing. Opentelemetry is highly evolving and many things do change and get added relatively quickly. During my second POC (the one mentioned in the attached PDF) I was able to validate that all of the 'key' features that we needed to implement opentelemetry for airflow has been released and available.

Yeah. I concur with that Howard wrote.  What was there a few months
ago a little "shaky", becomes more and more solid as time passes.  And
I also spoke with a few people who are involved in OpenTelemetry
standard and development (one of my friends is site-lead for Sumo
Logic Poland and they take active part in that effort and I just spoke
to him about it). Also AWS is very much vested in it as far as I
understand - I also spoke to Google and they are very much supporting
OTEL as industry standard. I think there is a firm industry backing
behind OTEL and there is no way it "won't progress" or "falter". This
is a little bit of a "leap of faith" that it will become fully
featured for our needs, but I think what is already there is "enough"
to justify the move and anything that comes out of Beta is a bonus.
Eventually if things will not go fast enough for us - we can also
actually contribute there on the "collection" level.  It will likely
have a much better outcome than if we try to integrate airflow with
multiple services ourselves - we will just have to make sure things
get collected "well" - and then all the different services will more
likely than not write exporters for it.

> 4. This is a debatable topic - and I believe those may not be a part of the core airflow code base, but would suggest perhaps we could create a 'contribution' repo which may maintain those third party assets (e.g. Grafana dashboard, Datadog dashboard, etc.) that users may use and even participate in maintaining it.

Yeah. I think  this is also an opportunity for someone who is
specializing in those or even let Grafana add it to their "portfolio".
I can imagine there might be some basic dashboards generated by
companies which do some kinds of integrations (and with an option of
"come to us when you want more customizability". While we won't be
able to endorse those, we can easily let it land in the "ecosystem"
page of ours. I think also we can easily reach out (for example to
Grafana to do it - in the case of Grafana, Myrle Krantz, ex Treasurer
from ASF who I know well is a Senior Manager in Grafana responsible
for Cloud team).


> 5. POC may not have all the changes, as the work was to 'prototype' and answer questions like 'what if' when opentelemetry is in place. The proposal actually has link to a GIT repo that contains the changes that were done during the POC: https://github.com/howardyoo/airflow/tree/opentelemetry-poc-1 . Since the details of the changes would make the existing PDF even more subjected to TL;DR, I linked this git branch for anyone interested in the changes to take a look. I hope this would be sufficient.

I think Elad is right that some  more details need to "surface" from
the POC to AIP. While the changes are very little - they don't change
any flows or logic in Airflow, it's more to "selectively" add
collections and make sure common "Span" id is shared throughout the
code (which I think is the biggest change).
I think Howard, it might make sense to extract parts of the POC and
put them as an outline of changes to implement in the "AIP".

The POC is more trace of what you've done, but I think simply
translating this into "this is what we need to do" for those who will
just read AIP is important. Our AIP's are much more "Technical" in
nature than most of the

>
> 6. Yes, I agree - I will update the AIP proposal to make the scope more clearer. Thanks for the feedback!

Yeah. Adding "future improvements" and especially "This is what we are
not going to do and leave for later" is important part of every AIP. I
think it's sometimes much more important to state what we are NOT
going to vs. what we are going to do.

>
> Howard
>
> On Sun, Apr 3, 2022 at 1:27 PM Elad Kalif <el...@apache.org> wrote:
>>
>> Thanks Howard!
>> Sorry for the delay, this was a long read. The PDF alone is 19 pages  :)
>> looks very good!
>>
>>  I have 6 questions/points to raise:
>>
>> 1. I'm not clear about what is to happen with StatsD .
>> It states "Make OpenTelemetry and StatsD optional and interchangeable."
>> But do we want to support both in the long run?
>> It doesn't specify if we are deprecating statsD along with completion of this AIP.
>>
>> 2. regarding adding metrics.
>> Do we intend to let users define their own KPIs/metrics to be measured or it will be a closed list set by Airflow?
>>
>> 3. The POC specifies it uses a feature of open-telemetry (add metrics) which was not yet released. Do we know the timeline for the feature to be released?
>> Can we vote on a plan to use something which is not yet publicly available?
>>
>> 4. Are the Grafana dashboards / other dashboards to be part of the Airflow core code base?
>> I wonder if this should be in a dedicated repo?
>>
>> 5. Could you please clarify in the AIP page what changes are required to the Airflow code base?
>> Some of them appear in the PDF but I'm not sure if that is all of them?
>>
>> 6. The PDF has several open questions/ideas.
>> I think it would be best to add to the AIP a scope paragraph listing what will be included in the first phase and what is left for other phases.
>>
>> On Fri, Apr 1, 2022 at 4:26 PM Jarek Potiuk <ja...@potiuk.com> wrote:
>>>
>>> Or maybe that people are so stunned by the beauty and usefulness of it
>>> that they cannot even say a word :)
>>>
>>> On Thu, Mar 31, 2022 at 11:37 PM Howard Yoo <ho...@gmail.com> wrote:
>>> >
>>> > I think it may mean all is well, perhaps :-)
>>> > Howard
>>> >
>>> > On Thu, Mar 31, 2022 at 9:29 AM Jarek Potiuk <ja...@potiuk.com> wrote:
>>> >>
>>> >> Hello everyone,
>>> >>
>>> >> Would be great to get some comments and reviews  - especially from
>>> >> those who are users and are doing monitoring. Howard made a lot of
>>> >> effort to show examples of how OTEL might help in this.
>>> >>
>>> >> Otherwise, is the silence sign that all is good ?
>>> >>
>>> >> J.
>>> >>
>>> >> On Fri, Mar 25, 2022 at 9:53 PM Jarek Potiuk <ja...@potiuk.com> wrote:
>>> >> >
>>> >> > And let me add to it - we laid some foundations for it with Melodie
>>> >> > Ezeani - the Outreachy intern where we did some internal integration
>>> >> > work that let us understand the challenges and state of
>>> >> > open-telemetry.
>>> >> >
>>> >> > I am super excited about what open-telemetry can bring to Airflow -
>>> >> > both short term (in metrics and instrumentation) and longer term - in
>>> >> > logging when logging is mature enough.
>>> >> >
>>> >> > The open-telemetry mov is at the heart of the principles we have -
>>> >> > making Airflow "modern" but at the same time delegating what's not
>>> >> > "core" to those who can do it better. Open Telemetry is precisely
>>> >> > about that,
>>> >> >
>>> >> > Howard particularly brought a great experience from using Open
>>> >> > Telemetry before and making some good judgements and POC on the
>>> >> > metrics produced by Airflow and how they can be useful from the
>>> >> > "maintainer value" side. I looked at it from the technical integration
>>> >> > POV - and Melodie helped to validate some of the assumptions and
>>> >> > expose some of the technical challenges.
>>> >> > The composite result is good, but we are looking with Howard on some
>>> >> > insightful comments and critique - especially from the users of
>>> >> > Airflow!
>>> >> >
>>> >> > I look forward to more cool stuff on Airflow!
>>> >> >
>>> >> > J.
>>> >> >
>>> >> > On Fri, Mar 25, 2022 at 9:39 PM Howard Yoo <ho...@gmail.com> wrote:
>>> >> > >
>>> >> > > Hi all,
>>> >> > >
>>> >> > > I am pleased to announce the start of the discussion for the new AIP draft that was recently been published: https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-49+OpenTelemetry+Support+for+Apache+Airflow
>>> >> > >
>>> >> > > Jarek Potuik and I have been discussing about this proposal since early this year. During that time, we worked together on drafting this proposal, as well as doing another round of mini-POC to refresh and test on feasibility of OpenTelemetry on Apache Airflow.
>>> >> > >
>>> >> > > As the POC was successful in terms of testing the OpenTelemetry on Airflow, we would like to expand the discussion to a wider user community here this mailing list to gather more consensus, comments, and feedbacks regarding this AIP.
>>> >> > >
>>> >> > > Sincerely,
>>> >> > > Howard and Jarek

Re: [DISCUSSION] AIP-49 OpenTelemetry Support for Apache Airflow

Posted by Howard Yoo <ho...@gmail.com>.
Thank you, Elad!
Yeah, it might have been TL;DR; moment on the PDF, but the description of
the work required such amount, unfortunately (thanks for bearing with it!).

I may not have all the answers, but do have some opinions on your 6
questions/points, so if I may add a little of them:

1. I would assume the path for StatsD (dogstatsd) would be for deprecation
- we will perhaps comment or mark it as deprecation - and should follow the
established (or usual?) process of feature deprecation of Airflow, once the
opentelemetry is in place.

2. Airflow will provide its own list of 'instrumented metrics' out of the
box - with the list specified in its documentation so that users would be
aware of them. However, with opentelemetry, the users will also have the
ability to add their 'custom' metrics / traces / logs to be collected via
opentelemetry as needed. That option and how-to's should also be documented.

3. A little clarification - I believe there were two POC's - which at the
time during the first POC, it had lacked some of the features (e.g. traces
and logs). However, in the second POC, there is a stable release of Traces,
and beta release of logs, so things have been progressing. Opentelemetry is
highly evolving and many things do change and get added relatively quickly.
During my second POC (the one mentioned in the attached PDF) I was able to
validate that all of the 'key' features that we needed to implement
opentelemetry for airflow has been released and available.

4. This is a debatable topic - and I believe those may not be a part of the
core airflow code base, but would suggest perhaps we could create a
'contribution' repo which may maintain those third party assets (e.g.
Grafana dashboard, Datadog dashboard, etc.) that users may use and even
participate in maintaining it.

5. POC may not have all the changes, as the work was to 'prototype' and
answer questions like 'what if' when opentelemetry is in place. The
proposal actually has link to a GIT repo that contains the changes that
were done during the POC:
https://github.com/howardyoo/airflow/tree/opentelemetry-poc-1 . Since the
details of the changes would make the existing PDF even more subjected to
TL;DR, I linked this git branch for anyone interested in the changes to
take a look. I hope this would be sufficient.

6. Yes, I agree - I will update the AIP proposal to make the scope more
clearer. Thanks for the feedback!

Howard

On Sun, Apr 3, 2022 at 1:27 PM Elad Kalif <el...@apache.org> wrote:

> Thanks Howard!
> Sorry for the delay, this was a long read. The PDF alone is 19 pages  :)
> looks very good!
>
>  I have 6 questions/points to raise:
>
> 1. I'm not clear about what is to happen with StatsD .
> It states "Make OpenTelemetry and StatsD optional and interchangeable."
> But do we want to support both in the long run?
> It doesn't specify if we are deprecating statsD along with completion of
> this AIP.
>
> 2. regarding adding metrics.
> Do we intend to let users define their own KPIs/metrics to be measured or
> it will be a closed list set by Airflow?
>
> 3. The POC specifies it uses a feature of open-telemetry (add metrics)
> which was not yet released. Do we know the timeline for the feature to be
> released?
> Can we vote on a plan to use something which is not yet publicly available?
>
> 4. Are the Grafana dashboards / other dashboards to be part of the Airflow
> core code base?
> I wonder if this should be in a dedicated repo?
>
> 5. Could you please clarify in the AIP page what changes are required to
> the Airflow code base?
> Some of them appear in the PDF but I'm not sure if that is all of them?
>
> 6. The PDF has several open questions/ideas.
> I think it would be best to add to the AIP a scope paragraph listing what
> will be included in the first phase and what is left for other phases.
>
> On Fri, Apr 1, 2022 at 4:26 PM Jarek Potiuk <ja...@potiuk.com> wrote:
>
>> Or maybe that people are so stunned by the beauty and usefulness of it
>> that they cannot even say a word :)
>>
>> On Thu, Mar 31, 2022 at 11:37 PM Howard Yoo <ho...@gmail.com> wrote:
>> >
>> > I think it may mean all is well, perhaps :-)
>> > Howard
>> >
>> > On Thu, Mar 31, 2022 at 9:29 AM Jarek Potiuk <ja...@potiuk.com> wrote:
>> >>
>> >> Hello everyone,
>> >>
>> >> Would be great to get some comments and reviews  - especially from
>> >> those who are users and are doing monitoring. Howard made a lot of
>> >> effort to show examples of how OTEL might help in this.
>> >>
>> >> Otherwise, is the silence sign that all is good ?
>> >>
>> >> J.
>> >>
>> >> On Fri, Mar 25, 2022 at 9:53 PM Jarek Potiuk <ja...@potiuk.com> wrote:
>> >> >
>> >> > And let me add to it - we laid some foundations for it with Melodie
>> >> > Ezeani - the Outreachy intern where we did some internal integration
>> >> > work that let us understand the challenges and state of
>> >> > open-telemetry.
>> >> >
>> >> > I am super excited about what open-telemetry can bring to Airflow -
>> >> > both short term (in metrics and instrumentation) and longer term - in
>> >> > logging when logging is mature enough.
>> >> >
>> >> > The open-telemetry mov is at the heart of the principles we have -
>> >> > making Airflow "modern" but at the same time delegating what's not
>> >> > "core" to those who can do it better. Open Telemetry is precisely
>> >> > about that,
>> >> >
>> >> > Howard particularly brought a great experience from using Open
>> >> > Telemetry before and making some good judgements and POC on the
>> >> > metrics produced by Airflow and how they can be useful from the
>> >> > "maintainer value" side. I looked at it from the technical
>> integration
>> >> > POV - and Melodie helped to validate some of the assumptions and
>> >> > expose some of the technical challenges.
>> >> > The composite result is good, but we are looking with Howard on some
>> >> > insightful comments and critique - especially from the users of
>> >> > Airflow!
>> >> >
>> >> > I look forward to more cool stuff on Airflow!
>> >> >
>> >> > J.
>> >> >
>> >> > On Fri, Mar 25, 2022 at 9:39 PM Howard Yoo <ho...@gmail.com>
>> wrote:
>> >> > >
>> >> > > Hi all,
>> >> > >
>> >> > > I am pleased to announce the start of the discussion for the new
>> AIP draft that was recently been published:
>> https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-49+OpenTelemetry+Support+for+Apache+Airflow
>> >> > >
>> >> > > Jarek Potuik and I have been discussing about this proposal since
>> early this year. During that time, we worked together on drafting this
>> proposal, as well as doing another round of mini-POC to refresh and test on
>> feasibility of OpenTelemetry on Apache Airflow.
>> >> > >
>> >> > > As the POC was successful in terms of testing the OpenTelemetry on
>> Airflow, we would like to expand the discussion to a wider user community
>> here this mailing list to gather more consensus, comments, and feedbacks
>> regarding this AIP.
>> >> > >
>> >> > > Sincerely,
>> >> > > Howard and Jarek
>>
>

Re: [DISCUSSION] AIP-49 OpenTelemetry Support for Apache Airflow

Posted by Elad Kalif <el...@apache.org>.
Thanks Howard!
Sorry for the delay, this was a long read. The PDF alone is 19 pages  :)
looks very good!

 I have 6 questions/points to raise:

1. I'm not clear about what is to happen with StatsD .
It states "Make OpenTelemetry and StatsD optional and interchangeable."
But do we want to support both in the long run?
It doesn't specify if we are deprecating statsD along with completion of
this AIP.

2. regarding adding metrics.
Do we intend to let users define their own KPIs/metrics to be measured or
it will be a closed list set by Airflow?

3. The POC specifies it uses a feature of open-telemetry (add metrics)
which was not yet released. Do we know the timeline for the feature to be
released?
Can we vote on a plan to use something which is not yet publicly available?

4. Are the Grafana dashboards / other dashboards to be part of the Airflow
core code base?
I wonder if this should be in a dedicated repo?

5. Could you please clarify in the AIP page what changes are required to
the Airflow code base?
Some of them appear in the PDF but I'm not sure if that is all of them?

6. The PDF has several open questions/ideas.
I think it would be best to add to the AIP a scope paragraph listing what
will be included in the first phase and what is left for other phases.

On Fri, Apr 1, 2022 at 4:26 PM Jarek Potiuk <ja...@potiuk.com> wrote:

> Or maybe that people are so stunned by the beauty and usefulness of it
> that they cannot even say a word :)
>
> On Thu, Mar 31, 2022 at 11:37 PM Howard Yoo <ho...@gmail.com> wrote:
> >
> > I think it may mean all is well, perhaps :-)
> > Howard
> >
> > On Thu, Mar 31, 2022 at 9:29 AM Jarek Potiuk <ja...@potiuk.com> wrote:
> >>
> >> Hello everyone,
> >>
> >> Would be great to get some comments and reviews  - especially from
> >> those who are users and are doing monitoring. Howard made a lot of
> >> effort to show examples of how OTEL might help in this.
> >>
> >> Otherwise, is the silence sign that all is good ?
> >>
> >> J.
> >>
> >> On Fri, Mar 25, 2022 at 9:53 PM Jarek Potiuk <ja...@potiuk.com> wrote:
> >> >
> >> > And let me add to it - we laid some foundations for it with Melodie
> >> > Ezeani - the Outreachy intern where we did some internal integration
> >> > work that let us understand the challenges and state of
> >> > open-telemetry.
> >> >
> >> > I am super excited about what open-telemetry can bring to Airflow -
> >> > both short term (in metrics and instrumentation) and longer term - in
> >> > logging when logging is mature enough.
> >> >
> >> > The open-telemetry mov is at the heart of the principles we have -
> >> > making Airflow "modern" but at the same time delegating what's not
> >> > "core" to those who can do it better. Open Telemetry is precisely
> >> > about that,
> >> >
> >> > Howard particularly brought a great experience from using Open
> >> > Telemetry before and making some good judgements and POC on the
> >> > metrics produced by Airflow and how they can be useful from the
> >> > "maintainer value" side. I looked at it from the technical integration
> >> > POV - and Melodie helped to validate some of the assumptions and
> >> > expose some of the technical challenges.
> >> > The composite result is good, but we are looking with Howard on some
> >> > insightful comments and critique - especially from the users of
> >> > Airflow!
> >> >
> >> > I look forward to more cool stuff on Airflow!
> >> >
> >> > J.
> >> >
> >> > On Fri, Mar 25, 2022 at 9:39 PM Howard Yoo <ho...@gmail.com>
> wrote:
> >> > >
> >> > > Hi all,
> >> > >
> >> > > I am pleased to announce the start of the discussion for the new
> AIP draft that was recently been published:
> https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-49+OpenTelemetry+Support+for+Apache+Airflow
> >> > >
> >> > > Jarek Potuik and I have been discussing about this proposal since
> early this year. During that time, we worked together on drafting this
> proposal, as well as doing another round of mini-POC to refresh and test on
> feasibility of OpenTelemetry on Apache Airflow.
> >> > >
> >> > > As the POC was successful in terms of testing the OpenTelemetry on
> Airflow, we would like to expand the discussion to a wider user community
> here this mailing list to gather more consensus, comments, and feedbacks
> regarding this AIP.
> >> > >
> >> > > Sincerely,
> >> > > Howard and Jarek
>