You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nifi.apache.org by Joe Witt <jo...@gmail.com> on 2023/05/17 17:05:27 UTC

Re: OpenTelemetry Integration

Brian Putt, All

Are you aware of any good tools/services that can ingest the traces and
provide an interesting view/story/reporting on it?

I could see us emitting otel events instead of our current provenance
mechanism and using that both internally to do what we already do but also
have a clear/spec friendly way of exporting it to others.

Thanks

On Sat, Jul 30, 2022 at 7:43 AM Uwe@Moosheimer.com <Uw...@moosheimer.com>
wrote:

> Hello Brian, Bryan, Greg, NiFi devs,
>
> Integrating OpenTelemetry is a very good idea, especially since the major
> cloud providers also rely on it. This could also be interesting for
> Stateless NiFi.
>
> I have a suggestion that I would like to put up for discussion.
>
> Would it be useful to make a list of what extensions or new development
> would be helpful for a complete integration of OpenTelemetry?
>
> I'm thinking of ConsumeMQTT and PublishMQTT, for example. Currently these
> can do max. MQTT version 3.11, but since version 5 the User Properties
> exist, which are similar to the HTTP header fields.
> Thus one could implement OpenTelemetry in the MQTT processors similarly as
> in HTTP.
>
> With a list we could make an overview of the "necessary" adjustments and
> advertise for support.
>
> If what I write is nonsense, then I may not have understood something and
> I take it all back :)
>
> Mit freundlichen Grüßen / best regards
> Kay-Uwe Moosheimer
>
> > Am 29.07.2022 um 05:09 schrieb Brian Putt <pu...@gmail.com>:
> >
> > Hello Bryan / Greg / NiFi devs,
> >
> > Distributed tracing (DT) is similar to provenance in that it shows the
> path
> > a particular flowfile travels, but its core selling point is that it
> > supports tracing across multiple systems/services regardless of what's
> > receiving the data. Provenance is a fantastic feature and there are
> > instances where one might want to draw that bigger picture of identifying
> > bottlenecks as data flows from one system to another and that system
> > may/may not be using NiFi.
> >
> > DT utilizes three ids: traceId, parentId, and spanId. While a tree can be
> > built using two ids, the third id (traceId) helps bring all of the
> relevant
> > information out of a datastore more easily.
> > DT is focused more on performance and identifying bottlenecks in one or
> > more systems. Imagine if NiFi were receiving data from various sources
> > (i.e. HTTP, Kafka, SQS) and NiFi egressed to other sources (HTTP, Kafka,
> > NiFi).
> > DT provides a spec that we'd be able to follow and correlate the data as
> it
> > traverses from system to system. Each system that participates in the DT
> > ecosystem would simply emit information (a trace is made up of one or
> more
> > spans) and there'd be a collection system which would aggregate all of
> > these spans and would draw a bigger picture of the path that data went
> > through and could help identify key bottlenecks.
> >
> > OpenTelemetry (OTEL) provides clients (across many languages, including
> > java) where developers can instrument their library's APIs and
> participate
> > in a DT ecosystem as it adheres to the tracing spec. Egressing trace data
> > is possible without using OTEL, but then we may find ourselves having to
> > recreate the wheel, but could be optimized for NiFi.
> >
> > Creating a reporting task could certainly be a path, mainly have a few
> > concerns with that:
> >
> > 1. If provenance is disabled, will provenance events still be emitted and
> > be collected by a new reporting task?
> > 2. There'll be an impact on performance, how much is unknown. OTEL is
> > gaining traction across industry and there are ways to mitigate
> > performance, mainly sampling and the fact that *tracing is best effort*.
> > Spans would be emitted from NiFi via UDP to a collector on the same
> network
> > 3. Would there be any issues with appending a flowfile attribute that is
> > carried throughout the flow where it maintains the traceId, parentSpanId,
> > and trace flags? See below for more details
> >
> > There's a W3C spec (Trace context) which includes a formatted string that
> > would be propagated to services (HTTP, Kafka, etc...). So if NiFi were to
> > put information onto kafka, any consumers of that data would be able to
> > continue the trace and help draw the bigger picture.
> >
> > W3C Spec: https://www.w3.org/TR/trace-context/#traceparent-header
> >
> > For #2, since DT is focused on performance, sampling can help alleviate
> > chatter over the wire and ideally, 0.01% would draw the same picture as
> 1%
> > or 10%+. This is certainly different from provenance as DT is focused on
> > performance over quality of the data and should not be thought of as
> > auditing.
> >
> https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/trace/sdk.md#sampler
> >
> >> On Thu, Jul 28, 2022 at 5:01 PM Bryan Bende <bb...@gmail.com> wrote:
> >>
> >> Hi Greg,
> >>
> >> I don't really know anything about OpenTelemetry, but from the
> >> perspective of integrating something into the framework, some things
> >> to consider...
> >>
> >> Is there some way to piggy-back on provenance and use a ReportingTask
> >> to process provenance events and report something to OpenTelemetry?
> >>
> >> If something new does need to be added, it should probably be an
> >> extension point where there is an interface in the framework-api and
> >> different implementations can be plugged in.
> >> Ideally the framework itself wouldn't have any knowledge of
> >> OpenTelemetry specifically, it would only be reporting some
> >> information, which could then be used in some way by the OpenTelemetry
> >> implementation.
> >>
> >> How does NiFi actually communicate with OpenTelemetry? Are you
> >> expecting to send data to OpenTelemetry in this new method you are
> >> suggesting?
> >> That would likely have a significant impact on the performance of the
> flow.
> >>
> >> Thanks,
> >>
> >> Bryan
> >>
> >>> On Thu, Jul 28, 2022 at 3:17 PM glmars3@uwe.nsa.gov <
> glmars3@uwe.nsa.gov>
> >>> wrote:
> >>>
> >>> Nifi Devs,
> >>>
> >>> My team and I are looking for guidance on how we can extend Apache
> >> Nifi's capabilities. Specifically we're looking to include distributed
> >> tracing. We'll approach this effort as if we're the tracing experts and
> >> simply seeking implementation guidance. Our developers have good
> exposure
> >> to working with Nifi and creating custom processors. We plan to fork the
> >> project to begin this effort but want to make sure we approach this with
> >> the best possible direction for community adoption.
> >>>
> >>> Our initial thoughts on this approach would be to piggyback on how
> >> Provenance was implemented. We essentially want to include a subroutine
> or
> >> method that gets implicitly invoked upon a processors 'onTrigger'
> method.
> >> From there we would analyze the FlowFiles attributes to check for the
> >> existence of 'traceId' and/or propagate one if found.
> >>>
> >>> We can expound upon all of these tracing/observability details if that
> >> helps by any means. We're able to provide more detailed scope of this
> task
> >> as well but for now we just want to get feed back for our overall goal
> and
> >> proposed approach.
> >>>
> >>> Thanks,
> >>> Greg Marshall
> >>
>
>

Re: OpenTelemetry Integration

Posted by "Uwe@Moosheimer.com" <Uw...@moosheimer.com>.
Hallo Brian,

Jaeger would be a good choice because it is very common (almost the standard with OpenTelemetry).
Have you looked at OpenLineage (https://openlineage.io/)? Possibly interesting?!

Thanks
Uwe

> Am 23.05.2023 um 04:57 schrieb Brian Putt <pu...@gmail.com>:
> 
> Hello Joe / All,
> 
> Jaeger or Grafana (w/ tempo) offer comparable tools to visualize the trace
> data. I believe additional tools will be needed to get the most out of the
> trace data. We've been experimenting with a number of open source products
> to see what works best for the amount of trace data that NiFi emits. So
> far, Grafana Tempo, Victoria Metrics, and Clickhouse seem to offer a good
> set of features to cover searching / viewing the traces along with
> summarizing certain flowfile attributes. As long as the trace data is in
> OTEL's format, the collector offers flexibility in exporting the data to a
> number of services with ease.
> 
> I would expect a PR to OTEL's java auto instrumentation project over the
> next few months that adds NiFi to its list of instrumentations. If the NiFi
> committers would like a demo / tech exchange to go over the current state
> of the tracing agent, we'd be happy to accommodate. As it stands, the agent
> utilizes flowfile attributes to pass along the tracestate so trace
> propagation can occur across NiFi to NiFi boundaries.
> 
> Thanks,
> 
> Brian
> 
>> On Wed, May 17, 2023 at 1:05 PM Joe Witt <jo...@gmail.com> wrote:
>> 
>> Brian Putt, All
>> 
>> Are you aware of any good tools/services that can ingest the traces and
>> provide an interesting view/story/reporting on it?
>> 
>> I could see us emitting otel events instead of our current provenance
>> mechanism and using that both internally to do what we already do but also
>> have a clear/spec friendly way of exporting it to others.
>> 
>> Thanks
>> 
>> On Sat, Jul 30, 2022 at 7:43 AM Uwe@Moosheimer.com <Uw...@moosheimer.com>
>> wrote:
>> 
>>> Hello Brian, Bryan, Greg, NiFi devs,
>>> 
>>> Integrating OpenTelemetry is a very good idea, especially since the major
>>> cloud providers also rely on it. This could also be interesting for
>>> Stateless NiFi.
>>> 
>>> I have a suggestion that I would like to put up for discussion.
>>> 
>>> Would it be useful to make a list of what extensions or new development
>>> would be helpful for a complete integration of OpenTelemetry?
>>> 
>>> I'm thinking of ConsumeMQTT and PublishMQTT, for example. Currently these
>>> can do max. MQTT version 3.11, but since version 5 the User Properties
>>> exist, which are similar to the HTTP header fields.
>>> Thus one could implement OpenTelemetry in the MQTT processors similarly
>> as
>>> in HTTP.
>>> 
>>> With a list we could make an overview of the "necessary" adjustments and
>>> advertise for support.
>>> 
>>> If what I write is nonsense, then I may not have understood something and
>>> I take it all back :)
>>> 
>>> Mit freundlichen Grüßen / best regards
>>> Kay-Uwe Moosheimer
>>> 
>>>> Am 29.07.2022 um 05:09 schrieb Brian Putt <pu...@gmail.com>:
>>>> 
>>>> Hello Bryan / Greg / NiFi devs,
>>>> 
>>>> Distributed tracing (DT) is similar to provenance in that it shows the
>>> path
>>>> a particular flowfile travels, but its core selling point is that it
>>>> supports tracing across multiple systems/services regardless of what's
>>>> receiving the data. Provenance is a fantastic feature and there are
>>>> instances where one might want to draw that bigger picture of
>> identifying
>>>> bottlenecks as data flows from one system to another and that system
>>>> may/may not be using NiFi.
>>>> 
>>>> DT utilizes three ids: traceId, parentId, and spanId. While a tree can
>> be
>>>> built using two ids, the third id (traceId) helps bring all of the
>>> relevant
>>>> information out of a datastore more easily.
>>>> DT is focused more on performance and identifying bottlenecks in one or
>>>> more systems. Imagine if NiFi were receiving data from various sources
>>>> (i.e. HTTP, Kafka, SQS) and NiFi egressed to other sources (HTTP,
>> Kafka,
>>>> NiFi).
>>>> DT provides a spec that we'd be able to follow and correlate the data
>> as
>>> it
>>>> traverses from system to system. Each system that participates in the
>> DT
>>>> ecosystem would simply emit information (a trace is made up of one or
>>> more
>>>> spans) and there'd be a collection system which would aggregate all of
>>>> these spans and would draw a bigger picture of the path that data went
>>>> through and could help identify key bottlenecks.
>>>> 
>>>> OpenTelemetry (OTEL) provides clients (across many languages, including
>>>> java) where developers can instrument their library's APIs and
>>> participate
>>>> in a DT ecosystem as it adheres to the tracing spec. Egressing trace
>> data
>>>> is possible without using OTEL, but then we may find ourselves having
>> to
>>>> recreate the wheel, but could be optimized for NiFi.
>>>> 
>>>> Creating a reporting task could certainly be a path, mainly have a few
>>>> concerns with that:
>>>> 
>>>> 1. If provenance is disabled, will provenance events still be emitted
>> and
>>>> be collected by a new reporting task?
>>>> 2. There'll be an impact on performance, how much is unknown. OTEL is
>>>> gaining traction across industry and there are ways to mitigate
>>>> performance, mainly sampling and the fact that *tracing is best
>> effort*.
>>>> Spans would be emitted from NiFi via UDP to a collector on the same
>>> network
>>>> 3. Would there be any issues with appending a flowfile attribute that
>> is
>>>> carried throughout the flow where it maintains the traceId,
>> parentSpanId,
>>>> and trace flags? See below for more details
>>>> 
>>>> There's a W3C spec (Trace context) which includes a formatted string
>> that
>>>> would be propagated to services (HTTP, Kafka, etc...). So if NiFi were
>> to
>>>> put information onto kafka, any consumers of that data would be able to
>>>> continue the trace and help draw the bigger picture.
>>>> 
>>>> W3C Spec: https://www.w3.org/TR/trace-context/#traceparent-header
>>>> 
>>>> For #2, since DT is focused on performance, sampling can help alleviate
>>>> chatter over the wire and ideally, 0.01% would draw the same picture as
>>> 1%
>>>> or 10%+. This is certainly different from provenance as DT is focused
>> on
>>>> performance over quality of the data and should not be thought of as
>>>> auditing.
>>>> 
>>> 
>> https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/trace/sdk.md#sampler
>>>> 
>>>>> On Thu, Jul 28, 2022 at 5:01 PM Bryan Bende <bb...@gmail.com> wrote:
>>>>> 
>>>>> Hi Greg,
>>>>> 
>>>>> I don't really know anything about OpenTelemetry, but from the
>>>>> perspective of integrating something into the framework, some things
>>>>> to consider...
>>>>> 
>>>>> Is there some way to piggy-back on provenance and use a ReportingTask
>>>>> to process provenance events and report something to OpenTelemetry?
>>>>> 
>>>>> If something new does need to be added, it should probably be an
>>>>> extension point where there is an interface in the framework-api and
>>>>> different implementations can be plugged in.
>>>>> Ideally the framework itself wouldn't have any knowledge of
>>>>> OpenTelemetry specifically, it would only be reporting some
>>>>> information, which could then be used in some way by the OpenTelemetry
>>>>> implementation.
>>>>> 
>>>>> How does NiFi actually communicate with OpenTelemetry? Are you
>>>>> expecting to send data to OpenTelemetry in this new method you are
>>>>> suggesting?
>>>>> That would likely have a significant impact on the performance of the
>>> flow.
>>>>> 
>>>>> Thanks,
>>>>> 
>>>>> Bryan
>>>>> 
>>>>>> On Thu, Jul 28, 2022 at 3:17 PM glmars3@uwe.nsa.gov <
>>> glmars3@uwe.nsa.gov>
>>>>>> wrote:
>>>>>> 
>>>>>> Nifi Devs,
>>>>>> 
>>>>>> My team and I are looking for guidance on how we can extend Apache
>>>>> Nifi's capabilities. Specifically we're looking to include distributed
>>>>> tracing. We'll approach this effort as if we're the tracing experts
>> and
>>>>> simply seeking implementation guidance. Our developers have good
>>> exposure
>>>>> to working with Nifi and creating custom processors. We plan to fork
>> the
>>>>> project to begin this effort but want to make sure we approach this
>> with
>>>>> the best possible direction for community adoption.
>>>>>> 
>>>>>> Our initial thoughts on this approach would be to piggyback on how
>>>>> Provenance was implemented. We essentially want to include a
>> subroutine
>>> or
>>>>> method that gets implicitly invoked upon a processors 'onTrigger'
>>> method.
>>>>> From there we would analyze the FlowFiles attributes to check for the
>>>>> existence of 'traceId' and/or propagate one if found.
>>>>>> 
>>>>>> We can expound upon all of these tracing/observability details if
>> that
>>>>> helps by any means. We're able to provide more detailed scope of this
>>> task
>>>>> as well but for now we just want to get feed back for our overall goal
>>> and
>>>>> proposed approach.
>>>>>> 
>>>>>> Thanks,
>>>>>> Greg Marshall
>>>>> 
>>> 
>>> 
>> 


Re: OpenTelemetry Integration

Posted by Pierre Villard <pi...@gmail.com>.
Brian,

I would expect a PR to OTEL's java auto instrumentation project over the
> next few months that adds NiFi to its list of instrumentations. If the NiFi
> committers would like a demo / tech exchange to go over the current state
> of the tracing agent, we'd be happy to accommodate. As it stands, the agent
> utilizes flowfile attributes to pass along the tracestate so trace
> propagation can occur across NiFi to NiFi boundaries.
>

I think a lot of us would love to get on a call for a demo and discuss this
work. It sounds great!
How would you like to proceed?


Le mar. 23 mai 2023 à 12:19, Michael Hogue <mi...@gmail.com> a
écrit :

> Related & complementary to Brian's tracing work would be modeling
> provenance/lineage in OTEL. I chatted with some OTEL maintainers about
> modeling provenance natively as a top-level Signal [1] at KubeCon and they
> advised raising an issue in the opentelemetry-specification repo for
> discussion:
> https://github.com/open-telemetry/opentelemetry-specification/issues/3447
>
> Their in-person response was largely that OTEL events could be used to
> model provenance, but I argued that the model should be standardized for
> use across tooling. This way, the provenance model wouldn't need to be
> maintained separately.
>
> My original thought was that we could have a standard provenance model in
> OTEL and shift NiFi to emit these instead of its own provenance events.
> Then we can tie provenance together across heterogeneous tooling using log
> visualization through something like Loki & Grafana, but there are many
> options in this space.
>
> Thanks,
> Mike
>
> [1] https://opentelemetry.io/docs/concepts/signals/
>
> On Tue, May 23, 2023 at 3:56 AM Brian Putt <pu...@gmail.com> wrote:
>
> > Hello Joe / All,
> >
> > Jaeger or Grafana (w/ tempo) offer comparable tools to visualize the
> trace
> > data. I believe additional tools will be needed to get the most out of
> the
> > trace data. We've been experimenting with a number of open source
> products
> > to see what works best for the amount of trace data that NiFi emits. So
> > far, Grafana Tempo, Victoria Metrics, and Clickhouse seem to offer a good
> > set of features to cover searching / viewing the traces along with
> > summarizing certain flowfile attributes. As long as the trace data is in
> > OTEL's format, the collector offers flexibility in exporting the data to
> a
> > number of services with ease.
> >
> > I would expect a PR to OTEL's java auto instrumentation project over the
> > next few months that adds NiFi to its list of instrumentations. If the
> NiFi
> > committers would like a demo / tech exchange to go over the current state
> > of the tracing agent, we'd be happy to accommodate. As it stands, the
> agent
> > utilizes flowfile attributes to pass along the tracestate so trace
> > propagation can occur across NiFi to NiFi boundaries.
> >
> > Thanks,
> >
> > Brian
> >
> > On Wed, May 17, 2023 at 1:05 PM Joe Witt <jo...@gmail.com> wrote:
> >
> > > Brian Putt, All
> > >
> > > Are you aware of any good tools/services that can ingest the traces and
> > > provide an interesting view/story/reporting on it?
> > >
> > > I could see us emitting otel events instead of our current provenance
> > > mechanism and using that both internally to do what we already do but
> > also
> > > have a clear/spec friendly way of exporting it to others.
> > >
> > > Thanks
> > >
> > > On Sat, Jul 30, 2022 at 7:43 AM Uwe@Moosheimer.com <Uwe@moosheimer.com
> >
> > > wrote:
> > >
> > > > Hello Brian, Bryan, Greg, NiFi devs,
> > > >
> > > > Integrating OpenTelemetry is a very good idea, especially since the
> > major
> > > > cloud providers also rely on it. This could also be interesting for
> > > > Stateless NiFi.
> > > >
> > > > I have a suggestion that I would like to put up for discussion.
> > > >
> > > > Would it be useful to make a list of what extensions or new
> development
> > > > would be helpful for a complete integration of OpenTelemetry?
> > > >
> > > > I'm thinking of ConsumeMQTT and PublishMQTT, for example. Currently
> > these
> > > > can do max. MQTT version 3.11, but since version 5 the User
> Properties
> > > > exist, which are similar to the HTTP header fields.
> > > > Thus one could implement OpenTelemetry in the MQTT processors
> similarly
> > > as
> > > > in HTTP.
> > > >
> > > > With a list we could make an overview of the "necessary" adjustments
> > and
> > > > advertise for support.
> > > >
> > > > If what I write is nonsense, then I may not have understood something
> > and
> > > > I take it all back :)
> > > >
> > > > Mit freundlichen Grüßen / best regards
> > > > Kay-Uwe Moosheimer
> > > >
> > > > > Am 29.07.2022 um 05:09 schrieb Brian Putt <pu...@gmail.com>:
> > > > >
> > > > > Hello Bryan / Greg / NiFi devs,
> > > > >
> > > > > Distributed tracing (DT) is similar to provenance in that it shows
> > the
> > > > path
> > > > > a particular flowfile travels, but its core selling point is that
> it
> > > > > supports tracing across multiple systems/services regardless of
> > what's
> > > > > receiving the data. Provenance is a fantastic feature and there are
> > > > > instances where one might want to draw that bigger picture of
> > > identifying
> > > > > bottlenecks as data flows from one system to another and that
> system
> > > > > may/may not be using NiFi.
> > > > >
> > > > > DT utilizes three ids: traceId, parentId, and spanId. While a tree
> > can
> > > be
> > > > > built using two ids, the third id (traceId) helps bring all of the
> > > > relevant
> > > > > information out of a datastore more easily.
> > > > > DT is focused more on performance and identifying bottlenecks in
> one
> > or
> > > > > more systems. Imagine if NiFi were receiving data from various
> > sources
> > > > > (i.e. HTTP, Kafka, SQS) and NiFi egressed to other sources (HTTP,
> > > Kafka,
> > > > > NiFi).
> > > > > DT provides a spec that we'd be able to follow and correlate the
> data
> > > as
> > > > it
> > > > > traverses from system to system. Each system that participates in
> the
> > > DT
> > > > > ecosystem would simply emit information (a trace is made up of one
> or
> > > > more
> > > > > spans) and there'd be a collection system which would aggregate all
> > of
> > > > > these spans and would draw a bigger picture of the path that data
> > went
> > > > > through and could help identify key bottlenecks.
> > > > >
> > > > > OpenTelemetry (OTEL) provides clients (across many languages,
> > including
> > > > > java) where developers can instrument their library's APIs and
> > > > participate
> > > > > in a DT ecosystem as it adheres to the tracing spec. Egressing
> trace
> > > data
> > > > > is possible without using OTEL, but then we may find ourselves
> having
> > > to
> > > > > recreate the wheel, but could be optimized for NiFi.
> > > > >
> > > > > Creating a reporting task could certainly be a path, mainly have a
> > few
> > > > > concerns with that:
> > > > >
> > > > > 1. If provenance is disabled, will provenance events still be
> emitted
> > > and
> > > > > be collected by a new reporting task?
> > > > > 2. There'll be an impact on performance, how much is unknown. OTEL
> is
> > > > > gaining traction across industry and there are ways to mitigate
> > > > > performance, mainly sampling and the fact that *tracing is best
> > > effort*.
> > > > > Spans would be emitted from NiFi via UDP to a collector on the same
> > > > network
> > > > > 3. Would there be any issues with appending a flowfile attribute
> that
> > > is
> > > > > carried throughout the flow where it maintains the traceId,
> > > parentSpanId,
> > > > > and trace flags? See below for more details
> > > > >
> > > > > There's a W3C spec (Trace context) which includes a formatted
> string
> > > that
> > > > > would be propagated to services (HTTP, Kafka, etc...). So if NiFi
> > were
> > > to
> > > > > put information onto kafka, any consumers of that data would be
> able
> > to
> > > > > continue the trace and help draw the bigger picture.
> > > > >
> > > > > W3C Spec: https://www.w3.org/TR/trace-context/#traceparent-header
> > > > >
> > > > > For #2, since DT is focused on performance, sampling can help
> > alleviate
> > > > > chatter over the wire and ideally, 0.01% would draw the same
> picture
> > as
> > > > 1%
> > > > > or 10%+. This is certainly different from provenance as DT is
> focused
> > > on
> > > > > performance over quality of the data and should not be thought of
> as
> > > > > auditing.
> > > > >
> > > >
> > >
> >
> https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/trace/sdk.md#sampler
> > > > >
> > > > >> On Thu, Jul 28, 2022 at 5:01 PM Bryan Bende <bb...@gmail.com>
> > wrote:
> > > > >>
> > > > >> Hi Greg,
> > > > >>
> > > > >> I don't really know anything about OpenTelemetry, but from the
> > > > >> perspective of integrating something into the framework, some
> things
> > > > >> to consider...
> > > > >>
> > > > >> Is there some way to piggy-back on provenance and use a
> > ReportingTask
> > > > >> to process provenance events and report something to
> OpenTelemetry?
> > > > >>
> > > > >> If something new does need to be added, it should probably be an
> > > > >> extension point where there is an interface in the framework-api
> and
> > > > >> different implementations can be plugged in.
> > > > >> Ideally the framework itself wouldn't have any knowledge of
> > > > >> OpenTelemetry specifically, it would only be reporting some
> > > > >> information, which could then be used in some way by the
> > OpenTelemetry
> > > > >> implementation.
> > > > >>
> > > > >> How does NiFi actually communicate with OpenTelemetry? Are you
> > > > >> expecting to send data to OpenTelemetry in this new method you are
> > > > >> suggesting?
> > > > >> That would likely have a significant impact on the performance of
> > the
> > > > flow.
> > > > >>
> > > > >> Thanks,
> > > > >>
> > > > >> Bryan
> > > > >>
> > > > >>> On Thu, Jul 28, 2022 at 3:17 PM glmars3@uwe.nsa.gov <
> > > > glmars3@uwe.nsa.gov>
> > > > >>> wrote:
> > > > >>>
> > > > >>> Nifi Devs,
> > > > >>>
> > > > >>> My team and I are looking for guidance on how we can extend
> Apache
> > > > >> Nifi's capabilities. Specifically we're looking to include
> > distributed
> > > > >> tracing. We'll approach this effort as if we're the tracing
> experts
> > > and
> > > > >> simply seeking implementation guidance. Our developers have good
> > > > exposure
> > > > >> to working with Nifi and creating custom processors. We plan to
> fork
> > > the
> > > > >> project to begin this effort but want to make sure we approach
> this
> > > with
> > > > >> the best possible direction for community adoption.
> > > > >>>
> > > > >>> Our initial thoughts on this approach would be to piggyback on
> how
> > > > >> Provenance was implemented. We essentially want to include a
> > > subroutine
> > > > or
> > > > >> method that gets implicitly invoked upon a processors 'onTrigger'
> > > > method.
> > > > >> From there we would analyze the FlowFiles attributes to check for
> > the
> > > > >> existence of 'traceId' and/or propagate one if found.
> > > > >>>
> > > > >>> We can expound upon all of these tracing/observability details if
> > > that
> > > > >> helps by any means. We're able to provide more detailed scope of
> > this
> > > > task
> > > > >> as well but for now we just want to get feed back for our overall
> > goal
> > > > and
> > > > >> proposed approach.
> > > > >>>
> > > > >>> Thanks,
> > > > >>> Greg Marshall
> > > > >>
> > > >
> > > >
> > >
> >
>

Re: OpenTelemetry Integration

Posted by Michael Hogue <mi...@gmail.com>.
Related & complementary to Brian's tracing work would be modeling
provenance/lineage in OTEL. I chatted with some OTEL maintainers about
modeling provenance natively as a top-level Signal [1] at KubeCon and they
advised raising an issue in the opentelemetry-specification repo for
discussion:
https://github.com/open-telemetry/opentelemetry-specification/issues/3447

Their in-person response was largely that OTEL events could be used to
model provenance, but I argued that the model should be standardized for
use across tooling. This way, the provenance model wouldn't need to be
maintained separately.

My original thought was that we could have a standard provenance model in
OTEL and shift NiFi to emit these instead of its own provenance events.
Then we can tie provenance together across heterogeneous tooling using log
visualization through something like Loki & Grafana, but there are many
options in this space.

Thanks,
Mike

[1] https://opentelemetry.io/docs/concepts/signals/

On Tue, May 23, 2023 at 3:56 AM Brian Putt <pu...@gmail.com> wrote:

> Hello Joe / All,
>
> Jaeger or Grafana (w/ tempo) offer comparable tools to visualize the trace
> data. I believe additional tools will be needed to get the most out of the
> trace data. We've been experimenting with a number of open source products
> to see what works best for the amount of trace data that NiFi emits. So
> far, Grafana Tempo, Victoria Metrics, and Clickhouse seem to offer a good
> set of features to cover searching / viewing the traces along with
> summarizing certain flowfile attributes. As long as the trace data is in
> OTEL's format, the collector offers flexibility in exporting the data to a
> number of services with ease.
>
> I would expect a PR to OTEL's java auto instrumentation project over the
> next few months that adds NiFi to its list of instrumentations. If the NiFi
> committers would like a demo / tech exchange to go over the current state
> of the tracing agent, we'd be happy to accommodate. As it stands, the agent
> utilizes flowfile attributes to pass along the tracestate so trace
> propagation can occur across NiFi to NiFi boundaries.
>
> Thanks,
>
> Brian
>
> On Wed, May 17, 2023 at 1:05 PM Joe Witt <jo...@gmail.com> wrote:
>
> > Brian Putt, All
> >
> > Are you aware of any good tools/services that can ingest the traces and
> > provide an interesting view/story/reporting on it?
> >
> > I could see us emitting otel events instead of our current provenance
> > mechanism and using that both internally to do what we already do but
> also
> > have a clear/spec friendly way of exporting it to others.
> >
> > Thanks
> >
> > On Sat, Jul 30, 2022 at 7:43 AM Uwe@Moosheimer.com <Uw...@moosheimer.com>
> > wrote:
> >
> > > Hello Brian, Bryan, Greg, NiFi devs,
> > >
> > > Integrating OpenTelemetry is a very good idea, especially since the
> major
> > > cloud providers also rely on it. This could also be interesting for
> > > Stateless NiFi.
> > >
> > > I have a suggestion that I would like to put up for discussion.
> > >
> > > Would it be useful to make a list of what extensions or new development
> > > would be helpful for a complete integration of OpenTelemetry?
> > >
> > > I'm thinking of ConsumeMQTT and PublishMQTT, for example. Currently
> these
> > > can do max. MQTT version 3.11, but since version 5 the User Properties
> > > exist, which are similar to the HTTP header fields.
> > > Thus one could implement OpenTelemetry in the MQTT processors similarly
> > as
> > > in HTTP.
> > >
> > > With a list we could make an overview of the "necessary" adjustments
> and
> > > advertise for support.
> > >
> > > If what I write is nonsense, then I may not have understood something
> and
> > > I take it all back :)
> > >
> > > Mit freundlichen Grüßen / best regards
> > > Kay-Uwe Moosheimer
> > >
> > > > Am 29.07.2022 um 05:09 schrieb Brian Putt <pu...@gmail.com>:
> > > >
> > > > Hello Bryan / Greg / NiFi devs,
> > > >
> > > > Distributed tracing (DT) is similar to provenance in that it shows
> the
> > > path
> > > > a particular flowfile travels, but its core selling point is that it
> > > > supports tracing across multiple systems/services regardless of
> what's
> > > > receiving the data. Provenance is a fantastic feature and there are
> > > > instances where one might want to draw that bigger picture of
> > identifying
> > > > bottlenecks as data flows from one system to another and that system
> > > > may/may not be using NiFi.
> > > >
> > > > DT utilizes three ids: traceId, parentId, and spanId. While a tree
> can
> > be
> > > > built using two ids, the third id (traceId) helps bring all of the
> > > relevant
> > > > information out of a datastore more easily.
> > > > DT is focused more on performance and identifying bottlenecks in one
> or
> > > > more systems. Imagine if NiFi were receiving data from various
> sources
> > > > (i.e. HTTP, Kafka, SQS) and NiFi egressed to other sources (HTTP,
> > Kafka,
> > > > NiFi).
> > > > DT provides a spec that we'd be able to follow and correlate the data
> > as
> > > it
> > > > traverses from system to system. Each system that participates in the
> > DT
> > > > ecosystem would simply emit information (a trace is made up of one or
> > > more
> > > > spans) and there'd be a collection system which would aggregate all
> of
> > > > these spans and would draw a bigger picture of the path that data
> went
> > > > through and could help identify key bottlenecks.
> > > >
> > > > OpenTelemetry (OTEL) provides clients (across many languages,
> including
> > > > java) where developers can instrument their library's APIs and
> > > participate
> > > > in a DT ecosystem as it adheres to the tracing spec. Egressing trace
> > data
> > > > is possible without using OTEL, but then we may find ourselves having
> > to
> > > > recreate the wheel, but could be optimized for NiFi.
> > > >
> > > > Creating a reporting task could certainly be a path, mainly have a
> few
> > > > concerns with that:
> > > >
> > > > 1. If provenance is disabled, will provenance events still be emitted
> > and
> > > > be collected by a new reporting task?
> > > > 2. There'll be an impact on performance, how much is unknown. OTEL is
> > > > gaining traction across industry and there are ways to mitigate
> > > > performance, mainly sampling and the fact that *tracing is best
> > effort*.
> > > > Spans would be emitted from NiFi via UDP to a collector on the same
> > > network
> > > > 3. Would there be any issues with appending a flowfile attribute that
> > is
> > > > carried throughout the flow where it maintains the traceId,
> > parentSpanId,
> > > > and trace flags? See below for more details
> > > >
> > > > There's a W3C spec (Trace context) which includes a formatted string
> > that
> > > > would be propagated to services (HTTP, Kafka, etc...). So if NiFi
> were
> > to
> > > > put information onto kafka, any consumers of that data would be able
> to
> > > > continue the trace and help draw the bigger picture.
> > > >
> > > > W3C Spec: https://www.w3.org/TR/trace-context/#traceparent-header
> > > >
> > > > For #2, since DT is focused on performance, sampling can help
> alleviate
> > > > chatter over the wire and ideally, 0.01% would draw the same picture
> as
> > > 1%
> > > > or 10%+. This is certainly different from provenance as DT is focused
> > on
> > > > performance over quality of the data and should not be thought of as
> > > > auditing.
> > > >
> > >
> >
> https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/trace/sdk.md#sampler
> > > >
> > > >> On Thu, Jul 28, 2022 at 5:01 PM Bryan Bende <bb...@gmail.com>
> wrote:
> > > >>
> > > >> Hi Greg,
> > > >>
> > > >> I don't really know anything about OpenTelemetry, but from the
> > > >> perspective of integrating something into the framework, some things
> > > >> to consider...
> > > >>
> > > >> Is there some way to piggy-back on provenance and use a
> ReportingTask
> > > >> to process provenance events and report something to OpenTelemetry?
> > > >>
> > > >> If something new does need to be added, it should probably be an
> > > >> extension point where there is an interface in the framework-api and
> > > >> different implementations can be plugged in.
> > > >> Ideally the framework itself wouldn't have any knowledge of
> > > >> OpenTelemetry specifically, it would only be reporting some
> > > >> information, which could then be used in some way by the
> OpenTelemetry
> > > >> implementation.
> > > >>
> > > >> How does NiFi actually communicate with OpenTelemetry? Are you
> > > >> expecting to send data to OpenTelemetry in this new method you are
> > > >> suggesting?
> > > >> That would likely have a significant impact on the performance of
> the
> > > flow.
> > > >>
> > > >> Thanks,
> > > >>
> > > >> Bryan
> > > >>
> > > >>> On Thu, Jul 28, 2022 at 3:17 PM glmars3@uwe.nsa.gov <
> > > glmars3@uwe.nsa.gov>
> > > >>> wrote:
> > > >>>
> > > >>> Nifi Devs,
> > > >>>
> > > >>> My team and I are looking for guidance on how we can extend Apache
> > > >> Nifi's capabilities. Specifically we're looking to include
> distributed
> > > >> tracing. We'll approach this effort as if we're the tracing experts
> > and
> > > >> simply seeking implementation guidance. Our developers have good
> > > exposure
> > > >> to working with Nifi and creating custom processors. We plan to fork
> > the
> > > >> project to begin this effort but want to make sure we approach this
> > with
> > > >> the best possible direction for community adoption.
> > > >>>
> > > >>> Our initial thoughts on this approach would be to piggyback on how
> > > >> Provenance was implemented. We essentially want to include a
> > subroutine
> > > or
> > > >> method that gets implicitly invoked upon a processors 'onTrigger'
> > > method.
> > > >> From there we would analyze the FlowFiles attributes to check for
> the
> > > >> existence of 'traceId' and/or propagate one if found.
> > > >>>
> > > >>> We can expound upon all of these tracing/observability details if
> > that
> > > >> helps by any means. We're able to provide more detailed scope of
> this
> > > task
> > > >> as well but for now we just want to get feed back for our overall
> goal
> > > and
> > > >> proposed approach.
> > > >>>
> > > >>> Thanks,
> > > >>> Greg Marshall
> > > >>
> > >
> > >
> >
>

Re: OpenTelemetry Integration

Posted by Brian Putt <pu...@gmail.com>.
Hello Joe / All,

Jaeger or Grafana (w/ tempo) offer comparable tools to visualize the trace
data. I believe additional tools will be needed to get the most out of the
trace data. We've been experimenting with a number of open source products
to see what works best for the amount of trace data that NiFi emits. So
far, Grafana Tempo, Victoria Metrics, and Clickhouse seem to offer a good
set of features to cover searching / viewing the traces along with
summarizing certain flowfile attributes. As long as the trace data is in
OTEL's format, the collector offers flexibility in exporting the data to a
number of services with ease.

I would expect a PR to OTEL's java auto instrumentation project over the
next few months that adds NiFi to its list of instrumentations. If the NiFi
committers would like a demo / tech exchange to go over the current state
of the tracing agent, we'd be happy to accommodate. As it stands, the agent
utilizes flowfile attributes to pass along the tracestate so trace
propagation can occur across NiFi to NiFi boundaries.

Thanks,

Brian

On Wed, May 17, 2023 at 1:05 PM Joe Witt <jo...@gmail.com> wrote:

> Brian Putt, All
>
> Are you aware of any good tools/services that can ingest the traces and
> provide an interesting view/story/reporting on it?
>
> I could see us emitting otel events instead of our current provenance
> mechanism and using that both internally to do what we already do but also
> have a clear/spec friendly way of exporting it to others.
>
> Thanks
>
> On Sat, Jul 30, 2022 at 7:43 AM Uwe@Moosheimer.com <Uw...@moosheimer.com>
> wrote:
>
> > Hello Brian, Bryan, Greg, NiFi devs,
> >
> > Integrating OpenTelemetry is a very good idea, especially since the major
> > cloud providers also rely on it. This could also be interesting for
> > Stateless NiFi.
> >
> > I have a suggestion that I would like to put up for discussion.
> >
> > Would it be useful to make a list of what extensions or new development
> > would be helpful for a complete integration of OpenTelemetry?
> >
> > I'm thinking of ConsumeMQTT and PublishMQTT, for example. Currently these
> > can do max. MQTT version 3.11, but since version 5 the User Properties
> > exist, which are similar to the HTTP header fields.
> > Thus one could implement OpenTelemetry in the MQTT processors similarly
> as
> > in HTTP.
> >
> > With a list we could make an overview of the "necessary" adjustments and
> > advertise for support.
> >
> > If what I write is nonsense, then I may not have understood something and
> > I take it all back :)
> >
> > Mit freundlichen Grüßen / best regards
> > Kay-Uwe Moosheimer
> >
> > > Am 29.07.2022 um 05:09 schrieb Brian Putt <pu...@gmail.com>:
> > >
> > > Hello Bryan / Greg / NiFi devs,
> > >
> > > Distributed tracing (DT) is similar to provenance in that it shows the
> > path
> > > a particular flowfile travels, but its core selling point is that it
> > > supports tracing across multiple systems/services regardless of what's
> > > receiving the data. Provenance is a fantastic feature and there are
> > > instances where one might want to draw that bigger picture of
> identifying
> > > bottlenecks as data flows from one system to another and that system
> > > may/may not be using NiFi.
> > >
> > > DT utilizes three ids: traceId, parentId, and spanId. While a tree can
> be
> > > built using two ids, the third id (traceId) helps bring all of the
> > relevant
> > > information out of a datastore more easily.
> > > DT is focused more on performance and identifying bottlenecks in one or
> > > more systems. Imagine if NiFi were receiving data from various sources
> > > (i.e. HTTP, Kafka, SQS) and NiFi egressed to other sources (HTTP,
> Kafka,
> > > NiFi).
> > > DT provides a spec that we'd be able to follow and correlate the data
> as
> > it
> > > traverses from system to system. Each system that participates in the
> DT
> > > ecosystem would simply emit information (a trace is made up of one or
> > more
> > > spans) and there'd be a collection system which would aggregate all of
> > > these spans and would draw a bigger picture of the path that data went
> > > through and could help identify key bottlenecks.
> > >
> > > OpenTelemetry (OTEL) provides clients (across many languages, including
> > > java) where developers can instrument their library's APIs and
> > participate
> > > in a DT ecosystem as it adheres to the tracing spec. Egressing trace
> data
> > > is possible without using OTEL, but then we may find ourselves having
> to
> > > recreate the wheel, but could be optimized for NiFi.
> > >
> > > Creating a reporting task could certainly be a path, mainly have a few
> > > concerns with that:
> > >
> > > 1. If provenance is disabled, will provenance events still be emitted
> and
> > > be collected by a new reporting task?
> > > 2. There'll be an impact on performance, how much is unknown. OTEL is
> > > gaining traction across industry and there are ways to mitigate
> > > performance, mainly sampling and the fact that *tracing is best
> effort*.
> > > Spans would be emitted from NiFi via UDP to a collector on the same
> > network
> > > 3. Would there be any issues with appending a flowfile attribute that
> is
> > > carried throughout the flow where it maintains the traceId,
> parentSpanId,
> > > and trace flags? See below for more details
> > >
> > > There's a W3C spec (Trace context) which includes a formatted string
> that
> > > would be propagated to services (HTTP, Kafka, etc...). So if NiFi were
> to
> > > put information onto kafka, any consumers of that data would be able to
> > > continue the trace and help draw the bigger picture.
> > >
> > > W3C Spec: https://www.w3.org/TR/trace-context/#traceparent-header
> > >
> > > For #2, since DT is focused on performance, sampling can help alleviate
> > > chatter over the wire and ideally, 0.01% would draw the same picture as
> > 1%
> > > or 10%+. This is certainly different from provenance as DT is focused
> on
> > > performance over quality of the data and should not be thought of as
> > > auditing.
> > >
> >
> https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/trace/sdk.md#sampler
> > >
> > >> On Thu, Jul 28, 2022 at 5:01 PM Bryan Bende <bb...@gmail.com> wrote:
> > >>
> > >> Hi Greg,
> > >>
> > >> I don't really know anything about OpenTelemetry, but from the
> > >> perspective of integrating something into the framework, some things
> > >> to consider...
> > >>
> > >> Is there some way to piggy-back on provenance and use a ReportingTask
> > >> to process provenance events and report something to OpenTelemetry?
> > >>
> > >> If something new does need to be added, it should probably be an
> > >> extension point where there is an interface in the framework-api and
> > >> different implementations can be plugged in.
> > >> Ideally the framework itself wouldn't have any knowledge of
> > >> OpenTelemetry specifically, it would only be reporting some
> > >> information, which could then be used in some way by the OpenTelemetry
> > >> implementation.
> > >>
> > >> How does NiFi actually communicate with OpenTelemetry? Are you
> > >> expecting to send data to OpenTelemetry in this new method you are
> > >> suggesting?
> > >> That would likely have a significant impact on the performance of the
> > flow.
> > >>
> > >> Thanks,
> > >>
> > >> Bryan
> > >>
> > >>> On Thu, Jul 28, 2022 at 3:17 PM glmars3@uwe.nsa.gov <
> > glmars3@uwe.nsa.gov>
> > >>> wrote:
> > >>>
> > >>> Nifi Devs,
> > >>>
> > >>> My team and I are looking for guidance on how we can extend Apache
> > >> Nifi's capabilities. Specifically we're looking to include distributed
> > >> tracing. We'll approach this effort as if we're the tracing experts
> and
> > >> simply seeking implementation guidance. Our developers have good
> > exposure
> > >> to working with Nifi and creating custom processors. We plan to fork
> the
> > >> project to begin this effort but want to make sure we approach this
> with
> > >> the best possible direction for community adoption.
> > >>>
> > >>> Our initial thoughts on this approach would be to piggyback on how
> > >> Provenance was implemented. We essentially want to include a
> subroutine
> > or
> > >> method that gets implicitly invoked upon a processors 'onTrigger'
> > method.
> > >> From there we would analyze the FlowFiles attributes to check for the
> > >> existence of 'traceId' and/or propagate one if found.
> > >>>
> > >>> We can expound upon all of these tracing/observability details if
> that
> > >> helps by any means. We're able to provide more detailed scope of this
> > task
> > >> as well but for now we just want to get feed back for our overall goal
> > and
> > >> proposed approach.
> > >>>
> > >>> Thanks,
> > >>> Greg Marshall
> > >>
> >
> >
>