You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@devlake.apache.org by David Van Couvering <da...@gmail.com> on 2022/12/16 01:30:30 UTC

Open Telemetry?

Hello! A colleague just pointed me to your project, looks awesome! There is
definitely a need for this in our industry.

I have been working on something similar, but rather than have multiple
projects doing similar things, I am looking at maybe jumping on your
bandwagon.

OpenTelemetry is focused on visualizing and improving the flow of requests
and messages through a software system. But the same goals exist for a
devops system, except in this case the work flowing through are not
requests and messages but features and stories.

So what I have been looking at is taking advantage of the industry support,
APIs and tooling around OpenTelemetry and leveraging it for devops process
visibility and tuning.

Looking at your system, you have an excellent framework for connecting to
various data sources, collecting data, storing the data and then building
dashboards around it.

I was imagining we could find a way to plug that data into OpenTelemetry
through an OpenTelemetry collector. Then you can pick and choose from the
many OpenTelemetry vendors to help you query, visualize and build
dashboards around what I call "Flow Telemetry".

I would love your thoughts. Have you looked at this and decided it wasn't a
good match. Do you have concerns or questions? Thoughts on how to best
create an integration?  I was imagining either creating a new data store
implementation that pushes data to an OTel Collector, but I am not sure how
tied to SQL your current implementation is.

Thanks,

David
https://www.linkedin.com/in/davidvc/
(also a member of the Apache DB PMC although I haven't been active for like
20 years :) ).

Re: Open Telemetry?

Posted by David Van Couvering <da...@gmail.com>.
Hi, Hezheng! Thanks for getting back.

Yes, I saw how your architecture works.

As you may know, OpenTelemetry defines its own data model
<https://opentelemetry.io/docs/reference/specification/overview/>. The core
concept is a "signal" which provides information about your system. There
are four main types of signals: events, traces, logs and baggage (key-value
pair data). For each of these, the common feature is *context propagation*
- each signal is associated with a particular context, and signals have
both time-based and hierarchical relationship with each other. For example,
a request has a trace which consists of a hierarchy of spans, and the same
request has logs and events.  Using this data we can discover, inspect and
learn about the behavior and performance of the request, as well as
aggregate insights about many requests.

This same concept of context and context propagation is essential to what I
call "feature flow" (this term is derived from the Flow Framework and their
Flow Metrics). Every event in JIRA, Github, Jenkins, etc., is associated
with a particular feature (what you are calling a story in your model).  An
epic is a hierarchy of stories and tasks, and each story may have a series
of pull requests and Jenkins jobs associated with it.  Everything ties back
to the primary feature/story that is being built. Using this data we can
discover, inspect and learn about the behavior and performance of the
feature, as well as aggregate insights about many features.

The opportunity I am seeing is, if we generate data using the OpenTelemetry
model, then, rather than build our own data collection and analysis
systems, we can take advantage of a vast ecosystem of tools that supports
collection, discovery and analysis of Open Telemetry signals to uncover
deep insights about the flow of features/stories through our software
development systems. These tools include DataDog
<https://www.datadoghq.com/blog/ingest-opentelemetry-traces-metrics-with-datadog-exporter/>,
NewRelic <https://newrelic.com/solutions/opentelemetry>, Grafana
<https://grafana.com/oss/opentelemetry/>, Elastic
<https://www.elastic.co/guide/en/apm/guide/current/open-telemetry.html>,
Honeycomb
<https://docs.honeycomb.io/getting-data-in/opentelemetry-overview/>, Jaeger
<https://www.jaegertracing.io/docs/1.21/opentelemetry/>, Zipkin
<https://opentelemetry.io/docs/reference/specification/trace/sdk_exporters/zipkin/>
and so on.

So what I had been working on was building code to pull data from first
JIRA and Github and then export that data out through the OpenTelemetry
APIs (they have SDKs for almost every language, including Go
<https://opentelemetry.io/docs/instrumentation/go/manual/>) into an
OpenTelemetry collector, and then use one of these tools to help build
insights and dashboards for the flow of feature work through our software
development systems.

Then I ran into what you're doing. You have already build a vast suite of
plugins that grab data from all these various DevOps systems.

And I was thinking, what if we could marry that with the vast suite of
collectors and analysis tools that is already out there with OpenTelemetry,
rather than build our own?

It may not be a good marriage. It may be that the concepts we care about
just don't map well into the OpenTelemetry world. But I thought it was an
idea worth exploring.

Since OpenTelemetry already has a collector and all these tools already
have their own data stores, I thought the point of integration should be at
your storage layer. Rather than storing in a relational DB, we could
instead push the data to the Open Telemetry collector.  And then we could
experiment using the OpenTelemetry visualization and analysis tools to try
and get the insights we are looking for.

Would love your thoughts!

DVC (my nickname)

On Fri, Dec 16, 2022 at 5:45 PM Hezheng Yin <yi...@gmail.com>
wrote:

> Hi David,
>
> Thanks for reaching out and welcome to the DevLake community! I have basic
> knowledge of OpenTelemtry but definitely no expert.  Right now, DevLake
> takes care of data collection and transformation while leaving
> visualization to Grafana. DevLake's interface to visualization tools is its
> domain layer schema [1].  So technically, users can go with any
> analysis/visualization tools that can connect to DevLake's data store and
> query data described by the domain layer schema. We chose Grafana mainly
> for its wide adoption and the ability to provision pre-built dashboards.
>
> I'd love to get your perspective on how the integration with OpenTelemetry
> may help DevLake users. And I'm curious to learn more about your "Flow
> Telemetry" concept, which sounds super interesting. Looking forward to
> diving deeper into your ideas!
>
> [1]
> https://devlake.apache.org/docs/next/DataModels/DevLakeDomainLayerSchema
>
> Best,
> Hezheng
> Apache DevLake PPMC
>
>
> On Fri, Dec 16, 2022 at 8:01 AM Rob Basham <ro...@us.ibm.com> wrote:
>
> > David,
> > Thanks for reaching out.  I appreciate your email as I had not been
> > following OpenTelemetry.  My primary job is automation and I can see how
> > this project could really improve our test automation state/response
> > verification.  I looked over your linked in profile and presentations and
> > couldn't agree with you more on well-defined acceptance criteria.
> > Regards,
> > Rob Basham
> > ________________________________
> > From: David Van Couvering <da...@gmail.com>
> > Sent: Thursday, December 15, 2022 5:30 PM
> > To: dev@devlake.apache.org <de...@devlake.apache.org>
> > Subject: [EXTERNAL] Open Telemetry?
> >
> > Hello! A colleague just pointed me to your project, looks awesome! There
> is
> > definitely a need for this in our industry.
> >
> > I have been working on something similar, but rather than have multiple
> > projects doing similar things, I am looking at maybe jumping on your
> > bandwagon.
> >
> > OpenTelemetry is focused on visualizing and improving the flow of
> requests
> > and messages through a software system. But the same goals exist for a
> > devops system, except in this case the work flowing through are not
> > requests and messages but features and stories.
> >
> > So what I have been looking at is taking advantage of the industry
> support,
> > APIs and tooling around OpenTelemetry and leveraging it for devops
> process
> > visibility and tuning.
> >
> > Looking at your system, you have an excellent framework for connecting to
> > various data sources, collecting data, storing the data and then building
> > dashboards around it.
> >
> > I was imagining we could find a way to plug that data into OpenTelemetry
> > through an OpenTelemetry collector. Then you can pick and choose from the
> > many OpenTelemetry vendors to help you query, visualize and build
> > dashboards around what I call "Flow Telemetry".
> >
> > I would love your thoughts. Have you looked at this and decided it
> wasn't a
> > good match. Do you have concerns or questions? Thoughts on how to best
> > create an integration?  I was imagining either creating a new data store
> > implementation that pushes data to an OTel Collector, but I am not sure
> how
> > tied to SQL your current implementation is.
> >
> > Thanks,
> >
> > David
> > https://www.linkedin.com/in/davidvc/
> > (also a member of the Apache DB PMC although I haven't been active for
> like
> > 20 years :) ).
> >
>

Re: Open Telemetry?

Posted by Hezheng Yin <yi...@gmail.com>.
Hi David,

Thanks for reaching out and welcome to the DevLake community! I have basic
knowledge of OpenTelemtry but definitely no expert.  Right now, DevLake
takes care of data collection and transformation while leaving
visualization to Grafana. DevLake's interface to visualization tools is its
domain layer schema [1].  So technically, users can go with any
analysis/visualization tools that can connect to DevLake's data store and
query data described by the domain layer schema. We chose Grafana mainly
for its wide adoption and the ability to provision pre-built dashboards.

I'd love to get your perspective on how the integration with OpenTelemetry
may help DevLake users. And I'm curious to learn more about your "Flow
Telemetry" concept, which sounds super interesting. Looking forward to
diving deeper into your ideas!

[1] https://devlake.apache.org/docs/next/DataModels/DevLakeDomainLayerSchema

Best,
Hezheng
Apache DevLake PPMC


On Fri, Dec 16, 2022 at 8:01 AM Rob Basham <ro...@us.ibm.com> wrote:

> David,
> Thanks for reaching out.  I appreciate your email as I had not been
> following OpenTelemetry.  My primary job is automation and I can see how
> this project could really improve our test automation state/response
> verification.  I looked over your linked in profile and presentations and
> couldn't agree with you more on well-defined acceptance criteria.
> Regards,
> Rob Basham
> ________________________________
> From: David Van Couvering <da...@gmail.com>
> Sent: Thursday, December 15, 2022 5:30 PM
> To: dev@devlake.apache.org <de...@devlake.apache.org>
> Subject: [EXTERNAL] Open Telemetry?
>
> Hello! A colleague just pointed me to your project, looks awesome! There is
> definitely a need for this in our industry.
>
> I have been working on something similar, but rather than have multiple
> projects doing similar things, I am looking at maybe jumping on your
> bandwagon.
>
> OpenTelemetry is focused on visualizing and improving the flow of requests
> and messages through a software system. But the same goals exist for a
> devops system, except in this case the work flowing through are not
> requests and messages but features and stories.
>
> So what I have been looking at is taking advantage of the industry support,
> APIs and tooling around OpenTelemetry and leveraging it for devops process
> visibility and tuning.
>
> Looking at your system, you have an excellent framework for connecting to
> various data sources, collecting data, storing the data and then building
> dashboards around it.
>
> I was imagining we could find a way to plug that data into OpenTelemetry
> through an OpenTelemetry collector. Then you can pick and choose from the
> many OpenTelemetry vendors to help you query, visualize and build
> dashboards around what I call "Flow Telemetry".
>
> I would love your thoughts. Have you looked at this and decided it wasn't a
> good match. Do you have concerns or questions? Thoughts on how to best
> create an integration?  I was imagining either creating a new data store
> implementation that pushes data to an OTel Collector, but I am not sure how
> tied to SQL your current implementation is.
>
> Thanks,
>
> David
> https://www.linkedin.com/in/davidvc/
> (also a member of the Apache DB PMC although I haven't been active for like
> 20 years :) ).
>

Re: Open Telemetry?

Posted by Rob Basham <ro...@us.ibm.com>.
David,
Thanks for reaching out.  I appreciate your email as I had not been following OpenTelemetry.  My primary job is automation and I can see how this project could really improve our test automation state/response verification.  I looked over your linked in profile and presentations and couldn't agree with you more on well-defined acceptance criteria.
Regards,
Rob Basham
________________________________
From: David Van Couvering <da...@gmail.com>
Sent: Thursday, December 15, 2022 5:30 PM
To: dev@devlake.apache.org <de...@devlake.apache.org>
Subject: [EXTERNAL] Open Telemetry?

Hello! A colleague just pointed me to your project, looks awesome! There is
definitely a need for this in our industry.

I have been working on something similar, but rather than have multiple
projects doing similar things, I am looking at maybe jumping on your
bandwagon.

OpenTelemetry is focused on visualizing and improving the flow of requests
and messages through a software system. But the same goals exist for a
devops system, except in this case the work flowing through are not
requests and messages but features and stories.

So what I have been looking at is taking advantage of the industry support,
APIs and tooling around OpenTelemetry and leveraging it for devops process
visibility and tuning.

Looking at your system, you have an excellent framework for connecting to
various data sources, collecting data, storing the data and then building
dashboards around it.

I was imagining we could find a way to plug that data into OpenTelemetry
through an OpenTelemetry collector. Then you can pick and choose from the
many OpenTelemetry vendors to help you query, visualize and build
dashboards around what I call "Flow Telemetry".

I would love your thoughts. Have you looked at this and decided it wasn't a
good match. Do you have concerns or questions? Thoughts on how to best
create an integration?  I was imagining either creating a new data store
implementation that pushes data to an OTel Collector, but I am not sure how
tied to SQL your current implementation is.

Thanks,

David
https://www.linkedin.com/in/davidvc/
(also a member of the Apache DB PMC although I haven't been active for like
20 years :) ).