You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@spark.apache.org by Andriy Redko <dr...@gmail.com> on 2019/03/21 00:16:07 UTC

[HELP WANTED] Apache Zipkin (incubating) needs Spark gurus

Hello Dear Spark Community!

The hyper-popularity of the Apache Spark made it a de-facto choice for many
projects which need some sort of data processing capabilities. One of those is 
Zipkin, currenly incubating at Apache [1], the widespread distributed tracing framework.
The small amazing team behind the project maintains around ~40 different integrations 
and components, including the [2], a set of Spark jobs to reconstruct over time the 
service dependency graphs from the collected traces. The current maintainers are not
yet savvy on Spark and the team really struggles to address the ongoing ussues and
answer user questions. For example, users are reporting concerns about job distribution 
which the Zipkin team doesn't know how to answer. It is really difficult to keep this 
particular component up and running due to the lack of the Spark expertise. 

Thereby this message to the community, anyone could be / is interested in distributed 
tracing (fascinating development by itself!) to the point to step in, help with Spark
expertise and contribute? Please feel free to reach out, Gitter @ [3]
or dev@zipkin.apache.org!

[1] http://incubator.apache.org/projects/zipkin.html
[2] https://github.com/openzipkin/zipkin-dependencies
[3] https://gitter.im/openzipkin/zipkin

Best Regards,
    Apache Zipkin (incubating) Team


---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org

Re: [HELP WANTED] Apache Zipkin (incubating) needs Spark gurus

Posted by Adrian Cole <ad...@gmail.com>.

Thanks for the reply, Xin!

So the question that has been haunting us recently is people saying
that when they run our job in a cluster, it only ends up running on
one node. We may have something wrong, but it would be nice to know
how to verify if this is the case or not. Ex simplest test setup.
Ideally we can end with a testcontainers test to prove all jobs run
distributed. for now, having someone who knows what they are doing
help validate this issue would be of great benefit.

Ex I made a pull request as I only really know how to run simple
setup. The user claims the job doesn't work on multiple nodes and it
would be nice to know if this is true or if it is a setup or version
mismatch or something else.
https://github.com/openzipkin/zipkin-dependencies/pull/133

-A

On Thu, Mar 21, 2019 at 2:56 PM Reynold Xin <rx...@databricks.com> wrote:
>
> Are there specific questions you have? Might be easier to post them here
> also.
>
> On Wed, Mar 20, 2019 at 5:16 PM Andriy Redko <dr...@gmail.com> wrote:
>
> > Hello Dear Spark Community!
> >
> > The hyper-popularity of the Apache Spark made it a de-facto choice for many
> > projects which need some sort of data processing capabilities. One of
> > those is
> > Zipkin, currenly incubating at Apache [1], the widespread distributed
> > tracing framework.
> > The small amazing team behind the project maintains around ~40 different
> > integrations
> > and components, including the [2], a set of Spark jobs to reconstruct over
> > time the
> > service dependency graphs from the collected traces. The current
> > maintainers are not
> > yet savvy on Spark and the team really struggles to address the ongoing
> > ussues and
> > answer user questions. For example, users are reporting concerns about job
> > distribution
> > which the Zipkin team doesn't know how to answer. It is really difficult
> > to keep this
> > particular component up and running due to the lack of the Spark
> > expertise.
> >
> > Thereby this message to the community, anyone could be / is interested in
> > distributed
> > tracing (fascinating development by itself!) to the point to step in, help
> > with Spark
> > expertise and contribute? Please feel free to reach out, Gitter @ [3]
> > or dev@zipkin.apache.org!
> >
> > [1] http://incubator.apache.org/projects/zipkin.html
> > [2] https://github.com/openzipkin/zipkin-dependencies
> > [3] https://gitter.im/openzipkin/zipkin
> >
> > Best Regards,
> >     Apache Zipkin (incubating) Team
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe e-mail: user-unsubscribe@spark.apache.org
> >
> >

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@zipkin.apache.org
For additional commands, e-mail: dev-help@zipkin.apache.org

Re: [HELP WANTED] Apache Zipkin (incubating) needs Spark gurus

Posted by Reynold Xin <rx...@databricks.com>.

Are there specific questions you have? Might be easier to post them here
also.

On Wed, Mar 20, 2019 at 5:16 PM Andriy Redko <dr...@gmail.com> wrote:

> Hello Dear Spark Community!
>
> The hyper-popularity of the Apache Spark made it a de-facto choice for many
> projects which need some sort of data processing capabilities. One of
> those is
> Zipkin, currenly incubating at Apache [1], the widespread distributed
> tracing framework.
> The small amazing team behind the project maintains around ~40 different
> integrations
> and components, including the [2], a set of Spark jobs to reconstruct over
> time the
> service dependency graphs from the collected traces. The current
> maintainers are not
> yet savvy on Spark and the team really struggles to address the ongoing
> ussues and
> answer user questions. For example, users are reporting concerns about job
> distribution
> which the Zipkin team doesn't know how to answer. It is really difficult
> to keep this
> particular component up and running due to the lack of the Spark
> expertise.
>
> Thereby this message to the community, anyone could be / is interested in
> distributed
> tracing (fascinating development by itself!) to the point to step in, help
> with Spark
> expertise and contribute? Please feel free to reach out, Gitter @ [3]
> or dev@zipkin.apache.org!
>
> [1] http://incubator.apache.org/projects/zipkin.html
> [2] https://github.com/openzipkin/zipkin-dependencies
> [3] https://gitter.im/openzipkin/zipkin
>
> Best Regards,
>     Apache Zipkin (incubating) Team
>
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscribe@spark.apache.org
>
>

Re: [HELP WANTED] Apache Zipkin (incubating) needs Spark gurus

Posted by Reynold Xin <rx...@databricks.com>.

Are there specific questions you have? Might be easier to post them here
also.

On Wed, Mar 20, 2019 at 5:16 PM Andriy Redko <dr...@gmail.com> wrote:

> Hello Dear Spark Community!
>
> The hyper-popularity of the Apache Spark made it a de-facto choice for many
> projects which need some sort of data processing capabilities. One of
> those is
> Zipkin, currenly incubating at Apache [1], the widespread distributed
> tracing framework.
> The small amazing team behind the project maintains around ~40 different
> integrations
> and components, including the [2], a set of Spark jobs to reconstruct over
> time the
> service dependency graphs from the collected traces. The current
> maintainers are not
> yet savvy on Spark and the team really struggles to address the ongoing
> ussues and
> answer user questions. For example, users are reporting concerns about job
> distribution
> which the Zipkin team doesn't know how to answer. It is really difficult
> to keep this
> particular component up and running due to the lack of the Spark
> expertise.
>
> Thereby this message to the community, anyone could be / is interested in
> distributed
> tracing (fascinating development by itself!) to the point to step in, help
> with Spark
> expertise and contribute? Please feel free to reach out, Gitter @ [3]
> or dev@zipkin.apache.org!
>
> [1] http://incubator.apache.org/projects/zipkin.html
> [2] https://github.com/openzipkin/zipkin-dependencies
> [3] https://gitter.im/openzipkin/zipkin
>
> Best Regards,
>     Apache Zipkin (incubating) Team
>
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscribe@spark.apache.org
>
>