You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@beam.apache.org by Anton Kedin <ke...@google.com> on 2018/04/27 16:55:57 UTC

Pipeline Debugging in Intellij

Hi,

Java developers have probably already seen the stream debugging plugin
<https://github.com/JetBrains/intellij-community/tree/master/plugins/stream-debugger>
added
to intellij some time ago, or maybe other similar tools for debugging java8
streams:

[image: image.png]

The plugin allows you to see how actual data propagates through the stream.
This way it is much easier to trace what happens to each input element
after each transformation.

Haven't seen this proposed on the list before, so I am wondering if it's
worth looking into building a similar tool for Beam Pipelines to help with
pipeline authoring, testing, and debugging?

Couple of thoughts:
 - intellij-community is open source under Apache license, should be
straightforward to poke around and prototype;
 - wiring it up to Beam tests in direct runner looks feasible;

Any opinions/comments?

Regards,
Anton

Re: Pipeline Debugging in Intellij

Posted by Kenneth Knowles <kl...@google.com>.
I think trying it out on the DirectRunner for understanding unit test
failures seems useful and well-scoped. For larger pipelines, I imagine it
needs some ability to focus on a smaller part of the pipeline. For larger
data, I've seen a few interesting papers [1] [2] [3] [4] [5] some of which
have very similar diagrams.

Kenn

[1] http://www.vldb.org/pvldb/vol9/p1137-chothia.pdf
[2]
https://www.researchgate.net/profile/Matteo_Interlandi/publication/320069490_Automated_debugging_in_data-intensive_scalable_computing/links/59dbc2b90f7e9b1460fc26c2/Automated-debugging-in-data-intensive-scalable-computing.pdf
[3] https://arxiv.org/pdf/1801.07237.pdf
[4] https://md2k.org/images/papers/methods/adding-provenance_interlandi.pdf
[5] http://www.cs.columbia.edu/~fotis/pubs/papers/smokedemo-sigmod18.pdf

On Fri, Apr 27, 2018 at 10:06 AM Reuven Lax <re...@google.com> wrote:

> Someone wrote up a similar UI for Dataflow (back before Beam), also based
> on DirectRunner. I don't think that ended up going anywhere, but I think
> this would make a cool debugging tool for Beam! Would have to make sure
> that it was readable for big pipelines though.
>
> Reuven
>
> On Fri, Apr 27, 2018 at 9:56 AM Anton Kedin <ke...@google.com> wrote:
>
>> Hi,
>>
>> Java developers have probably already seen the stream debugging plugin
>> <https://github.com/JetBrains/intellij-community/tree/master/plugins/stream-debugger> added
>> to intellij some time ago, or maybe other similar tools for debugging java8
>> streams:
>>
>> [image: image.png]
>>
>> The plugin allows you to see how actual data propagates through the
>> stream. This way it is much easier to trace what happens to each input
>> element after each transformation.
>>
>> Haven't seen this proposed on the list before, so I am wondering if it's
>> worth looking into building a similar tool for Beam Pipelines to help with
>> pipeline authoring, testing, and debugging?
>>
>> Couple of thoughts:
>>  - intellij-community is open source under Apache license, should be
>> straightforward to poke around and prototype;
>>  - wiring it up to Beam tests in direct runner looks feasible;
>>
>> Any opinions/comments?
>>
>> Regards,
>> Anton
>>
>

Re: Pipeline Debugging in Intellij

Posted by Reuven Lax <re...@google.com>.
Someone wrote up a similar UI for Dataflow (back before Beam), also based
on DirectRunner. I don't think that ended up going anywhere, but I think
this would make a cool debugging tool for Beam! Would have to make sure
that it was readable for big pipelines though.

Reuven

On Fri, Apr 27, 2018 at 9:56 AM Anton Kedin <ke...@google.com> wrote:

> Hi,
>
> Java developers have probably already seen the stream debugging plugin
> <https://github.com/JetBrains/intellij-community/tree/master/plugins/stream-debugger> added
> to intellij some time ago, or maybe other similar tools for debugging java8
> streams:
>
> [image: image.png]
>
> The plugin allows you to see how actual data propagates through the
> stream. This way it is much easier to trace what happens to each input
> element after each transformation.
>
> Haven't seen this proposed on the list before, so I am wondering if it's
> worth looking into building a similar tool for Beam Pipelines to help with
> pipeline authoring, testing, and debugging?
>
> Couple of thoughts:
>  - intellij-community is open source under Apache license, should be
> straightforward to poke around and prototype;
>  - wiring it up to Beam tests in direct runner looks feasible;
>
> Any opinions/comments?
>
> Regards,
> Anton
>