You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@beam.apache.org by Aljoscha Krettek <al...@apache.org> on 2016/07/04 06:27:20 UTC

Re: Display Data Runner Support

Thanks Scott for this compilation of information! I'll look into how this
can be incorporated into the Flink runner once I have some time on my hands.

On Thu, 30 Jun 2016 at 17:05 Scott Wegner <sw...@google.com.invalid>
wrote:

> Hi Beam Dev community,
>
> I wanted to circle-back on a recent Beam feature, Display Data, which we
> proposed back in March [1] and is now implemented in the Beam SDK. Display
> Data provides a method for Runners to collect additional metadata about a
> pipeline during construction, suitable for display in UI. PipelineOptions,
> PTransforms, and user-defined function types (DoFn, CombineFn, WindowFn)
> will register their display data, and the SDK hooks are provided for users
> to integrate display data from their own components.
>
> Alex Amato and I wrote a blog post describing how Google Dataflow is now
> surfacing display data in its monitoring interface [2]. I encourage other
> Runner authors to take a look and consider how display data could fit into
> your runner. Integrating display data is relatively straightforward as most
> of the heavy-lifting is done in the SDK. The Dataflow Runner collects
> display data during pipeline translation for PipelineOptions [3] and
> PTransforms [4].
>
> Please have a look at the display data API docs [5] and let me know if you
> have any questions.
>
> - Scott
>
> [1]
>
> http://mail-archives.apache.org/mod_mbox/incubator-beam-dev/201603.mbox/raw/%3CCAN-7FgbR%3DyXPHZj-GrPO3aGSkkj11NXwAoyOGEzWc9r3ApnOpg%40mail.gmail.com%3E/1
> [2]
>
> https://cloud.google.com/blog/big-data/2016/06/dataflow-updates-see-more-details-about-your-pipelines
> [3]
>
> https://github.com/apache/incubator-beam/blob/7d767056a90e769eff68d4347e1b3a7bc43f415c/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowPipelineTranslator.java#L406
> [4]
>
> https://github.com/apache/incubator-beam/blob/7d767056a90e769eff68d4347e1b3a7bc43f415c/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowPipelineTranslator.java#L548
> [5]
>
> http://beam.incubator.apache.org/javadoc/0.1.0-incubating/org/apache/beam/sdk/transforms/display/HasDisplayData.html#populateDisplayData-org.apache.beam.sdk.transforms.display.DisplayData.Builder-
>

Re: Display Data Runner Support

Posted by Frances Perry <fj...@google.com.INVALID>.
Perhaps its worth filing jira issues to investigate this integration for
other runners? I'm guessing those might be good starter tasks for folks
with the right background.

On Sun, Jul 3, 2016 at 11:27 PM, Aljoscha Krettek <al...@apache.org>
wrote:

> Thanks Scott for this compilation of information! I'll look into how this
> can be incorporated into the Flink runner once I have some time on my
> hands.
>
> On Thu, 30 Jun 2016 at 17:05 Scott Wegner <sw...@google.com.invalid>
> wrote:
>
> > Hi Beam Dev community,
> >
> > I wanted to circle-back on a recent Beam feature, Display Data, which we
> > proposed back in March [1] and is now implemented in the Beam SDK.
> Display
> > Data provides a method for Runners to collect additional metadata about a
> > pipeline during construction, suitable for display in UI.
> PipelineOptions,
> > PTransforms, and user-defined function types (DoFn, CombineFn, WindowFn)
> > will register their display data, and the SDK hooks are provided for
> users
> > to integrate display data from their own components.
> >
> > Alex Amato and I wrote a blog post describing how Google Dataflow is now
> > surfacing display data in its monitoring interface [2]. I encourage other
> > Runner authors to take a look and consider how display data could fit
> into
> > your runner. Integrating display data is relatively straightforward as
> most
> > of the heavy-lifting is done in the SDK. The Dataflow Runner collects
> > display data during pipeline translation for PipelineOptions [3] and
> > PTransforms [4].
> >
> > Please have a look at the display data API docs [5] and let me know if
> you
> > have any questions.
> >
> > - Scott
> >
> > [1]
> >
> >
> http://mail-archives.apache.org/mod_mbox/incubator-beam-dev/201603.mbox/raw/%3CCAN-7FgbR%3DyXPHZj-GrPO3aGSkkj11NXwAoyOGEzWc9r3ApnOpg%40mail.gmail.com%3E/1
> > [2]
> >
> >
> https://cloud.google.com/blog/big-data/2016/06/dataflow-updates-see-more-details-about-your-pipelines
> > [3]
> >
> >
> https://github.com/apache/incubator-beam/blob/7d767056a90e769eff68d4347e1b3a7bc43f415c/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowPipelineTranslator.java#L406
> > [4]
> >
> >
> https://github.com/apache/incubator-beam/blob/7d767056a90e769eff68d4347e1b3a7bc43f415c/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowPipelineTranslator.java#L548
> > [5]
> >
> >
> http://beam.incubator.apache.org/javadoc/0.1.0-incubating/org/apache/beam/sdk/transforms/display/HasDisplayData.html#populateDisplayData-org.apache.beam.sdk.transforms.display.DisplayData.Builder-
> >
>