You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@beam.apache.org by Lukasz Cwik <lc...@google.com> on 2018/03/26 18:51:42 UTC

All SDKs on all Runners (portability virtual team)

A few people from the Beam community have been steadily making effort on
realizing the portability goal (all SDKs on all Runners).

Last week on an experimental branch we were able to get Apache Flink to run
a Go pipeline and also a Python pipeline using those respective SDKs. The
pipelines were limited to running ParDo and GBK (no combiners, no state, no
timers, no ...).

To continue this effort, I have started this video conference[1] and
document[2] specifically with the intention for people to join and help
hack on the portability effort in a tighter development cycle (note time
differences may mean nobody is on the call, if that is the case feel free
to use the slack channel[3]). Questions/discussions of note will be brought
back to the dev@ list to circulate information with a wider audience.

Note that this meet up is ONLY about getting all SDKs to work on all
Runners.

1: https://s.apache.org/beam-portability-team-meet
2: https://s.apache.org/beam-portability-team-doc
3: https://the-asf.slack.com/messages/C9W769ZJ7/

Re: All SDKs on all Runners (portability virtual team)

Posted by Romain Manni-Bucau <rm...@gmail.com>.
Le 26 mars 2018 22:34, "Lukasz Cwik" <lc...@google.com> a écrit :

Both.

There are parts which live outside the runner like fusion and other
pipeline manipulation steps (up to the runner whether they want to use
these components or implement it themselves).



It is fine to have replacements/optims but by default it shouldnt require
anything to use a 100% java runner IMHO.


There is still code which is specific that integrates those components onto
the runners native execution plan.

On Mon, Mar 26, 2018 at 1:21 PM Romain Manni-Bucau <rm...@gmail.com>
wrote:

> Hi Lukasz,
>
> Did you started to do it generically (ie outside a particular runner impl)
> using pipeline pre visitor or so, or is it still hardcoded in runner?
>
> It is key to not bind it to runners since this is a transversal feature
> which doesnt require more than the primitives the runner supports to impl
> beam model. This enables portability to be maintained at once and without
> impacting the runtime reliably.
>
> Romain
>
> Le 26 mars 2018 20:51, "Lukasz Cwik" <lc...@google.com> a écrit :
>
> A few people from the Beam community have been steadily making effort on
> realizing the portability goal (all SDKs on all Runners).
>
> Last week on an experimental branch we were able to get Apache Flink to
> run a Go pipeline and also a Python pipeline using those respective SDKs.
> The pipelines were limited to running ParDo and GBK (no combiners, no
> state, no timers, no ...).
>
> To continue this effort, I have started this video conference[1] and
> document[2] specifically with the intention for people to join and help
> hack on the portability effort in a tighter development cycle (note time
> differences may mean nobody is on the call, if that is the case feel free
> to use the slack channel[3]). Questions/discussions of note will be brought
> back to the dev@ list to circulate information with a wider audience.
>
> Note that this meet up is ONLY about getting all SDKs to work on all
> Runners.
>
> 1: https://s.apache.org/beam-portability-team-meet
> 2: https://s.apache.org/beam-portability-team-doc
> 3: https://the-asf.slack.com/messages/C9W769ZJ7/
>
>
>

Re: All SDKs on all Runners (portability virtual team)

Posted by Lukasz Cwik <lc...@google.com>.
Both.

There are parts which live outside the runner like fusion and other
pipeline manipulation steps (up to the runner whether they want to use
these components or implement it themselves).

There is still code which is specific that integrates those components onto
the runners native execution plan.

On Mon, Mar 26, 2018 at 1:21 PM Romain Manni-Bucau <rm...@gmail.com>
wrote:

> Hi Lukasz,
>
> Did you started to do it generically (ie outside a particular runner impl)
> using pipeline pre visitor or so, or is it still hardcoded in runner?
>
> It is key to not bind it to runners since this is a transversal feature
> which doesnt require more than the primitives the runner supports to impl
> beam model. This enables portability to be maintained at once and without
> impacting the runtime reliably.
>
> Romain
>
> Le 26 mars 2018 20:51, "Lukasz Cwik" <lc...@google.com> a écrit :
>
> A few people from the Beam community have been steadily making effort on
> realizing the portability goal (all SDKs on all Runners).
>
> Last week on an experimental branch we were able to get Apache Flink to
> run a Go pipeline and also a Python pipeline using those respective SDKs.
> The pipelines were limited to running ParDo and GBK (no combiners, no
> state, no timers, no ...).
>
> To continue this effort, I have started this video conference[1] and
> document[2] specifically with the intention for people to join and help
> hack on the portability effort in a tighter development cycle (note time
> differences may mean nobody is on the call, if that is the case feel free
> to use the slack channel[3]). Questions/discussions of note will be brought
> back to the dev@ list to circulate information with a wider audience.
>
> Note that this meet up is ONLY about getting all SDKs to work on all
> Runners.
>
> 1: https://s.apache.org/beam-portability-team-meet
> 2: https://s.apache.org/beam-portability-team-doc
> 3: https://the-asf.slack.com/messages/C9W769ZJ7/
>
>
>

Re: All SDKs on all Runners (portability virtual team)

Posted by Romain Manni-Bucau <rm...@gmail.com>.
Hi Lukasz,

Did you started to do it generically (ie outside a particular runner impl)
using pipeline pre visitor or so, or is it still hardcoded in runner?

It is key to not bind it to runners since this is a transversal feature
which doesnt require more than the primitives the runner supports to impl
beam model. This enables portability to be maintained at once and without
impacting the runtime reliably.

Romain

Le 26 mars 2018 20:51, "Lukasz Cwik" <lc...@google.com> a écrit :

A few people from the Beam community have been steadily making effort on
realizing the portability goal (all SDKs on all Runners).

Last week on an experimental branch we were able to get Apache Flink to run
a Go pipeline and also a Python pipeline using those respective SDKs. The
pipelines were limited to running ParDo and GBK (no combiners, no state, no
timers, no ...).

To continue this effort, I have started this video conference[1] and
document[2] specifically with the intention for people to join and help
hack on the portability effort in a tighter development cycle (note time
differences may mean nobody is on the call, if that is the case feel free
to use the slack channel[3]). Questions/discussions of note will be brought
back to the dev@ list to circulate information with a wider audience.

Note that this meet up is ONLY about getting all SDKs to work on all
Runners.

1: https://s.apache.org/beam-portability-team-meet
2: https://s.apache.org/beam-portability-team-doc
3: https://the-asf.slack.com/messages/C9W769ZJ7/