You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@beam.apache.org by Maximilian Michels <mx...@apache.org> on 2018/09/11 13:42:19 UTC

[Discuss] Upgrade story for Beam's execution engines

Hi Beamers,

In the light of the discussion about Beam LTS releases, I'd like to kick 
off a thread about how often we upgrade the execution engine of each 
Runner. By upgrade, I mean major/minor versions which typically break 
the binary compatibility of Beam pipelines.

For the Flink Runner, we try to track the latest stable version. Some 
users reported that this can be problematic, as it requires them to 
potentially upgrade their Flink cluster with a new version of Beam.

 From a developer's perspective, it makes sense to migrate as early as 
possible to the newest version of the execution engine, e.g. to leverage 
the newest features. From a user's perspective, you don't care about the 
latest features if your use case still works with Beam.

We have to please both parties. So I'd suggest to upgrade the execution 
engine whenever necessary (e.g. critical new features, end of life of 
current version). On the other hand, the upcoming Beam LTS releases will 
contain a longer-supported version.

Maybe we don't need to discuss much about this but I wanted to hear what 
the community has to say about it. Particularly, I'd be interested in 
how the other Runner authors intend to do it.

As far as I understand, with the portability being stable, we could 
theoretically upgrade the SDK without upgrading the runtime components. 
That would allow us to defer the upgrade for a longer time.

Best,
Max

Re: [Discuss] Upgrade story for Beam's execution engines

Posted by Thomas Weise <th...@apache.org>.
It would be good to engage with the Flink community and attempt to find
stable API that the runner can depend on.

Thanks,
Thomas


On Mon, Sep 17, 2018 at 6:31 AM Maximilian Michels <mx...@apache.org> wrote:

> [Copying this also to the dev list]
>
> +1. A version compatibility table would be great!
>
>  > I don't know if Flink could do something like this (become a provided
>  > dep) in particular for the current case where there seems not to be
>  > API breaking changes.
>
> That doesn't work. The Flink Runner is too tightly integrated with Flink
> internals, and these internals are not always optimally decoupled. This
> fails already at the client side, e.g. when submitting a Flink job via
> Beam to a Flink cluster. Though it should be better now with the new
> Rest-based clients.
>
> On 17.09.18 09:48, Ismaël Mejía wrote:
> > In the Spark runner the user provides the core spark dependencies at
> runtime and
> > we assume that backwards compatibility is kept (in upstream Spark). We
> support
> > the whole 2.x line but we try to keep the version close to the latest
> stable
> > release.
> >
> > Notice however that we lack tests to validate that all versions do work,
> I
> > remember some issues with metrics during the migration to spark 2.x with
> older
> > versions of spark (<= 2.1). Those worked flawlessly with more recent
> versions.
> >
> > I don't know if Flink could do something like this (become a provided
> > dep) in particular for the current case where there seems not to be
> > API breaking changes.
> >
> > In any case +1 to try to get a bit the act together on this.
> >
> > On Mon, Sep 17, 2018 at 9:31 AM Robert Bradshaw <ro...@google.com>
> wrote:
> >>
> >> On Mon, Sep 17, 2018 at 2:02 AM Austin Bennett <
> whatwouldaustindo@gmail.com> wrote:
> >>>
> >>> Do we currently maintain a finer grained list of compatibility between
> execution/runner versions and beam versions?  Is this only really a concern
> with recent Flink (sounded like at least Spark jump, too)?  I see the
> capability matrix:
> https://beam.apache.org/documentation/runners/capability-matrix/, but
> some sort of compatibility between runner versions with beam releases might
> be useful.
> >>>
> >>> I see compatibility matrix as far as beam features, but not for
> underlying runners.  Ex: something like this would save a user trying to
> get Beam working on recent Flink 1.6 and then subsequently hitting a
> (potentially not well documented) wall given known issues.
> >>
> >>
> >> +1. I was bitten by this as well.
> >>
> >> I don't know if it's worth having a compatibility matrix for each
> version (as the overlap is likely to be all or nothing in most cases), but
> it should be prominently displayed here and elsewhere. Want to send out a
> PR?
>

Re: [Discuss] Upgrade story for Beam's execution engines

Posted by Thomas Weise <th...@apache.org>.
It would be good to engage with the Flink community and attempt to find
stable API that the runner can depend on.

Thanks,
Thomas


On Mon, Sep 17, 2018 at 6:31 AM Maximilian Michels <mx...@apache.org> wrote:

> [Copying this also to the dev list]
>
> +1. A version compatibility table would be great!
>
>  > I don't know if Flink could do something like this (become a provided
>  > dep) in particular for the current case where there seems not to be
>  > API breaking changes.
>
> That doesn't work. The Flink Runner is too tightly integrated with Flink
> internals, and these internals are not always optimally decoupled. This
> fails already at the client side, e.g. when submitting a Flink job via
> Beam to a Flink cluster. Though it should be better now with the new
> Rest-based clients.
>
> On 17.09.18 09:48, Ismaël Mejía wrote:
> > In the Spark runner the user provides the core spark dependencies at
> runtime and
> > we assume that backwards compatibility is kept (in upstream Spark). We
> support
> > the whole 2.x line but we try to keep the version close to the latest
> stable
> > release.
> >
> > Notice however that we lack tests to validate that all versions do work,
> I
> > remember some issues with metrics during the migration to spark 2.x with
> older
> > versions of spark (<= 2.1). Those worked flawlessly with more recent
> versions.
> >
> > I don't know if Flink could do something like this (become a provided
> > dep) in particular for the current case where there seems not to be
> > API breaking changes.
> >
> > In any case +1 to try to get a bit the act together on this.
> >
> > On Mon, Sep 17, 2018 at 9:31 AM Robert Bradshaw <ro...@google.com>
> wrote:
> >>
> >> On Mon, Sep 17, 2018 at 2:02 AM Austin Bennett <
> whatwouldaustindo@gmail.com> wrote:
> >>>
> >>> Do we currently maintain a finer grained list of compatibility between
> execution/runner versions and beam versions?  Is this only really a concern
> with recent Flink (sounded like at least Spark jump, too)?  I see the
> capability matrix:
> https://beam.apache.org/documentation/runners/capability-matrix/, but
> some sort of compatibility between runner versions with beam releases might
> be useful.
> >>>
> >>> I see compatibility matrix as far as beam features, but not for
> underlying runners.  Ex: something like this would save a user trying to
> get Beam working on recent Flink 1.6 and then subsequently hitting a
> (potentially not well documented) wall given known issues.
> >>
> >>
> >> +1. I was bitten by this as well.
> >>
> >> I don't know if it's worth having a compatibility matrix for each
> version (as the overlap is likely to be all or nothing in most cases), but
> it should be prominently displayed here and elsewhere. Want to send out a
> PR?
>

Re: [Discuss] Upgrade story for Beam's execution engines

Posted by Maximilian Michels <mx...@apache.org>.
[Copying this also to the dev list]

+1. A version compatibility table would be great!

 > I don't know if Flink could do something like this (become a provided
 > dep) in particular for the current case where there seems not to be
 > API breaking changes.

That doesn't work. The Flink Runner is too tightly integrated with Flink 
internals, and these internals are not always optimally decoupled. This 
fails already at the client side, e.g. when submitting a Flink job via 
Beam to a Flink cluster. Though it should be better now with the new 
Rest-based clients.

On 17.09.18 09:48, Ismaël Mejía wrote:
> In the Spark runner the user provides the core spark dependencies at runtime and
> we assume that backwards compatibility is kept (in upstream Spark). We support
> the whole 2.x line but we try to keep the version close to the latest stable
> release.
> 
> Notice however that we lack tests to validate that all versions do work, I
> remember some issues with metrics during the migration to spark 2.x with older
> versions of spark (<= 2.1). Those worked flawlessly with more recent versions.
> 
> I don't know if Flink could do something like this (become a provided
> dep) in particular for the current case where there seems not to be
> API breaking changes.
> 
> In any case +1 to try to get a bit the act together on this.
> 
> On Mon, Sep 17, 2018 at 9:31 AM Robert Bradshaw <ro...@google.com> wrote:
>>
>> On Mon, Sep 17, 2018 at 2:02 AM Austin Bennett <wh...@gmail.com> wrote:
>>>
>>> Do we currently maintain a finer grained list of compatibility between execution/runner versions and beam versions?  Is this only really a concern with recent Flink (sounded like at least Spark jump, too)?  I see the capability matrix:  https://beam.apache.org/documentation/runners/capability-matrix/, but some sort of compatibility between runner versions with beam releases might be useful.
>>>
>>> I see compatibility matrix as far as beam features, but not for underlying runners.  Ex: something like this would save a user trying to get Beam working on recent Flink 1.6 and then subsequently hitting a (potentially not well documented) wall given known issues.
>>
>>
>> +1. I was bitten by this as well.
>>
>> I don't know if it's worth having a compatibility matrix for each version (as the overlap is likely to be all or nothing in most cases), but it should be prominently displayed here and elsewhere. Want to send out a PR?

Re: [Discuss] Upgrade story for Beam's execution engines

Posted by Vishwas Bm <bm...@gmail.com>.
Hi,

As part of our POC, we are testing Spark runner. We tried to submit a
submit job to a spark(2.2) cluster running on a K8s cluster.
As per our findings during this POC, the beam capability matrix
https://beam.apache.org/documentation/runners/capability-matrix/ is not
updated.

Below features are actually supported in SparkRunner:
1) GroupByKey
2) Event-time triggers (watermark crosses the end of window)
3) Count triggers
4) Allowed lateness
5) Accumulating fired panes.

*Thanks & Regards,*

*Vishwas *



On Mon, Sep 17, 2018 at 1:18 PM Ismaël Mejía <ie...@gmail.com> wrote:

> In the Spark runner the user provides the core spark dependencies at
> runtime and
> we assume that backwards compatibility is kept (in upstream Spark). We
> support
> the whole 2.x line but we try to keep the version close to the latest
> stable
> release.
>
> Notice however that we lack tests to validate that all versions do work, I
> remember some issues with metrics during the migration to spark 2.x with
> older
> versions of spark (<= 2.1). Those worked flawlessly with more recent
> versions.
>
> I don't know if Flink could do something like this (become a provided
> dep) in particular for the current case where there seems not to be
> API breaking changes.
>
> In any case +1 to try to get a bit the act together on this.
>
> On Mon, Sep 17, 2018 at 9:31 AM Robert Bradshaw <ro...@google.com>
> wrote:
> >
> > On Mon, Sep 17, 2018 at 2:02 AM Austin Bennett <
> whatwouldaustindo@gmail.com> wrote:
> >>
> >> Do we currently maintain a finer grained list of compatibility between
> execution/runner versions and beam versions?  Is this only really a concern
> with recent Flink (sounded like at least Spark jump, too)?  I see the
> capability matrix:
> https://beam.apache.org/documentation/runners/capability-matrix/, but
> some sort of compatibility between runner versions with beam releases might
> be useful.
> >>
> >> I see compatibility matrix as far as beam features, but not for
> underlying runners.  Ex: something like this would save a user trying to
> get Beam working on recent Flink 1.6 and then subsequently hitting a
> (potentially not well documented) wall given known issues.
> >
> >
> > +1. I was bitten by this as well.
> >
> > I don't know if it's worth having a compatibility matrix for each
> version (as the overlap is likely to be all or nothing in most cases), but
> it should be prominently displayed here and elsewhere. Want to send out a
> PR?
>

Re: [Discuss] Upgrade story for Beam's execution engines

Posted by Maximilian Michels <mx...@apache.org>.
[Copying this also to the dev list]

+1. A version compatibility table would be great!

 > I don't know if Flink could do something like this (become a provided
 > dep) in particular for the current case where there seems not to be
 > API breaking changes.

That doesn't work. The Flink Runner is too tightly integrated with Flink 
internals, and these internals are not always optimally decoupled. This 
fails already at the client side, e.g. when submitting a Flink job via 
Beam to a Flink cluster. Though it should be better now with the new 
Rest-based clients.

On 17.09.18 09:48, Ismaël Mejía wrote:
> In the Spark runner the user provides the core spark dependencies at runtime and
> we assume that backwards compatibility is kept (in upstream Spark). We support
> the whole 2.x line but we try to keep the version close to the latest stable
> release.
> 
> Notice however that we lack tests to validate that all versions do work, I
> remember some issues with metrics during the migration to spark 2.x with older
> versions of spark (<= 2.1). Those worked flawlessly with more recent versions.
> 
> I don't know if Flink could do something like this (become a provided
> dep) in particular for the current case where there seems not to be
> API breaking changes.
> 
> In any case +1 to try to get a bit the act together on this.
> 
> On Mon, Sep 17, 2018 at 9:31 AM Robert Bradshaw <ro...@google.com> wrote:
>>
>> On Mon, Sep 17, 2018 at 2:02 AM Austin Bennett <wh...@gmail.com> wrote:
>>>
>>> Do we currently maintain a finer grained list of compatibility between execution/runner versions and beam versions?  Is this only really a concern with recent Flink (sounded like at least Spark jump, too)?  I see the capability matrix:  https://beam.apache.org/documentation/runners/capability-matrix/, but some sort of compatibility between runner versions with beam releases might be useful.
>>>
>>> I see compatibility matrix as far as beam features, but not for underlying runners.  Ex: something like this would save a user trying to get Beam working on recent Flink 1.6 and then subsequently hitting a (potentially not well documented) wall given known issues.
>>
>>
>> +1. I was bitten by this as well.
>>
>> I don't know if it's worth having a compatibility matrix for each version (as the overlap is likely to be all or nothing in most cases), but it should be prominently displayed here and elsewhere. Want to send out a PR?

Re: [Discuss] Upgrade story for Beam's execution engines

Posted by Ismaël Mejía <ie...@gmail.com>.
In the Spark runner the user provides the core spark dependencies at runtime and
we assume that backwards compatibility is kept (in upstream Spark). We support
the whole 2.x line but we try to keep the version close to the latest stable
release.

Notice however that we lack tests to validate that all versions do work, I
remember some issues with metrics during the migration to spark 2.x with older
versions of spark (<= 2.1). Those worked flawlessly with more recent versions.

I don't know if Flink could do something like this (become a provided
dep) in particular for the current case where there seems not to be
API breaking changes.

In any case +1 to try to get a bit the act together on this.

On Mon, Sep 17, 2018 at 9:31 AM Robert Bradshaw <ro...@google.com> wrote:
>
> On Mon, Sep 17, 2018 at 2:02 AM Austin Bennett <wh...@gmail.com> wrote:
>>
>> Do we currently maintain a finer grained list of compatibility between execution/runner versions and beam versions?  Is this only really a concern with recent Flink (sounded like at least Spark jump, too)?  I see the capability matrix:  https://beam.apache.org/documentation/runners/capability-matrix/, but some sort of compatibility between runner versions with beam releases might be useful.
>>
>> I see compatibility matrix as far as beam features, but not for underlying runners.  Ex: something like this would save a user trying to get Beam working on recent Flink 1.6 and then subsequently hitting a (potentially not well documented) wall given known issues.
>
>
> +1. I was bitten by this as well.
>
> I don't know if it's worth having a compatibility matrix for each version (as the overlap is likely to be all or nothing in most cases), but it should be prominently displayed here and elsewhere. Want to send out a PR?

Re: [Discuss] Upgrade story for Beam's execution engines

Posted by Maximilian Michels <mx...@apache.org>.
FYI, I opened a PR with a compatibility table for the Flink Runner page: 
https://github.com/apache/beam-site/pull/553

On 17.09.18 09:31, Robert Bradshaw wrote:
> On Mon, Sep 17, 2018 at 2:02 AM Austin Bennett 
> <whatwouldaustindo@gmail.com <ma...@gmail.com>> wrote:
> 
>     Do we currently maintain a finer grained list of compatibility
>     between execution/runner versions and beam versions?  Is this only
>     really a concern with recent Flink (sounded like at least Spark
>     jump, too)?  I see the capability matrix:
>     https://beam.apache.org/documentation/runners/capability-matrix/,
>     but some sort of compatibility between runner versions with beam
>     releases might be useful.
> 
>     I see compatibility matrix as far as beam features, but not for
>     underlying runners.  Ex: something like this would save a user
>     trying to get Beam working on recent Flink 1.6 and then subsequently
>     hitting a (potentially not well documented) wall given known issues. 
> 
> 
> +1. I was bitten by this as well.
> 
> I don't know if it's worth having a compatibility matrix for each 
> version (as the overlap is likely to be all or nothing in most cases), 
> but it should be prominently displayed here and elsewhere. Want to send 
> out a PR?

Re: [Discuss] Upgrade story for Beam's execution engines

Posted by Maximilian Michels <mx...@apache.org>.
FYI, I opened a PR with a compatibility table for the Flink Runner page: 
https://github.com/apache/beam-site/pull/553

On 17.09.18 09:31, Robert Bradshaw wrote:
> On Mon, Sep 17, 2018 at 2:02 AM Austin Bennett 
> <whatwouldaustindo@gmail.com <ma...@gmail.com>> wrote:
> 
>     Do we currently maintain a finer grained list of compatibility
>     between execution/runner versions and beam versions?  Is this only
>     really a concern with recent Flink (sounded like at least Spark
>     jump, too)?  I see the capability matrix:
>     https://beam.apache.org/documentation/runners/capability-matrix/,
>     but some sort of compatibility between runner versions with beam
>     releases might be useful.
> 
>     I see compatibility matrix as far as beam features, but not for
>     underlying runners.  Ex: something like this would save a user
>     trying to get Beam working on recent Flink 1.6 and then subsequently
>     hitting a (potentially not well documented) wall given known issues. 
> 
> 
> +1. I was bitten by this as well.
> 
> I don't know if it's worth having a compatibility matrix for each 
> version (as the overlap is likely to be all or nothing in most cases), 
> but it should be prominently displayed here and elsewhere. Want to send 
> out a PR?

Re: [Discuss] Upgrade story for Beam's execution engines

Posted by Robert Bradshaw <ro...@google.com>.
On Mon, Sep 17, 2018 at 2:02 AM Austin Bennett <wh...@gmail.com>
wrote:

> Do we currently maintain a finer grained list of compatibility between
> execution/runner versions and beam versions?  Is this only really a concern
> with recent Flink (sounded like at least Spark jump, too)?  I see the
> capability matrix:
> https://beam.apache.org/documentation/runners/capability-matrix/, but
> some sort of compatibility between runner versions with beam releases might
> be useful.
>
> I see compatibility matrix as far as beam features, but not for underlying
> runners.  Ex: something like this would save a user trying to get Beam
> working on recent Flink 1.6 and then subsequently hitting a (potentially
> not well documented) wall given known issues.
>

+1. I was bitten by this as well.

I don't know if it's worth having a compatibility matrix for each version
(as the overlap is likely to be all or nothing in most cases), but it
should be prominently displayed here and elsewhere. Want to send out a PR?

Re: [Discuss] Upgrade story for Beam's execution engines

Posted by Robert Bradshaw <ro...@google.com>.
On Mon, Sep 17, 2018 at 2:02 AM Austin Bennett <wh...@gmail.com>
wrote:

> Do we currently maintain a finer grained list of compatibility between
> execution/runner versions and beam versions?  Is this only really a concern
> with recent Flink (sounded like at least Spark jump, too)?  I see the
> capability matrix:
> https://beam.apache.org/documentation/runners/capability-matrix/, but
> some sort of compatibility between runner versions with beam releases might
> be useful.
>
> I see compatibility matrix as far as beam features, but not for underlying
> runners.  Ex: something like this would save a user trying to get Beam
> working on recent Flink 1.6 and then subsequently hitting a (potentially
> not well documented) wall given known issues.
>

+1. I was bitten by this as well.

I don't know if it's worth having a compatibility matrix for each version
(as the overlap is likely to be all or nothing in most cases), but it
should be prominently displayed here and elsewhere. Want to send out a PR?

Re: [Discuss] Upgrade story for Beam's execution engines

Posted by Austin Bennett <wh...@gmail.com>.
Do we currently maintain a finer grained list of compatibility between
execution/runner versions and beam versions?  Is this only really a concern
with recent Flink (sounded like at least Spark jump, too)?  I see the
capability matrix:
https://beam.apache.org/documentation/runners/capability-matrix/, but some
sort of compatibility between runner versions with beam releases might be
useful.

I see compatibility matrix as far as beam features, but not for underlying
runners.  Ex: something like this would save a user trying to get Beam
working on recent Flink 1.6 and then subsequently hitting a (potentially
not well documented) wall given known issues.



On Sun, Sep 16, 2018 at 3:59 AM Maximilian Michels <mx...@apache.org> wrote:

> > If I understand the LTS proposal correctly, then it will be a release
> line that continues to receive patches (as in semantic versioning), but no
> new features as that would defeat the purpose (stability).
>
> It matters insofar, as execution engine upgrades could be performed in
> the master but the LTS version won't receive them. So LTS is the go-to
> if you want to ensure compatibility with your existing setup.
>
> > To limit the pain of dealing with incompatible runner changes and copies
> within Beam, we should probably also work with the respective community to
> improve the compatibility story.
>
> Absolutely. If we find that we can improve compatibility with upstream
> changes, we should go that path. Even if we don't have a dedicated
> compatibility layer upstream yet.
>
> On 13.09.18 19:34, Thomas Weise wrote:
> >
> > On Thu, Sep 13, 2018 at 9:49 AM Maximilian Michels <mxm@apache.org
> > <ma...@apache.org>> wrote:
> >
> >     Thank you for your comments. Let me try to summarize what has been
> >     discussed so far:
> >
> >     1. The Beam LTS version will ensure a stable execution engine for as
> >     long as the LTS life span.
> >
> >
> > If I understand the LTS proposal correctly, then it will be a release
> > line that continues to receive patches (as in semantic versioning), but
> > no new features as that would defeat the purpose (stability).
> >
> > If so, then I don't think LTS matters for this discussion.
> >
> >     2. We agree that pushing updates to the execution engine for the
> >     Runners
> >     is only desirable if it results in a better integration with the Beam
> >     model or if it is necessary due security or performance reasons.
> >
> >     3. We might have to consider adding additional build targets for a
> >     Runner for whenever the execution engine gets upgraded. This might be
> >     really easy if the engine's API remains stable. It might also be
> >     desirable if the upgrade path is not easy and not completely
> >     foreseeable, e.g. Etienne mentioned Spark 1.x vs Spark 2.x Runner.
> The
> >     Beam feature set could vary depending on the version.
> >
> >
> > To limit the pain of dealing with incompatible runner changes and copies
> > within Beam, we should probably also work with the respective community
> > to improve the compatibility story.
> >
> >
> >     4. In the long run, we want a stable abstraction layer for each
> Runner
> >     that, ideally, is maintained by the upstream of the execution
> >     engine. In
> >     the short run, this is probably not realistic, as the shared
> libraries
> >     of Beam are not stable enough.
> >
> >
> > Yes, that will only become an option once we reach interface stability.
> > Similar to how the runner projects maintain their IO connectors.
> >
> >     On 13.09.18 14:39, Robert Bradshaw wrote:
> >      > The ideal long-term solution is, as Romain mentions, pushing the
> >      > runner-specific code up to be maintained by each runner with a
> >     stable
> >      > API to use to talk to Beam. Unfortunately, I think we're still a
> >     long
> >      > way from having this Stable API, or having the clout for
> >      > non-beam-developers to maintain these bindings externally (though
> >      > hopefully we'll get there).
> >      >
> >      > In the short term, we're stuck with either hurting users that
> >     want to
> >      > stick with Flink 1.5, hurting users that want to upgrade to Flink
> >     1.6,
> >      > or supporting both. Is Beam's interaction with Flink such that we
> >     can't
> >      > simply have separate targets linking the same Beam code against
> >     one or
> >      > the other? (I.e. are code changes needed?) If so, we'll probably
> >     need a
> >      > flink-runner-1.5 module, a flink-runner-1.6, and a
> >     flink-runner-common
> >      > module. Or we hope that all users are happy with 1.5 until a
> certain
> >      > point in time when they all want to simultaneously jump to 1.6
> >     and Beam
> >      > at the same time. Maybe that's enough in the short term, but
> >     longer term
> >      > we need a more sustainable solution.
> >      >
> >      >
> >      > On Thu, Sep 13, 2018 at 7:13 AM Romain Manni-Bucau
> >      > <rmannibucau@gmail.com <ma...@gmail.com>
> >     <mailto:rmannibucau@gmail.com <ma...@gmail.com>>>
> wrote:
> >      >
> >      >     Hi guys,
> >      >
> >      >     Isnt the issue "only" that beam has this code instead of
> engines?
> >      >
> >      >     Assuming beam runner facing api is stable - which must be the
> >     case
> >      >     anyway - and that each engine has its integration (flink-beam
> >      >     instead of beam-runners-flink), then this issue disappears by
> >      >     construction.
> >      >
> >      >     It also has the advantage to have a better maintenance.
> >      >
> >      >     Side note: this is what happent which arquillian, originally
> the
> >      >     community did all adapters impl then each vendor took it back
> in
> >      >     house to make it better.
> >      >
> >      >     Any way to work in that direction maybe?
> >      >
> >      >     Le jeu. 13 sept. 2018 00:49, Thomas Weise <thw@apache.org
> >     <ma...@apache.org>
> >      >     <mailto:thw@apache.org <ma...@apache.org>>> a écrit :
> >      >
> >      >         The main problem here is that users are forced to upgrade
> >      >         infrastructure to obtain new features in Beam, even when
> >     those
> >      >         features actually don't require such changes. As an
> example,
> >      >         another update to Flink 1.6.0 was proposed (without
> >     supporting
> >      >         new functionality in Beam) and we already know that it
> breaks
> >      >         compatibility (again).
> >      >
> >      >         I think that upgrading to a Flink X.Y.0 version isn't a
> good
> >      >         idea to start with. But besides that, if we want to grow
> >      >         adoption, then we need to focus on stability and
> delivering
> >      >         improvements to Beam without disrupting users.
> >      >
> >      >         In the specific case, ideally the surface of Flink would
> be
> >      >         backward compatible, allowing us to stick to a minimum
> >     version
> >      >         and be able to submit pipelines to Flink endpoints of
> higher
> >      >         versions. Some work in that direction is underway (like
> >      >         versioning the REST API). FYI, lowest common version is
> what
> >      >         most projects that depend on Hadoop 2.x follow.
> >      >
> >      >         Since Beam with Flink 1.5.x client won't talk to Flink
> >     1.6 and
> >      >         there are code changes required to make it compile, we
> would
> >      >         need to come up with a more involved strategy to support
> >      >         multiple Flink versions. Till then, I would prefer we
> favor
> >      >         existing users over short lived experiments, which would
> mean
> >      >         stick with 1.5.x and not support 1.6.0.
> >      >
> >      >         Thanks,
> >      >         Thomas
> >      >
> >      >
> >      >         On Wed, Sep 12, 2018 at 1:15 PM Lukasz Cwik
> >     <lcwik@google.com <ma...@google.com>
> >      >         <mailto:lcwik@google.com <ma...@google.com>>>
> wrote:
> >      >
> >      >             As others have already suggested, I also believe LTS
> >      >             releases is the best we can do as a community right
> now
> >      >             until portability allows us to decouple what a user
> >     writes
> >      >             with and how it runs (the SDK and the SDK
> >     environment) from
> >      >             the runner (job service + shared common runner libs +
> >      >             Flink/Spark/Dataflow/Apex/Samza/...).
> >      >
> >      >             Dataflow would be highly invested in having the
> >     appropriate
> >      >             tooling within Apache Beam to support multiple SDK
> >     versions
> >      >             against a runner. This in turn would allow people to
> >     use any
> >      >             SDK with any runner and as Robert had mentioned,
> certain
> >      >             optimizations and features would be disabled
> depending on
> >      >             the capabilities of the runner and the capabilities
> >     of the SDK.
> >      >
> >      >
> >      >
> >      >             On Wed, Sep 12, 2018 at 6:38 AM Robert Bradshaw
> >      >             <robertwb@google.com <ma...@google.com>
> >     <mailto:robertwb@google.com <ma...@google.com>>> wrote:
> >      >
> >      >                 The target audience is people who want to use the
> >     latest
> >      >                 Beam but do not want to use the latest version of
> the
> >      >                 runner, right?
> >      >
> >      >                 I think this will be somewhat (though not
> entirely)
> >      >                 addressed by Beam LTS releases, where those not
> >     wanting
> >      >                 to upgrade the runner at least have a
> well-supported
> >      >                 version of Beam. In the long term, we have the
> >     division
> >      >
> >      >                      Runner <-> BeamRunnerSpecificCode <->
> >      >                 CommonBeamRunnerLibs <-> SDK.
> >      >
> >      >                 (which applies to the job submission as well as
> >     execution).
> >      >
> >      >                 Insomuch as the BeamRunnerSpecificCode uses the
> >     public
> >      >                 APIs of the runner, hopefully upgrading the
> >     runner for
> >      >                 minor versions should be a no-op, and we can
> >     target the
> >      >                 lowest version of the runner that makes sense,
> >     allowing
> >      >                 the user to link against higher versions at his
> >     or her
> >      >                 discretion. We should provide built targets that
> >     allow
> >      >                 this. For major versions, it may make sense to
> >     have two
> >      >                 distinct BeamRunnerSpecificCode libraries (which
> >     may or
> >      >                 may not share some common code). I hope these
> >     wrappers
> >      >                 are not too thick.
> >      >
> >      >                 There is a tight coupling at
> >     the BeamRunnerSpecificCode
> >      >                 <-> CommonBeamRunnerLibs layer, but hopefully the
> >     bulk
> >      >                 of the code lives on the right hand side and can
> be
> >      >                 updated as needed independent of the runner.
> >     There may
> >      >                 be code of the form "if the runner supports X, do
> >     this
> >      >                 fast path, otherwise, do this slow path (or
> >     reject the
> >      >                 pipeline).
> >      >
> >      >                 I hope the CommonBeamRunnerLibs <-> SDK coupling
> is
> >      >                 fairly loose, to the point that one could use
> >     SDKs from
> >      >                 different versions of Beam (or even developed
> >     outside of
> >      >                 Beam) with an older/newer runner. We may need to
> add
> >      >                 versioning to the Fn/Runner/Job API itself to
> support
> >      >                 this. Right now of course we're still in a
> pre-1.0,
> >      >                 rapid-development phase wrt this API.
> >      >
> >      >
> >      >
> >      >
> >      >                 On Wed, Sep 12, 2018 at 2:10 PM Etienne Chauchot
> >      >                 <echauchot@apache.org
> >     <ma...@apache.org> <mailto:echauchot@apache.org
> >     <ma...@apache.org>>> wrote:
> >      >
> >      >                     Hi Max,
> >      >
> >      >                     I totally agree with your points especially
> the
> >      >                     users priorities (stick to the already working
> >      >                     version) , and the need to leverage important
> new
> >      >                     features. It is indeed a difficult balance to
> >     find .
> >      >
> >      >                     I can talk for a part I know: for the Spark
> >     runner,
> >      >                     the aim was to support Dataset native spark
> >     API (in
> >      >                     place of RDD). For that we needed to upgrade
> to
> >      >                     spark 2.x (and we will probably leverage Beam
> >     Row as
> >      >                     well).
> >      >                     But such an upgrade is a good amount of work
> >     which
> >      >                     makes it difficult to commit on a schedule
> >     such as
> >      >                     "if there is a major new feature on an
> execution
> >      >                     engine that we want to leverage, then the
> >     upgrade in
> >      >                     Beam will be done within x months".
> >      >
> >      >                     Regarding your point on portability :
> >     decoupling SDK
> >      >                     from runner with runner harness and SDK
> harness
> >      >                     might make pipeline authors work easy
> regarding
> >      >                     pipeline maintenance. But, still, if we
> upgrade
> >      >                     runner libs, then the users might have their
> >     runner
> >      >                     harness not work with their engine version.
> >      >                     If such SDK/runner decoupling is 100%
> functional,
> >      >                     then we could imaging having multiple runner
> >      >                     harnesses shipping different versions of the
> >     runner
> >      >                     libs to solve this problem.
> >      >                     But we would need to support more than one
> >     version
> >      >                     of the runner libs. We chose not to do this
> >     on spark
> >      >                     runner.
> >      >
> >      >                     WDYT ?
> >      >
> >      >                     Best
> >      >                     Etienne
> >      >
> >      >
> >      >                     Le mardi 11 septembre 2018 à 15:42 +0200,
> >     Maximilian
> >      >                     Michels a écrit :
> >      >>                     Hi Beamers,
> >      >>
> >      >>                     In the light of the discussion about Beam
> >     LTS releases, I'd like to kick
> >      >>                     off a thread about how often we upgrade the
> >     execution engine of each
> >      >>                     Runner. By upgrade, I mean major/minor
> >     versions which typically break
> >      >>                     the binary compatibility of Beam pipelines.
> >      >>
> >      >>                     For the Flink Runner, we try to track the
> >     latest stable version. Some
> >      >>                     users reported that this can be problematic,
> >     as it requires them to
> >      >>                     potentially upgrade their Flink cluster with
> >     a new version of Beam.
> >      >>
> >      >>                       From a developer's perspective, it makes
> >     sense to migrate as early as
> >      >>                     possible to the newest version of the
> >     execution engine, e.g. to leverage
> >      >>                     the newest features. From a user's
> >     perspective, you don't care about the
> >      >>                     latest features if your use case still works
> >     with Beam.
> >      >>
> >      >>                     We have to please both parties. So I'd
> >     suggest to upgrade the execution
> >      >>                     engine whenever necessary (e.g. critical new
> >     features, end of life of
> >      >>                     current version). On the other hand, the
> >     upcoming Beam LTS releases will
> >      >>                     contain a longer-supported version.
> >      >>
> >      >>                     Maybe we don't need to discuss much about
> >     this but I wanted to hear what
> >      >>                     the community has to say about it.
> >     Particularly, I'd be interested in
> >      >>                     how the other Runner authors intend to do it.
> >      >>
> >      >>                     As far as I understand, with the portability
> >     being stable, we could
> >      >>                     theoretically upgrade the SDK without
> >     upgrading the runtime components.
> >      >>                     That would allow us to defer the upgrade for
> >     a longer time.
> >      >>
> >      >>                     Best,
> >      >>                     Max
> >      >>
> >
>

Re: [Discuss] Upgrade story for Beam's execution engines

Posted by Maximilian Michels <mx...@apache.org>.
 > To clarify, there are two classes of OldRunner users here: those that 
want new features in Beam, and those that simply want to run a supported 
version of Beam. The LTS proposal helps the latter, which is going to be 
biased towards those not upgrading their runners. The former are worth 
supporting as well.

Agree. Apart from the LTS, we want to minimize upgrade pain for new Beam 
versions as much as possible.

On 17.09.18 09:30, Robert Bradshaw wrote:
> On Sun, Sep 16, 2018 at 12:59 PM Maximilian Michels <mxm@apache.org 
> <ma...@apache.org>> wrote:
> 
>      > If I understand the LTS proposal correctly, then it will be a
>     release line that continues to receive patches (as in semantic
>     versioning), but no new features as that would defeat the purpose
>     (stability).
> 
>     It matters insofar, as execution engine upgrades could be performed in
>     the master but the LTS version won't receive them. So LTS is the go-to
>     if you want to ensure compatibility with your existing setup.
> 
> 
> To clarify, there are two classes of OldRunner users here: those that 
> want new features in Beam, and those that simply want to run a supported 
> version of Beam. The LTS proposal helps the latter, which is going to be 
> biased towards those not upgrading their runners. The former are worth 
> supporting as well.
> 
>      > To limit the pain of dealing with incompatible runner changes and
>     copies within Beam, we should probably also work with the respective
>     community to improve the compatibility story.
> 
>     Absolutely. If we find that we can improve compatibility with upstream
>     changes, we should go that path. Even if we don't have a dedicated
>     compatibility layer upstream yet.
> 
>     On 13.09.18 19:34, Thomas Weise wrote:
>      >
>      > On Thu, Sep 13, 2018 at 9:49 AM Maximilian Michels
>     <mxm@apache.org <ma...@apache.org>
>      > <mailto:mxm@apache.org <ma...@apache.org>>> wrote:
>      >
>      >     Thank you for your comments. Let me try to summarize what has
>     been
>      >     discussed so far:
>      >
>      >     1. The Beam LTS version will ensure a stable execution engine
>     for as
>      >     long as the LTS life span.
>      >
>      >
>      > If I understand the LTS proposal correctly, then it will be a
>     release
>      > line that continues to receive patches (as in semantic
>     versioning), but
>      > no new features as that would defeat the purpose (stability).
>      >
>      > If so, then I don't think LTS matters for this discussion.
>      >
>      >     2. We agree that pushing updates to the execution engine for the
>      >     Runners
>      >     is only desirable if it results in a better integration with
>     the Beam
>      >     model or if it is necessary due security or performance reasons.
>      >
>      >     3. We might have to consider adding additional build targets
>     for a
>      >     Runner for whenever the execution engine gets upgraded. This
>     might be
>      >     really easy if the engine's API remains stable. It might also be
>      >     desirable if the upgrade path is not easy and not completely
>      >     foreseeable, e.g. Etienne mentioned Spark 1.x vs Spark 2.x
>     Runner. The
>      >     Beam feature set could vary depending on the version.
>      >
>      >
>      > To limit the pain of dealing with incompatible runner changes and
>     copies
>      > within Beam, we should probably also work with the respective
>     community
>      > to improve the compatibility story.
>      >
>      >
>      >     4. In the long run, we want a stable abstraction layer for
>     each Runner
>      >     that, ideally, is maintained by the upstream of the execution
>      >     engine. In
>      >     the short run, this is probably not realistic, as the shared
>     libraries
>      >     of Beam are not stable enough.
>      >
>      >
>      > Yes, that will only become an option once we reach interface
>     stability.
>      > Similar to how the runner projects maintain their IO connectors.
>      >
>      >     On 13.09.18 14:39, Robert Bradshaw wrote:
>      >      > The ideal long-term solution is, as Romain mentions,
>     pushing the
>      >      > runner-specific code up to be maintained by each runner with a
>      >     stable
>      >      > API to use to talk to Beam. Unfortunately, I think we're
>     still a
>      >     long
>      >      > way from having this Stable API, or having the clout for
>      >      > non-beam-developers to maintain these bindings externally
>     (though
>      >      > hopefully we'll get there).
>      >      >
>      >      > In the short term, we're stuck with either hurting users that
>      >     want to
>      >      > stick with Flink 1.5, hurting users that want to upgrade
>     to Flink
>      >     1.6,
>      >      > or supporting both. Is Beam's interaction with Flink such
>     that we
>      >     can't
>      >      > simply have separate targets linking the same Beam code
>     against
>      >     one or
>      >      > the other? (I.e. are code changes needed?) If so, we'll
>     probably
>      >     need a
>      >      > flink-runner-1.5 module, a flink-runner-1.6, and a
>      >     flink-runner-common
>      >      > module. Or we hope that all users are happy with 1.5 until
>     a certain
>      >      > point in time when they all want to simultaneously jump to 1.6
>      >     and Beam
>      >      > at the same time. Maybe that's enough in the short term, but
>      >     longer term
>      >      > we need a more sustainable solution.
>      >      >
>      >      >
>      >      > On Thu, Sep 13, 2018 at 7:13 AM Romain Manni-Bucau
>      >      > <rmannibucau@gmail.com <ma...@gmail.com>
>     <mailto:rmannibucau@gmail.com <ma...@gmail.com>>
>      >     <mailto:rmannibucau@gmail.com <ma...@gmail.com>
>     <mailto:rmannibucau@gmail.com <ma...@gmail.com>>>> wrote:
>      >      >
>      >      >     Hi guys,
>      >      >
>      >      >     Isnt the issue "only" that beam has this code instead
>     of engines?
>      >      >
>      >      >     Assuming beam runner facing api is stable - which must
>     be the
>      >     case
>      >      >     anyway - and that each engine has its integration
>     (flink-beam
>      >      >     instead of beam-runners-flink), then this issue
>     disappears by
>      >      >     construction.
>      >      >
>      >      >     It also has the advantage to have a better maintenance.
>      >      >
>      >      >     Side note: this is what happent which arquillian,
>     originally the
>      >      >     community did all adapters impl then each vendor took
>     it back in
>      >      >     house to make it better.
>      >      >
>      >      >     Any way to work in that direction maybe?
>      >      >
>      >      >     Le jeu. 13 sept. 2018 00:49, Thomas Weise
>     <thw@apache.org <ma...@apache.org>
>      >     <mailto:thw@apache.org <ma...@apache.org>>
>      >      >     <mailto:thw@apache.org <ma...@apache.org>
>     <mailto:thw@apache.org <ma...@apache.org>>>> a écrit :
>      >      >
>      >      >         The main problem here is that users are forced to
>     upgrade
>      >      >         infrastructure to obtain new features in Beam,
>     even when
>      >     those
>      >      >         features actually don't require such changes. As
>     an example,
>      >      >         another update to Flink 1.6.0 was proposed (without
>      >     supporting
>      >      >         new functionality in Beam) and we already know
>     that it breaks
>      >      >         compatibility (again).
>      >      >
>      >      >         I think that upgrading to a Flink X.Y.0 version
>     isn't a good
>      >      >         idea to start with. But besides that, if we want
>     to grow
>      >      >         adoption, then we need to focus on stability and
>     delivering
>      >      >         improvements to Beam without disrupting users.
>      >      >
>      >      >         In the specific case, ideally the surface of Flink
>     would be
>      >      >         backward compatible, allowing us to stick to a minimum
>      >     version
>      >      >         and be able to submit pipelines to Flink endpoints
>     of higher
>      >      >         versions. Some work in that direction is underway
>     (like
>      >      >         versioning the REST API). FYI, lowest common
>     version is what
>      >      >         most projects that depend on Hadoop 2.x follow.
>      >      >
>      >      >         Since Beam with Flink 1.5.x client won't talk to Flink
>      >     1.6 and
>      >      >         there are code changes required to make it
>     compile, we would
>      >      >         need to come up with a more involved strategy to
>     support
>      >      >         multiple Flink versions. Till then, I would prefer
>     we favor
>      >      >         existing users over short lived experiments, which
>     would mean
>      >      >         stick with 1.5.x and not support 1.6.0.
>      >      >
>      >      >         Thanks,
>      >      >         Thomas
>      >      >
>      >      >
>      >      >         On Wed, Sep 12, 2018 at 1:15 PM Lukasz Cwik
>      >     <lcwik@google.com <ma...@google.com>
>     <mailto:lcwik@google.com <ma...@google.com>>
>      >      >         <mailto:lcwik@google.com <ma...@google.com>
>     <mailto:lcwik@google.com <ma...@google.com>>>> wrote:
>      >      >
>      >      >             As others have already suggested, I also
>     believe LTS
>      >      >             releases is the best we can do as a community
>     right now
>      >      >             until portability allows us to decouple what a
>     user
>      >     writes
>      >      >             with and how it runs (the SDK and the SDK
>      >     environment) from
>      >      >             the runner (job service + shared common runner
>     libs +
>      >      >             Flink/Spark/Dataflow/Apex/Samza/...).
>      >      >
>      >      >             Dataflow would be highly invested in having the
>      >     appropriate
>      >      >             tooling within Apache Beam to support multiple SDK
>      >     versions
>      >      >             against a runner. This in turn would allow
>     people to
>      >     use any
>      >      >             SDK with any runner and as Robert had
>     mentioned, certain
>      >      >             optimizations and features would be disabled
>     depending on
>      >      >             the capabilities of the runner and the
>     capabilities
>      >     of the SDK.
>      >      >
>      >      >
>      >      >
>      >      >             On Wed, Sep 12, 2018 at 6:38 AM Robert Bradshaw
>      >      >             <robertwb@google.com
>     <ma...@google.com> <mailto:robertwb@google.com
>     <ma...@google.com>>
>      >     <mailto:robertwb@google.com <ma...@google.com>
>     <mailto:robertwb@google.com <ma...@google.com>>>> wrote:
>      >      >
>      >      >                 The target audience is people who want to
>     use the
>      >     latest
>      >      >                 Beam but do not want to use the
>     latest version of the
>      >      >                 runner, right?
>      >      >
>      >      >                 I think this will be somewhat (though not
>     entirely)
>      >      >                 addressed by Beam LTS releases, where
>     those not
>      >     wanting
>      >      >                 to upgrade the runner at least have a
>     well-supported
>      >      >                 version of Beam. In the long term, we have the
>      >     division
>      >      >
>      >      >                      Runner <-> BeamRunnerSpecificCode <->
>      >      >                 CommonBeamRunnerLibs <-> SDK.
>      >      >
>      >      >                 (which applies to the job submission as
>     well as
>      >     execution).
>      >      >
>      >      >                 Insomuch as the BeamRunnerSpecificCode
>     uses the
>      >     public
>      >      >                 APIs of the runner, hopefully upgrading the
>      >     runner for
>      >      >                 minor versions should be a no-op, and we can
>      >     target the
>      >      >                 lowest version of the runner that makes sense,
>      >     allowing
>      >      >                 the user to link against higher versions
>     at his
>      >     or her
>      >      >                 discretion. We should provide built
>     targets that
>      >     allow
>      >      >                 this. For major versions, it may make sense to
>      >     have two
>      >      >                 distinct BeamRunnerSpecificCode libraries
>     (which
>      >     may or
>      >      >                 may not share some common code). I hope these
>      >     wrappers
>      >      >                 are not too thick.
>      >      >
>      >      >                 There is a tight coupling at
>      >     the BeamRunnerSpecificCode
>      >      >                 <-> CommonBeamRunnerLibs layer, but
>     hopefully the
>      >     bulk
>      >      >                 of the code lives on the right hand side
>     and can be
>      >      >                 updated as needed independent of the runner.
>      >     There may
>      >      >                 be code of the form "if the runner
>     supports X, do
>      >     this
>      >      >                 fast path, otherwise, do this slow path (or
>      >     reject the
>      >      >                 pipeline).
>      >      >
>      >      >                 I hope the CommonBeamRunnerLibs <-> SDK
>     coupling is
>      >      >                 fairly loose, to the point that one could use
>      >     SDKs from
>      >      >                 different versions of Beam (or even developed
>      >     outside of
>      >      >                 Beam) with an older/newer runner. We may
>     need to add
>      >      >                 versioning to the Fn/Runner/Job API itself
>     to support
>      >      >                 this. Right now of course we're still in a
>     pre-1.0,
>      >      >                 rapid-development phase wrt this API.
>      >      >
>      >      >
>      >      >
>      >      >
>      >      >                 On Wed, Sep 12, 2018 at 2:10 PM Etienne
>     Chauchot
>      >      >                 <echauchot@apache.org
>     <ma...@apache.org>
>      >     <mailto:echauchot@apache.org <ma...@apache.org>>
>     <mailto:echauchot@apache.org <ma...@apache.org>
>      >     <mailto:echauchot@apache.org <ma...@apache.org>>>>
>     wrote:
>      >      >
>      >      >                     Hi Max,
>      >      >
>      >      >                     I totally agree with your points
>     especially the
>      >      >                     users priorities (stick to the already
>     working
>      >      >                     version) , and the need to leverage
>     important new
>      >      >                     features. It is indeed a difficult
>     balance to
>      >     find .
>      >      >
>      >      >                     I can talk for a part I know: for the
>     Spark
>      >     runner,
>      >      >                     the aim was to support Dataset native
>     spark
>      >     API (in
>      >      >                     place of RDD). For that we needed to
>     upgrade to
>      >      >                     spark 2.x (and we will probably
>     leverage Beam
>      >     Row as
>      >      >                     well).
>      >      >                     But such an upgrade is a good amount
>     of work
>      >     which
>      >      >                     makes it difficult to commit on a schedule
>      >     such as
>      >      >                     "if there is a major new feature on an
>     execution
>      >      >                     engine that we want to leverage, then the
>      >     upgrade in
>      >      >                     Beam will be done within x months".
>      >      >
>      >      >                     Regarding your point on portability :
>      >     decoupling SDK
>      >      >                     from runner with runner harness and
>     SDK harness
>      >      >                     might make pipeline authors work easy
>     regarding
>      >      >                     pipeline maintenance. But, still, if
>     we upgrade
>      >      >                     runner libs, then the users might have
>     their
>      >     runner
>      >      >                     harness not work with their engine
>     version.
>      >      >                     If such SDK/runner decoupling is 100%
>     functional,
>      >      >                     then we could imaging having multiple
>     runner
>      >      >                     harnesses shipping different versions
>     of the
>      >     runner
>      >      >                     libs to solve this problem.
>      >      >                     But we would need to support more than one
>      >     version
>      >      >                     of the runner libs. We chose not to do
>     this
>      >     on spark
>      >      >                     runner.
>      >      >
>      >      >                     WDYT ?
>      >      >
>      >      >                     Best
>      >      >                     Etienne
>      >      >
>      >      >
>      >      >                     Le mardi 11 septembre 2018 à 15:42 +0200,
>      >     Maximilian
>      >      >                     Michels a écrit :
>      >      >>                     Hi Beamers,
>      >      >>
>      >      >>                     In the light of the discussion about Beam
>      >     LTS releases, I'd like to kick
>      >      >>                     off a thread about how often we
>     upgrade the
>      >     execution engine of each
>      >      >>                     Runner. By upgrade, I mean major/minor
>      >     versions which typically break
>      >      >>                     the binary compatibility of Beam
>     pipelines.
>      >      >>
>      >      >>                     For the Flink Runner, we try to track the
>      >     latest stable version. Some
>      >      >>                     users reported that this can be
>     problematic,
>      >     as it requires them to
>      >      >>                     potentially upgrade their Flink
>     cluster with
>      >     a new version of Beam.
>      >      >>
>      >      >>                       From a developer's perspective, it
>     makes
>      >     sense to migrate as early as
>      >      >>                     possible to the newest version of the
>      >     execution engine, e.g. to leverage
>      >      >>                     the newest features. From a user's
>      >     perspective, you don't care about the
>      >      >>                     latest features if your use case
>     still works
>      >     with Beam.
>      >      >>
>      >      >>                     We have to please both parties. So I'd
>      >     suggest to upgrade the execution
>      >      >>                     engine whenever necessary (e.g.
>     critical new
>      >     features, end of life of
>      >      >>                     current version). On the other hand, the
>      >     upcoming Beam LTS releases will
>      >      >>                     contain a longer-supported version.
>      >      >>
>      >      >>                     Maybe we don't need to discuss much about
>      >     this but I wanted to hear what
>      >      >>                     the community has to say about it.
>      >     Particularly, I'd be interested in
>      >      >>                     how the other Runner authors intend
>     to do it.
>      >      >>
>      >      >>                     As far as I understand, with the
>     portability
>      >     being stable, we could
>      >      >>                     theoretically upgrade the SDK without
>      >     upgrading the runtime components.
>      >      >>                     That would allow us to defer the
>     upgrade for
>      >     a longer time.
>      >      >>
>      >      >>                     Best,
>      >      >>                     Max
>      >      >>
>      >
> 

Re: [Discuss] Upgrade story for Beam's execution engines

Posted by Robert Bradshaw <ro...@google.com>.
On Sun, Sep 16, 2018 at 12:59 PM Maximilian Michels <mx...@apache.org> wrote:

> > If I understand the LTS proposal correctly, then it will be a release
> line that continues to receive patches (as in semantic versioning), but no
> new features as that would defeat the purpose (stability).
>
> It matters insofar, as execution engine upgrades could be performed in
> the master but the LTS version won't receive them. So LTS is the go-to
> if you want to ensure compatibility with your existing setup.
>

To clarify, there are two classes of OldRunner users here: those that want
new features in Beam, and those that simply want to run a supported version
of Beam. The LTS proposal helps the latter, which is going to be biased
towards those not upgrading their runners. The former are worth supporting
as well.


> > To limit the pain of dealing with incompatible runner changes and copies
> within Beam, we should probably also work with the respective community to
> improve the compatibility story.
>
> Absolutely. If we find that we can improve compatibility with upstream
> changes, we should go that path. Even if we don't have a dedicated
> compatibility layer upstream yet.
>
> On 13.09.18 19:34, Thomas Weise wrote:
> >
> > On Thu, Sep 13, 2018 at 9:49 AM Maximilian Michels <mxm@apache.org
> > <ma...@apache.org>> wrote:
> >
> >     Thank you for your comments. Let me try to summarize what has been
> >     discussed so far:
> >
> >     1. The Beam LTS version will ensure a stable execution engine for as
> >     long as the LTS life span.
> >
> >
> > If I understand the LTS proposal correctly, then it will be a release
> > line that continues to receive patches (as in semantic versioning), but
> > no new features as that would defeat the purpose (stability).
> >
> > If so, then I don't think LTS matters for this discussion.
> >
> >     2. We agree that pushing updates to the execution engine for the
> >     Runners
> >     is only desirable if it results in a better integration with the Beam
> >     model or if it is necessary due security or performance reasons.
> >
> >     3. We might have to consider adding additional build targets for a
> >     Runner for whenever the execution engine gets upgraded. This might be
> >     really easy if the engine's API remains stable. It might also be
> >     desirable if the upgrade path is not easy and not completely
> >     foreseeable, e.g. Etienne mentioned Spark 1.x vs Spark 2.x Runner.
> The
> >     Beam feature set could vary depending on the version.
> >
> >
> > To limit the pain of dealing with incompatible runner changes and copies
> > within Beam, we should probably also work with the respective community
> > to improve the compatibility story.
> >
> >
> >     4. In the long run, we want a stable abstraction layer for each
> Runner
> >     that, ideally, is maintained by the upstream of the execution
> >     engine. In
> >     the short run, this is probably not realistic, as the shared
> libraries
> >     of Beam are not stable enough.
> >
> >
> > Yes, that will only become an option once we reach interface stability.
> > Similar to how the runner projects maintain their IO connectors.
> >
> >     On 13.09.18 14:39, Robert Bradshaw wrote:
> >      > The ideal long-term solution is, as Romain mentions, pushing the
> >      > runner-specific code up to be maintained by each runner with a
> >     stable
> >      > API to use to talk to Beam. Unfortunately, I think we're still a
> >     long
> >      > way from having this Stable API, or having the clout for
> >      > non-beam-developers to maintain these bindings externally (though
> >      > hopefully we'll get there).
> >      >
> >      > In the short term, we're stuck with either hurting users that
> >     want to
> >      > stick with Flink 1.5, hurting users that want to upgrade to Flink
> >     1.6,
> >      > or supporting both. Is Beam's interaction with Flink such that we
> >     can't
> >      > simply have separate targets linking the same Beam code against
> >     one or
> >      > the other? (I.e. are code changes needed?) If so, we'll probably
> >     need a
> >      > flink-runner-1.5 module, a flink-runner-1.6, and a
> >     flink-runner-common
> >      > module. Or we hope that all users are happy with 1.5 until a
> certain
> >      > point in time when they all want to simultaneously jump to 1.6
> >     and Beam
> >      > at the same time. Maybe that's enough in the short term, but
> >     longer term
> >      > we need a more sustainable solution.
> >      >
> >      >
> >      > On Thu, Sep 13, 2018 at 7:13 AM Romain Manni-Bucau
> >      > <rmannibucau@gmail.com <ma...@gmail.com>
> >     <mailto:rmannibucau@gmail.com <ma...@gmail.com>>>
> wrote:
> >      >
> >      >     Hi guys,
> >      >
> >      >     Isnt the issue "only" that beam has this code instead of
> engines?
> >      >
> >      >     Assuming beam runner facing api is stable - which must be the
> >     case
> >      >     anyway - and that each engine has its integration (flink-beam
> >      >     instead of beam-runners-flink), then this issue disappears by
> >      >     construction.
> >      >
> >      >     It also has the advantage to have a better maintenance.
> >      >
> >      >     Side note: this is what happent which arquillian, originally
> the
> >      >     community did all adapters impl then each vendor took it back
> in
> >      >     house to make it better.
> >      >
> >      >     Any way to work in that direction maybe?
> >      >
> >      >     Le jeu. 13 sept. 2018 00:49, Thomas Weise <thw@apache.org
> >     <ma...@apache.org>
> >      >     <mailto:thw@apache.org <ma...@apache.org>>> a écrit :
> >      >
> >      >         The main problem here is that users are forced to upgrade
> >      >         infrastructure to obtain new features in Beam, even when
> >     those
> >      >         features actually don't require such changes. As an
> example,
> >      >         another update to Flink 1.6.0 was proposed (without
> >     supporting
> >      >         new functionality in Beam) and we already know that it
> breaks
> >      >         compatibility (again).
> >      >
> >      >         I think that upgrading to a Flink X.Y.0 version isn't a
> good
> >      >         idea to start with. But besides that, if we want to grow
> >      >         adoption, then we need to focus on stability and
> delivering
> >      >         improvements to Beam without disrupting users.
> >      >
> >      >         In the specific case, ideally the surface of Flink would
> be
> >      >         backward compatible, allowing us to stick to a minimum
> >     version
> >      >         and be able to submit pipelines to Flink endpoints of
> higher
> >      >         versions. Some work in that direction is underway (like
> >      >         versioning the REST API). FYI, lowest common version is
> what
> >      >         most projects that depend on Hadoop 2.x follow.
> >      >
> >      >         Since Beam with Flink 1.5.x client won't talk to Flink
> >     1.6 and
> >      >         there are code changes required to make it compile, we
> would
> >      >         need to come up with a more involved strategy to support
> >      >         multiple Flink versions. Till then, I would prefer we
> favor
> >      >         existing users over short lived experiments, which would
> mean
> >      >         stick with 1.5.x and not support 1.6.0.
> >      >
> >      >         Thanks,
> >      >         Thomas
> >      >
> >      >
> >      >         On Wed, Sep 12, 2018 at 1:15 PM Lukasz Cwik
> >     <lcwik@google.com <ma...@google.com>
> >      >         <mailto:lcwik@google.com <ma...@google.com>>>
> wrote:
> >      >
> >      >             As others have already suggested, I also believe LTS
> >      >             releases is the best we can do as a community right
> now
> >      >             until portability allows us to decouple what a user
> >     writes
> >      >             with and how it runs (the SDK and the SDK
> >     environment) from
> >      >             the runner (job service + shared common runner libs +
> >      >             Flink/Spark/Dataflow/Apex/Samza/...).
> >      >
> >      >             Dataflow would be highly invested in having the
> >     appropriate
> >      >             tooling within Apache Beam to support multiple SDK
> >     versions
> >      >             against a runner. This in turn would allow people to
> >     use any
> >      >             SDK with any runner and as Robert had mentioned,
> certain
> >      >             optimizations and features would be disabled
> depending on
> >      >             the capabilities of the runner and the capabilities
> >     of the SDK.
> >      >
> >      >
> >      >
> >      >             On Wed, Sep 12, 2018 at 6:38 AM Robert Bradshaw
> >      >             <robertwb@google.com <ma...@google.com>
> >     <mailto:robertwb@google.com <ma...@google.com>>> wrote:
> >      >
> >      >                 The target audience is people who want to use the
> >     latest
> >      >                 Beam but do not want to use the latest version of
> the
> >      >                 runner, right?
> >      >
> >      >                 I think this will be somewhat (though not
> entirely)
> >      >                 addressed by Beam LTS releases, where those not
> >     wanting
> >      >                 to upgrade the runner at least have a
> well-supported
> >      >                 version of Beam. In the long term, we have the
> >     division
> >      >
> >      >                      Runner <-> BeamRunnerSpecificCode <->
> >      >                 CommonBeamRunnerLibs <-> SDK.
> >      >
> >      >                 (which applies to the job submission as well as
> >     execution).
> >      >
> >      >                 Insomuch as the BeamRunnerSpecificCode uses the
> >     public
> >      >                 APIs of the runner, hopefully upgrading the
> >     runner for
> >      >                 minor versions should be a no-op, and we can
> >     target the
> >      >                 lowest version of the runner that makes sense,
> >     allowing
> >      >                 the user to link against higher versions at his
> >     or her
> >      >                 discretion. We should provide built targets that
> >     allow
> >      >                 this. For major versions, it may make sense to
> >     have two
> >      >                 distinct BeamRunnerSpecificCode libraries (which
> >     may or
> >      >                 may not share some common code). I hope these
> >     wrappers
> >      >                 are not too thick.
> >      >
> >      >                 There is a tight coupling at
> >     the BeamRunnerSpecificCode
> >      >                 <-> CommonBeamRunnerLibs layer, but hopefully the
> >     bulk
> >      >                 of the code lives on the right hand side and can
> be
> >      >                 updated as needed independent of the runner.
> >     There may
> >      >                 be code of the form "if the runner supports X, do
> >     this
> >      >                 fast path, otherwise, do this slow path (or
> >     reject the
> >      >                 pipeline).
> >      >
> >      >                 I hope the CommonBeamRunnerLibs <-> SDK coupling
> is
> >      >                 fairly loose, to the point that one could use
> >     SDKs from
> >      >                 different versions of Beam (or even developed
> >     outside of
> >      >                 Beam) with an older/newer runner. We may need to
> add
> >      >                 versioning to the Fn/Runner/Job API itself to
> support
> >      >                 this. Right now of course we're still in a
> pre-1.0,
> >      >                 rapid-development phase wrt this API.
> >      >
> >      >
> >      >
> >      >
> >      >                 On Wed, Sep 12, 2018 at 2:10 PM Etienne Chauchot
> >      >                 <echauchot@apache.org
> >     <ma...@apache.org> <mailto:echauchot@apache.org
> >     <ma...@apache.org>>> wrote:
> >      >
> >      >                     Hi Max,
> >      >
> >      >                     I totally agree with your points especially
> the
> >      >                     users priorities (stick to the already working
> >      >                     version) , and the need to leverage important
> new
> >      >                     features. It is indeed a difficult balance to
> >     find .
> >      >
> >      >                     I can talk for a part I know: for the Spark
> >     runner,
> >      >                     the aim was to support Dataset native spark
> >     API (in
> >      >                     place of RDD). For that we needed to upgrade
> to
> >      >                     spark 2.x (and we will probably leverage Beam
> >     Row as
> >      >                     well).
> >      >                     But such an upgrade is a good amount of work
> >     which
> >      >                     makes it difficult to commit on a schedule
> >     such as
> >      >                     "if there is a major new feature on an
> execution
> >      >                     engine that we want to leverage, then the
> >     upgrade in
> >      >                     Beam will be done within x months".
> >      >
> >      >                     Regarding your point on portability :
> >     decoupling SDK
> >      >                     from runner with runner harness and SDK
> harness
> >      >                     might make pipeline authors work easy
> regarding
> >      >                     pipeline maintenance. But, still, if we
> upgrade
> >      >                     runner libs, then the users might have their
> >     runner
> >      >                     harness not work with their engine version.
> >      >                     If such SDK/runner decoupling is 100%
> functional,
> >      >                     then we could imaging having multiple runner
> >      >                     harnesses shipping different versions of the
> >     runner
> >      >                     libs to solve this problem.
> >      >                     But we would need to support more than one
> >     version
> >      >                     of the runner libs. We chose not to do this
> >     on spark
> >      >                     runner.
> >      >
> >      >                     WDYT ?
> >      >
> >      >                     Best
> >      >                     Etienne
> >      >
> >      >
> >      >                     Le mardi 11 septembre 2018 à 15:42 +0200,
> >     Maximilian
> >      >                     Michels a écrit :
> >      >>                     Hi Beamers,
> >      >>
> >      >>                     In the light of the discussion about Beam
> >     LTS releases, I'd like to kick
> >      >>                     off a thread about how often we upgrade the
> >     execution engine of each
> >      >>                     Runner. By upgrade, I mean major/minor
> >     versions which typically break
> >      >>                     the binary compatibility of Beam pipelines.
> >      >>
> >      >>                     For the Flink Runner, we try to track the
> >     latest stable version. Some
> >      >>                     users reported that this can be problematic,
> >     as it requires them to
> >      >>                     potentially upgrade their Flink cluster with
> >     a new version of Beam.
> >      >>
> >      >>                       From a developer's perspective, it makes
> >     sense to migrate as early as
> >      >>                     possible to the newest version of the
> >     execution engine, e.g. to leverage
> >      >>                     the newest features. From a user's
> >     perspective, you don't care about the
> >      >>                     latest features if your use case still works
> >     with Beam.
> >      >>
> >      >>                     We have to please both parties. So I'd
> >     suggest to upgrade the execution
> >      >>                     engine whenever necessary (e.g. critical new
> >     features, end of life of
> >      >>                     current version). On the other hand, the
> >     upcoming Beam LTS releases will
> >      >>                     contain a longer-supported version.
> >      >>
> >      >>                     Maybe we don't need to discuss much about
> >     this but I wanted to hear what
> >      >>                     the community has to say about it.
> >     Particularly, I'd be interested in
> >      >>                     how the other Runner authors intend to do it.
> >      >>
> >      >>                     As far as I understand, with the portability
> >     being stable, we could
> >      >>                     theoretically upgrade the SDK without
> >     upgrading the runtime components.
> >      >>                     That would allow us to defer the upgrade for
> >     a longer time.
> >      >>
> >      >>                     Best,
> >      >>                     Max
> >      >>
> >
>

Re: [Discuss] Upgrade story for Beam's execution engines

Posted by Austin Bennett <wh...@gmail.com>.
Do we currently maintain a finer grained list of compatibility between
execution/runner versions and beam versions?  Is this only really a concern
with recent Flink (sounded like at least Spark jump, too)?  I see the
capability matrix:
https://beam.apache.org/documentation/runners/capability-matrix/, but some
sort of compatibility between runner versions with beam releases might be
useful.

I see compatibility matrix as far as beam features, but not for underlying
runners.  Ex: something like this would save a user trying to get Beam
working on recent Flink 1.6 and then subsequently hitting a (potentially
not well documented) wall given known issues.



On Sun, Sep 16, 2018 at 3:59 AM Maximilian Michels <mx...@apache.org> wrote:

> > If I understand the LTS proposal correctly, then it will be a release
> line that continues to receive patches (as in semantic versioning), but no
> new features as that would defeat the purpose (stability).
>
> It matters insofar, as execution engine upgrades could be performed in
> the master but the LTS version won't receive them. So LTS is the go-to
> if you want to ensure compatibility with your existing setup.
>
> > To limit the pain of dealing with incompatible runner changes and copies
> within Beam, we should probably also work with the respective community to
> improve the compatibility story.
>
> Absolutely. If we find that we can improve compatibility with upstream
> changes, we should go that path. Even if we don't have a dedicated
> compatibility layer upstream yet.
>
> On 13.09.18 19:34, Thomas Weise wrote:
> >
> > On Thu, Sep 13, 2018 at 9:49 AM Maximilian Michels <mxm@apache.org
> > <ma...@apache.org>> wrote:
> >
> >     Thank you for your comments. Let me try to summarize what has been
> >     discussed so far:
> >
> >     1. The Beam LTS version will ensure a stable execution engine for as
> >     long as the LTS life span.
> >
> >
> > If I understand the LTS proposal correctly, then it will be a release
> > line that continues to receive patches (as in semantic versioning), but
> > no new features as that would defeat the purpose (stability).
> >
> > If so, then I don't think LTS matters for this discussion.
> >
> >     2. We agree that pushing updates to the execution engine for the
> >     Runners
> >     is only desirable if it results in a better integration with the Beam
> >     model or if it is necessary due security or performance reasons.
> >
> >     3. We might have to consider adding additional build targets for a
> >     Runner for whenever the execution engine gets upgraded. This might be
> >     really easy if the engine's API remains stable. It might also be
> >     desirable if the upgrade path is not easy and not completely
> >     foreseeable, e.g. Etienne mentioned Spark 1.x vs Spark 2.x Runner.
> The
> >     Beam feature set could vary depending on the version.
> >
> >
> > To limit the pain of dealing with incompatible runner changes and copies
> > within Beam, we should probably also work with the respective community
> > to improve the compatibility story.
> >
> >
> >     4. In the long run, we want a stable abstraction layer for each
> Runner
> >     that, ideally, is maintained by the upstream of the execution
> >     engine. In
> >     the short run, this is probably not realistic, as the shared
> libraries
> >     of Beam are not stable enough.
> >
> >
> > Yes, that will only become an option once we reach interface stability.
> > Similar to how the runner projects maintain their IO connectors.
> >
> >     On 13.09.18 14:39, Robert Bradshaw wrote:
> >      > The ideal long-term solution is, as Romain mentions, pushing the
> >      > runner-specific code up to be maintained by each runner with a
> >     stable
> >      > API to use to talk to Beam. Unfortunately, I think we're still a
> >     long
> >      > way from having this Stable API, or having the clout for
> >      > non-beam-developers to maintain these bindings externally (though
> >      > hopefully we'll get there).
> >      >
> >      > In the short term, we're stuck with either hurting users that
> >     want to
> >      > stick with Flink 1.5, hurting users that want to upgrade to Flink
> >     1.6,
> >      > or supporting both. Is Beam's interaction with Flink such that we
> >     can't
> >      > simply have separate targets linking the same Beam code against
> >     one or
> >      > the other? (I.e. are code changes needed?) If so, we'll probably
> >     need a
> >      > flink-runner-1.5 module, a flink-runner-1.6, and a
> >     flink-runner-common
> >      > module. Or we hope that all users are happy with 1.5 until a
> certain
> >      > point in time when they all want to simultaneously jump to 1.6
> >     and Beam
> >      > at the same time. Maybe that's enough in the short term, but
> >     longer term
> >      > we need a more sustainable solution.
> >      >
> >      >
> >      > On Thu, Sep 13, 2018 at 7:13 AM Romain Manni-Bucau
> >      > <rmannibucau@gmail.com <ma...@gmail.com>
> >     <mailto:rmannibucau@gmail.com <ma...@gmail.com>>>
> wrote:
> >      >
> >      >     Hi guys,
> >      >
> >      >     Isnt the issue "only" that beam has this code instead of
> engines?
> >      >
> >      >     Assuming beam runner facing api is stable - which must be the
> >     case
> >      >     anyway - and that each engine has its integration (flink-beam
> >      >     instead of beam-runners-flink), then this issue disappears by
> >      >     construction.
> >      >
> >      >     It also has the advantage to have a better maintenance.
> >      >
> >      >     Side note: this is what happent which arquillian, originally
> the
> >      >     community did all adapters impl then each vendor took it back
> in
> >      >     house to make it better.
> >      >
> >      >     Any way to work in that direction maybe?
> >      >
> >      >     Le jeu. 13 sept. 2018 00:49, Thomas Weise <thw@apache.org
> >     <ma...@apache.org>
> >      >     <mailto:thw@apache.org <ma...@apache.org>>> a écrit :
> >      >
> >      >         The main problem here is that users are forced to upgrade
> >      >         infrastructure to obtain new features in Beam, even when
> >     those
> >      >         features actually don't require such changes. As an
> example,
> >      >         another update to Flink 1.6.0 was proposed (without
> >     supporting
> >      >         new functionality in Beam) and we already know that it
> breaks
> >      >         compatibility (again).
> >      >
> >      >         I think that upgrading to a Flink X.Y.0 version isn't a
> good
> >      >         idea to start with. But besides that, if we want to grow
> >      >         adoption, then we need to focus on stability and
> delivering
> >      >         improvements to Beam without disrupting users.
> >      >
> >      >         In the specific case, ideally the surface of Flink would
> be
> >      >         backward compatible, allowing us to stick to a minimum
> >     version
> >      >         and be able to submit pipelines to Flink endpoints of
> higher
> >      >         versions. Some work in that direction is underway (like
> >      >         versioning the REST API). FYI, lowest common version is
> what
> >      >         most projects that depend on Hadoop 2.x follow.
> >      >
> >      >         Since Beam with Flink 1.5.x client won't talk to Flink
> >     1.6 and
> >      >         there are code changes required to make it compile, we
> would
> >      >         need to come up with a more involved strategy to support
> >      >         multiple Flink versions. Till then, I would prefer we
> favor
> >      >         existing users over short lived experiments, which would
> mean
> >      >         stick with 1.5.x and not support 1.6.0.
> >      >
> >      >         Thanks,
> >      >         Thomas
> >      >
> >      >
> >      >         On Wed, Sep 12, 2018 at 1:15 PM Lukasz Cwik
> >     <lcwik@google.com <ma...@google.com>
> >      >         <mailto:lcwik@google.com <ma...@google.com>>>
> wrote:
> >      >
> >      >             As others have already suggested, I also believe LTS
> >      >             releases is the best we can do as a community right
> now
> >      >             until portability allows us to decouple what a user
> >     writes
> >      >             with and how it runs (the SDK and the SDK
> >     environment) from
> >      >             the runner (job service + shared common runner libs +
> >      >             Flink/Spark/Dataflow/Apex/Samza/...).
> >      >
> >      >             Dataflow would be highly invested in having the
> >     appropriate
> >      >             tooling within Apache Beam to support multiple SDK
> >     versions
> >      >             against a runner. This in turn would allow people to
> >     use any
> >      >             SDK with any runner and as Robert had mentioned,
> certain
> >      >             optimizations and features would be disabled
> depending on
> >      >             the capabilities of the runner and the capabilities
> >     of the SDK.
> >      >
> >      >
> >      >
> >      >             On Wed, Sep 12, 2018 at 6:38 AM Robert Bradshaw
> >      >             <robertwb@google.com <ma...@google.com>
> >     <mailto:robertwb@google.com <ma...@google.com>>> wrote:
> >      >
> >      >                 The target audience is people who want to use the
> >     latest
> >      >                 Beam but do not want to use the latest version of
> the
> >      >                 runner, right?
> >      >
> >      >                 I think this will be somewhat (though not
> entirely)
> >      >                 addressed by Beam LTS releases, where those not
> >     wanting
> >      >                 to upgrade the runner at least have a
> well-supported
> >      >                 version of Beam. In the long term, we have the
> >     division
> >      >
> >      >                      Runner <-> BeamRunnerSpecificCode <->
> >      >                 CommonBeamRunnerLibs <-> SDK.
> >      >
> >      >                 (which applies to the job submission as well as
> >     execution).
> >      >
> >      >                 Insomuch as the BeamRunnerSpecificCode uses the
> >     public
> >      >                 APIs of the runner, hopefully upgrading the
> >     runner for
> >      >                 minor versions should be a no-op, and we can
> >     target the
> >      >                 lowest version of the runner that makes sense,
> >     allowing
> >      >                 the user to link against higher versions at his
> >     or her
> >      >                 discretion. We should provide built targets that
> >     allow
> >      >                 this. For major versions, it may make sense to
> >     have two
> >      >                 distinct BeamRunnerSpecificCode libraries (which
> >     may or
> >      >                 may not share some common code). I hope these
> >     wrappers
> >      >                 are not too thick.
> >      >
> >      >                 There is a tight coupling at
> >     the BeamRunnerSpecificCode
> >      >                 <-> CommonBeamRunnerLibs layer, but hopefully the
> >     bulk
> >      >                 of the code lives on the right hand side and can
> be
> >      >                 updated as needed independent of the runner.
> >     There may
> >      >                 be code of the form "if the runner supports X, do
> >     this
> >      >                 fast path, otherwise, do this slow path (or
> >     reject the
> >      >                 pipeline).
> >      >
> >      >                 I hope the CommonBeamRunnerLibs <-> SDK coupling
> is
> >      >                 fairly loose, to the point that one could use
> >     SDKs from
> >      >                 different versions of Beam (or even developed
> >     outside of
> >      >                 Beam) with an older/newer runner. We may need to
> add
> >      >                 versioning to the Fn/Runner/Job API itself to
> support
> >      >                 this. Right now of course we're still in a
> pre-1.0,
> >      >                 rapid-development phase wrt this API.
> >      >
> >      >
> >      >
> >      >
> >      >                 On Wed, Sep 12, 2018 at 2:10 PM Etienne Chauchot
> >      >                 <echauchot@apache.org
> >     <ma...@apache.org> <mailto:echauchot@apache.org
> >     <ma...@apache.org>>> wrote:
> >      >
> >      >                     Hi Max,
> >      >
> >      >                     I totally agree with your points especially
> the
> >      >                     users priorities (stick to the already working
> >      >                     version) , and the need to leverage important
> new
> >      >                     features. It is indeed a difficult balance to
> >     find .
> >      >
> >      >                     I can talk for a part I know: for the Spark
> >     runner,
> >      >                     the aim was to support Dataset native spark
> >     API (in
> >      >                     place of RDD). For that we needed to upgrade
> to
> >      >                     spark 2.x (and we will probably leverage Beam
> >     Row as
> >      >                     well).
> >      >                     But such an upgrade is a good amount of work
> >     which
> >      >                     makes it difficult to commit on a schedule
> >     such as
> >      >                     "if there is a major new feature on an
> execution
> >      >                     engine that we want to leverage, then the
> >     upgrade in
> >      >                     Beam will be done within x months".
> >      >
> >      >                     Regarding your point on portability :
> >     decoupling SDK
> >      >                     from runner with runner harness and SDK
> harness
> >      >                     might make pipeline authors work easy
> regarding
> >      >                     pipeline maintenance. But, still, if we
> upgrade
> >      >                     runner libs, then the users might have their
> >     runner
> >      >                     harness not work with their engine version.
> >      >                     If such SDK/runner decoupling is 100%
> functional,
> >      >                     then we could imaging having multiple runner
> >      >                     harnesses shipping different versions of the
> >     runner
> >      >                     libs to solve this problem.
> >      >                     But we would need to support more than one
> >     version
> >      >                     of the runner libs. We chose not to do this
> >     on spark
> >      >                     runner.
> >      >
> >      >                     WDYT ?
> >      >
> >      >                     Best
> >      >                     Etienne
> >      >
> >      >
> >      >                     Le mardi 11 septembre 2018 à 15:42 +0200,
> >     Maximilian
> >      >                     Michels a écrit :
> >      >>                     Hi Beamers,
> >      >>
> >      >>                     In the light of the discussion about Beam
> >     LTS releases, I'd like to kick
> >      >>                     off a thread about how often we upgrade the
> >     execution engine of each
> >      >>                     Runner. By upgrade, I mean major/minor
> >     versions which typically break
> >      >>                     the binary compatibility of Beam pipelines.
> >      >>
> >      >>                     For the Flink Runner, we try to track the
> >     latest stable version. Some
> >      >>                     users reported that this can be problematic,
> >     as it requires them to
> >      >>                     potentially upgrade their Flink cluster with
> >     a new version of Beam.
> >      >>
> >      >>                       From a developer's perspective, it makes
> >     sense to migrate as early as
> >      >>                     possible to the newest version of the
> >     execution engine, e.g. to leverage
> >      >>                     the newest features. From a user's
> >     perspective, you don't care about the
> >      >>                     latest features if your use case still works
> >     with Beam.
> >      >>
> >      >>                     We have to please both parties. So I'd
> >     suggest to upgrade the execution
> >      >>                     engine whenever necessary (e.g. critical new
> >     features, end of life of
> >      >>                     current version). On the other hand, the
> >     upcoming Beam LTS releases will
> >      >>                     contain a longer-supported version.
> >      >>
> >      >>                     Maybe we don't need to discuss much about
> >     this but I wanted to hear what
> >      >>                     the community has to say about it.
> >     Particularly, I'd be interested in
> >      >>                     how the other Runner authors intend to do it.
> >      >>
> >      >>                     As far as I understand, with the portability
> >     being stable, we could
> >      >>                     theoretically upgrade the SDK without
> >     upgrading the runtime components.
> >      >>                     That would allow us to defer the upgrade for
> >     a longer time.
> >      >>
> >      >>                     Best,
> >      >>                     Max
> >      >>
> >
>

Re: [Discuss] Upgrade story for Beam's execution engines

Posted by Maximilian Michels <mx...@apache.org>.
> If I understand the LTS proposal correctly, then it will be a release line that continues to receive patches (as in semantic versioning), but no new features as that would defeat the purpose (stability).

It matters insofar, as execution engine upgrades could be performed in 
the master but the LTS version won't receive them. So LTS is the go-to 
if you want to ensure compatibility with your existing setup.

> To limit the pain of dealing with incompatible runner changes and copies within Beam, we should probably also work with the respective community to improve the compatibility story.

Absolutely. If we find that we can improve compatibility with upstream 
changes, we should go that path. Even if we don't have a dedicated 
compatibility layer upstream yet.

On 13.09.18 19:34, Thomas Weise wrote:
> 
> On Thu, Sep 13, 2018 at 9:49 AM Maximilian Michels <mxm@apache.org 
> <ma...@apache.org>> wrote:
> 
>     Thank you for your comments. Let me try to summarize what has been
>     discussed so far:
> 
>     1. The Beam LTS version will ensure a stable execution engine for as
>     long as the LTS life span.
> 
> 
> If I understand the LTS proposal correctly, then it will be a release 
> line that continues to receive patches (as in semantic versioning), but 
> no new features as that would defeat the purpose (stability).
> 
> If so, then I don't think LTS matters for this discussion.
> 
>     2. We agree that pushing updates to the execution engine for the
>     Runners
>     is only desirable if it results in a better integration with the Beam
>     model or if it is necessary due security or performance reasons.
> 
>     3. We might have to consider adding additional build targets for a
>     Runner for whenever the execution engine gets upgraded. This might be
>     really easy if the engine's API remains stable. It might also be
>     desirable if the upgrade path is not easy and not completely
>     foreseeable, e.g. Etienne mentioned Spark 1.x vs Spark 2.x Runner. The
>     Beam feature set could vary depending on the version.
> 
> 
> To limit the pain of dealing with incompatible runner changes and copies 
> within Beam, we should probably also work with the respective community 
> to improve the compatibility story.
> 
> 
>     4. In the long run, we want a stable abstraction layer for each Runner
>     that, ideally, is maintained by the upstream of the execution
>     engine. In
>     the short run, this is probably not realistic, as the shared libraries
>     of Beam are not stable enough.
> 
> 
> Yes, that will only become an option once we reach interface stability. 
> Similar to how the runner projects maintain their IO connectors.
> 
>     On 13.09.18 14:39, Robert Bradshaw wrote:
>      > The ideal long-term solution is, as Romain mentions, pushing the
>      > runner-specific code up to be maintained by each runner with a
>     stable
>      > API to use to talk to Beam. Unfortunately, I think we're still a
>     long
>      > way from having this Stable API, or having the clout for
>      > non-beam-developers to maintain these bindings externally (though
>      > hopefully we'll get there).
>      >
>      > In the short term, we're stuck with either hurting users that
>     want to
>      > stick with Flink 1.5, hurting users that want to upgrade to Flink
>     1.6,
>      > or supporting both. Is Beam's interaction with Flink such that we
>     can't
>      > simply have separate targets linking the same Beam code against
>     one or
>      > the other? (I.e. are code changes needed?) If so, we'll probably
>     need a
>      > flink-runner-1.5 module, a flink-runner-1.6, and a
>     flink-runner-common
>      > module. Or we hope that all users are happy with 1.5 until a certain
>      > point in time when they all want to simultaneously jump to 1.6
>     and Beam
>      > at the same time. Maybe that's enough in the short term, but
>     longer term
>      > we need a more sustainable solution.
>      >
>      >
>      > On Thu, Sep 13, 2018 at 7:13 AM Romain Manni-Bucau
>      > <rmannibucau@gmail.com <ma...@gmail.com>
>     <mailto:rmannibucau@gmail.com <ma...@gmail.com>>> wrote:
>      >
>      >     Hi guys,
>      >
>      >     Isnt the issue "only" that beam has this code instead of engines?
>      >
>      >     Assuming beam runner facing api is stable - which must be the
>     case
>      >     anyway - and that each engine has its integration (flink-beam
>      >     instead of beam-runners-flink), then this issue disappears by
>      >     construction.
>      >
>      >     It also has the advantage to have a better maintenance.
>      >
>      >     Side note: this is what happent which arquillian, originally the
>      >     community did all adapters impl then each vendor took it back in
>      >     house to make it better.
>      >
>      >     Any way to work in that direction maybe?
>      >
>      >     Le jeu. 13 sept. 2018 00:49, Thomas Weise <thw@apache.org
>     <ma...@apache.org>
>      >     <mailto:thw@apache.org <ma...@apache.org>>> a écrit :
>      >
>      >         The main problem here is that users are forced to upgrade
>      >         infrastructure to obtain new features in Beam, even when
>     those
>      >         features actually don't require such changes. As an example,
>      >         another update to Flink 1.6.0 was proposed (without
>     supporting
>      >         new functionality in Beam) and we already know that it breaks
>      >         compatibility (again).
>      >
>      >         I think that upgrading to a Flink X.Y.0 version isn't a good
>      >         idea to start with. But besides that, if we want to grow
>      >         adoption, then we need to focus on stability and delivering
>      >         improvements to Beam without disrupting users.
>      >
>      >         In the specific case, ideally the surface of Flink would be
>      >         backward compatible, allowing us to stick to a minimum
>     version
>      >         and be able to submit pipelines to Flink endpoints of higher
>      >         versions. Some work in that direction is underway (like
>      >         versioning the REST API). FYI, lowest common version is what
>      >         most projects that depend on Hadoop 2.x follow.
>      >
>      >         Since Beam with Flink 1.5.x client won't talk to Flink
>     1.6 and
>      >         there are code changes required to make it compile, we would
>      >         need to come up with a more involved strategy to support
>      >         multiple Flink versions. Till then, I would prefer we favor
>      >         existing users over short lived experiments, which would mean
>      >         stick with 1.5.x and not support 1.6.0.
>      >
>      >         Thanks,
>      >         Thomas
>      >
>      >
>      >         On Wed, Sep 12, 2018 at 1:15 PM Lukasz Cwik
>     <lcwik@google.com <ma...@google.com>
>      >         <mailto:lcwik@google.com <ma...@google.com>>> wrote:
>      >
>      >             As others have already suggested, I also believe LTS
>      >             releases is the best we can do as a community right now
>      >             until portability allows us to decouple what a user
>     writes
>      >             with and how it runs (the SDK and the SDK
>     environment) from
>      >             the runner (job service + shared common runner libs +
>      >             Flink/Spark/Dataflow/Apex/Samza/...).
>      >
>      >             Dataflow would be highly invested in having the
>     appropriate
>      >             tooling within Apache Beam to support multiple SDK
>     versions
>      >             against a runner. This in turn would allow people to
>     use any
>      >             SDK with any runner and as Robert had mentioned, certain
>      >             optimizations and features would be disabled depending on
>      >             the capabilities of the runner and the capabilities
>     of the SDK.
>      >
>      >
>      >
>      >             On Wed, Sep 12, 2018 at 6:38 AM Robert Bradshaw
>      >             <robertwb@google.com <ma...@google.com>
>     <mailto:robertwb@google.com <ma...@google.com>>> wrote:
>      >
>      >                 The target audience is people who want to use the
>     latest
>      >                 Beam but do not want to use the latest version of the
>      >                 runner, right?
>      >
>      >                 I think this will be somewhat (though not entirely)
>      >                 addressed by Beam LTS releases, where those not
>     wanting
>      >                 to upgrade the runner at least have a well-supported
>      >                 version of Beam. In the long term, we have the
>     division
>      >
>      >                      Runner <-> BeamRunnerSpecificCode <->
>      >                 CommonBeamRunnerLibs <-> SDK.
>      >
>      >                 (which applies to the job submission as well as
>     execution).
>      >
>      >                 Insomuch as the BeamRunnerSpecificCode uses the
>     public
>      >                 APIs of the runner, hopefully upgrading the
>     runner for
>      >                 minor versions should be a no-op, and we can
>     target the
>      >                 lowest version of the runner that makes sense,
>     allowing
>      >                 the user to link against higher versions at his
>     or her
>      >                 discretion. We should provide built targets that
>     allow
>      >                 this. For major versions, it may make sense to
>     have two
>      >                 distinct BeamRunnerSpecificCode libraries (which
>     may or
>      >                 may not share some common code). I hope these
>     wrappers
>      >                 are not too thick.
>      >
>      >                 There is a tight coupling at
>     the BeamRunnerSpecificCode
>      >                 <-> CommonBeamRunnerLibs layer, but hopefully the
>     bulk
>      >                 of the code lives on the right hand side and can be
>      >                 updated as needed independent of the runner.
>     There may
>      >                 be code of the form "if the runner supports X, do
>     this
>      >                 fast path, otherwise, do this slow path (or
>     reject the
>      >                 pipeline).
>      >
>      >                 I hope the CommonBeamRunnerLibs <-> SDK coupling is
>      >                 fairly loose, to the point that one could use
>     SDKs from
>      >                 different versions of Beam (or even developed
>     outside of
>      >                 Beam) with an older/newer runner. We may need to add
>      >                 versioning to the Fn/Runner/Job API itself to support
>      >                 this. Right now of course we're still in a pre-1.0,
>      >                 rapid-development phase wrt this API.
>      >
>      >
>      >
>      >
>      >                 On Wed, Sep 12, 2018 at 2:10 PM Etienne Chauchot
>      >                 <echauchot@apache.org
>     <ma...@apache.org> <mailto:echauchot@apache.org
>     <ma...@apache.org>>> wrote:
>      >
>      >                     Hi Max,
>      >
>      >                     I totally agree with your points especially the
>      >                     users priorities (stick to the already working
>      >                     version) , and the need to leverage important new
>      >                     features. It is indeed a difficult balance to
>     find .
>      >
>      >                     I can talk for a part I know: for the Spark
>     runner,
>      >                     the aim was to support Dataset native spark
>     API (in
>      >                     place of RDD). For that we needed to upgrade to
>      >                     spark 2.x (and we will probably leverage Beam
>     Row as
>      >                     well).
>      >                     But such an upgrade is a good amount of work
>     which
>      >                     makes it difficult to commit on a schedule
>     such as
>      >                     "if there is a major new feature on an execution
>      >                     engine that we want to leverage, then the
>     upgrade in
>      >                     Beam will be done within x months".
>      >
>      >                     Regarding your point on portability :
>     decoupling SDK
>      >                     from runner with runner harness and SDK harness
>      >                     might make pipeline authors work easy regarding
>      >                     pipeline maintenance. But, still, if we upgrade
>      >                     runner libs, then the users might have their
>     runner
>      >                     harness not work with their engine version.
>      >                     If such SDK/runner decoupling is 100% functional,
>      >                     then we could imaging having multiple runner
>      >                     harnesses shipping different versions of the
>     runner
>      >                     libs to solve this problem.
>      >                     But we would need to support more than one
>     version
>      >                     of the runner libs. We chose not to do this
>     on spark
>      >                     runner.
>      >
>      >                     WDYT ?
>      >
>      >                     Best
>      >                     Etienne
>      >
>      >
>      >                     Le mardi 11 septembre 2018 à 15:42 +0200,
>     Maximilian
>      >                     Michels a écrit :
>      >>                     Hi Beamers,
>      >>
>      >>                     In the light of the discussion about Beam
>     LTS releases, I'd like to kick
>      >>                     off a thread about how often we upgrade the
>     execution engine of each
>      >>                     Runner. By upgrade, I mean major/minor
>     versions which typically break
>      >>                     the binary compatibility of Beam pipelines.
>      >>
>      >>                     For the Flink Runner, we try to track the
>     latest stable version. Some
>      >>                     users reported that this can be problematic,
>     as it requires them to
>      >>                     potentially upgrade their Flink cluster with
>     a new version of Beam.
>      >>
>      >>                       From a developer's perspective, it makes
>     sense to migrate as early as
>      >>                     possible to the newest version of the
>     execution engine, e.g. to leverage
>      >>                     the newest features. From a user's
>     perspective, you don't care about the
>      >>                     latest features if your use case still works
>     with Beam.
>      >>
>      >>                     We have to please both parties. So I'd
>     suggest to upgrade the execution
>      >>                     engine whenever necessary (e.g. critical new
>     features, end of life of
>      >>                     current version). On the other hand, the
>     upcoming Beam LTS releases will
>      >>                     contain a longer-supported version.
>      >>
>      >>                     Maybe we don't need to discuss much about
>     this but I wanted to hear what
>      >>                     the community has to say about it.
>     Particularly, I'd be interested in
>      >>                     how the other Runner authors intend to do it.
>      >>
>      >>                     As far as I understand, with the portability
>     being stable, we could
>      >>                     theoretically upgrade the SDK without
>     upgrading the runtime components.
>      >>                     That would allow us to defer the upgrade for
>     a longer time.
>      >>
>      >>                     Best,
>      >>                     Max
>      >>
> 

Re: [Discuss] Upgrade story for Beam's execution engines

Posted by Thomas Weise <th...@apache.org>.
On Thu, Sep 13, 2018 at 9:49 AM Maximilian Michels <mx...@apache.org> wrote:

> Thank you for your comments. Let me try to summarize what has been
> discussed so far:
>
> 1. The Beam LTS version will ensure a stable execution engine for as
> long as the LTS life span.
>

If I understand the LTS proposal correctly, then it will be a release line
that continues to receive patches (as in semantic versioning), but no new
features as that would defeat the purpose (stability).

If so, then I don't think LTS matters for this discussion.

2. We agree that pushing updates to the execution engine for the Runners
> is only desirable if it results in a better integration with the Beam
> model or if it is necessary due security or performance reasons.
>
> 3. We might have to consider adding additional build targets for a
> Runner for whenever the execution engine gets upgraded. This might be
> really easy if the engine's API remains stable. It might also be
> desirable if the upgrade path is not easy and not completely
> foreseeable, e.g. Etienne mentioned Spark 1.x vs Spark 2.x Runner. The
> Beam feature set could vary depending on the version.
>

To limit the pain of dealing with incompatible runner changes and copies
within Beam, we should probably also work with the respective community to
improve the compatibility story.


> 4. In the long run, we want a stable abstraction layer for each Runner
> that, ideally, is maintained by the upstream of the execution engine. In
> the short run, this is probably not realistic, as the shared libraries
> of Beam are not stable enough.
>

Yes, that will only become an option once we reach interface stability.
Similar to how the runner projects maintain their IO connectors.


> On 13.09.18 14:39, Robert Bradshaw wrote:
> > The ideal long-term solution is, as Romain mentions, pushing the
> > runner-specific code up to be maintained by each runner with a stable
> > API to use to talk to Beam. Unfortunately, I think we're still a long
> > way from having this Stable API, or having the clout for
> > non-beam-developers to maintain these bindings externally (though
> > hopefully we'll get there).
> >
> > In the short term, we're stuck with either hurting users that want to
> > stick with Flink 1.5, hurting users that want to upgrade to Flink 1.6,
> > or supporting both. Is Beam's interaction with Flink such that we can't
> > simply have separate targets linking the same Beam code against one or
> > the other? (I.e. are code changes needed?) If so, we'll probably need a
> > flink-runner-1.5 module, a flink-runner-1.6, and a flink-runner-common
> > module. Or we hope that all users are happy with 1.5 until a certain
> > point in time when they all want to simultaneously jump to 1.6 and Beam
> > at the same time. Maybe that's enough in the short term, but longer term
> > we need a more sustainable solution.
> >
> >
> > On Thu, Sep 13, 2018 at 7:13 AM Romain Manni-Bucau
> > <rmannibucau@gmail.com <ma...@gmail.com>> wrote:
> >
> >     Hi guys,
> >
> >     Isnt the issue "only" that beam has this code instead of engines?
> >
> >     Assuming beam runner facing api is stable - which must be the case
> >     anyway - and that each engine has its integration (flink-beam
> >     instead of beam-runners-flink), then this issue disappears by
> >     construction.
> >
> >     It also has the advantage to have a better maintenance.
> >
> >     Side note: this is what happent which arquillian, originally the
> >     community did all adapters impl then each vendor took it back in
> >     house to make it better.
> >
> >     Any way to work in that direction maybe?
> >
> >     Le jeu. 13 sept. 2018 00:49, Thomas Weise <thw@apache.org
> >     <ma...@apache.org>> a écrit :
> >
> >         The main problem here is that users are forced to upgrade
> >         infrastructure to obtain new features in Beam, even when those
> >         features actually don't require such changes. As an example,
> >         another update to Flink 1.6.0 was proposed (without supporting
> >         new functionality in Beam) and we already know that it breaks
> >         compatibility (again).
> >
> >         I think that upgrading to a Flink X.Y.0 version isn't a good
> >         idea to start with. But besides that, if we want to grow
> >         adoption, then we need to focus on stability and delivering
> >         improvements to Beam without disrupting users.
> >
> >         In the specific case, ideally the surface of Flink would be
> >         backward compatible, allowing us to stick to a minimum version
> >         and be able to submit pipelines to Flink endpoints of higher
> >         versions. Some work in that direction is underway (like
> >         versioning the REST API). FYI, lowest common version is what
> >         most projects that depend on Hadoop 2.x follow.
> >
> >         Since Beam with Flink 1.5.x client won't talk to Flink 1.6 and
> >         there are code changes required to make it compile, we would
> >         need to come up with a more involved strategy to support
> >         multiple Flink versions. Till then, I would prefer we favor
> >         existing users over short lived experiments, which would mean
> >         stick with 1.5.x and not support 1.6.0.
> >
> >         Thanks,
> >         Thomas
> >
> >
> >         On Wed, Sep 12, 2018 at 1:15 PM Lukasz Cwik <lcwik@google.com
> >         <ma...@google.com>> wrote:
> >
> >             As others have already suggested, I also believe LTS
> >             releases is the best we can do as a community right now
> >             until portability allows us to decouple what a user writes
> >             with and how it runs (the SDK and the SDK environment) from
> >             the runner (job service + shared common runner libs +
> >             Flink/Spark/Dataflow/Apex/Samza/...).
> >
> >             Dataflow would be highly invested in having the appropriate
> >             tooling within Apache Beam to support multiple SDK versions
> >             against a runner. This in turn would allow people to use any
> >             SDK with any runner and as Robert had mentioned, certain
> >             optimizations and features would be disabled depending on
> >             the capabilities of the runner and the capabilities of the
> SDK.
> >
> >
> >
> >             On Wed, Sep 12, 2018 at 6:38 AM Robert Bradshaw
> >             <robertwb@google.com <ma...@google.com>> wrote:
> >
> >                 The target audience is people who want to use the latest
> >                 Beam but do not want to use the latest version of the
> >                 runner, right?
> >
> >                 I think this will be somewhat (though not entirely)
> >                 addressed by Beam LTS releases, where those not wanting
> >                 to upgrade the runner at least have a well-supported
> >                 version of Beam. In the long term, we have the division
> >
> >                      Runner <-> BeamRunnerSpecificCode <->
> >                 CommonBeamRunnerLibs <-> SDK.
> >
> >                 (which applies to the job submission as well as
> execution).
> >
> >                 Insomuch as the BeamRunnerSpecificCode uses the public
> >                 APIs of the runner, hopefully upgrading the runner for
> >                 minor versions should be a no-op, and we can target the
> >                 lowest version of the runner that makes sense, allowing
> >                 the user to link against higher versions at his or her
> >                 discretion. We should provide built targets that allow
> >                 this. For major versions, it may make sense to have two
> >                 distinct BeamRunnerSpecificCode libraries (which may or
> >                 may not share some common code). I hope these wrappers
> >                 are not too thick.
> >
> >                 There is a tight coupling at the BeamRunnerSpecificCode
> >                 <-> CommonBeamRunnerLibs layer, but hopefully the bulk
> >                 of the code lives on the right hand side and can be
> >                 updated as needed independent of the runner. There may
> >                 be code of the form "if the runner supports X, do this
> >                 fast path, otherwise, do this slow path (or reject the
> >                 pipeline).
> >
> >                 I hope the CommonBeamRunnerLibs <-> SDK coupling is
> >                 fairly loose, to the point that one could use SDKs from
> >                 different versions of Beam (or even developed outside of
> >                 Beam) with an older/newer runner. We may need to add
> >                 versioning to the Fn/Runner/Job API itself to support
> >                 this. Right now of course we're still in a pre-1.0,
> >                 rapid-development phase wrt this API.
> >
> >
> >
> >
> >                 On Wed, Sep 12, 2018 at 2:10 PM Etienne Chauchot
> >                 <echauchot@apache.org <ma...@apache.org>>
> wrote:
> >
> >                     Hi Max,
> >
> >                     I totally agree with your points especially the
> >                     users priorities (stick to the already working
> >                     version) , and the need to leverage important new
> >                     features. It is indeed a difficult balance to find .
> >
> >                     I can talk for a part I know: for the Spark runner,
> >                     the aim was to support Dataset native spark API (in
> >                     place of RDD). For that we needed to upgrade to
> >                     spark 2.x (and we will probably leverage Beam Row as
> >                     well).
> >                     But such an upgrade is a good amount of work which
> >                     makes it difficult to commit on a schedule such as
> >                     "if there is a major new feature on an execution
> >                     engine that we want to leverage, then the upgrade in
> >                     Beam will be done within x months".
> >
> >                     Regarding your point on portability : decoupling SDK
> >                     from runner with runner harness and SDK harness
> >                     might make pipeline authors work easy regarding
> >                     pipeline maintenance. But, still, if we upgrade
> >                     runner libs, then the users might have their runner
> >                     harness not work with their engine version.
> >                     If such SDK/runner decoupling is 100% functional,
> >                     then we could imaging having multiple runner
> >                     harnesses shipping different versions of the runner
> >                     libs to solve this problem.
> >                     But we would need to support more than one version
> >                     of the runner libs. We chose not to do this on spark
> >                     runner.
> >
> >                     WDYT ?
> >
> >                     Best
> >                     Etienne
> >
> >
> >                     Le mardi 11 septembre 2018 à 15:42 +0200, Maximilian
> >                     Michels a écrit :
> >>                     Hi Beamers,
> >>
> >>                     In the light of the discussion about Beam LTS
> releases, I'd like to kick
> >>                     off a thread about how often we upgrade the
> execution engine of each
> >>                     Runner. By upgrade, I mean major/minor versions
> which typically break
> >>                     the binary compatibility of Beam pipelines.
> >>
> >>                     For the Flink Runner, we try to track the latest
> stable version. Some
> >>                     users reported that this can be problematic, as it
> requires them to
> >>                     potentially upgrade their Flink cluster with a new
> version of Beam.
> >>
> >>                       From a developer's perspective, it makes sense to
> migrate as early as
> >>                     possible to the newest version of the execution
> engine, e.g. to leverage
> >>                     the newest features. From a user's perspective, you
> don't care about the
> >>                     latest features if your use case still works with
> Beam.
> >>
> >>                     We have to please both parties. So I'd suggest to
> upgrade the execution
> >>                     engine whenever necessary (e.g. critical new
> features, end of life of
> >>                     current version). On the other hand, the upcoming
> Beam LTS releases will
> >>                     contain a longer-supported version.
> >>
> >>                     Maybe we don't need to discuss much about this but
> I wanted to hear what
> >>                     the community has to say about it. Particularly,
> I'd be interested in
> >>                     how the other Runner authors intend to do it.
> >>
> >>                     As far as I understand, with the portability being
> stable, we could
> >>                     theoretically upgrade the SDK without upgrading the
> runtime components.
> >>                     That would allow us to defer the upgrade for a
> longer time.
> >>
> >>                     Best,
> >>                     Max
> >>
>

Re: [Discuss] Upgrade story for Beam's execution engines

Posted by Maximilian Michels <mx...@apache.org>.
Thank you for your comments. Let me try to summarize what has been 
discussed so far:

1. The Beam LTS version will ensure a stable execution engine for as 
long as the LTS life span.

2. We agree that pushing updates to the execution engine for the Runners 
is only desirable if it results in a better integration with the Beam 
model or if it is necessary due security or performance reasons.

3. We might have to consider adding additional build targets for a 
Runner for whenever the execution engine gets upgraded. This might be 
really easy if the engine's API remains stable. It might also be 
desirable if the upgrade path is not easy and not completely 
foreseeable, e.g. Etienne mentioned Spark 1.x vs Spark 2.x Runner. The 
Beam feature set could vary depending on the version.

4. In the long run, we want a stable abstraction layer for each Runner 
that, ideally, is maintained by the upstream of the execution engine. In 
the short run, this is probably not realistic, as the shared libraries 
of Beam are not stable enough.


On 13.09.18 14:39, Robert Bradshaw wrote:
> The ideal long-term solution is, as Romain mentions, pushing the 
> runner-specific code up to be maintained by each runner with a stable 
> API to use to talk to Beam. Unfortunately, I think we're still a long 
> way from having this Stable API, or having the clout for 
> non-beam-developers to maintain these bindings externally (though 
> hopefully we'll get there).
> 
> In the short term, we're stuck with either hurting users that want to 
> stick with Flink 1.5, hurting users that want to upgrade to Flink 1.6, 
> or supporting both. Is Beam's interaction with Flink such that we can't 
> simply have separate targets linking the same Beam code against one or 
> the other? (I.e. are code changes needed?) If so, we'll probably need a 
> flink-runner-1.5 module, a flink-runner-1.6, and a flink-runner-common 
> module. Or we hope that all users are happy with 1.5 until a certain 
> point in time when they all want to simultaneously jump to 1.6 and Beam 
> at the same time. Maybe that's enough in the short term, but longer term 
> we need a more sustainable solution.
> 
> 
> On Thu, Sep 13, 2018 at 7:13 AM Romain Manni-Bucau 
> <rmannibucau@gmail.com <ma...@gmail.com>> wrote:
> 
>     Hi guys,
> 
>     Isnt the issue "only" that beam has this code instead of engines?
> 
>     Assuming beam runner facing api is stable - which must be the case
>     anyway - and that each engine has its integration (flink-beam
>     instead of beam-runners-flink), then this issue disappears by
>     construction.
> 
>     It also has the advantage to have a better maintenance.
> 
>     Side note: this is what happent which arquillian, originally the
>     community did all adapters impl then each vendor took it back in
>     house to make it better.
> 
>     Any way to work in that direction maybe?
> 
>     Le jeu. 13 sept. 2018 00:49, Thomas Weise <thw@apache.org
>     <ma...@apache.org>> a écrit :
> 
>         The main problem here is that users are forced to upgrade
>         infrastructure to obtain new features in Beam, even when those
>         features actually don't require such changes. As an example,
>         another update to Flink 1.6.0 was proposed (without supporting
>         new functionality in Beam) and we already know that it breaks
>         compatibility (again).
> 
>         I think that upgrading to a Flink X.Y.0 version isn't a good
>         idea to start with. But besides that, if we want to grow
>         adoption, then we need to focus on stability and delivering
>         improvements to Beam without disrupting users.
> 
>         In the specific case, ideally the surface of Flink would be
>         backward compatible, allowing us to stick to a minimum version
>         and be able to submit pipelines to Flink endpoints of higher
>         versions. Some work in that direction is underway (like
>         versioning the REST API). FYI, lowest common version is what
>         most projects that depend on Hadoop 2.x follow.
> 
>         Since Beam with Flink 1.5.x client won't talk to Flink 1.6 and
>         there are code changes required to make it compile, we would
>         need to come up with a more involved strategy to support
>         multiple Flink versions. Till then, I would prefer we favor
>         existing users over short lived experiments, which would mean
>         stick with 1.5.x and not support 1.6.0.
> 
>         Thanks,
>         Thomas
> 
> 
>         On Wed, Sep 12, 2018 at 1:15 PM Lukasz Cwik <lcwik@google.com
>         <ma...@google.com>> wrote:
> 
>             As others have already suggested, I also believe LTS
>             releases is the best we can do as a community right now
>             until portability allows us to decouple what a user writes
>             with and how it runs (the SDK and the SDK environment) from
>             the runner (job service + shared common runner libs +
>             Flink/Spark/Dataflow/Apex/Samza/...).
> 
>             Dataflow would be highly invested in having the appropriate
>             tooling within Apache Beam to support multiple SDK versions
>             against a runner. This in turn would allow people to use any
>             SDK with any runner and as Robert had mentioned, certain
>             optimizations and features would be disabled depending on
>             the capabilities of the runner and the capabilities of the SDK.
> 
> 
> 
>             On Wed, Sep 12, 2018 at 6:38 AM Robert Bradshaw
>             <robertwb@google.com <ma...@google.com>> wrote:
> 
>                 The target audience is people who want to use the latest
>                 Beam but do not want to use the latest version of the
>                 runner, right?
> 
>                 I think this will be somewhat (though not entirely)
>                 addressed by Beam LTS releases, where those not wanting
>                 to upgrade the runner at least have a well-supported
>                 version of Beam. In the long term, we have the division
> 
>                      Runner <-> BeamRunnerSpecificCode <->
>                 CommonBeamRunnerLibs <-> SDK.
> 
>                 (which applies to the job submission as well as execution).
> 
>                 Insomuch as the BeamRunnerSpecificCode uses the public
>                 APIs of the runner, hopefully upgrading the runner for
>                 minor versions should be a no-op, and we can target the
>                 lowest version of the runner that makes sense, allowing
>                 the user to link against higher versions at his or her
>                 discretion. We should provide built targets that allow
>                 this. For major versions, it may make sense to have two
>                 distinct BeamRunnerSpecificCode libraries (which may or
>                 may not share some common code). I hope these wrappers
>                 are not too thick.
> 
>                 There is a tight coupling at the BeamRunnerSpecificCode
>                 <-> CommonBeamRunnerLibs layer, but hopefully the bulk
>                 of the code lives on the right hand side and can be
>                 updated as needed independent of the runner. There may
>                 be code of the form "if the runner supports X, do this
>                 fast path, otherwise, do this slow path (or reject the
>                 pipeline).
> 
>                 I hope the CommonBeamRunnerLibs <-> SDK coupling is
>                 fairly loose, to the point that one could use SDKs from
>                 different versions of Beam (or even developed outside of
>                 Beam) with an older/newer runner. We may need to add
>                 versioning to the Fn/Runner/Job API itself to support
>                 this. Right now of course we're still in a pre-1.0,
>                 rapid-development phase wrt this API.
> 
> 
> 
> 
>                 On Wed, Sep 12, 2018 at 2:10 PM Etienne Chauchot
>                 <echauchot@apache.org <ma...@apache.org>> wrote:
> 
>                     Hi Max,
> 
>                     I totally agree with your points especially the
>                     users priorities (stick to the already working
>                     version) , and the need to leverage important new
>                     features. It is indeed a difficult balance to find .
> 
>                     I can talk for a part I know: for the Spark runner,
>                     the aim was to support Dataset native spark API (in
>                     place of RDD). For that we needed to upgrade to
>                     spark 2.x (and we will probably leverage Beam Row as
>                     well).
>                     But such an upgrade is a good amount of work which
>                     makes it difficult to commit on a schedule such as
>                     "if there is a major new feature on an execution
>                     engine that we want to leverage, then the upgrade in
>                     Beam will be done within x months".
> 
>                     Regarding your point on portability : decoupling SDK
>                     from runner with runner harness and SDK harness
>                     might make pipeline authors work easy regarding
>                     pipeline maintenance. But, still, if we upgrade
>                     runner libs, then the users might have their runner
>                     harness not work with their engine version.
>                     If such SDK/runner decoupling is 100% functional,
>                     then we could imaging having multiple runner
>                     harnesses shipping different versions of the runner
>                     libs to solve this problem.
>                     But we would need to support more than one version
>                     of the runner libs. We chose not to do this on spark
>                     runner.
> 
>                     WDYT ?
> 
>                     Best
>                     Etienne
> 
> 
>                     Le mardi 11 septembre 2018 à 15:42 +0200, Maximilian
>                     Michels a écrit :
>>                     Hi Beamers,
>>
>>                     In the light of the discussion about Beam LTS releases, I'd like to kick
>>                     off a thread about how often we upgrade the execution engine of each
>>                     Runner. By upgrade, I mean major/minor versions which typically break
>>                     the binary compatibility of Beam pipelines.
>>
>>                     For the Flink Runner, we try to track the latest stable version. Some
>>                     users reported that this can be problematic, as it requires them to
>>                     potentially upgrade their Flink cluster with a new version of Beam.
>>
>>                       From a developer's perspective, it makes sense to migrate as early as
>>                     possible to the newest version of the execution engine, e.g. to leverage
>>                     the newest features. From a user's perspective, you don't care about the
>>                     latest features if your use case still works with Beam.
>>
>>                     We have to please both parties. So I'd suggest to upgrade the execution
>>                     engine whenever necessary (e.g. critical new features, end of life of
>>                     current version). On the other hand, the upcoming Beam LTS releases will
>>                     contain a longer-supported version.
>>
>>                     Maybe we don't need to discuss much about this but I wanted to hear what
>>                     the community has to say about it. Particularly, I'd be interested in
>>                     how the other Runner authors intend to do it.
>>
>>                     As far as I understand, with the portability being stable, we could
>>                     theoretically upgrade the SDK without upgrading the runtime components.
>>                     That would allow us to defer the upgrade for a longer time.
>>
>>                     Best,
>>                     Max
>>

Re: [Discuss] Upgrade story for Beam's execution engines

Posted by Robert Bradshaw <ro...@google.com>.
The ideal long-term solution is, as Romain mentions, pushing the
runner-specific code up to be maintained by each runner with a stable API
to use to talk to Beam. Unfortunately, I think we're still a long way from
having this Stable API, or having the clout for non-beam-developers to
maintain these bindings externally (though hopefully we'll get there).

In the short term, we're stuck with either hurting users that want to stick
with Flink 1.5, hurting users that want to upgrade to Flink 1.6, or
supporting both. Is Beam's interaction with Flink such that we can't simply
have separate targets linking the same Beam code against one or the other?
(I.e. are code changes needed?) If so, we'll probably need a
flink-runner-1.5 module, a flink-runner-1.6, and a flink-runner-common
module. Or we hope that all users are happy with 1.5 until a certain point
in time when they all want to simultaneously jump to 1.6 and Beam at the
same time. Maybe that's enough in the short term, but longer term we need a
more sustainable solution.


On Thu, Sep 13, 2018 at 7:13 AM Romain Manni-Bucau <rm...@gmail.com>
wrote:

> Hi guys,
>
> Isnt the issue "only" that beam has this code instead of engines?
>
> Assuming beam runner facing api is stable - which must be the case anyway
> - and that each engine has its integration (flink-beam instead of
> beam-runners-flink), then this issue disappears by construction.
>
> It also has the advantage to have a better maintenance.
>
> Side note: this is what happent which arquillian, originally the community
> did all adapters impl then each vendor took it back in house to make it
> better.
>
> Any way to work in that direction maybe?
>
> Le jeu. 13 sept. 2018 00:49, Thomas Weise <th...@apache.org> a écrit :
>
>> The main problem here is that users are forced to upgrade infrastructure
>> to obtain new features in Beam, even when those features actually don't
>> require such changes. As an example, another update to Flink 1.6.0 was
>> proposed (without supporting new functionality in Beam) and we already know
>> that it breaks compatibility (again).
>>
>> I think that upgrading to a Flink X.Y.0 version isn't a good idea to
>> start with. But besides that, if we want to grow adoption, then we need to
>> focus on stability and delivering improvements to Beam without disrupting
>> users.
>>
>> In the specific case, ideally the surface of Flink would be backward
>> compatible, allowing us to stick to a minimum version and be able to submit
>> pipelines to Flink endpoints of higher versions. Some work in that
>> direction is underway (like versioning the REST API). FYI, lowest common
>> version is what most projects that depend on Hadoop 2.x follow.
>>
>> Since Beam with Flink 1.5.x client won't talk to Flink 1.6 and there are
>> code changes required to make it compile, we would need to come up with a
>> more involved strategy to support multiple Flink versions. Till then, I
>> would prefer we favor existing users over short lived experiments, which
>> would mean stick with 1.5.x and not support 1.6.0.
>>
>> Thanks,
>> Thomas
>>
>>
>> On Wed, Sep 12, 2018 at 1:15 PM Lukasz Cwik <lc...@google.com> wrote:
>>
>>> As others have already suggested, I also believe LTS releases is the
>>> best we can do as a community right now until portability allows us to
>>> decouple what a user writes with and how it runs (the SDK and the SDK
>>> environment) from the runner (job service + shared common runner libs +
>>> Flink/Spark/Dataflow/Apex/Samza/...).
>>>
>>> Dataflow would be highly invested in having the appropriate tooling
>>> within Apache Beam to support multiple SDK versions against a runner. This
>>> in turn would allow people to use any SDK with any runner and as Robert had
>>> mentioned, certain optimizations and features would be disabled depending
>>> on the capabilities of the runner and the capabilities of the SDK.
>>>
>>>
>>>
>>> On Wed, Sep 12, 2018 at 6:38 AM Robert Bradshaw <ro...@google.com>
>>> wrote:
>>>
>>>> The target audience is people who want to use the latest Beam but do
>>>> not want to use the latest version of the runner, right?
>>>>
>>>> I think this will be somewhat (though not entirely) addressed by Beam
>>>> LTS releases, where those not wanting to upgrade the runner at least have a
>>>> well-supported version of Beam. In the long term, we have the division
>>>>
>>>>     Runner <-> BeamRunnerSpecificCode <-> CommonBeamRunnerLibs <-> SDK.
>>>>
>>>> (which applies to the job submission as well as execution).
>>>>
>>>> Insomuch as the BeamRunnerSpecificCode uses the public APIs of the
>>>> runner, hopefully upgrading the runner for minor versions should be a
>>>> no-op, and we can target the lowest version of the runner that makes sense,
>>>> allowing the user to link against higher versions at his or her discretion.
>>>> We should provide built targets that allow this. For major versions, it may
>>>> make sense to have two distinct BeamRunnerSpecificCode libraries (which may
>>>> or may not share some common code). I hope these wrappers are not too
>>>> thick.
>>>>
>>>> There is a tight coupling at the BeamRunnerSpecificCode <->
>>>> CommonBeamRunnerLibs layer, but hopefully the bulk of the code lives on the
>>>> right hand side and can be updated as needed independent of the runner.
>>>> There may be code of the form "if the runner supports X, do this fast path,
>>>> otherwise, do this slow path (or reject the pipeline).
>>>>
>>>> I hope the CommonBeamRunnerLibs <-> SDK coupling is fairly loose, to
>>>> the point that one could use SDKs from different versions of Beam (or even
>>>> developed outside of Beam) with an older/newer runner. We may need to add
>>>> versioning to the Fn/Runner/Job API itself to support this. Right now of
>>>> course we're still in a pre-1.0, rapid-development phase wrt this API.
>>>>
>>>>
>>>>
>>>>
>>>> On Wed, Sep 12, 2018 at 2:10 PM Etienne Chauchot <ec...@apache.org>
>>>> wrote:
>>>>
>>>>> Hi Max,
>>>>>
>>>>> I totally agree with your points especially the users priorities
>>>>> (stick to the already working version) , and the need to leverage important
>>>>> new features. It is indeed a difficult balance to find .
>>>>>
>>>>> I can talk for a part I know: for the Spark runner, the aim was to
>>>>> support Dataset native spark API (in place of RDD). For that we needed to
>>>>> upgrade to spark 2.x (and we will probably leverage Beam Row as well).
>>>>> But such an upgrade is a good amount of work which makes it difficult
>>>>> to commit on a schedule such as "if there is a major new feature on an
>>>>> execution engine that we want to leverage, then the upgrade in Beam will be
>>>>> done within x months".
>>>>>
>>>>> Regarding your point on portability : decoupling SDK from runner with
>>>>> runner harness and SDK harness might make pipeline authors work easy
>>>>> regarding pipeline maintenance. But, still, if we upgrade runner libs, then
>>>>> the users might have their runner harness not work with their engine
>>>>> version.
>>>>> If such SDK/runner decoupling is 100% functional, then we could
>>>>> imaging having multiple runner harnesses shipping different versions of the
>>>>> runner libs to solve this problem.
>>>>> But we would need to support more than one version of the runner libs.
>>>>> We chose not to do this on spark runner.
>>>>>
>>>>> WDYT ?
>>>>>
>>>>> Best
>>>>> Etienne
>>>>>
>>>>>
>>>>> Le mardi 11 septembre 2018 à 15:42 +0200, Maximilian Michels a écrit :
>>>>>
>>>>> Hi Beamers,
>>>>>
>>>>>
>>>>> In the light of the discussion about Beam LTS releases, I'd like to kick
>>>>>
>>>>> off a thread about how often we upgrade the execution engine of each
>>>>>
>>>>> Runner. By upgrade, I mean major/minor versions which typically break
>>>>>
>>>>> the binary compatibility of Beam pipelines.
>>>>>
>>>>>
>>>>> For the Flink Runner, we try to track the latest stable version. Some
>>>>>
>>>>> users reported that this can be problematic, as it requires them to
>>>>>
>>>>> potentially upgrade their Flink cluster with a new version of Beam.
>>>>>
>>>>>
>>>>>  From a developer's perspective, it makes sense to migrate as early as
>>>>>
>>>>> possible to the newest version of the execution engine, e.g. to leverage
>>>>>
>>>>> the newest features. From a user's perspective, you don't care about the
>>>>>
>>>>> latest features if your use case still works with Beam.
>>>>>
>>>>>
>>>>> We have to please both parties. So I'd suggest to upgrade the execution
>>>>>
>>>>> engine whenever necessary (e.g. critical new features, end of life of
>>>>>
>>>>> current version). On the other hand, the upcoming Beam LTS releases will
>>>>>
>>>>> contain a longer-supported version.
>>>>>
>>>>>
>>>>> Maybe we don't need to discuss much about this but I wanted to hear what
>>>>>
>>>>> the community has to say about it. Particularly, I'd be interested in
>>>>>
>>>>> how the other Runner authors intend to do it.
>>>>>
>>>>>
>>>>> As far as I understand, with the portability being stable, we could
>>>>>
>>>>> theoretically upgrade the SDK without upgrading the runtime components.
>>>>>
>>>>> That would allow us to defer the upgrade for a longer time.
>>>>>
>>>>>
>>>>> Best,
>>>>>
>>>>> Max
>>>>>
>>>>>
>>>>>

Re: [Discuss] Upgrade story for Beam's execution engines

Posted by Romain Manni-Bucau <rm...@gmail.com>.
Hi guys,

Isnt the issue "only" that beam has this code instead of engines?

Assuming beam runner facing api is stable - which must be the case anyway -
and that each engine has its integration (flink-beam instead of
beam-runners-flink), then this issue disappears by construction.

It also has the advantage to have a better maintenance.

Side note: this is what happent which arquillian, originally the community
did all adapters impl then each vendor took it back in house to make it
better.

Any way to work in that direction maybe?

Le jeu. 13 sept. 2018 00:49, Thomas Weise <th...@apache.org> a écrit :

> The main problem here is that users are forced to upgrade infrastructure
> to obtain new features in Beam, even when those features actually don't
> require such changes. As an example, another update to Flink 1.6.0 was
> proposed (without supporting new functionality in Beam) and we already know
> that it breaks compatibility (again).
>
> I think that upgrading to a Flink X.Y.0 version isn't a good idea to start
> with. But besides that, if we want to grow adoption, then we need to focus
> on stability and delivering improvements to Beam without disrupting users.
>
> In the specific case, ideally the surface of Flink would be backward
> compatible, allowing us to stick to a minimum version and be able to submit
> pipelines to Flink endpoints of higher versions. Some work in that
> direction is underway (like versioning the REST API). FYI, lowest common
> version is what most projects that depend on Hadoop 2.x follow.
>
> Since Beam with Flink 1.5.x client won't talk to Flink 1.6 and there are
> code changes required to make it compile, we would need to come up with a
> more involved strategy to support multiple Flink versions. Till then, I
> would prefer we favor existing users over short lived experiments, which
> would mean stick with 1.5.x and not support 1.6.0.
>
> Thanks,
> Thomas
>
>
> On Wed, Sep 12, 2018 at 1:15 PM Lukasz Cwik <lc...@google.com> wrote:
>
>> As others have already suggested, I also believe LTS releases is the best
>> we can do as a community right now until portability allows us to decouple
>> what a user writes with and how it runs (the SDK and the SDK environment)
>> from the runner (job service + shared common runner libs +
>> Flink/Spark/Dataflow/Apex/Samza/...).
>>
>> Dataflow would be highly invested in having the appropriate tooling
>> within Apache Beam to support multiple SDK versions against a runner. This
>> in turn would allow people to use any SDK with any runner and as Robert had
>> mentioned, certain optimizations and features would be disabled depending
>> on the capabilities of the runner and the capabilities of the SDK.
>>
>>
>>
>> On Wed, Sep 12, 2018 at 6:38 AM Robert Bradshaw <ro...@google.com>
>> wrote:
>>
>>> The target audience is people who want to use the latest Beam but do not
>>> want to use the latest version of the runner, right?
>>>
>>> I think this will be somewhat (though not entirely) addressed by Beam
>>> LTS releases, where those not wanting to upgrade the runner at least have a
>>> well-supported version of Beam. In the long term, we have the division
>>>
>>>     Runner <-> BeamRunnerSpecificCode <-> CommonBeamRunnerLibs <-> SDK.
>>>
>>> (which applies to the job submission as well as execution).
>>>
>>> Insomuch as the BeamRunnerSpecificCode uses the public APIs of the
>>> runner, hopefully upgrading the runner for minor versions should be a
>>> no-op, and we can target the lowest version of the runner that makes sense,
>>> allowing the user to link against higher versions at his or her discretion.
>>> We should provide built targets that allow this. For major versions, it may
>>> make sense to have two distinct BeamRunnerSpecificCode libraries (which may
>>> or may not share some common code). I hope these wrappers are not too
>>> thick.
>>>
>>> There is a tight coupling at the BeamRunnerSpecificCode <->
>>> CommonBeamRunnerLibs layer, but hopefully the bulk of the code lives on the
>>> right hand side and can be updated as needed independent of the runner.
>>> There may be code of the form "if the runner supports X, do this fast path,
>>> otherwise, do this slow path (or reject the pipeline).
>>>
>>> I hope the CommonBeamRunnerLibs <-> SDK coupling is fairly loose, to the
>>> point that one could use SDKs from different versions of Beam (or even
>>> developed outside of Beam) with an older/newer runner. We may need to add
>>> versioning to the Fn/Runner/Job API itself to support this. Right now of
>>> course we're still in a pre-1.0, rapid-development phase wrt this API.
>>>
>>>
>>>
>>>
>>> On Wed, Sep 12, 2018 at 2:10 PM Etienne Chauchot <ec...@apache.org>
>>> wrote:
>>>
>>>> Hi Max,
>>>>
>>>> I totally agree with your points especially the users priorities (stick
>>>> to the already working version) , and the need to leverage important new
>>>> features. It is indeed a difficult balance to find .
>>>>
>>>> I can talk for a part I know: for the Spark runner, the aim was to
>>>> support Dataset native spark API (in place of RDD). For that we needed to
>>>> upgrade to spark 2.x (and we will probably leverage Beam Row as well).
>>>> But such an upgrade is a good amount of work which makes it difficult
>>>> to commit on a schedule such as "if there is a major new feature on an
>>>> execution engine that we want to leverage, then the upgrade in Beam will be
>>>> done within x months".
>>>>
>>>> Regarding your point on portability : decoupling SDK from runner with
>>>> runner harness and SDK harness might make pipeline authors work easy
>>>> regarding pipeline maintenance. But, still, if we upgrade runner libs, then
>>>> the users might have their runner harness not work with their engine
>>>> version.
>>>> If such SDK/runner decoupling is 100% functional, then we could imaging
>>>> having multiple runner harnesses shipping different versions of the runner
>>>> libs to solve this problem.
>>>> But we would need to support more than one version of the runner libs.
>>>> We chose not to do this on spark runner.
>>>>
>>>> WDYT ?
>>>>
>>>> Best
>>>> Etienne
>>>>
>>>>
>>>> Le mardi 11 septembre 2018 à 15:42 +0200, Maximilian Michels a écrit :
>>>>
>>>> Hi Beamers,
>>>>
>>>>
>>>> In the light of the discussion about Beam LTS releases, I'd like to kick
>>>>
>>>> off a thread about how often we upgrade the execution engine of each
>>>>
>>>> Runner. By upgrade, I mean major/minor versions which typically break
>>>>
>>>> the binary compatibility of Beam pipelines.
>>>>
>>>>
>>>> For the Flink Runner, we try to track the latest stable version. Some
>>>>
>>>> users reported that this can be problematic, as it requires them to
>>>>
>>>> potentially upgrade their Flink cluster with a new version of Beam.
>>>>
>>>>
>>>>  From a developer's perspective, it makes sense to migrate as early as
>>>>
>>>> possible to the newest version of the execution engine, e.g. to leverage
>>>>
>>>> the newest features. From a user's perspective, you don't care about the
>>>>
>>>> latest features if your use case still works with Beam.
>>>>
>>>>
>>>> We have to please both parties. So I'd suggest to upgrade the execution
>>>>
>>>> engine whenever necessary (e.g. critical new features, end of life of
>>>>
>>>> current version). On the other hand, the upcoming Beam LTS releases will
>>>>
>>>> contain a longer-supported version.
>>>>
>>>>
>>>> Maybe we don't need to discuss much about this but I wanted to hear what
>>>>
>>>> the community has to say about it. Particularly, I'd be interested in
>>>>
>>>> how the other Runner authors intend to do it.
>>>>
>>>>
>>>> As far as I understand, with the portability being stable, we could
>>>>
>>>> theoretically upgrade the SDK without upgrading the runtime components.
>>>>
>>>> That would allow us to defer the upgrade for a longer time.
>>>>
>>>>
>>>> Best,
>>>>
>>>> Max
>>>>
>>>>
>>>>

Re: [Discuss] Upgrade story for Beam's execution engines

Posted by Thomas Weise <th...@apache.org>.
The main problem here is that users are forced to upgrade infrastructure to
obtain new features in Beam, even when those features actually don't
require such changes. As an example, another update to Flink 1.6.0 was
proposed (without supporting new functionality in Beam) and we already know
that it breaks compatibility (again).

I think that upgrading to a Flink X.Y.0 version isn't a good idea to start
with. But besides that, if we want to grow adoption, then we need to focus
on stability and delivering improvements to Beam without disrupting users.

In the specific case, ideally the surface of Flink would be backward
compatible, allowing us to stick to a minimum version and be able to submit
pipelines to Flink endpoints of higher versions. Some work in that
direction is underway (like versioning the REST API). FYI, lowest common
version is what most projects that depend on Hadoop 2.x follow.

Since Beam with Flink 1.5.x client won't talk to Flink 1.6 and there are
code changes required to make it compile, we would need to come up with a
more involved strategy to support multiple Flink versions. Till then, I
would prefer we favor existing users over short lived experiments, which
would mean stick with 1.5.x and not support 1.6.0.

Thanks,
Thomas


On Wed, Sep 12, 2018 at 1:15 PM Lukasz Cwik <lc...@google.com> wrote:

> As others have already suggested, I also believe LTS releases is the best
> we can do as a community right now until portability allows us to decouple
> what a user writes with and how it runs (the SDK and the SDK environment)
> from the runner (job service + shared common runner libs +
> Flink/Spark/Dataflow/Apex/Samza/...).
>
> Dataflow would be highly invested in having the appropriate tooling within
> Apache Beam to support multiple SDK versions against a runner. This in turn
> would allow people to use any SDK with any runner and as Robert had
> mentioned, certain optimizations and features would be disabled depending
> on the capabilities of the runner and the capabilities of the SDK.
>
>
>
> On Wed, Sep 12, 2018 at 6:38 AM Robert Bradshaw <ro...@google.com>
> wrote:
>
>> The target audience is people who want to use the latest Beam but do not
>> want to use the latest version of the runner, right?
>>
>> I think this will be somewhat (though not entirely) addressed by Beam LTS
>> releases, where those not wanting to upgrade the runner at least have a
>> well-supported version of Beam. In the long term, we have the division
>>
>>     Runner <-> BeamRunnerSpecificCode <-> CommonBeamRunnerLibs <-> SDK.
>>
>> (which applies to the job submission as well as execution).
>>
>> Insomuch as the BeamRunnerSpecificCode uses the public APIs of the
>> runner, hopefully upgrading the runner for minor versions should be a
>> no-op, and we can target the lowest version of the runner that makes sense,
>> allowing the user to link against higher versions at his or her discretion.
>> We should provide built targets that allow this. For major versions, it may
>> make sense to have two distinct BeamRunnerSpecificCode libraries (which may
>> or may not share some common code). I hope these wrappers are not too
>> thick.
>>
>> There is a tight coupling at the BeamRunnerSpecificCode <->
>> CommonBeamRunnerLibs layer, but hopefully the bulk of the code lives on the
>> right hand side and can be updated as needed independent of the runner.
>> There may be code of the form "if the runner supports X, do this fast path,
>> otherwise, do this slow path (or reject the pipeline).
>>
>> I hope the CommonBeamRunnerLibs <-> SDK coupling is fairly loose, to the
>> point that one could use SDKs from different versions of Beam (or even
>> developed outside of Beam) with an older/newer runner. We may need to add
>> versioning to the Fn/Runner/Job API itself to support this. Right now of
>> course we're still in a pre-1.0, rapid-development phase wrt this API.
>>
>>
>>
>>
>> On Wed, Sep 12, 2018 at 2:10 PM Etienne Chauchot <ec...@apache.org>
>> wrote:
>>
>>> Hi Max,
>>>
>>> I totally agree with your points especially the users priorities (stick
>>> to the already working version) , and the need to leverage important new
>>> features. It is indeed a difficult balance to find .
>>>
>>> I can talk for a part I know: for the Spark runner, the aim was to
>>> support Dataset native spark API (in place of RDD). For that we needed to
>>> upgrade to spark 2.x (and we will probably leverage Beam Row as well).
>>> But such an upgrade is a good amount of work which makes it difficult to
>>> commit on a schedule such as "if there is a major new feature on an
>>> execution engine that we want to leverage, then the upgrade in Beam will be
>>> done within x months".
>>>
>>> Regarding your point on portability : decoupling SDK from runner with
>>> runner harness and SDK harness might make pipeline authors work easy
>>> regarding pipeline maintenance. But, still, if we upgrade runner libs, then
>>> the users might have their runner harness not work with their engine
>>> version.
>>> If such SDK/runner decoupling is 100% functional, then we could imaging
>>> having multiple runner harnesses shipping different versions of the runner
>>> libs to solve this problem.
>>> But we would need to support more than one version of the runner libs.
>>> We chose not to do this on spark runner.
>>>
>>> WDYT ?
>>>
>>> Best
>>> Etienne
>>>
>>>
>>> Le mardi 11 septembre 2018 à 15:42 +0200, Maximilian Michels a écrit :
>>>
>>> Hi Beamers,
>>>
>>>
>>> In the light of the discussion about Beam LTS releases, I'd like to kick
>>>
>>> off a thread about how often we upgrade the execution engine of each
>>>
>>> Runner. By upgrade, I mean major/minor versions which typically break
>>>
>>> the binary compatibility of Beam pipelines.
>>>
>>>
>>> For the Flink Runner, we try to track the latest stable version. Some
>>>
>>> users reported that this can be problematic, as it requires them to
>>>
>>> potentially upgrade their Flink cluster with a new version of Beam.
>>>
>>>
>>>  From a developer's perspective, it makes sense to migrate as early as
>>>
>>> possible to the newest version of the execution engine, e.g. to leverage
>>>
>>> the newest features. From a user's perspective, you don't care about the
>>>
>>> latest features if your use case still works with Beam.
>>>
>>>
>>> We have to please both parties. So I'd suggest to upgrade the execution
>>>
>>> engine whenever necessary (e.g. critical new features, end of life of
>>>
>>> current version). On the other hand, the upcoming Beam LTS releases will
>>>
>>> contain a longer-supported version.
>>>
>>>
>>> Maybe we don't need to discuss much about this but I wanted to hear what
>>>
>>> the community has to say about it. Particularly, I'd be interested in
>>>
>>> how the other Runner authors intend to do it.
>>>
>>>
>>> As far as I understand, with the portability being stable, we could
>>>
>>> theoretically upgrade the SDK without upgrading the runtime components.
>>>
>>> That would allow us to defer the upgrade for a longer time.
>>>
>>>
>>> Best,
>>>
>>> Max
>>>
>>>
>>>

Re: [Discuss] Upgrade story for Beam's execution engines

Posted by Lukasz Cwik <lc...@google.com>.
As others have already suggested, I also believe LTS releases is the best
we can do as a community right now until portability allows us to decouple
what a user writes with and how it runs (the SDK and the SDK environment)
from the runner (job service + shared common runner libs +
Flink/Spark/Dataflow/Apex/Samza/...).

Dataflow would be highly invested in having the appropriate tooling within
Apache Beam to support multiple SDK versions against a runner. This in turn
would allow people to use any SDK with any runner and as Robert had
mentioned, certain optimizations and features would be disabled depending
on the capabilities of the runner and the capabilities of the SDK.



On Wed, Sep 12, 2018 at 6:38 AM Robert Bradshaw <ro...@google.com> wrote:

> The target audience is people who want to use the latest Beam but do not
> want to use the latest version of the runner, right?
>
> I think this will be somewhat (though not entirely) addressed by Beam LTS
> releases, where those not wanting to upgrade the runner at least have a
> well-supported version of Beam. In the long term, we have the division
>
>     Runner <-> BeamRunnerSpecificCode <-> CommonBeamRunnerLibs <-> SDK.
>
> (which applies to the job submission as well as execution).
>
> Insomuch as the BeamRunnerSpecificCode uses the public APIs of the runner,
> hopefully upgrading the runner for minor versions should be a no-op, and we
> can target the lowest version of the runner that makes sense, allowing the
> user to link against higher versions at his or her discretion. We should
> provide built targets that allow this. For major versions, it may make
> sense to have two distinct BeamRunnerSpecificCode libraries (which may or
> may not share some common code). I hope these wrappers are not too thick.
>
> There is a tight coupling at the BeamRunnerSpecificCode <->
> CommonBeamRunnerLibs layer, but hopefully the bulk of the code lives on the
> right hand side and can be updated as needed independent of the runner.
> There may be code of the form "if the runner supports X, do this fast path,
> otherwise, do this slow path (or reject the pipeline).
>
> I hope the CommonBeamRunnerLibs <-> SDK coupling is fairly loose, to the
> point that one could use SDKs from different versions of Beam (or even
> developed outside of Beam) with an older/newer runner. We may need to add
> versioning to the Fn/Runner/Job API itself to support this. Right now of
> course we're still in a pre-1.0, rapid-development phase wrt this API.
>
>
>
>
> On Wed, Sep 12, 2018 at 2:10 PM Etienne Chauchot <ec...@apache.org>
> wrote:
>
>> Hi Max,
>>
>> I totally agree with your points especially the users priorities (stick
>> to the already working version) , and the need to leverage important new
>> features. It is indeed a difficult balance to find .
>>
>> I can talk for a part I know: for the Spark runner, the aim was to
>> support Dataset native spark API (in place of RDD). For that we needed to
>> upgrade to spark 2.x (and we will probably leverage Beam Row as well).
>> But such an upgrade is a good amount of work which makes it difficult to
>> commit on a schedule such as "if there is a major new feature on an
>> execution engine that we want to leverage, then the upgrade in Beam will be
>> done within x months".
>>
>> Regarding your point on portability : decoupling SDK from runner with
>> runner harness and SDK harness might make pipeline authors work easy
>> regarding pipeline maintenance. But, still, if we upgrade runner libs, then
>> the users might have their runner harness not work with their engine
>> version.
>> If such SDK/runner decoupling is 100% functional, then we could imaging
>> having multiple runner harnesses shipping different versions of the runner
>> libs to solve this problem.
>> But we would need to support more than one version of the runner libs. We
>> chose not to do this on spark runner.
>>
>> WDYT ?
>>
>> Best
>> Etienne
>>
>>
>> Le mardi 11 septembre 2018 à 15:42 +0200, Maximilian Michels a écrit :
>>
>> Hi Beamers,
>>
>>
>> In the light of the discussion about Beam LTS releases, I'd like to kick
>>
>> off a thread about how often we upgrade the execution engine of each
>>
>> Runner. By upgrade, I mean major/minor versions which typically break
>>
>> the binary compatibility of Beam pipelines.
>>
>>
>> For the Flink Runner, we try to track the latest stable version. Some
>>
>> users reported that this can be problematic, as it requires them to
>>
>> potentially upgrade their Flink cluster with a new version of Beam.
>>
>>
>>  From a developer's perspective, it makes sense to migrate as early as
>>
>> possible to the newest version of the execution engine, e.g. to leverage
>>
>> the newest features. From a user's perspective, you don't care about the
>>
>> latest features if your use case still works with Beam.
>>
>>
>> We have to please both parties. So I'd suggest to upgrade the execution
>>
>> engine whenever necessary (e.g. critical new features, end of life of
>>
>> current version). On the other hand, the upcoming Beam LTS releases will
>>
>> contain a longer-supported version.
>>
>>
>> Maybe we don't need to discuss much about this but I wanted to hear what
>>
>> the community has to say about it. Particularly, I'd be interested in
>>
>> how the other Runner authors intend to do it.
>>
>>
>> As far as I understand, with the portability being stable, we could
>>
>> theoretically upgrade the SDK without upgrading the runtime components.
>>
>> That would allow us to defer the upgrade for a longer time.
>>
>>
>> Best,
>>
>> Max
>>
>>
>>

Re: [Discuss] Upgrade story for Beam's execution engines

Posted by Robert Bradshaw <ro...@google.com>.
The target audience is people who want to use the latest Beam but do not
want to use the latest version of the runner, right?

I think this will be somewhat (though not entirely) addressed by Beam LTS
releases, where those not wanting to upgrade the runner at least have a
well-supported version of Beam. In the long term, we have the division

    Runner <-> BeamRunnerSpecificCode <-> CommonBeamRunnerLibs <-> SDK.

(which applies to the job submission as well as execution).

Insomuch as the BeamRunnerSpecificCode uses the public APIs of the runner,
hopefully upgrading the runner for minor versions should be a no-op, and we
can target the lowest version of the runner that makes sense, allowing the
user to link against higher versions at his or her discretion. We should
provide built targets that allow this. For major versions, it may make
sense to have two distinct BeamRunnerSpecificCode libraries (which may or
may not share some common code). I hope these wrappers are not too thick.

There is a tight coupling at the BeamRunnerSpecificCode <->
CommonBeamRunnerLibs layer, but hopefully the bulk of the code lives on the
right hand side and can be updated as needed independent of the runner.
There may be code of the form "if the runner supports X, do this fast path,
otherwise, do this slow path (or reject the pipeline).

I hope the CommonBeamRunnerLibs <-> SDK coupling is fairly loose, to the
point that one could use SDKs from different versions of Beam (or even
developed outside of Beam) with an older/newer runner. We may need to add
versioning to the Fn/Runner/Job API itself to support this. Right now of
course we're still in a pre-1.0, rapid-development phase wrt this API.




On Wed, Sep 12, 2018 at 2:10 PM Etienne Chauchot <ec...@apache.org>
wrote:

> Hi Max,
>
> I totally agree with your points especially the users priorities (stick to
> the already working version) , and the need to leverage important new
> features. It is indeed a difficult balance to find .
>
> I can talk for a part I know: for the Spark runner, the aim was to support
> Dataset native spark API (in place of RDD). For that we needed to upgrade
> to spark 2.x (and we will probably leverage Beam Row as well).
> But such an upgrade is a good amount of work which makes it difficult to
> commit on a schedule such as "if there is a major new feature on an
> execution engine that we want to leverage, then the upgrade in Beam will be
> done within x months".
>
> Regarding your point on portability : decoupling SDK from runner with
> runner harness and SDK harness might make pipeline authors work easy
> regarding pipeline maintenance. But, still, if we upgrade runner libs, then
> the users might have their runner harness not work with their engine
> version.
> If such SDK/runner decoupling is 100% functional, then we could imaging
> having multiple runner harnesses shipping different versions of the runner
> libs to solve this problem.
> But we would need to support more than one version of the runner libs. We
> chose not to do this on spark runner.
>
> WDYT ?
>
> Best
> Etienne
>
>
> Le mardi 11 septembre 2018 à 15:42 +0200, Maximilian Michels a écrit :
>
> Hi Beamers,
>
>
> In the light of the discussion about Beam LTS releases, I'd like to kick
>
> off a thread about how often we upgrade the execution engine of each
>
> Runner. By upgrade, I mean major/minor versions which typically break
>
> the binary compatibility of Beam pipelines.
>
>
> For the Flink Runner, we try to track the latest stable version. Some
>
> users reported that this can be problematic, as it requires them to
>
> potentially upgrade their Flink cluster with a new version of Beam.
>
>
>  From a developer's perspective, it makes sense to migrate as early as
>
> possible to the newest version of the execution engine, e.g. to leverage
>
> the newest features. From a user's perspective, you don't care about the
>
> latest features if your use case still works with Beam.
>
>
> We have to please both parties. So I'd suggest to upgrade the execution
>
> engine whenever necessary (e.g. critical new features, end of life of
>
> current version). On the other hand, the upcoming Beam LTS releases will
>
> contain a longer-supported version.
>
>
> Maybe we don't need to discuss much about this but I wanted to hear what
>
> the community has to say about it. Particularly, I'd be interested in
>
> how the other Runner authors intend to do it.
>
>
> As far as I understand, with the portability being stable, we could
>
> theoretically upgrade the SDK without upgrading the runtime components.
>
> That would allow us to defer the upgrade for a longer time.
>
>
> Best,
>
> Max
>
>
>

Re: [Discuss] Upgrade story for Beam's execution engines

Posted by Etienne Chauchot <ec...@apache.org>.
Hi Max,

I totally agree with your points especially the users priorities (stick to the already working version) , and the need
to leverage important new features.  It is indeed a difficult balance to find .

I can talk for a part I know: for the Spark runner, the aim was to support Dataset native spark API (in place of RDD).
For that we needed to upgrade to spark 2.x (and we will probably leverage Beam Row as well).
But such an upgrade is a good amount of work which makes it difficult to commit on a schedule such as "if there is a
major new feature on an execution engine that we want to leverage, then the upgrade in Beam will be done within x
months".

Regarding your point on portability : decoupling SDK from runner with runner harness and SDK harness might make pipeline
authors work easy regarding pipeline maintenance. But, still, if we upgrade runner libs, then the users might have their
runner harness not work with their engine version. 
If such SDK/runner decoupling is 100% functional, then we could imaging having multiple runner harnesses shipping
different versions of the runner libs to solve this problem.
But we would need to support more than one version of the runner libs. We chose not to do this on spark runner.

WDYT ?

Best
Etienne


Le mardi 11 septembre 2018 à 15:42 +0200, Maximilian Michels a écrit :
> Hi Beamers,
> 
> In the light of the discussion about Beam LTS releases, I'd like to kick 
> off a thread about how often we upgrade the execution engine of each 
> Runner. By upgrade, I mean major/minor versions which typically break 
> the binary compatibility of Beam pipelines.
> 
> For the Flink Runner, we try to track the latest stable version. Some 
> users reported that this can be problematic, as it requires them to 
> potentially upgrade their Flink cluster with a new version of Beam.
> 
>  From a developer's perspective, it makes sense to migrate as early as 
> possible to the newest version of the execution engine, e.g. to leverage 
> the newest features. From a user's perspective, you don't care about the 
> latest features if your use case still works with Beam.
> 
> We have to please both parties. So I'd suggest to upgrade the execution 
> engine whenever necessary (e.g. critical new features, end of life of 
> current version). On the other hand, the upcoming Beam LTS releases will 
> contain a longer-supported version.
> 
> Maybe we don't need to discuss much about this but I wanted to hear what 
> the community has to say about it. Particularly, I'd be interested in 
> how the other Runner authors intend to do it.
> 
> As far as I understand, with the portability being stable, we could 
> theoretically upgrade the SDK without upgrading the runtime components. 
> That would allow us to defer the upgrade for a longer time.
> 
> Best,
> Max
>