You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Robert Metzger <rm...@apache.org> on 2020/10/23 11:42:19 UTC

[SURVEY] Remove Mesos support

Hi all,

I wanted to discuss if it makes sense to remove support for Mesos in Flink.
It seems that nobody is actively maintaining that component (except for
necessary refactorings because of interfaces we are changing), and there
are almost no users reporting issues or asking for features.

The Apache Mesos project itself seems very inactive: There has been only
one message on the dev@ list in the last 3 months.

In 2020, I found only 3 users mentioning that they are using Mesos on the
user@ list.

Maybe it makes sense to add a prominent log warning into the Mesos code in
the Flink 1.12 release, that we are planning to remove Mesos support. Users
will then have enough opportunity to raise concerns or discuss with us.

Best,
Robert

Re: [SURVEY] Remove Mesos support

Posted by Kostas Kloudas <kk...@gmail.com>.
+1 for adding a warning about the removal of Mesos support and I would
also propose to state explicitly in the warning the version that we
are planning to actually remove it (e.g. 1.13 or even 1.14 if we feel
it is too aggressive).

This will help as a reminder to users and devs about the upcoming
removal and it will avoid future, potentially endless, discussions.

Cheers,
Kostas

On Fri, Oct 23, 2020 at 2:03 PM Xintong Song <to...@gmail.com> wrote:
>
> +1 for adding a warning in 1.12 about planning to remove Mesos support.
>
>
> With my developer hat on, removing the Mesos support would definitely reduce the maintaining overhead for the deployment and resource management related components. On the other hand, the Flink on Mesos users' voices definitely matter a lot for this community. Either way, it would be good to draw users attention to this discussion early.
>
>
> Thank you~
>
> Xintong Song
>
>
>
> On Fri, Oct 23, 2020 at 7:53 PM Konstantin Knauf <kn...@apache.org> wrote:
>>
>> Hi Robert,
>>
>> +1 to the plan you outlined. If we were to drop support in Flink 1.13+, we
>> would still support it in Flink 1.12- with bug fixes for some time so that
>> users have time to move on.
>>
>> It would certainly be very interesting to hear from current Flink on Mesos
>> users, on how they see the evolution of this part of the ecosystem.
>>
>> Best,
>>
>> Konstantin

Re: [SURVEY] Remove Mesos support

Posted by Kostas Kloudas <kk...@gmail.com>.
+1 for adding a warning about the removal of Mesos support and I would
also propose to state explicitly in the warning the version that we
are planning to actually remove it (e.g. 1.13 or even 1.14 if we feel
it is too aggressive).

This will help as a reminder to users and devs about the upcoming
removal and it will avoid future, potentially endless, discussions.

Cheers,
Kostas

On Fri, Oct 23, 2020 at 2:03 PM Xintong Song <to...@gmail.com> wrote:
>
> +1 for adding a warning in 1.12 about planning to remove Mesos support.
>
>
> With my developer hat on, removing the Mesos support would definitely reduce the maintaining overhead for the deployment and resource management related components. On the other hand, the Flink on Mesos users' voices definitely matter a lot for this community. Either way, it would be good to draw users attention to this discussion early.
>
>
> Thank you~
>
> Xintong Song
>
>
>
> On Fri, Oct 23, 2020 at 7:53 PM Konstantin Knauf <kn...@apache.org> wrote:
>>
>> Hi Robert,
>>
>> +1 to the plan you outlined. If we were to drop support in Flink 1.13+, we
>> would still support it in Flink 1.12- with bug fixes for some time so that
>> users have time to move on.
>>
>> It would certainly be very interesting to hear from current Flink on Mesos
>> users, on how they see the evolution of this part of the ecosystem.
>>
>> Best,
>>
>> Konstantin

Re: [BULK]Re: [SURVEY] Remove Mesos support

Posted by Matthias Pohl <ma...@ververica.com>.
Thanks for everyone's feedback. I'm gonna initiate a vote in a separate
thread.

On Mon, Mar 29, 2021 at 9:18 AM Robert Metzger <rm...@apache.org> wrote:

> +1
>
>
>
> On Mon, Mar 29, 2021 at 5:44 AM Yangze Guo <ka...@gmail.com> wrote:
>
> > +1
> >
> > Best,
> > Yangze Guo
> >
> > On Mon, Mar 29, 2021 at 11:31 AM Xintong Song <to...@gmail.com>
> > wrote:
> > >
> > > +1
> > > It's already a matter of fact for a while that we no longer port new
> > features to the Mesos deployment.
> > >
> > > Thank you~
> > >
> > > Xintong Song
> > >
> > >
> > >
> > > On Fri, Mar 26, 2021 at 10:37 PM Till Rohrmann <tr...@apache.org>
> > wrote:
> > >>
> > >> +1 for officially deprecating this component for the 1.13 release.
> > >>
> > >> Cheers,
> > >> Till
> > >>
> > >> On Thu, Mar 25, 2021 at 1:49 PM Konstantin Knauf <kn...@apache.org>
> > wrote:
> > >>>
> > >>> Hi Matthias,
> > >>>
> > >>> Thank you for following up on this. +1 to officially deprecate Mesos
> > in the code and documentation, too. It will be confusing for users if
> this
> > diverges from the roadmap.
> > >>>
> > >>> Cheers,
> > >>>
> > >>> Konstantin
> > >>>
> > >>> On Thu, Mar 25, 2021 at 12:23 PM Matthias Pohl <
> matthias@ververica.com>
> > wrote:
> > >>>>
> > >>>> Hi everyone,
> > >>>> considering the upcoming release of Flink 1.13, I wanted to revive
> the
> > >>>> discussion about the Mesos support ones more. Mesos is also already
> > listed
> > >>>> as deprecated in Flink's overall roadmap [1]. Maybe, it's time to
> > align the
> > >>>> documentation accordingly to make it more explicit?
> > >>>>
> > >>>> What do you think?
> > >>>>
> > >>>> Best,
> > >>>> Matthias
> > >>>>
> > >>>> [1] https://flink.apache.org/roadmap.html#feature-radar
> > >>>>
> > >>>> On Wed, Oct 28, 2020 at 9:40 AM Till Rohrmann <trohrmann@apache.org
> >
> > wrote:
> > >>>>
> > >>>> > Hi Oleksandr,
> > >>>> >
> > >>>> > yes you are right. The biggest problem is at the moment the lack
> of
> > test
> > >>>> > coverage and thereby confidence to make changes. We have some e2e
> > tests
> > >>>> > which you can find here [1]. These tests are, however, quite
> coarse
> > grained
> > >>>> > and are missing a lot of cases. One idea would be to add a Mesos
> > e2e test
> > >>>> > based on Flink's end-to-end test framework [2]. I think what needs
> > to be
> > >>>> > done there is to add a Mesos resource and a way to submit jobs to
> a
> > Mesos
> > >>>> > cluster to write e2e tests.
> > >>>> >
> > >>>> > [1] https://github.com/apache/flink/tree/master/flink-jepsen
> > >>>> > [2]
> > >>>> >
> >
> https://github.com/apache/flink/tree/master/flink-end-to-end-tests/flink-end-to-end-tests-common
> > >>>> >
> > >>>> > Cheers,
> > >>>> > Till
> > >>>> >
> > >>>> > On Tue, Oct 27, 2020 at 12:29 PM Oleksandr Nitavskyi <
> > >>>> > o.nitavskyi@criteo.com> wrote:
> > >>>> >
> > >>>> >> Hello Xintong,
> > >>>> >>
> > >>>> >> Thanks for the insights and support.
> > >>>> >>
> > >>>> >> Browsing the Mesos backlog and didn't identify anything critical,
> > which
> > >>>> >> is left there.
> > >>>> >>
> > >>>> >> I see that there are were quite a lot of contributions to the
> > Flink Mesos
> > >>>> >> in the recent version:
> > >>>> >> https://github.com/apache/flink/commits/master/flink-mesos.
> > >>>> >> We plan to validate the current Flink master (or release 1.12
> > branch) our
> > >>>> >> Mesos setup. In case of any issues, we will try to propose
> changes.
> > >>>> >> My feeling is that our test results shouldn't affect the Flink
> 1.12
> > >>>> >> release cycle. And if any potential commits will land into the
> > 1.12.1 it
> > >>>> >> should be totally fine.
> > >>>> >>
> > >>>> >> In the future, we would be glad to help you guys with any
> > >>>> >> maintenance-related questions. One of the highest priorities
> > around this
> > >>>> >> component seems to be the development of the full e2e test.
> > >>>> >>
> > >>>> >> Kind Regards
> > >>>> >> Oleksandr Nitavskyi
> > >>>> >> ________________________________
> > >>>> >> From: Xintong Song <to...@gmail.com>
> > >>>> >> Sent: Tuesday, October 27, 2020 7:14 AM
> > >>>> >> To: dev <de...@flink.apache.org>; user <us...@flink.apache.org>
> > >>>> >> Cc: Piyush Narang <p....@criteo.com>
> > >>>> >> Subject: [BULK]Re: [SURVEY] Remove Mesos support
> > >>>> >>
> > >>>> >> Hi Piyush,
> > >>>> >>
> > >>>> >> Thanks a lot for sharing the information. It would be a great
> > relief that
> > >>>> >> you are good with Flink on Mesos as is.
> > >>>> >>
> > >>>> >> As for the jira issues, I believe the most essential ones should
> > have
> > >>>> >> already been resolved. You may find some remaining open issues
> > here [1],
> > >>>> >> but not all of them are necessary if we decide to keep Flink on
> > Mesos as is.
> > >>>> >>
> > >>>> >> At the moment and in the short future, I think helps are mostly
> > needed on
> > >>>> >> testing the upcoming release 1.12 with Mesos use cases. The
> > community is
> > >>>> >> currently actively preparing the new release, and hopefully we
> > could come
> > >>>> >> up with a release candidate early next month. It would be greatly
> > >>>> >> appreciated if you fork as experienced Flink on Mesos users can
> > help with
> > >>>> >> verifying the release candidates.
> > >>>> >>
> > >>>> >>
> > >>>> >> Thank you~
> > >>>> >>
> > >>>> >> Xintong Song
> > >>>> >>
> > >>>> >> [1]
> > >>>> >>
> >
> https://issues.apache.org/jira/browse/FLINK-17402?jql=project%20%3D%20FLINK%20AND%20component%20%3D%20%22Deployment%20%2F%20Mesos%22%20AND%20status%20%3D%20Open
> > >>>> >> <
> > >>>> >>
> >
> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FFLINK-17402%3Fjql%3Dproject%2520%253D%2520FLINK%2520AND%2520component%2520%253D%2520%2522Deployment%2520%252F%2520Mesos%2522%2520AND%2520status%2520%253D%2520Open&data=04%7C01%7Co.nitavskyi%40criteo.com%7C3585e1f25bdf4e091af808d87a3f92db%7C2a35d8fd574d48e3927c8c398e225a01%7C1%7C0%7C637393760750820881%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=hytJFQE0MCPzMLiQTQTdbg3GVckX5M3r1NPRGrRV8j4%3D&reserved=0
> > >>>> >> >
> > >>>> >>
> > >>>> >> On Tue, Oct 27, 2020 at 2:58 AM Piyush Narang <
> p.narang@criteo.com
> > >>>> >> <ma...@criteo.com>> wrote:
> > >>>> >>
> > >>>> >> Hi Xintong,
> > >>>> >>
> > >>>> >>
> > >>>> >>
> > >>>> >> Do you have any jiras that cover any of the items on 1 or 2? I
> can
> > reach
> > >>>> >> out to folks internally and see if I can get some folks to commit
> > to
> > >>>> >> helping out.
> > >>>> >>
> > >>>> >>
> > >>>> >>
> > >>>> >> To cover the other qs:
> > >>>> >>
> > >>>> >>   *   Yes, we’ve not got a plan at the moment to get off Mesos.
> We
> > use
> > >>>> >> Yarn for some our Flink workloads when we can. Mesos is only used
> > when we
> > >>>> >> need streaming capabilities in our WW dcs (as our Yarn is
> > centralized in
> > >>>> >> one DC)
> > >>>> >>   *   We’re currently on Flink 1.9 (old planner). We have a plan
> > to bump
> > >>>> >> to 1.11 / 1.12 this quarter.
> > >>>> >>   *   We typically upgrade once every 6 months to a year (not
> every
> > >>>> >> release). We’d like to speed up the cadence but we’re not there
> > yet.
> > >>>> >>   *   We’d largely be good with keeping Flink on Mesos as-is and
> > >>>> >> functional while missing out on some of the newer features. We
> > understand
> > >>>> >> the pain on the communities side and we can take on the work if
> we
> > see some
> > >>>> >> fancy improvement in Flink on Yarn / K8s that we want in Mesos to
> > put in
> > >>>> >> the request to port it over.
> > >>>> >>
> > >>>> >>
> > >>>> >>
> > >>>> >> Thanks,
> > >>>> >>
> > >>>> >>
> > >>>> >>
> > >>>> >> -- Piyush
> > >>>> >>
> > >>>> >>
> > >>>> >>
> > >>>> >>
> > >>>> >>
> > >>>> >> From: Xintong Song <tonysong820@gmail.com<mailto:
> > tonysong820@gmail.com>>
> > >>>> >> Date: Sunday, October 25, 2020 at 10:57 PM
> > >>>> >> To: dev <de...@flink.apache.org>>,
> user
> > <
> > >>>> >> user@flink.apache.org<ma...@flink.apache.org>>
> > >>>> >> Cc: Lasse Nedergaard <lassenedergaardflink@gmail.com<mailto:
> > >>>> >> lassenedergaardflink@gmail.com>>, <p.narang@criteo.com<mailto:
> > >>>> >> p.narang@criteo.com>>
> > >>>> >> Subject: Re: [SURVEY] Remove Mesos support
> > >>>> >>
> > >>>> >>
> > >>>> >>
> > >>>> >> Thanks for sharing the information with us, Piyush an Lasse.
> > >>>> >>
> > >>>> >>
> > >>>> >>
> > >>>> >> @Piyush
> > >>>> >>
> > >>>> >>
> > >>>> >>
> > >>>> >> Thanks for offering the help. IMO, there are currently several
> > problems
> > >>>> >> that make supporting Flink on Mesos challenging for us.
> > >>>> >>
> > >>>> >>   1.  Lack of Mesos experts. AFAIK, there are very few people (if
> > not
> > >>>> >> none) among the active contributors in this community that are
> > familiar
> > >>>> >> with Mesos and can help with development on this component.
> > >>>> >>   2.  Absence of tests. Mesos does not provide a testing cluster,
> > like
> > >>>> >> `MiniYARNCluster`, making it hard to test interactions between
> > Flink and
> > >>>> >> Mesos. We have only a few very simple e2e tests running on Mesos
> > deployed
> > >>>> >> in a docker, covering the most fundamental workflows. We are not
> > sure how
> > >>>> >> well those tests work, especially against some potential corner
> > cases.
> > >>>> >>   3.  Divergence from other deployment. Because of 1 and 2, the
> new
> > >>>> >> efforts (features, maintenance, refactors) tend to exclude Mesos
> if
> > >>>> >> possible. When the new efforts have to touch the Mesos related
> > components
> > >>>> >> (e.g., changes to the common resource manager interfaces), we
> have
> > to be
> > >>>> >> very careful and make as few changes as possible, to avoid
> > accidentally
> > >>>> >> breaking anything that we are not familiar with. As a result, the
> > component
> > >>>> >> diverges a lot from other deployment components (K8s/Yarn), which
> > makes it
> > >>>> >> harder to maintain.
> > >>>> >>
> > >>>> >> It would be greatly appreciated if you can help with either of
> the
> > above
> > >>>> >> issues.
> > >>>> >>
> > >>>> >>
> > >>>> >>
> > >>>> >> Additionally, I have a few questions concerning your use cases at
> > Criteo.
> > >>>> >> IIUC, you are going to stay on Mesos in the foreseeable future,
> > while
> > >>>> >> keeping the Flink version up-to-date? What Flink version are you
> > currently
> > >>>> >> using? How often do you upgrade (e.g., every release)? Would you
> > be good
> > >>>> >> with keeping the Flink on Mesos component as it is (means that
> > deployment
> > >>>> >> and resource management improvements may not be ported to Mesos),
> > while
> > >>>> >> keeping other components up-to-date (e.g., improvements from
> > programming
> > >>>> >> APIs, operators, state backens, etc.)?
> > >>>> >>
> > >>>> >>
> > >>>> >>
> > >>>> >> Thank you~
> > >>>> >>
> > >>>> >> Xintong Song
> > >>>> >>
> > >>>> >>
> > >>>> >>
> > >>>> >>
> > >>>> >>
> > >>>> >> On Sat, Oct 24, 2020 at 2:48 AM Lasse Nedergaard <
> > >>>> >> lassenedergaardflink@gmail.com<mailto:
> > lassenedergaardflink@gmail.com>>
> > >>>> >> wrote:
> > >>>> >>
> > >>>> >> Hi
> > >>>> >>
> > >>>> >>
> > >>>> >>
> > >>>> >> At Trackunit We have been using Mesos for long time but have now
> > moved to
> > >>>> >> k8s.
> > >>>> >>
> > >>>> >> Med venlig hilsen / Best regards
> > >>>> >>
> > >>>> >> Lasse Nedergaard
> > >>>> >>
> > >>>> >>
> > >>>> >>
> > >>>> >>
> > >>>> >>
> > >>>> >> Den 23. okt. 2020 kl. 17.01 skrev Robert Metzger <
> > rmetzger@apache.org
> > >>>> >> <ma...@apache.org>>:
> > >>>> >>
> > >>>> >> 
> > >>>> >>
> > >>>> >> Hey Piyush,
> > >>>> >>
> > >>>> >> thanks a lot for raising this concern. I believe we should keep
> > Mesos in
> > >>>> >> Flink then in the foreseeable future.
> > >>>> >>
> > >>>> >> Your offer to help is much appreciated. We'll let you know once
> > there is
> > >>>> >> something.
> > >>>> >>
> > >>>> >>
> > >>>> >>
> > >>>> >> On Fri, Oct 23, 2020 at 4:28 PM Piyush Narang <
> p.narang@criteo.com
> > >>>> >> <ma...@criteo.com>> wrote:
> > >>>> >>
> > >>>> >> Thanks Kostas. If there's items we can help with, I'm sure we'd
> be
> > able
> > >>>> >> to find folks who would be excited to contribute / help in any
> way.
> > >>>> >>
> > >>>> >> -- Piyush
> > >>>> >>
> > >>>> >>
> > >>>> >> On 10/23/20, 10:25 AM, "Kostas Kloudas" <kkloudas@gmail.com
> > <mailto:
> > >>>> >> kkloudas@gmail.com>> wrote:
> > >>>> >>
> > >>>> >>     Thanks Piyush for the message.
> > >>>> >>     After this, I revoke my +1. I agree with the previous
> opinions
> > that we
> > >>>> >>     cannot drop code that is actively used by users, especially
> if
> > it
> > >>>> >>     something that deep in the stack as support for cluster
> > management
> > >>>> >>     framework.
> > >>>> >>
> > >>>> >>     Cheers,
> > >>>> >>     Kostas
> > >>>> >>
> > >>>> >>     On Fri, Oct 23, 2020 at 4:15 PM Piyush Narang <
> > p.narang@criteo.com
> > >>>> >> <ma...@criteo.com>> wrote:
> > >>>> >>     >
> > >>>> >>     > Hi folks,
> > >>>> >>     >
> > >>>> >>     >
> > >>>> >>     >
> > >>>> >>     > We at Criteo are active users of the Flink on Mesos
> resource
> > >>>> >> management component. We are pretty heavy users of Mesos for
> > scheduling
> > >>>> >> workloads on our edge datacenters and we do want to continue to
> be
> > able to
> > >>>> >> run some of our Flink topologies (to compute machine learning
> > short term
> > >>>> >> features) on those DCs. If possible our vote would be not to drop
> > Mesos
> > >>>> >> support as that will tie us to an old release / have to maintain
> a
> > fork as
> > >>>> >> we’re not planning to migrate off Mesos anytime soon. Is the
> burden
> > >>>> >> something that can be helped with by the community? (Or are you
> > referring
> > >>>> >> to having to ensure PRs handle the Mesos piece as well when they
> > touch the
> > >>>> >> resource managers?)
> > >>>> >>     >
> > >>>> >>     >
> > >>>> >>     >
> > >>>> >>     > Thanks,
> > >>>> >>     >
> > >>>> >>     >
> > >>>> >>     >
> > >>>> >>     > -- Piyush
> > >>>> >>     >
> > >>>> >>     >
> > >>>> >>     >
> > >>>> >>     >
> > >>>> >>     >
> > >>>> >>     > From: Till Rohrmann <trohrmann@apache.org<mailto:
> > >>>> >> trohrmann@apache.org>>
> > >>>> >>     > Date: Friday, October 23, 2020 at 8:19 AM
> > >>>> >>     > To: Xintong Song <tonysong820@gmail.com<mailto:
> > >>>> >> tonysong820@gmail.com>>
> > >>>> >>     > Cc: dev <dev@flink.apache.org<mailto:dev@flink.apache.org
> >>,
> > user <
> > >>>> >> user@flink.apache.org<ma...@flink.apache.org>>
> > >>>> >>     > Subject: Re: [SURVEY] Remove Mesos support
> > >>>> >>     >
> > >>>> >>     >
> > >>>> >>     >
> > >>>> >>     > Thanks for starting this survey Robert! I second Konstantin
> > and
> > >>>> >> Xintong in the sense that our Mesos user's opinions should matter
> > most
> > >>>> >> here. If our community is no longer using the Mesos integration,
> > then I
> > >>>> >> would be +1 for removing it in order to decrease the maintenance
> > burden.
> > >>>> >>     >
> > >>>> >>     >
> > >>>> >>     >
> > >>>> >>     > Cheers,
> > >>>> >>     >
> > >>>> >>     > Till
> > >>>> >>     >
> > >>>> >>     >
> > >>>> >>     >
> > >>>> >>     > On Fri, Oct 23, 2020 at 2:03 PM Xintong Song <
> > tonysong820@gmail.com
> > >>>> >> <ma...@gmail.com>> wrote:
> > >>>> >>     >
> > >>>> >>     > +1 for adding a warning in 1.12 about planning to remove
> > Mesos
> > >>>> >> support.
> > >>>> >>     >
> > >>>> >>     >
> > >>>> >>     >
> > >>>> >>     > With my developer hat on, removing the Mesos support would
> > >>>> >> definitely reduce the maintaining overhead for the deployment and
> > resource
> > >>>> >> management related components. On the other hand, the Flink on
> > Mesos users'
> > >>>> >> voices definitely matter a lot for this community. Either way, it
> > would be
> > >>>> >> good to draw users attention to this discussion early.
> > >>>> >>     >
> > >>>> >>     >
> > >>>> >>     >
> > >>>> >>     > Thank you~
> > >>>> >>     >
> > >>>> >>     > Xintong Song
> > >>>> >>     >
> > >>>> >>     >
> > >>>> >>     >
> > >>>> >>     >
> > >>>> >>     >
> > >>>> >>     > On Fri, Oct 23, 2020 at 7:53 PM Konstantin Knauf <
> > knaufk@apache.org
> > >>>> >> <ma...@apache.org>> wrote:
> > >>>> >>     >
> > >>>> >>     > Hi Robert,
> > >>>> >>     >
> > >>>> >>     > +1 to the plan you outlined. If we were to drop support in
> > Flink
> > >>>> >> 1.13+, we
> > >>>> >>     > would still support it in Flink 1.12- with bug fixes for
> > some time
> > >>>> >> so that
> > >>>> >>     > users have time to move on.
> > >>>> >>     >
> > >>>> >>     > It would certainly be very interesting to hear from current
> > Flink
> > >>>> >> on Mesos
> > >>>> >>     > users, on how they see the evolution of this part of the
> > ecosystem.
> > >>>> >>     >
> > >>>> >>     > Best,
> > >>>> >>     >
> > >>>> >>     > Konstantin
> > >>>> >>
> > >>>> >
> > >>>
> > >>>
> > >>>
> > >>> --
> > >>>
> > >>> Konstantin Knauf
> > >>>
> > >>> https://twitter.com/snntrable
> > >>>
> > >>> https://github.com/knaufk
> >

Re: [BULK]Re: [SURVEY] Remove Mesos support

Posted by Matthias Pohl <ma...@ververica.com>.
Thanks for everyone's feedback. I'm gonna initiate a vote in a separate
thread.

On Mon, Mar 29, 2021 at 9:18 AM Robert Metzger <rm...@apache.org> wrote:

> +1
>
>
>
> On Mon, Mar 29, 2021 at 5:44 AM Yangze Guo <ka...@gmail.com> wrote:
>
> > +1
> >
> > Best,
> > Yangze Guo
> >
> > On Mon, Mar 29, 2021 at 11:31 AM Xintong Song <to...@gmail.com>
> > wrote:
> > >
> > > +1
> > > It's already a matter of fact for a while that we no longer port new
> > features to the Mesos deployment.
> > >
> > > Thank you~
> > >
> > > Xintong Song
> > >
> > >
> > >
> > > On Fri, Mar 26, 2021 at 10:37 PM Till Rohrmann <tr...@apache.org>
> > wrote:
> > >>
> > >> +1 for officially deprecating this component for the 1.13 release.
> > >>
> > >> Cheers,
> > >> Till
> > >>
> > >> On Thu, Mar 25, 2021 at 1:49 PM Konstantin Knauf <kn...@apache.org>
> > wrote:
> > >>>
> > >>> Hi Matthias,
> > >>>
> > >>> Thank you for following up on this. +1 to officially deprecate Mesos
> > in the code and documentation, too. It will be confusing for users if
> this
> > diverges from the roadmap.
> > >>>
> > >>> Cheers,
> > >>>
> > >>> Konstantin
> > >>>
> > >>> On Thu, Mar 25, 2021 at 12:23 PM Matthias Pohl <
> matthias@ververica.com>
> > wrote:
> > >>>>
> > >>>> Hi everyone,
> > >>>> considering the upcoming release of Flink 1.13, I wanted to revive
> the
> > >>>> discussion about the Mesos support ones more. Mesos is also already
> > listed
> > >>>> as deprecated in Flink's overall roadmap [1]. Maybe, it's time to
> > align the
> > >>>> documentation accordingly to make it more explicit?
> > >>>>
> > >>>> What do you think?
> > >>>>
> > >>>> Best,
> > >>>> Matthias
> > >>>>
> > >>>> [1] https://flink.apache.org/roadmap.html#feature-radar
> > >>>>
> > >>>> On Wed, Oct 28, 2020 at 9:40 AM Till Rohrmann <trohrmann@apache.org
> >
> > wrote:
> > >>>>
> > >>>> > Hi Oleksandr,
> > >>>> >
> > >>>> > yes you are right. The biggest problem is at the moment the lack
> of
> > test
> > >>>> > coverage and thereby confidence to make changes. We have some e2e
> > tests
> > >>>> > which you can find here [1]. These tests are, however, quite
> coarse
> > grained
> > >>>> > and are missing a lot of cases. One idea would be to add a Mesos
> > e2e test
> > >>>> > based on Flink's end-to-end test framework [2]. I think what needs
> > to be
> > >>>> > done there is to add a Mesos resource and a way to submit jobs to
> a
> > Mesos
> > >>>> > cluster to write e2e tests.
> > >>>> >
> > >>>> > [1] https://github.com/apache/flink/tree/master/flink-jepsen
> > >>>> > [2]
> > >>>> >
> >
> https://github.com/apache/flink/tree/master/flink-end-to-end-tests/flink-end-to-end-tests-common
> > >>>> >
> > >>>> > Cheers,
> > >>>> > Till
> > >>>> >
> > >>>> > On Tue, Oct 27, 2020 at 12:29 PM Oleksandr Nitavskyi <
> > >>>> > o.nitavskyi@criteo.com> wrote:
> > >>>> >
> > >>>> >> Hello Xintong,
> > >>>> >>
> > >>>> >> Thanks for the insights and support.
> > >>>> >>
> > >>>> >> Browsing the Mesos backlog and didn't identify anything critical,
> > which
> > >>>> >> is left there.
> > >>>> >>
> > >>>> >> I see that there are were quite a lot of contributions to the
> > Flink Mesos
> > >>>> >> in the recent version:
> > >>>> >> https://github.com/apache/flink/commits/master/flink-mesos.
> > >>>> >> We plan to validate the current Flink master (or release 1.12
> > branch) our
> > >>>> >> Mesos setup. In case of any issues, we will try to propose
> changes.
> > >>>> >> My feeling is that our test results shouldn't affect the Flink
> 1.12
> > >>>> >> release cycle. And if any potential commits will land into the
> > 1.12.1 it
> > >>>> >> should be totally fine.
> > >>>> >>
> > >>>> >> In the future, we would be glad to help you guys with any
> > >>>> >> maintenance-related questions. One of the highest priorities
> > around this
> > >>>> >> component seems to be the development of the full e2e test.
> > >>>> >>
> > >>>> >> Kind Regards
> > >>>> >> Oleksandr Nitavskyi
> > >>>> >> ________________________________
> > >>>> >> From: Xintong Song <to...@gmail.com>
> > >>>> >> Sent: Tuesday, October 27, 2020 7:14 AM
> > >>>> >> To: dev <de...@flink.apache.org>; user <us...@flink.apache.org>
> > >>>> >> Cc: Piyush Narang <p....@criteo.com>
> > >>>> >> Subject: [BULK]Re: [SURVEY] Remove Mesos support
> > >>>> >>
> > >>>> >> Hi Piyush,
> > >>>> >>
> > >>>> >> Thanks a lot for sharing the information. It would be a great
> > relief that
> > >>>> >> you are good with Flink on Mesos as is.
> > >>>> >>
> > >>>> >> As for the jira issues, I believe the most essential ones should
> > have
> > >>>> >> already been resolved. You may find some remaining open issues
> > here [1],
> > >>>> >> but not all of them are necessary if we decide to keep Flink on
> > Mesos as is.
> > >>>> >>
> > >>>> >> At the moment and in the short future, I think helps are mostly
> > needed on
> > >>>> >> testing the upcoming release 1.12 with Mesos use cases. The
> > community is
> > >>>> >> currently actively preparing the new release, and hopefully we
> > could come
> > >>>> >> up with a release candidate early next month. It would be greatly
> > >>>> >> appreciated if you fork as experienced Flink on Mesos users can
> > help with
> > >>>> >> verifying the release candidates.
> > >>>> >>
> > >>>> >>
> > >>>> >> Thank you~
> > >>>> >>
> > >>>> >> Xintong Song
> > >>>> >>
> > >>>> >> [1]
> > >>>> >>
> >
> https://issues.apache.org/jira/browse/FLINK-17402?jql=project%20%3D%20FLINK%20AND%20component%20%3D%20%22Deployment%20%2F%20Mesos%22%20AND%20status%20%3D%20Open
> > >>>> >> <
> > >>>> >>
> >
> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FFLINK-17402%3Fjql%3Dproject%2520%253D%2520FLINK%2520AND%2520component%2520%253D%2520%2522Deployment%2520%252F%2520Mesos%2522%2520AND%2520status%2520%253D%2520Open&data=04%7C01%7Co.nitavskyi%40criteo.com%7C3585e1f25bdf4e091af808d87a3f92db%7C2a35d8fd574d48e3927c8c398e225a01%7C1%7C0%7C637393760750820881%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=hytJFQE0MCPzMLiQTQTdbg3GVckX5M3r1NPRGrRV8j4%3D&reserved=0
> > >>>> >> >
> > >>>> >>
> > >>>> >> On Tue, Oct 27, 2020 at 2:58 AM Piyush Narang <
> p.narang@criteo.com
> > >>>> >> <ma...@criteo.com>> wrote:
> > >>>> >>
> > >>>> >> Hi Xintong,
> > >>>> >>
> > >>>> >>
> > >>>> >>
> > >>>> >> Do you have any jiras that cover any of the items on 1 or 2? I
> can
> > reach
> > >>>> >> out to folks internally and see if I can get some folks to commit
> > to
> > >>>> >> helping out.
> > >>>> >>
> > >>>> >>
> > >>>> >>
> > >>>> >> To cover the other qs:
> > >>>> >>
> > >>>> >>   *   Yes, we’ve not got a plan at the moment to get off Mesos.
> We
> > use
> > >>>> >> Yarn for some our Flink workloads when we can. Mesos is only used
> > when we
> > >>>> >> need streaming capabilities in our WW dcs (as our Yarn is
> > centralized in
> > >>>> >> one DC)
> > >>>> >>   *   We’re currently on Flink 1.9 (old planner). We have a plan
> > to bump
> > >>>> >> to 1.11 / 1.12 this quarter.
> > >>>> >>   *   We typically upgrade once every 6 months to a year (not
> every
> > >>>> >> release). We’d like to speed up the cadence but we’re not there
> > yet.
> > >>>> >>   *   We’d largely be good with keeping Flink on Mesos as-is and
> > >>>> >> functional while missing out on some of the newer features. We
> > understand
> > >>>> >> the pain on the communities side and we can take on the work if
> we
> > see some
> > >>>> >> fancy improvement in Flink on Yarn / K8s that we want in Mesos to
> > put in
> > >>>> >> the request to port it over.
> > >>>> >>
> > >>>> >>
> > >>>> >>
> > >>>> >> Thanks,
> > >>>> >>
> > >>>> >>
> > >>>> >>
> > >>>> >> -- Piyush
> > >>>> >>
> > >>>> >>
> > >>>> >>
> > >>>> >>
> > >>>> >>
> > >>>> >> From: Xintong Song <tonysong820@gmail.com<mailto:
> > tonysong820@gmail.com>>
> > >>>> >> Date: Sunday, October 25, 2020 at 10:57 PM
> > >>>> >> To: dev <de...@flink.apache.org>>,
> user
> > <
> > >>>> >> user@flink.apache.org<ma...@flink.apache.org>>
> > >>>> >> Cc: Lasse Nedergaard <lassenedergaardflink@gmail.com<mailto:
> > >>>> >> lassenedergaardflink@gmail.com>>, <p.narang@criteo.com<mailto:
> > >>>> >> p.narang@criteo.com>>
> > >>>> >> Subject: Re: [SURVEY] Remove Mesos support
> > >>>> >>
> > >>>> >>
> > >>>> >>
> > >>>> >> Thanks for sharing the information with us, Piyush an Lasse.
> > >>>> >>
> > >>>> >>
> > >>>> >>
> > >>>> >> @Piyush
> > >>>> >>
> > >>>> >>
> > >>>> >>
> > >>>> >> Thanks for offering the help. IMO, there are currently several
> > problems
> > >>>> >> that make supporting Flink on Mesos challenging for us.
> > >>>> >>
> > >>>> >>   1.  Lack of Mesos experts. AFAIK, there are very few people (if
> > not
> > >>>> >> none) among the active contributors in this community that are
> > familiar
> > >>>> >> with Mesos and can help with development on this component.
> > >>>> >>   2.  Absence of tests. Mesos does not provide a testing cluster,
> > like
> > >>>> >> `MiniYARNCluster`, making it hard to test interactions between
> > Flink and
> > >>>> >> Mesos. We have only a few very simple e2e tests running on Mesos
> > deployed
> > >>>> >> in a docker, covering the most fundamental workflows. We are not
> > sure how
> > >>>> >> well those tests work, especially against some potential corner
> > cases.
> > >>>> >>   3.  Divergence from other deployment. Because of 1 and 2, the
> new
> > >>>> >> efforts (features, maintenance, refactors) tend to exclude Mesos
> if
> > >>>> >> possible. When the new efforts have to touch the Mesos related
> > components
> > >>>> >> (e.g., changes to the common resource manager interfaces), we
> have
> > to be
> > >>>> >> very careful and make as few changes as possible, to avoid
> > accidentally
> > >>>> >> breaking anything that we are not familiar with. As a result, the
> > component
> > >>>> >> diverges a lot from other deployment components (K8s/Yarn), which
> > makes it
> > >>>> >> harder to maintain.
> > >>>> >>
> > >>>> >> It would be greatly appreciated if you can help with either of
> the
> > above
> > >>>> >> issues.
> > >>>> >>
> > >>>> >>
> > >>>> >>
> > >>>> >> Additionally, I have a few questions concerning your use cases at
> > Criteo.
> > >>>> >> IIUC, you are going to stay on Mesos in the foreseeable future,
> > while
> > >>>> >> keeping the Flink version up-to-date? What Flink version are you
> > currently
> > >>>> >> using? How often do you upgrade (e.g., every release)? Would you
> > be good
> > >>>> >> with keeping the Flink on Mesos component as it is (means that
> > deployment
> > >>>> >> and resource management improvements may not be ported to Mesos),
> > while
> > >>>> >> keeping other components up-to-date (e.g., improvements from
> > programming
> > >>>> >> APIs, operators, state backens, etc.)?
> > >>>> >>
> > >>>> >>
> > >>>> >>
> > >>>> >> Thank you~
> > >>>> >>
> > >>>> >> Xintong Song
> > >>>> >>
> > >>>> >>
> > >>>> >>
> > >>>> >>
> > >>>> >>
> > >>>> >> On Sat, Oct 24, 2020 at 2:48 AM Lasse Nedergaard <
> > >>>> >> lassenedergaardflink@gmail.com<mailto:
> > lassenedergaardflink@gmail.com>>
> > >>>> >> wrote:
> > >>>> >>
> > >>>> >> Hi
> > >>>> >>
> > >>>> >>
> > >>>> >>
> > >>>> >> At Trackunit We have been using Mesos for long time but have now
> > moved to
> > >>>> >> k8s.
> > >>>> >>
> > >>>> >> Med venlig hilsen / Best regards
> > >>>> >>
> > >>>> >> Lasse Nedergaard
> > >>>> >>
> > >>>> >>
> > >>>> >>
> > >>>> >>
> > >>>> >>
> > >>>> >> Den 23. okt. 2020 kl. 17.01 skrev Robert Metzger <
> > rmetzger@apache.org
> > >>>> >> <ma...@apache.org>>:
> > >>>> >>
> > >>>> >> 
> > >>>> >>
> > >>>> >> Hey Piyush,
> > >>>> >>
> > >>>> >> thanks a lot for raising this concern. I believe we should keep
> > Mesos in
> > >>>> >> Flink then in the foreseeable future.
> > >>>> >>
> > >>>> >> Your offer to help is much appreciated. We'll let you know once
> > there is
> > >>>> >> something.
> > >>>> >>
> > >>>> >>
> > >>>> >>
> > >>>> >> On Fri, Oct 23, 2020 at 4:28 PM Piyush Narang <
> p.narang@criteo.com
> > >>>> >> <ma...@criteo.com>> wrote:
> > >>>> >>
> > >>>> >> Thanks Kostas. If there's items we can help with, I'm sure we'd
> be
> > able
> > >>>> >> to find folks who would be excited to contribute / help in any
> way.
> > >>>> >>
> > >>>> >> -- Piyush
> > >>>> >>
> > >>>> >>
> > >>>> >> On 10/23/20, 10:25 AM, "Kostas Kloudas" <kkloudas@gmail.com
> > <mailto:
> > >>>> >> kkloudas@gmail.com>> wrote:
> > >>>> >>
> > >>>> >>     Thanks Piyush for the message.
> > >>>> >>     After this, I revoke my +1. I agree with the previous
> opinions
> > that we
> > >>>> >>     cannot drop code that is actively used by users, especially
> if
> > it
> > >>>> >>     something that deep in the stack as support for cluster
> > management
> > >>>> >>     framework.
> > >>>> >>
> > >>>> >>     Cheers,
> > >>>> >>     Kostas
> > >>>> >>
> > >>>> >>     On Fri, Oct 23, 2020 at 4:15 PM Piyush Narang <
> > p.narang@criteo.com
> > >>>> >> <ma...@criteo.com>> wrote:
> > >>>> >>     >
> > >>>> >>     > Hi folks,
> > >>>> >>     >
> > >>>> >>     >
> > >>>> >>     >
> > >>>> >>     > We at Criteo are active users of the Flink on Mesos
> resource
> > >>>> >> management component. We are pretty heavy users of Mesos for
> > scheduling
> > >>>> >> workloads on our edge datacenters and we do want to continue to
> be
> > able to
> > >>>> >> run some of our Flink topologies (to compute machine learning
> > short term
> > >>>> >> features) on those DCs. If possible our vote would be not to drop
> > Mesos
> > >>>> >> support as that will tie us to an old release / have to maintain
> a
> > fork as
> > >>>> >> we’re not planning to migrate off Mesos anytime soon. Is the
> burden
> > >>>> >> something that can be helped with by the community? (Or are you
> > referring
> > >>>> >> to having to ensure PRs handle the Mesos piece as well when they
> > touch the
> > >>>> >> resource managers?)
> > >>>> >>     >
> > >>>> >>     >
> > >>>> >>     >
> > >>>> >>     > Thanks,
> > >>>> >>     >
> > >>>> >>     >
> > >>>> >>     >
> > >>>> >>     > -- Piyush
> > >>>> >>     >
> > >>>> >>     >
> > >>>> >>     >
> > >>>> >>     >
> > >>>> >>     >
> > >>>> >>     > From: Till Rohrmann <trohrmann@apache.org<mailto:
> > >>>> >> trohrmann@apache.org>>
> > >>>> >>     > Date: Friday, October 23, 2020 at 8:19 AM
> > >>>> >>     > To: Xintong Song <tonysong820@gmail.com<mailto:
> > >>>> >> tonysong820@gmail.com>>
> > >>>> >>     > Cc: dev <dev@flink.apache.org<mailto:dev@flink.apache.org
> >>,
> > user <
> > >>>> >> user@flink.apache.org<ma...@flink.apache.org>>
> > >>>> >>     > Subject: Re: [SURVEY] Remove Mesos support
> > >>>> >>     >
> > >>>> >>     >
> > >>>> >>     >
> > >>>> >>     > Thanks for starting this survey Robert! I second Konstantin
> > and
> > >>>> >> Xintong in the sense that our Mesos user's opinions should matter
> > most
> > >>>> >> here. If our community is no longer using the Mesos integration,
> > then I
> > >>>> >> would be +1 for removing it in order to decrease the maintenance
> > burden.
> > >>>> >>     >
> > >>>> >>     >
> > >>>> >>     >
> > >>>> >>     > Cheers,
> > >>>> >>     >
> > >>>> >>     > Till
> > >>>> >>     >
> > >>>> >>     >
> > >>>> >>     >
> > >>>> >>     > On Fri, Oct 23, 2020 at 2:03 PM Xintong Song <
> > tonysong820@gmail.com
> > >>>> >> <ma...@gmail.com>> wrote:
> > >>>> >>     >
> > >>>> >>     > +1 for adding a warning in 1.12 about planning to remove
> > Mesos
> > >>>> >> support.
> > >>>> >>     >
> > >>>> >>     >
> > >>>> >>     >
> > >>>> >>     > With my developer hat on, removing the Mesos support would
> > >>>> >> definitely reduce the maintaining overhead for the deployment and
> > resource
> > >>>> >> management related components. On the other hand, the Flink on
> > Mesos users'
> > >>>> >> voices definitely matter a lot for this community. Either way, it
> > would be
> > >>>> >> good to draw users attention to this discussion early.
> > >>>> >>     >
> > >>>> >>     >
> > >>>> >>     >
> > >>>> >>     > Thank you~
> > >>>> >>     >
> > >>>> >>     > Xintong Song
> > >>>> >>     >
> > >>>> >>     >
> > >>>> >>     >
> > >>>> >>     >
> > >>>> >>     >
> > >>>> >>     > On Fri, Oct 23, 2020 at 7:53 PM Konstantin Knauf <
> > knaufk@apache.org
> > >>>> >> <ma...@apache.org>> wrote:
> > >>>> >>     >
> > >>>> >>     > Hi Robert,
> > >>>> >>     >
> > >>>> >>     > +1 to the plan you outlined. If we were to drop support in
> > Flink
> > >>>> >> 1.13+, we
> > >>>> >>     > would still support it in Flink 1.12- with bug fixes for
> > some time
> > >>>> >> so that
> > >>>> >>     > users have time to move on.
> > >>>> >>     >
> > >>>> >>     > It would certainly be very interesting to hear from current
> > Flink
> > >>>> >> on Mesos
> > >>>> >>     > users, on how they see the evolution of this part of the
> > ecosystem.
> > >>>> >>     >
> > >>>> >>     > Best,
> > >>>> >>     >
> > >>>> >>     > Konstantin
> > >>>> >>
> > >>>> >
> > >>>
> > >>>
> > >>>
> > >>> --
> > >>>
> > >>> Konstantin Knauf
> > >>>
> > >>> https://twitter.com/snntrable
> > >>>
> > >>> https://github.com/knaufk
> >

Re: [BULK]Re: [SURVEY] Remove Mesos support

Posted by Robert Metzger <rm...@apache.org>.
+1



On Mon, Mar 29, 2021 at 5:44 AM Yangze Guo <ka...@gmail.com> wrote:

> +1
>
> Best,
> Yangze Guo
>
> On Mon, Mar 29, 2021 at 11:31 AM Xintong Song <to...@gmail.com>
> wrote:
> >
> > +1
> > It's already a matter of fact for a while that we no longer port new
> features to the Mesos deployment.
> >
> > Thank you~
> >
> > Xintong Song
> >
> >
> >
> > On Fri, Mar 26, 2021 at 10:37 PM Till Rohrmann <tr...@apache.org>
> wrote:
> >>
> >> +1 for officially deprecating this component for the 1.13 release.
> >>
> >> Cheers,
> >> Till
> >>
> >> On Thu, Mar 25, 2021 at 1:49 PM Konstantin Knauf <kn...@apache.org>
> wrote:
> >>>
> >>> Hi Matthias,
> >>>
> >>> Thank you for following up on this. +1 to officially deprecate Mesos
> in the code and documentation, too. It will be confusing for users if this
> diverges from the roadmap.
> >>>
> >>> Cheers,
> >>>
> >>> Konstantin
> >>>
> >>> On Thu, Mar 25, 2021 at 12:23 PM Matthias Pohl <ma...@ververica.com>
> wrote:
> >>>>
> >>>> Hi everyone,
> >>>> considering the upcoming release of Flink 1.13, I wanted to revive the
> >>>> discussion about the Mesos support ones more. Mesos is also already
> listed
> >>>> as deprecated in Flink's overall roadmap [1]. Maybe, it's time to
> align the
> >>>> documentation accordingly to make it more explicit?
> >>>>
> >>>> What do you think?
> >>>>
> >>>> Best,
> >>>> Matthias
> >>>>
> >>>> [1] https://flink.apache.org/roadmap.html#feature-radar
> >>>>
> >>>> On Wed, Oct 28, 2020 at 9:40 AM Till Rohrmann <tr...@apache.org>
> wrote:
> >>>>
> >>>> > Hi Oleksandr,
> >>>> >
> >>>> > yes you are right. The biggest problem is at the moment the lack of
> test
> >>>> > coverage and thereby confidence to make changes. We have some e2e
> tests
> >>>> > which you can find here [1]. These tests are, however, quite coarse
> grained
> >>>> > and are missing a lot of cases. One idea would be to add a Mesos
> e2e test
> >>>> > based on Flink's end-to-end test framework [2]. I think what needs
> to be
> >>>> > done there is to add a Mesos resource and a way to submit jobs to a
> Mesos
> >>>> > cluster to write e2e tests.
> >>>> >
> >>>> > [1] https://github.com/apache/flink/tree/master/flink-jepsen
> >>>> > [2]
> >>>> >
> https://github.com/apache/flink/tree/master/flink-end-to-end-tests/flink-end-to-end-tests-common
> >>>> >
> >>>> > Cheers,
> >>>> > Till
> >>>> >
> >>>> > On Tue, Oct 27, 2020 at 12:29 PM Oleksandr Nitavskyi <
> >>>> > o.nitavskyi@criteo.com> wrote:
> >>>> >
> >>>> >> Hello Xintong,
> >>>> >>
> >>>> >> Thanks for the insights and support.
> >>>> >>
> >>>> >> Browsing the Mesos backlog and didn't identify anything critical,
> which
> >>>> >> is left there.
> >>>> >>
> >>>> >> I see that there are were quite a lot of contributions to the
> Flink Mesos
> >>>> >> in the recent version:
> >>>> >> https://github.com/apache/flink/commits/master/flink-mesos.
> >>>> >> We plan to validate the current Flink master (or release 1.12
> branch) our
> >>>> >> Mesos setup. In case of any issues, we will try to propose changes.
> >>>> >> My feeling is that our test results shouldn't affect the Flink 1.12
> >>>> >> release cycle. And if any potential commits will land into the
> 1.12.1 it
> >>>> >> should be totally fine.
> >>>> >>
> >>>> >> In the future, we would be glad to help you guys with any
> >>>> >> maintenance-related questions. One of the highest priorities
> around this
> >>>> >> component seems to be the development of the full e2e test.
> >>>> >>
> >>>> >> Kind Regards
> >>>> >> Oleksandr Nitavskyi
> >>>> >> ________________________________
> >>>> >> From: Xintong Song <to...@gmail.com>
> >>>> >> Sent: Tuesday, October 27, 2020 7:14 AM
> >>>> >> To: dev <de...@flink.apache.org>; user <us...@flink.apache.org>
> >>>> >> Cc: Piyush Narang <p....@criteo.com>
> >>>> >> Subject: [BULK]Re: [SURVEY] Remove Mesos support
> >>>> >>
> >>>> >> Hi Piyush,
> >>>> >>
> >>>> >> Thanks a lot for sharing the information. It would be a great
> relief that
> >>>> >> you are good with Flink on Mesos as is.
> >>>> >>
> >>>> >> As for the jira issues, I believe the most essential ones should
> have
> >>>> >> already been resolved. You may find some remaining open issues
> here [1],
> >>>> >> but not all of them are necessary if we decide to keep Flink on
> Mesos as is.
> >>>> >>
> >>>> >> At the moment and in the short future, I think helps are mostly
> needed on
> >>>> >> testing the upcoming release 1.12 with Mesos use cases. The
> community is
> >>>> >> currently actively preparing the new release, and hopefully we
> could come
> >>>> >> up with a release candidate early next month. It would be greatly
> >>>> >> appreciated if you fork as experienced Flink on Mesos users can
> help with
> >>>> >> verifying the release candidates.
> >>>> >>
> >>>> >>
> >>>> >> Thank you~
> >>>> >>
> >>>> >> Xintong Song
> >>>> >>
> >>>> >> [1]
> >>>> >>
> https://issues.apache.org/jira/browse/FLINK-17402?jql=project%20%3D%20FLINK%20AND%20component%20%3D%20%22Deployment%20%2F%20Mesos%22%20AND%20status%20%3D%20Open
> >>>> >> <
> >>>> >>
> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FFLINK-17402%3Fjql%3Dproject%2520%253D%2520FLINK%2520AND%2520component%2520%253D%2520%2522Deployment%2520%252F%2520Mesos%2522%2520AND%2520status%2520%253D%2520Open&data=04%7C01%7Co.nitavskyi%40criteo.com%7C3585e1f25bdf4e091af808d87a3f92db%7C2a35d8fd574d48e3927c8c398e225a01%7C1%7C0%7C637393760750820881%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=hytJFQE0MCPzMLiQTQTdbg3GVckX5M3r1NPRGrRV8j4%3D&reserved=0
> >>>> >> >
> >>>> >>
> >>>> >> On Tue, Oct 27, 2020 at 2:58 AM Piyush Narang <p.narang@criteo.com
> >>>> >> <ma...@criteo.com>> wrote:
> >>>> >>
> >>>> >> Hi Xintong,
> >>>> >>
> >>>> >>
> >>>> >>
> >>>> >> Do you have any jiras that cover any of the items on 1 or 2? I can
> reach
> >>>> >> out to folks internally and see if I can get some folks to commit
> to
> >>>> >> helping out.
> >>>> >>
> >>>> >>
> >>>> >>
> >>>> >> To cover the other qs:
> >>>> >>
> >>>> >>   *   Yes, we’ve not got a plan at the moment to get off Mesos. We
> use
> >>>> >> Yarn for some our Flink workloads when we can. Mesos is only used
> when we
> >>>> >> need streaming capabilities in our WW dcs (as our Yarn is
> centralized in
> >>>> >> one DC)
> >>>> >>   *   We’re currently on Flink 1.9 (old planner). We have a plan
> to bump
> >>>> >> to 1.11 / 1.12 this quarter.
> >>>> >>   *   We typically upgrade once every 6 months to a year (not every
> >>>> >> release). We’d like to speed up the cadence but we’re not there
> yet.
> >>>> >>   *   We’d largely be good with keeping Flink on Mesos as-is and
> >>>> >> functional while missing out on some of the newer features. We
> understand
> >>>> >> the pain on the communities side and we can take on the work if we
> see some
> >>>> >> fancy improvement in Flink on Yarn / K8s that we want in Mesos to
> put in
> >>>> >> the request to port it over.
> >>>> >>
> >>>> >>
> >>>> >>
> >>>> >> Thanks,
> >>>> >>
> >>>> >>
> >>>> >>
> >>>> >> -- Piyush
> >>>> >>
> >>>> >>
> >>>> >>
> >>>> >>
> >>>> >>
> >>>> >> From: Xintong Song <tonysong820@gmail.com<mailto:
> tonysong820@gmail.com>>
> >>>> >> Date: Sunday, October 25, 2020 at 10:57 PM
> >>>> >> To: dev <de...@flink.apache.org>>, user
> <
> >>>> >> user@flink.apache.org<ma...@flink.apache.org>>
> >>>> >> Cc: Lasse Nedergaard <lassenedergaardflink@gmail.com<mailto:
> >>>> >> lassenedergaardflink@gmail.com>>, <p.narang@criteo.com<mailto:
> >>>> >> p.narang@criteo.com>>
> >>>> >> Subject: Re: [SURVEY] Remove Mesos support
> >>>> >>
> >>>> >>
> >>>> >>
> >>>> >> Thanks for sharing the information with us, Piyush an Lasse.
> >>>> >>
> >>>> >>
> >>>> >>
> >>>> >> @Piyush
> >>>> >>
> >>>> >>
> >>>> >>
> >>>> >> Thanks for offering the help. IMO, there are currently several
> problems
> >>>> >> that make supporting Flink on Mesos challenging for us.
> >>>> >>
> >>>> >>   1.  Lack of Mesos experts. AFAIK, there are very few people (if
> not
> >>>> >> none) among the active contributors in this community that are
> familiar
> >>>> >> with Mesos and can help with development on this component.
> >>>> >>   2.  Absence of tests. Mesos does not provide a testing cluster,
> like
> >>>> >> `MiniYARNCluster`, making it hard to test interactions between
> Flink and
> >>>> >> Mesos. We have only a few very simple e2e tests running on Mesos
> deployed
> >>>> >> in a docker, covering the most fundamental workflows. We are not
> sure how
> >>>> >> well those tests work, especially against some potential corner
> cases.
> >>>> >>   3.  Divergence from other deployment. Because of 1 and 2, the new
> >>>> >> efforts (features, maintenance, refactors) tend to exclude Mesos if
> >>>> >> possible. When the new efforts have to touch the Mesos related
> components
> >>>> >> (e.g., changes to the common resource manager interfaces), we have
> to be
> >>>> >> very careful and make as few changes as possible, to avoid
> accidentally
> >>>> >> breaking anything that we are not familiar with. As a result, the
> component
> >>>> >> diverges a lot from other deployment components (K8s/Yarn), which
> makes it
> >>>> >> harder to maintain.
> >>>> >>
> >>>> >> It would be greatly appreciated if you can help with either of the
> above
> >>>> >> issues.
> >>>> >>
> >>>> >>
> >>>> >>
> >>>> >> Additionally, I have a few questions concerning your use cases at
> Criteo.
> >>>> >> IIUC, you are going to stay on Mesos in the foreseeable future,
> while
> >>>> >> keeping the Flink version up-to-date? What Flink version are you
> currently
> >>>> >> using? How often do you upgrade (e.g., every release)? Would you
> be good
> >>>> >> with keeping the Flink on Mesos component as it is (means that
> deployment
> >>>> >> and resource management improvements may not be ported to Mesos),
> while
> >>>> >> keeping other components up-to-date (e.g., improvements from
> programming
> >>>> >> APIs, operators, state backens, etc.)?
> >>>> >>
> >>>> >>
> >>>> >>
> >>>> >> Thank you~
> >>>> >>
> >>>> >> Xintong Song
> >>>> >>
> >>>> >>
> >>>> >>
> >>>> >>
> >>>> >>
> >>>> >> On Sat, Oct 24, 2020 at 2:48 AM Lasse Nedergaard <
> >>>> >> lassenedergaardflink@gmail.com<mailto:
> lassenedergaardflink@gmail.com>>
> >>>> >> wrote:
> >>>> >>
> >>>> >> Hi
> >>>> >>
> >>>> >>
> >>>> >>
> >>>> >> At Trackunit We have been using Mesos for long time but have now
> moved to
> >>>> >> k8s.
> >>>> >>
> >>>> >> Med venlig hilsen / Best regards
> >>>> >>
> >>>> >> Lasse Nedergaard
> >>>> >>
> >>>> >>
> >>>> >>
> >>>> >>
> >>>> >>
> >>>> >> Den 23. okt. 2020 kl. 17.01 skrev Robert Metzger <
> rmetzger@apache.org
> >>>> >> <ma...@apache.org>>:
> >>>> >>
> >>>> >> 
> >>>> >>
> >>>> >> Hey Piyush,
> >>>> >>
> >>>> >> thanks a lot for raising this concern. I believe we should keep
> Mesos in
> >>>> >> Flink then in the foreseeable future.
> >>>> >>
> >>>> >> Your offer to help is much appreciated. We'll let you know once
> there is
> >>>> >> something.
> >>>> >>
> >>>> >>
> >>>> >>
> >>>> >> On Fri, Oct 23, 2020 at 4:28 PM Piyush Narang <p.narang@criteo.com
> >>>> >> <ma...@criteo.com>> wrote:
> >>>> >>
> >>>> >> Thanks Kostas. If there's items we can help with, I'm sure we'd be
> able
> >>>> >> to find folks who would be excited to contribute / help in any way.
> >>>> >>
> >>>> >> -- Piyush
> >>>> >>
> >>>> >>
> >>>> >> On 10/23/20, 10:25 AM, "Kostas Kloudas" <kkloudas@gmail.com
> <mailto:
> >>>> >> kkloudas@gmail.com>> wrote:
> >>>> >>
> >>>> >>     Thanks Piyush for the message.
> >>>> >>     After this, I revoke my +1. I agree with the previous opinions
> that we
> >>>> >>     cannot drop code that is actively used by users, especially if
> it
> >>>> >>     something that deep in the stack as support for cluster
> management
> >>>> >>     framework.
> >>>> >>
> >>>> >>     Cheers,
> >>>> >>     Kostas
> >>>> >>
> >>>> >>     On Fri, Oct 23, 2020 at 4:15 PM Piyush Narang <
> p.narang@criteo.com
> >>>> >> <ma...@criteo.com>> wrote:
> >>>> >>     >
> >>>> >>     > Hi folks,
> >>>> >>     >
> >>>> >>     >
> >>>> >>     >
> >>>> >>     > We at Criteo are active users of the Flink on Mesos resource
> >>>> >> management component. We are pretty heavy users of Mesos for
> scheduling
> >>>> >> workloads on our edge datacenters and we do want to continue to be
> able to
> >>>> >> run some of our Flink topologies (to compute machine learning
> short term
> >>>> >> features) on those DCs. If possible our vote would be not to drop
> Mesos
> >>>> >> support as that will tie us to an old release / have to maintain a
> fork as
> >>>> >> we’re not planning to migrate off Mesos anytime soon. Is the burden
> >>>> >> something that can be helped with by the community? (Or are you
> referring
> >>>> >> to having to ensure PRs handle the Mesos piece as well when they
> touch the
> >>>> >> resource managers?)
> >>>> >>     >
> >>>> >>     >
> >>>> >>     >
> >>>> >>     > Thanks,
> >>>> >>     >
> >>>> >>     >
> >>>> >>     >
> >>>> >>     > -- Piyush
> >>>> >>     >
> >>>> >>     >
> >>>> >>     >
> >>>> >>     >
> >>>> >>     >
> >>>> >>     > From: Till Rohrmann <trohrmann@apache.org<mailto:
> >>>> >> trohrmann@apache.org>>
> >>>> >>     > Date: Friday, October 23, 2020 at 8:19 AM
> >>>> >>     > To: Xintong Song <tonysong820@gmail.com<mailto:
> >>>> >> tonysong820@gmail.com>>
> >>>> >>     > Cc: dev <de...@flink.apache.org>>,
> user <
> >>>> >> user@flink.apache.org<ma...@flink.apache.org>>
> >>>> >>     > Subject: Re: [SURVEY] Remove Mesos support
> >>>> >>     >
> >>>> >>     >
> >>>> >>     >
> >>>> >>     > Thanks for starting this survey Robert! I second Konstantin
> and
> >>>> >> Xintong in the sense that our Mesos user's opinions should matter
> most
> >>>> >> here. If our community is no longer using the Mesos integration,
> then I
> >>>> >> would be +1 for removing it in order to decrease the maintenance
> burden.
> >>>> >>     >
> >>>> >>     >
> >>>> >>     >
> >>>> >>     > Cheers,
> >>>> >>     >
> >>>> >>     > Till
> >>>> >>     >
> >>>> >>     >
> >>>> >>     >
> >>>> >>     > On Fri, Oct 23, 2020 at 2:03 PM Xintong Song <
> tonysong820@gmail.com
> >>>> >> <ma...@gmail.com>> wrote:
> >>>> >>     >
> >>>> >>     > +1 for adding a warning in 1.12 about planning to remove
> Mesos
> >>>> >> support.
> >>>> >>     >
> >>>> >>     >
> >>>> >>     >
> >>>> >>     > With my developer hat on, removing the Mesos support would
> >>>> >> definitely reduce the maintaining overhead for the deployment and
> resource
> >>>> >> management related components. On the other hand, the Flink on
> Mesos users'
> >>>> >> voices definitely matter a lot for this community. Either way, it
> would be
> >>>> >> good to draw users attention to this discussion early.
> >>>> >>     >
> >>>> >>     >
> >>>> >>     >
> >>>> >>     > Thank you~
> >>>> >>     >
> >>>> >>     > Xintong Song
> >>>> >>     >
> >>>> >>     >
> >>>> >>     >
> >>>> >>     >
> >>>> >>     >
> >>>> >>     > On Fri, Oct 23, 2020 at 7:53 PM Konstantin Knauf <
> knaufk@apache.org
> >>>> >> <ma...@apache.org>> wrote:
> >>>> >>     >
> >>>> >>     > Hi Robert,
> >>>> >>     >
> >>>> >>     > +1 to the plan you outlined. If we were to drop support in
> Flink
> >>>> >> 1.13+, we
> >>>> >>     > would still support it in Flink 1.12- with bug fixes for
> some time
> >>>> >> so that
> >>>> >>     > users have time to move on.
> >>>> >>     >
> >>>> >>     > It would certainly be very interesting to hear from current
> Flink
> >>>> >> on Mesos
> >>>> >>     > users, on how they see the evolution of this part of the
> ecosystem.
> >>>> >>     >
> >>>> >>     > Best,
> >>>> >>     >
> >>>> >>     > Konstantin
> >>>> >>
> >>>> >
> >>>
> >>>
> >>>
> >>> --
> >>>
> >>> Konstantin Knauf
> >>>
> >>> https://twitter.com/snntrable
> >>>
> >>> https://github.com/knaufk
>

Re: [BULK]Re: [SURVEY] Remove Mesos support

Posted by Robert Metzger <rm...@apache.org>.
+1



On Mon, Mar 29, 2021 at 5:44 AM Yangze Guo <ka...@gmail.com> wrote:

> +1
>
> Best,
> Yangze Guo
>
> On Mon, Mar 29, 2021 at 11:31 AM Xintong Song <to...@gmail.com>
> wrote:
> >
> > +1
> > It's already a matter of fact for a while that we no longer port new
> features to the Mesos deployment.
> >
> > Thank you~
> >
> > Xintong Song
> >
> >
> >
> > On Fri, Mar 26, 2021 at 10:37 PM Till Rohrmann <tr...@apache.org>
> wrote:
> >>
> >> +1 for officially deprecating this component for the 1.13 release.
> >>
> >> Cheers,
> >> Till
> >>
> >> On Thu, Mar 25, 2021 at 1:49 PM Konstantin Knauf <kn...@apache.org>
> wrote:
> >>>
> >>> Hi Matthias,
> >>>
> >>> Thank you for following up on this. +1 to officially deprecate Mesos
> in the code and documentation, too. It will be confusing for users if this
> diverges from the roadmap.
> >>>
> >>> Cheers,
> >>>
> >>> Konstantin
> >>>
> >>> On Thu, Mar 25, 2021 at 12:23 PM Matthias Pohl <ma...@ververica.com>
> wrote:
> >>>>
> >>>> Hi everyone,
> >>>> considering the upcoming release of Flink 1.13, I wanted to revive the
> >>>> discussion about the Mesos support ones more. Mesos is also already
> listed
> >>>> as deprecated in Flink's overall roadmap [1]. Maybe, it's time to
> align the
> >>>> documentation accordingly to make it more explicit?
> >>>>
> >>>> What do you think?
> >>>>
> >>>> Best,
> >>>> Matthias
> >>>>
> >>>> [1] https://flink.apache.org/roadmap.html#feature-radar
> >>>>
> >>>> On Wed, Oct 28, 2020 at 9:40 AM Till Rohrmann <tr...@apache.org>
> wrote:
> >>>>
> >>>> > Hi Oleksandr,
> >>>> >
> >>>> > yes you are right. The biggest problem is at the moment the lack of
> test
> >>>> > coverage and thereby confidence to make changes. We have some e2e
> tests
> >>>> > which you can find here [1]. These tests are, however, quite coarse
> grained
> >>>> > and are missing a lot of cases. One idea would be to add a Mesos
> e2e test
> >>>> > based on Flink's end-to-end test framework [2]. I think what needs
> to be
> >>>> > done there is to add a Mesos resource and a way to submit jobs to a
> Mesos
> >>>> > cluster to write e2e tests.
> >>>> >
> >>>> > [1] https://github.com/apache/flink/tree/master/flink-jepsen
> >>>> > [2]
> >>>> >
> https://github.com/apache/flink/tree/master/flink-end-to-end-tests/flink-end-to-end-tests-common
> >>>> >
> >>>> > Cheers,
> >>>> > Till
> >>>> >
> >>>> > On Tue, Oct 27, 2020 at 12:29 PM Oleksandr Nitavskyi <
> >>>> > o.nitavskyi@criteo.com> wrote:
> >>>> >
> >>>> >> Hello Xintong,
> >>>> >>
> >>>> >> Thanks for the insights and support.
> >>>> >>
> >>>> >> Browsing the Mesos backlog and didn't identify anything critical,
> which
> >>>> >> is left there.
> >>>> >>
> >>>> >> I see that there are were quite a lot of contributions to the
> Flink Mesos
> >>>> >> in the recent version:
> >>>> >> https://github.com/apache/flink/commits/master/flink-mesos.
> >>>> >> We plan to validate the current Flink master (or release 1.12
> branch) our
> >>>> >> Mesos setup. In case of any issues, we will try to propose changes.
> >>>> >> My feeling is that our test results shouldn't affect the Flink 1.12
> >>>> >> release cycle. And if any potential commits will land into the
> 1.12.1 it
> >>>> >> should be totally fine.
> >>>> >>
> >>>> >> In the future, we would be glad to help you guys with any
> >>>> >> maintenance-related questions. One of the highest priorities
> around this
> >>>> >> component seems to be the development of the full e2e test.
> >>>> >>
> >>>> >> Kind Regards
> >>>> >> Oleksandr Nitavskyi
> >>>> >> ________________________________
> >>>> >> From: Xintong Song <to...@gmail.com>
> >>>> >> Sent: Tuesday, October 27, 2020 7:14 AM
> >>>> >> To: dev <de...@flink.apache.org>; user <us...@flink.apache.org>
> >>>> >> Cc: Piyush Narang <p....@criteo.com>
> >>>> >> Subject: [BULK]Re: [SURVEY] Remove Mesos support
> >>>> >>
> >>>> >> Hi Piyush,
> >>>> >>
> >>>> >> Thanks a lot for sharing the information. It would be a great
> relief that
> >>>> >> you are good with Flink on Mesos as is.
> >>>> >>
> >>>> >> As for the jira issues, I believe the most essential ones should
> have
> >>>> >> already been resolved. You may find some remaining open issues
> here [1],
> >>>> >> but not all of them are necessary if we decide to keep Flink on
> Mesos as is.
> >>>> >>
> >>>> >> At the moment and in the short future, I think helps are mostly
> needed on
> >>>> >> testing the upcoming release 1.12 with Mesos use cases. The
> community is
> >>>> >> currently actively preparing the new release, and hopefully we
> could come
> >>>> >> up with a release candidate early next month. It would be greatly
> >>>> >> appreciated if you fork as experienced Flink on Mesos users can
> help with
> >>>> >> verifying the release candidates.
> >>>> >>
> >>>> >>
> >>>> >> Thank you~
> >>>> >>
> >>>> >> Xintong Song
> >>>> >>
> >>>> >> [1]
> >>>> >>
> https://issues.apache.org/jira/browse/FLINK-17402?jql=project%20%3D%20FLINK%20AND%20component%20%3D%20%22Deployment%20%2F%20Mesos%22%20AND%20status%20%3D%20Open
> >>>> >> <
> >>>> >>
> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FFLINK-17402%3Fjql%3Dproject%2520%253D%2520FLINK%2520AND%2520component%2520%253D%2520%2522Deployment%2520%252F%2520Mesos%2522%2520AND%2520status%2520%253D%2520Open&data=04%7C01%7Co.nitavskyi%40criteo.com%7C3585e1f25bdf4e091af808d87a3f92db%7C2a35d8fd574d48e3927c8c398e225a01%7C1%7C0%7C637393760750820881%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=hytJFQE0MCPzMLiQTQTdbg3GVckX5M3r1NPRGrRV8j4%3D&reserved=0
> >>>> >> >
> >>>> >>
> >>>> >> On Tue, Oct 27, 2020 at 2:58 AM Piyush Narang <p.narang@criteo.com
> >>>> >> <ma...@criteo.com>> wrote:
> >>>> >>
> >>>> >> Hi Xintong,
> >>>> >>
> >>>> >>
> >>>> >>
> >>>> >> Do you have any jiras that cover any of the items on 1 or 2? I can
> reach
> >>>> >> out to folks internally and see if I can get some folks to commit
> to
> >>>> >> helping out.
> >>>> >>
> >>>> >>
> >>>> >>
> >>>> >> To cover the other qs:
> >>>> >>
> >>>> >>   *   Yes, we’ve not got a plan at the moment to get off Mesos. We
> use
> >>>> >> Yarn for some our Flink workloads when we can. Mesos is only used
> when we
> >>>> >> need streaming capabilities in our WW dcs (as our Yarn is
> centralized in
> >>>> >> one DC)
> >>>> >>   *   We’re currently on Flink 1.9 (old planner). We have a plan
> to bump
> >>>> >> to 1.11 / 1.12 this quarter.
> >>>> >>   *   We typically upgrade once every 6 months to a year (not every
> >>>> >> release). We’d like to speed up the cadence but we’re not there
> yet.
> >>>> >>   *   We’d largely be good with keeping Flink on Mesos as-is and
> >>>> >> functional while missing out on some of the newer features. We
> understand
> >>>> >> the pain on the communities side and we can take on the work if we
> see some
> >>>> >> fancy improvement in Flink on Yarn / K8s that we want in Mesos to
> put in
> >>>> >> the request to port it over.
> >>>> >>
> >>>> >>
> >>>> >>
> >>>> >> Thanks,
> >>>> >>
> >>>> >>
> >>>> >>
> >>>> >> -- Piyush
> >>>> >>
> >>>> >>
> >>>> >>
> >>>> >>
> >>>> >>
> >>>> >> From: Xintong Song <tonysong820@gmail.com<mailto:
> tonysong820@gmail.com>>
> >>>> >> Date: Sunday, October 25, 2020 at 10:57 PM
> >>>> >> To: dev <de...@flink.apache.org>>, user
> <
> >>>> >> user@flink.apache.org<ma...@flink.apache.org>>
> >>>> >> Cc: Lasse Nedergaard <lassenedergaardflink@gmail.com<mailto:
> >>>> >> lassenedergaardflink@gmail.com>>, <p.narang@criteo.com<mailto:
> >>>> >> p.narang@criteo.com>>
> >>>> >> Subject: Re: [SURVEY] Remove Mesos support
> >>>> >>
> >>>> >>
> >>>> >>
> >>>> >> Thanks for sharing the information with us, Piyush an Lasse.
> >>>> >>
> >>>> >>
> >>>> >>
> >>>> >> @Piyush
> >>>> >>
> >>>> >>
> >>>> >>
> >>>> >> Thanks for offering the help. IMO, there are currently several
> problems
> >>>> >> that make supporting Flink on Mesos challenging for us.
> >>>> >>
> >>>> >>   1.  Lack of Mesos experts. AFAIK, there are very few people (if
> not
> >>>> >> none) among the active contributors in this community that are
> familiar
> >>>> >> with Mesos and can help with development on this component.
> >>>> >>   2.  Absence of tests. Mesos does not provide a testing cluster,
> like
> >>>> >> `MiniYARNCluster`, making it hard to test interactions between
> Flink and
> >>>> >> Mesos. We have only a few very simple e2e tests running on Mesos
> deployed
> >>>> >> in a docker, covering the most fundamental workflows. We are not
> sure how
> >>>> >> well those tests work, especially against some potential corner
> cases.
> >>>> >>   3.  Divergence from other deployment. Because of 1 and 2, the new
> >>>> >> efforts (features, maintenance, refactors) tend to exclude Mesos if
> >>>> >> possible. When the new efforts have to touch the Mesos related
> components
> >>>> >> (e.g., changes to the common resource manager interfaces), we have
> to be
> >>>> >> very careful and make as few changes as possible, to avoid
> accidentally
> >>>> >> breaking anything that we are not familiar with. As a result, the
> component
> >>>> >> diverges a lot from other deployment components (K8s/Yarn), which
> makes it
> >>>> >> harder to maintain.
> >>>> >>
> >>>> >> It would be greatly appreciated if you can help with either of the
> above
> >>>> >> issues.
> >>>> >>
> >>>> >>
> >>>> >>
> >>>> >> Additionally, I have a few questions concerning your use cases at
> Criteo.
> >>>> >> IIUC, you are going to stay on Mesos in the foreseeable future,
> while
> >>>> >> keeping the Flink version up-to-date? What Flink version are you
> currently
> >>>> >> using? How often do you upgrade (e.g., every release)? Would you
> be good
> >>>> >> with keeping the Flink on Mesos component as it is (means that
> deployment
> >>>> >> and resource management improvements may not be ported to Mesos),
> while
> >>>> >> keeping other components up-to-date (e.g., improvements from
> programming
> >>>> >> APIs, operators, state backens, etc.)?
> >>>> >>
> >>>> >>
> >>>> >>
> >>>> >> Thank you~
> >>>> >>
> >>>> >> Xintong Song
> >>>> >>
> >>>> >>
> >>>> >>
> >>>> >>
> >>>> >>
> >>>> >> On Sat, Oct 24, 2020 at 2:48 AM Lasse Nedergaard <
> >>>> >> lassenedergaardflink@gmail.com<mailto:
> lassenedergaardflink@gmail.com>>
> >>>> >> wrote:
> >>>> >>
> >>>> >> Hi
> >>>> >>
> >>>> >>
> >>>> >>
> >>>> >> At Trackunit We have been using Mesos for long time but have now
> moved to
> >>>> >> k8s.
> >>>> >>
> >>>> >> Med venlig hilsen / Best regards
> >>>> >>
> >>>> >> Lasse Nedergaard
> >>>> >>
> >>>> >>
> >>>> >>
> >>>> >>
> >>>> >>
> >>>> >> Den 23. okt. 2020 kl. 17.01 skrev Robert Metzger <
> rmetzger@apache.org
> >>>> >> <ma...@apache.org>>:
> >>>> >>
> >>>> >> 
> >>>> >>
> >>>> >> Hey Piyush,
> >>>> >>
> >>>> >> thanks a lot for raising this concern. I believe we should keep
> Mesos in
> >>>> >> Flink then in the foreseeable future.
> >>>> >>
> >>>> >> Your offer to help is much appreciated. We'll let you know once
> there is
> >>>> >> something.
> >>>> >>
> >>>> >>
> >>>> >>
> >>>> >> On Fri, Oct 23, 2020 at 4:28 PM Piyush Narang <p.narang@criteo.com
> >>>> >> <ma...@criteo.com>> wrote:
> >>>> >>
> >>>> >> Thanks Kostas. If there's items we can help with, I'm sure we'd be
> able
> >>>> >> to find folks who would be excited to contribute / help in any way.
> >>>> >>
> >>>> >> -- Piyush
> >>>> >>
> >>>> >>
> >>>> >> On 10/23/20, 10:25 AM, "Kostas Kloudas" <kkloudas@gmail.com
> <mailto:
> >>>> >> kkloudas@gmail.com>> wrote:
> >>>> >>
> >>>> >>     Thanks Piyush for the message.
> >>>> >>     After this, I revoke my +1. I agree with the previous opinions
> that we
> >>>> >>     cannot drop code that is actively used by users, especially if
> it
> >>>> >>     something that deep in the stack as support for cluster
> management
> >>>> >>     framework.
> >>>> >>
> >>>> >>     Cheers,
> >>>> >>     Kostas
> >>>> >>
> >>>> >>     On Fri, Oct 23, 2020 at 4:15 PM Piyush Narang <
> p.narang@criteo.com
> >>>> >> <ma...@criteo.com>> wrote:
> >>>> >>     >
> >>>> >>     > Hi folks,
> >>>> >>     >
> >>>> >>     >
> >>>> >>     >
> >>>> >>     > We at Criteo are active users of the Flink on Mesos resource
> >>>> >> management component. We are pretty heavy users of Mesos for
> scheduling
> >>>> >> workloads on our edge datacenters and we do want to continue to be
> able to
> >>>> >> run some of our Flink topologies (to compute machine learning
> short term
> >>>> >> features) on those DCs. If possible our vote would be not to drop
> Mesos
> >>>> >> support as that will tie us to an old release / have to maintain a
> fork as
> >>>> >> we’re not planning to migrate off Mesos anytime soon. Is the burden
> >>>> >> something that can be helped with by the community? (Or are you
> referring
> >>>> >> to having to ensure PRs handle the Mesos piece as well when they
> touch the
> >>>> >> resource managers?)
> >>>> >>     >
> >>>> >>     >
> >>>> >>     >
> >>>> >>     > Thanks,
> >>>> >>     >
> >>>> >>     >
> >>>> >>     >
> >>>> >>     > -- Piyush
> >>>> >>     >
> >>>> >>     >
> >>>> >>     >
> >>>> >>     >
> >>>> >>     >
> >>>> >>     > From: Till Rohrmann <trohrmann@apache.org<mailto:
> >>>> >> trohrmann@apache.org>>
> >>>> >>     > Date: Friday, October 23, 2020 at 8:19 AM
> >>>> >>     > To: Xintong Song <tonysong820@gmail.com<mailto:
> >>>> >> tonysong820@gmail.com>>
> >>>> >>     > Cc: dev <de...@flink.apache.org>>,
> user <
> >>>> >> user@flink.apache.org<ma...@flink.apache.org>>
> >>>> >>     > Subject: Re: [SURVEY] Remove Mesos support
> >>>> >>     >
> >>>> >>     >
> >>>> >>     >
> >>>> >>     > Thanks for starting this survey Robert! I second Konstantin
> and
> >>>> >> Xintong in the sense that our Mesos user's opinions should matter
> most
> >>>> >> here. If our community is no longer using the Mesos integration,
> then I
> >>>> >> would be +1 for removing it in order to decrease the maintenance
> burden.
> >>>> >>     >
> >>>> >>     >
> >>>> >>     >
> >>>> >>     > Cheers,
> >>>> >>     >
> >>>> >>     > Till
> >>>> >>     >
> >>>> >>     >
> >>>> >>     >
> >>>> >>     > On Fri, Oct 23, 2020 at 2:03 PM Xintong Song <
> tonysong820@gmail.com
> >>>> >> <ma...@gmail.com>> wrote:
> >>>> >>     >
> >>>> >>     > +1 for adding a warning in 1.12 about planning to remove
> Mesos
> >>>> >> support.
> >>>> >>     >
> >>>> >>     >
> >>>> >>     >
> >>>> >>     > With my developer hat on, removing the Mesos support would
> >>>> >> definitely reduce the maintaining overhead for the deployment and
> resource
> >>>> >> management related components. On the other hand, the Flink on
> Mesos users'
> >>>> >> voices definitely matter a lot for this community. Either way, it
> would be
> >>>> >> good to draw users attention to this discussion early.
> >>>> >>     >
> >>>> >>     >
> >>>> >>     >
> >>>> >>     > Thank you~
> >>>> >>     >
> >>>> >>     > Xintong Song
> >>>> >>     >
> >>>> >>     >
> >>>> >>     >
> >>>> >>     >
> >>>> >>     >
> >>>> >>     > On Fri, Oct 23, 2020 at 7:53 PM Konstantin Knauf <
> knaufk@apache.org
> >>>> >> <ma...@apache.org>> wrote:
> >>>> >>     >
> >>>> >>     > Hi Robert,
> >>>> >>     >
> >>>> >>     > +1 to the plan you outlined. If we were to drop support in
> Flink
> >>>> >> 1.13+, we
> >>>> >>     > would still support it in Flink 1.12- with bug fixes for
> some time
> >>>> >> so that
> >>>> >>     > users have time to move on.
> >>>> >>     >
> >>>> >>     > It would certainly be very interesting to hear from current
> Flink
> >>>> >> on Mesos
> >>>> >>     > users, on how they see the evolution of this part of the
> ecosystem.
> >>>> >>     >
> >>>> >>     > Best,
> >>>> >>     >
> >>>> >>     > Konstantin
> >>>> >>
> >>>> >
> >>>
> >>>
> >>>
> >>> --
> >>>
> >>> Konstantin Knauf
> >>>
> >>> https://twitter.com/snntrable
> >>>
> >>> https://github.com/knaufk
>

Re: [BULK]Re: [SURVEY] Remove Mesos support

Posted by Yangze Guo <ka...@gmail.com>.
+1

Best,
Yangze Guo

On Mon, Mar 29, 2021 at 11:31 AM Xintong Song <to...@gmail.com> wrote:
>
> +1
> It's already a matter of fact for a while that we no longer port new features to the Mesos deployment.
>
> Thank you~
>
> Xintong Song
>
>
>
> On Fri, Mar 26, 2021 at 10:37 PM Till Rohrmann <tr...@apache.org> wrote:
>>
>> +1 for officially deprecating this component for the 1.13 release.
>>
>> Cheers,
>> Till
>>
>> On Thu, Mar 25, 2021 at 1:49 PM Konstantin Knauf <kn...@apache.org> wrote:
>>>
>>> Hi Matthias,
>>>
>>> Thank you for following up on this. +1 to officially deprecate Mesos in the code and documentation, too. It will be confusing for users if this diverges from the roadmap.
>>>
>>> Cheers,
>>>
>>> Konstantin
>>>
>>> On Thu, Mar 25, 2021 at 12:23 PM Matthias Pohl <ma...@ververica.com> wrote:
>>>>
>>>> Hi everyone,
>>>> considering the upcoming release of Flink 1.13, I wanted to revive the
>>>> discussion about the Mesos support ones more. Mesos is also already listed
>>>> as deprecated in Flink's overall roadmap [1]. Maybe, it's time to align the
>>>> documentation accordingly to make it more explicit?
>>>>
>>>> What do you think?
>>>>
>>>> Best,
>>>> Matthias
>>>>
>>>> [1] https://flink.apache.org/roadmap.html#feature-radar
>>>>
>>>> On Wed, Oct 28, 2020 at 9:40 AM Till Rohrmann <tr...@apache.org> wrote:
>>>>
>>>> > Hi Oleksandr,
>>>> >
>>>> > yes you are right. The biggest problem is at the moment the lack of test
>>>> > coverage and thereby confidence to make changes. We have some e2e tests
>>>> > which you can find here [1]. These tests are, however, quite coarse grained
>>>> > and are missing a lot of cases. One idea would be to add a Mesos e2e test
>>>> > based on Flink's end-to-end test framework [2]. I think what needs to be
>>>> > done there is to add a Mesos resource and a way to submit jobs to a Mesos
>>>> > cluster to write e2e tests.
>>>> >
>>>> > [1] https://github.com/apache/flink/tree/master/flink-jepsen
>>>> > [2]
>>>> > https://github.com/apache/flink/tree/master/flink-end-to-end-tests/flink-end-to-end-tests-common
>>>> >
>>>> > Cheers,
>>>> > Till
>>>> >
>>>> > On Tue, Oct 27, 2020 at 12:29 PM Oleksandr Nitavskyi <
>>>> > o.nitavskyi@criteo.com> wrote:
>>>> >
>>>> >> Hello Xintong,
>>>> >>
>>>> >> Thanks for the insights and support.
>>>> >>
>>>> >> Browsing the Mesos backlog and didn't identify anything critical, which
>>>> >> is left there.
>>>> >>
>>>> >> I see that there are were quite a lot of contributions to the Flink Mesos
>>>> >> in the recent version:
>>>> >> https://github.com/apache/flink/commits/master/flink-mesos.
>>>> >> We plan to validate the current Flink master (or release 1.12 branch) our
>>>> >> Mesos setup. In case of any issues, we will try to propose changes.
>>>> >> My feeling is that our test results shouldn't affect the Flink 1.12
>>>> >> release cycle. And if any potential commits will land into the 1.12.1 it
>>>> >> should be totally fine.
>>>> >>
>>>> >> In the future, we would be glad to help you guys with any
>>>> >> maintenance-related questions. One of the highest priorities around this
>>>> >> component seems to be the development of the full e2e test.
>>>> >>
>>>> >> Kind Regards
>>>> >> Oleksandr Nitavskyi
>>>> >> ________________________________
>>>> >> From: Xintong Song <to...@gmail.com>
>>>> >> Sent: Tuesday, October 27, 2020 7:14 AM
>>>> >> To: dev <de...@flink.apache.org>; user <us...@flink.apache.org>
>>>> >> Cc: Piyush Narang <p....@criteo.com>
>>>> >> Subject: [BULK]Re: [SURVEY] Remove Mesos support
>>>> >>
>>>> >> Hi Piyush,
>>>> >>
>>>> >> Thanks a lot for sharing the information. It would be a great relief that
>>>> >> you are good with Flink on Mesos as is.
>>>> >>
>>>> >> As for the jira issues, I believe the most essential ones should have
>>>> >> already been resolved. You may find some remaining open issues here [1],
>>>> >> but not all of them are necessary if we decide to keep Flink on Mesos as is.
>>>> >>
>>>> >> At the moment and in the short future, I think helps are mostly needed on
>>>> >> testing the upcoming release 1.12 with Mesos use cases. The community is
>>>> >> currently actively preparing the new release, and hopefully we could come
>>>> >> up with a release candidate early next month. It would be greatly
>>>> >> appreciated if you fork as experienced Flink on Mesos users can help with
>>>> >> verifying the release candidates.
>>>> >>
>>>> >>
>>>> >> Thank you~
>>>> >>
>>>> >> Xintong Song
>>>> >>
>>>> >> [1]
>>>> >> https://issues.apache.org/jira/browse/FLINK-17402?jql=project%20%3D%20FLINK%20AND%20component%20%3D%20%22Deployment%20%2F%20Mesos%22%20AND%20status%20%3D%20Open
>>>> >> <
>>>> >> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FFLINK-17402%3Fjql%3Dproject%2520%253D%2520FLINK%2520AND%2520component%2520%253D%2520%2522Deployment%2520%252F%2520Mesos%2522%2520AND%2520status%2520%253D%2520Open&data=04%7C01%7Co.nitavskyi%40criteo.com%7C3585e1f25bdf4e091af808d87a3f92db%7C2a35d8fd574d48e3927c8c398e225a01%7C1%7C0%7C637393760750820881%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=hytJFQE0MCPzMLiQTQTdbg3GVckX5M3r1NPRGrRV8j4%3D&reserved=0
>>>> >> >
>>>> >>
>>>> >> On Tue, Oct 27, 2020 at 2:58 AM Piyush Narang <p.narang@criteo.com
>>>> >> <ma...@criteo.com>> wrote:
>>>> >>
>>>> >> Hi Xintong,
>>>> >>
>>>> >>
>>>> >>
>>>> >> Do you have any jiras that cover any of the items on 1 or 2? I can reach
>>>> >> out to folks internally and see if I can get some folks to commit to
>>>> >> helping out.
>>>> >>
>>>> >>
>>>> >>
>>>> >> To cover the other qs:
>>>> >>
>>>> >>   *   Yes, we’ve not got a plan at the moment to get off Mesos. We use
>>>> >> Yarn for some our Flink workloads when we can. Mesos is only used when we
>>>> >> need streaming capabilities in our WW dcs (as our Yarn is centralized in
>>>> >> one DC)
>>>> >>   *   We’re currently on Flink 1.9 (old planner). We have a plan to bump
>>>> >> to 1.11 / 1.12 this quarter.
>>>> >>   *   We typically upgrade once every 6 months to a year (not every
>>>> >> release). We’d like to speed up the cadence but we’re not there yet.
>>>> >>   *   We’d largely be good with keeping Flink on Mesos as-is and
>>>> >> functional while missing out on some of the newer features. We understand
>>>> >> the pain on the communities side and we can take on the work if we see some
>>>> >> fancy improvement in Flink on Yarn / K8s that we want in Mesos to put in
>>>> >> the request to port it over.
>>>> >>
>>>> >>
>>>> >>
>>>> >> Thanks,
>>>> >>
>>>> >>
>>>> >>
>>>> >> -- Piyush
>>>> >>
>>>> >>
>>>> >>
>>>> >>
>>>> >>
>>>> >> From: Xintong Song <to...@gmail.com>>
>>>> >> Date: Sunday, October 25, 2020 at 10:57 PM
>>>> >> To: dev <de...@flink.apache.org>>, user <
>>>> >> user@flink.apache.org<ma...@flink.apache.org>>
>>>> >> Cc: Lasse Nedergaard <lassenedergaardflink@gmail.com<mailto:
>>>> >> lassenedergaardflink@gmail.com>>, <p.narang@criteo.com<mailto:
>>>> >> p.narang@criteo.com>>
>>>> >> Subject: Re: [SURVEY] Remove Mesos support
>>>> >>
>>>> >>
>>>> >>
>>>> >> Thanks for sharing the information with us, Piyush an Lasse.
>>>> >>
>>>> >>
>>>> >>
>>>> >> @Piyush
>>>> >>
>>>> >>
>>>> >>
>>>> >> Thanks for offering the help. IMO, there are currently several problems
>>>> >> that make supporting Flink on Mesos challenging for us.
>>>> >>
>>>> >>   1.  Lack of Mesos experts. AFAIK, there are very few people (if not
>>>> >> none) among the active contributors in this community that are familiar
>>>> >> with Mesos and can help with development on this component.
>>>> >>   2.  Absence of tests. Mesos does not provide a testing cluster, like
>>>> >> `MiniYARNCluster`, making it hard to test interactions between Flink and
>>>> >> Mesos. We have only a few very simple e2e tests running on Mesos deployed
>>>> >> in a docker, covering the most fundamental workflows. We are not sure how
>>>> >> well those tests work, especially against some potential corner cases.
>>>> >>   3.  Divergence from other deployment. Because of 1 and 2, the new
>>>> >> efforts (features, maintenance, refactors) tend to exclude Mesos if
>>>> >> possible. When the new efforts have to touch the Mesos related components
>>>> >> (e.g., changes to the common resource manager interfaces), we have to be
>>>> >> very careful and make as few changes as possible, to avoid accidentally
>>>> >> breaking anything that we are not familiar with. As a result, the component
>>>> >> diverges a lot from other deployment components (K8s/Yarn), which makes it
>>>> >> harder to maintain.
>>>> >>
>>>> >> It would be greatly appreciated if you can help with either of the above
>>>> >> issues.
>>>> >>
>>>> >>
>>>> >>
>>>> >> Additionally, I have a few questions concerning your use cases at Criteo.
>>>> >> IIUC, you are going to stay on Mesos in the foreseeable future, while
>>>> >> keeping the Flink version up-to-date? What Flink version are you currently
>>>> >> using? How often do you upgrade (e.g., every release)? Would you be good
>>>> >> with keeping the Flink on Mesos component as it is (means that deployment
>>>> >> and resource management improvements may not be ported to Mesos), while
>>>> >> keeping other components up-to-date (e.g., improvements from programming
>>>> >> APIs, operators, state backens, etc.)?
>>>> >>
>>>> >>
>>>> >>
>>>> >> Thank you~
>>>> >>
>>>> >> Xintong Song
>>>> >>
>>>> >>
>>>> >>
>>>> >>
>>>> >>
>>>> >> On Sat, Oct 24, 2020 at 2:48 AM Lasse Nedergaard <
>>>> >> lassenedergaardflink@gmail.com<ma...@gmail.com>>
>>>> >> wrote:
>>>> >>
>>>> >> Hi
>>>> >>
>>>> >>
>>>> >>
>>>> >> At Trackunit We have been using Mesos for long time but have now moved to
>>>> >> k8s.
>>>> >>
>>>> >> Med venlig hilsen / Best regards
>>>> >>
>>>> >> Lasse Nedergaard
>>>> >>
>>>> >>
>>>> >>
>>>> >>
>>>> >>
>>>> >> Den 23. okt. 2020 kl. 17.01 skrev Robert Metzger <rmetzger@apache.org
>>>> >> <ma...@apache.org>>:
>>>> >>
>>>> >> 
>>>> >>
>>>> >> Hey Piyush,
>>>> >>
>>>> >> thanks a lot for raising this concern. I believe we should keep Mesos in
>>>> >> Flink then in the foreseeable future.
>>>> >>
>>>> >> Your offer to help is much appreciated. We'll let you know once there is
>>>> >> something.
>>>> >>
>>>> >>
>>>> >>
>>>> >> On Fri, Oct 23, 2020 at 4:28 PM Piyush Narang <p.narang@criteo.com
>>>> >> <ma...@criteo.com>> wrote:
>>>> >>
>>>> >> Thanks Kostas. If there's items we can help with, I'm sure we'd be able
>>>> >> to find folks who would be excited to contribute / help in any way.
>>>> >>
>>>> >> -- Piyush
>>>> >>
>>>> >>
>>>> >> On 10/23/20, 10:25 AM, "Kostas Kloudas" <kkloudas@gmail.com<mailto:
>>>> >> kkloudas@gmail.com>> wrote:
>>>> >>
>>>> >>     Thanks Piyush for the message.
>>>> >>     After this, I revoke my +1. I agree with the previous opinions that we
>>>> >>     cannot drop code that is actively used by users, especially if it
>>>> >>     something that deep in the stack as support for cluster management
>>>> >>     framework.
>>>> >>
>>>> >>     Cheers,
>>>> >>     Kostas
>>>> >>
>>>> >>     On Fri, Oct 23, 2020 at 4:15 PM Piyush Narang <p.narang@criteo.com
>>>> >> <ma...@criteo.com>> wrote:
>>>> >>     >
>>>> >>     > Hi folks,
>>>> >>     >
>>>> >>     >
>>>> >>     >
>>>> >>     > We at Criteo are active users of the Flink on Mesos resource
>>>> >> management component. We are pretty heavy users of Mesos for scheduling
>>>> >> workloads on our edge datacenters and we do want to continue to be able to
>>>> >> run some of our Flink topologies (to compute machine learning short term
>>>> >> features) on those DCs. If possible our vote would be not to drop Mesos
>>>> >> support as that will tie us to an old release / have to maintain a fork as
>>>> >> we’re not planning to migrate off Mesos anytime soon. Is the burden
>>>> >> something that can be helped with by the community? (Or are you referring
>>>> >> to having to ensure PRs handle the Mesos piece as well when they touch the
>>>> >> resource managers?)
>>>> >>     >
>>>> >>     >
>>>> >>     >
>>>> >>     > Thanks,
>>>> >>     >
>>>> >>     >
>>>> >>     >
>>>> >>     > -- Piyush
>>>> >>     >
>>>> >>     >
>>>> >>     >
>>>> >>     >
>>>> >>     >
>>>> >>     > From: Till Rohrmann <trohrmann@apache.org<mailto:
>>>> >> trohrmann@apache.org>>
>>>> >>     > Date: Friday, October 23, 2020 at 8:19 AM
>>>> >>     > To: Xintong Song <tonysong820@gmail.com<mailto:
>>>> >> tonysong820@gmail.com>>
>>>> >>     > Cc: dev <de...@flink.apache.org>>, user <
>>>> >> user@flink.apache.org<ma...@flink.apache.org>>
>>>> >>     > Subject: Re: [SURVEY] Remove Mesos support
>>>> >>     >
>>>> >>     >
>>>> >>     >
>>>> >>     > Thanks for starting this survey Robert! I second Konstantin and
>>>> >> Xintong in the sense that our Mesos user's opinions should matter most
>>>> >> here. If our community is no longer using the Mesos integration, then I
>>>> >> would be +1 for removing it in order to decrease the maintenance burden.
>>>> >>     >
>>>> >>     >
>>>> >>     >
>>>> >>     > Cheers,
>>>> >>     >
>>>> >>     > Till
>>>> >>     >
>>>> >>     >
>>>> >>     >
>>>> >>     > On Fri, Oct 23, 2020 at 2:03 PM Xintong Song <tonysong820@gmail.com
>>>> >> <ma...@gmail.com>> wrote:
>>>> >>     >
>>>> >>     > +1 for adding a warning in 1.12 about planning to remove Mesos
>>>> >> support.
>>>> >>     >
>>>> >>     >
>>>> >>     >
>>>> >>     > With my developer hat on, removing the Mesos support would
>>>> >> definitely reduce the maintaining overhead for the deployment and resource
>>>> >> management related components. On the other hand, the Flink on Mesos users'
>>>> >> voices definitely matter a lot for this community. Either way, it would be
>>>> >> good to draw users attention to this discussion early.
>>>> >>     >
>>>> >>     >
>>>> >>     >
>>>> >>     > Thank you~
>>>> >>     >
>>>> >>     > Xintong Song
>>>> >>     >
>>>> >>     >
>>>> >>     >
>>>> >>     >
>>>> >>     >
>>>> >>     > On Fri, Oct 23, 2020 at 7:53 PM Konstantin Knauf <knaufk@apache.org
>>>> >> <ma...@apache.org>> wrote:
>>>> >>     >
>>>> >>     > Hi Robert,
>>>> >>     >
>>>> >>     > +1 to the plan you outlined. If we were to drop support in Flink
>>>> >> 1.13+, we
>>>> >>     > would still support it in Flink 1.12- with bug fixes for some time
>>>> >> so that
>>>> >>     > users have time to move on.
>>>> >>     >
>>>> >>     > It would certainly be very interesting to hear from current Flink
>>>> >> on Mesos
>>>> >>     > users, on how they see the evolution of this part of the ecosystem.
>>>> >>     >
>>>> >>     > Best,
>>>> >>     >
>>>> >>     > Konstantin
>>>> >>
>>>> >
>>>
>>>
>>>
>>> --
>>>
>>> Konstantin Knauf
>>>
>>> https://twitter.com/snntrable
>>>
>>> https://github.com/knaufk

Re: [BULK]Re: [SURVEY] Remove Mesos support

Posted by Yangze Guo <ka...@gmail.com>.
+1

Best,
Yangze Guo

On Mon, Mar 29, 2021 at 11:31 AM Xintong Song <to...@gmail.com> wrote:
>
> +1
> It's already a matter of fact for a while that we no longer port new features to the Mesos deployment.
>
> Thank you~
>
> Xintong Song
>
>
>
> On Fri, Mar 26, 2021 at 10:37 PM Till Rohrmann <tr...@apache.org> wrote:
>>
>> +1 for officially deprecating this component for the 1.13 release.
>>
>> Cheers,
>> Till
>>
>> On Thu, Mar 25, 2021 at 1:49 PM Konstantin Knauf <kn...@apache.org> wrote:
>>>
>>> Hi Matthias,
>>>
>>> Thank you for following up on this. +1 to officially deprecate Mesos in the code and documentation, too. It will be confusing for users if this diverges from the roadmap.
>>>
>>> Cheers,
>>>
>>> Konstantin
>>>
>>> On Thu, Mar 25, 2021 at 12:23 PM Matthias Pohl <ma...@ververica.com> wrote:
>>>>
>>>> Hi everyone,
>>>> considering the upcoming release of Flink 1.13, I wanted to revive the
>>>> discussion about the Mesos support ones more. Mesos is also already listed
>>>> as deprecated in Flink's overall roadmap [1]. Maybe, it's time to align the
>>>> documentation accordingly to make it more explicit?
>>>>
>>>> What do you think?
>>>>
>>>> Best,
>>>> Matthias
>>>>
>>>> [1] https://flink.apache.org/roadmap.html#feature-radar
>>>>
>>>> On Wed, Oct 28, 2020 at 9:40 AM Till Rohrmann <tr...@apache.org> wrote:
>>>>
>>>> > Hi Oleksandr,
>>>> >
>>>> > yes you are right. The biggest problem is at the moment the lack of test
>>>> > coverage and thereby confidence to make changes. We have some e2e tests
>>>> > which you can find here [1]. These tests are, however, quite coarse grained
>>>> > and are missing a lot of cases. One idea would be to add a Mesos e2e test
>>>> > based on Flink's end-to-end test framework [2]. I think what needs to be
>>>> > done there is to add a Mesos resource and a way to submit jobs to a Mesos
>>>> > cluster to write e2e tests.
>>>> >
>>>> > [1] https://github.com/apache/flink/tree/master/flink-jepsen
>>>> > [2]
>>>> > https://github.com/apache/flink/tree/master/flink-end-to-end-tests/flink-end-to-end-tests-common
>>>> >
>>>> > Cheers,
>>>> > Till
>>>> >
>>>> > On Tue, Oct 27, 2020 at 12:29 PM Oleksandr Nitavskyi <
>>>> > o.nitavskyi@criteo.com> wrote:
>>>> >
>>>> >> Hello Xintong,
>>>> >>
>>>> >> Thanks for the insights and support.
>>>> >>
>>>> >> Browsing the Mesos backlog and didn't identify anything critical, which
>>>> >> is left there.
>>>> >>
>>>> >> I see that there are were quite a lot of contributions to the Flink Mesos
>>>> >> in the recent version:
>>>> >> https://github.com/apache/flink/commits/master/flink-mesos.
>>>> >> We plan to validate the current Flink master (or release 1.12 branch) our
>>>> >> Mesos setup. In case of any issues, we will try to propose changes.
>>>> >> My feeling is that our test results shouldn't affect the Flink 1.12
>>>> >> release cycle. And if any potential commits will land into the 1.12.1 it
>>>> >> should be totally fine.
>>>> >>
>>>> >> In the future, we would be glad to help you guys with any
>>>> >> maintenance-related questions. One of the highest priorities around this
>>>> >> component seems to be the development of the full e2e test.
>>>> >>
>>>> >> Kind Regards
>>>> >> Oleksandr Nitavskyi
>>>> >> ________________________________
>>>> >> From: Xintong Song <to...@gmail.com>
>>>> >> Sent: Tuesday, October 27, 2020 7:14 AM
>>>> >> To: dev <de...@flink.apache.org>; user <us...@flink.apache.org>
>>>> >> Cc: Piyush Narang <p....@criteo.com>
>>>> >> Subject: [BULK]Re: [SURVEY] Remove Mesos support
>>>> >>
>>>> >> Hi Piyush,
>>>> >>
>>>> >> Thanks a lot for sharing the information. It would be a great relief that
>>>> >> you are good with Flink on Mesos as is.
>>>> >>
>>>> >> As for the jira issues, I believe the most essential ones should have
>>>> >> already been resolved. You may find some remaining open issues here [1],
>>>> >> but not all of them are necessary if we decide to keep Flink on Mesos as is.
>>>> >>
>>>> >> At the moment and in the short future, I think helps are mostly needed on
>>>> >> testing the upcoming release 1.12 with Mesos use cases. The community is
>>>> >> currently actively preparing the new release, and hopefully we could come
>>>> >> up with a release candidate early next month. It would be greatly
>>>> >> appreciated if you fork as experienced Flink on Mesos users can help with
>>>> >> verifying the release candidates.
>>>> >>
>>>> >>
>>>> >> Thank you~
>>>> >>
>>>> >> Xintong Song
>>>> >>
>>>> >> [1]
>>>> >> https://issues.apache.org/jira/browse/FLINK-17402?jql=project%20%3D%20FLINK%20AND%20component%20%3D%20%22Deployment%20%2F%20Mesos%22%20AND%20status%20%3D%20Open
>>>> >> <
>>>> >> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FFLINK-17402%3Fjql%3Dproject%2520%253D%2520FLINK%2520AND%2520component%2520%253D%2520%2522Deployment%2520%252F%2520Mesos%2522%2520AND%2520status%2520%253D%2520Open&data=04%7C01%7Co.nitavskyi%40criteo.com%7C3585e1f25bdf4e091af808d87a3f92db%7C2a35d8fd574d48e3927c8c398e225a01%7C1%7C0%7C637393760750820881%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=hytJFQE0MCPzMLiQTQTdbg3GVckX5M3r1NPRGrRV8j4%3D&reserved=0
>>>> >> >
>>>> >>
>>>> >> On Tue, Oct 27, 2020 at 2:58 AM Piyush Narang <p.narang@criteo.com
>>>> >> <ma...@criteo.com>> wrote:
>>>> >>
>>>> >> Hi Xintong,
>>>> >>
>>>> >>
>>>> >>
>>>> >> Do you have any jiras that cover any of the items on 1 or 2? I can reach
>>>> >> out to folks internally and see if I can get some folks to commit to
>>>> >> helping out.
>>>> >>
>>>> >>
>>>> >>
>>>> >> To cover the other qs:
>>>> >>
>>>> >>   *   Yes, we’ve not got a plan at the moment to get off Mesos. We use
>>>> >> Yarn for some our Flink workloads when we can. Mesos is only used when we
>>>> >> need streaming capabilities in our WW dcs (as our Yarn is centralized in
>>>> >> one DC)
>>>> >>   *   We’re currently on Flink 1.9 (old planner). We have a plan to bump
>>>> >> to 1.11 / 1.12 this quarter.
>>>> >>   *   We typically upgrade once every 6 months to a year (not every
>>>> >> release). We’d like to speed up the cadence but we’re not there yet.
>>>> >>   *   We’d largely be good with keeping Flink on Mesos as-is and
>>>> >> functional while missing out on some of the newer features. We understand
>>>> >> the pain on the communities side and we can take on the work if we see some
>>>> >> fancy improvement in Flink on Yarn / K8s that we want in Mesos to put in
>>>> >> the request to port it over.
>>>> >>
>>>> >>
>>>> >>
>>>> >> Thanks,
>>>> >>
>>>> >>
>>>> >>
>>>> >> -- Piyush
>>>> >>
>>>> >>
>>>> >>
>>>> >>
>>>> >>
>>>> >> From: Xintong Song <to...@gmail.com>>
>>>> >> Date: Sunday, October 25, 2020 at 10:57 PM
>>>> >> To: dev <de...@flink.apache.org>>, user <
>>>> >> user@flink.apache.org<ma...@flink.apache.org>>
>>>> >> Cc: Lasse Nedergaard <lassenedergaardflink@gmail.com<mailto:
>>>> >> lassenedergaardflink@gmail.com>>, <p.narang@criteo.com<mailto:
>>>> >> p.narang@criteo.com>>
>>>> >> Subject: Re: [SURVEY] Remove Mesos support
>>>> >>
>>>> >>
>>>> >>
>>>> >> Thanks for sharing the information with us, Piyush an Lasse.
>>>> >>
>>>> >>
>>>> >>
>>>> >> @Piyush
>>>> >>
>>>> >>
>>>> >>
>>>> >> Thanks for offering the help. IMO, there are currently several problems
>>>> >> that make supporting Flink on Mesos challenging for us.
>>>> >>
>>>> >>   1.  Lack of Mesos experts. AFAIK, there are very few people (if not
>>>> >> none) among the active contributors in this community that are familiar
>>>> >> with Mesos and can help with development on this component.
>>>> >>   2.  Absence of tests. Mesos does not provide a testing cluster, like
>>>> >> `MiniYARNCluster`, making it hard to test interactions between Flink and
>>>> >> Mesos. We have only a few very simple e2e tests running on Mesos deployed
>>>> >> in a docker, covering the most fundamental workflows. We are not sure how
>>>> >> well those tests work, especially against some potential corner cases.
>>>> >>   3.  Divergence from other deployment. Because of 1 and 2, the new
>>>> >> efforts (features, maintenance, refactors) tend to exclude Mesos if
>>>> >> possible. When the new efforts have to touch the Mesos related components
>>>> >> (e.g., changes to the common resource manager interfaces), we have to be
>>>> >> very careful and make as few changes as possible, to avoid accidentally
>>>> >> breaking anything that we are not familiar with. As a result, the component
>>>> >> diverges a lot from other deployment components (K8s/Yarn), which makes it
>>>> >> harder to maintain.
>>>> >>
>>>> >> It would be greatly appreciated if you can help with either of the above
>>>> >> issues.
>>>> >>
>>>> >>
>>>> >>
>>>> >> Additionally, I have a few questions concerning your use cases at Criteo.
>>>> >> IIUC, you are going to stay on Mesos in the foreseeable future, while
>>>> >> keeping the Flink version up-to-date? What Flink version are you currently
>>>> >> using? How often do you upgrade (e.g., every release)? Would you be good
>>>> >> with keeping the Flink on Mesos component as it is (means that deployment
>>>> >> and resource management improvements may not be ported to Mesos), while
>>>> >> keeping other components up-to-date (e.g., improvements from programming
>>>> >> APIs, operators, state backens, etc.)?
>>>> >>
>>>> >>
>>>> >>
>>>> >> Thank you~
>>>> >>
>>>> >> Xintong Song
>>>> >>
>>>> >>
>>>> >>
>>>> >>
>>>> >>
>>>> >> On Sat, Oct 24, 2020 at 2:48 AM Lasse Nedergaard <
>>>> >> lassenedergaardflink@gmail.com<ma...@gmail.com>>
>>>> >> wrote:
>>>> >>
>>>> >> Hi
>>>> >>
>>>> >>
>>>> >>
>>>> >> At Trackunit We have been using Mesos for long time but have now moved to
>>>> >> k8s.
>>>> >>
>>>> >> Med venlig hilsen / Best regards
>>>> >>
>>>> >> Lasse Nedergaard
>>>> >>
>>>> >>
>>>> >>
>>>> >>
>>>> >>
>>>> >> Den 23. okt. 2020 kl. 17.01 skrev Robert Metzger <rmetzger@apache.org
>>>> >> <ma...@apache.org>>:
>>>> >>
>>>> >> 
>>>> >>
>>>> >> Hey Piyush,
>>>> >>
>>>> >> thanks a lot for raising this concern. I believe we should keep Mesos in
>>>> >> Flink then in the foreseeable future.
>>>> >>
>>>> >> Your offer to help is much appreciated. We'll let you know once there is
>>>> >> something.
>>>> >>
>>>> >>
>>>> >>
>>>> >> On Fri, Oct 23, 2020 at 4:28 PM Piyush Narang <p.narang@criteo.com
>>>> >> <ma...@criteo.com>> wrote:
>>>> >>
>>>> >> Thanks Kostas. If there's items we can help with, I'm sure we'd be able
>>>> >> to find folks who would be excited to contribute / help in any way.
>>>> >>
>>>> >> -- Piyush
>>>> >>
>>>> >>
>>>> >> On 10/23/20, 10:25 AM, "Kostas Kloudas" <kkloudas@gmail.com<mailto:
>>>> >> kkloudas@gmail.com>> wrote:
>>>> >>
>>>> >>     Thanks Piyush for the message.
>>>> >>     After this, I revoke my +1. I agree with the previous opinions that we
>>>> >>     cannot drop code that is actively used by users, especially if it
>>>> >>     something that deep in the stack as support for cluster management
>>>> >>     framework.
>>>> >>
>>>> >>     Cheers,
>>>> >>     Kostas
>>>> >>
>>>> >>     On Fri, Oct 23, 2020 at 4:15 PM Piyush Narang <p.narang@criteo.com
>>>> >> <ma...@criteo.com>> wrote:
>>>> >>     >
>>>> >>     > Hi folks,
>>>> >>     >
>>>> >>     >
>>>> >>     >
>>>> >>     > We at Criteo are active users of the Flink on Mesos resource
>>>> >> management component. We are pretty heavy users of Mesos for scheduling
>>>> >> workloads on our edge datacenters and we do want to continue to be able to
>>>> >> run some of our Flink topologies (to compute machine learning short term
>>>> >> features) on those DCs. If possible our vote would be not to drop Mesos
>>>> >> support as that will tie us to an old release / have to maintain a fork as
>>>> >> we’re not planning to migrate off Mesos anytime soon. Is the burden
>>>> >> something that can be helped with by the community? (Or are you referring
>>>> >> to having to ensure PRs handle the Mesos piece as well when they touch the
>>>> >> resource managers?)
>>>> >>     >
>>>> >>     >
>>>> >>     >
>>>> >>     > Thanks,
>>>> >>     >
>>>> >>     >
>>>> >>     >
>>>> >>     > -- Piyush
>>>> >>     >
>>>> >>     >
>>>> >>     >
>>>> >>     >
>>>> >>     >
>>>> >>     > From: Till Rohrmann <trohrmann@apache.org<mailto:
>>>> >> trohrmann@apache.org>>
>>>> >>     > Date: Friday, October 23, 2020 at 8:19 AM
>>>> >>     > To: Xintong Song <tonysong820@gmail.com<mailto:
>>>> >> tonysong820@gmail.com>>
>>>> >>     > Cc: dev <de...@flink.apache.org>>, user <
>>>> >> user@flink.apache.org<ma...@flink.apache.org>>
>>>> >>     > Subject: Re: [SURVEY] Remove Mesos support
>>>> >>     >
>>>> >>     >
>>>> >>     >
>>>> >>     > Thanks for starting this survey Robert! I second Konstantin and
>>>> >> Xintong in the sense that our Mesos user's opinions should matter most
>>>> >> here. If our community is no longer using the Mesos integration, then I
>>>> >> would be +1 for removing it in order to decrease the maintenance burden.
>>>> >>     >
>>>> >>     >
>>>> >>     >
>>>> >>     > Cheers,
>>>> >>     >
>>>> >>     > Till
>>>> >>     >
>>>> >>     >
>>>> >>     >
>>>> >>     > On Fri, Oct 23, 2020 at 2:03 PM Xintong Song <tonysong820@gmail.com
>>>> >> <ma...@gmail.com>> wrote:
>>>> >>     >
>>>> >>     > +1 for adding a warning in 1.12 about planning to remove Mesos
>>>> >> support.
>>>> >>     >
>>>> >>     >
>>>> >>     >
>>>> >>     > With my developer hat on, removing the Mesos support would
>>>> >> definitely reduce the maintaining overhead for the deployment and resource
>>>> >> management related components. On the other hand, the Flink on Mesos users'
>>>> >> voices definitely matter a lot for this community. Either way, it would be
>>>> >> good to draw users attention to this discussion early.
>>>> >>     >
>>>> >>     >
>>>> >>     >
>>>> >>     > Thank you~
>>>> >>     >
>>>> >>     > Xintong Song
>>>> >>     >
>>>> >>     >
>>>> >>     >
>>>> >>     >
>>>> >>     >
>>>> >>     > On Fri, Oct 23, 2020 at 7:53 PM Konstantin Knauf <knaufk@apache.org
>>>> >> <ma...@apache.org>> wrote:
>>>> >>     >
>>>> >>     > Hi Robert,
>>>> >>     >
>>>> >>     > +1 to the plan you outlined. If we were to drop support in Flink
>>>> >> 1.13+, we
>>>> >>     > would still support it in Flink 1.12- with bug fixes for some time
>>>> >> so that
>>>> >>     > users have time to move on.
>>>> >>     >
>>>> >>     > It would certainly be very interesting to hear from current Flink
>>>> >> on Mesos
>>>> >>     > users, on how they see the evolution of this part of the ecosystem.
>>>> >>     >
>>>> >>     > Best,
>>>> >>     >
>>>> >>     > Konstantin
>>>> >>
>>>> >
>>>
>>>
>>>
>>> --
>>>
>>> Konstantin Knauf
>>>
>>> https://twitter.com/snntrable
>>>
>>> https://github.com/knaufk

Re: [BULK]Re: [SURVEY] Remove Mesos support

Posted by Xintong Song <to...@gmail.com>.
+1
It's already a matter of fact for a while that we no longer port new
features to the Mesos deployment.

Thank you~

Xintong Song



On Fri, Mar 26, 2021 at 10:37 PM Till Rohrmann <tr...@apache.org> wrote:

> +1 for officially deprecating this component for the 1.13 release.
>
> Cheers,
> Till
>
> On Thu, Mar 25, 2021 at 1:49 PM Konstantin Knauf <kn...@apache.org>
> wrote:
>
>> Hi Matthias,
>>
>> Thank you for following up on this. +1 to officially deprecate Mesos in
>> the code and documentation, too. It will be confusing for users if this
>> diverges from the roadmap.
>>
>> Cheers,
>>
>> Konstantin
>>
>> On Thu, Mar 25, 2021 at 12:23 PM Matthias Pohl <ma...@ververica.com>
>> wrote:
>>
>>> Hi everyone,
>>> considering the upcoming release of Flink 1.13, I wanted to revive the
>>> discussion about the Mesos support ones more. Mesos is also already
>>> listed
>>> as deprecated in Flink's overall roadmap [1]. Maybe, it's time to align
>>> the
>>> documentation accordingly to make it more explicit?
>>>
>>> What do you think?
>>>
>>> Best,
>>> Matthias
>>>
>>> [1] https://flink.apache.org/roadmap.html#feature-radar
>>>
>>> On Wed, Oct 28, 2020 at 9:40 AM Till Rohrmann <tr...@apache.org>
>>> wrote:
>>>
>>> > Hi Oleksandr,
>>> >
>>> > yes you are right. The biggest problem is at the moment the lack of
>>> test
>>> > coverage and thereby confidence to make changes. We have some e2e tests
>>> > which you can find here [1]. These tests are, however, quite coarse
>>> grained
>>> > and are missing a lot of cases. One idea would be to add a Mesos e2e
>>> test
>>> > based on Flink's end-to-end test framework [2]. I think what needs to
>>> be
>>> > done there is to add a Mesos resource and a way to submit jobs to a
>>> Mesos
>>> > cluster to write e2e tests.
>>> >
>>> > [1] https://github.com/apache/flink/tree/master/flink-jepsen
>>> > [2]
>>> >
>>> https://github.com/apache/flink/tree/master/flink-end-to-end-tests/flink-end-to-end-tests-common
>>> >
>>> > Cheers,
>>> > Till
>>> >
>>> > On Tue, Oct 27, 2020 at 12:29 PM Oleksandr Nitavskyi <
>>> > o.nitavskyi@criteo.com> wrote:
>>> >
>>> >> Hello Xintong,
>>> >>
>>> >> Thanks for the insights and support.
>>> >>
>>> >> Browsing the Mesos backlog and didn't identify anything critical,
>>> which
>>> >> is left there.
>>> >>
>>> >> I see that there are were quite a lot of contributions to the Flink
>>> Mesos
>>> >> in the recent version:
>>> >> https://github.com/apache/flink/commits/master/flink-mesos.
>>> >> We plan to validate the current Flink master (or release 1.12 branch)
>>> our
>>> >> Mesos setup. In case of any issues, we will try to propose changes.
>>> >> My feeling is that our test results shouldn't affect the Flink 1.12
>>> >> release cycle. And if any potential commits will land into the 1.12.1
>>> it
>>> >> should be totally fine.
>>> >>
>>> >> In the future, we would be glad to help you guys with any
>>> >> maintenance-related questions. One of the highest priorities around
>>> this
>>> >> component seems to be the development of the full e2e test.
>>> >>
>>> >> Kind Regards
>>> >> Oleksandr Nitavskyi
>>> >> ________________________________
>>> >> From: Xintong Song <to...@gmail.com>
>>> >> Sent: Tuesday, October 27, 2020 7:14 AM
>>> >> To: dev <de...@flink.apache.org>; user <us...@flink.apache.org>
>>> >> Cc: Piyush Narang <p....@criteo.com>
>>> >> Subject: [BULK]Re: [SURVEY] Remove Mesos support
>>> >>
>>> >> Hi Piyush,
>>> >>
>>> >> Thanks a lot for sharing the information. It would be a great relief
>>> that
>>> >> you are good with Flink on Mesos as is.
>>> >>
>>> >> As for the jira issues, I believe the most essential ones should have
>>> >> already been resolved. You may find some remaining open issues here
>>> [1],
>>> >> but not all of them are necessary if we decide to keep Flink on Mesos
>>> as is.
>>> >>
>>> >> At the moment and in the short future, I think helps are mostly
>>> needed on
>>> >> testing the upcoming release 1.12 with Mesos use cases. The community
>>> is
>>> >> currently actively preparing the new release, and hopefully we could
>>> come
>>> >> up with a release candidate early next month. It would be greatly
>>> >> appreciated if you fork as experienced Flink on Mesos users can help
>>> with
>>> >> verifying the release candidates.
>>> >>
>>> >>
>>> >> Thank you~
>>> >>
>>> >> Xintong Song
>>> >>
>>> >> [1]
>>> >>
>>> https://issues.apache.org/jira/browse/FLINK-17402?jql=project%20%3D%20FLINK%20AND%20component%20%3D%20%22Deployment%20%2F%20Mesos%22%20AND%20status%20%3D%20Open
>>> >> <
>>> >>
>>> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FFLINK-17402%3Fjql%3Dproject%2520%253D%2520FLINK%2520AND%2520component%2520%253D%2520%2522Deployment%2520%252F%2520Mesos%2522%2520AND%2520status%2520%253D%2520Open&data=04%7C01%7Co.nitavskyi%40criteo.com%7C3585e1f25bdf4e091af808d87a3f92db%7C2a35d8fd574d48e3927c8c398e225a01%7C1%7C0%7C637393760750820881%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=hytJFQE0MCPzMLiQTQTdbg3GVckX5M3r1NPRGrRV8j4%3D&reserved=0
>>> >> >
>>> >>
>>> >> On Tue, Oct 27, 2020 at 2:58 AM Piyush Narang <p.narang@criteo.com
>>> >> <ma...@criteo.com>> wrote:
>>> >>
>>> >> Hi Xintong,
>>> >>
>>> >>
>>> >>
>>> >> Do you have any jiras that cover any of the items on 1 or 2? I can
>>> reach
>>> >> out to folks internally and see if I can get some folks to commit to
>>> >> helping out.
>>> >>
>>> >>
>>> >>
>>> >> To cover the other qs:
>>> >>
>>> >>   *   Yes, we’ve not got a plan at the moment to get off Mesos. We use
>>> >> Yarn for some our Flink workloads when we can. Mesos is only used
>>> when we
>>> >> need streaming capabilities in our WW dcs (as our Yarn is centralized
>>> in
>>> >> one DC)
>>> >>   *   We’re currently on Flink 1.9 (old planner). We have a plan to
>>> bump
>>> >> to 1.11 / 1.12 this quarter.
>>> >>   *   We typically upgrade once every 6 months to a year (not every
>>> >> release). We’d like to speed up the cadence but we’re not there yet.
>>> >>   *   We’d largely be good with keeping Flink on Mesos as-is and
>>> >> functional while missing out on some of the newer features. We
>>> understand
>>> >> the pain on the communities side and we can take on the work if we
>>> see some
>>> >> fancy improvement in Flink on Yarn / K8s that we want in Mesos to put
>>> in
>>> >> the request to port it over.
>>> >>
>>> >>
>>> >>
>>> >> Thanks,
>>> >>
>>> >>
>>> >>
>>> >> -- Piyush
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >> From: Xintong Song <tonysong820@gmail.com<mailto:
>>> tonysong820@gmail.com>>
>>> >> Date: Sunday, October 25, 2020 at 10:57 PM
>>> >> To: dev <de...@flink.apache.org>>, user <
>>> >> user@flink.apache.org<ma...@flink.apache.org>>
>>> >> Cc: Lasse Nedergaard <lassenedergaardflink@gmail.com<mailto:
>>> >> lassenedergaardflink@gmail.com>>, <p.narang@criteo.com<mailto:
>>> >> p.narang@criteo.com>>
>>> >> Subject: Re: [SURVEY] Remove Mesos support
>>> >>
>>> >>
>>> >>
>>> >> Thanks for sharing the information with us, Piyush an Lasse.
>>> >>
>>> >>
>>> >>
>>> >> @Piyush
>>> >>
>>> >>
>>> >>
>>> >> Thanks for offering the help. IMO, there are currently several
>>> problems
>>> >> that make supporting Flink on Mesos challenging for us.
>>> >>
>>> >>   1.  Lack of Mesos experts. AFAIK, there are very few people (if not
>>> >> none) among the active contributors in this community that are
>>> familiar
>>> >> with Mesos and can help with development on this component.
>>> >>   2.  Absence of tests. Mesos does not provide a testing cluster, like
>>> >> `MiniYARNCluster`, making it hard to test interactions between Flink
>>> and
>>> >> Mesos. We have only a few very simple e2e tests running on Mesos
>>> deployed
>>> >> in a docker, covering the most fundamental workflows. We are not sure
>>> how
>>> >> well those tests work, especially against some potential corner cases.
>>> >>   3.  Divergence from other deployment. Because of 1 and 2, the new
>>> >> efforts (features, maintenance, refactors) tend to exclude Mesos if
>>> >> possible. When the new efforts have to touch the Mesos related
>>> components
>>> >> (e.g., changes to the common resource manager interfaces), we have to
>>> be
>>> >> very careful and make as few changes as possible, to avoid
>>> accidentally
>>> >> breaking anything that we are not familiar with. As a result, the
>>> component
>>> >> diverges a lot from other deployment components (K8s/Yarn), which
>>> makes it
>>> >> harder to maintain.
>>> >>
>>> >> It would be greatly appreciated if you can help with either of the
>>> above
>>> >> issues.
>>> >>
>>> >>
>>> >>
>>> >> Additionally, I have a few questions concerning your use cases at
>>> Criteo.
>>> >> IIUC, you are going to stay on Mesos in the foreseeable future, while
>>> >> keeping the Flink version up-to-date? What Flink version are you
>>> currently
>>> >> using? How often do you upgrade (e.g., every release)? Would you be
>>> good
>>> >> with keeping the Flink on Mesos component as it is (means that
>>> deployment
>>> >> and resource management improvements may not be ported to Mesos),
>>> while
>>> >> keeping other components up-to-date (e.g., improvements from
>>> programming
>>> >> APIs, operators, state backens, etc.)?
>>> >>
>>> >>
>>> >>
>>> >> Thank you~
>>> >>
>>> >> Xintong Song
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >> On Sat, Oct 24, 2020 at 2:48 AM Lasse Nedergaard <
>>> >> lassenedergaardflink@gmail.com<mailto:lassenedergaardflink@gmail.com
>>> >>
>>> >> wrote:
>>> >>
>>> >> Hi
>>> >>
>>> >>
>>> >>
>>> >> At Trackunit We have been using Mesos for long time but have now
>>> moved to
>>> >> k8s.
>>> >>
>>> >> Med venlig hilsen / Best regards
>>> >>
>>> >> Lasse Nedergaard
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >> Den 23. okt. 2020 kl. 17.01 skrev Robert Metzger <rmetzger@apache.org
>>> >> <ma...@apache.org>>:
>>> >>
>>> >> 
>>> >>
>>> >> Hey Piyush,
>>> >>
>>> >> thanks a lot for raising this concern. I believe we should keep Mesos
>>> in
>>> >> Flink then in the foreseeable future.
>>> >>
>>> >> Your offer to help is much appreciated. We'll let you know once there
>>> is
>>> >> something.
>>> >>
>>> >>
>>> >>
>>> >> On Fri, Oct 23, 2020 at 4:28 PM Piyush Narang <p.narang@criteo.com
>>> >> <ma...@criteo.com>> wrote:
>>> >>
>>> >> Thanks Kostas. If there's items we can help with, I'm sure we'd be
>>> able
>>> >> to find folks who would be excited to contribute / help in any way.
>>> >>
>>> >> -- Piyush
>>> >>
>>> >>
>>> >> On 10/23/20, 10:25 AM, "Kostas Kloudas" <kkloudas@gmail.com<mailto:
>>> >> kkloudas@gmail.com>> wrote:
>>> >>
>>> >>     Thanks Piyush for the message.
>>> >>     After this, I revoke my +1. I agree with the previous opinions
>>> that we
>>> >>     cannot drop code that is actively used by users, especially if it
>>> >>     something that deep in the stack as support for cluster management
>>> >>     framework.
>>> >>
>>> >>     Cheers,
>>> >>     Kostas
>>> >>
>>> >>     On Fri, Oct 23, 2020 at 4:15 PM Piyush Narang <
>>> p.narang@criteo.com
>>> >> <ma...@criteo.com>> wrote:
>>> >>     >
>>> >>     > Hi folks,
>>> >>     >
>>> >>     >
>>> >>     >
>>> >>     > We at Criteo are active users of the Flink on Mesos resource
>>> >> management component. We are pretty heavy users of Mesos for
>>> scheduling
>>> >> workloads on our edge datacenters and we do want to continue to be
>>> able to
>>> >> run some of our Flink topologies (to compute machine learning short
>>> term
>>> >> features) on those DCs. If possible our vote would be not to drop
>>> Mesos
>>> >> support as that will tie us to an old release / have to maintain a
>>> fork as
>>> >> we’re not planning to migrate off Mesos anytime soon. Is the burden
>>> >> something that can be helped with by the community? (Or are you
>>> referring
>>> >> to having to ensure PRs handle the Mesos piece as well when they
>>> touch the
>>> >> resource managers?)
>>> >>     >
>>> >>     >
>>> >>     >
>>> >>     > Thanks,
>>> >>     >
>>> >>     >
>>> >>     >
>>> >>     > -- Piyush
>>> >>     >
>>> >>     >
>>> >>     >
>>> >>     >
>>> >>     >
>>> >>     > From: Till Rohrmann <trohrmann@apache.org<mailto:
>>> >> trohrmann@apache.org>>
>>> >>     > Date: Friday, October 23, 2020 at 8:19 AM
>>> >>     > To: Xintong Song <tonysong820@gmail.com<mailto:
>>> >> tonysong820@gmail.com>>
>>> >>     > Cc: dev <de...@flink.apache.org>>,
>>> user <
>>> >> user@flink.apache.org<ma...@flink.apache.org>>
>>> >>     > Subject: Re: [SURVEY] Remove Mesos support
>>> >>     >
>>> >>     >
>>> >>     >
>>> >>     > Thanks for starting this survey Robert! I second Konstantin and
>>> >> Xintong in the sense that our Mesos user's opinions should matter most
>>> >> here. If our community is no longer using the Mesos integration, then
>>> I
>>> >> would be +1 for removing it in order to decrease the maintenance
>>> burden.
>>> >>     >
>>> >>     >
>>> >>     >
>>> >>     > Cheers,
>>> >>     >
>>> >>     > Till
>>> >>     >
>>> >>     >
>>> >>     >
>>> >>     > On Fri, Oct 23, 2020 at 2:03 PM Xintong Song <
>>> tonysong820@gmail.com
>>> >> <ma...@gmail.com>> wrote:
>>> >>     >
>>> >>     > +1 for adding a warning in 1.12 about planning to remove Mesos
>>> >> support.
>>> >>     >
>>> >>     >
>>> >>     >
>>> >>     > With my developer hat on, removing the Mesos support would
>>> >> definitely reduce the maintaining overhead for the deployment and
>>> resource
>>> >> management related components. On the other hand, the Flink on Mesos
>>> users'
>>> >> voices definitely matter a lot for this community. Either way, it
>>> would be
>>> >> good to draw users attention to this discussion early.
>>> >>     >
>>> >>     >
>>> >>     >
>>> >>     > Thank you~
>>> >>     >
>>> >>     > Xintong Song
>>> >>     >
>>> >>     >
>>> >>     >
>>> >>     >
>>> >>     >
>>> >>     > On Fri, Oct 23, 2020 at 7:53 PM Konstantin Knauf <
>>> knaufk@apache.org
>>> >> <ma...@apache.org>> wrote:
>>> >>     >
>>> >>     > Hi Robert,
>>> >>     >
>>> >>     > +1 to the plan you outlined. If we were to drop support in Flink
>>> >> 1.13+, we
>>> >>     > would still support it in Flink 1.12- with bug fixes for some
>>> time
>>> >> so that
>>> >>     > users have time to move on.
>>> >>     >
>>> >>     > It would certainly be very interesting to hear from current
>>> Flink
>>> >> on Mesos
>>> >>     > users, on how they see the evolution of this part of the
>>> ecosystem.
>>> >>     >
>>> >>     > Best,
>>> >>     >
>>> >>     > Konstantin
>>> >>
>>> >
>>>
>>
>>
>> --
>>
>> Konstantin Knauf
>>
>> https://twitter.com/snntrable
>>
>> https://github.com/knaufk
>>
>

Re: [BULK]Re: [SURVEY] Remove Mesos support

Posted by Xintong Song <to...@gmail.com>.
+1
It's already a matter of fact for a while that we no longer port new
features to the Mesos deployment.

Thank you~

Xintong Song



On Fri, Mar 26, 2021 at 10:37 PM Till Rohrmann <tr...@apache.org> wrote:

> +1 for officially deprecating this component for the 1.13 release.
>
> Cheers,
> Till
>
> On Thu, Mar 25, 2021 at 1:49 PM Konstantin Knauf <kn...@apache.org>
> wrote:
>
>> Hi Matthias,
>>
>> Thank you for following up on this. +1 to officially deprecate Mesos in
>> the code and documentation, too. It will be confusing for users if this
>> diverges from the roadmap.
>>
>> Cheers,
>>
>> Konstantin
>>
>> On Thu, Mar 25, 2021 at 12:23 PM Matthias Pohl <ma...@ververica.com>
>> wrote:
>>
>>> Hi everyone,
>>> considering the upcoming release of Flink 1.13, I wanted to revive the
>>> discussion about the Mesos support ones more. Mesos is also already
>>> listed
>>> as deprecated in Flink's overall roadmap [1]. Maybe, it's time to align
>>> the
>>> documentation accordingly to make it more explicit?
>>>
>>> What do you think?
>>>
>>> Best,
>>> Matthias
>>>
>>> [1] https://flink.apache.org/roadmap.html#feature-radar
>>>
>>> On Wed, Oct 28, 2020 at 9:40 AM Till Rohrmann <tr...@apache.org>
>>> wrote:
>>>
>>> > Hi Oleksandr,
>>> >
>>> > yes you are right. The biggest problem is at the moment the lack of
>>> test
>>> > coverage and thereby confidence to make changes. We have some e2e tests
>>> > which you can find here [1]. These tests are, however, quite coarse
>>> grained
>>> > and are missing a lot of cases. One idea would be to add a Mesos e2e
>>> test
>>> > based on Flink's end-to-end test framework [2]. I think what needs to
>>> be
>>> > done there is to add a Mesos resource and a way to submit jobs to a
>>> Mesos
>>> > cluster to write e2e tests.
>>> >
>>> > [1] https://github.com/apache/flink/tree/master/flink-jepsen
>>> > [2]
>>> >
>>> https://github.com/apache/flink/tree/master/flink-end-to-end-tests/flink-end-to-end-tests-common
>>> >
>>> > Cheers,
>>> > Till
>>> >
>>> > On Tue, Oct 27, 2020 at 12:29 PM Oleksandr Nitavskyi <
>>> > o.nitavskyi@criteo.com> wrote:
>>> >
>>> >> Hello Xintong,
>>> >>
>>> >> Thanks for the insights and support.
>>> >>
>>> >> Browsing the Mesos backlog and didn't identify anything critical,
>>> which
>>> >> is left there.
>>> >>
>>> >> I see that there are were quite a lot of contributions to the Flink
>>> Mesos
>>> >> in the recent version:
>>> >> https://github.com/apache/flink/commits/master/flink-mesos.
>>> >> We plan to validate the current Flink master (or release 1.12 branch)
>>> our
>>> >> Mesos setup. In case of any issues, we will try to propose changes.
>>> >> My feeling is that our test results shouldn't affect the Flink 1.12
>>> >> release cycle. And if any potential commits will land into the 1.12.1
>>> it
>>> >> should be totally fine.
>>> >>
>>> >> In the future, we would be glad to help you guys with any
>>> >> maintenance-related questions. One of the highest priorities around
>>> this
>>> >> component seems to be the development of the full e2e test.
>>> >>
>>> >> Kind Regards
>>> >> Oleksandr Nitavskyi
>>> >> ________________________________
>>> >> From: Xintong Song <to...@gmail.com>
>>> >> Sent: Tuesday, October 27, 2020 7:14 AM
>>> >> To: dev <de...@flink.apache.org>; user <us...@flink.apache.org>
>>> >> Cc: Piyush Narang <p....@criteo.com>
>>> >> Subject: [BULK]Re: [SURVEY] Remove Mesos support
>>> >>
>>> >> Hi Piyush,
>>> >>
>>> >> Thanks a lot for sharing the information. It would be a great relief
>>> that
>>> >> you are good with Flink on Mesos as is.
>>> >>
>>> >> As for the jira issues, I believe the most essential ones should have
>>> >> already been resolved. You may find some remaining open issues here
>>> [1],
>>> >> but not all of them are necessary if we decide to keep Flink on Mesos
>>> as is.
>>> >>
>>> >> At the moment and in the short future, I think helps are mostly
>>> needed on
>>> >> testing the upcoming release 1.12 with Mesos use cases. The community
>>> is
>>> >> currently actively preparing the new release, and hopefully we could
>>> come
>>> >> up with a release candidate early next month. It would be greatly
>>> >> appreciated if you fork as experienced Flink on Mesos users can help
>>> with
>>> >> verifying the release candidates.
>>> >>
>>> >>
>>> >> Thank you~
>>> >>
>>> >> Xintong Song
>>> >>
>>> >> [1]
>>> >>
>>> https://issues.apache.org/jira/browse/FLINK-17402?jql=project%20%3D%20FLINK%20AND%20component%20%3D%20%22Deployment%20%2F%20Mesos%22%20AND%20status%20%3D%20Open
>>> >> <
>>> >>
>>> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FFLINK-17402%3Fjql%3Dproject%2520%253D%2520FLINK%2520AND%2520component%2520%253D%2520%2522Deployment%2520%252F%2520Mesos%2522%2520AND%2520status%2520%253D%2520Open&data=04%7C01%7Co.nitavskyi%40criteo.com%7C3585e1f25bdf4e091af808d87a3f92db%7C2a35d8fd574d48e3927c8c398e225a01%7C1%7C0%7C637393760750820881%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=hytJFQE0MCPzMLiQTQTdbg3GVckX5M3r1NPRGrRV8j4%3D&reserved=0
>>> >> >
>>> >>
>>> >> On Tue, Oct 27, 2020 at 2:58 AM Piyush Narang <p.narang@criteo.com
>>> >> <ma...@criteo.com>> wrote:
>>> >>
>>> >> Hi Xintong,
>>> >>
>>> >>
>>> >>
>>> >> Do you have any jiras that cover any of the items on 1 or 2? I can
>>> reach
>>> >> out to folks internally and see if I can get some folks to commit to
>>> >> helping out.
>>> >>
>>> >>
>>> >>
>>> >> To cover the other qs:
>>> >>
>>> >>   *   Yes, we’ve not got a plan at the moment to get off Mesos. We use
>>> >> Yarn for some our Flink workloads when we can. Mesos is only used
>>> when we
>>> >> need streaming capabilities in our WW dcs (as our Yarn is centralized
>>> in
>>> >> one DC)
>>> >>   *   We’re currently on Flink 1.9 (old planner). We have a plan to
>>> bump
>>> >> to 1.11 / 1.12 this quarter.
>>> >>   *   We typically upgrade once every 6 months to a year (not every
>>> >> release). We’d like to speed up the cadence but we’re not there yet.
>>> >>   *   We’d largely be good with keeping Flink on Mesos as-is and
>>> >> functional while missing out on some of the newer features. We
>>> understand
>>> >> the pain on the communities side and we can take on the work if we
>>> see some
>>> >> fancy improvement in Flink on Yarn / K8s that we want in Mesos to put
>>> in
>>> >> the request to port it over.
>>> >>
>>> >>
>>> >>
>>> >> Thanks,
>>> >>
>>> >>
>>> >>
>>> >> -- Piyush
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >> From: Xintong Song <tonysong820@gmail.com<mailto:
>>> tonysong820@gmail.com>>
>>> >> Date: Sunday, October 25, 2020 at 10:57 PM
>>> >> To: dev <de...@flink.apache.org>>, user <
>>> >> user@flink.apache.org<ma...@flink.apache.org>>
>>> >> Cc: Lasse Nedergaard <lassenedergaardflink@gmail.com<mailto:
>>> >> lassenedergaardflink@gmail.com>>, <p.narang@criteo.com<mailto:
>>> >> p.narang@criteo.com>>
>>> >> Subject: Re: [SURVEY] Remove Mesos support
>>> >>
>>> >>
>>> >>
>>> >> Thanks for sharing the information with us, Piyush an Lasse.
>>> >>
>>> >>
>>> >>
>>> >> @Piyush
>>> >>
>>> >>
>>> >>
>>> >> Thanks for offering the help. IMO, there are currently several
>>> problems
>>> >> that make supporting Flink on Mesos challenging for us.
>>> >>
>>> >>   1.  Lack of Mesos experts. AFAIK, there are very few people (if not
>>> >> none) among the active contributors in this community that are
>>> familiar
>>> >> with Mesos and can help with development on this component.
>>> >>   2.  Absence of tests. Mesos does not provide a testing cluster, like
>>> >> `MiniYARNCluster`, making it hard to test interactions between Flink
>>> and
>>> >> Mesos. We have only a few very simple e2e tests running on Mesos
>>> deployed
>>> >> in a docker, covering the most fundamental workflows. We are not sure
>>> how
>>> >> well those tests work, especially against some potential corner cases.
>>> >>   3.  Divergence from other deployment. Because of 1 and 2, the new
>>> >> efforts (features, maintenance, refactors) tend to exclude Mesos if
>>> >> possible. When the new efforts have to touch the Mesos related
>>> components
>>> >> (e.g., changes to the common resource manager interfaces), we have to
>>> be
>>> >> very careful and make as few changes as possible, to avoid
>>> accidentally
>>> >> breaking anything that we are not familiar with. As a result, the
>>> component
>>> >> diverges a lot from other deployment components (K8s/Yarn), which
>>> makes it
>>> >> harder to maintain.
>>> >>
>>> >> It would be greatly appreciated if you can help with either of the
>>> above
>>> >> issues.
>>> >>
>>> >>
>>> >>
>>> >> Additionally, I have a few questions concerning your use cases at
>>> Criteo.
>>> >> IIUC, you are going to stay on Mesos in the foreseeable future, while
>>> >> keeping the Flink version up-to-date? What Flink version are you
>>> currently
>>> >> using? How often do you upgrade (e.g., every release)? Would you be
>>> good
>>> >> with keeping the Flink on Mesos component as it is (means that
>>> deployment
>>> >> and resource management improvements may not be ported to Mesos),
>>> while
>>> >> keeping other components up-to-date (e.g., improvements from
>>> programming
>>> >> APIs, operators, state backens, etc.)?
>>> >>
>>> >>
>>> >>
>>> >> Thank you~
>>> >>
>>> >> Xintong Song
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >> On Sat, Oct 24, 2020 at 2:48 AM Lasse Nedergaard <
>>> >> lassenedergaardflink@gmail.com<mailto:lassenedergaardflink@gmail.com
>>> >>
>>> >> wrote:
>>> >>
>>> >> Hi
>>> >>
>>> >>
>>> >>
>>> >> At Trackunit We have been using Mesos for long time but have now
>>> moved to
>>> >> k8s.
>>> >>
>>> >> Med venlig hilsen / Best regards
>>> >>
>>> >> Lasse Nedergaard
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >> Den 23. okt. 2020 kl. 17.01 skrev Robert Metzger <rmetzger@apache.org
>>> >> <ma...@apache.org>>:
>>> >>
>>> >> 
>>> >>
>>> >> Hey Piyush,
>>> >>
>>> >> thanks a lot for raising this concern. I believe we should keep Mesos
>>> in
>>> >> Flink then in the foreseeable future.
>>> >>
>>> >> Your offer to help is much appreciated. We'll let you know once there
>>> is
>>> >> something.
>>> >>
>>> >>
>>> >>
>>> >> On Fri, Oct 23, 2020 at 4:28 PM Piyush Narang <p.narang@criteo.com
>>> >> <ma...@criteo.com>> wrote:
>>> >>
>>> >> Thanks Kostas. If there's items we can help with, I'm sure we'd be
>>> able
>>> >> to find folks who would be excited to contribute / help in any way.
>>> >>
>>> >> -- Piyush
>>> >>
>>> >>
>>> >> On 10/23/20, 10:25 AM, "Kostas Kloudas" <kkloudas@gmail.com<mailto:
>>> >> kkloudas@gmail.com>> wrote:
>>> >>
>>> >>     Thanks Piyush for the message.
>>> >>     After this, I revoke my +1. I agree with the previous opinions
>>> that we
>>> >>     cannot drop code that is actively used by users, especially if it
>>> >>     something that deep in the stack as support for cluster management
>>> >>     framework.
>>> >>
>>> >>     Cheers,
>>> >>     Kostas
>>> >>
>>> >>     On Fri, Oct 23, 2020 at 4:15 PM Piyush Narang <
>>> p.narang@criteo.com
>>> >> <ma...@criteo.com>> wrote:
>>> >>     >
>>> >>     > Hi folks,
>>> >>     >
>>> >>     >
>>> >>     >
>>> >>     > We at Criteo are active users of the Flink on Mesos resource
>>> >> management component. We are pretty heavy users of Mesos for
>>> scheduling
>>> >> workloads on our edge datacenters and we do want to continue to be
>>> able to
>>> >> run some of our Flink topologies (to compute machine learning short
>>> term
>>> >> features) on those DCs. If possible our vote would be not to drop
>>> Mesos
>>> >> support as that will tie us to an old release / have to maintain a
>>> fork as
>>> >> we’re not planning to migrate off Mesos anytime soon. Is the burden
>>> >> something that can be helped with by the community? (Or are you
>>> referring
>>> >> to having to ensure PRs handle the Mesos piece as well when they
>>> touch the
>>> >> resource managers?)
>>> >>     >
>>> >>     >
>>> >>     >
>>> >>     > Thanks,
>>> >>     >
>>> >>     >
>>> >>     >
>>> >>     > -- Piyush
>>> >>     >
>>> >>     >
>>> >>     >
>>> >>     >
>>> >>     >
>>> >>     > From: Till Rohrmann <trohrmann@apache.org<mailto:
>>> >> trohrmann@apache.org>>
>>> >>     > Date: Friday, October 23, 2020 at 8:19 AM
>>> >>     > To: Xintong Song <tonysong820@gmail.com<mailto:
>>> >> tonysong820@gmail.com>>
>>> >>     > Cc: dev <de...@flink.apache.org>>,
>>> user <
>>> >> user@flink.apache.org<ma...@flink.apache.org>>
>>> >>     > Subject: Re: [SURVEY] Remove Mesos support
>>> >>     >
>>> >>     >
>>> >>     >
>>> >>     > Thanks for starting this survey Robert! I second Konstantin and
>>> >> Xintong in the sense that our Mesos user's opinions should matter most
>>> >> here. If our community is no longer using the Mesos integration, then
>>> I
>>> >> would be +1 for removing it in order to decrease the maintenance
>>> burden.
>>> >>     >
>>> >>     >
>>> >>     >
>>> >>     > Cheers,
>>> >>     >
>>> >>     > Till
>>> >>     >
>>> >>     >
>>> >>     >
>>> >>     > On Fri, Oct 23, 2020 at 2:03 PM Xintong Song <
>>> tonysong820@gmail.com
>>> >> <ma...@gmail.com>> wrote:
>>> >>     >
>>> >>     > +1 for adding a warning in 1.12 about planning to remove Mesos
>>> >> support.
>>> >>     >
>>> >>     >
>>> >>     >
>>> >>     > With my developer hat on, removing the Mesos support would
>>> >> definitely reduce the maintaining overhead for the deployment and
>>> resource
>>> >> management related components. On the other hand, the Flink on Mesos
>>> users'
>>> >> voices definitely matter a lot for this community. Either way, it
>>> would be
>>> >> good to draw users attention to this discussion early.
>>> >>     >
>>> >>     >
>>> >>     >
>>> >>     > Thank you~
>>> >>     >
>>> >>     > Xintong Song
>>> >>     >
>>> >>     >
>>> >>     >
>>> >>     >
>>> >>     >
>>> >>     > On Fri, Oct 23, 2020 at 7:53 PM Konstantin Knauf <
>>> knaufk@apache.org
>>> >> <ma...@apache.org>> wrote:
>>> >>     >
>>> >>     > Hi Robert,
>>> >>     >
>>> >>     > +1 to the plan you outlined. If we were to drop support in Flink
>>> >> 1.13+, we
>>> >>     > would still support it in Flink 1.12- with bug fixes for some
>>> time
>>> >> so that
>>> >>     > users have time to move on.
>>> >>     >
>>> >>     > It would certainly be very interesting to hear from current
>>> Flink
>>> >> on Mesos
>>> >>     > users, on how they see the evolution of this part of the
>>> ecosystem.
>>> >>     >
>>> >>     > Best,
>>> >>     >
>>> >>     > Konstantin
>>> >>
>>> >
>>>
>>
>>
>> --
>>
>> Konstantin Knauf
>>
>> https://twitter.com/snntrable
>>
>> https://github.com/knaufk
>>
>

Re: [BULK]Re: [SURVEY] Remove Mesos support

Posted by Till Rohrmann <tr...@apache.org>.
+1 for officially deprecating this component for the 1.13 release.

Cheers,
Till

On Thu, Mar 25, 2021 at 1:49 PM Konstantin Knauf <kn...@apache.org> wrote:

> Hi Matthias,
>
> Thank you for following up on this. +1 to officially deprecate Mesos in
> the code and documentation, too. It will be confusing for users if this
> diverges from the roadmap.
>
> Cheers,
>
> Konstantin
>
> On Thu, Mar 25, 2021 at 12:23 PM Matthias Pohl <ma...@ververica.com>
> wrote:
>
>> Hi everyone,
>> considering the upcoming release of Flink 1.13, I wanted to revive the
>> discussion about the Mesos support ones more. Mesos is also already listed
>> as deprecated in Flink's overall roadmap [1]. Maybe, it's time to align
>> the
>> documentation accordingly to make it more explicit?
>>
>> What do you think?
>>
>> Best,
>> Matthias
>>
>> [1] https://flink.apache.org/roadmap.html#feature-radar
>>
>> On Wed, Oct 28, 2020 at 9:40 AM Till Rohrmann <tr...@apache.org>
>> wrote:
>>
>> > Hi Oleksandr,
>> >
>> > yes you are right. The biggest problem is at the moment the lack of test
>> > coverage and thereby confidence to make changes. We have some e2e tests
>> > which you can find here [1]. These tests are, however, quite coarse
>> grained
>> > and are missing a lot of cases. One idea would be to add a Mesos e2e
>> test
>> > based on Flink's end-to-end test framework [2]. I think what needs to be
>> > done there is to add a Mesos resource and a way to submit jobs to a
>> Mesos
>> > cluster to write e2e tests.
>> >
>> > [1] https://github.com/apache/flink/tree/master/flink-jepsen
>> > [2]
>> >
>> https://github.com/apache/flink/tree/master/flink-end-to-end-tests/flink-end-to-end-tests-common
>> >
>> > Cheers,
>> > Till
>> >
>> > On Tue, Oct 27, 2020 at 12:29 PM Oleksandr Nitavskyi <
>> > o.nitavskyi@criteo.com> wrote:
>> >
>> >> Hello Xintong,
>> >>
>> >> Thanks for the insights and support.
>> >>
>> >> Browsing the Mesos backlog and didn't identify anything critical, which
>> >> is left there.
>> >>
>> >> I see that there are were quite a lot of contributions to the Flink
>> Mesos
>> >> in the recent version:
>> >> https://github.com/apache/flink/commits/master/flink-mesos.
>> >> We plan to validate the current Flink master (or release 1.12 branch)
>> our
>> >> Mesos setup. In case of any issues, we will try to propose changes.
>> >> My feeling is that our test results shouldn't affect the Flink 1.12
>> >> release cycle. And if any potential commits will land into the 1.12.1
>> it
>> >> should be totally fine.
>> >>
>> >> In the future, we would be glad to help you guys with any
>> >> maintenance-related questions. One of the highest priorities around
>> this
>> >> component seems to be the development of the full e2e test.
>> >>
>> >> Kind Regards
>> >> Oleksandr Nitavskyi
>> >> ________________________________
>> >> From: Xintong Song <to...@gmail.com>
>> >> Sent: Tuesday, October 27, 2020 7:14 AM
>> >> To: dev <de...@flink.apache.org>; user <us...@flink.apache.org>
>> >> Cc: Piyush Narang <p....@criteo.com>
>> >> Subject: [BULK]Re: [SURVEY] Remove Mesos support
>> >>
>> >> Hi Piyush,
>> >>
>> >> Thanks a lot for sharing the information. It would be a great relief
>> that
>> >> you are good with Flink on Mesos as is.
>> >>
>> >> As for the jira issues, I believe the most essential ones should have
>> >> already been resolved. You may find some remaining open issues here
>> [1],
>> >> but not all of them are necessary if we decide to keep Flink on Mesos
>> as is.
>> >>
>> >> At the moment and in the short future, I think helps are mostly needed
>> on
>> >> testing the upcoming release 1.12 with Mesos use cases. The community
>> is
>> >> currently actively preparing the new release, and hopefully we could
>> come
>> >> up with a release candidate early next month. It would be greatly
>> >> appreciated if you fork as experienced Flink on Mesos users can help
>> with
>> >> verifying the release candidates.
>> >>
>> >>
>> >> Thank you~
>> >>
>> >> Xintong Song
>> >>
>> >> [1]
>> >>
>> https://issues.apache.org/jira/browse/FLINK-17402?jql=project%20%3D%20FLINK%20AND%20component%20%3D%20%22Deployment%20%2F%20Mesos%22%20AND%20status%20%3D%20Open
>> >> <
>> >>
>> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FFLINK-17402%3Fjql%3Dproject%2520%253D%2520FLINK%2520AND%2520component%2520%253D%2520%2522Deployment%2520%252F%2520Mesos%2522%2520AND%2520status%2520%253D%2520Open&data=04%7C01%7Co.nitavskyi%40criteo.com%7C3585e1f25bdf4e091af808d87a3f92db%7C2a35d8fd574d48e3927c8c398e225a01%7C1%7C0%7C637393760750820881%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=hytJFQE0MCPzMLiQTQTdbg3GVckX5M3r1NPRGrRV8j4%3D&reserved=0
>> >> >
>> >>
>> >> On Tue, Oct 27, 2020 at 2:58 AM Piyush Narang <p.narang@criteo.com
>> >> <ma...@criteo.com>> wrote:
>> >>
>> >> Hi Xintong,
>> >>
>> >>
>> >>
>> >> Do you have any jiras that cover any of the items on 1 or 2? I can
>> reach
>> >> out to folks internally and see if I can get some folks to commit to
>> >> helping out.
>> >>
>> >>
>> >>
>> >> To cover the other qs:
>> >>
>> >>   *   Yes, we’ve not got a plan at the moment to get off Mesos. We use
>> >> Yarn for some our Flink workloads when we can. Mesos is only used when
>> we
>> >> need streaming capabilities in our WW dcs (as our Yarn is centralized
>> in
>> >> one DC)
>> >>   *   We’re currently on Flink 1.9 (old planner). We have a plan to
>> bump
>> >> to 1.11 / 1.12 this quarter.
>> >>   *   We typically upgrade once every 6 months to a year (not every
>> >> release). We’d like to speed up the cadence but we’re not there yet.
>> >>   *   We’d largely be good with keeping Flink on Mesos as-is and
>> >> functional while missing out on some of the newer features. We
>> understand
>> >> the pain on the communities side and we can take on the work if we see
>> some
>> >> fancy improvement in Flink on Yarn / K8s that we want in Mesos to put
>> in
>> >> the request to port it over.
>> >>
>> >>
>> >>
>> >> Thanks,
>> >>
>> >>
>> >>
>> >> -- Piyush
>> >>
>> >>
>> >>
>> >>
>> >>
>> >> From: Xintong Song <tonysong820@gmail.com<mailto:tonysong820@gmail.com
>> >>
>> >> Date: Sunday, October 25, 2020 at 10:57 PM
>> >> To: dev <de...@flink.apache.org>>, user <
>> >> user@flink.apache.org<ma...@flink.apache.org>>
>> >> Cc: Lasse Nedergaard <lassenedergaardflink@gmail.com<mailto:
>> >> lassenedergaardflink@gmail.com>>, <p.narang@criteo.com<mailto:
>> >> p.narang@criteo.com>>
>> >> Subject: Re: [SURVEY] Remove Mesos support
>> >>
>> >>
>> >>
>> >> Thanks for sharing the information with us, Piyush an Lasse.
>> >>
>> >>
>> >>
>> >> @Piyush
>> >>
>> >>
>> >>
>> >> Thanks for offering the help. IMO, there are currently several problems
>> >> that make supporting Flink on Mesos challenging for us.
>> >>
>> >>   1.  Lack of Mesos experts. AFAIK, there are very few people (if not
>> >> none) among the active contributors in this community that are familiar
>> >> with Mesos and can help with development on this component.
>> >>   2.  Absence of tests. Mesos does not provide a testing cluster, like
>> >> `MiniYARNCluster`, making it hard to test interactions between Flink
>> and
>> >> Mesos. We have only a few very simple e2e tests running on Mesos
>> deployed
>> >> in a docker, covering the most fundamental workflows. We are not sure
>> how
>> >> well those tests work, especially against some potential corner cases.
>> >>   3.  Divergence from other deployment. Because of 1 and 2, the new
>> >> efforts (features, maintenance, refactors) tend to exclude Mesos if
>> >> possible. When the new efforts have to touch the Mesos related
>> components
>> >> (e.g., changes to the common resource manager interfaces), we have to
>> be
>> >> very careful and make as few changes as possible, to avoid accidentally
>> >> breaking anything that we are not familiar with. As a result, the
>> component
>> >> diverges a lot from other deployment components (K8s/Yarn), which
>> makes it
>> >> harder to maintain.
>> >>
>> >> It would be greatly appreciated if you can help with either of the
>> above
>> >> issues.
>> >>
>> >>
>> >>
>> >> Additionally, I have a few questions concerning your use cases at
>> Criteo.
>> >> IIUC, you are going to stay on Mesos in the foreseeable future, while
>> >> keeping the Flink version up-to-date? What Flink version are you
>> currently
>> >> using? How often do you upgrade (e.g., every release)? Would you be
>> good
>> >> with keeping the Flink on Mesos component as it is (means that
>> deployment
>> >> and resource management improvements may not be ported to Mesos), while
>> >> keeping other components up-to-date (e.g., improvements from
>> programming
>> >> APIs, operators, state backens, etc.)?
>> >>
>> >>
>> >>
>> >> Thank you~
>> >>
>> >> Xintong Song
>> >>
>> >>
>> >>
>> >>
>> >>
>> >> On Sat, Oct 24, 2020 at 2:48 AM Lasse Nedergaard <
>> >> lassenedergaardflink@gmail.com<ma...@gmail.com>>
>> >> wrote:
>> >>
>> >> Hi
>> >>
>> >>
>> >>
>> >> At Trackunit We have been using Mesos for long time but have now moved
>> to
>> >> k8s.
>> >>
>> >> Med venlig hilsen / Best regards
>> >>
>> >> Lasse Nedergaard
>> >>
>> >>
>> >>
>> >>
>> >>
>> >> Den 23. okt. 2020 kl. 17.01 skrev Robert Metzger <rmetzger@apache.org
>> >> <ma...@apache.org>>:
>> >>
>> >> 
>> >>
>> >> Hey Piyush,
>> >>
>> >> thanks a lot for raising this concern. I believe we should keep Mesos
>> in
>> >> Flink then in the foreseeable future.
>> >>
>> >> Your offer to help is much appreciated. We'll let you know once there
>> is
>> >> something.
>> >>
>> >>
>> >>
>> >> On Fri, Oct 23, 2020 at 4:28 PM Piyush Narang <p.narang@criteo.com
>> >> <ma...@criteo.com>> wrote:
>> >>
>> >> Thanks Kostas. If there's items we can help with, I'm sure we'd be able
>> >> to find folks who would be excited to contribute / help in any way.
>> >>
>> >> -- Piyush
>> >>
>> >>
>> >> On 10/23/20, 10:25 AM, "Kostas Kloudas" <kkloudas@gmail.com<mailto:
>> >> kkloudas@gmail.com>> wrote:
>> >>
>> >>     Thanks Piyush for the message.
>> >>     After this, I revoke my +1. I agree with the previous opinions
>> that we
>> >>     cannot drop code that is actively used by users, especially if it
>> >>     something that deep in the stack as support for cluster management
>> >>     framework.
>> >>
>> >>     Cheers,
>> >>     Kostas
>> >>
>> >>     On Fri, Oct 23, 2020 at 4:15 PM Piyush Narang <p.narang@criteo.com
>> >> <ma...@criteo.com>> wrote:
>> >>     >
>> >>     > Hi folks,
>> >>     >
>> >>     >
>> >>     >
>> >>     > We at Criteo are active users of the Flink on Mesos resource
>> >> management component. We are pretty heavy users of Mesos for scheduling
>> >> workloads on our edge datacenters and we do want to continue to be
>> able to
>> >> run some of our Flink topologies (to compute machine learning short
>> term
>> >> features) on those DCs. If possible our vote would be not to drop Mesos
>> >> support as that will tie us to an old release / have to maintain a
>> fork as
>> >> we’re not planning to migrate off Mesos anytime soon. Is the burden
>> >> something that can be helped with by the community? (Or are you
>> referring
>> >> to having to ensure PRs handle the Mesos piece as well when they touch
>> the
>> >> resource managers?)
>> >>     >
>> >>     >
>> >>     >
>> >>     > Thanks,
>> >>     >
>> >>     >
>> >>     >
>> >>     > -- Piyush
>> >>     >
>> >>     >
>> >>     >
>> >>     >
>> >>     >
>> >>     > From: Till Rohrmann <trohrmann@apache.org<mailto:
>> >> trohrmann@apache.org>>
>> >>     > Date: Friday, October 23, 2020 at 8:19 AM
>> >>     > To: Xintong Song <tonysong820@gmail.com<mailto:
>> >> tonysong820@gmail.com>>
>> >>     > Cc: dev <de...@flink.apache.org>>,
>> user <
>> >> user@flink.apache.org<ma...@flink.apache.org>>
>> >>     > Subject: Re: [SURVEY] Remove Mesos support
>> >>     >
>> >>     >
>> >>     >
>> >>     > Thanks for starting this survey Robert! I second Konstantin and
>> >> Xintong in the sense that our Mesos user's opinions should matter most
>> >> here. If our community is no longer using the Mesos integration, then I
>> >> would be +1 for removing it in order to decrease the maintenance
>> burden.
>> >>     >
>> >>     >
>> >>     >
>> >>     > Cheers,
>> >>     >
>> >>     > Till
>> >>     >
>> >>     >
>> >>     >
>> >>     > On Fri, Oct 23, 2020 at 2:03 PM Xintong Song <
>> tonysong820@gmail.com
>> >> <ma...@gmail.com>> wrote:
>> >>     >
>> >>     > +1 for adding a warning in 1.12 about planning to remove Mesos
>> >> support.
>> >>     >
>> >>     >
>> >>     >
>> >>     > With my developer hat on, removing the Mesos support would
>> >> definitely reduce the maintaining overhead for the deployment and
>> resource
>> >> management related components. On the other hand, the Flink on Mesos
>> users'
>> >> voices definitely matter a lot for this community. Either way, it
>> would be
>> >> good to draw users attention to this discussion early.
>> >>     >
>> >>     >
>> >>     >
>> >>     > Thank you~
>> >>     >
>> >>     > Xintong Song
>> >>     >
>> >>     >
>> >>     >
>> >>     >
>> >>     >
>> >>     > On Fri, Oct 23, 2020 at 7:53 PM Konstantin Knauf <
>> knaufk@apache.org
>> >> <ma...@apache.org>> wrote:
>> >>     >
>> >>     > Hi Robert,
>> >>     >
>> >>     > +1 to the plan you outlined. If we were to drop support in Flink
>> >> 1.13+, we
>> >>     > would still support it in Flink 1.12- with bug fixes for some
>> time
>> >> so that
>> >>     > users have time to move on.
>> >>     >
>> >>     > It would certainly be very interesting to hear from current Flink
>> >> on Mesos
>> >>     > users, on how they see the evolution of this part of the
>> ecosystem.
>> >>     >
>> >>     > Best,
>> >>     >
>> >>     > Konstantin
>> >>
>> >
>>
>
>
> --
>
> Konstantin Knauf
>
> https://twitter.com/snntrable
>
> https://github.com/knaufk
>

Re: [BULK]Re: [SURVEY] Remove Mesos support

Posted by Till Rohrmann <tr...@apache.org>.
+1 for officially deprecating this component for the 1.13 release.

Cheers,
Till

On Thu, Mar 25, 2021 at 1:49 PM Konstantin Knauf <kn...@apache.org> wrote:

> Hi Matthias,
>
> Thank you for following up on this. +1 to officially deprecate Mesos in
> the code and documentation, too. It will be confusing for users if this
> diverges from the roadmap.
>
> Cheers,
>
> Konstantin
>
> On Thu, Mar 25, 2021 at 12:23 PM Matthias Pohl <ma...@ververica.com>
> wrote:
>
>> Hi everyone,
>> considering the upcoming release of Flink 1.13, I wanted to revive the
>> discussion about the Mesos support ones more. Mesos is also already listed
>> as deprecated in Flink's overall roadmap [1]. Maybe, it's time to align
>> the
>> documentation accordingly to make it more explicit?
>>
>> What do you think?
>>
>> Best,
>> Matthias
>>
>> [1] https://flink.apache.org/roadmap.html#feature-radar
>>
>> On Wed, Oct 28, 2020 at 9:40 AM Till Rohrmann <tr...@apache.org>
>> wrote:
>>
>> > Hi Oleksandr,
>> >
>> > yes you are right. The biggest problem is at the moment the lack of test
>> > coverage and thereby confidence to make changes. We have some e2e tests
>> > which you can find here [1]. These tests are, however, quite coarse
>> grained
>> > and are missing a lot of cases. One idea would be to add a Mesos e2e
>> test
>> > based on Flink's end-to-end test framework [2]. I think what needs to be
>> > done there is to add a Mesos resource and a way to submit jobs to a
>> Mesos
>> > cluster to write e2e tests.
>> >
>> > [1] https://github.com/apache/flink/tree/master/flink-jepsen
>> > [2]
>> >
>> https://github.com/apache/flink/tree/master/flink-end-to-end-tests/flink-end-to-end-tests-common
>> >
>> > Cheers,
>> > Till
>> >
>> > On Tue, Oct 27, 2020 at 12:29 PM Oleksandr Nitavskyi <
>> > o.nitavskyi@criteo.com> wrote:
>> >
>> >> Hello Xintong,
>> >>
>> >> Thanks for the insights and support.
>> >>
>> >> Browsing the Mesos backlog and didn't identify anything critical, which
>> >> is left there.
>> >>
>> >> I see that there are were quite a lot of contributions to the Flink
>> Mesos
>> >> in the recent version:
>> >> https://github.com/apache/flink/commits/master/flink-mesos.
>> >> We plan to validate the current Flink master (or release 1.12 branch)
>> our
>> >> Mesos setup. In case of any issues, we will try to propose changes.
>> >> My feeling is that our test results shouldn't affect the Flink 1.12
>> >> release cycle. And if any potential commits will land into the 1.12.1
>> it
>> >> should be totally fine.
>> >>
>> >> In the future, we would be glad to help you guys with any
>> >> maintenance-related questions. One of the highest priorities around
>> this
>> >> component seems to be the development of the full e2e test.
>> >>
>> >> Kind Regards
>> >> Oleksandr Nitavskyi
>> >> ________________________________
>> >> From: Xintong Song <to...@gmail.com>
>> >> Sent: Tuesday, October 27, 2020 7:14 AM
>> >> To: dev <de...@flink.apache.org>; user <us...@flink.apache.org>
>> >> Cc: Piyush Narang <p....@criteo.com>
>> >> Subject: [BULK]Re: [SURVEY] Remove Mesos support
>> >>
>> >> Hi Piyush,
>> >>
>> >> Thanks a lot for sharing the information. It would be a great relief
>> that
>> >> you are good with Flink on Mesos as is.
>> >>
>> >> As for the jira issues, I believe the most essential ones should have
>> >> already been resolved. You may find some remaining open issues here
>> [1],
>> >> but not all of them are necessary if we decide to keep Flink on Mesos
>> as is.
>> >>
>> >> At the moment and in the short future, I think helps are mostly needed
>> on
>> >> testing the upcoming release 1.12 with Mesos use cases. The community
>> is
>> >> currently actively preparing the new release, and hopefully we could
>> come
>> >> up with a release candidate early next month. It would be greatly
>> >> appreciated if you fork as experienced Flink on Mesos users can help
>> with
>> >> verifying the release candidates.
>> >>
>> >>
>> >> Thank you~
>> >>
>> >> Xintong Song
>> >>
>> >> [1]
>> >>
>> https://issues.apache.org/jira/browse/FLINK-17402?jql=project%20%3D%20FLINK%20AND%20component%20%3D%20%22Deployment%20%2F%20Mesos%22%20AND%20status%20%3D%20Open
>> >> <
>> >>
>> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FFLINK-17402%3Fjql%3Dproject%2520%253D%2520FLINK%2520AND%2520component%2520%253D%2520%2522Deployment%2520%252F%2520Mesos%2522%2520AND%2520status%2520%253D%2520Open&data=04%7C01%7Co.nitavskyi%40criteo.com%7C3585e1f25bdf4e091af808d87a3f92db%7C2a35d8fd574d48e3927c8c398e225a01%7C1%7C0%7C637393760750820881%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=hytJFQE0MCPzMLiQTQTdbg3GVckX5M3r1NPRGrRV8j4%3D&reserved=0
>> >> >
>> >>
>> >> On Tue, Oct 27, 2020 at 2:58 AM Piyush Narang <p.narang@criteo.com
>> >> <ma...@criteo.com>> wrote:
>> >>
>> >> Hi Xintong,
>> >>
>> >>
>> >>
>> >> Do you have any jiras that cover any of the items on 1 or 2? I can
>> reach
>> >> out to folks internally and see if I can get some folks to commit to
>> >> helping out.
>> >>
>> >>
>> >>
>> >> To cover the other qs:
>> >>
>> >>   *   Yes, we’ve not got a plan at the moment to get off Mesos. We use
>> >> Yarn for some our Flink workloads when we can. Mesos is only used when
>> we
>> >> need streaming capabilities in our WW dcs (as our Yarn is centralized
>> in
>> >> one DC)
>> >>   *   We’re currently on Flink 1.9 (old planner). We have a plan to
>> bump
>> >> to 1.11 / 1.12 this quarter.
>> >>   *   We typically upgrade once every 6 months to a year (not every
>> >> release). We’d like to speed up the cadence but we’re not there yet.
>> >>   *   We’d largely be good with keeping Flink on Mesos as-is and
>> >> functional while missing out on some of the newer features. We
>> understand
>> >> the pain on the communities side and we can take on the work if we see
>> some
>> >> fancy improvement in Flink on Yarn / K8s that we want in Mesos to put
>> in
>> >> the request to port it over.
>> >>
>> >>
>> >>
>> >> Thanks,
>> >>
>> >>
>> >>
>> >> -- Piyush
>> >>
>> >>
>> >>
>> >>
>> >>
>> >> From: Xintong Song <tonysong820@gmail.com<mailto:tonysong820@gmail.com
>> >>
>> >> Date: Sunday, October 25, 2020 at 10:57 PM
>> >> To: dev <de...@flink.apache.org>>, user <
>> >> user@flink.apache.org<ma...@flink.apache.org>>
>> >> Cc: Lasse Nedergaard <lassenedergaardflink@gmail.com<mailto:
>> >> lassenedergaardflink@gmail.com>>, <p.narang@criteo.com<mailto:
>> >> p.narang@criteo.com>>
>> >> Subject: Re: [SURVEY] Remove Mesos support
>> >>
>> >>
>> >>
>> >> Thanks for sharing the information with us, Piyush an Lasse.
>> >>
>> >>
>> >>
>> >> @Piyush
>> >>
>> >>
>> >>
>> >> Thanks for offering the help. IMO, there are currently several problems
>> >> that make supporting Flink on Mesos challenging for us.
>> >>
>> >>   1.  Lack of Mesos experts. AFAIK, there are very few people (if not
>> >> none) among the active contributors in this community that are familiar
>> >> with Mesos and can help with development on this component.
>> >>   2.  Absence of tests. Mesos does not provide a testing cluster, like
>> >> `MiniYARNCluster`, making it hard to test interactions between Flink
>> and
>> >> Mesos. We have only a few very simple e2e tests running on Mesos
>> deployed
>> >> in a docker, covering the most fundamental workflows. We are not sure
>> how
>> >> well those tests work, especially against some potential corner cases.
>> >>   3.  Divergence from other deployment. Because of 1 and 2, the new
>> >> efforts (features, maintenance, refactors) tend to exclude Mesos if
>> >> possible. When the new efforts have to touch the Mesos related
>> components
>> >> (e.g., changes to the common resource manager interfaces), we have to
>> be
>> >> very careful and make as few changes as possible, to avoid accidentally
>> >> breaking anything that we are not familiar with. As a result, the
>> component
>> >> diverges a lot from other deployment components (K8s/Yarn), which
>> makes it
>> >> harder to maintain.
>> >>
>> >> It would be greatly appreciated if you can help with either of the
>> above
>> >> issues.
>> >>
>> >>
>> >>
>> >> Additionally, I have a few questions concerning your use cases at
>> Criteo.
>> >> IIUC, you are going to stay on Mesos in the foreseeable future, while
>> >> keeping the Flink version up-to-date? What Flink version are you
>> currently
>> >> using? How often do you upgrade (e.g., every release)? Would you be
>> good
>> >> with keeping the Flink on Mesos component as it is (means that
>> deployment
>> >> and resource management improvements may not be ported to Mesos), while
>> >> keeping other components up-to-date (e.g., improvements from
>> programming
>> >> APIs, operators, state backens, etc.)?
>> >>
>> >>
>> >>
>> >> Thank you~
>> >>
>> >> Xintong Song
>> >>
>> >>
>> >>
>> >>
>> >>
>> >> On Sat, Oct 24, 2020 at 2:48 AM Lasse Nedergaard <
>> >> lassenedergaardflink@gmail.com<ma...@gmail.com>>
>> >> wrote:
>> >>
>> >> Hi
>> >>
>> >>
>> >>
>> >> At Trackunit We have been using Mesos for long time but have now moved
>> to
>> >> k8s.
>> >>
>> >> Med venlig hilsen / Best regards
>> >>
>> >> Lasse Nedergaard
>> >>
>> >>
>> >>
>> >>
>> >>
>> >> Den 23. okt. 2020 kl. 17.01 skrev Robert Metzger <rmetzger@apache.org
>> >> <ma...@apache.org>>:
>> >>
>> >> 
>> >>
>> >> Hey Piyush,
>> >>
>> >> thanks a lot for raising this concern. I believe we should keep Mesos
>> in
>> >> Flink then in the foreseeable future.
>> >>
>> >> Your offer to help is much appreciated. We'll let you know once there
>> is
>> >> something.
>> >>
>> >>
>> >>
>> >> On Fri, Oct 23, 2020 at 4:28 PM Piyush Narang <p.narang@criteo.com
>> >> <ma...@criteo.com>> wrote:
>> >>
>> >> Thanks Kostas. If there's items we can help with, I'm sure we'd be able
>> >> to find folks who would be excited to contribute / help in any way.
>> >>
>> >> -- Piyush
>> >>
>> >>
>> >> On 10/23/20, 10:25 AM, "Kostas Kloudas" <kkloudas@gmail.com<mailto:
>> >> kkloudas@gmail.com>> wrote:
>> >>
>> >>     Thanks Piyush for the message.
>> >>     After this, I revoke my +1. I agree with the previous opinions
>> that we
>> >>     cannot drop code that is actively used by users, especially if it
>> >>     something that deep in the stack as support for cluster management
>> >>     framework.
>> >>
>> >>     Cheers,
>> >>     Kostas
>> >>
>> >>     On Fri, Oct 23, 2020 at 4:15 PM Piyush Narang <p.narang@criteo.com
>> >> <ma...@criteo.com>> wrote:
>> >>     >
>> >>     > Hi folks,
>> >>     >
>> >>     >
>> >>     >
>> >>     > We at Criteo are active users of the Flink on Mesos resource
>> >> management component. We are pretty heavy users of Mesos for scheduling
>> >> workloads on our edge datacenters and we do want to continue to be
>> able to
>> >> run some of our Flink topologies (to compute machine learning short
>> term
>> >> features) on those DCs. If possible our vote would be not to drop Mesos
>> >> support as that will tie us to an old release / have to maintain a
>> fork as
>> >> we’re not planning to migrate off Mesos anytime soon. Is the burden
>> >> something that can be helped with by the community? (Or are you
>> referring
>> >> to having to ensure PRs handle the Mesos piece as well when they touch
>> the
>> >> resource managers?)
>> >>     >
>> >>     >
>> >>     >
>> >>     > Thanks,
>> >>     >
>> >>     >
>> >>     >
>> >>     > -- Piyush
>> >>     >
>> >>     >
>> >>     >
>> >>     >
>> >>     >
>> >>     > From: Till Rohrmann <trohrmann@apache.org<mailto:
>> >> trohrmann@apache.org>>
>> >>     > Date: Friday, October 23, 2020 at 8:19 AM
>> >>     > To: Xintong Song <tonysong820@gmail.com<mailto:
>> >> tonysong820@gmail.com>>
>> >>     > Cc: dev <de...@flink.apache.org>>,
>> user <
>> >> user@flink.apache.org<ma...@flink.apache.org>>
>> >>     > Subject: Re: [SURVEY] Remove Mesos support
>> >>     >
>> >>     >
>> >>     >
>> >>     > Thanks for starting this survey Robert! I second Konstantin and
>> >> Xintong in the sense that our Mesos user's opinions should matter most
>> >> here. If our community is no longer using the Mesos integration, then I
>> >> would be +1 for removing it in order to decrease the maintenance
>> burden.
>> >>     >
>> >>     >
>> >>     >
>> >>     > Cheers,
>> >>     >
>> >>     > Till
>> >>     >
>> >>     >
>> >>     >
>> >>     > On Fri, Oct 23, 2020 at 2:03 PM Xintong Song <
>> tonysong820@gmail.com
>> >> <ma...@gmail.com>> wrote:
>> >>     >
>> >>     > +1 for adding a warning in 1.12 about planning to remove Mesos
>> >> support.
>> >>     >
>> >>     >
>> >>     >
>> >>     > With my developer hat on, removing the Mesos support would
>> >> definitely reduce the maintaining overhead for the deployment and
>> resource
>> >> management related components. On the other hand, the Flink on Mesos
>> users'
>> >> voices definitely matter a lot for this community. Either way, it
>> would be
>> >> good to draw users attention to this discussion early.
>> >>     >
>> >>     >
>> >>     >
>> >>     > Thank you~
>> >>     >
>> >>     > Xintong Song
>> >>     >
>> >>     >
>> >>     >
>> >>     >
>> >>     >
>> >>     > On Fri, Oct 23, 2020 at 7:53 PM Konstantin Knauf <
>> knaufk@apache.org
>> >> <ma...@apache.org>> wrote:
>> >>     >
>> >>     > Hi Robert,
>> >>     >
>> >>     > +1 to the plan you outlined. If we were to drop support in Flink
>> >> 1.13+, we
>> >>     > would still support it in Flink 1.12- with bug fixes for some
>> time
>> >> so that
>> >>     > users have time to move on.
>> >>     >
>> >>     > It would certainly be very interesting to hear from current Flink
>> >> on Mesos
>> >>     > users, on how they see the evolution of this part of the
>> ecosystem.
>> >>     >
>> >>     > Best,
>> >>     >
>> >>     > Konstantin
>> >>
>> >
>>
>
>
> --
>
> Konstantin Knauf
>
> https://twitter.com/snntrable
>
> https://github.com/knaufk
>

Re: [BULK]Re: [SURVEY] Remove Mesos support

Posted by Konstantin Knauf <kn...@apache.org>.
Hi Matthias,

Thank you for following up on this. +1 to officially deprecate Mesos in the
code and documentation, too. It will be confusing for users if this
diverges from the roadmap.

Cheers,

Konstantin

On Thu, Mar 25, 2021 at 12:23 PM Matthias Pohl <ma...@ververica.com>
wrote:

> Hi everyone,
> considering the upcoming release of Flink 1.13, I wanted to revive the
> discussion about the Mesos support ones more. Mesos is also already listed
> as deprecated in Flink's overall roadmap [1]. Maybe, it's time to align the
> documentation accordingly to make it more explicit?
>
> What do you think?
>
> Best,
> Matthias
>
> [1] https://flink.apache.org/roadmap.html#feature-radar
>
> On Wed, Oct 28, 2020 at 9:40 AM Till Rohrmann <tr...@apache.org>
> wrote:
>
> > Hi Oleksandr,
> >
> > yes you are right. The biggest problem is at the moment the lack of test
> > coverage and thereby confidence to make changes. We have some e2e tests
> > which you can find here [1]. These tests are, however, quite coarse
> grained
> > and are missing a lot of cases. One idea would be to add a Mesos e2e test
> > based on Flink's end-to-end test framework [2]. I think what needs to be
> > done there is to add a Mesos resource and a way to submit jobs to a Mesos
> > cluster to write e2e tests.
> >
> > [1] https://github.com/apache/flink/tree/master/flink-jepsen
> > [2]
> >
> https://github.com/apache/flink/tree/master/flink-end-to-end-tests/flink-end-to-end-tests-common
> >
> > Cheers,
> > Till
> >
> > On Tue, Oct 27, 2020 at 12:29 PM Oleksandr Nitavskyi <
> > o.nitavskyi@criteo.com> wrote:
> >
> >> Hello Xintong,
> >>
> >> Thanks for the insights and support.
> >>
> >> Browsing the Mesos backlog and didn't identify anything critical, which
> >> is left there.
> >>
> >> I see that there are were quite a lot of contributions to the Flink
> Mesos
> >> in the recent version:
> >> https://github.com/apache/flink/commits/master/flink-mesos.
> >> We plan to validate the current Flink master (or release 1.12 branch)
> our
> >> Mesos setup. In case of any issues, we will try to propose changes.
> >> My feeling is that our test results shouldn't affect the Flink 1.12
> >> release cycle. And if any potential commits will land into the 1.12.1 it
> >> should be totally fine.
> >>
> >> In the future, we would be glad to help you guys with any
> >> maintenance-related questions. One of the highest priorities around this
> >> component seems to be the development of the full e2e test.
> >>
> >> Kind Regards
> >> Oleksandr Nitavskyi
> >> ________________________________
> >> From: Xintong Song <to...@gmail.com>
> >> Sent: Tuesday, October 27, 2020 7:14 AM
> >> To: dev <de...@flink.apache.org>; user <us...@flink.apache.org>
> >> Cc: Piyush Narang <p....@criteo.com>
> >> Subject: [BULK]Re: [SURVEY] Remove Mesos support
> >>
> >> Hi Piyush,
> >>
> >> Thanks a lot for sharing the information. It would be a great relief
> that
> >> you are good with Flink on Mesos as is.
> >>
> >> As for the jira issues, I believe the most essential ones should have
> >> already been resolved. You may find some remaining open issues here [1],
> >> but not all of them are necessary if we decide to keep Flink on Mesos
> as is.
> >>
> >> At the moment and in the short future, I think helps are mostly needed
> on
> >> testing the upcoming release 1.12 with Mesos use cases. The community is
> >> currently actively preparing the new release, and hopefully we could
> come
> >> up with a release candidate early next month. It would be greatly
> >> appreciated if you fork as experienced Flink on Mesos users can help
> with
> >> verifying the release candidates.
> >>
> >>
> >> Thank you~
> >>
> >> Xintong Song
> >>
> >> [1]
> >>
> https://issues.apache.org/jira/browse/FLINK-17402?jql=project%20%3D%20FLINK%20AND%20component%20%3D%20%22Deployment%20%2F%20Mesos%22%20AND%20status%20%3D%20Open
> >> <
> >>
> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FFLINK-17402%3Fjql%3Dproject%2520%253D%2520FLINK%2520AND%2520component%2520%253D%2520%2522Deployment%2520%252F%2520Mesos%2522%2520AND%2520status%2520%253D%2520Open&data=04%7C01%7Co.nitavskyi%40criteo.com%7C3585e1f25bdf4e091af808d87a3f92db%7C2a35d8fd574d48e3927c8c398e225a01%7C1%7C0%7C637393760750820881%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=hytJFQE0MCPzMLiQTQTdbg3GVckX5M3r1NPRGrRV8j4%3D&reserved=0
> >> >
> >>
> >> On Tue, Oct 27, 2020 at 2:58 AM Piyush Narang <p.narang@criteo.com
> >> <ma...@criteo.com>> wrote:
> >>
> >> Hi Xintong,
> >>
> >>
> >>
> >> Do you have any jiras that cover any of the items on 1 or 2? I can reach
> >> out to folks internally and see if I can get some folks to commit to
> >> helping out.
> >>
> >>
> >>
> >> To cover the other qs:
> >>
> >>   *   Yes, we’ve not got a plan at the moment to get off Mesos. We use
> >> Yarn for some our Flink workloads when we can. Mesos is only used when
> we
> >> need streaming capabilities in our WW dcs (as our Yarn is centralized in
> >> one DC)
> >>   *   We’re currently on Flink 1.9 (old planner). We have a plan to bump
> >> to 1.11 / 1.12 this quarter.
> >>   *   We typically upgrade once every 6 months to a year (not every
> >> release). We’d like to speed up the cadence but we’re not there yet.
> >>   *   We’d largely be good with keeping Flink on Mesos as-is and
> >> functional while missing out on some of the newer features. We
> understand
> >> the pain on the communities side and we can take on the work if we see
> some
> >> fancy improvement in Flink on Yarn / K8s that we want in Mesos to put in
> >> the request to port it over.
> >>
> >>
> >>
> >> Thanks,
> >>
> >>
> >>
> >> -- Piyush
> >>
> >>
> >>
> >>
> >>
> >> From: Xintong Song <tonysong820@gmail.com<mailto:tonysong820@gmail.com
> >>
> >> Date: Sunday, October 25, 2020 at 10:57 PM
> >> To: dev <de...@flink.apache.org>>, user <
> >> user@flink.apache.org<ma...@flink.apache.org>>
> >> Cc: Lasse Nedergaard <lassenedergaardflink@gmail.com<mailto:
> >> lassenedergaardflink@gmail.com>>, <p.narang@criteo.com<mailto:
> >> p.narang@criteo.com>>
> >> Subject: Re: [SURVEY] Remove Mesos support
> >>
> >>
> >>
> >> Thanks for sharing the information with us, Piyush an Lasse.
> >>
> >>
> >>
> >> @Piyush
> >>
> >>
> >>
> >> Thanks for offering the help. IMO, there are currently several problems
> >> that make supporting Flink on Mesos challenging for us.
> >>
> >>   1.  Lack of Mesos experts. AFAIK, there are very few people (if not
> >> none) among the active contributors in this community that are familiar
> >> with Mesos and can help with development on this component.
> >>   2.  Absence of tests. Mesos does not provide a testing cluster, like
> >> `MiniYARNCluster`, making it hard to test interactions between Flink and
> >> Mesos. We have only a few very simple e2e tests running on Mesos
> deployed
> >> in a docker, covering the most fundamental workflows. We are not sure
> how
> >> well those tests work, especially against some potential corner cases.
> >>   3.  Divergence from other deployment. Because of 1 and 2, the new
> >> efforts (features, maintenance, refactors) tend to exclude Mesos if
> >> possible. When the new efforts have to touch the Mesos related
> components
> >> (e.g., changes to the common resource manager interfaces), we have to be
> >> very careful and make as few changes as possible, to avoid accidentally
> >> breaking anything that we are not familiar with. As a result, the
> component
> >> diverges a lot from other deployment components (K8s/Yarn), which makes
> it
> >> harder to maintain.
> >>
> >> It would be greatly appreciated if you can help with either of the above
> >> issues.
> >>
> >>
> >>
> >> Additionally, I have a few questions concerning your use cases at
> Criteo.
> >> IIUC, you are going to stay on Mesos in the foreseeable future, while
> >> keeping the Flink version up-to-date? What Flink version are you
> currently
> >> using? How often do you upgrade (e.g., every release)? Would you be good
> >> with keeping the Flink on Mesos component as it is (means that
> deployment
> >> and resource management improvements may not be ported to Mesos), while
> >> keeping other components up-to-date (e.g., improvements from programming
> >> APIs, operators, state backens, etc.)?
> >>
> >>
> >>
> >> Thank you~
> >>
> >> Xintong Song
> >>
> >>
> >>
> >>
> >>
> >> On Sat, Oct 24, 2020 at 2:48 AM Lasse Nedergaard <
> >> lassenedergaardflink@gmail.com<ma...@gmail.com>>
> >> wrote:
> >>
> >> Hi
> >>
> >>
> >>
> >> At Trackunit We have been using Mesos for long time but have now moved
> to
> >> k8s.
> >>
> >> Med venlig hilsen / Best regards
> >>
> >> Lasse Nedergaard
> >>
> >>
> >>
> >>
> >>
> >> Den 23. okt. 2020 kl. 17.01 skrev Robert Metzger <rmetzger@apache.org
> >> <ma...@apache.org>>:
> >>
> >> 
> >>
> >> Hey Piyush,
> >>
> >> thanks a lot for raising this concern. I believe we should keep Mesos in
> >> Flink then in the foreseeable future.
> >>
> >> Your offer to help is much appreciated. We'll let you know once there is
> >> something.
> >>
> >>
> >>
> >> On Fri, Oct 23, 2020 at 4:28 PM Piyush Narang <p.narang@criteo.com
> >> <ma...@criteo.com>> wrote:
> >>
> >> Thanks Kostas. If there's items we can help with, I'm sure we'd be able
> >> to find folks who would be excited to contribute / help in any way.
> >>
> >> -- Piyush
> >>
> >>
> >> On 10/23/20, 10:25 AM, "Kostas Kloudas" <kkloudas@gmail.com<mailto:
> >> kkloudas@gmail.com>> wrote:
> >>
> >>     Thanks Piyush for the message.
> >>     After this, I revoke my +1. I agree with the previous opinions that
> we
> >>     cannot drop code that is actively used by users, especially if it
> >>     something that deep in the stack as support for cluster management
> >>     framework.
> >>
> >>     Cheers,
> >>     Kostas
> >>
> >>     On Fri, Oct 23, 2020 at 4:15 PM Piyush Narang <p.narang@criteo.com
> >> <ma...@criteo.com>> wrote:
> >>     >
> >>     > Hi folks,
> >>     >
> >>     >
> >>     >
> >>     > We at Criteo are active users of the Flink on Mesos resource
> >> management component. We are pretty heavy users of Mesos for scheduling
> >> workloads on our edge datacenters and we do want to continue to be able
> to
> >> run some of our Flink topologies (to compute machine learning short term
> >> features) on those DCs. If possible our vote would be not to drop Mesos
> >> support as that will tie us to an old release / have to maintain a fork
> as
> >> we’re not planning to migrate off Mesos anytime soon. Is the burden
> >> something that can be helped with by the community? (Or are you
> referring
> >> to having to ensure PRs handle the Mesos piece as well when they touch
> the
> >> resource managers?)
> >>     >
> >>     >
> >>     >
> >>     > Thanks,
> >>     >
> >>     >
> >>     >
> >>     > -- Piyush
> >>     >
> >>     >
> >>     >
> >>     >
> >>     >
> >>     > From: Till Rohrmann <trohrmann@apache.org<mailto:
> >> trohrmann@apache.org>>
> >>     > Date: Friday, October 23, 2020 at 8:19 AM
> >>     > To: Xintong Song <tonysong820@gmail.com<mailto:
> >> tonysong820@gmail.com>>
> >>     > Cc: dev <de...@flink.apache.org>>,
> user <
> >> user@flink.apache.org<ma...@flink.apache.org>>
> >>     > Subject: Re: [SURVEY] Remove Mesos support
> >>     >
> >>     >
> >>     >
> >>     > Thanks for starting this survey Robert! I second Konstantin and
> >> Xintong in the sense that our Mesos user's opinions should matter most
> >> here. If our community is no longer using the Mesos integration, then I
> >> would be +1 for removing it in order to decrease the maintenance burden.
> >>     >
> >>     >
> >>     >
> >>     > Cheers,
> >>     >
> >>     > Till
> >>     >
> >>     >
> >>     >
> >>     > On Fri, Oct 23, 2020 at 2:03 PM Xintong Song <
> tonysong820@gmail.com
> >> <ma...@gmail.com>> wrote:
> >>     >
> >>     > +1 for adding a warning in 1.12 about planning to remove Mesos
> >> support.
> >>     >
> >>     >
> >>     >
> >>     > With my developer hat on, removing the Mesos support would
> >> definitely reduce the maintaining overhead for the deployment and
> resource
> >> management related components. On the other hand, the Flink on Mesos
> users'
> >> voices definitely matter a lot for this community. Either way, it would
> be
> >> good to draw users attention to this discussion early.
> >>     >
> >>     >
> >>     >
> >>     > Thank you~
> >>     >
> >>     > Xintong Song
> >>     >
> >>     >
> >>     >
> >>     >
> >>     >
> >>     > On Fri, Oct 23, 2020 at 7:53 PM Konstantin Knauf <
> knaufk@apache.org
> >> <ma...@apache.org>> wrote:
> >>     >
> >>     > Hi Robert,
> >>     >
> >>     > +1 to the plan you outlined. If we were to drop support in Flink
> >> 1.13+, we
> >>     > would still support it in Flink 1.12- with bug fixes for some time
> >> so that
> >>     > users have time to move on.
> >>     >
> >>     > It would certainly be very interesting to hear from current Flink
> >> on Mesos
> >>     > users, on how they see the evolution of this part of the
> ecosystem.
> >>     >
> >>     > Best,
> >>     >
> >>     > Konstantin
> >>
> >
>


-- 

Konstantin Knauf

https://twitter.com/snntrable

https://github.com/knaufk

Re: [BULK]Re: [SURVEY] Remove Mesos support

Posted by Konstantin Knauf <kn...@apache.org>.
Hi Matthias,

Thank you for following up on this. +1 to officially deprecate Mesos in the
code and documentation, too. It will be confusing for users if this
diverges from the roadmap.

Cheers,

Konstantin

On Thu, Mar 25, 2021 at 12:23 PM Matthias Pohl <ma...@ververica.com>
wrote:

> Hi everyone,
> considering the upcoming release of Flink 1.13, I wanted to revive the
> discussion about the Mesos support ones more. Mesos is also already listed
> as deprecated in Flink's overall roadmap [1]. Maybe, it's time to align the
> documentation accordingly to make it more explicit?
>
> What do you think?
>
> Best,
> Matthias
>
> [1] https://flink.apache.org/roadmap.html#feature-radar
>
> On Wed, Oct 28, 2020 at 9:40 AM Till Rohrmann <tr...@apache.org>
> wrote:
>
> > Hi Oleksandr,
> >
> > yes you are right. The biggest problem is at the moment the lack of test
> > coverage and thereby confidence to make changes. We have some e2e tests
> > which you can find here [1]. These tests are, however, quite coarse
> grained
> > and are missing a lot of cases. One idea would be to add a Mesos e2e test
> > based on Flink's end-to-end test framework [2]. I think what needs to be
> > done there is to add a Mesos resource and a way to submit jobs to a Mesos
> > cluster to write e2e tests.
> >
> > [1] https://github.com/apache/flink/tree/master/flink-jepsen
> > [2]
> >
> https://github.com/apache/flink/tree/master/flink-end-to-end-tests/flink-end-to-end-tests-common
> >
> > Cheers,
> > Till
> >
> > On Tue, Oct 27, 2020 at 12:29 PM Oleksandr Nitavskyi <
> > o.nitavskyi@criteo.com> wrote:
> >
> >> Hello Xintong,
> >>
> >> Thanks for the insights and support.
> >>
> >> Browsing the Mesos backlog and didn't identify anything critical, which
> >> is left there.
> >>
> >> I see that there are were quite a lot of contributions to the Flink
> Mesos
> >> in the recent version:
> >> https://github.com/apache/flink/commits/master/flink-mesos.
> >> We plan to validate the current Flink master (or release 1.12 branch)
> our
> >> Mesos setup. In case of any issues, we will try to propose changes.
> >> My feeling is that our test results shouldn't affect the Flink 1.12
> >> release cycle. And if any potential commits will land into the 1.12.1 it
> >> should be totally fine.
> >>
> >> In the future, we would be glad to help you guys with any
> >> maintenance-related questions. One of the highest priorities around this
> >> component seems to be the development of the full e2e test.
> >>
> >> Kind Regards
> >> Oleksandr Nitavskyi
> >> ________________________________
> >> From: Xintong Song <to...@gmail.com>
> >> Sent: Tuesday, October 27, 2020 7:14 AM
> >> To: dev <de...@flink.apache.org>; user <us...@flink.apache.org>
> >> Cc: Piyush Narang <p....@criteo.com>
> >> Subject: [BULK]Re: [SURVEY] Remove Mesos support
> >>
> >> Hi Piyush,
> >>
> >> Thanks a lot for sharing the information. It would be a great relief
> that
> >> you are good with Flink on Mesos as is.
> >>
> >> As for the jira issues, I believe the most essential ones should have
> >> already been resolved. You may find some remaining open issues here [1],
> >> but not all of them are necessary if we decide to keep Flink on Mesos
> as is.
> >>
> >> At the moment and in the short future, I think helps are mostly needed
> on
> >> testing the upcoming release 1.12 with Mesos use cases. The community is
> >> currently actively preparing the new release, and hopefully we could
> come
> >> up with a release candidate early next month. It would be greatly
> >> appreciated if you fork as experienced Flink on Mesos users can help
> with
> >> verifying the release candidates.
> >>
> >>
> >> Thank you~
> >>
> >> Xintong Song
> >>
> >> [1]
> >>
> https://issues.apache.org/jira/browse/FLINK-17402?jql=project%20%3D%20FLINK%20AND%20component%20%3D%20%22Deployment%20%2F%20Mesos%22%20AND%20status%20%3D%20Open
> >> <
> >>
> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FFLINK-17402%3Fjql%3Dproject%2520%253D%2520FLINK%2520AND%2520component%2520%253D%2520%2522Deployment%2520%252F%2520Mesos%2522%2520AND%2520status%2520%253D%2520Open&data=04%7C01%7Co.nitavskyi%40criteo.com%7C3585e1f25bdf4e091af808d87a3f92db%7C2a35d8fd574d48e3927c8c398e225a01%7C1%7C0%7C637393760750820881%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=hytJFQE0MCPzMLiQTQTdbg3GVckX5M3r1NPRGrRV8j4%3D&reserved=0
> >> >
> >>
> >> On Tue, Oct 27, 2020 at 2:58 AM Piyush Narang <p.narang@criteo.com
> >> <ma...@criteo.com>> wrote:
> >>
> >> Hi Xintong,
> >>
> >>
> >>
> >> Do you have any jiras that cover any of the items on 1 or 2? I can reach
> >> out to folks internally and see if I can get some folks to commit to
> >> helping out.
> >>
> >>
> >>
> >> To cover the other qs:
> >>
> >>   *   Yes, we’ve not got a plan at the moment to get off Mesos. We use
> >> Yarn for some our Flink workloads when we can. Mesos is only used when
> we
> >> need streaming capabilities in our WW dcs (as our Yarn is centralized in
> >> one DC)
> >>   *   We’re currently on Flink 1.9 (old planner). We have a plan to bump
> >> to 1.11 / 1.12 this quarter.
> >>   *   We typically upgrade once every 6 months to a year (not every
> >> release). We’d like to speed up the cadence but we’re not there yet.
> >>   *   We’d largely be good with keeping Flink on Mesos as-is and
> >> functional while missing out on some of the newer features. We
> understand
> >> the pain on the communities side and we can take on the work if we see
> some
> >> fancy improvement in Flink on Yarn / K8s that we want in Mesos to put in
> >> the request to port it over.
> >>
> >>
> >>
> >> Thanks,
> >>
> >>
> >>
> >> -- Piyush
> >>
> >>
> >>
> >>
> >>
> >> From: Xintong Song <tonysong820@gmail.com<mailto:tonysong820@gmail.com
> >>
> >> Date: Sunday, October 25, 2020 at 10:57 PM
> >> To: dev <de...@flink.apache.org>>, user <
> >> user@flink.apache.org<ma...@flink.apache.org>>
> >> Cc: Lasse Nedergaard <lassenedergaardflink@gmail.com<mailto:
> >> lassenedergaardflink@gmail.com>>, <p.narang@criteo.com<mailto:
> >> p.narang@criteo.com>>
> >> Subject: Re: [SURVEY] Remove Mesos support
> >>
> >>
> >>
> >> Thanks for sharing the information with us, Piyush an Lasse.
> >>
> >>
> >>
> >> @Piyush
> >>
> >>
> >>
> >> Thanks for offering the help. IMO, there are currently several problems
> >> that make supporting Flink on Mesos challenging for us.
> >>
> >>   1.  Lack of Mesos experts. AFAIK, there are very few people (if not
> >> none) among the active contributors in this community that are familiar
> >> with Mesos and can help with development on this component.
> >>   2.  Absence of tests. Mesos does not provide a testing cluster, like
> >> `MiniYARNCluster`, making it hard to test interactions between Flink and
> >> Mesos. We have only a few very simple e2e tests running on Mesos
> deployed
> >> in a docker, covering the most fundamental workflows. We are not sure
> how
> >> well those tests work, especially against some potential corner cases.
> >>   3.  Divergence from other deployment. Because of 1 and 2, the new
> >> efforts (features, maintenance, refactors) tend to exclude Mesos if
> >> possible. When the new efforts have to touch the Mesos related
> components
> >> (e.g., changes to the common resource manager interfaces), we have to be
> >> very careful and make as few changes as possible, to avoid accidentally
> >> breaking anything that we are not familiar with. As a result, the
> component
> >> diverges a lot from other deployment components (K8s/Yarn), which makes
> it
> >> harder to maintain.
> >>
> >> It would be greatly appreciated if you can help with either of the above
> >> issues.
> >>
> >>
> >>
> >> Additionally, I have a few questions concerning your use cases at
> Criteo.
> >> IIUC, you are going to stay on Mesos in the foreseeable future, while
> >> keeping the Flink version up-to-date? What Flink version are you
> currently
> >> using? How often do you upgrade (e.g., every release)? Would you be good
> >> with keeping the Flink on Mesos component as it is (means that
> deployment
> >> and resource management improvements may not be ported to Mesos), while
> >> keeping other components up-to-date (e.g., improvements from programming
> >> APIs, operators, state backens, etc.)?
> >>
> >>
> >>
> >> Thank you~
> >>
> >> Xintong Song
> >>
> >>
> >>
> >>
> >>
> >> On Sat, Oct 24, 2020 at 2:48 AM Lasse Nedergaard <
> >> lassenedergaardflink@gmail.com<ma...@gmail.com>>
> >> wrote:
> >>
> >> Hi
> >>
> >>
> >>
> >> At Trackunit We have been using Mesos for long time but have now moved
> to
> >> k8s.
> >>
> >> Med venlig hilsen / Best regards
> >>
> >> Lasse Nedergaard
> >>
> >>
> >>
> >>
> >>
> >> Den 23. okt. 2020 kl. 17.01 skrev Robert Metzger <rmetzger@apache.org
> >> <ma...@apache.org>>:
> >>
> >> 
> >>
> >> Hey Piyush,
> >>
> >> thanks a lot for raising this concern. I believe we should keep Mesos in
> >> Flink then in the foreseeable future.
> >>
> >> Your offer to help is much appreciated. We'll let you know once there is
> >> something.
> >>
> >>
> >>
> >> On Fri, Oct 23, 2020 at 4:28 PM Piyush Narang <p.narang@criteo.com
> >> <ma...@criteo.com>> wrote:
> >>
> >> Thanks Kostas. If there's items we can help with, I'm sure we'd be able
> >> to find folks who would be excited to contribute / help in any way.
> >>
> >> -- Piyush
> >>
> >>
> >> On 10/23/20, 10:25 AM, "Kostas Kloudas" <kkloudas@gmail.com<mailto:
> >> kkloudas@gmail.com>> wrote:
> >>
> >>     Thanks Piyush for the message.
> >>     After this, I revoke my +1. I agree with the previous opinions that
> we
> >>     cannot drop code that is actively used by users, especially if it
> >>     something that deep in the stack as support for cluster management
> >>     framework.
> >>
> >>     Cheers,
> >>     Kostas
> >>
> >>     On Fri, Oct 23, 2020 at 4:15 PM Piyush Narang <p.narang@criteo.com
> >> <ma...@criteo.com>> wrote:
> >>     >
> >>     > Hi folks,
> >>     >
> >>     >
> >>     >
> >>     > We at Criteo are active users of the Flink on Mesos resource
> >> management component. We are pretty heavy users of Mesos for scheduling
> >> workloads on our edge datacenters and we do want to continue to be able
> to
> >> run some of our Flink topologies (to compute machine learning short term
> >> features) on those DCs. If possible our vote would be not to drop Mesos
> >> support as that will tie us to an old release / have to maintain a fork
> as
> >> we’re not planning to migrate off Mesos anytime soon. Is the burden
> >> something that can be helped with by the community? (Or are you
> referring
> >> to having to ensure PRs handle the Mesos piece as well when they touch
> the
> >> resource managers?)
> >>     >
> >>     >
> >>     >
> >>     > Thanks,
> >>     >
> >>     >
> >>     >
> >>     > -- Piyush
> >>     >
> >>     >
> >>     >
> >>     >
> >>     >
> >>     > From: Till Rohrmann <trohrmann@apache.org<mailto:
> >> trohrmann@apache.org>>
> >>     > Date: Friday, October 23, 2020 at 8:19 AM
> >>     > To: Xintong Song <tonysong820@gmail.com<mailto:
> >> tonysong820@gmail.com>>
> >>     > Cc: dev <de...@flink.apache.org>>,
> user <
> >> user@flink.apache.org<ma...@flink.apache.org>>
> >>     > Subject: Re: [SURVEY] Remove Mesos support
> >>     >
> >>     >
> >>     >
> >>     > Thanks for starting this survey Robert! I second Konstantin and
> >> Xintong in the sense that our Mesos user's opinions should matter most
> >> here. If our community is no longer using the Mesos integration, then I
> >> would be +1 for removing it in order to decrease the maintenance burden.
> >>     >
> >>     >
> >>     >
> >>     > Cheers,
> >>     >
> >>     > Till
> >>     >
> >>     >
> >>     >
> >>     > On Fri, Oct 23, 2020 at 2:03 PM Xintong Song <
> tonysong820@gmail.com
> >> <ma...@gmail.com>> wrote:
> >>     >
> >>     > +1 for adding a warning in 1.12 about planning to remove Mesos
> >> support.
> >>     >
> >>     >
> >>     >
> >>     > With my developer hat on, removing the Mesos support would
> >> definitely reduce the maintaining overhead for the deployment and
> resource
> >> management related components. On the other hand, the Flink on Mesos
> users'
> >> voices definitely matter a lot for this community. Either way, it would
> be
> >> good to draw users attention to this discussion early.
> >>     >
> >>     >
> >>     >
> >>     > Thank you~
> >>     >
> >>     > Xintong Song
> >>     >
> >>     >
> >>     >
> >>     >
> >>     >
> >>     > On Fri, Oct 23, 2020 at 7:53 PM Konstantin Knauf <
> knaufk@apache.org
> >> <ma...@apache.org>> wrote:
> >>     >
> >>     > Hi Robert,
> >>     >
> >>     > +1 to the plan you outlined. If we were to drop support in Flink
> >> 1.13+, we
> >>     > would still support it in Flink 1.12- with bug fixes for some time
> >> so that
> >>     > users have time to move on.
> >>     >
> >>     > It would certainly be very interesting to hear from current Flink
> >> on Mesos
> >>     > users, on how they see the evolution of this part of the
> ecosystem.
> >>     >
> >>     > Best,
> >>     >
> >>     > Konstantin
> >>
> >
>


-- 

Konstantin Knauf

https://twitter.com/snntrable

https://github.com/knaufk

Re: [BULK]Re: [SURVEY] Remove Mesos support

Posted by Matthias Pohl <ma...@ververica.com>.
Hi everyone,
considering the upcoming release of Flink 1.13, I wanted to revive the
discussion about the Mesos support ones more. Mesos is also already listed
as deprecated in Flink's overall roadmap [1]. Maybe, it's time to align the
documentation accordingly to make it more explicit?

What do you think?

Best,
Matthias

[1] https://flink.apache.org/roadmap.html#feature-radar

On Wed, Oct 28, 2020 at 9:40 AM Till Rohrmann <tr...@apache.org> wrote:

> Hi Oleksandr,
>
> yes you are right. The biggest problem is at the moment the lack of test
> coverage and thereby confidence to make changes. We have some e2e tests
> which you can find here [1]. These tests are, however, quite coarse grained
> and are missing a lot of cases. One idea would be to add a Mesos e2e test
> based on Flink's end-to-end test framework [2]. I think what needs to be
> done there is to add a Mesos resource and a way to submit jobs to a Mesos
> cluster to write e2e tests.
>
> [1] https://github.com/apache/flink/tree/master/flink-jepsen
> [2]
> https://github.com/apache/flink/tree/master/flink-end-to-end-tests/flink-end-to-end-tests-common
>
> Cheers,
> Till
>
> On Tue, Oct 27, 2020 at 12:29 PM Oleksandr Nitavskyi <
> o.nitavskyi@criteo.com> wrote:
>
>> Hello Xintong,
>>
>> Thanks for the insights and support.
>>
>> Browsing the Mesos backlog and didn't identify anything critical, which
>> is left there.
>>
>> I see that there are were quite a lot of contributions to the Flink Mesos
>> in the recent version:
>> https://github.com/apache/flink/commits/master/flink-mesos.
>> We plan to validate the current Flink master (or release 1.12 branch) our
>> Mesos setup. In case of any issues, we will try to propose changes.
>> My feeling is that our test results shouldn't affect the Flink 1.12
>> release cycle. And if any potential commits will land into the 1.12.1 it
>> should be totally fine.
>>
>> In the future, we would be glad to help you guys with any
>> maintenance-related questions. One of the highest priorities around this
>> component seems to be the development of the full e2e test.
>>
>> Kind Regards
>> Oleksandr Nitavskyi
>> ________________________________
>> From: Xintong Song <to...@gmail.com>
>> Sent: Tuesday, October 27, 2020 7:14 AM
>> To: dev <de...@flink.apache.org>; user <us...@flink.apache.org>
>> Cc: Piyush Narang <p....@criteo.com>
>> Subject: [BULK]Re: [SURVEY] Remove Mesos support
>>
>> Hi Piyush,
>>
>> Thanks a lot for sharing the information. It would be a great relief that
>> you are good with Flink on Mesos as is.
>>
>> As for the jira issues, I believe the most essential ones should have
>> already been resolved. You may find some remaining open issues here [1],
>> but not all of them are necessary if we decide to keep Flink on Mesos as is.
>>
>> At the moment and in the short future, I think helps are mostly needed on
>> testing the upcoming release 1.12 with Mesos use cases. The community is
>> currently actively preparing the new release, and hopefully we could come
>> up with a release candidate early next month. It would be greatly
>> appreciated if you fork as experienced Flink on Mesos users can help with
>> verifying the release candidates.
>>
>>
>> Thank you~
>>
>> Xintong Song
>>
>> [1]
>> https://issues.apache.org/jira/browse/FLINK-17402?jql=project%20%3D%20FLINK%20AND%20component%20%3D%20%22Deployment%20%2F%20Mesos%22%20AND%20status%20%3D%20Open
>> <
>> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FFLINK-17402%3Fjql%3Dproject%2520%253D%2520FLINK%2520AND%2520component%2520%253D%2520%2522Deployment%2520%252F%2520Mesos%2522%2520AND%2520status%2520%253D%2520Open&data=04%7C01%7Co.nitavskyi%40criteo.com%7C3585e1f25bdf4e091af808d87a3f92db%7C2a35d8fd574d48e3927c8c398e225a01%7C1%7C0%7C637393760750820881%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=hytJFQE0MCPzMLiQTQTdbg3GVckX5M3r1NPRGrRV8j4%3D&reserved=0
>> >
>>
>> On Tue, Oct 27, 2020 at 2:58 AM Piyush Narang <p.narang@criteo.com
>> <ma...@criteo.com>> wrote:
>>
>> Hi Xintong,
>>
>>
>>
>> Do you have any jiras that cover any of the items on 1 or 2? I can reach
>> out to folks internally and see if I can get some folks to commit to
>> helping out.
>>
>>
>>
>> To cover the other qs:
>>
>>   *   Yes, we’ve not got a plan at the moment to get off Mesos. We use
>> Yarn for some our Flink workloads when we can. Mesos is only used when we
>> need streaming capabilities in our WW dcs (as our Yarn is centralized in
>> one DC)
>>   *   We’re currently on Flink 1.9 (old planner). We have a plan to bump
>> to 1.11 / 1.12 this quarter.
>>   *   We typically upgrade once every 6 months to a year (not every
>> release). We’d like to speed up the cadence but we’re not there yet.
>>   *   We’d largely be good with keeping Flink on Mesos as-is and
>> functional while missing out on some of the newer features. We understand
>> the pain on the communities side and we can take on the work if we see some
>> fancy improvement in Flink on Yarn / K8s that we want in Mesos to put in
>> the request to port it over.
>>
>>
>>
>> Thanks,
>>
>>
>>
>> -- Piyush
>>
>>
>>
>>
>>
>> From: Xintong Song <to...@gmail.com>>
>> Date: Sunday, October 25, 2020 at 10:57 PM
>> To: dev <de...@flink.apache.org>>, user <
>> user@flink.apache.org<ma...@flink.apache.org>>
>> Cc: Lasse Nedergaard <lassenedergaardflink@gmail.com<mailto:
>> lassenedergaardflink@gmail.com>>, <p.narang@criteo.com<mailto:
>> p.narang@criteo.com>>
>> Subject: Re: [SURVEY] Remove Mesos support
>>
>>
>>
>> Thanks for sharing the information with us, Piyush an Lasse.
>>
>>
>>
>> @Piyush
>>
>>
>>
>> Thanks for offering the help. IMO, there are currently several problems
>> that make supporting Flink on Mesos challenging for us.
>>
>>   1.  Lack of Mesos experts. AFAIK, there are very few people (if not
>> none) among the active contributors in this community that are familiar
>> with Mesos and can help with development on this component.
>>   2.  Absence of tests. Mesos does not provide a testing cluster, like
>> `MiniYARNCluster`, making it hard to test interactions between Flink and
>> Mesos. We have only a few very simple e2e tests running on Mesos deployed
>> in a docker, covering the most fundamental workflows. We are not sure how
>> well those tests work, especially against some potential corner cases.
>>   3.  Divergence from other deployment. Because of 1 and 2, the new
>> efforts (features, maintenance, refactors) tend to exclude Mesos if
>> possible. When the new efforts have to touch the Mesos related components
>> (e.g., changes to the common resource manager interfaces), we have to be
>> very careful and make as few changes as possible, to avoid accidentally
>> breaking anything that we are not familiar with. As a result, the component
>> diverges a lot from other deployment components (K8s/Yarn), which makes it
>> harder to maintain.
>>
>> It would be greatly appreciated if you can help with either of the above
>> issues.
>>
>>
>>
>> Additionally, I have a few questions concerning your use cases at Criteo.
>> IIUC, you are going to stay on Mesos in the foreseeable future, while
>> keeping the Flink version up-to-date? What Flink version are you currently
>> using? How often do you upgrade (e.g., every release)? Would you be good
>> with keeping the Flink on Mesos component as it is (means that deployment
>> and resource management improvements may not be ported to Mesos), while
>> keeping other components up-to-date (e.g., improvements from programming
>> APIs, operators, state backens, etc.)?
>>
>>
>>
>> Thank you~
>>
>> Xintong Song
>>
>>
>>
>>
>>
>> On Sat, Oct 24, 2020 at 2:48 AM Lasse Nedergaard <
>> lassenedergaardflink@gmail.com<ma...@gmail.com>>
>> wrote:
>>
>> Hi
>>
>>
>>
>> At Trackunit We have been using Mesos for long time but have now moved to
>> k8s.
>>
>> Med venlig hilsen / Best regards
>>
>> Lasse Nedergaard
>>
>>
>>
>>
>>
>> Den 23. okt. 2020 kl. 17.01 skrev Robert Metzger <rmetzger@apache.org
>> <ma...@apache.org>>:
>>
>> 
>>
>> Hey Piyush,
>>
>> thanks a lot for raising this concern. I believe we should keep Mesos in
>> Flink then in the foreseeable future.
>>
>> Your offer to help is much appreciated. We'll let you know once there is
>> something.
>>
>>
>>
>> On Fri, Oct 23, 2020 at 4:28 PM Piyush Narang <p.narang@criteo.com
>> <ma...@criteo.com>> wrote:
>>
>> Thanks Kostas. If there's items we can help with, I'm sure we'd be able
>> to find folks who would be excited to contribute / help in any way.
>>
>> -- Piyush
>>
>>
>> On 10/23/20, 10:25 AM, "Kostas Kloudas" <kkloudas@gmail.com<mailto:
>> kkloudas@gmail.com>> wrote:
>>
>>     Thanks Piyush for the message.
>>     After this, I revoke my +1. I agree with the previous opinions that we
>>     cannot drop code that is actively used by users, especially if it
>>     something that deep in the stack as support for cluster management
>>     framework.
>>
>>     Cheers,
>>     Kostas
>>
>>     On Fri, Oct 23, 2020 at 4:15 PM Piyush Narang <p.narang@criteo.com
>> <ma...@criteo.com>> wrote:
>>     >
>>     > Hi folks,
>>     >
>>     >
>>     >
>>     > We at Criteo are active users of the Flink on Mesos resource
>> management component. We are pretty heavy users of Mesos for scheduling
>> workloads on our edge datacenters and we do want to continue to be able to
>> run some of our Flink topologies (to compute machine learning short term
>> features) on those DCs. If possible our vote would be not to drop Mesos
>> support as that will tie us to an old release / have to maintain a fork as
>> we’re not planning to migrate off Mesos anytime soon. Is the burden
>> something that can be helped with by the community? (Or are you referring
>> to having to ensure PRs handle the Mesos piece as well when they touch the
>> resource managers?)
>>     >
>>     >
>>     >
>>     > Thanks,
>>     >
>>     >
>>     >
>>     > -- Piyush
>>     >
>>     >
>>     >
>>     >
>>     >
>>     > From: Till Rohrmann <trohrmann@apache.org<mailto:
>> trohrmann@apache.org>>
>>     > Date: Friday, October 23, 2020 at 8:19 AM
>>     > To: Xintong Song <tonysong820@gmail.com<mailto:
>> tonysong820@gmail.com>>
>>     > Cc: dev <de...@flink.apache.org>>, user <
>> user@flink.apache.org<ma...@flink.apache.org>>
>>     > Subject: Re: [SURVEY] Remove Mesos support
>>     >
>>     >
>>     >
>>     > Thanks for starting this survey Robert! I second Konstantin and
>> Xintong in the sense that our Mesos user's opinions should matter most
>> here. If our community is no longer using the Mesos integration, then I
>> would be +1 for removing it in order to decrease the maintenance burden.
>>     >
>>     >
>>     >
>>     > Cheers,
>>     >
>>     > Till
>>     >
>>     >
>>     >
>>     > On Fri, Oct 23, 2020 at 2:03 PM Xintong Song <tonysong820@gmail.com
>> <ma...@gmail.com>> wrote:
>>     >
>>     > +1 for adding a warning in 1.12 about planning to remove Mesos
>> support.
>>     >
>>     >
>>     >
>>     > With my developer hat on, removing the Mesos support would
>> definitely reduce the maintaining overhead for the deployment and resource
>> management related components. On the other hand, the Flink on Mesos users'
>> voices definitely matter a lot for this community. Either way, it would be
>> good to draw users attention to this discussion early.
>>     >
>>     >
>>     >
>>     > Thank you~
>>     >
>>     > Xintong Song
>>     >
>>     >
>>     >
>>     >
>>     >
>>     > On Fri, Oct 23, 2020 at 7:53 PM Konstantin Knauf <knaufk@apache.org
>> <ma...@apache.org>> wrote:
>>     >
>>     > Hi Robert,
>>     >
>>     > +1 to the plan you outlined. If we were to drop support in Flink
>> 1.13+, we
>>     > would still support it in Flink 1.12- with bug fixes for some time
>> so that
>>     > users have time to move on.
>>     >
>>     > It would certainly be very interesting to hear from current Flink
>> on Mesos
>>     > users, on how they see the evolution of this part of the ecosystem.
>>     >
>>     > Best,
>>     >
>>     > Konstantin
>>
>

Re: [BULK]Re: [SURVEY] Remove Mesos support

Posted by Matthias Pohl <ma...@ververica.com>.
Hi everyone,
considering the upcoming release of Flink 1.13, I wanted to revive the
discussion about the Mesos support ones more. Mesos is also already listed
as deprecated in Flink's overall roadmap [1]. Maybe, it's time to align the
documentation accordingly to make it more explicit?

What do you think?

Best,
Matthias

[1] https://flink.apache.org/roadmap.html#feature-radar

On Wed, Oct 28, 2020 at 9:40 AM Till Rohrmann <tr...@apache.org> wrote:

> Hi Oleksandr,
>
> yes you are right. The biggest problem is at the moment the lack of test
> coverage and thereby confidence to make changes. We have some e2e tests
> which you can find here [1]. These tests are, however, quite coarse grained
> and are missing a lot of cases. One idea would be to add a Mesos e2e test
> based on Flink's end-to-end test framework [2]. I think what needs to be
> done there is to add a Mesos resource and a way to submit jobs to a Mesos
> cluster to write e2e tests.
>
> [1] https://github.com/apache/flink/tree/master/flink-jepsen
> [2]
> https://github.com/apache/flink/tree/master/flink-end-to-end-tests/flink-end-to-end-tests-common
>
> Cheers,
> Till
>
> On Tue, Oct 27, 2020 at 12:29 PM Oleksandr Nitavskyi <
> o.nitavskyi@criteo.com> wrote:
>
>> Hello Xintong,
>>
>> Thanks for the insights and support.
>>
>> Browsing the Mesos backlog and didn't identify anything critical, which
>> is left there.
>>
>> I see that there are were quite a lot of contributions to the Flink Mesos
>> in the recent version:
>> https://github.com/apache/flink/commits/master/flink-mesos.
>> We plan to validate the current Flink master (or release 1.12 branch) our
>> Mesos setup. In case of any issues, we will try to propose changes.
>> My feeling is that our test results shouldn't affect the Flink 1.12
>> release cycle. And if any potential commits will land into the 1.12.1 it
>> should be totally fine.
>>
>> In the future, we would be glad to help you guys with any
>> maintenance-related questions. One of the highest priorities around this
>> component seems to be the development of the full e2e test.
>>
>> Kind Regards
>> Oleksandr Nitavskyi
>> ________________________________
>> From: Xintong Song <to...@gmail.com>
>> Sent: Tuesday, October 27, 2020 7:14 AM
>> To: dev <de...@flink.apache.org>; user <us...@flink.apache.org>
>> Cc: Piyush Narang <p....@criteo.com>
>> Subject: [BULK]Re: [SURVEY] Remove Mesos support
>>
>> Hi Piyush,
>>
>> Thanks a lot for sharing the information. It would be a great relief that
>> you are good with Flink on Mesos as is.
>>
>> As for the jira issues, I believe the most essential ones should have
>> already been resolved. You may find some remaining open issues here [1],
>> but not all of them are necessary if we decide to keep Flink on Mesos as is.
>>
>> At the moment and in the short future, I think helps are mostly needed on
>> testing the upcoming release 1.12 with Mesos use cases. The community is
>> currently actively preparing the new release, and hopefully we could come
>> up with a release candidate early next month. It would be greatly
>> appreciated if you fork as experienced Flink on Mesos users can help with
>> verifying the release candidates.
>>
>>
>> Thank you~
>>
>> Xintong Song
>>
>> [1]
>> https://issues.apache.org/jira/browse/FLINK-17402?jql=project%20%3D%20FLINK%20AND%20component%20%3D%20%22Deployment%20%2F%20Mesos%22%20AND%20status%20%3D%20Open
>> <
>> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FFLINK-17402%3Fjql%3Dproject%2520%253D%2520FLINK%2520AND%2520component%2520%253D%2520%2522Deployment%2520%252F%2520Mesos%2522%2520AND%2520status%2520%253D%2520Open&data=04%7C01%7Co.nitavskyi%40criteo.com%7C3585e1f25bdf4e091af808d87a3f92db%7C2a35d8fd574d48e3927c8c398e225a01%7C1%7C0%7C637393760750820881%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=hytJFQE0MCPzMLiQTQTdbg3GVckX5M3r1NPRGrRV8j4%3D&reserved=0
>> >
>>
>> On Tue, Oct 27, 2020 at 2:58 AM Piyush Narang <p.narang@criteo.com
>> <ma...@criteo.com>> wrote:
>>
>> Hi Xintong,
>>
>>
>>
>> Do you have any jiras that cover any of the items on 1 or 2? I can reach
>> out to folks internally and see if I can get some folks to commit to
>> helping out.
>>
>>
>>
>> To cover the other qs:
>>
>>   *   Yes, we’ve not got a plan at the moment to get off Mesos. We use
>> Yarn for some our Flink workloads when we can. Mesos is only used when we
>> need streaming capabilities in our WW dcs (as our Yarn is centralized in
>> one DC)
>>   *   We’re currently on Flink 1.9 (old planner). We have a plan to bump
>> to 1.11 / 1.12 this quarter.
>>   *   We typically upgrade once every 6 months to a year (not every
>> release). We’d like to speed up the cadence but we’re not there yet.
>>   *   We’d largely be good with keeping Flink on Mesos as-is and
>> functional while missing out on some of the newer features. We understand
>> the pain on the communities side and we can take on the work if we see some
>> fancy improvement in Flink on Yarn / K8s that we want in Mesos to put in
>> the request to port it over.
>>
>>
>>
>> Thanks,
>>
>>
>>
>> -- Piyush
>>
>>
>>
>>
>>
>> From: Xintong Song <to...@gmail.com>>
>> Date: Sunday, October 25, 2020 at 10:57 PM
>> To: dev <de...@flink.apache.org>>, user <
>> user@flink.apache.org<ma...@flink.apache.org>>
>> Cc: Lasse Nedergaard <lassenedergaardflink@gmail.com<mailto:
>> lassenedergaardflink@gmail.com>>, <p.narang@criteo.com<mailto:
>> p.narang@criteo.com>>
>> Subject: Re: [SURVEY] Remove Mesos support
>>
>>
>>
>> Thanks for sharing the information with us, Piyush an Lasse.
>>
>>
>>
>> @Piyush
>>
>>
>>
>> Thanks for offering the help. IMO, there are currently several problems
>> that make supporting Flink on Mesos challenging for us.
>>
>>   1.  Lack of Mesos experts. AFAIK, there are very few people (if not
>> none) among the active contributors in this community that are familiar
>> with Mesos and can help with development on this component.
>>   2.  Absence of tests. Mesos does not provide a testing cluster, like
>> `MiniYARNCluster`, making it hard to test interactions between Flink and
>> Mesos. We have only a few very simple e2e tests running on Mesos deployed
>> in a docker, covering the most fundamental workflows. We are not sure how
>> well those tests work, especially against some potential corner cases.
>>   3.  Divergence from other deployment. Because of 1 and 2, the new
>> efforts (features, maintenance, refactors) tend to exclude Mesos if
>> possible. When the new efforts have to touch the Mesos related components
>> (e.g., changes to the common resource manager interfaces), we have to be
>> very careful and make as few changes as possible, to avoid accidentally
>> breaking anything that we are not familiar with. As a result, the component
>> diverges a lot from other deployment components (K8s/Yarn), which makes it
>> harder to maintain.
>>
>> It would be greatly appreciated if you can help with either of the above
>> issues.
>>
>>
>>
>> Additionally, I have a few questions concerning your use cases at Criteo.
>> IIUC, you are going to stay on Mesos in the foreseeable future, while
>> keeping the Flink version up-to-date? What Flink version are you currently
>> using? How often do you upgrade (e.g., every release)? Would you be good
>> with keeping the Flink on Mesos component as it is (means that deployment
>> and resource management improvements may not be ported to Mesos), while
>> keeping other components up-to-date (e.g., improvements from programming
>> APIs, operators, state backens, etc.)?
>>
>>
>>
>> Thank you~
>>
>> Xintong Song
>>
>>
>>
>>
>>
>> On Sat, Oct 24, 2020 at 2:48 AM Lasse Nedergaard <
>> lassenedergaardflink@gmail.com<ma...@gmail.com>>
>> wrote:
>>
>> Hi
>>
>>
>>
>> At Trackunit We have been using Mesos for long time but have now moved to
>> k8s.
>>
>> Med venlig hilsen / Best regards
>>
>> Lasse Nedergaard
>>
>>
>>
>>
>>
>> Den 23. okt. 2020 kl. 17.01 skrev Robert Metzger <rmetzger@apache.org
>> <ma...@apache.org>>:
>>
>> 
>>
>> Hey Piyush,
>>
>> thanks a lot for raising this concern. I believe we should keep Mesos in
>> Flink then in the foreseeable future.
>>
>> Your offer to help is much appreciated. We'll let you know once there is
>> something.
>>
>>
>>
>> On Fri, Oct 23, 2020 at 4:28 PM Piyush Narang <p.narang@criteo.com
>> <ma...@criteo.com>> wrote:
>>
>> Thanks Kostas. If there's items we can help with, I'm sure we'd be able
>> to find folks who would be excited to contribute / help in any way.
>>
>> -- Piyush
>>
>>
>> On 10/23/20, 10:25 AM, "Kostas Kloudas" <kkloudas@gmail.com<mailto:
>> kkloudas@gmail.com>> wrote:
>>
>>     Thanks Piyush for the message.
>>     After this, I revoke my +1. I agree with the previous opinions that we
>>     cannot drop code that is actively used by users, especially if it
>>     something that deep in the stack as support for cluster management
>>     framework.
>>
>>     Cheers,
>>     Kostas
>>
>>     On Fri, Oct 23, 2020 at 4:15 PM Piyush Narang <p.narang@criteo.com
>> <ma...@criteo.com>> wrote:
>>     >
>>     > Hi folks,
>>     >
>>     >
>>     >
>>     > We at Criteo are active users of the Flink on Mesos resource
>> management component. We are pretty heavy users of Mesos for scheduling
>> workloads on our edge datacenters and we do want to continue to be able to
>> run some of our Flink topologies (to compute machine learning short term
>> features) on those DCs. If possible our vote would be not to drop Mesos
>> support as that will tie us to an old release / have to maintain a fork as
>> we’re not planning to migrate off Mesos anytime soon. Is the burden
>> something that can be helped with by the community? (Or are you referring
>> to having to ensure PRs handle the Mesos piece as well when they touch the
>> resource managers?)
>>     >
>>     >
>>     >
>>     > Thanks,
>>     >
>>     >
>>     >
>>     > -- Piyush
>>     >
>>     >
>>     >
>>     >
>>     >
>>     > From: Till Rohrmann <trohrmann@apache.org<mailto:
>> trohrmann@apache.org>>
>>     > Date: Friday, October 23, 2020 at 8:19 AM
>>     > To: Xintong Song <tonysong820@gmail.com<mailto:
>> tonysong820@gmail.com>>
>>     > Cc: dev <de...@flink.apache.org>>, user <
>> user@flink.apache.org<ma...@flink.apache.org>>
>>     > Subject: Re: [SURVEY] Remove Mesos support
>>     >
>>     >
>>     >
>>     > Thanks for starting this survey Robert! I second Konstantin and
>> Xintong in the sense that our Mesos user's opinions should matter most
>> here. If our community is no longer using the Mesos integration, then I
>> would be +1 for removing it in order to decrease the maintenance burden.
>>     >
>>     >
>>     >
>>     > Cheers,
>>     >
>>     > Till
>>     >
>>     >
>>     >
>>     > On Fri, Oct 23, 2020 at 2:03 PM Xintong Song <tonysong820@gmail.com
>> <ma...@gmail.com>> wrote:
>>     >
>>     > +1 for adding a warning in 1.12 about planning to remove Mesos
>> support.
>>     >
>>     >
>>     >
>>     > With my developer hat on, removing the Mesos support would
>> definitely reduce the maintaining overhead for the deployment and resource
>> management related components. On the other hand, the Flink on Mesos users'
>> voices definitely matter a lot for this community. Either way, it would be
>> good to draw users attention to this discussion early.
>>     >
>>     >
>>     >
>>     > Thank you~
>>     >
>>     > Xintong Song
>>     >
>>     >
>>     >
>>     >
>>     >
>>     > On Fri, Oct 23, 2020 at 7:53 PM Konstantin Knauf <knaufk@apache.org
>> <ma...@apache.org>> wrote:
>>     >
>>     > Hi Robert,
>>     >
>>     > +1 to the plan you outlined. If we were to drop support in Flink
>> 1.13+, we
>>     > would still support it in Flink 1.12- with bug fixes for some time
>> so that
>>     > users have time to move on.
>>     >
>>     > It would certainly be very interesting to hear from current Flink
>> on Mesos
>>     > users, on how they see the evolution of this part of the ecosystem.
>>     >
>>     > Best,
>>     >
>>     > Konstantin
>>
>

Re: [BULK]Re: [SURVEY] Remove Mesos support

Posted by Till Rohrmann <tr...@apache.org>.
Hi Oleksandr,

yes you are right. The biggest problem is at the moment the lack of test
coverage and thereby confidence to make changes. We have some e2e tests
which you can find here [1]. These tests are, however, quite coarse grained
and are missing a lot of cases. One idea would be to add a Mesos e2e test
based on Flink's end-to-end test framework [2]. I think what needs to be
done there is to add a Mesos resource and a way to submit jobs to a Mesos
cluster to write e2e tests.

[1] https://github.com/apache/flink/tree/master/flink-jepsen
[2]
https://github.com/apache/flink/tree/master/flink-end-to-end-tests/flink-end-to-end-tests-common

Cheers,
Till

On Tue, Oct 27, 2020 at 12:29 PM Oleksandr Nitavskyi <o....@criteo.com>
wrote:

> Hello Xintong,
>
> Thanks for the insights and support.
>
> Browsing the Mesos backlog and didn't identify anything critical, which is
> left there.
>
> I see that there are were quite a lot of contributions to the Flink Mesos
> in the recent version:
> https://github.com/apache/flink/commits/master/flink-mesos.
> We plan to validate the current Flink master (or release 1.12 branch) our
> Mesos setup. In case of any issues, we will try to propose changes.
> My feeling is that our test results shouldn't affect the Flink 1.12
> release cycle. And if any potential commits will land into the 1.12.1 it
> should be totally fine.
>
> In the future, we would be glad to help you guys with any
> maintenance-related questions. One of the highest priorities around this
> component seems to be the development of the full e2e test.
>
> Kind Regards
> Oleksandr Nitavskyi
> ________________________________
> From: Xintong Song <to...@gmail.com>
> Sent: Tuesday, October 27, 2020 7:14 AM
> To: dev <de...@flink.apache.org>; user <us...@flink.apache.org>
> Cc: Piyush Narang <p....@criteo.com>
> Subject: [BULK]Re: [SURVEY] Remove Mesos support
>
> Hi Piyush,
>
> Thanks a lot for sharing the information. It would be a great relief that
> you are good with Flink on Mesos as is.
>
> As for the jira issues, I believe the most essential ones should have
> already been resolved. You may find some remaining open issues here [1],
> but not all of them are necessary if we decide to keep Flink on Mesos as is.
>
> At the moment and in the short future, I think helps are mostly needed on
> testing the upcoming release 1.12 with Mesos use cases. The community is
> currently actively preparing the new release, and hopefully we could come
> up with a release candidate early next month. It would be greatly
> appreciated if you fork as experienced Flink on Mesos users can help with
> verifying the release candidates.
>
>
> Thank you~
>
> Xintong Song
>
> [1]
> https://issues.apache.org/jira/browse/FLINK-17402?jql=project%20%3D%20FLINK%20AND%20component%20%3D%20%22Deployment%20%2F%20Mesos%22%20AND%20status%20%3D%20Open
> <
> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FFLINK-17402%3Fjql%3Dproject%2520%253D%2520FLINK%2520AND%2520component%2520%253D%2520%2522Deployment%2520%252F%2520Mesos%2522%2520AND%2520status%2520%253D%2520Open&data=04%7C01%7Co.nitavskyi%40criteo.com%7C3585e1f25bdf4e091af808d87a3f92db%7C2a35d8fd574d48e3927c8c398e225a01%7C1%7C0%7C637393760750820881%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=hytJFQE0MCPzMLiQTQTdbg3GVckX5M3r1NPRGrRV8j4%3D&reserved=0
> >
>
> On Tue, Oct 27, 2020 at 2:58 AM Piyush Narang <p.narang@criteo.com<mailto:
> p.narang@criteo.com>> wrote:
>
> Hi Xintong,
>
>
>
> Do you have any jiras that cover any of the items on 1 or 2? I can reach
> out to folks internally and see if I can get some folks to commit to
> helping out.
>
>
>
> To cover the other qs:
>
>   *   Yes, we’ve not got a plan at the moment to get off Mesos. We use
> Yarn for some our Flink workloads when we can. Mesos is only used when we
> need streaming capabilities in our WW dcs (as our Yarn is centralized in
> one DC)
>   *   We’re currently on Flink 1.9 (old planner). We have a plan to bump
> to 1.11 / 1.12 this quarter.
>   *   We typically upgrade once every 6 months to a year (not every
> release). We’d like to speed up the cadence but we’re not there yet.
>   *   We’d largely be good with keeping Flink on Mesos as-is and
> functional while missing out on some of the newer features. We understand
> the pain on the communities side and we can take on the work if we see some
> fancy improvement in Flink on Yarn / K8s that we want in Mesos to put in
> the request to port it over.
>
>
>
> Thanks,
>
>
>
> -- Piyush
>
>
>
>
>
> From: Xintong Song <to...@gmail.com>>
> Date: Sunday, October 25, 2020 at 10:57 PM
> To: dev <de...@flink.apache.org>>, user <
> user@flink.apache.org<ma...@flink.apache.org>>
> Cc: Lasse Nedergaard <lassenedergaardflink@gmail.com<mailto:
> lassenedergaardflink@gmail.com>>, <p.narang@criteo.com<mailto:
> p.narang@criteo.com>>
> Subject: Re: [SURVEY] Remove Mesos support
>
>
>
> Thanks for sharing the information with us, Piyush an Lasse.
>
>
>
> @Piyush
>
>
>
> Thanks for offering the help. IMO, there are currently several problems
> that make supporting Flink on Mesos challenging for us.
>
>   1.  Lack of Mesos experts. AFAIK, there are very few people (if not
> none) among the active contributors in this community that are familiar
> with Mesos and can help with development on this component.
>   2.  Absence of tests. Mesos does not provide a testing cluster, like
> `MiniYARNCluster`, making it hard to test interactions between Flink and
> Mesos. We have only a few very simple e2e tests running on Mesos deployed
> in a docker, covering the most fundamental workflows. We are not sure how
> well those tests work, especially against some potential corner cases.
>   3.  Divergence from other deployment. Because of 1 and 2, the new
> efforts (features, maintenance, refactors) tend to exclude Mesos if
> possible. When the new efforts have to touch the Mesos related components
> (e.g., changes to the common resource manager interfaces), we have to be
> very careful and make as few changes as possible, to avoid accidentally
> breaking anything that we are not familiar with. As a result, the component
> diverges a lot from other deployment components (K8s/Yarn), which makes it
> harder to maintain.
>
> It would be greatly appreciated if you can help with either of the above
> issues.
>
>
>
> Additionally, I have a few questions concerning your use cases at Criteo.
> IIUC, you are going to stay on Mesos in the foreseeable future, while
> keeping the Flink version up-to-date? What Flink version are you currently
> using? How often do you upgrade (e.g., every release)? Would you be good
> with keeping the Flink on Mesos component as it is (means that deployment
> and resource management improvements may not be ported to Mesos), while
> keeping other components up-to-date (e.g., improvements from programming
> APIs, operators, state backens, etc.)?
>
>
>
> Thank you~
>
> Xintong Song
>
>
>
>
>
> On Sat, Oct 24, 2020 at 2:48 AM Lasse Nedergaard <
> lassenedergaardflink@gmail.com<ma...@gmail.com>>
> wrote:
>
> Hi
>
>
>
> At Trackunit We have been using Mesos for long time but have now moved to
> k8s.
>
> Med venlig hilsen / Best regards
>
> Lasse Nedergaard
>
>
>
>
>
> Den 23. okt. 2020 kl. 17.01 skrev Robert Metzger <rmetzger@apache.org
> <ma...@apache.org>>:
>
> 
>
> Hey Piyush,
>
> thanks a lot for raising this concern. I believe we should keep Mesos in
> Flink then in the foreseeable future.
>
> Your offer to help is much appreciated. We'll let you know once there is
> something.
>
>
>
> On Fri, Oct 23, 2020 at 4:28 PM Piyush Narang <p.narang@criteo.com<mailto:
> p.narang@criteo.com>> wrote:
>
> Thanks Kostas. If there's items we can help with, I'm sure we'd be able to
> find folks who would be excited to contribute / help in any way.
>
> -- Piyush
>
>
> On 10/23/20, 10:25 AM, "Kostas Kloudas" <kkloudas@gmail.com<mailto:
> kkloudas@gmail.com>> wrote:
>
>     Thanks Piyush for the message.
>     After this, I revoke my +1. I agree with the previous opinions that we
>     cannot drop code that is actively used by users, especially if it
>     something that deep in the stack as support for cluster management
>     framework.
>
>     Cheers,
>     Kostas
>
>     On Fri, Oct 23, 2020 at 4:15 PM Piyush Narang <p.narang@criteo.com
> <ma...@criteo.com>> wrote:
>     >
>     > Hi folks,
>     >
>     >
>     >
>     > We at Criteo are active users of the Flink on Mesos resource
> management component. We are pretty heavy users of Mesos for scheduling
> workloads on our edge datacenters and we do want to continue to be able to
> run some of our Flink topologies (to compute machine learning short term
> features) on those DCs. If possible our vote would be not to drop Mesos
> support as that will tie us to an old release / have to maintain a fork as
> we’re not planning to migrate off Mesos anytime soon. Is the burden
> something that can be helped with by the community? (Or are you referring
> to having to ensure PRs handle the Mesos piece as well when they touch the
> resource managers?)
>     >
>     >
>     >
>     > Thanks,
>     >
>     >
>     >
>     > -- Piyush
>     >
>     >
>     >
>     >
>     >
>     > From: Till Rohrmann <trohrmann@apache.org<mailto:
> trohrmann@apache.org>>
>     > Date: Friday, October 23, 2020 at 8:19 AM
>     > To: Xintong Song <tonysong820@gmail.com<mailto:tonysong820@gmail.com
> >>
>     > Cc: dev <de...@flink.apache.org>>, user <
> user@flink.apache.org<ma...@flink.apache.org>>
>     > Subject: Re: [SURVEY] Remove Mesos support
>     >
>     >
>     >
>     > Thanks for starting this survey Robert! I second Konstantin and
> Xintong in the sense that our Mesos user's opinions should matter most
> here. If our community is no longer using the Mesos integration, then I
> would be +1 for removing it in order to decrease the maintenance burden.
>     >
>     >
>     >
>     > Cheers,
>     >
>     > Till
>     >
>     >
>     >
>     > On Fri, Oct 23, 2020 at 2:03 PM Xintong Song <tonysong820@gmail.com
> <ma...@gmail.com>> wrote:
>     >
>     > +1 for adding a warning in 1.12 about planning to remove Mesos
> support.
>     >
>     >
>     >
>     > With my developer hat on, removing the Mesos support would
> definitely reduce the maintaining overhead for the deployment and resource
> management related components. On the other hand, the Flink on Mesos users'
> voices definitely matter a lot for this community. Either way, it would be
> good to draw users attention to this discussion early.
>     >
>     >
>     >
>     > Thank you~
>     >
>     > Xintong Song
>     >
>     >
>     >
>     >
>     >
>     > On Fri, Oct 23, 2020 at 7:53 PM Konstantin Knauf <knaufk@apache.org
> <ma...@apache.org>> wrote:
>     >
>     > Hi Robert,
>     >
>     > +1 to the plan you outlined. If we were to drop support in Flink
> 1.13+, we
>     > would still support it in Flink 1.12- with bug fixes for some time
> so that
>     > users have time to move on.
>     >
>     > It would certainly be very interesting to hear from current Flink on
> Mesos
>     > users, on how they see the evolution of this part of the ecosystem.
>     >
>     > Best,
>     >
>     > Konstantin
>
>

Re: [BULK]Re: [SURVEY] Remove Mesos support

Posted by Till Rohrmann <tr...@apache.org>.
Hi Oleksandr,

yes you are right. The biggest problem is at the moment the lack of test
coverage and thereby confidence to make changes. We have some e2e tests
which you can find here [1]. These tests are, however, quite coarse grained
and are missing a lot of cases. One idea would be to add a Mesos e2e test
based on Flink's end-to-end test framework [2]. I think what needs to be
done there is to add a Mesos resource and a way to submit jobs to a Mesos
cluster to write e2e tests.

[1] https://github.com/apache/flink/tree/master/flink-jepsen
[2]
https://github.com/apache/flink/tree/master/flink-end-to-end-tests/flink-end-to-end-tests-common

Cheers,
Till

On Tue, Oct 27, 2020 at 12:29 PM Oleksandr Nitavskyi <o....@criteo.com>
wrote:

> Hello Xintong,
>
> Thanks for the insights and support.
>
> Browsing the Mesos backlog and didn't identify anything critical, which is
> left there.
>
> I see that there are were quite a lot of contributions to the Flink Mesos
> in the recent version:
> https://github.com/apache/flink/commits/master/flink-mesos.
> We plan to validate the current Flink master (or release 1.12 branch) our
> Mesos setup. In case of any issues, we will try to propose changes.
> My feeling is that our test results shouldn't affect the Flink 1.12
> release cycle. And if any potential commits will land into the 1.12.1 it
> should be totally fine.
>
> In the future, we would be glad to help you guys with any
> maintenance-related questions. One of the highest priorities around this
> component seems to be the development of the full e2e test.
>
> Kind Regards
> Oleksandr Nitavskyi
> ________________________________
> From: Xintong Song <to...@gmail.com>
> Sent: Tuesday, October 27, 2020 7:14 AM
> To: dev <de...@flink.apache.org>; user <us...@flink.apache.org>
> Cc: Piyush Narang <p....@criteo.com>
> Subject: [BULK]Re: [SURVEY] Remove Mesos support
>
> Hi Piyush,
>
> Thanks a lot for sharing the information. It would be a great relief that
> you are good with Flink on Mesos as is.
>
> As for the jira issues, I believe the most essential ones should have
> already been resolved. You may find some remaining open issues here [1],
> but not all of them are necessary if we decide to keep Flink on Mesos as is.
>
> At the moment and in the short future, I think helps are mostly needed on
> testing the upcoming release 1.12 with Mesos use cases. The community is
> currently actively preparing the new release, and hopefully we could come
> up with a release candidate early next month. It would be greatly
> appreciated if you fork as experienced Flink on Mesos users can help with
> verifying the release candidates.
>
>
> Thank you~
>
> Xintong Song
>
> [1]
> https://issues.apache.org/jira/browse/FLINK-17402?jql=project%20%3D%20FLINK%20AND%20component%20%3D%20%22Deployment%20%2F%20Mesos%22%20AND%20status%20%3D%20Open
> <
> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FFLINK-17402%3Fjql%3Dproject%2520%253D%2520FLINK%2520AND%2520component%2520%253D%2520%2522Deployment%2520%252F%2520Mesos%2522%2520AND%2520status%2520%253D%2520Open&data=04%7C01%7Co.nitavskyi%40criteo.com%7C3585e1f25bdf4e091af808d87a3f92db%7C2a35d8fd574d48e3927c8c398e225a01%7C1%7C0%7C637393760750820881%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=hytJFQE0MCPzMLiQTQTdbg3GVckX5M3r1NPRGrRV8j4%3D&reserved=0
> >
>
> On Tue, Oct 27, 2020 at 2:58 AM Piyush Narang <p.narang@criteo.com<mailto:
> p.narang@criteo.com>> wrote:
>
> Hi Xintong,
>
>
>
> Do you have any jiras that cover any of the items on 1 or 2? I can reach
> out to folks internally and see if I can get some folks to commit to
> helping out.
>
>
>
> To cover the other qs:
>
>   *   Yes, we’ve not got a plan at the moment to get off Mesos. We use
> Yarn for some our Flink workloads when we can. Mesos is only used when we
> need streaming capabilities in our WW dcs (as our Yarn is centralized in
> one DC)
>   *   We’re currently on Flink 1.9 (old planner). We have a plan to bump
> to 1.11 / 1.12 this quarter.
>   *   We typically upgrade once every 6 months to a year (not every
> release). We’d like to speed up the cadence but we’re not there yet.
>   *   We’d largely be good with keeping Flink on Mesos as-is and
> functional while missing out on some of the newer features. We understand
> the pain on the communities side and we can take on the work if we see some
> fancy improvement in Flink on Yarn / K8s that we want in Mesos to put in
> the request to port it over.
>
>
>
> Thanks,
>
>
>
> -- Piyush
>
>
>
>
>
> From: Xintong Song <to...@gmail.com>>
> Date: Sunday, October 25, 2020 at 10:57 PM
> To: dev <de...@flink.apache.org>>, user <
> user@flink.apache.org<ma...@flink.apache.org>>
> Cc: Lasse Nedergaard <lassenedergaardflink@gmail.com<mailto:
> lassenedergaardflink@gmail.com>>, <p.narang@criteo.com<mailto:
> p.narang@criteo.com>>
> Subject: Re: [SURVEY] Remove Mesos support
>
>
>
> Thanks for sharing the information with us, Piyush an Lasse.
>
>
>
> @Piyush
>
>
>
> Thanks for offering the help. IMO, there are currently several problems
> that make supporting Flink on Mesos challenging for us.
>
>   1.  Lack of Mesos experts. AFAIK, there are very few people (if not
> none) among the active contributors in this community that are familiar
> with Mesos and can help with development on this component.
>   2.  Absence of tests. Mesos does not provide a testing cluster, like
> `MiniYARNCluster`, making it hard to test interactions between Flink and
> Mesos. We have only a few very simple e2e tests running on Mesos deployed
> in a docker, covering the most fundamental workflows. We are not sure how
> well those tests work, especially against some potential corner cases.
>   3.  Divergence from other deployment. Because of 1 and 2, the new
> efforts (features, maintenance, refactors) tend to exclude Mesos if
> possible. When the new efforts have to touch the Mesos related components
> (e.g., changes to the common resource manager interfaces), we have to be
> very careful and make as few changes as possible, to avoid accidentally
> breaking anything that we are not familiar with. As a result, the component
> diverges a lot from other deployment components (K8s/Yarn), which makes it
> harder to maintain.
>
> It would be greatly appreciated if you can help with either of the above
> issues.
>
>
>
> Additionally, I have a few questions concerning your use cases at Criteo.
> IIUC, you are going to stay on Mesos in the foreseeable future, while
> keeping the Flink version up-to-date? What Flink version are you currently
> using? How often do you upgrade (e.g., every release)? Would you be good
> with keeping the Flink on Mesos component as it is (means that deployment
> and resource management improvements may not be ported to Mesos), while
> keeping other components up-to-date (e.g., improvements from programming
> APIs, operators, state backens, etc.)?
>
>
>
> Thank you~
>
> Xintong Song
>
>
>
>
>
> On Sat, Oct 24, 2020 at 2:48 AM Lasse Nedergaard <
> lassenedergaardflink@gmail.com<ma...@gmail.com>>
> wrote:
>
> Hi
>
>
>
> At Trackunit We have been using Mesos for long time but have now moved to
> k8s.
>
> Med venlig hilsen / Best regards
>
> Lasse Nedergaard
>
>
>
>
>
> Den 23. okt. 2020 kl. 17.01 skrev Robert Metzger <rmetzger@apache.org
> <ma...@apache.org>>:
>
> 
>
> Hey Piyush,
>
> thanks a lot for raising this concern. I believe we should keep Mesos in
> Flink then in the foreseeable future.
>
> Your offer to help is much appreciated. We'll let you know once there is
> something.
>
>
>
> On Fri, Oct 23, 2020 at 4:28 PM Piyush Narang <p.narang@criteo.com<mailto:
> p.narang@criteo.com>> wrote:
>
> Thanks Kostas. If there's items we can help with, I'm sure we'd be able to
> find folks who would be excited to contribute / help in any way.
>
> -- Piyush
>
>
> On 10/23/20, 10:25 AM, "Kostas Kloudas" <kkloudas@gmail.com<mailto:
> kkloudas@gmail.com>> wrote:
>
>     Thanks Piyush for the message.
>     After this, I revoke my +1. I agree with the previous opinions that we
>     cannot drop code that is actively used by users, especially if it
>     something that deep in the stack as support for cluster management
>     framework.
>
>     Cheers,
>     Kostas
>
>     On Fri, Oct 23, 2020 at 4:15 PM Piyush Narang <p.narang@criteo.com
> <ma...@criteo.com>> wrote:
>     >
>     > Hi folks,
>     >
>     >
>     >
>     > We at Criteo are active users of the Flink on Mesos resource
> management component. We are pretty heavy users of Mesos for scheduling
> workloads on our edge datacenters and we do want to continue to be able to
> run some of our Flink topologies (to compute machine learning short term
> features) on those DCs. If possible our vote would be not to drop Mesos
> support as that will tie us to an old release / have to maintain a fork as
> we’re not planning to migrate off Mesos anytime soon. Is the burden
> something that can be helped with by the community? (Or are you referring
> to having to ensure PRs handle the Mesos piece as well when they touch the
> resource managers?)
>     >
>     >
>     >
>     > Thanks,
>     >
>     >
>     >
>     > -- Piyush
>     >
>     >
>     >
>     >
>     >
>     > From: Till Rohrmann <trohrmann@apache.org<mailto:
> trohrmann@apache.org>>
>     > Date: Friday, October 23, 2020 at 8:19 AM
>     > To: Xintong Song <tonysong820@gmail.com<mailto:tonysong820@gmail.com
> >>
>     > Cc: dev <de...@flink.apache.org>>, user <
> user@flink.apache.org<ma...@flink.apache.org>>
>     > Subject: Re: [SURVEY] Remove Mesos support
>     >
>     >
>     >
>     > Thanks for starting this survey Robert! I second Konstantin and
> Xintong in the sense that our Mesos user's opinions should matter most
> here. If our community is no longer using the Mesos integration, then I
> would be +1 for removing it in order to decrease the maintenance burden.
>     >
>     >
>     >
>     > Cheers,
>     >
>     > Till
>     >
>     >
>     >
>     > On Fri, Oct 23, 2020 at 2:03 PM Xintong Song <tonysong820@gmail.com
> <ma...@gmail.com>> wrote:
>     >
>     > +1 for adding a warning in 1.12 about planning to remove Mesos
> support.
>     >
>     >
>     >
>     > With my developer hat on, removing the Mesos support would
> definitely reduce the maintaining overhead for the deployment and resource
> management related components. On the other hand, the Flink on Mesos users'
> voices definitely matter a lot for this community. Either way, it would be
> good to draw users attention to this discussion early.
>     >
>     >
>     >
>     > Thank you~
>     >
>     > Xintong Song
>     >
>     >
>     >
>     >
>     >
>     > On Fri, Oct 23, 2020 at 7:53 PM Konstantin Knauf <knaufk@apache.org
> <ma...@apache.org>> wrote:
>     >
>     > Hi Robert,
>     >
>     > +1 to the plan you outlined. If we were to drop support in Flink
> 1.13+, we
>     > would still support it in Flink 1.12- with bug fixes for some time
> so that
>     > users have time to move on.
>     >
>     > It would certainly be very interesting to hear from current Flink on
> Mesos
>     > users, on how they see the evolution of this part of the ecosystem.
>     >
>     > Best,
>     >
>     > Konstantin
>
>

Re: [BULK]Re: [SURVEY] Remove Mesos support

Posted by Oleksandr Nitavskyi <o....@criteo.com>.
Hello Xintong,

Thanks for the insights and support.

Browsing the Mesos backlog and didn't identify anything critical, which is left there.

I see that there are were quite a lot of contributions to the Flink Mesos in the recent version: https://github.com/apache/flink/commits/master/flink-mesos.
We plan to validate the current Flink master (or release 1.12 branch) our Mesos setup. In case of any issues, we will try to propose changes.
My feeling is that our test results shouldn't affect the Flink 1.12 release cycle. And if any potential commits will land into the 1.12.1 it should be totally fine.

In the future, we would be glad to help you guys with any maintenance-related questions. One of the highest priorities around this component seems to be the development of the full e2e test.

Kind Regards
Oleksandr Nitavskyi
________________________________
From: Xintong Song <to...@gmail.com>
Sent: Tuesday, October 27, 2020 7:14 AM
To: dev <de...@flink.apache.org>; user <us...@flink.apache.org>
Cc: Piyush Narang <p....@criteo.com>
Subject: [BULK]Re: [SURVEY] Remove Mesos support

Hi Piyush,

Thanks a lot for sharing the information. It would be a great relief that you are good with Flink on Mesos as is.

As for the jira issues, I believe the most essential ones should have already been resolved. You may find some remaining open issues here [1], but not all of them are necessary if we decide to keep Flink on Mesos as is.

At the moment and in the short future, I think helps are mostly needed on testing the upcoming release 1.12 with Mesos use cases. The community is currently actively preparing the new release, and hopefully we could come up with a release candidate early next month. It would be greatly appreciated if you fork as experienced Flink on Mesos users can help with verifying the release candidates.


Thank you~

Xintong Song

[1] https://issues.apache.org/jira/browse/FLINK-17402?jql=project%20%3D%20FLINK%20AND%20component%20%3D%20%22Deployment%20%2F%20Mesos%22%20AND%20status%20%3D%20Open<https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FFLINK-17402%3Fjql%3Dproject%2520%253D%2520FLINK%2520AND%2520component%2520%253D%2520%2522Deployment%2520%252F%2520Mesos%2522%2520AND%2520status%2520%253D%2520Open&data=04%7C01%7Co.nitavskyi%40criteo.com%7C3585e1f25bdf4e091af808d87a3f92db%7C2a35d8fd574d48e3927c8c398e225a01%7C1%7C0%7C637393760750820881%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=hytJFQE0MCPzMLiQTQTdbg3GVckX5M3r1NPRGrRV8j4%3D&reserved=0>

On Tue, Oct 27, 2020 at 2:58 AM Piyush Narang <p....@criteo.com>> wrote:

Hi Xintong,



Do you have any jiras that cover any of the items on 1 or 2? I can reach out to folks internally and see if I can get some folks to commit to helping out.



To cover the other qs:

  *   Yes, we’ve not got a plan at the moment to get off Mesos. We use Yarn for some our Flink workloads when we can. Mesos is only used when we need streaming capabilities in our WW dcs (as our Yarn is centralized in one DC)
  *   We’re currently on Flink 1.9 (old planner). We have a plan to bump to 1.11 / 1.12 this quarter.
  *   We typically upgrade once every 6 months to a year (not every release). We’d like to speed up the cadence but we’re not there yet.
  *   We’d largely be good with keeping Flink on Mesos as-is and functional while missing out on some of the newer features. We understand the pain on the communities side and we can take on the work if we see some fancy improvement in Flink on Yarn / K8s that we want in Mesos to put in the request to port it over.



Thanks,



-- Piyush





From: Xintong Song <to...@gmail.com>>
Date: Sunday, October 25, 2020 at 10:57 PM
To: dev <de...@flink.apache.org>>, user <us...@flink.apache.org>>
Cc: Lasse Nedergaard <la...@gmail.com>>, <p....@criteo.com>>
Subject: Re: [SURVEY] Remove Mesos support



Thanks for sharing the information with us, Piyush an Lasse.



@Piyush



Thanks for offering the help. IMO, there are currently several problems that make supporting Flink on Mesos challenging for us.

  1.  Lack of Mesos experts. AFAIK, there are very few people (if not none) among the active contributors in this community that are familiar with Mesos and can help with development on this component.
  2.  Absence of tests. Mesos does not provide a testing cluster, like `MiniYARNCluster`, making it hard to test interactions between Flink and Mesos. We have only a few very simple e2e tests running on Mesos deployed in a docker, covering the most fundamental workflows. We are not sure how well those tests work, especially against some potential corner cases.
  3.  Divergence from other deployment. Because of 1 and 2, the new efforts (features, maintenance, refactors) tend to exclude Mesos if possible. When the new efforts have to touch the Mesos related components (e.g., changes to the common resource manager interfaces), we have to be very careful and make as few changes as possible, to avoid accidentally breaking anything that we are not familiar with. As a result, the component diverges a lot from other deployment components (K8s/Yarn), which makes it harder to maintain.

It would be greatly appreciated if you can help with either of the above issues.



Additionally, I have a few questions concerning your use cases at Criteo. IIUC, you are going to stay on Mesos in the foreseeable future, while keeping the Flink version up-to-date? What Flink version are you currently using? How often do you upgrade (e.g., every release)? Would you be good with keeping the Flink on Mesos component as it is (means that deployment and resource management improvements may not be ported to Mesos), while keeping other components up-to-date (e.g., improvements from programming APIs, operators, state backens, etc.)?



Thank you~

Xintong Song





On Sat, Oct 24, 2020 at 2:48 AM Lasse Nedergaard <la...@gmail.com>> wrote:

Hi



At Trackunit We have been using Mesos for long time but have now moved to k8s.

Med venlig hilsen / Best regards

Lasse Nedergaard





Den 23. okt. 2020 kl. 17.01 skrev Robert Metzger <rm...@apache.org>>:



Hey Piyush,

thanks a lot for raising this concern. I believe we should keep Mesos in Flink then in the foreseeable future.

Your offer to help is much appreciated. We'll let you know once there is something.



On Fri, Oct 23, 2020 at 4:28 PM Piyush Narang <p....@criteo.com>> wrote:

Thanks Kostas. If there's items we can help with, I'm sure we'd be able to find folks who would be excited to contribute / help in any way.

-- Piyush


On 10/23/20, 10:25 AM, "Kostas Kloudas" <kk...@gmail.com>> wrote:

    Thanks Piyush for the message.
    After this, I revoke my +1. I agree with the previous opinions that we
    cannot drop code that is actively used by users, especially if it
    something that deep in the stack as support for cluster management
    framework.

    Cheers,
    Kostas

    On Fri, Oct 23, 2020 at 4:15 PM Piyush Narang <p....@criteo.com>> wrote:
    >
    > Hi folks,
    >
    >
    >
    > We at Criteo are active users of the Flink on Mesos resource management component. We are pretty heavy users of Mesos for scheduling workloads on our edge datacenters and we do want to continue to be able to run some of our Flink topologies (to compute machine learning short term features) on those DCs. If possible our vote would be not to drop Mesos support as that will tie us to an old release / have to maintain a fork as we’re not planning to migrate off Mesos anytime soon. Is the burden something that can be helped with by the community? (Or are you referring to having to ensure PRs handle the Mesos piece as well when they touch the resource managers?)
    >
    >
    >
    > Thanks,
    >
    >
    >
    > -- Piyush
    >
    >
    >
    >
    >
    > From: Till Rohrmann <tr...@apache.org>>
    > Date: Friday, October 23, 2020 at 8:19 AM
    > To: Xintong Song <to...@gmail.com>>
    > Cc: dev <de...@flink.apache.org>>, user <us...@flink.apache.org>>
    > Subject: Re: [SURVEY] Remove Mesos support
    >
    >
    >
    > Thanks for starting this survey Robert! I second Konstantin and Xintong in the sense that our Mesos user's opinions should matter most here. If our community is no longer using the Mesos integration, then I would be +1 for removing it in order to decrease the maintenance burden.
    >
    >
    >
    > Cheers,
    >
    > Till
    >
    >
    >
    > On Fri, Oct 23, 2020 at 2:03 PM Xintong Song <to...@gmail.com>> wrote:
    >
    > +1 for adding a warning in 1.12 about planning to remove Mesos support.
    >
    >
    >
    > With my developer hat on, removing the Mesos support would definitely reduce the maintaining overhead for the deployment and resource management related components. On the other hand, the Flink on Mesos users' voices definitely matter a lot for this community. Either way, it would be good to draw users attention to this discussion early.
    >
    >
    >
    > Thank you~
    >
    > Xintong Song
    >
    >
    >
    >
    >
    > On Fri, Oct 23, 2020 at 7:53 PM Konstantin Knauf <kn...@apache.org>> wrote:
    >
    > Hi Robert,
    >
    > +1 to the plan you outlined. If we were to drop support in Flink 1.13+, we
    > would still support it in Flink 1.12- with bug fixes for some time so that
    > users have time to move on.
    >
    > It would certainly be very interesting to hear from current Flink on Mesos
    > users, on how they see the evolution of this part of the ecosystem.
    >
    > Best,
    >
    > Konstantin


Re: [BULK]Re: [SURVEY] Remove Mesos support

Posted by Oleksandr Nitavskyi <o....@criteo.com>.
Hello Xintong,

Thanks for the insights and support.

Browsing the Mesos backlog and didn't identify anything critical, which is left there.

I see that there are were quite a lot of contributions to the Flink Mesos in the recent version: https://github.com/apache/flink/commits/master/flink-mesos.
We plan to validate the current Flink master (or release 1.12 branch) our Mesos setup. In case of any issues, we will try to propose changes.
My feeling is that our test results shouldn't affect the Flink 1.12 release cycle. And if any potential commits will land into the 1.12.1 it should be totally fine.

In the future, we would be glad to help you guys with any maintenance-related questions. One of the highest priorities around this component seems to be the development of the full e2e test.

Kind Regards
Oleksandr Nitavskyi
________________________________
From: Xintong Song <to...@gmail.com>
Sent: Tuesday, October 27, 2020 7:14 AM
To: dev <de...@flink.apache.org>; user <us...@flink.apache.org>
Cc: Piyush Narang <p....@criteo.com>
Subject: [BULK]Re: [SURVEY] Remove Mesos support

Hi Piyush,

Thanks a lot for sharing the information. It would be a great relief that you are good with Flink on Mesos as is.

As for the jira issues, I believe the most essential ones should have already been resolved. You may find some remaining open issues here [1], but not all of them are necessary if we decide to keep Flink on Mesos as is.

At the moment and in the short future, I think helps are mostly needed on testing the upcoming release 1.12 with Mesos use cases. The community is currently actively preparing the new release, and hopefully we could come up with a release candidate early next month. It would be greatly appreciated if you fork as experienced Flink on Mesos users can help with verifying the release candidates.


Thank you~

Xintong Song

[1] https://issues.apache.org/jira/browse/FLINK-17402?jql=project%20%3D%20FLINK%20AND%20component%20%3D%20%22Deployment%20%2F%20Mesos%22%20AND%20status%20%3D%20Open<https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FFLINK-17402%3Fjql%3Dproject%2520%253D%2520FLINK%2520AND%2520component%2520%253D%2520%2522Deployment%2520%252F%2520Mesos%2522%2520AND%2520status%2520%253D%2520Open&data=04%7C01%7Co.nitavskyi%40criteo.com%7C3585e1f25bdf4e091af808d87a3f92db%7C2a35d8fd574d48e3927c8c398e225a01%7C1%7C0%7C637393760750820881%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=hytJFQE0MCPzMLiQTQTdbg3GVckX5M3r1NPRGrRV8j4%3D&reserved=0>

On Tue, Oct 27, 2020 at 2:58 AM Piyush Narang <p....@criteo.com>> wrote:

Hi Xintong,



Do you have any jiras that cover any of the items on 1 or 2? I can reach out to folks internally and see if I can get some folks to commit to helping out.



To cover the other qs:

  *   Yes, we’ve not got a plan at the moment to get off Mesos. We use Yarn for some our Flink workloads when we can. Mesos is only used when we need streaming capabilities in our WW dcs (as our Yarn is centralized in one DC)
  *   We’re currently on Flink 1.9 (old planner). We have a plan to bump to 1.11 / 1.12 this quarter.
  *   We typically upgrade once every 6 months to a year (not every release). We’d like to speed up the cadence but we’re not there yet.
  *   We’d largely be good with keeping Flink on Mesos as-is and functional while missing out on some of the newer features. We understand the pain on the communities side and we can take on the work if we see some fancy improvement in Flink on Yarn / K8s that we want in Mesos to put in the request to port it over.



Thanks,



-- Piyush





From: Xintong Song <to...@gmail.com>>
Date: Sunday, October 25, 2020 at 10:57 PM
To: dev <de...@flink.apache.org>>, user <us...@flink.apache.org>>
Cc: Lasse Nedergaard <la...@gmail.com>>, <p....@criteo.com>>
Subject: Re: [SURVEY] Remove Mesos support



Thanks for sharing the information with us, Piyush an Lasse.



@Piyush



Thanks for offering the help. IMO, there are currently several problems that make supporting Flink on Mesos challenging for us.

  1.  Lack of Mesos experts. AFAIK, there are very few people (if not none) among the active contributors in this community that are familiar with Mesos and can help with development on this component.
  2.  Absence of tests. Mesos does not provide a testing cluster, like `MiniYARNCluster`, making it hard to test interactions between Flink and Mesos. We have only a few very simple e2e tests running on Mesos deployed in a docker, covering the most fundamental workflows. We are not sure how well those tests work, especially against some potential corner cases.
  3.  Divergence from other deployment. Because of 1 and 2, the new efforts (features, maintenance, refactors) tend to exclude Mesos if possible. When the new efforts have to touch the Mesos related components (e.g., changes to the common resource manager interfaces), we have to be very careful and make as few changes as possible, to avoid accidentally breaking anything that we are not familiar with. As a result, the component diverges a lot from other deployment components (K8s/Yarn), which makes it harder to maintain.

It would be greatly appreciated if you can help with either of the above issues.



Additionally, I have a few questions concerning your use cases at Criteo. IIUC, you are going to stay on Mesos in the foreseeable future, while keeping the Flink version up-to-date? What Flink version are you currently using? How often do you upgrade (e.g., every release)? Would you be good with keeping the Flink on Mesos component as it is (means that deployment and resource management improvements may not be ported to Mesos), while keeping other components up-to-date (e.g., improvements from programming APIs, operators, state backens, etc.)?



Thank you~

Xintong Song





On Sat, Oct 24, 2020 at 2:48 AM Lasse Nedergaard <la...@gmail.com>> wrote:

Hi



At Trackunit We have been using Mesos for long time but have now moved to k8s.

Med venlig hilsen / Best regards

Lasse Nedergaard





Den 23. okt. 2020 kl. 17.01 skrev Robert Metzger <rm...@apache.org>>:



Hey Piyush,

thanks a lot for raising this concern. I believe we should keep Mesos in Flink then in the foreseeable future.

Your offer to help is much appreciated. We'll let you know once there is something.



On Fri, Oct 23, 2020 at 4:28 PM Piyush Narang <p....@criteo.com>> wrote:

Thanks Kostas. If there's items we can help with, I'm sure we'd be able to find folks who would be excited to contribute / help in any way.

-- Piyush


On 10/23/20, 10:25 AM, "Kostas Kloudas" <kk...@gmail.com>> wrote:

    Thanks Piyush for the message.
    After this, I revoke my +1. I agree with the previous opinions that we
    cannot drop code that is actively used by users, especially if it
    something that deep in the stack as support for cluster management
    framework.

    Cheers,
    Kostas

    On Fri, Oct 23, 2020 at 4:15 PM Piyush Narang <p....@criteo.com>> wrote:
    >
    > Hi folks,
    >
    >
    >
    > We at Criteo are active users of the Flink on Mesos resource management component. We are pretty heavy users of Mesos for scheduling workloads on our edge datacenters and we do want to continue to be able to run some of our Flink topologies (to compute machine learning short term features) on those DCs. If possible our vote would be not to drop Mesos support as that will tie us to an old release / have to maintain a fork as we’re not planning to migrate off Mesos anytime soon. Is the burden something that can be helped with by the community? (Or are you referring to having to ensure PRs handle the Mesos piece as well when they touch the resource managers?)
    >
    >
    >
    > Thanks,
    >
    >
    >
    > -- Piyush
    >
    >
    >
    >
    >
    > From: Till Rohrmann <tr...@apache.org>>
    > Date: Friday, October 23, 2020 at 8:19 AM
    > To: Xintong Song <to...@gmail.com>>
    > Cc: dev <de...@flink.apache.org>>, user <us...@flink.apache.org>>
    > Subject: Re: [SURVEY] Remove Mesos support
    >
    >
    >
    > Thanks for starting this survey Robert! I second Konstantin and Xintong in the sense that our Mesos user's opinions should matter most here. If our community is no longer using the Mesos integration, then I would be +1 for removing it in order to decrease the maintenance burden.
    >
    >
    >
    > Cheers,
    >
    > Till
    >
    >
    >
    > On Fri, Oct 23, 2020 at 2:03 PM Xintong Song <to...@gmail.com>> wrote:
    >
    > +1 for adding a warning in 1.12 about planning to remove Mesos support.
    >
    >
    >
    > With my developer hat on, removing the Mesos support would definitely reduce the maintaining overhead for the deployment and resource management related components. On the other hand, the Flink on Mesos users' voices definitely matter a lot for this community. Either way, it would be good to draw users attention to this discussion early.
    >
    >
    >
    > Thank you~
    >
    > Xintong Song
    >
    >
    >
    >
    >
    > On Fri, Oct 23, 2020 at 7:53 PM Konstantin Knauf <kn...@apache.org>> wrote:
    >
    > Hi Robert,
    >
    > +1 to the plan you outlined. If we were to drop support in Flink 1.13+, we
    > would still support it in Flink 1.12- with bug fixes for some time so that
    > users have time to move on.
    >
    > It would certainly be very interesting to hear from current Flink on Mesos
    > users, on how they see the evolution of this part of the ecosystem.
    >
    > Best,
    >
    > Konstantin


Re: [SURVEY] Remove Mesos support

Posted by Xintong Song <to...@gmail.com>.
Hi Piyush,

Thanks a lot for sharing the information. It would be a great relief that
you are good with Flink on Mesos as is.

As for the jira issues, I believe the most essential ones should have
already been resolved. You may find some remaining open issues here [1],
but not all of them are necessary if we decide to keep Flink on Mesos as is.

At the moment and in the short future, I think helps are mostly needed on
testing the upcoming release 1.12 with Mesos use cases. The community is
currently actively preparing the new release, and hopefully we could come
up with a release candidate early next month. It would be greatly
appreciated if you fork as experienced Flink on Mesos users can help with
verifying the release candidates.

Thank you~

Xintong Song


[1]
https://issues.apache.org/jira/browse/FLINK-17402?jql=project%20%3D%20FLINK%20AND%20component%20%3D%20%22Deployment%20%2F%20Mesos%22%20AND%20status%20%3D%20Open

On Tue, Oct 27, 2020 at 2:58 AM Piyush Narang <p....@criteo.com> wrote:

> Hi Xintong,
>
>
>
> Do you have any jiras that cover any of the items on 1 or 2? I can reach
> out to folks internally and see if I can get some folks to commit to
> helping out.
>
>
>
> To cover the other qs:
>
>    - Yes, we’ve not got a plan at the moment to get off Mesos. We use
>    Yarn for some our Flink workloads when we can. Mesos is only used when we
>    need streaming capabilities in our WW dcs (as our Yarn is centralized in
>    one DC)
>    - We’re currently on Flink 1.9 (old planner). We have a plan to bump
>    to 1.11 / 1.12 this quarter.
>    - We typically upgrade once every 6 months to a year (not every
>    release). We’d like to speed up the cadence but we’re not there yet.
>    - We’d largely be good with keeping Flink on Mesos as-is and
>    functional while missing out on some of the newer features. We understand
>    the pain on the communities side and we can take on the work if we see some
>    fancy improvement in Flink on Yarn / K8s that we want in Mesos to put in
>    the request to port it over.
>
>
>
> Thanks,
>
>
>
> -- Piyush
>
>
>
>
>
> *From: *Xintong Song <to...@gmail.com>
> *Date: *Sunday, October 25, 2020 at 10:57 PM
> *To: *dev <de...@flink.apache.org>, user <us...@flink.apache.org>
> *Cc: *Lasse Nedergaard <la...@gmail.com>, <
> p.narang@criteo.com>
> *Subject: *Re: [SURVEY] Remove Mesos support
>
>
>
> Thanks for sharing the information with us, Piyush an Lasse.
>
>
>
> @Piyush
>
>
>
> Thanks for offering the help. IMO, there are currently several problems
> that make supporting Flink on Mesos challenging for us.
>
>    1. *Lack of Mesos experts.* AFAIK, there are very few people (if not
>    none) among the active contributors in this community that are
>    familiar with Mesos and can help with development on this component.
>    2. *Absence of tests.* Mesos does not provide a testing cluster, like
>    `MiniYARNCluster`, making it hard to test interactions between Flink and
>    Mesos. We have only a few very simple e2e tests running on Mesos deployed
>    in a docker, covering the most fundamental workflows. We are not sure how
>    well those tests work, especially against some potential corner cases.
>    3. *Divergence from other deployment.* Because of 1 and 2, the new
>    efforts (features, maintenance, refactors) tend to exclude Mesos if
>    possible. When the new efforts have to touch the Mesos related components
>    (e.g., changes to the common resource manager interfaces), we have to be
>    very careful and make as few changes as possible, to avoid accidentally
>    breaking anything that we are not familiar with. As a result, the component
>    diverges a lot from other deployment components (K8s/Yarn), which makes it
>    harder to maintain.
>
> It would be greatly appreciated if you can help with either of the above
> issues.
>
>
>
> Additionally, I have a few questions concerning your use cases at Criteo.
> IIUC, you are going to stay on Mesos in the foreseeable future, while
> keeping the Flink version up-to-date? What Flink version are you currently
> using? How often do you upgrade (e.g., every release)? Would you be good
> with keeping the Flink on Mesos component as it is (means that deployment
> and resource management improvements may not be ported to Mesos), while
> keeping other components up-to-date (e.g., improvements from programming
> APIs, operators, state backens, etc.)?
>
>
>
> Thank you~
>
> Xintong Song
>
>
>
>
>
> On Sat, Oct 24, 2020 at 2:48 AM Lasse Nedergaard <
> lassenedergaardflink@gmail.com> wrote:
>
> Hi
>
>
>
> At Trackunit We have been using Mesos for long time but have now moved to
> k8s.
>
> Med venlig hilsen / Best regards
>
> Lasse Nedergaard
>
>
>
>
>
> Den 23. okt. 2020 kl. 17.01 skrev Robert Metzger <rm...@apache.org>:
>
> 
>
> Hey Piyush,
>
> thanks a lot for raising this concern. I believe we should keep Mesos in
> Flink then in the foreseeable future.
>
> Your offer to help is much appreciated. We'll let you know once there is
> something.
>
>
>
> On Fri, Oct 23, 2020 at 4:28 PM Piyush Narang <p....@criteo.com> wrote:
>
> Thanks Kostas. If there's items we can help with, I'm sure we'd be able to
> find folks who would be excited to contribute / help in any way.
>
> -- Piyush
>
>
> On 10/23/20, 10:25 AM, "Kostas Kloudas" <kk...@gmail.com> wrote:
>
>     Thanks Piyush for the message.
>     After this, I revoke my +1. I agree with the previous opinions that we
>     cannot drop code that is actively used by users, especially if it
>     something that deep in the stack as support for cluster management
>     framework.
>
>     Cheers,
>     Kostas
>
>     On Fri, Oct 23, 2020 at 4:15 PM Piyush Narang <p....@criteo.com>
> wrote:
>     >
>     > Hi folks,
>     >
>     >
>     >
>     > We at Criteo are active users of the Flink on Mesos resource
> management component. We are pretty heavy users of Mesos for scheduling
> workloads on our edge datacenters and we do want to continue to be able to
> run some of our Flink topologies (to compute machine learning short term
> features) on those DCs. If possible our vote would be not to drop Mesos
> support as that will tie us to an old release / have to maintain a fork as
> we’re not planning to migrate off Mesos anytime soon. Is the burden
> something that can be helped with by the community? (Or are you referring
> to having to ensure PRs handle the Mesos piece as well when they touch the
> resource managers?)
>     >
>     >
>     >
>     > Thanks,
>     >
>     >
>     >
>     > -- Piyush
>     >
>     >
>     >
>     >
>     >
>     > From: Till Rohrmann <tr...@apache.org>
>     > Date: Friday, October 23, 2020 at 8:19 AM
>     > To: Xintong Song <to...@gmail.com>
>     > Cc: dev <de...@flink.apache.org>, user <us...@flink.apache.org>
>     > Subject: Re: [SURVEY] Remove Mesos support
>     >
>     >
>     >
>     > Thanks for starting this survey Robert! I second Konstantin and
> Xintong in the sense that our Mesos user's opinions should matter most
> here. If our community is no longer using the Mesos integration, then I
> would be +1 for removing it in order to decrease the maintenance burden.
>     >
>     >
>     >
>     > Cheers,
>     >
>     > Till
>     >
>     >
>     >
>     > On Fri, Oct 23, 2020 at 2:03 PM Xintong Song <to...@gmail.com>
> wrote:
>     >
>     > +1 for adding a warning in 1.12 about planning to remove Mesos
> support.
>     >
>     >
>     >
>     > With my developer hat on, removing the Mesos support would
> definitely reduce the maintaining overhead for the deployment and resource
> management related components. On the other hand, the Flink on Mesos users'
> voices definitely matter a lot for this community. Either way, it would be
> good to draw users attention to this discussion early.
>     >
>     >
>     >
>     > Thank you~
>     >
>     > Xintong Song
>     >
>     >
>     >
>     >
>     >
>     > On Fri, Oct 23, 2020 at 7:53 PM Konstantin Knauf <kn...@apache.org>
> wrote:
>     >
>     > Hi Robert,
>     >
>     > +1 to the plan you outlined. If we were to drop support in Flink
> 1.13+, we
>     > would still support it in Flink 1.12- with bug fixes for some time
> so that
>     > users have time to move on.
>     >
>     > It would certainly be very interesting to hear from current Flink on
> Mesos
>     > users, on how they see the evolution of this part of the ecosystem.
>     >
>     > Best,
>     >
>     > Konstantin
>
>

Re: [SURVEY] Remove Mesos support

Posted by Xintong Song <to...@gmail.com>.
Hi Piyush,

Thanks a lot for sharing the information. It would be a great relief that
you are good with Flink on Mesos as is.

As for the jira issues, I believe the most essential ones should have
already been resolved. You may find some remaining open issues here [1],
but not all of them are necessary if we decide to keep Flink on Mesos as is.

At the moment and in the short future, I think helps are mostly needed on
testing the upcoming release 1.12 with Mesos use cases. The community is
currently actively preparing the new release, and hopefully we could come
up with a release candidate early next month. It would be greatly
appreciated if you fork as experienced Flink on Mesos users can help with
verifying the release candidates.

Thank you~

Xintong Song


[1]
https://issues.apache.org/jira/browse/FLINK-17402?jql=project%20%3D%20FLINK%20AND%20component%20%3D%20%22Deployment%20%2F%20Mesos%22%20AND%20status%20%3D%20Open

On Tue, Oct 27, 2020 at 2:58 AM Piyush Narang <p....@criteo.com> wrote:

> Hi Xintong,
>
>
>
> Do you have any jiras that cover any of the items on 1 or 2? I can reach
> out to folks internally and see if I can get some folks to commit to
> helping out.
>
>
>
> To cover the other qs:
>
>    - Yes, we’ve not got a plan at the moment to get off Mesos. We use
>    Yarn for some our Flink workloads when we can. Mesos is only used when we
>    need streaming capabilities in our WW dcs (as our Yarn is centralized in
>    one DC)
>    - We’re currently on Flink 1.9 (old planner). We have a plan to bump
>    to 1.11 / 1.12 this quarter.
>    - We typically upgrade once every 6 months to a year (not every
>    release). We’d like to speed up the cadence but we’re not there yet.
>    - We’d largely be good with keeping Flink on Mesos as-is and
>    functional while missing out on some of the newer features. We understand
>    the pain on the communities side and we can take on the work if we see some
>    fancy improvement in Flink on Yarn / K8s that we want in Mesos to put in
>    the request to port it over.
>
>
>
> Thanks,
>
>
>
> -- Piyush
>
>
>
>
>
> *From: *Xintong Song <to...@gmail.com>
> *Date: *Sunday, October 25, 2020 at 10:57 PM
> *To: *dev <de...@flink.apache.org>, user <us...@flink.apache.org>
> *Cc: *Lasse Nedergaard <la...@gmail.com>, <
> p.narang@criteo.com>
> *Subject: *Re: [SURVEY] Remove Mesos support
>
>
>
> Thanks for sharing the information with us, Piyush an Lasse.
>
>
>
> @Piyush
>
>
>
> Thanks for offering the help. IMO, there are currently several problems
> that make supporting Flink on Mesos challenging for us.
>
>    1. *Lack of Mesos experts.* AFAIK, there are very few people (if not
>    none) among the active contributors in this community that are
>    familiar with Mesos and can help with development on this component.
>    2. *Absence of tests.* Mesos does not provide a testing cluster, like
>    `MiniYARNCluster`, making it hard to test interactions between Flink and
>    Mesos. We have only a few very simple e2e tests running on Mesos deployed
>    in a docker, covering the most fundamental workflows. We are not sure how
>    well those tests work, especially against some potential corner cases.
>    3. *Divergence from other deployment.* Because of 1 and 2, the new
>    efforts (features, maintenance, refactors) tend to exclude Mesos if
>    possible. When the new efforts have to touch the Mesos related components
>    (e.g., changes to the common resource manager interfaces), we have to be
>    very careful and make as few changes as possible, to avoid accidentally
>    breaking anything that we are not familiar with. As a result, the component
>    diverges a lot from other deployment components (K8s/Yarn), which makes it
>    harder to maintain.
>
> It would be greatly appreciated if you can help with either of the above
> issues.
>
>
>
> Additionally, I have a few questions concerning your use cases at Criteo.
> IIUC, you are going to stay on Mesos in the foreseeable future, while
> keeping the Flink version up-to-date? What Flink version are you currently
> using? How often do you upgrade (e.g., every release)? Would you be good
> with keeping the Flink on Mesos component as it is (means that deployment
> and resource management improvements may not be ported to Mesos), while
> keeping other components up-to-date (e.g., improvements from programming
> APIs, operators, state backens, etc.)?
>
>
>
> Thank you~
>
> Xintong Song
>
>
>
>
>
> On Sat, Oct 24, 2020 at 2:48 AM Lasse Nedergaard <
> lassenedergaardflink@gmail.com> wrote:
>
> Hi
>
>
>
> At Trackunit We have been using Mesos for long time but have now moved to
> k8s.
>
> Med venlig hilsen / Best regards
>
> Lasse Nedergaard
>
>
>
>
>
> Den 23. okt. 2020 kl. 17.01 skrev Robert Metzger <rm...@apache.org>:
>
> 
>
> Hey Piyush,
>
> thanks a lot for raising this concern. I believe we should keep Mesos in
> Flink then in the foreseeable future.
>
> Your offer to help is much appreciated. We'll let you know once there is
> something.
>
>
>
> On Fri, Oct 23, 2020 at 4:28 PM Piyush Narang <p....@criteo.com> wrote:
>
> Thanks Kostas. If there's items we can help with, I'm sure we'd be able to
> find folks who would be excited to contribute / help in any way.
>
> -- Piyush
>
>
> On 10/23/20, 10:25 AM, "Kostas Kloudas" <kk...@gmail.com> wrote:
>
>     Thanks Piyush for the message.
>     After this, I revoke my +1. I agree with the previous opinions that we
>     cannot drop code that is actively used by users, especially if it
>     something that deep in the stack as support for cluster management
>     framework.
>
>     Cheers,
>     Kostas
>
>     On Fri, Oct 23, 2020 at 4:15 PM Piyush Narang <p....@criteo.com>
> wrote:
>     >
>     > Hi folks,
>     >
>     >
>     >
>     > We at Criteo are active users of the Flink on Mesos resource
> management component. We are pretty heavy users of Mesos for scheduling
> workloads on our edge datacenters and we do want to continue to be able to
> run some of our Flink topologies (to compute machine learning short term
> features) on those DCs. If possible our vote would be not to drop Mesos
> support as that will tie us to an old release / have to maintain a fork as
> we’re not planning to migrate off Mesos anytime soon. Is the burden
> something that can be helped with by the community? (Or are you referring
> to having to ensure PRs handle the Mesos piece as well when they touch the
> resource managers?)
>     >
>     >
>     >
>     > Thanks,
>     >
>     >
>     >
>     > -- Piyush
>     >
>     >
>     >
>     >
>     >
>     > From: Till Rohrmann <tr...@apache.org>
>     > Date: Friday, October 23, 2020 at 8:19 AM
>     > To: Xintong Song <to...@gmail.com>
>     > Cc: dev <de...@flink.apache.org>, user <us...@flink.apache.org>
>     > Subject: Re: [SURVEY] Remove Mesos support
>     >
>     >
>     >
>     > Thanks for starting this survey Robert! I second Konstantin and
> Xintong in the sense that our Mesos user's opinions should matter most
> here. If our community is no longer using the Mesos integration, then I
> would be +1 for removing it in order to decrease the maintenance burden.
>     >
>     >
>     >
>     > Cheers,
>     >
>     > Till
>     >
>     >
>     >
>     > On Fri, Oct 23, 2020 at 2:03 PM Xintong Song <to...@gmail.com>
> wrote:
>     >
>     > +1 for adding a warning in 1.12 about planning to remove Mesos
> support.
>     >
>     >
>     >
>     > With my developer hat on, removing the Mesos support would
> definitely reduce the maintaining overhead for the deployment and resource
> management related components. On the other hand, the Flink on Mesos users'
> voices definitely matter a lot for this community. Either way, it would be
> good to draw users attention to this discussion early.
>     >
>     >
>     >
>     > Thank you~
>     >
>     > Xintong Song
>     >
>     >
>     >
>     >
>     >
>     > On Fri, Oct 23, 2020 at 7:53 PM Konstantin Knauf <kn...@apache.org>
> wrote:
>     >
>     > Hi Robert,
>     >
>     > +1 to the plan you outlined. If we were to drop support in Flink
> 1.13+, we
>     > would still support it in Flink 1.12- with bug fixes for some time
> so that
>     > users have time to move on.
>     >
>     > It would certainly be very interesting to hear from current Flink on
> Mesos
>     > users, on how they see the evolution of this part of the ecosystem.
>     >
>     > Best,
>     >
>     > Konstantin
>
>

Re: [SURVEY] Remove Mesos support

Posted by Piyush Narang <p....@criteo.com>.
Hi Xintong,

Do you have any jiras that cover any of the items on 1 or 2? I can reach out to folks internally and see if I can get some folks to commit to helping out.

To cover the other qs:

  *   Yes, we’ve not got a plan at the moment to get off Mesos. We use Yarn for some our Flink workloads when we can. Mesos is only used when we need streaming capabilities in our WW dcs (as our Yarn is centralized in one DC)
  *   We’re currently on Flink 1.9 (old planner). We have a plan to bump to 1.11 / 1.12 this quarter.
  *   We typically upgrade once every 6 months to a year (not every release). We’d like to speed up the cadence but we’re not there yet.
  *   We’d largely be good with keeping Flink on Mesos as-is and functional while missing out on some of the newer features. We understand the pain on the communities side and we can take on the work if we see some fancy improvement in Flink on Yarn / K8s that we want in Mesos to put in the request to port it over.

Thanks,

-- Piyush


From: Xintong Song <to...@gmail.com>
Date: Sunday, October 25, 2020 at 10:57 PM
To: dev <de...@flink.apache.org>, user <us...@flink.apache.org>
Cc: Lasse Nedergaard <la...@gmail.com>, <p....@criteo.com>
Subject: Re: [SURVEY] Remove Mesos support

Thanks for sharing the information with us, Piyush an Lasse.



@Piyush



Thanks for offering the help. IMO, there are currently several problems that make supporting Flink on Mesos challenging for us.

  1.  Lack of Mesos experts. AFAIK, there are very few people (if not none) among the active contributors in this community that are familiar with Mesos and can help with development on this component.
  2.  Absence of tests. Mesos does not provide a testing cluster, like `MiniYARNCluster`, making it hard to test interactions between Flink and Mesos. We have only a few very simple e2e tests running on Mesos deployed in a docker, covering the most fundamental workflows. We are not sure how well those tests work, especially against some potential corner cases.
  3.  Divergence from other deployment. Because of 1 and 2, the new efforts (features, maintenance, refactors) tend to exclude Mesos if possible. When the new efforts have to touch the Mesos related components (e.g., changes to the common resource manager interfaces), we have to be very careful and make as few changes as possible, to avoid accidentally breaking anything that we are not familiar with. As a result, the component diverges a lot from other deployment components (K8s/Yarn), which makes it harder to maintain.

It would be greatly appreciated if you can help with either of the above issues.



Additionally, I have a few questions concerning your use cases at Criteo. IIUC, you are going to stay on Mesos in the foreseeable future, while keeping the Flink version up-to-date? What Flink version are you currently using? How often do you upgrade (e.g., every release)? Would you be good with keeping the Flink on Mesos component as it is (means that deployment and resource management improvements may not be ported to Mesos), while keeping other components up-to-date (e.g., improvements from programming APIs, operators, state backens, etc.)?



Thank you~

Xintong Song


On Sat, Oct 24, 2020 at 2:48 AM Lasse Nedergaard <la...@gmail.com>> wrote:
Hi

At Trackunit We have been using Mesos for long time but have now moved to k8s.
Med venlig hilsen / Best regards
Lasse Nedergaard



Den 23. okt. 2020 kl. 17.01 skrev Robert Metzger <rm...@apache.org>>:

Hey Piyush,
thanks a lot for raising this concern. I believe we should keep Mesos in Flink then in the foreseeable future.
Your offer to help is much appreciated. We'll let you know once there is something.

On Fri, Oct 23, 2020 at 4:28 PM Piyush Narang <p....@criteo.com>> wrote:
Thanks Kostas. If there's items we can help with, I'm sure we'd be able to find folks who would be excited to contribute / help in any way.

-- Piyush


On 10/23/20, 10:25 AM, "Kostas Kloudas" <kk...@gmail.com>> wrote:

    Thanks Piyush for the message.
    After this, I revoke my +1. I agree with the previous opinions that we
    cannot drop code that is actively used by users, especially if it
    something that deep in the stack as support for cluster management
    framework.

    Cheers,
    Kostas

    On Fri, Oct 23, 2020 at 4:15 PM Piyush Narang <p....@criteo.com>> wrote:
    >
    > Hi folks,
    >
    >
    >
    > We at Criteo are active users of the Flink on Mesos resource management component. We are pretty heavy users of Mesos for scheduling workloads on our edge datacenters and we do want to continue to be able to run some of our Flink topologies (to compute machine learning short term features) on those DCs. If possible our vote would be not to drop Mesos support as that will tie us to an old release / have to maintain a fork as we’re not planning to migrate off Mesos anytime soon. Is the burden something that can be helped with by the community? (Or are you referring to having to ensure PRs handle the Mesos piece as well when they touch the resource managers?)
    >
    >
    >
    > Thanks,
    >
    >
    >
    > -- Piyush
    >
    >
    >
    >
    >
    > From: Till Rohrmann <tr...@apache.org>>
    > Date: Friday, October 23, 2020 at 8:19 AM
    > To: Xintong Song <to...@gmail.com>>
    > Cc: dev <de...@flink.apache.org>>, user <us...@flink.apache.org>>
    > Subject: Re: [SURVEY] Remove Mesos support
    >
    >
    >
    > Thanks for starting this survey Robert! I second Konstantin and Xintong in the sense that our Mesos user's opinions should matter most here. If our community is no longer using the Mesos integration, then I would be +1 for removing it in order to decrease the maintenance burden.
    >
    >
    >
    > Cheers,
    >
    > Till
    >
    >
    >
    > On Fri, Oct 23, 2020 at 2:03 PM Xintong Song <to...@gmail.com>> wrote:
    >
    > +1 for adding a warning in 1.12 about planning to remove Mesos support.
    >
    >
    >
    > With my developer hat on, removing the Mesos support would definitely reduce the maintaining overhead for the deployment and resource management related components. On the other hand, the Flink on Mesos users' voices definitely matter a lot for this community. Either way, it would be good to draw users attention to this discussion early.
    >
    >
    >
    > Thank you~
    >
    > Xintong Song
    >
    >
    >
    >
    >
    > On Fri, Oct 23, 2020 at 7:53 PM Konstantin Knauf <kn...@apache.org>> wrote:
    >
    > Hi Robert,
    >
    > +1 to the plan you outlined. If we were to drop support in Flink 1.13+, we
    > would still support it in Flink 1.12- with bug fixes for some time so that
    > users have time to move on.
    >
    > It would certainly be very interesting to hear from current Flink on Mesos
    > users, on how they see the evolution of this part of the ecosystem.
    >
    > Best,
    >
    > Konstantin


Re: [SURVEY] Remove Mesos support

Posted by Piyush Narang <p....@criteo.com>.
Hi Xintong,

Do you have any jiras that cover any of the items on 1 or 2? I can reach out to folks internally and see if I can get some folks to commit to helping out.

To cover the other qs:

  *   Yes, we’ve not got a plan at the moment to get off Mesos. We use Yarn for some our Flink workloads when we can. Mesos is only used when we need streaming capabilities in our WW dcs (as our Yarn is centralized in one DC)
  *   We’re currently on Flink 1.9 (old planner). We have a plan to bump to 1.11 / 1.12 this quarter.
  *   We typically upgrade once every 6 months to a year (not every release). We’d like to speed up the cadence but we’re not there yet.
  *   We’d largely be good with keeping Flink on Mesos as-is and functional while missing out on some of the newer features. We understand the pain on the communities side and we can take on the work if we see some fancy improvement in Flink on Yarn / K8s that we want in Mesos to put in the request to port it over.

Thanks,

-- Piyush


From: Xintong Song <to...@gmail.com>
Date: Sunday, October 25, 2020 at 10:57 PM
To: dev <de...@flink.apache.org>, user <us...@flink.apache.org>
Cc: Lasse Nedergaard <la...@gmail.com>, <p....@criteo.com>
Subject: Re: [SURVEY] Remove Mesos support

Thanks for sharing the information with us, Piyush an Lasse.



@Piyush



Thanks for offering the help. IMO, there are currently several problems that make supporting Flink on Mesos challenging for us.

  1.  Lack of Mesos experts. AFAIK, there are very few people (if not none) among the active contributors in this community that are familiar with Mesos and can help with development on this component.
  2.  Absence of tests. Mesos does not provide a testing cluster, like `MiniYARNCluster`, making it hard to test interactions between Flink and Mesos. We have only a few very simple e2e tests running on Mesos deployed in a docker, covering the most fundamental workflows. We are not sure how well those tests work, especially against some potential corner cases.
  3.  Divergence from other deployment. Because of 1 and 2, the new efforts (features, maintenance, refactors) tend to exclude Mesos if possible. When the new efforts have to touch the Mesos related components (e.g., changes to the common resource manager interfaces), we have to be very careful and make as few changes as possible, to avoid accidentally breaking anything that we are not familiar with. As a result, the component diverges a lot from other deployment components (K8s/Yarn), which makes it harder to maintain.

It would be greatly appreciated if you can help with either of the above issues.



Additionally, I have a few questions concerning your use cases at Criteo. IIUC, you are going to stay on Mesos in the foreseeable future, while keeping the Flink version up-to-date? What Flink version are you currently using? How often do you upgrade (e.g., every release)? Would you be good with keeping the Flink on Mesos component as it is (means that deployment and resource management improvements may not be ported to Mesos), while keeping other components up-to-date (e.g., improvements from programming APIs, operators, state backens, etc.)?



Thank you~

Xintong Song


On Sat, Oct 24, 2020 at 2:48 AM Lasse Nedergaard <la...@gmail.com>> wrote:
Hi

At Trackunit We have been using Mesos for long time but have now moved to k8s.
Med venlig hilsen / Best regards
Lasse Nedergaard



Den 23. okt. 2020 kl. 17.01 skrev Robert Metzger <rm...@apache.org>>:

Hey Piyush,
thanks a lot for raising this concern. I believe we should keep Mesos in Flink then in the foreseeable future.
Your offer to help is much appreciated. We'll let you know once there is something.

On Fri, Oct 23, 2020 at 4:28 PM Piyush Narang <p....@criteo.com>> wrote:
Thanks Kostas. If there's items we can help with, I'm sure we'd be able to find folks who would be excited to contribute / help in any way.

-- Piyush


On 10/23/20, 10:25 AM, "Kostas Kloudas" <kk...@gmail.com>> wrote:

    Thanks Piyush for the message.
    After this, I revoke my +1. I agree with the previous opinions that we
    cannot drop code that is actively used by users, especially if it
    something that deep in the stack as support for cluster management
    framework.

    Cheers,
    Kostas

    On Fri, Oct 23, 2020 at 4:15 PM Piyush Narang <p....@criteo.com>> wrote:
    >
    > Hi folks,
    >
    >
    >
    > We at Criteo are active users of the Flink on Mesos resource management component. We are pretty heavy users of Mesos for scheduling workloads on our edge datacenters and we do want to continue to be able to run some of our Flink topologies (to compute machine learning short term features) on those DCs. If possible our vote would be not to drop Mesos support as that will tie us to an old release / have to maintain a fork as we’re not planning to migrate off Mesos anytime soon. Is the burden something that can be helped with by the community? (Or are you referring to having to ensure PRs handle the Mesos piece as well when they touch the resource managers?)
    >
    >
    >
    > Thanks,
    >
    >
    >
    > -- Piyush
    >
    >
    >
    >
    >
    > From: Till Rohrmann <tr...@apache.org>>
    > Date: Friday, October 23, 2020 at 8:19 AM
    > To: Xintong Song <to...@gmail.com>>
    > Cc: dev <de...@flink.apache.org>>, user <us...@flink.apache.org>>
    > Subject: Re: [SURVEY] Remove Mesos support
    >
    >
    >
    > Thanks for starting this survey Robert! I second Konstantin and Xintong in the sense that our Mesos user's opinions should matter most here. If our community is no longer using the Mesos integration, then I would be +1 for removing it in order to decrease the maintenance burden.
    >
    >
    >
    > Cheers,
    >
    > Till
    >
    >
    >
    > On Fri, Oct 23, 2020 at 2:03 PM Xintong Song <to...@gmail.com>> wrote:
    >
    > +1 for adding a warning in 1.12 about planning to remove Mesos support.
    >
    >
    >
    > With my developer hat on, removing the Mesos support would definitely reduce the maintaining overhead for the deployment and resource management related components. On the other hand, the Flink on Mesos users' voices definitely matter a lot for this community. Either way, it would be good to draw users attention to this discussion early.
    >
    >
    >
    > Thank you~
    >
    > Xintong Song
    >
    >
    >
    >
    >
    > On Fri, Oct 23, 2020 at 7:53 PM Konstantin Knauf <kn...@apache.org>> wrote:
    >
    > Hi Robert,
    >
    > +1 to the plan you outlined. If we were to drop support in Flink 1.13+, we
    > would still support it in Flink 1.12- with bug fixes for some time so that
    > users have time to move on.
    >
    > It would certainly be very interesting to hear from current Flink on Mesos
    > users, on how they see the evolution of this part of the ecosystem.
    >
    > Best,
    >
    > Konstantin


Re: [SURVEY] Remove Mesos support

Posted by Xintong Song <to...@gmail.com>.
Thanks for sharing the information with us, Piyush an Lasse.


@Piyush


Thanks for offering the help. IMO, there are currently several problems
that make supporting Flink on Mesos challenging for us.


   1. *Lack of Mesos experts.* AFAIK, there are very few people (if not
   none) among the active contributors in this community that are
   familiar with Mesos and can help with development on this component.
   2. *Absence of tests.* Mesos does not provide a testing cluster, like
   `MiniYARNCluster`, making it hard to test interactions between Flink and
   Mesos. We have only a few very simple e2e tests running on Mesos deployed
   in a docker, covering the most fundamental workflows. We are not sure how
   well those tests work, especially against some potential corner cases.
   3. *Divergence from other deployment.* Because of 1 and 2, the new
   efforts (features, maintenance, refactors) tend to exclude Mesos if
   possible. When the new efforts have to touch the Mesos related components
   (e.g., changes to the common resource manager interfaces), we have to be
   very careful and make as few changes as possible, to avoid accidentally
   breaking anything that we are not familiar with. As a result, the component
   diverges a lot from other deployment components (K8s/Yarn), which makes it
   harder to maintain.

It would be greatly appreciated if you can help with either of the above
issues.


Additionally, I have a few questions concerning your use cases at Criteo.
IIUC, you are going to stay on Mesos in the foreseeable future, while
keeping the Flink version up-to-date? What Flink version are you currently
using? How often do you upgrade (e.g., every release)? Would you be good
with keeping the Flink on Mesos component as it is (means that deployment
and resource management improvements may not be ported to Mesos), while
keeping other components up-to-date (e.g., improvements from programming
APIs, operators, state backens, etc.)?


Thank you~

Xintong Song



On Sat, Oct 24, 2020 at 2:48 AM Lasse Nedergaard <
lassenedergaardflink@gmail.com> wrote:

> Hi
>
> At Trackunit We have been using Mesos for long time but have now moved to
> k8s.
>
> Med venlig hilsen / Best regards
> Lasse Nedergaard
>
>
> Den 23. okt. 2020 kl. 17.01 skrev Robert Metzger <rm...@apache.org>:
>
> 
> Hey Piyush,
> thanks a lot for raising this concern. I believe we should keep Mesos in
> Flink then in the foreseeable future.
> Your offer to help is much appreciated. We'll let you know once there is
> something.
>
> On Fri, Oct 23, 2020 at 4:28 PM Piyush Narang <p....@criteo.com> wrote:
>
>> Thanks Kostas. If there's items we can help with, I'm sure we'd be able
>> to find folks who would be excited to contribute / help in any way.
>>
>> -- Piyush
>>
>>
>> On 10/23/20, 10:25 AM, "Kostas Kloudas" <kk...@gmail.com> wrote:
>>
>>     Thanks Piyush for the message.
>>     After this, I revoke my +1. I agree with the previous opinions that we
>>     cannot drop code that is actively used by users, especially if it
>>     something that deep in the stack as support for cluster management
>>     framework.
>>
>>     Cheers,
>>     Kostas
>>
>>     On Fri, Oct 23, 2020 at 4:15 PM Piyush Narang <p....@criteo.com>
>> wrote:
>>     >
>>     > Hi folks,
>>     >
>>     >
>>     >
>>     > We at Criteo are active users of the Flink on Mesos resource
>> management component. We are pretty heavy users of Mesos for scheduling
>> workloads on our edge datacenters and we do want to continue to be able to
>> run some of our Flink topologies (to compute machine learning short term
>> features) on those DCs. If possible our vote would be not to drop Mesos
>> support as that will tie us to an old release / have to maintain a fork as
>> we’re not planning to migrate off Mesos anytime soon. Is the burden
>> something that can be helped with by the community? (Or are you referring
>> to having to ensure PRs handle the Mesos piece as well when they touch the
>> resource managers?)
>>     >
>>     >
>>     >
>>     > Thanks,
>>     >
>>     >
>>     >
>>     > -- Piyush
>>     >
>>     >
>>     >
>>     >
>>     >
>>     > From: Till Rohrmann <tr...@apache.org>
>>     > Date: Friday, October 23, 2020 at 8:19 AM
>>     > To: Xintong Song <to...@gmail.com>
>>     > Cc: dev <de...@flink.apache.org>, user <us...@flink.apache.org>
>>     > Subject: Re: [SURVEY] Remove Mesos support
>>     >
>>     >
>>     >
>>     > Thanks for starting this survey Robert! I second Konstantin and
>> Xintong in the sense that our Mesos user's opinions should matter most
>> here. If our community is no longer using the Mesos integration, then I
>> would be +1 for removing it in order to decrease the maintenance burden.
>>     >
>>     >
>>     >
>>     > Cheers,
>>     >
>>     > Till
>>     >
>>     >
>>     >
>>     > On Fri, Oct 23, 2020 at 2:03 PM Xintong Song <to...@gmail.com>
>> wrote:
>>     >
>>     > +1 for adding a warning in 1.12 about planning to remove Mesos
>> support.
>>     >
>>     >
>>     >
>>     > With my developer hat on, removing the Mesos support would
>> definitely reduce the maintaining overhead for the deployment and resource
>> management related components. On the other hand, the Flink on Mesos users'
>> voices definitely matter a lot for this community. Either way, it would be
>> good to draw users attention to this discussion early.
>>     >
>>     >
>>     >
>>     > Thank you~
>>     >
>>     > Xintong Song
>>     >
>>     >
>>     >
>>     >
>>     >
>>     > On Fri, Oct 23, 2020 at 7:53 PM Konstantin Knauf <kn...@apache.org>
>> wrote:
>>     >
>>     > Hi Robert,
>>     >
>>     > +1 to the plan you outlined. If we were to drop support in Flink
>> 1.13+, we
>>     > would still support it in Flink 1.12- with bug fixes for some time
>> so that
>>     > users have time to move on.
>>     >
>>     > It would certainly be very interesting to hear from current Flink
>> on Mesos
>>     > users, on how they see the evolution of this part of the ecosystem.
>>     >
>>     > Best,
>>     >
>>     > Konstantin
>>
>>
>>

Re: [SURVEY] Remove Mesos support

Posted by Xintong Song <to...@gmail.com>.
Thanks for sharing the information with us, Piyush an Lasse.


@Piyush


Thanks for offering the help. IMO, there are currently several problems
that make supporting Flink on Mesos challenging for us.


   1. *Lack of Mesos experts.* AFAIK, there are very few people (if not
   none) among the active contributors in this community that are
   familiar with Mesos and can help with development on this component.
   2. *Absence of tests.* Mesos does not provide a testing cluster, like
   `MiniYARNCluster`, making it hard to test interactions between Flink and
   Mesos. We have only a few very simple e2e tests running on Mesos deployed
   in a docker, covering the most fundamental workflows. We are not sure how
   well those tests work, especially against some potential corner cases.
   3. *Divergence from other deployment.* Because of 1 and 2, the new
   efforts (features, maintenance, refactors) tend to exclude Mesos if
   possible. When the new efforts have to touch the Mesos related components
   (e.g., changes to the common resource manager interfaces), we have to be
   very careful and make as few changes as possible, to avoid accidentally
   breaking anything that we are not familiar with. As a result, the component
   diverges a lot from other deployment components (K8s/Yarn), which makes it
   harder to maintain.

It would be greatly appreciated if you can help with either of the above
issues.


Additionally, I have a few questions concerning your use cases at Criteo.
IIUC, you are going to stay on Mesos in the foreseeable future, while
keeping the Flink version up-to-date? What Flink version are you currently
using? How often do you upgrade (e.g., every release)? Would you be good
with keeping the Flink on Mesos component as it is (means that deployment
and resource management improvements may not be ported to Mesos), while
keeping other components up-to-date (e.g., improvements from programming
APIs, operators, state backens, etc.)?


Thank you~

Xintong Song



On Sat, Oct 24, 2020 at 2:48 AM Lasse Nedergaard <
lassenedergaardflink@gmail.com> wrote:

> Hi
>
> At Trackunit We have been using Mesos for long time but have now moved to
> k8s.
>
> Med venlig hilsen / Best regards
> Lasse Nedergaard
>
>
> Den 23. okt. 2020 kl. 17.01 skrev Robert Metzger <rm...@apache.org>:
>
> 
> Hey Piyush,
> thanks a lot for raising this concern. I believe we should keep Mesos in
> Flink then in the foreseeable future.
> Your offer to help is much appreciated. We'll let you know once there is
> something.
>
> On Fri, Oct 23, 2020 at 4:28 PM Piyush Narang <p....@criteo.com> wrote:
>
>> Thanks Kostas. If there's items we can help with, I'm sure we'd be able
>> to find folks who would be excited to contribute / help in any way.
>>
>> -- Piyush
>>
>>
>> On 10/23/20, 10:25 AM, "Kostas Kloudas" <kk...@gmail.com> wrote:
>>
>>     Thanks Piyush for the message.
>>     After this, I revoke my +1. I agree with the previous opinions that we
>>     cannot drop code that is actively used by users, especially if it
>>     something that deep in the stack as support for cluster management
>>     framework.
>>
>>     Cheers,
>>     Kostas
>>
>>     On Fri, Oct 23, 2020 at 4:15 PM Piyush Narang <p....@criteo.com>
>> wrote:
>>     >
>>     > Hi folks,
>>     >
>>     >
>>     >
>>     > We at Criteo are active users of the Flink on Mesos resource
>> management component. We are pretty heavy users of Mesos for scheduling
>> workloads on our edge datacenters and we do want to continue to be able to
>> run some of our Flink topologies (to compute machine learning short term
>> features) on those DCs. If possible our vote would be not to drop Mesos
>> support as that will tie us to an old release / have to maintain a fork as
>> we’re not planning to migrate off Mesos anytime soon. Is the burden
>> something that can be helped with by the community? (Or are you referring
>> to having to ensure PRs handle the Mesos piece as well when they touch the
>> resource managers?)
>>     >
>>     >
>>     >
>>     > Thanks,
>>     >
>>     >
>>     >
>>     > -- Piyush
>>     >
>>     >
>>     >
>>     >
>>     >
>>     > From: Till Rohrmann <tr...@apache.org>
>>     > Date: Friday, October 23, 2020 at 8:19 AM
>>     > To: Xintong Song <to...@gmail.com>
>>     > Cc: dev <de...@flink.apache.org>, user <us...@flink.apache.org>
>>     > Subject: Re: [SURVEY] Remove Mesos support
>>     >
>>     >
>>     >
>>     > Thanks for starting this survey Robert! I second Konstantin and
>> Xintong in the sense that our Mesos user's opinions should matter most
>> here. If our community is no longer using the Mesos integration, then I
>> would be +1 for removing it in order to decrease the maintenance burden.
>>     >
>>     >
>>     >
>>     > Cheers,
>>     >
>>     > Till
>>     >
>>     >
>>     >
>>     > On Fri, Oct 23, 2020 at 2:03 PM Xintong Song <to...@gmail.com>
>> wrote:
>>     >
>>     > +1 for adding a warning in 1.12 about planning to remove Mesos
>> support.
>>     >
>>     >
>>     >
>>     > With my developer hat on, removing the Mesos support would
>> definitely reduce the maintaining overhead for the deployment and resource
>> management related components. On the other hand, the Flink on Mesos users'
>> voices definitely matter a lot for this community. Either way, it would be
>> good to draw users attention to this discussion early.
>>     >
>>     >
>>     >
>>     > Thank you~
>>     >
>>     > Xintong Song
>>     >
>>     >
>>     >
>>     >
>>     >
>>     > On Fri, Oct 23, 2020 at 7:53 PM Konstantin Knauf <kn...@apache.org>
>> wrote:
>>     >
>>     > Hi Robert,
>>     >
>>     > +1 to the plan you outlined. If we were to drop support in Flink
>> 1.13+, we
>>     > would still support it in Flink 1.12- with bug fixes for some time
>> so that
>>     > users have time to move on.
>>     >
>>     > It would certainly be very interesting to hear from current Flink
>> on Mesos
>>     > users, on how they see the evolution of this part of the ecosystem.
>>     >
>>     > Best,
>>     >
>>     > Konstantin
>>
>>
>>

Re: [SURVEY] Remove Mesos support

Posted by Lasse Nedergaard <la...@gmail.com>.
Hi

At Trackunit We have been using Mesos for long time but have now moved to k8s. 

Med venlig hilsen / Best regards
Lasse Nedergaard


> Den 23. okt. 2020 kl. 17.01 skrev Robert Metzger <rm...@apache.org>:
> 
> 
> Hey Piyush,
> thanks a lot for raising this concern. I believe we should keep Mesos in Flink then in the foreseeable future.
> Your offer to help is much appreciated. We'll let you know once there is something.
> 
>> On Fri, Oct 23, 2020 at 4:28 PM Piyush Narang <p....@criteo.com> wrote:
>> Thanks Kostas. If there's items we can help with, I'm sure we'd be able to find folks who would be excited to contribute / help in any way. 
>> 
>> -- Piyush
>> 
>> 
>> On 10/23/20, 10:25 AM, "Kostas Kloudas" <kk...@gmail.com> wrote:
>> 
>>     Thanks Piyush for the message.
>>     After this, I revoke my +1. I agree with the previous opinions that we
>>     cannot drop code that is actively used by users, especially if it
>>     something that deep in the stack as support for cluster management
>>     framework.
>> 
>>     Cheers,
>>     Kostas
>> 
>>     On Fri, Oct 23, 2020 at 4:15 PM Piyush Narang <p....@criteo.com> wrote:
>>     >
>>     > Hi folks,
>>     >
>>     >
>>     >
>>     > We at Criteo are active users of the Flink on Mesos resource management component. We are pretty heavy users of Mesos for scheduling workloads on our edge datacenters and we do want to continue to be able to run some of our Flink topologies (to compute machine learning short term features) on those DCs. If possible our vote would be not to drop Mesos support as that will tie us to an old release / have to maintain a fork as we’re not planning to migrate off Mesos anytime soon. Is the burden something that can be helped with by the community? (Or are you referring to having to ensure PRs handle the Mesos piece as well when they touch the resource managers?)
>>     >
>>     >
>>     >
>>     > Thanks,
>>     >
>>     >
>>     >
>>     > -- Piyush
>>     >
>>     >
>>     >
>>     >
>>     >
>>     > From: Till Rohrmann <tr...@apache.org>
>>     > Date: Friday, October 23, 2020 at 8:19 AM
>>     > To: Xintong Song <to...@gmail.com>
>>     > Cc: dev <de...@flink.apache.org>, user <us...@flink.apache.org>
>>     > Subject: Re: [SURVEY] Remove Mesos support
>>     >
>>     >
>>     >
>>     > Thanks for starting this survey Robert! I second Konstantin and Xintong in the sense that our Mesos user's opinions should matter most here. If our community is no longer using the Mesos integration, then I would be +1 for removing it in order to decrease the maintenance burden.
>>     >
>>     >
>>     >
>>     > Cheers,
>>     >
>>     > Till
>>     >
>>     >
>>     >
>>     > On Fri, Oct 23, 2020 at 2:03 PM Xintong Song <to...@gmail.com> wrote:
>>     >
>>     > +1 for adding a warning in 1.12 about planning to remove Mesos support.
>>     >
>>     >
>>     >
>>     > With my developer hat on, removing the Mesos support would definitely reduce the maintaining overhead for the deployment and resource management related components. On the other hand, the Flink on Mesos users' voices definitely matter a lot for this community. Either way, it would be good to draw users attention to this discussion early.
>>     >
>>     >
>>     >
>>     > Thank you~
>>     >
>>     > Xintong Song
>>     >
>>     >
>>     >
>>     >
>>     >
>>     > On Fri, Oct 23, 2020 at 7:53 PM Konstantin Knauf <kn...@apache.org> wrote:
>>     >
>>     > Hi Robert,
>>     >
>>     > +1 to the plan you outlined. If we were to drop support in Flink 1.13+, we
>>     > would still support it in Flink 1.12- with bug fixes for some time so that
>>     > users have time to move on.
>>     >
>>     > It would certainly be very interesting to hear from current Flink on Mesos
>>     > users, on how they see the evolution of this part of the ecosystem.
>>     >
>>     > Best,
>>     >
>>     > Konstantin
>> 
>> 

Re: [SURVEY] Remove Mesos support

Posted by Robert Metzger <rm...@apache.org>.
Hey Piyush,
thanks a lot for raising this concern. I believe we should keep Mesos in
Flink then in the foreseeable future.
Your offer to help is much appreciated. We'll let you know once there is
something.

On Fri, Oct 23, 2020 at 4:28 PM Piyush Narang <p....@criteo.com> wrote:

> Thanks Kostas. If there's items we can help with, I'm sure we'd be able to
> find folks who would be excited to contribute / help in any way.
>
> -- Piyush
>
>
> On 10/23/20, 10:25 AM, "Kostas Kloudas" <kk...@gmail.com> wrote:
>
>     Thanks Piyush for the message.
>     After this, I revoke my +1. I agree with the previous opinions that we
>     cannot drop code that is actively used by users, especially if it
>     something that deep in the stack as support for cluster management
>     framework.
>
>     Cheers,
>     Kostas
>
>     On Fri, Oct 23, 2020 at 4:15 PM Piyush Narang <p....@criteo.com>
> wrote:
>     >
>     > Hi folks,
>     >
>     >
>     >
>     > We at Criteo are active users of the Flink on Mesos resource
> management component. We are pretty heavy users of Mesos for scheduling
> workloads on our edge datacenters and we do want to continue to be able to
> run some of our Flink topologies (to compute machine learning short term
> features) on those DCs. If possible our vote would be not to drop Mesos
> support as that will tie us to an old release / have to maintain a fork as
> we’re not planning to migrate off Mesos anytime soon. Is the burden
> something that can be helped with by the community? (Or are you referring
> to having to ensure PRs handle the Mesos piece as well when they touch the
> resource managers?)
>     >
>     >
>     >
>     > Thanks,
>     >
>     >
>     >
>     > -- Piyush
>     >
>     >
>     >
>     >
>     >
>     > From: Till Rohrmann <tr...@apache.org>
>     > Date: Friday, October 23, 2020 at 8:19 AM
>     > To: Xintong Song <to...@gmail.com>
>     > Cc: dev <de...@flink.apache.org>, user <us...@flink.apache.org>
>     > Subject: Re: [SURVEY] Remove Mesos support
>     >
>     >
>     >
>     > Thanks for starting this survey Robert! I second Konstantin and
> Xintong in the sense that our Mesos user's opinions should matter most
> here. If our community is no longer using the Mesos integration, then I
> would be +1 for removing it in order to decrease the maintenance burden.
>     >
>     >
>     >
>     > Cheers,
>     >
>     > Till
>     >
>     >
>     >
>     > On Fri, Oct 23, 2020 at 2:03 PM Xintong Song <to...@gmail.com>
> wrote:
>     >
>     > +1 for adding a warning in 1.12 about planning to remove Mesos
> support.
>     >
>     >
>     >
>     > With my developer hat on, removing the Mesos support would
> definitely reduce the maintaining overhead for the deployment and resource
> management related components. On the other hand, the Flink on Mesos users'
> voices definitely matter a lot for this community. Either way, it would be
> good to draw users attention to this discussion early.
>     >
>     >
>     >
>     > Thank you~
>     >
>     > Xintong Song
>     >
>     >
>     >
>     >
>     >
>     > On Fri, Oct 23, 2020 at 7:53 PM Konstantin Knauf <kn...@apache.org>
> wrote:
>     >
>     > Hi Robert,
>     >
>     > +1 to the plan you outlined. If we were to drop support in Flink
> 1.13+, we
>     > would still support it in Flink 1.12- with bug fixes for some time
> so that
>     > users have time to move on.
>     >
>     > It would certainly be very interesting to hear from current Flink on
> Mesos
>     > users, on how they see the evolution of this part of the ecosystem.
>     >
>     > Best,
>     >
>     > Konstantin
>
>
>

Re: [SURVEY] Remove Mesos support

Posted by Robert Metzger <rm...@apache.org>.
Hey Piyush,
thanks a lot for raising this concern. I believe we should keep Mesos in
Flink then in the foreseeable future.
Your offer to help is much appreciated. We'll let you know once there is
something.

On Fri, Oct 23, 2020 at 4:28 PM Piyush Narang <p....@criteo.com> wrote:

> Thanks Kostas. If there's items we can help with, I'm sure we'd be able to
> find folks who would be excited to contribute / help in any way.
>
> -- Piyush
>
>
> On 10/23/20, 10:25 AM, "Kostas Kloudas" <kk...@gmail.com> wrote:
>
>     Thanks Piyush for the message.
>     After this, I revoke my +1. I agree with the previous opinions that we
>     cannot drop code that is actively used by users, especially if it
>     something that deep in the stack as support for cluster management
>     framework.
>
>     Cheers,
>     Kostas
>
>     On Fri, Oct 23, 2020 at 4:15 PM Piyush Narang <p....@criteo.com>
> wrote:
>     >
>     > Hi folks,
>     >
>     >
>     >
>     > We at Criteo are active users of the Flink on Mesos resource
> management component. We are pretty heavy users of Mesos for scheduling
> workloads on our edge datacenters and we do want to continue to be able to
> run some of our Flink topologies (to compute machine learning short term
> features) on those DCs. If possible our vote would be not to drop Mesos
> support as that will tie us to an old release / have to maintain a fork as
> we’re not planning to migrate off Mesos anytime soon. Is the burden
> something that can be helped with by the community? (Or are you referring
> to having to ensure PRs handle the Mesos piece as well when they touch the
> resource managers?)
>     >
>     >
>     >
>     > Thanks,
>     >
>     >
>     >
>     > -- Piyush
>     >
>     >
>     >
>     >
>     >
>     > From: Till Rohrmann <tr...@apache.org>
>     > Date: Friday, October 23, 2020 at 8:19 AM
>     > To: Xintong Song <to...@gmail.com>
>     > Cc: dev <de...@flink.apache.org>, user <us...@flink.apache.org>
>     > Subject: Re: [SURVEY] Remove Mesos support
>     >
>     >
>     >
>     > Thanks for starting this survey Robert! I second Konstantin and
> Xintong in the sense that our Mesos user's opinions should matter most
> here. If our community is no longer using the Mesos integration, then I
> would be +1 for removing it in order to decrease the maintenance burden.
>     >
>     >
>     >
>     > Cheers,
>     >
>     > Till
>     >
>     >
>     >
>     > On Fri, Oct 23, 2020 at 2:03 PM Xintong Song <to...@gmail.com>
> wrote:
>     >
>     > +1 for adding a warning in 1.12 about planning to remove Mesos
> support.
>     >
>     >
>     >
>     > With my developer hat on, removing the Mesos support would
> definitely reduce the maintaining overhead for the deployment and resource
> management related components. On the other hand, the Flink on Mesos users'
> voices definitely matter a lot for this community. Either way, it would be
> good to draw users attention to this discussion early.
>     >
>     >
>     >
>     > Thank you~
>     >
>     > Xintong Song
>     >
>     >
>     >
>     >
>     >
>     > On Fri, Oct 23, 2020 at 7:53 PM Konstantin Knauf <kn...@apache.org>
> wrote:
>     >
>     > Hi Robert,
>     >
>     > +1 to the plan you outlined. If we were to drop support in Flink
> 1.13+, we
>     > would still support it in Flink 1.12- with bug fixes for some time
> so that
>     > users have time to move on.
>     >
>     > It would certainly be very interesting to hear from current Flink on
> Mesos
>     > users, on how they see the evolution of this part of the ecosystem.
>     >
>     > Best,
>     >
>     > Konstantin
>
>
>

Re: [SURVEY] Remove Mesos support

Posted by Piyush Narang <p....@criteo.com>.
Thanks Kostas. If there's items we can help with, I'm sure we'd be able to find folks who would be excited to contribute / help in any way. 

-- Piyush
 

On 10/23/20, 10:25 AM, "Kostas Kloudas" <kk...@gmail.com> wrote:

    Thanks Piyush for the message.
    After this, I revoke my +1. I agree with the previous opinions that we
    cannot drop code that is actively used by users, especially if it
    something that deep in the stack as support for cluster management
    framework.
    
    Cheers,
    Kostas
    
    On Fri, Oct 23, 2020 at 4:15 PM Piyush Narang <p....@criteo.com> wrote:
    >
    > Hi folks,
    >
    >
    >
    > We at Criteo are active users of the Flink on Mesos resource management component. We are pretty heavy users of Mesos for scheduling workloads on our edge datacenters and we do want to continue to be able to run some of our Flink topologies (to compute machine learning short term features) on those DCs. If possible our vote would be not to drop Mesos support as that will tie us to an old release / have to maintain a fork as we’re not planning to migrate off Mesos anytime soon. Is the burden something that can be helped with by the community? (Or are you referring to having to ensure PRs handle the Mesos piece as well when they touch the resource managers?)
    >
    >
    >
    > Thanks,
    >
    >
    >
    > -- Piyush
    >
    >
    >
    >
    >
    > From: Till Rohrmann <tr...@apache.org>
    > Date: Friday, October 23, 2020 at 8:19 AM
    > To: Xintong Song <to...@gmail.com>
    > Cc: dev <de...@flink.apache.org>, user <us...@flink.apache.org>
    > Subject: Re: [SURVEY] Remove Mesos support
    >
    >
    >
    > Thanks for starting this survey Robert! I second Konstantin and Xintong in the sense that our Mesos user's opinions should matter most here. If our community is no longer using the Mesos integration, then I would be +1 for removing it in order to decrease the maintenance burden.
    >
    >
    >
    > Cheers,
    >
    > Till
    >
    >
    >
    > On Fri, Oct 23, 2020 at 2:03 PM Xintong Song <to...@gmail.com> wrote:
    >
    > +1 for adding a warning in 1.12 about planning to remove Mesos support.
    >
    >
    >
    > With my developer hat on, removing the Mesos support would definitely reduce the maintaining overhead for the deployment and resource management related components. On the other hand, the Flink on Mesos users' voices definitely matter a lot for this community. Either way, it would be good to draw users attention to this discussion early.
    >
    >
    >
    > Thank you~
    >
    > Xintong Song
    >
    >
    >
    >
    >
    > On Fri, Oct 23, 2020 at 7:53 PM Konstantin Knauf <kn...@apache.org> wrote:
    >
    > Hi Robert,
    >
    > +1 to the plan you outlined. If we were to drop support in Flink 1.13+, we
    > would still support it in Flink 1.12- with bug fixes for some time so that
    > users have time to move on.
    >
    > It would certainly be very interesting to hear from current Flink on Mesos
    > users, on how they see the evolution of this part of the ecosystem.
    >
    > Best,
    >
    > Konstantin
    


Re: [SURVEY] Remove Mesos support

Posted by Piyush Narang <p....@criteo.com>.
Thanks Kostas. If there's items we can help with, I'm sure we'd be able to find folks who would be excited to contribute / help in any way. 

-- Piyush
 

On 10/23/20, 10:25 AM, "Kostas Kloudas" <kk...@gmail.com> wrote:

    Thanks Piyush for the message.
    After this, I revoke my +1. I agree with the previous opinions that we
    cannot drop code that is actively used by users, especially if it
    something that deep in the stack as support for cluster management
    framework.
    
    Cheers,
    Kostas
    
    On Fri, Oct 23, 2020 at 4:15 PM Piyush Narang <p....@criteo.com> wrote:
    >
    > Hi folks,
    >
    >
    >
    > We at Criteo are active users of the Flink on Mesos resource management component. We are pretty heavy users of Mesos for scheduling workloads on our edge datacenters and we do want to continue to be able to run some of our Flink topologies (to compute machine learning short term features) on those DCs. If possible our vote would be not to drop Mesos support as that will tie us to an old release / have to maintain a fork as we’re not planning to migrate off Mesos anytime soon. Is the burden something that can be helped with by the community? (Or are you referring to having to ensure PRs handle the Mesos piece as well when they touch the resource managers?)
    >
    >
    >
    > Thanks,
    >
    >
    >
    > -- Piyush
    >
    >
    >
    >
    >
    > From: Till Rohrmann <tr...@apache.org>
    > Date: Friday, October 23, 2020 at 8:19 AM
    > To: Xintong Song <to...@gmail.com>
    > Cc: dev <de...@flink.apache.org>, user <us...@flink.apache.org>
    > Subject: Re: [SURVEY] Remove Mesos support
    >
    >
    >
    > Thanks for starting this survey Robert! I second Konstantin and Xintong in the sense that our Mesos user's opinions should matter most here. If our community is no longer using the Mesos integration, then I would be +1 for removing it in order to decrease the maintenance burden.
    >
    >
    >
    > Cheers,
    >
    > Till
    >
    >
    >
    > On Fri, Oct 23, 2020 at 2:03 PM Xintong Song <to...@gmail.com> wrote:
    >
    > +1 for adding a warning in 1.12 about planning to remove Mesos support.
    >
    >
    >
    > With my developer hat on, removing the Mesos support would definitely reduce the maintaining overhead for the deployment and resource management related components. On the other hand, the Flink on Mesos users' voices definitely matter a lot for this community. Either way, it would be good to draw users attention to this discussion early.
    >
    >
    >
    > Thank you~
    >
    > Xintong Song
    >
    >
    >
    >
    >
    > On Fri, Oct 23, 2020 at 7:53 PM Konstantin Knauf <kn...@apache.org> wrote:
    >
    > Hi Robert,
    >
    > +1 to the plan you outlined. If we were to drop support in Flink 1.13+, we
    > would still support it in Flink 1.12- with bug fixes for some time so that
    > users have time to move on.
    >
    > It would certainly be very interesting to hear from current Flink on Mesos
    > users, on how they see the evolution of this part of the ecosystem.
    >
    > Best,
    >
    > Konstantin
    


Re: [SURVEY] Remove Mesos support

Posted by Kostas Kloudas <kk...@gmail.com>.
Thanks Piyush for the message.
After this, I revoke my +1. I agree with the previous opinions that we
cannot drop code that is actively used by users, especially if it
something that deep in the stack as support for cluster management
framework.

Cheers,
Kostas

On Fri, Oct 23, 2020 at 4:15 PM Piyush Narang <p....@criteo.com> wrote:
>
> Hi folks,
>
>
>
> We at Criteo are active users of the Flink on Mesos resource management component. We are pretty heavy users of Mesos for scheduling workloads on our edge datacenters and we do want to continue to be able to run some of our Flink topologies (to compute machine learning short term features) on those DCs. If possible our vote would be not to drop Mesos support as that will tie us to an old release / have to maintain a fork as we’re not planning to migrate off Mesos anytime soon. Is the burden something that can be helped with by the community? (Or are you referring to having to ensure PRs handle the Mesos piece as well when they touch the resource managers?)
>
>
>
> Thanks,
>
>
>
> -- Piyush
>
>
>
>
>
> From: Till Rohrmann <tr...@apache.org>
> Date: Friday, October 23, 2020 at 8:19 AM
> To: Xintong Song <to...@gmail.com>
> Cc: dev <de...@flink.apache.org>, user <us...@flink.apache.org>
> Subject: Re: [SURVEY] Remove Mesos support
>
>
>
> Thanks for starting this survey Robert! I second Konstantin and Xintong in the sense that our Mesos user's opinions should matter most here. If our community is no longer using the Mesos integration, then I would be +1 for removing it in order to decrease the maintenance burden.
>
>
>
> Cheers,
>
> Till
>
>
>
> On Fri, Oct 23, 2020 at 2:03 PM Xintong Song <to...@gmail.com> wrote:
>
> +1 for adding a warning in 1.12 about planning to remove Mesos support.
>
>
>
> With my developer hat on, removing the Mesos support would definitely reduce the maintaining overhead for the deployment and resource management related components. On the other hand, the Flink on Mesos users' voices definitely matter a lot for this community. Either way, it would be good to draw users attention to this discussion early.
>
>
>
> Thank you~
>
> Xintong Song
>
>
>
>
>
> On Fri, Oct 23, 2020 at 7:53 PM Konstantin Knauf <kn...@apache.org> wrote:
>
> Hi Robert,
>
> +1 to the plan you outlined. If we were to drop support in Flink 1.13+, we
> would still support it in Flink 1.12- with bug fixes for some time so that
> users have time to move on.
>
> It would certainly be very interesting to hear from current Flink on Mesos
> users, on how they see the evolution of this part of the ecosystem.
>
> Best,
>
> Konstantin

Re: [SURVEY] Remove Mesos support

Posted by Kostas Kloudas <kk...@gmail.com>.
Thanks Piyush for the message.
After this, I revoke my +1. I agree with the previous opinions that we
cannot drop code that is actively used by users, especially if it
something that deep in the stack as support for cluster management
framework.

Cheers,
Kostas

On Fri, Oct 23, 2020 at 4:15 PM Piyush Narang <p....@criteo.com> wrote:
>
> Hi folks,
>
>
>
> We at Criteo are active users of the Flink on Mesos resource management component. We are pretty heavy users of Mesos for scheduling workloads on our edge datacenters and we do want to continue to be able to run some of our Flink topologies (to compute machine learning short term features) on those DCs. If possible our vote would be not to drop Mesos support as that will tie us to an old release / have to maintain a fork as we’re not planning to migrate off Mesos anytime soon. Is the burden something that can be helped with by the community? (Or are you referring to having to ensure PRs handle the Mesos piece as well when they touch the resource managers?)
>
>
>
> Thanks,
>
>
>
> -- Piyush
>
>
>
>
>
> From: Till Rohrmann <tr...@apache.org>
> Date: Friday, October 23, 2020 at 8:19 AM
> To: Xintong Song <to...@gmail.com>
> Cc: dev <de...@flink.apache.org>, user <us...@flink.apache.org>
> Subject: Re: [SURVEY] Remove Mesos support
>
>
>
> Thanks for starting this survey Robert! I second Konstantin and Xintong in the sense that our Mesos user's opinions should matter most here. If our community is no longer using the Mesos integration, then I would be +1 for removing it in order to decrease the maintenance burden.
>
>
>
> Cheers,
>
> Till
>
>
>
> On Fri, Oct 23, 2020 at 2:03 PM Xintong Song <to...@gmail.com> wrote:
>
> +1 for adding a warning in 1.12 about planning to remove Mesos support.
>
>
>
> With my developer hat on, removing the Mesos support would definitely reduce the maintaining overhead for the deployment and resource management related components. On the other hand, the Flink on Mesos users' voices definitely matter a lot for this community. Either way, it would be good to draw users attention to this discussion early.
>
>
>
> Thank you~
>
> Xintong Song
>
>
>
>
>
> On Fri, Oct 23, 2020 at 7:53 PM Konstantin Knauf <kn...@apache.org> wrote:
>
> Hi Robert,
>
> +1 to the plan you outlined. If we were to drop support in Flink 1.13+, we
> would still support it in Flink 1.12- with bug fixes for some time so that
> users have time to move on.
>
> It would certainly be very interesting to hear from current Flink on Mesos
> users, on how they see the evolution of this part of the ecosystem.
>
> Best,
>
> Konstantin

Re: [SURVEY] Remove Mesos support

Posted by Piyush Narang <p....@criteo.com>.
Hi folks,

We at Criteo are active users of the Flink on Mesos resource management component. We are pretty heavy users of Mesos for scheduling workloads on our edge datacenters and we do want to continue to be able to run some of our Flink topologies (to compute machine learning short term features) on those DCs. If possible our vote would be not to drop Mesos support as that will tie us to an old release / have to maintain a fork as we’re not planning to migrate off Mesos anytime soon. Is the burden something that can be helped with by the community? (Or are you referring to having to ensure PRs handle the Mesos piece as well when they touch the resource managers?)

Thanks,

-- Piyush


From: Till Rohrmann <tr...@apache.org>
Date: Friday, October 23, 2020 at 8:19 AM
To: Xintong Song <to...@gmail.com>
Cc: dev <de...@flink.apache.org>, user <us...@flink.apache.org>
Subject: Re: [SURVEY] Remove Mesos support

Thanks for starting this survey Robert! I second Konstantin and Xintong in the sense that our Mesos user's opinions should matter most here. If our community is no longer using the Mesos integration, then I would be +1 for removing it in order to decrease the maintenance burden.

Cheers,
Till

On Fri, Oct 23, 2020 at 2:03 PM Xintong Song <to...@gmail.com>> wrote:
+1 for adding a warning in 1.12 about planning to remove Mesos support.



With my developer hat on, removing the Mesos support would definitely reduce the maintaining overhead for the deployment and resource management related components. On the other hand, the Flink on Mesos users' voices definitely matter a lot for this community. Either way, it would be good to draw users attention to this discussion early.



Thank you~

Xintong Song


On Fri, Oct 23, 2020 at 7:53 PM Konstantin Knauf <kn...@apache.org>> wrote:
Hi Robert,

+1 to the plan you outlined. If we were to drop support in Flink 1.13+, we
would still support it in Flink 1.12- with bug fixes for some time so that
users have time to move on.

It would certainly be very interesting to hear from current Flink on Mesos
users, on how they see the evolution of this part of the ecosystem.

Best,

Konstantin

Re: [SURVEY] Remove Mesos support

Posted by Piyush Narang <p....@criteo.com>.
Hi folks,

We at Criteo are active users of the Flink on Mesos resource management component. We are pretty heavy users of Mesos for scheduling workloads on our edge datacenters and we do want to continue to be able to run some of our Flink topologies (to compute machine learning short term features) on those DCs. If possible our vote would be not to drop Mesos support as that will tie us to an old release / have to maintain a fork as we’re not planning to migrate off Mesos anytime soon. Is the burden something that can be helped with by the community? (Or are you referring to having to ensure PRs handle the Mesos piece as well when they touch the resource managers?)

Thanks,

-- Piyush


From: Till Rohrmann <tr...@apache.org>
Date: Friday, October 23, 2020 at 8:19 AM
To: Xintong Song <to...@gmail.com>
Cc: dev <de...@flink.apache.org>, user <us...@flink.apache.org>
Subject: Re: [SURVEY] Remove Mesos support

Thanks for starting this survey Robert! I second Konstantin and Xintong in the sense that our Mesos user's opinions should matter most here. If our community is no longer using the Mesos integration, then I would be +1 for removing it in order to decrease the maintenance burden.

Cheers,
Till

On Fri, Oct 23, 2020 at 2:03 PM Xintong Song <to...@gmail.com>> wrote:
+1 for adding a warning in 1.12 about planning to remove Mesos support.



With my developer hat on, removing the Mesos support would definitely reduce the maintaining overhead for the deployment and resource management related components. On the other hand, the Flink on Mesos users' voices definitely matter a lot for this community. Either way, it would be good to draw users attention to this discussion early.



Thank you~

Xintong Song


On Fri, Oct 23, 2020 at 7:53 PM Konstantin Knauf <kn...@apache.org>> wrote:
Hi Robert,

+1 to the plan you outlined. If we were to drop support in Flink 1.13+, we
would still support it in Flink 1.12- with bug fixes for some time so that
users have time to move on.

It would certainly be very interesting to hear from current Flink on Mesos
users, on how they see the evolution of this part of the ecosystem.

Best,

Konstantin

Re: [SURVEY] Remove Mesos support

Posted by Till Rohrmann <tr...@apache.org>.
Thanks for starting this survey Robert! I second Konstantin and Xintong in
the sense that our Mesos user's opinions should matter most here. If our
community is no longer using the Mesos integration, then I would be +1 for
removing it in order to decrease the maintenance burden.

Cheers,
Till

On Fri, Oct 23, 2020 at 2:03 PM Xintong Song <to...@gmail.com> wrote:

> +1 for adding a warning in 1.12 about planning to remove Mesos support.
>
>
> With my developer hat on, removing the Mesos support would
> definitely reduce the maintaining overhead for the deployment and resource
> management related components. On the other hand, the Flink on Mesos users'
> voices definitely matter a lot for this community. Either way, it would be
> good to draw users attention to this discussion early.
>
>
> Thank you~
>
> Xintong Song
>
>
>
> On Fri, Oct 23, 2020 at 7:53 PM Konstantin Knauf <kn...@apache.org>
> wrote:
>
>> Hi Robert,
>>
>> +1 to the plan you outlined. If we were to drop support in Flink 1.13+, we
>> would still support it in Flink 1.12- with bug fixes for some time so that
>> users have time to move on.
>>
>> It would certainly be very interesting to hear from current Flink on Mesos
>> users, on how they see the evolution of this part of the ecosystem.
>>
>> Best,
>>
>> Konstantin
>>
>

Re: [SURVEY] Remove Mesos support

Posted by Till Rohrmann <tr...@apache.org>.
Thanks for starting this survey Robert! I second Konstantin and Xintong in
the sense that our Mesos user's opinions should matter most here. If our
community is no longer using the Mesos integration, then I would be +1 for
removing it in order to decrease the maintenance burden.

Cheers,
Till

On Fri, Oct 23, 2020 at 2:03 PM Xintong Song <to...@gmail.com> wrote:

> +1 for adding a warning in 1.12 about planning to remove Mesos support.
>
>
> With my developer hat on, removing the Mesos support would
> definitely reduce the maintaining overhead for the deployment and resource
> management related components. On the other hand, the Flink on Mesos users'
> voices definitely matter a lot for this community. Either way, it would be
> good to draw users attention to this discussion early.
>
>
> Thank you~
>
> Xintong Song
>
>
>
> On Fri, Oct 23, 2020 at 7:53 PM Konstantin Knauf <kn...@apache.org>
> wrote:
>
>> Hi Robert,
>>
>> +1 to the plan you outlined. If we were to drop support in Flink 1.13+, we
>> would still support it in Flink 1.12- with bug fixes for some time so that
>> users have time to move on.
>>
>> It would certainly be very interesting to hear from current Flink on Mesos
>> users, on how they see the evolution of this part of the ecosystem.
>>
>> Best,
>>
>> Konstantin
>>
>

Re: [SURVEY] Remove Mesos support

Posted by Xintong Song <to...@gmail.com>.
+1 for adding a warning in 1.12 about planning to remove Mesos support.


With my developer hat on, removing the Mesos support would
definitely reduce the maintaining overhead for the deployment and resource
management related components. On the other hand, the Flink on Mesos users'
voices definitely matter a lot for this community. Either way, it would be
good to draw users attention to this discussion early.


Thank you~

Xintong Song



On Fri, Oct 23, 2020 at 7:53 PM Konstantin Knauf <kn...@apache.org> wrote:

> Hi Robert,
>
> +1 to the plan you outlined. If we were to drop support in Flink 1.13+, we
> would still support it in Flink 1.12- with bug fixes for some time so that
> users have time to move on.
>
> It would certainly be very interesting to hear from current Flink on Mesos
> users, on how they see the evolution of this part of the ecosystem.
>
> Best,
>
> Konstantin
>

Re: [SURVEY] Remove Mesos support

Posted by Xintong Song <to...@gmail.com>.
+1 for adding a warning in 1.12 about planning to remove Mesos support.


With my developer hat on, removing the Mesos support would
definitely reduce the maintaining overhead for the deployment and resource
management related components. On the other hand, the Flink on Mesos users'
voices definitely matter a lot for this community. Either way, it would be
good to draw users attention to this discussion early.


Thank you~

Xintong Song



On Fri, Oct 23, 2020 at 7:53 PM Konstantin Knauf <kn...@apache.org> wrote:

> Hi Robert,
>
> +1 to the plan you outlined. If we were to drop support in Flink 1.13+, we
> would still support it in Flink 1.12- with bug fixes for some time so that
> users have time to move on.
>
> It would certainly be very interesting to hear from current Flink on Mesos
> users, on how they see the evolution of this part of the ecosystem.
>
> Best,
>
> Konstantin
>

Re: [SURVEY] Remove Mesos support

Posted by Konstantin Knauf <kn...@apache.org>.
Hi Robert,

+1 to the plan you outlined. If we were to drop support in Flink 1.13+, we
would still support it in Flink 1.12- with bug fixes for some time so that
users have time to move on.

It would certainly be very interesting to hear from current Flink on Mesos
users, on how they see the evolution of this part of the ecosystem.

Best,

Konstantin

Re: [SURVEY] Remove Mesos support

Posted by Konstantin Knauf <kn...@apache.org>.
Hi Robert,

+1 to the plan you outlined. If we were to drop support in Flink 1.13+, we
would still support it in Flink 1.12- with bug fixes for some time so that
users have time to move on.

It would certainly be very interesting to hear from current Flink on Mesos
users, on how they see the evolution of this part of the ecosystem.

Best,

Konstantin