You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@flink.apache.org by Kurt Young <yk...@gmail.com> on 2020/08/04 05:51:17 UTC

Re: [DISCUSS] Planning Flink 1.12

Regarding setting the feature freeze date to late September, I have some
concern that it might make
the development time of 1.12 too short.

One reason for this is we took too much time (about 1.5 month, from mid of
May to beginning of July)
for testing 1.11. It's not ideal but further squeeze the development time
of 1.12 won't make this better.
 Besides, AFAIK July & August is also a popular vacation season for
European. Given the fact most
 committers of Flink come from Europe, I think we should also take this
into consideration.

It's also true that the first week of October is the national holiday of
China, so I'm wondering whether the
end of October could be a candidate feature freeze date.

Best,
Kurt


On Tue, Jul 28, 2020 at 2:41 AM Robert Metzger <rm...@apache.org> wrote:

> Hi all,
>
> Thanks a lot for the responses so far. I've put them into this Wiki page:
> https://cwiki.apache.org/confluence/display/FLINK/1.12+Release to keep
> track of them. Ideally, post JIRA tickets for your feature, then the status
> will update automatically in the wiki :)
>
> Please keep posting features here, or add them to the Wiki yourself 🙏
>
> @Prasanna kumar <pr...@gmail.com>: Dynamic Auto Scaling is a
> feature request the community is well-aware of. Till has posted
> "Reactive-scaling mode" as a feature he's working on for the 1.12 release.
> This work will introduce the basic building blocks and partial support for
> the feature you are requesting.
> Proper support for dynamic scaling, while maintaining Flink's high
> performance (throughout, low latency) and correctness is a difficult task
> that needs a lot of work. It will probably take a little bit of time till
> this is fully available.
>
> Cheers,
> Robert
>
>
>
> On Thu, Jul 23, 2020 at 2:27 PM Till Rohrmann <tr...@apache.org>
> wrote:
>
> > Thanks for being our release managers for the 1.12 release Dian & Robert!
> >
> > Here are some features I would like to work on for this release:
> >
> > # Features
> >
> > ## Finishing pipelined region scheduling (
> > https://issues.apache.org/jira/browse/FLINK-16430)
> > With the pipelined region scheduler we want to implement a scheduler
> which
> > can serve streaming as well as batch workloads alike while being able to
> > run jobs under constrained resources. The latter is particularly
> important
> > for bounded streaming jobs which, currently, are not well supported.
> >
> > ## Reactive-scaling mode
> > Being able to react to newly available resources and rescaling a running
> > job accordingly will make Flink's operation much easier because resources
> > can then be controlled by an external tool (e.g. GCP autoscaling, K8s
> > horizontal pod scaler, etc.). In this release we want to make a big step
> > towards this direction. As a first step we want to support the execution
> of
> > jobs with a parallelism which is lower than the specified parallelism in
> > case that Flink lost a TaskManager or could not acquire enough resources.
> >
> > # Maintenance/Stability
> >
> > ## JM / TM finished task reconciliation (
> > https://issues.apache.org/jira/browse/FLINK-17075)
> > This prevents the system from going out of sync if a task state change
> from
> > the TM to the JM is lost.
> >
> > ## Make metrics services work with Kubernetes deployments (
> > https://issues.apache.org/jira/browse/FLINK-11127)
> > Invert the direction in which the MetricFetcher connects to the
> > MetricQueryFetchers. That way it will no longer be necessary to expose on
> > K8s for every TaskManager a port on which the MetricQueryFetcher runs.
> This
> > will then make the deployment of Flink clusters on K8s easier.
> >
> > ## Handle long-blocking operations during job submission (savepoint
> > restore) (https://issues.apache.org/jira/browse/FLINK-16866)
> > Submitting a Flink job can involve the interaction with external systems
> > (blocking operations). Depending on the job the interactions can take so
> > long that it exceeds the submission timeout which reports a failure on
> the
> > client side even though the actual submission succeeded. By decoupling
> the
> > creation of the ExecutionGraph from the job submission, we can make the
> job
> > submission non-blocking which will solve this problem.
> >
> > ## Make IDs more intuitive to ease debugging (FLIP-118) (
> > https://issues.apache.org/jira/browse/FLINK-15679)
> > By making the internal Flink IDs compositional or logging how they belong
> > together, we can make the debugging of Flink's operations much easier.
> >
> > Cheers,
> > Till
> >
> >
> > On Thu, Jul 23, 2020 at 7:48 AM Canbin Zheng <fe...@gmail.com>
> > wrote:
> >
> > > Hi All,
> > >
> > > Thanks for bring-up this discussion, Robert!
> > > Congratulations on becoming the release manager of 1.12， Dian and
> Robert
> > !
> > >
> > > ----------
> > > Here are some of my thoughts of the features for native integration
> with
> > > Kubernetes in Flink 1.12:
> > >
> > > 1. Support user-specified pod templates
> > >     Description:
> > >     The current approach of introducing new configuration options for
> > each
> > > aspect of pod specification a user might wish is becoming unwieldy, we
> > have
> > > to maintain more and more Flink side Kubernetes configuration options
> and
> > > users have to learn the gap between the declarative model used by
> > > Kubernetes and the configuration model used by Flink. It's a great
> > > improvement to allow users to specify pod templates as central places
> for
> > > all customization needs for the jobmanager and taskmanager pods.
> > >     Benefits:
> > >     Users can leverage many of the advanced K8s features that the Flink
> > > community does not support explicitly, such as volume mounting, DNS
> > > configuration, pod affinity/anti-affinity setting, etc.
> > >
> > > 2. Support running PyFlink on Kubernetes
> > >     Description:
> > >     Support running PyFlink on Kubernetes, including session cluster
> and
> > > application cluster.
> > >     Benefits:
> > >     Running python application in a containerized environment.
> > >
> > > 3. Support built-in init-Container
> > >     Description:
> > >     We need a built-in init-Container to help solve dependency
> management
> > > in a containerized environment, especially in the application mode.
> > >     Benefits:
> > >     Separate the base Flink image from dynamic dependencies.
> > >
> > > 4. Support accessing secured services via K8s secrets
> > >     Description:
> > >     Kubernetes Secrets
> > > <https://kubernetes.io/docs/concepts/configuration/secret/> can be
> used
> > to
> > > provide credentials for a Flink application to access secured services.
> > It
> > > helps people who want to use a user-specified K8s Secret through an
> > > environment variable.
> > >     Benefits:
> > >     Improve user experience.
> > >
> > > 5. Support configuring replica of JobManager Deployment in ZooKeeper HA
> > > setups
> > >     Description:
> > >     Make the *replica* of Deployment configurable in the ZooKeeper HA
> > > setups.
> > >     Benefits:
> > >     Achieve faster failover.
> > >
> > > 6. Support to configure limit for CPU requirement
> > >     Description:
> > >     To leverage the Kubernetes feature of container request/limit CPU.
> > >     Benefits:
> > >     Reduce cost.
> > >
> > > Regards,
> > > Canbin Zheng
> > >
> > > Harold.Miao <mi...@gmail.com> 于2020年7月23日周四 下午12:44写道：
> > >
> > > > I'm excited to hear about this feature,  very, very, very highly
> > > encouraged
> > > >
> > > >
> > > > Prasanna kumar <pr...@gmail.com> 于2020年7月23日周四
> > 上午12:10写道：
> > > >
> > > > > Hi Flink Dev Team,
> > > > >
> > > > > Dynamic AutoScaling Based on the incoming data load would be a
> great
> > > > > feature.
> > > > >
> > > > > We should be able have some rule say If the load increased by 20% ,
> > add
> > > > > extra resource should be added.
> > > > > Or time based say during these peak hours the pipeline should scale
> > > > > automatically by 50%.
> > > > >
> > > > > This will help a lot in cost reduction.
> > > > >
> > > > > EMR cluster provides a similar feature for SPARK based application.
> > > > >
> > > > > Thanks,
> > > > > Prasanna.
> > > > >
> > > > > On Wed, Jul 22, 2020 at 5:40 PM Robert Metzger <
> rmetzger@apache.org>
> > > > > wrote:
> > > > >
> > > > > > Hi all,
> > > > > >
> > > > > > Now that the 1.11 release is out, it is time to plan for the next
> > > major
> > > > > > Flink release.
> > > > > >
> > > > > > Some items:
> > > > > >
> > > > > >    1.
> > > > > >
> > > > > >    Dian Fu and me volunteer to be the release managers for Flink
> > > 1.12.
> > > > > >
> > > > > >
> > > > > >
> > > > > >    1.
> > > > > >
> > > > > >    Timeline: We propose to stick to our approximate 4 month
> release
> > > > > cycle,
> > > > > >    thus the release should be done by late October. Given that
> > > there’s
> > > > a
> > > > > >    holiday week in China at the beginning of October, I propose
> to
> > do
> > > > the
> > > > > >    feature freeze on master by late September.
> > > > > >
> > > > > >    2.
> > > > > >
> > > > > >    Collecting features: It would be good to have a rough overview
> > of
> > > > the
> > > > > >    features that will likely be ready to be merged by late
> > September,
> > > > and
> > > > > > that
> > > > > >    we want in the release.
> > > > > >    Based on the discussion, we will update the Roadmap on the
> Flink
> > > > > website
> > > > > >    again!
> > > > > >
> > > > > >
> > > > > >
> > > > > >    1.
> > > > > >
> > > > > >    Test instabilities and blockers: I would like to avoid a
> > situation
> > > > > where
> > > > > >    we have many blocking issues or build instabilities at the
> time
> > of
> > > > the
> > > > > >    feature freeze. To achieve that, we will try to check every
> > build
> > > > > >    instability within a week, to decide if it is a blocker (make
> > sure
> > > > to
> > > > > > use
> > > > > >    the “test-stability” label for those tickets!)
> > > > > >    Blocker issues will need to have somebody assigned
> (responsible)
> > > > > within
> > > > > >    a week, and we want to see progress on all blocker issues
> > > > (downgrade,
> > > > > >    resolution, a good plan how to proceed if it is more
> > complicated)
> > > > > >
> > > > > >    2.
> > > > > >
> > > > > >    Quality and stability of new features: In order to have a
> short
> > > > > feature
> > > > > >    freeze phase, we encourage developers to only merge
> well-tested
> > > and
> > > > > >    documented features. In our experience, the feature freeze
> works
> > > > best
> > > > > if
> > > > > >    new features are complete, and the community can focus fully
> on
> > > > > > addressing
> > > > > >    newly found bugs and voting the release.
> > > > > >    By having a smooth release process, the next merge-window for
> > the
> > > > next
> > > > > >    release will come sooner.
> > > > > >
> > > > > >
> > > > > > Let me know what you think about our items, and share which
> > features
> > > > you
> > > > > > want in Flink 1.12.
> > > > > >
> > > > > > Best,
> > > > > >
> > > > > > Robert & Dian
> > > > > >
> > > > >
> > > >
> > > >
> > > > --
> > > >
> > > > Best Regards,
> > > > Harold Miao
> > > >
> > >
> >
>

Re: [DISCUSS] Planning Flink 1.12

Posted by Seth Wiesman <sj...@gmail.com>.

@David Anderson <da...@alpinegizmo.com>
I think we can get that into 1.12. I have a branch that just needs to be
cleaned up.

+1 for https://issues.apache.org/jira/browse/FLINK-13095

On Mon, Aug 17, 2020 at 2:03 AM Till Rohrmann <tr...@apache.org> wrote:

> Hi Rodrigo,
>
> FLINK-10407 has not been up to date with the actual progress on the
> feature. We will update it soon.
>
> I think that the community won't actively work on FLINK-12002 in this
> release. However, we will work on FLINK-16430 [1] and once this is done
> continue with the clean up of the legacy scheduler code paths.
>
> As for the scope of the 1.12 release, we hope to finish FLINK-16430 and
> make considerable progress with FLINK-10407. However, I don't think that
> FLINK-10407 will be fully functional yet.
>
> [1] https://issues.apache.org/jira/browse/FLINK-16430
>
> Cheers,
> Till
>
> On Sat, Aug 15, 2020 at 6:43 PM rodrigobrochado <
> rodrigo.brochado@predito.com.br> wrote:
>
> > Thanks Dian!
> >
> > About the wiki page, I think that the "Reactive-scaling mode" by Till has
> > an
> > open issue on FLINK-10407 [1].
> >
> > Still about scaling, what about the adaptive parallelism of jobs [2]?
> This
> > is somehow related to Prasanna and Harold's comments above. It seems to
> > depend on finishing the old "Redesign Flink Scheduling" [3], FLIP-119
> > (already on wiki list), and FLINK-15626 (Remove of legacy scheduler) [4].
> > Would they be achievable until 1.12?
> >
> > If I could add one more, the python UDF in docker mode would be awesome
> > [5].
> >
> > [1] https://issues.apache.org/jira/browse/FLINK-10407
> > [2] https://issues.apache.org/jira/browse/FLINK-12002
> > [3] https://issues.apache.org/jira/browse/FLINK-10429
> > [4] https://issues.apache.org/jira/browse/FLINK-15626
> > [5] https://issues.apache.org/jira/browse/FLINK-14025
> >
> > Thanks,
> > Rodrigo
> >
> >
> >
> > --
> > Sent from:
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/
> >
>

Re: [DISCUSS] Planning Flink 1.12

Posted by Till Rohrmann <tr...@apache.org>.

Hi Rodrigo,

FLINK-10407 has not been up to date with the actual progress on the
feature. We will update it soon.

I think that the community won't actively work on FLINK-12002 in this
release. However, we will work on FLINK-16430 [1] and once this is done
continue with the clean up of the legacy scheduler code paths.

As for the scope of the 1.12 release, we hope to finish FLINK-16430 and
make considerable progress with FLINK-10407. However, I don't think that
FLINK-10407 will be fully functional yet.

[1] https://issues.apache.org/jira/browse/FLINK-16430

Cheers,
Till

On Sat, Aug 15, 2020 at 6:43 PM rodrigobrochado <
rodrigo.brochado@predito.com.br> wrote:

> Thanks Dian!
>
> About the wiki page, I think that the "Reactive-scaling mode" by Till has
> an
> open issue on FLINK-10407 [1].
>
> Still about scaling, what about the adaptive parallelism of jobs [2]? This
> is somehow related to Prasanna and Harold's comments above. It seems to
> depend on finishing the old "Redesign Flink Scheduling" [3], FLIP-119
> (already on wiki list), and FLINK-15626 (Remove of legacy scheduler) [4].
> Would they be achievable until 1.12?
>
> If I could add one more, the python UDF in docker mode would be awesome
> [5].
>
> [1] https://issues.apache.org/jira/browse/FLINK-10407
> [2] https://issues.apache.org/jira/browse/FLINK-12002
> [3] https://issues.apache.org/jira/browse/FLINK-10429
> [4] https://issues.apache.org/jira/browse/FLINK-15626
> [5] https://issues.apache.org/jira/browse/FLINK-14025
>
> Thanks,
> Rodrigo
>
>
>
> --
> Sent from: http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/
>

Re: [DISCUSS] Planning Flink 1.12

Posted by rodrigobrochado <ro...@predito.com.br>.

Thanks Dian!

About the wiki page, I think that the "Reactive-scaling mode" by Till has an
open issue on FLINK-10407 [1]. 

Still about scaling, what about the adaptive parallelism of jobs [2]? This
is somehow related to Prasanna and Harold's comments above. It seems to
depend on finishing the old "Redesign Flink Scheduling" [3], FLIP-119
(already on wiki list), and FLINK-15626 (Remove of legacy scheduler) [4].
Would they be achievable until 1.12?

If I could add one more, the python UDF in docker mode would be awesome [5].

[1] https://issues.apache.org/jira/browse/FLINK-10407
[2] https://issues.apache.org/jira/browse/FLINK-12002
[3] https://issues.apache.org/jira/browse/FLINK-10429
[4] https://issues.apache.org/jira/browse/FLINK-15626
[5] https://issues.apache.org/jira/browse/FLINK-14025

Thanks,
Rodrigo



--
Sent from: http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/

Re: [DISCUSS] Planning Flink 1.12

Posted by David Anderson <da...@alpinegizmo.com>.

What about filling in at least the most critical holes in the State
Processor API? As it stands now, many of its supposed benefits (e.g.,
migrate between state backends, change max parallelism) can't be realized
for applications that use windows.

https://issues.apache.org/jira/browse/FLINK-13095 - Provide an easy way to
read / bootstrap window state

I realize that some rework will also be needed soon to move the
implementation away from DataSet, but I hope we can see some usability
improvements in 1.12.

Regards,
David


On Fri, Aug 14, 2020 at 8:37 PM Steven Wu <st...@gmail.com> wrote:

> What about the work of migrating some Flink sources to the new FLIP-27
> source interface? They are not listed in the 1.12 release wiki page.
>
> On Thu, Aug 13, 2020 at 6:51 PM Dian Fu <di...@gmail.com> wrote:
>
> > Hi Rodrigo,
> >
> > Both FLIP-130 and FLIP-133 will be in the list of 1.12. Besides, there
> are
> > also some other features from PyFlink side in 1.12. More details could be
> > found in the wiki page(
> > https://cwiki.apache.org/confluence/display/FLINK/1.12+Release <
> > https://cwiki.apache.org/confluence/display/FLINK/1.12+Release>).
> >
> > Regards,
> > Dian
> >
> > > 在 2020年8月14日，上午9:37，rodrigobrochado <ro...@predito.com.br>
> > 写道：
> > >
> > > Hi,
> > >
> > > I hope it's not too late to ask, but would FLIP-130 [1] and FLIP-133
> [2]
> > be
> > > considered? I think that it would be nice to have some details of
> pyFlink
> > > Datastreams API (FLIP-130) on the roadmap, giving us (users) more
> > insights
> > > into what we can expect from pyFlink in the near future.
> > >
> > >
> > > [1]
> > >
> >
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-130-Support-for-Python-DataStream-API-Stateless-Part-td43035.html
> > > [2]
> > >
> >
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-133-Rework-PyFlink-Documentation-tt43570.html
> > >
> > >
> > > Thanks,
> > > Rodrigo
> > >
> > >
> > >
> > > --
> > > Sent from:
> > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/
> >
> >
>

Re: [DISCUSS] Planning Flink 1.12

Posted by Steven Wu <st...@gmail.com>.

What about the work of migrating some Flink sources to the new FLIP-27
source interface? They are not listed in the 1.12 release wiki page.

On Thu, Aug 13, 2020 at 6:51 PM Dian Fu <di...@gmail.com> wrote:

> Hi Rodrigo,
>
> Both FLIP-130 and FLIP-133 will be in the list of 1.12. Besides, there are
> also some other features from PyFlink side in 1.12. More details could be
> found in the wiki page(
> https://cwiki.apache.org/confluence/display/FLINK/1.12+Release <
> https://cwiki.apache.org/confluence/display/FLINK/1.12+Release>).
>
> Regards,
> Dian
>
> > 在 2020年8月14日，上午9:37，rodrigobrochado <ro...@predito.com.br>
> 写道：
> >
> > Hi,
> >
> > I hope it's not too late to ask, but would FLIP-130 [1] and FLIP-133 [2]
> be
> > considered? I think that it would be nice to have some details of pyFlink
> > Datastreams API (FLIP-130) on the roadmap, giving us (users) more
> insights
> > into what we can expect from pyFlink in the near future.
> >
> >
> > [1]
> >
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-130-Support-for-Python-DataStream-API-Stateless-Part-td43035.html
> > [2]
> >
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-133-Rework-PyFlink-Documentation-tt43570.html
> >
> >
> > Thanks,
> > Rodrigo
> >
> >
> >
> > --
> > Sent from:
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/
>
>

Re: [DISCUSS] Planning Flink 1.12

Posted by Dian Fu <di...@gmail.com>.

Hi Rodrigo,

Both FLIP-130 and FLIP-133 will be in the list of 1.12. Besides, there are also some other features from PyFlink side in 1.12. More details could be found in the wiki page(https://cwiki.apache.org/confluence/display/FLINK/1.12+Release <https://cwiki.apache.org/confluence/display/FLINK/1.12+Release>).

Regards,
Dian

> 在 2020年8月14日，上午9:37，rodrigobrochado <ro...@predito.com.br> 写道：
> 
> Hi,
> 
> I hope it's not too late to ask, but would FLIP-130 [1] and FLIP-133 [2] be
> considered? I think that it would be nice to have some details of pyFlink
> Datastreams API (FLIP-130) on the roadmap, giving us (users) more insights
> into what we can expect from pyFlink in the near future.
> 
> 
> [1]
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-130-Support-for-Python-DataStream-API-Stateless-Part-td43035.html
> [2]
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-133-Rework-PyFlink-Documentation-tt43570.html
> 
> 
> Thanks,
> Rodrigo
> 
> 
> 
> --
> Sent from: http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/

Re: [DISCUSS] Planning Flink 1.12

Posted by rodrigobrochado <ro...@predito.com.br>.

Hi,

I hope it's not too late to ask, but would FLIP-130 [1] and FLIP-133 [2] be
considered? I think that it would be nice to have some details of pyFlink
Datastreams API (FLIP-130) on the roadmap, giving us (users) more insights
into what we can expect from pyFlink in the near future.


[1]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-130-Support-for-Python-DataStream-API-Stateless-Part-td43035.html
[2]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-133-Rework-PyFlink-Documentation-tt43570.html


Thanks,
Rodrigo



--
Sent from: http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/

Re: [DISCUSS] Planning Flink 1.12

Posted by Robert Metzger <rm...@apache.org>.

I updated the release date in the Wiki page.

On Sun, Aug 9, 2020 at 8:18 PM Yun Tang <my...@live.com> wrote:

> +1 for extending the feature freeze due date.
> ________________________________
> From: Zhijiang <wa...@aliyun.com.INVALID>
> Sent: Thursday, August 6, 2020 17:05
> To: dev <de...@flink.apache.org>
> Subject: Re: [DISCUSS] Planning Flink 1.12
>
> +1 on my side for feature freeze date by the end of Oct.
>
>
> ------------------------------------------------------------------
> From:Yuan Mei <yu...@gmail.com>
> Send Time:2020年8月6日(星期四) 14:54
> To:dev <de...@flink.apache.org>
> Subject:Re: [DISCUSS] Planning Flink 1.12
>
> +1
>
> > +1 for extending the feature freeze date to the end of October.
>
>
>
> On Thu, Aug 6, 2020 at 12:08 PM Yu Li <ca...@gmail.com> wrote:
>
> > +1 for extending feature freeze date to end of October.
> >
> > Feature development in the master branch could be unblocked through
> > creating the release branch, but every coin has its two sides (smile)
> >
> > Best Regards,
> > Yu
> >
> >
> > On Wed, 5 Aug 2020 at 20:12, Robert Metzger <rm...@apache.org> wrote:
> >
> > > Thanks all for your opinion.
> > >
> > > @Chesnay: That is a risk, but I hope the people responsible for
> > individual
> > > FLIPs plan accordingly. Extending the time till the feature freeze
> should
> > > not mean that we are extending the scope of the release.
> > > Ideally, features are done before FF, and they use the time till the
> > freeze
> > > for additional testing and documentation polishing.
> > > This FF will be virtual, there should be less disruption than a
> physical
> > > conference with all the travelling.
> > > Do you have a different proposal for the timing?
> > >
> > >
> > > I'm currently considering splitting the feature freeze and the release
> > > branch creation. Similar to the Linux kernel development, we could
> have a
> > > "merge window" and a stabilization phase. At the end of the
> stabilization
> > > phase, we cut the release branch and open the next merge window (I'll
> > start
> > > a separate thread regarding this towards the end of this release cycle,
> > if
> > > I still like the idea then)
> > >
> > >
> > > On Wed, Aug 5, 2020 at 12:04 PM Chesnay Schepler <ch...@apache.org>
> > > wrote:
> > >
> > > > I'm a bit concerned about end of October, because it means we have
> > Flink
> > > > forward, which usually means at least 1 week of little-to-no
> activity,
> > > > and then 1 week until feature-freeze.
> > > >
> > > > On 05/08/2020 11:56, jincheng sun wrote:
> > > > > +1 for end of October from me as well.
> > > > >
> > > > > Best,
> > > > > Jincheng
> > > > >
> > > > >
> > > > > Kostas Kloudas <kk...@gmail.com> 于2020年8月5日周三 下午4:59写道：
> > > > >
> > > > >> +1 for end of October from me as well.
> > > > >>
> > > > >> Cheers,
> > > > >> Kostas
> > > > >>
> > > > >> On Wed, Aug 5, 2020 at 9:59 AM Till Rohrmann <
> trohrmann@apache.org>
> > > > wrote:
> > > > >>
> > > > >>> +1 for end of October from my side as well.
> > > > >>>
> > > > >>> Cheers,
> > > > >>> Till
> > > > >>>
> > > > >>> On Tue, Aug 4, 2020 at 9:46 PM Stephan Ewen <se...@apache.org>
> > > wrote:
> > > > >>>
> > > > >>>> The end of October sounds good from my side, unless it collides
> > with
> > > > >> some
> > > > >>>> holidays that affect many committers.
> > > > >>>>
> > > > >>>> Feature-wise, I believe we can definitely make good use of the
> > time
> > > to
> > > > >>> wrap
> > > > >>>> up some critical threads (like finishing the FLIP-27 source
> > > efforts).
> > > > >>>>
> > > > >>>> So +1 to the end of October from my side.
> > > > >>>>
> > > > >>>> Best,
> > > > >>>> Stephan
> > > > >>>>
> > > > >>>>
> > > > >>>> On Tue, Aug 4, 2020 at 8:59 AM Robert Metzger <
> > rmetzger@apache.org>
> > > > >>> wrote:
> > > > >>>>> Thanks a lot for commenting on the feature freeze date.
> > > > >>>>>
> > > > >>>>> You are raising a few good points on the timing.
> > > > >>>>> If we have already (2 months before) concerns regarding the
> > > deadline,
> > > > >>>> then
> > > > >>>>> I agree that we should move it till the end of October.
> > > > >>>>>
> > > > >>>>> We then just need to be careful not to run into the Christmas
> > > season
> > > > >> at
> > > > >>>> the
> > > > >>>>> end of December.
> > > > >>>>>
> > > > >>>>> If nobody objects within a few days, I'll update the feature
> > freeze
> > > > >>> date
> > > > >>>> in
> > > > >>>>> the Wiki.
> > > > >>>>>
> > > > >>>>>
> > > > >>>>> On Tue, Aug 4, 2020 at 7:52 AM Kurt Young <yk...@gmail.com>
> > > wrote:
> > > > >>>>>
> > > > >>>>>> Regarding setting the feature freeze date to late September, I
> > > have
> > > > >>>> some
> > > > >>>>>> concern that it might make
> > > > >>>>>> the development time of 1.12 too short.
> > > > >>>>>>
> > > > >>>>>> One reason for this is we took too much time (about 1.5 month,
> > > from
> > > > >>> mid
> > > > >>>>> of
> > > > >>>>>> May to beginning of July)
> > > > >>>>>> for testing 1.11. It's not ideal but further squeeze the
> > > > >> development
> > > > >>>> time
> > > > >>>>>> of 1.12 won't make this better.
> > > > >>>>>>   Besides, AFAIK July & August is also a popular vacation
> season
> > > for
> > > > >>>>>> European. Given the fact most
> > > > >>>>>>   committers of Flink come from Europe, I think we should also
> > > take
> > > > >>> this
> > > > >>>>>> into consideration.
> > > > >>>>>>
> > > > >>>>>> It's also true that the first week of October is the national
> > > > >> holiday
> > > > >>>> of
> > > > >>>>>> China, so I'm wondering whether the
> > > > >>>>>> end of October could be a candidate feature freeze date.
> > > > >>>>>>
> > > > >>>>>> Best,
> > > > >>>>>> Kurt
> > > > >>>>>>
> > > > >>>>>>
> > > > >>>>>> On Tue, Jul 28, 2020 at 2:41 AM Robert Metzger <
> > > > >> rmetzger@apache.org>
> > > > >>>>>> wrote:
> > > > >>>>>>
> > > > >>>>>>> Hi all,
> > > > >>>>>>>
> > > > >>>>>>> Thanks a lot for the responses so far. I've put them into
> this
> > > > >> Wiki
> > > > >>>>> page:
> > > > >>>>>>>
> https://cwiki.apache.org/confluence/display/FLINK/1.12+Release
> > > > >> to
> > > > >>>> keep
> > > > >>>>>>> track of them. Ideally, post JIRA tickets for your feature,
> > then
> > > > >>> the
> > > > >>>>>> status
> > > > >>>>>>> will update automatically in the wiki :)
> > > > >>>>>>>
> > > > >>>>>>> Please keep posting features here, or add them to the Wiki
> > > > >> yourself
> > > > >>>> 🙏
> > > > >>>>>>> @Prasanna kumar <pr...@gmail.com>: Dynamic
> Auto
> > > > >>>> Scaling
> > > > >>>>>> is a
> > > > >>>>>>> feature request the community is well-aware of. Till has
> posted
> > > > >>>>>>> "Reactive-scaling mode" as a feature he's working on for the
> > 1.12
> > > > >>>>>> release.
> > > > >>>>>>> This work will introduce the basic building blocks and
> partial
> > > > >>>> support
> > > > >>>>>> for
> > > > >>>>>>> the feature you are requesting.
> > > > >>>>>>> Proper support for dynamic scaling, while maintaining Flink's
> > > > >> high
> > > > >>>>>>> performance (throughout, low latency) and correctness is a
> > > > >>> difficult
> > > > >>>>> task
> > > > >>>>>>> that needs a lot of work. It will probably take a little bit
> of
> > > > >>> time
> > > > >>>>> till
> > > > >>>>>>> this is fully available.
> > > > >>>>>>>
> > > > >>>>>>> Cheers,
> > > > >>>>>>> Robert
> > > > >>>>>>>
> > > > >>>>>>>
> > > > >>>>>>>
> > > > >>>>>>> On Thu, Jul 23, 2020 at 2:27 PM Till Rohrmann <
> > > > >>> trohrmann@apache.org>
> > > > >>>>>>> wrote:
> > > > >>>>>>>
> > > > >>>>>>>> Thanks for being our release managers for the 1.12 release
> > > > >> Dian &
> > > > >>>>>> Robert!
> > > > >>>>>>>> Here are some features I would like to work on for this
> > > > >> release:
> > > > >>>>>>>> # Features
> > > > >>>>>>>>
> > > > >>>>>>>> ## Finishing pipelined region scheduling (
> > > > >>>>>>>> https://issues.apache.org/jira/browse/FLINK-16430)
> > > > >>>>>>>> With the pipelined region scheduler we want to implement a
> > > > >>>> scheduler
> > > > >>>>>>> which
> > > > >>>>>>>> can serve streaming as well as batch workloads alike while
> > > > >> being
> > > > >>>> able
> > > > >>>>>> to
> > > > >>>>>>>> run jobs under constrained resources. The latter is
> > > > >> particularly
> > > > >>>>>>> important
> > > > >>>>>>>> for bounded streaming jobs which, currently, are not well
> > > > >>>> supported.
> > > > >>>>>>>> ## Reactive-scaling mode
> > > > >>>>>>>> Being able to react to newly available resources and
> rescaling
> > > > >> a
> > > > >>>>>> running
> > > > >>>>>>>> job accordingly will make Flink's operation much easier
> > because
> > > > >>>>>> resources
> > > > >>>>>>>> can then be controlled by an external tool (e.g. GCP
> > > > >> autoscaling,
> > > > >>>> K8s
> > > > >>>>>>>> horizontal pod scaler, etc.). In this release we want to
> make
> > a
> > > > >>> big
> > > > >>>>>> step
> > > > >>>>>>>> towards this direction. As a first step we want to support
> the
> > > > >>>>>> execution
> > > > >>>>>>> of
> > > > >>>>>>>> jobs with a parallelism which is lower than the specified
> > > > >>>> parallelism
> > > > >>>>>> in
> > > > >>>>>>>> case that Flink lost a TaskManager or could not acquire
> enough
> > > > >>>>>> resources.
> > > > >>>>>>>> # Maintenance/Stability
> > > > >>>>>>>>
> > > > >>>>>>>> ## JM / TM finished task reconciliation (
> > > > >>>>>>>> https://issues.apache.org/jira/browse/FLINK-17075)
> > > > >>>>>>>> This prevents the system from going out of sync if a task
> > state
> > > > >>>>> change
> > > > >>>>>>> from
> > > > >>>>>>>> the TM to the JM is lost.
> > > > >>>>>>>>
> > > > >>>>>>>> ## Make metrics services work with Kubernetes deployments (
> > > > >>>>>>>> https://issues.apache.org/jira/browse/FLINK-11127)
> > > > >>>>>>>> Invert the direction in which the MetricFetcher connects to
> > the
> > > > >>>>>>>> MetricQueryFetchers. That way it will no longer be necessary
> > to
> > > > >>>>> expose
> > > > >>>>>> on
> > > > >>>>>>>> K8s for every TaskManager a port on which the
> > > > >> MetricQueryFetcher
> > > > >>>>> runs.
> > > > >>>>>>> This
> > > > >>>>>>>> will then make the deployment of Flink clusters on K8s
> easier.
> > > > >>>>>>>>
> > > > >>>>>>>> ## Handle long-blocking operations during job submission
> > > > >>> (savepoint
> > > > >>>>>>>> restore) (https://issues.apache.org/jira/browse/FLINK-16866
> )
> > > > >>>>>>>> Submitting a Flink job can involve the interaction with
> > > > >> external
> > > > >>>>>> systems
> > > > >>>>>>>> (blocking operations). Depending on the job the interactions
> > > > >> can
> > > > >>>> take
> > > > >>>>>> so
> > > > >>>>>>>> long that it exceeds the submission timeout which reports a
> > > > >>> failure
> > > > >>>>> on
> > > > >>>>>>> the
> > > > >>>>>>>> client side even though the actual submission succeeded. By
> > > > >>>>> decoupling
> > > > >>>>>>> the
> > > > >>>>>>>> creation of the ExecutionGraph from the job submission, we
> can
> > > > >>> make
> > > > >>>>> the
> > > > >>>>>>> job
> > > > >>>>>>>> submission non-blocking which will solve this problem.
> > > > >>>>>>>>
> > > > >>>>>>>> ## Make IDs more intuitive to ease debugging (FLIP-118) (
> > > > >>>>>>>> https://issues.apache.org/jira/browse/FLINK-15679)
> > > > >>>>>>>> By making the internal Flink IDs compositional or logging
> how
> > > > >>> they
> > > > >>>>>> belong
> > > > >>>>>>>> together, we can make the debugging of Flink's operations
> much
> > > > >>>>> easier.
> > > > >>>>>>>> Cheers,
> > > > >>>>>>>> Till
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> On Thu, Jul 23, 2020 at 7:48 AM Canbin Zheng <
> > > > >>>> felixzhengcb@gmail.com
> > > > >>>>>>>> wrote:
> > > > >>>>>>>>
> > > > >>>>>>>>> Hi All,
> > > > >>>>>>>>>
> > > > >>>>>>>>> Thanks for bring-up this discussion, Robert!
> > > > >>>>>>>>> Congratulations on becoming the release manager of 1.12，
> Dian
> > > > >>> and
> > > > >>>>>>> Robert
> > > > >>>>>>>> !
> > > > >>>>>>>>> ----------
> > > > >>>>>>>>> Here are some of my thoughts of the features for native
> > > > >>>> integration
> > > > >>>>>>> with
> > > > >>>>>>>>> Kubernetes in Flink 1.12:
> > > > >>>>>>>>>
> > > > >>>>>>>>> 1. Support user-specified pod templates
> > > > >>>>>>>>>      Description:
> > > > >>>>>>>>>      The current approach of introducing new configuration
> > > > >>> options
> > > > >>>>> for
> > > > >>>>>>>> each
> > > > >>>>>>>>> aspect of pod specification a user might wish is becoming
> > > > >>>> unwieldy,
> > > > >>>>>> we
> > > > >>>>>>>> have
> > > > >>>>>>>>> to maintain more and more Flink side Kubernetes
> configuration
> > > > >>>>> options
> > > > >>>>>>> and
> > > > >>>>>>>>> users have to learn the gap between the declarative model
> > > > >> used
> > > > >>> by
> > > > >>>>>>>>> Kubernetes and the configuration model used by Flink. It's
> a
> > > > >>>> great
> > > > >>>>>>>>> improvement to allow users to specify pod templates as
> > > > >> central
> > > > >>>>> places
> > > > >>>>>>> for
> > > > >>>>>>>>> all customization needs for the jobmanager and taskmanager
> > > > >>> pods.
> > > > >>>>>>>>>      Benefits:
> > > > >>>>>>>>>      Users can leverage many of the advanced K8s features
> > that
> > > > >>> the
> > > > >>>>>> Flink
> > > > >>>>>>>>> community does not support explicitly, such as volume
> > > > >> mounting,
> > > > >>>> DNS
> > > > >>>>>>>>> configuration, pod affinity/anti-affinity setting, etc.
> > > > >>>>>>>>>
> > > > >>>>>>>>> 2. Support running PyFlink on Kubernetes
> > > > >>>>>>>>>      Description:
> > > > >>>>>>>>>      Support running PyFlink on Kubernetes, including
> session
> > > > >>>>> cluster
> > > > >>>>>>> and
> > > > >>>>>>>>> application cluster.
> > > > >>>>>>>>>      Benefits:
> > > > >>>>>>>>>      Running python application in a containerized
> > > > >> environment.
> > > > >>>>>>>>> 3. Support built-in init-Container
> > > > >>>>>>>>>      Description:
> > > > >>>>>>>>>      We need a built-in init-Container to help solve
> > > > >> dependency
> > > > >>>>>>> management
> > > > >>>>>>>>> in a containerized environment, especially in the
> application
> > > > >>>> mode.
> > > > >>>>>>>>>      Benefits:
> > > > >>>>>>>>>      Separate the base Flink image from dynamic
> dependencies.
> > > > >>>>>>>>>
> > > > >>>>>>>>> 4. Support accessing secured services via K8s secrets
> > > > >>>>>>>>>      Description:
> > > > >>>>>>>>>      Kubernetes Secrets
> > > > >>>>>>>>> <https://kubernetes.io/docs/concepts/configuration/secret/
> >
> > > > >>> can
> > > > >>>> be
> > > > >>>>>>> used
> > > > >>>>>>>> to
> > > > >>>>>>>>> provide credentials for a Flink application to access
> secured
> > > > >>>>>> services.
> > > > >>>>>>>> It
> > > > >>>>>>>>> helps people who want to use a user-specified K8s Secret
> > > > >>> through
> > > > >>>> an
> > > > >>>>>>>>> environment variable.
> > > > >>>>>>>>>      Benefits:
> > > > >>>>>>>>>      Improve user experience.
> > > > >>>>>>>>>
> > > > >>>>>>>>> 5. Support configuring replica of JobManager Deployment in
> > > > >>>>> ZooKeeper
> > > > >>>>>> HA
> > > > >>>>>>>>> setups
> > > > >>>>>>>>>      Description:
> > > > >>>>>>>>>      Make the *replica* of Deployment configurable in the
> > > > >>>> ZooKeeper
> > > > >>>>> HA
> > > > >>>>>>>>> setups.
> > > > >>>>>>>>>      Benefits:
> > > > >>>>>>>>>      Achieve faster failover.
> > > > >>>>>>>>>
> > > > >>>>>>>>> 6. Support to configure limit for CPU requirement
> > > > >>>>>>>>>      Description:
> > > > >>>>>>>>>      To leverage the Kubernetes feature of container
> > > > >>> request/limit
> > > > >>>>>> CPU.
> > > > >>>>>>>>>      Benefits:
> > > > >>>>>>>>>      Reduce cost.
> > > > >>>>>>>>>
> > > > >>>>>>>>> Regards,
> > > > >>>>>>>>> Canbin Zheng
> > > > >>>>>>>>>
> > > > >>>>>>>>> Harold.Miao <mi...@gmail.com> 于2020年7月23日周四
> 下午12:44写道：
> > > > >>>>>>>>>
> > > > >>>>>>>>>> I'm excited to hear about this feature,  very, very, very
> > > > >>>> highly
> > > > >>>>>>>>> encouraged
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> Prasanna kumar <pr...@gmail.com>
> > > > >> 于2020年7月23日周四
> > > > >>>>>>>> 上午12:10写道：
> > > > >>>>>>>>>>> Hi Flink Dev Team,
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>> Dynamic AutoScaling Based on the incoming data load would
> > > > >>> be
> > > > >>>> a
> > > > >>>>>>> great
> > > > >>>>>>>>>>> feature.
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>> We should be able have some rule say If the load
> > > > >> increased
> > > > >>> by
> > > > >>>>>> 20% ,
> > > > >>>>>>>> add
> > > > >>>>>>>>>>> extra resource should be added.
> > > > >>>>>>>>>>> Or time based say during these peak hours the pipeline
> > > > >>> should
> > > > >>>>>> scale
> > > > >>>>>>>>>>> automatically by 50%.
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>> This will help a lot in cost reduction.
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>> EMR cluster provides a similar feature for SPARK based
> > > > >>>>>> application.
> > > > >>>>>>>>>>> Thanks,
> > > > >>>>>>>>>>> Prasanna.
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>> On Wed, Jul 22, 2020 at 5:40 PM Robert Metzger <
> > > > >>>>>>> rmetzger@apache.org>
> > > > >>>>>>>>>>> wrote:
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>>> Hi all,
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> Now that the 1.11 release is out, it is time to plan
> > > > >> for
> > > > >>>> the
> > > > >>>>>> next
> > > > >>>>>>>>> major
> > > > >>>>>>>>>>>> Flink release.
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> Some items:
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>>     1.
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>>     Dian Fu and me volunteer to be the release managers
> > > > >>> for
> > > > >>>>>> Flink
> > > > >>>>>>>>> 1.12.
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>>     1.
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>>     Timeline: We propose to stick to our approximate 4
> > > > >>> month
> > > > >>>>>>> release
> > > > >>>>>>>>>>> cycle,
> > > > >>>>>>>>>>>>     thus the release should be done by late October.
> > > > >> Given
> > > > >>>>> that
> > > > >>>>>>>>> there’s
> > > > >>>>>>>>>> a
> > > > >>>>>>>>>>>>     holiday week in China at the beginning of October, I
> > > > >>>>> propose
> > > > >>>>>>> to
> > > > >>>>>>>> do
> > > > >>>>>>>>>> the
> > > > >>>>>>>>>>>>     feature freeze on master by late September.
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>>     2.
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>>     Collecting features: It would be good to have a
> > > > >> rough
> > > > >>>>>> overview
> > > > >>>>>>>> of
> > > > >>>>>>>>>> the
> > > > >>>>>>>>>>>>     features that will likely be ready to be merged by
> > > > >>> late
> > > > >>>>>>>> September,
> > > > >>>>>>>>>> and
> > > > >>>>>>>>>>>> that
> > > > >>>>>>>>>>>>     we want in the release.
> > > > >>>>>>>>>>>>     Based on the discussion, we will update the Roadmap
> > > > >> on
> > > > >>>> the
> > > > >>>>>>> Flink
> > > > >>>>>>>>>>> website
> > > > >>>>>>>>>>>>     again!
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>>     1.
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>>     Test instabilities and blockers: I would like to
> > > > >>> avoid a
> > > > >>>>>>>> situation
> > > > >>>>>>>>>>> where
> > > > >>>>>>>>>>>>     we have many blocking issues or build instabilities
> > > > >> at
> > > > >>>> the
> > > > >>>>>>> time
> > > > >>>>>>>> of
> > > > >>>>>>>>>> the
> > > > >>>>>>>>>>>>     feature freeze. To achieve that, we will try to
> > > > >> check
> > > > >>>>> every
> > > > >>>>>>>> build
> > > > >>>>>>>>>>>>     instability within a week, to decide if it is a
> > > > >>> blocker
> > > > >>>>>> (make
> > > > >>>>>>>> sure
> > > > >>>>>>>>>> to
> > > > >>>>>>>>>>>> use
> > > > >>>>>>>>>>>>     the “test-stability” label for those tickets!)
> > > > >>>>>>>>>>>>     Blocker issues will need to have somebody assigned
> > > > >>>>>>> (responsible)
> > > > >>>>>>>>>>> within
> > > > >>>>>>>>>>>>     a week, and we want to see progress on all blocker
> > > > >>>> issues
> > > > >>>>>>>>>> (downgrade,
> > > > >>>>>>>>>>>>     resolution, a good plan how to proceed if it is more
> > > > >>>>>>>> complicated)
> > > > >>>>>>>>>>>>     2.
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>>     Quality and stability of new features: In order to
> > > > >>> have
> > > > >>>> a
> > > > >>>>>>> short
> > > > >>>>>>>>>>> feature
> > > > >>>>>>>>>>>>     freeze phase, we encourage developers to only merge
> > > > >>>>>>> well-tested
> > > > >>>>>>>>> and
> > > > >>>>>>>>>>>>     documented features. In our experience, the feature
> > > > >>>> freeze
> > > > >>>>>>> works
> > > > >>>>>>>>>> best
> > > > >>>>>>>>>>> if
> > > > >>>>>>>>>>>>     new features are complete, and the community can
> > > > >> focus
> > > > >>>>> fully
> > > > >>>>>>> on
> > > > >>>>>>>>>>>> addressing
> > > > >>>>>>>>>>>>     newly found bugs and voting the release.
> > > > >>>>>>>>>>>>     By having a smooth release process, the next
> > > > >>>> merge-window
> > > > >>>>>> for
> > > > >>>>>>>> the
> > > > >>>>>>>>>> next
> > > > >>>>>>>>>>>>     release will come sooner.
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> Let me know what you think about our items, and share
> > > > >>> which
> > > > >>>>>>>> features
> > > > >>>>>>>>>> you
> > > > >>>>>>>>>>>> want in Flink 1.12.
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> Best,
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> Robert & Dian
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> --
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> Best Regards,
> > > > >>>>>>>>>> Harold Miao
> > > > >>>>>>>>>>
> > > >
> > > >
> > >
> >
>
>

Re: [DISCUSS] Planning Flink 1.12

Posted by Yun Tang <my...@live.com>.

+1 for extending the feature freeze due date.
________________________________
From: Zhijiang <wa...@aliyun.com.INVALID>
Sent: Thursday, August 6, 2020 17:05
To: dev <de...@flink.apache.org>
Subject: Re: [DISCUSS] Planning Flink 1.12

+1 on my side for feature freeze date by the end of Oct.


------------------------------------------------------------------
From:Yuan Mei <yu...@gmail.com>
Send Time:2020年8月6日(星期四) 14:54
To:dev <de...@flink.apache.org>
Subject:Re: [DISCUSS] Planning Flink 1.12

+1

> +1 for extending the feature freeze date to the end of October.



On Thu, Aug 6, 2020 at 12:08 PM Yu Li <ca...@gmail.com> wrote:

> +1 for extending feature freeze date to end of October.
>
> Feature development in the master branch could be unblocked through
> creating the release branch, but every coin has its two sides (smile)
>
> Best Regards,
> Yu
>
>
> On Wed, 5 Aug 2020 at 20:12, Robert Metzger <rm...@apache.org> wrote:
>
> > Thanks all for your opinion.
> >
> > @Chesnay: That is a risk, but I hope the people responsible for
> individual
> > FLIPs plan accordingly. Extending the time till the feature freeze should
> > not mean that we are extending the scope of the release.
> > Ideally, features are done before FF, and they use the time till the
> freeze
> > for additional testing and documentation polishing.
> > This FF will be virtual, there should be less disruption than a physical
> > conference with all the travelling.
> > Do you have a different proposal for the timing?
> >
> >
> > I'm currently considering splitting the feature freeze and the release
> > branch creation. Similar to the Linux kernel development, we could have a
> > "merge window" and a stabilization phase. At the end of the stabilization
> > phase, we cut the release branch and open the next merge window (I'll
> start
> > a separate thread regarding this towards the end of this release cycle,
> if
> > I still like the idea then)
> >
> >
> > On Wed, Aug 5, 2020 at 12:04 PM Chesnay Schepler <ch...@apache.org>
> > wrote:
> >
> > > I'm a bit concerned about end of October, because it means we have
> Flink
> > > forward, which usually means at least 1 week of little-to-no activity,
> > > and then 1 week until feature-freeze.
> > >
> > > On 05/08/2020 11:56, jincheng sun wrote:
> > > > +1 for end of October from me as well.
> > > >
> > > > Best,
> > > > Jincheng
> > > >
> > > >
> > > > Kostas Kloudas <kk...@gmail.com> 于2020年8月5日周三 下午4:59写道：
> > > >
> > > >> +1 for end of October from me as well.
> > > >>
> > > >> Cheers,
> > > >> Kostas
> > > >>
> > > >> On Wed, Aug 5, 2020 at 9:59 AM Till Rohrmann <tr...@apache.org>
> > > wrote:
> > > >>
> > > >>> +1 for end of October from my side as well.
> > > >>>
> > > >>> Cheers,
> > > >>> Till
> > > >>>
> > > >>> On Tue, Aug 4, 2020 at 9:46 PM Stephan Ewen <se...@apache.org>
> > wrote:
> > > >>>
> > > >>>> The end of October sounds good from my side, unless it collides
> with
> > > >> some
> > > >>>> holidays that affect many committers.
> > > >>>>
> > > >>>> Feature-wise, I believe we can definitely make good use of the
> time
> > to
> > > >>> wrap
> > > >>>> up some critical threads (like finishing the FLIP-27 source
> > efforts).
> > > >>>>
> > > >>>> So +1 to the end of October from my side.
> > > >>>>
> > > >>>> Best,
> > > >>>> Stephan
> > > >>>>
> > > >>>>
> > > >>>> On Tue, Aug 4, 2020 at 8:59 AM Robert Metzger <
> rmetzger@apache.org>
> > > >>> wrote:
> > > >>>>> Thanks a lot for commenting on the feature freeze date.
> > > >>>>>
> > > >>>>> You are raising a few good points on the timing.
> > > >>>>> If we have already (2 months before) concerns regarding the
> > deadline,
> > > >>>> then
> > > >>>>> I agree that we should move it till the end of October.
> > > >>>>>
> > > >>>>> We then just need to be careful not to run into the Christmas
> > season
> > > >> at
> > > >>>> the
> > > >>>>> end of December.
> > > >>>>>
> > > >>>>> If nobody objects within a few days, I'll update the feature
> freeze
> > > >>> date
> > > >>>> in
> > > >>>>> the Wiki.
> > > >>>>>
> > > >>>>>
> > > >>>>> On Tue, Aug 4, 2020 at 7:52 AM Kurt Young <yk...@gmail.com>
> > wrote:
> > > >>>>>
> > > >>>>>> Regarding setting the feature freeze date to late September, I
> > have
> > > >>>> some
> > > >>>>>> concern that it might make
> > > >>>>>> the development time of 1.12 too short.
> > > >>>>>>
> > > >>>>>> One reason for this is we took too much time (about 1.5 month,
> > from
> > > >>> mid
> > > >>>>> of
> > > >>>>>> May to beginning of July)
> > > >>>>>> for testing 1.11. It's not ideal but further squeeze the
> > > >> development
> > > >>>> time
> > > >>>>>> of 1.12 won't make this better.
> > > >>>>>>   Besides, AFAIK July & August is also a popular vacation season
> > for
> > > >>>>>> European. Given the fact most
> > > >>>>>>   committers of Flink come from Europe, I think we should also
> > take
> > > >>> this
> > > >>>>>> into consideration.
> > > >>>>>>
> > > >>>>>> It's also true that the first week of October is the national
> > > >> holiday
> > > >>>> of
> > > >>>>>> China, so I'm wondering whether the
> > > >>>>>> end of October could be a candidate feature freeze date.
> > > >>>>>>
> > > >>>>>> Best,
> > > >>>>>> Kurt
> > > >>>>>>
> > > >>>>>>
> > > >>>>>> On Tue, Jul 28, 2020 at 2:41 AM Robert Metzger <
> > > >> rmetzger@apache.org>
> > > >>>>>> wrote:
> > > >>>>>>
> > > >>>>>>> Hi all,
> > > >>>>>>>
> > > >>>>>>> Thanks a lot for the responses so far. I've put them into this
> > > >> Wiki
> > > >>>>> page:
> > > >>>>>>> https://cwiki.apache.org/confluence/display/FLINK/1.12+Release
> > > >> to
> > > >>>> keep
> > > >>>>>>> track of them. Ideally, post JIRA tickets for your feature,
> then
> > > >>> the
> > > >>>>>> status
> > > >>>>>>> will update automatically in the wiki :)
> > > >>>>>>>
> > > >>>>>>> Please keep posting features here, or add them to the Wiki
> > > >> yourself
> > > >>>> 🙏
> > > >>>>>>> @Prasanna kumar <pr...@gmail.com>: Dynamic Auto
> > > >>>> Scaling
> > > >>>>>> is a
> > > >>>>>>> feature request the community is well-aware of. Till has posted
> > > >>>>>>> "Reactive-scaling mode" as a feature he's working on for the
> 1.12
> > > >>>>>> release.
> > > >>>>>>> This work will introduce the basic building blocks and partial
> > > >>>> support
> > > >>>>>> for
> > > >>>>>>> the feature you are requesting.
> > > >>>>>>> Proper support for dynamic scaling, while maintaining Flink's
> > > >> high
> > > >>>>>>> performance (throughout, low latency) and correctness is a
> > > >>> difficult
> > > >>>>> task
> > > >>>>>>> that needs a lot of work. It will probably take a little bit of
> > > >>> time
> > > >>>>> till
> > > >>>>>>> this is fully available.
> > > >>>>>>>
> > > >>>>>>> Cheers,
> > > >>>>>>> Robert
> > > >>>>>>>
> > > >>>>>>>
> > > >>>>>>>
> > > >>>>>>> On Thu, Jul 23, 2020 at 2:27 PM Till Rohrmann <
> > > >>> trohrmann@apache.org>
> > > >>>>>>> wrote:
> > > >>>>>>>
> > > >>>>>>>> Thanks for being our release managers for the 1.12 release
> > > >> Dian &
> > > >>>>>> Robert!
> > > >>>>>>>> Here are some features I would like to work on for this
> > > >> release:
> > > >>>>>>>> # Features
> > > >>>>>>>>
> > > >>>>>>>> ## Finishing pipelined region scheduling (
> > > >>>>>>>> https://issues.apache.org/jira/browse/FLINK-16430)
> > > >>>>>>>> With the pipelined region scheduler we want to implement a
> > > >>>> scheduler
> > > >>>>>>> which
> > > >>>>>>>> can serve streaming as well as batch workloads alike while
> > > >> being
> > > >>>> able
> > > >>>>>> to
> > > >>>>>>>> run jobs under constrained resources. The latter is
> > > >> particularly
> > > >>>>>>> important
> > > >>>>>>>> for bounded streaming jobs which, currently, are not well
> > > >>>> supported.
> > > >>>>>>>> ## Reactive-scaling mode
> > > >>>>>>>> Being able to react to newly available resources and rescaling
> > > >> a
> > > >>>>>> running
> > > >>>>>>>> job accordingly will make Flink's operation much easier
> because
> > > >>>>>> resources
> > > >>>>>>>> can then be controlled by an external tool (e.g. GCP
> > > >> autoscaling,
> > > >>>> K8s
> > > >>>>>>>> horizontal pod scaler, etc.). In this release we want to make
> a
> > > >>> big
> > > >>>>>> step
> > > >>>>>>>> towards this direction. As a first step we want to support the
> > > >>>>>> execution
> > > >>>>>>> of
> > > >>>>>>>> jobs with a parallelism which is lower than the specified
> > > >>>> parallelism
> > > >>>>>> in
> > > >>>>>>>> case that Flink lost a TaskManager or could not acquire enough
> > > >>>>>> resources.
> > > >>>>>>>> # Maintenance/Stability
> > > >>>>>>>>
> > > >>>>>>>> ## JM / TM finished task reconciliation (
> > > >>>>>>>> https://issues.apache.org/jira/browse/FLINK-17075)
> > > >>>>>>>> This prevents the system from going out of sync if a task
> state
> > > >>>>> change
> > > >>>>>>> from
> > > >>>>>>>> the TM to the JM is lost.
> > > >>>>>>>>
> > > >>>>>>>> ## Make metrics services work with Kubernetes deployments (
> > > >>>>>>>> https://issues.apache.org/jira/browse/FLINK-11127)
> > > >>>>>>>> Invert the direction in which the MetricFetcher connects to
> the
> > > >>>>>>>> MetricQueryFetchers. That way it will no longer be necessary
> to
> > > >>>>> expose
> > > >>>>>> on
> > > >>>>>>>> K8s for every TaskManager a port on which the
> > > >> MetricQueryFetcher
> > > >>>>> runs.
> > > >>>>>>> This
> > > >>>>>>>> will then make the deployment of Flink clusters on K8s easier.
> > > >>>>>>>>
> > > >>>>>>>> ## Handle long-blocking operations during job submission
> > > >>> (savepoint
> > > >>>>>>>> restore) (https://issues.apache.org/jira/browse/FLINK-16866)
> > > >>>>>>>> Submitting a Flink job can involve the interaction with
> > > >> external
> > > >>>>>> systems
> > > >>>>>>>> (blocking operations). Depending on the job the interactions
> > > >> can
> > > >>>> take
> > > >>>>>> so
> > > >>>>>>>> long that it exceeds the submission timeout which reports a
> > > >>> failure
> > > >>>>> on
> > > >>>>>>> the
> > > >>>>>>>> client side even though the actual submission succeeded. By
> > > >>>>> decoupling
> > > >>>>>>> the
> > > >>>>>>>> creation of the ExecutionGraph from the job submission, we can
> > > >>> make
> > > >>>>> the
> > > >>>>>>> job
> > > >>>>>>>> submission non-blocking which will solve this problem.
> > > >>>>>>>>
> > > >>>>>>>> ## Make IDs more intuitive to ease debugging (FLIP-118) (
> > > >>>>>>>> https://issues.apache.org/jira/browse/FLINK-15679)
> > > >>>>>>>> By making the internal Flink IDs compositional or logging how
> > > >>> they
> > > >>>>>> belong
> > > >>>>>>>> together, we can make the debugging of Flink's operations much
> > > >>>>> easier.
> > > >>>>>>>> Cheers,
> > > >>>>>>>> Till
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>> On Thu, Jul 23, 2020 at 7:48 AM Canbin Zheng <
> > > >>>> felixzhengcb@gmail.com
> > > >>>>>>>> wrote:
> > > >>>>>>>>
> > > >>>>>>>>> Hi All,
> > > >>>>>>>>>
> > > >>>>>>>>> Thanks for bring-up this discussion, Robert!
> > > >>>>>>>>> Congratulations on becoming the release manager of 1.12， Dian
> > > >>> and
> > > >>>>>>> Robert
> > > >>>>>>>> !
> > > >>>>>>>>> ----------
> > > >>>>>>>>> Here are some of my thoughts of the features for native
> > > >>>> integration
> > > >>>>>>> with
> > > >>>>>>>>> Kubernetes in Flink 1.12:
> > > >>>>>>>>>
> > > >>>>>>>>> 1. Support user-specified pod templates
> > > >>>>>>>>>      Description:
> > > >>>>>>>>>      The current approach of introducing new configuration
> > > >>> options
> > > >>>>> for
> > > >>>>>>>> each
> > > >>>>>>>>> aspect of pod specification a user might wish is becoming
> > > >>>> unwieldy,
> > > >>>>>> we
> > > >>>>>>>> have
> > > >>>>>>>>> to maintain more and more Flink side Kubernetes configuration
> > > >>>>> options
> > > >>>>>>> and
> > > >>>>>>>>> users have to learn the gap between the declarative model
> > > >> used
> > > >>> by
> > > >>>>>>>>> Kubernetes and the configuration model used by Flink. It's a
> > > >>>> great
> > > >>>>>>>>> improvement to allow users to specify pod templates as
> > > >> central
> > > >>>>> places
> > > >>>>>>> for
> > > >>>>>>>>> all customization needs for the jobmanager and taskmanager
> > > >>> pods.
> > > >>>>>>>>>      Benefits:
> > > >>>>>>>>>      Users can leverage many of the advanced K8s features
> that
> > > >>> the
> > > >>>>>> Flink
> > > >>>>>>>>> community does not support explicitly, such as volume
> > > >> mounting,
> > > >>>> DNS
> > > >>>>>>>>> configuration, pod affinity/anti-affinity setting, etc.
> > > >>>>>>>>>
> > > >>>>>>>>> 2. Support running PyFlink on Kubernetes
> > > >>>>>>>>>      Description:
> > > >>>>>>>>>      Support running PyFlink on Kubernetes, including session
> > > >>>>> cluster
> > > >>>>>>> and
> > > >>>>>>>>> application cluster.
> > > >>>>>>>>>      Benefits:
> > > >>>>>>>>>      Running python application in a containerized
> > > >> environment.
> > > >>>>>>>>> 3. Support built-in init-Container
> > > >>>>>>>>>      Description:
> > > >>>>>>>>>      We need a built-in init-Container to help solve
> > > >> dependency
> > > >>>>>>> management
> > > >>>>>>>>> in a containerized environment, especially in the application
> > > >>>> mode.
> > > >>>>>>>>>      Benefits:
> > > >>>>>>>>>      Separate the base Flink image from dynamic dependencies.
> > > >>>>>>>>>
> > > >>>>>>>>> 4. Support accessing secured services via K8s secrets
> > > >>>>>>>>>      Description:
> > > >>>>>>>>>      Kubernetes Secrets
> > > >>>>>>>>> <https://kubernetes.io/docs/concepts/configuration/secret/>
> > > >>> can
> > > >>>> be
> > > >>>>>>> used
> > > >>>>>>>> to
> > > >>>>>>>>> provide credentials for a Flink application to access secured
> > > >>>>>> services.
> > > >>>>>>>> It
> > > >>>>>>>>> helps people who want to use a user-specified K8s Secret
> > > >>> through
> > > >>>> an
> > > >>>>>>>>> environment variable.
> > > >>>>>>>>>      Benefits:
> > > >>>>>>>>>      Improve user experience.
> > > >>>>>>>>>
> > > >>>>>>>>> 5. Support configuring replica of JobManager Deployment in
> > > >>>>> ZooKeeper
> > > >>>>>> HA
> > > >>>>>>>>> setups
> > > >>>>>>>>>      Description:
> > > >>>>>>>>>      Make the *replica* of Deployment configurable in the
> > > >>>> ZooKeeper
> > > >>>>> HA
> > > >>>>>>>>> setups.
> > > >>>>>>>>>      Benefits:
> > > >>>>>>>>>      Achieve faster failover.
> > > >>>>>>>>>
> > > >>>>>>>>> 6. Support to configure limit for CPU requirement
> > > >>>>>>>>>      Description:
> > > >>>>>>>>>      To leverage the Kubernetes feature of container
> > > >>> request/limit
> > > >>>>>> CPU.
> > > >>>>>>>>>      Benefits:
> > > >>>>>>>>>      Reduce cost.
> > > >>>>>>>>>
> > > >>>>>>>>> Regards,
> > > >>>>>>>>> Canbin Zheng
> > > >>>>>>>>>
> > > >>>>>>>>> Harold.Miao <mi...@gmail.com> 于2020年7月23日周四 下午12:44写道：
> > > >>>>>>>>>
> > > >>>>>>>>>> I'm excited to hear about this feature,  very, very, very
> > > >>>> highly
> > > >>>>>>>>> encouraged
> > > >>>>>>>>>>
> > > >>>>>>>>>> Prasanna kumar <pr...@gmail.com>
> > > >> 于2020年7月23日周四
> > > >>>>>>>> 上午12:10写道：
> > > >>>>>>>>>>> Hi Flink Dev Team,
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> Dynamic AutoScaling Based on the incoming data load would
> > > >>> be
> > > >>>> a
> > > >>>>>>> great
> > > >>>>>>>>>>> feature.
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> We should be able have some rule say If the load
> > > >> increased
> > > >>> by
> > > >>>>>> 20% ,
> > > >>>>>>>> add
> > > >>>>>>>>>>> extra resource should be added.
> > > >>>>>>>>>>> Or time based say during these peak hours the pipeline
> > > >>> should
> > > >>>>>> scale
> > > >>>>>>>>>>> automatically by 50%.
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> This will help a lot in cost reduction.
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> EMR cluster provides a similar feature for SPARK based
> > > >>>>>> application.
> > > >>>>>>>>>>> Thanks,
> > > >>>>>>>>>>> Prasanna.
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> On Wed, Jul 22, 2020 at 5:40 PM Robert Metzger <
> > > >>>>>>> rmetzger@apache.org>
> > > >>>>>>>>>>> wrote:
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>> Hi all,
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> Now that the 1.11 release is out, it is time to plan
> > > >> for
> > > >>>> the
> > > >>>>>> next
> > > >>>>>>>>> major
> > > >>>>>>>>>>>> Flink release.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> Some items:
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>>     1.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>>     Dian Fu and me volunteer to be the release managers
> > > >>> for
> > > >>>>>> Flink
> > > >>>>>>>>> 1.12.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>>     1.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>>     Timeline: We propose to stick to our approximate 4
> > > >>> month
> > > >>>>>>> release
> > > >>>>>>>>>>> cycle,
> > > >>>>>>>>>>>>     thus the release should be done by late October.
> > > >> Given
> > > >>>>> that
> > > >>>>>>>>> there’s
> > > >>>>>>>>>> a
> > > >>>>>>>>>>>>     holiday week in China at the beginning of October, I
> > > >>>>> propose
> > > >>>>>>> to
> > > >>>>>>>> do
> > > >>>>>>>>>> the
> > > >>>>>>>>>>>>     feature freeze on master by late September.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>>     2.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>>     Collecting features: It would be good to have a
> > > >> rough
> > > >>>>>> overview
> > > >>>>>>>> of
> > > >>>>>>>>>> the
> > > >>>>>>>>>>>>     features that will likely be ready to be merged by
> > > >>> late
> > > >>>>>>>> September,
> > > >>>>>>>>>> and
> > > >>>>>>>>>>>> that
> > > >>>>>>>>>>>>     we want in the release.
> > > >>>>>>>>>>>>     Based on the discussion, we will update the Roadmap
> > > >> on
> > > >>>> the
> > > >>>>>>> Flink
> > > >>>>>>>>>>> website
> > > >>>>>>>>>>>>     again!
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>>     1.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>>     Test instabilities and blockers: I would like to
> > > >>> avoid a
> > > >>>>>>>> situation
> > > >>>>>>>>>>> where
> > > >>>>>>>>>>>>     we have many blocking issues or build instabilities
> > > >> at
> > > >>>> the
> > > >>>>>>> time
> > > >>>>>>>> of
> > > >>>>>>>>>> the
> > > >>>>>>>>>>>>     feature freeze. To achieve that, we will try to
> > > >> check
> > > >>>>> every
> > > >>>>>>>> build
> > > >>>>>>>>>>>>     instability within a week, to decide if it is a
> > > >>> blocker
> > > >>>>>> (make
> > > >>>>>>>> sure
> > > >>>>>>>>>> to
> > > >>>>>>>>>>>> use
> > > >>>>>>>>>>>>     the “test-stability” label for those tickets!)
> > > >>>>>>>>>>>>     Blocker issues will need to have somebody assigned
> > > >>>>>>> (responsible)
> > > >>>>>>>>>>> within
> > > >>>>>>>>>>>>     a week, and we want to see progress on all blocker
> > > >>>> issues
> > > >>>>>>>>>> (downgrade,
> > > >>>>>>>>>>>>     resolution, a good plan how to proceed if it is more
> > > >>>>>>>> complicated)
> > > >>>>>>>>>>>>     2.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>>     Quality and stability of new features: In order to
> > > >>> have
> > > >>>> a
> > > >>>>>>> short
> > > >>>>>>>>>>> feature
> > > >>>>>>>>>>>>     freeze phase, we encourage developers to only merge
> > > >>>>>>> well-tested
> > > >>>>>>>>> and
> > > >>>>>>>>>>>>     documented features. In our experience, the feature
> > > >>>> freeze
> > > >>>>>>> works
> > > >>>>>>>>>> best
> > > >>>>>>>>>>> if
> > > >>>>>>>>>>>>     new features are complete, and the community can
> > > >> focus
> > > >>>>> fully
> > > >>>>>>> on
> > > >>>>>>>>>>>> addressing
> > > >>>>>>>>>>>>     newly found bugs and voting the release.
> > > >>>>>>>>>>>>     By having a smooth release process, the next
> > > >>>> merge-window
> > > >>>>>> for
> > > >>>>>>>> the
> > > >>>>>>>>>> next
> > > >>>>>>>>>>>>     release will come sooner.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> Let me know what you think about our items, and share
> > > >>> which
> > > >>>>>>>> features
> > > >>>>>>>>>> you
> > > >>>>>>>>>>>> want in Flink 1.12.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> Best,
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> Robert & Dian
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>
> > > >>>>>>>>>> --
> > > >>>>>>>>>>
> > > >>>>>>>>>> Best Regards,
> > > >>>>>>>>>> Harold Miao
> > > >>>>>>>>>>
> > >
> > >
> >
>

Re: [DISCUSS] Planning Flink 1.12

Posted by Zhijiang <wa...@aliyun.com.INVALID>.

+1 on my side for feature freeze date by the end of Oct.


------------------------------------------------------------------
From:Yuan Mei <yu...@gmail.com>
Send Time:2020年8月6日(星期四) 14:54
To:dev <de...@flink.apache.org>
Subject:Re: [DISCUSS] Planning Flink 1.12

+1

> +1 for extending the feature freeze date to the end of October.



On Thu, Aug 6, 2020 at 12:08 PM Yu Li <ca...@gmail.com> wrote:

> +1 for extending feature freeze date to end of October.
>
> Feature development in the master branch could be unblocked through
> creating the release branch, but every coin has its two sides (smile)
>
> Best Regards,
> Yu
>
>
> On Wed, 5 Aug 2020 at 20:12, Robert Metzger <rm...@apache.org> wrote:
>
> > Thanks all for your opinion.
> >
> > @Chesnay: That is a risk, but I hope the people responsible for
> individual
> > FLIPs plan accordingly. Extending the time till the feature freeze should
> > not mean that we are extending the scope of the release.
> > Ideally, features are done before FF, and they use the time till the
> freeze
> > for additional testing and documentation polishing.
> > This FF will be virtual, there should be less disruption than a physical
> > conference with all the travelling.
> > Do you have a different proposal for the timing?
> >
> >
> > I'm currently considering splitting the feature freeze and the release
> > branch creation. Similar to the Linux kernel development, we could have a
> > "merge window" and a stabilization phase. At the end of the stabilization
> > phase, we cut the release branch and open the next merge window (I'll
> start
> > a separate thread regarding this towards the end of this release cycle,
> if
> > I still like the idea then)
> >
> >
> > On Wed, Aug 5, 2020 at 12:04 PM Chesnay Schepler <ch...@apache.org>
> > wrote:
> >
> > > I'm a bit concerned about end of October, because it means we have
> Flink
> > > forward, which usually means at least 1 week of little-to-no activity,
> > > and then 1 week until feature-freeze.
> > >
> > > On 05/08/2020 11:56, jincheng sun wrote:
> > > > +1 for end of October from me as well.
> > > >
> > > > Best,
> > > > Jincheng
> > > >
> > > >
> > > > Kostas Kloudas <kk...@gmail.com> 于2020年8月5日周三 下午4:59写道：
> > > >
> > > >> +1 for end of October from me as well.
> > > >>
> > > >> Cheers,
> > > >> Kostas
> > > >>
> > > >> On Wed, Aug 5, 2020 at 9:59 AM Till Rohrmann <tr...@apache.org>
> > > wrote:
> > > >>
> > > >>> +1 for end of October from my side as well.
> > > >>>
> > > >>> Cheers,
> > > >>> Till
> > > >>>
> > > >>> On Tue, Aug 4, 2020 at 9:46 PM Stephan Ewen <se...@apache.org>
> > wrote:
> > > >>>
> > > >>>> The end of October sounds good from my side, unless it collides
> with
> > > >> some
> > > >>>> holidays that affect many committers.
> > > >>>>
> > > >>>> Feature-wise, I believe we can definitely make good use of the
> time
> > to
> > > >>> wrap
> > > >>>> up some critical threads (like finishing the FLIP-27 source
> > efforts).
> > > >>>>
> > > >>>> So +1 to the end of October from my side.
> > > >>>>
> > > >>>> Best,
> > > >>>> Stephan
> > > >>>>
> > > >>>>
> > > >>>> On Tue, Aug 4, 2020 at 8:59 AM Robert Metzger <
> rmetzger@apache.org>
> > > >>> wrote:
> > > >>>>> Thanks a lot for commenting on the feature freeze date.
> > > >>>>>
> > > >>>>> You are raising a few good points on the timing.
> > > >>>>> If we have already (2 months before) concerns regarding the
> > deadline,
> > > >>>> then
> > > >>>>> I agree that we should move it till the end of October.
> > > >>>>>
> > > >>>>> We then just need to be careful not to run into the Christmas
> > season
> > > >> at
> > > >>>> the
> > > >>>>> end of December.
> > > >>>>>
> > > >>>>> If nobody objects within a few days, I'll update the feature
> freeze
> > > >>> date
> > > >>>> in
> > > >>>>> the Wiki.
> > > >>>>>
> > > >>>>>
> > > >>>>> On Tue, Aug 4, 2020 at 7:52 AM Kurt Young <yk...@gmail.com>
> > wrote:
> > > >>>>>
> > > >>>>>> Regarding setting the feature freeze date to late September, I
> > have
> > > >>>> some
> > > >>>>>> concern that it might make
> > > >>>>>> the development time of 1.12 too short.
> > > >>>>>>
> > > >>>>>> One reason for this is we took too much time (about 1.5 month,
> > from
> > > >>> mid
> > > >>>>> of
> > > >>>>>> May to beginning of July)
> > > >>>>>> for testing 1.11. It's not ideal but further squeeze the
> > > >> development
> > > >>>> time
> > > >>>>>> of 1.12 won't make this better.
> > > >>>>>>   Besides, AFAIK July & August is also a popular vacation season
> > for
> > > >>>>>> European. Given the fact most
> > > >>>>>>   committers of Flink come from Europe, I think we should also
> > take
> > > >>> this
> > > >>>>>> into consideration.
> > > >>>>>>
> > > >>>>>> It's also true that the first week of October is the national
> > > >> holiday
> > > >>>> of
> > > >>>>>> China, so I'm wondering whether the
> > > >>>>>> end of October could be a candidate feature freeze date.
> > > >>>>>>
> > > >>>>>> Best,
> > > >>>>>> Kurt
> > > >>>>>>
> > > >>>>>>
> > > >>>>>> On Tue, Jul 28, 2020 at 2:41 AM Robert Metzger <
> > > >> rmetzger@apache.org>
> > > >>>>>> wrote:
> > > >>>>>>
> > > >>>>>>> Hi all,
> > > >>>>>>>
> > > >>>>>>> Thanks a lot for the responses so far. I've put them into this
> > > >> Wiki
> > > >>>>> page:
> > > >>>>>>> https://cwiki.apache.org/confluence/display/FLINK/1.12+Release
> > > >> to
> > > >>>> keep
> > > >>>>>>> track of them. Ideally, post JIRA tickets for your feature,
> then
> > > >>> the
> > > >>>>>> status
> > > >>>>>>> will update automatically in the wiki :)
> > > >>>>>>>
> > > >>>>>>> Please keep posting features here, or add them to the Wiki
> > > >> yourself
> > > >>>> 🙏
> > > >>>>>>> @Prasanna kumar <pr...@gmail.com>: Dynamic Auto
> > > >>>> Scaling
> > > >>>>>> is a
> > > >>>>>>> feature request the community is well-aware of. Till has posted
> > > >>>>>>> "Reactive-scaling mode" as a feature he's working on for the
> 1.12
> > > >>>>>> release.
> > > >>>>>>> This work will introduce the basic building blocks and partial
> > > >>>> support
> > > >>>>>> for
> > > >>>>>>> the feature you are requesting.
> > > >>>>>>> Proper support for dynamic scaling, while maintaining Flink's
> > > >> high
> > > >>>>>>> performance (throughout, low latency) and correctness is a
> > > >>> difficult
> > > >>>>> task
> > > >>>>>>> that needs a lot of work. It will probably take a little bit of
> > > >>> time
> > > >>>>> till
> > > >>>>>>> this is fully available.
> > > >>>>>>>
> > > >>>>>>> Cheers,
> > > >>>>>>> Robert
> > > >>>>>>>
> > > >>>>>>>
> > > >>>>>>>
> > > >>>>>>> On Thu, Jul 23, 2020 at 2:27 PM Till Rohrmann <
> > > >>> trohrmann@apache.org>
> > > >>>>>>> wrote:
> > > >>>>>>>
> > > >>>>>>>> Thanks for being our release managers for the 1.12 release
> > > >> Dian &
> > > >>>>>> Robert!
> > > >>>>>>>> Here are some features I would like to work on for this
> > > >> release:
> > > >>>>>>>> # Features
> > > >>>>>>>>
> > > >>>>>>>> ## Finishing pipelined region scheduling (
> > > >>>>>>>> https://issues.apache.org/jira/browse/FLINK-16430)
> > > >>>>>>>> With the pipelined region scheduler we want to implement a
> > > >>>> scheduler
> > > >>>>>>> which
> > > >>>>>>>> can serve streaming as well as batch workloads alike while
> > > >> being
> > > >>>> able
> > > >>>>>> to
> > > >>>>>>>> run jobs under constrained resources. The latter is
> > > >> particularly
> > > >>>>>>> important
> > > >>>>>>>> for bounded streaming jobs which, currently, are not well
> > > >>>> supported.
> > > >>>>>>>> ## Reactive-scaling mode
> > > >>>>>>>> Being able to react to newly available resources and rescaling
> > > >> a
> > > >>>>>> running
> > > >>>>>>>> job accordingly will make Flink's operation much easier
> because
> > > >>>>>> resources
> > > >>>>>>>> can then be controlled by an external tool (e.g. GCP
> > > >> autoscaling,
> > > >>>> K8s
> > > >>>>>>>> horizontal pod scaler, etc.). In this release we want to make
> a
> > > >>> big
> > > >>>>>> step
> > > >>>>>>>> towards this direction. As a first step we want to support the
> > > >>>>>> execution
> > > >>>>>>> of
> > > >>>>>>>> jobs with a parallelism which is lower than the specified
> > > >>>> parallelism
> > > >>>>>> in
> > > >>>>>>>> case that Flink lost a TaskManager or could not acquire enough
> > > >>>>>> resources.
> > > >>>>>>>> # Maintenance/Stability
> > > >>>>>>>>
> > > >>>>>>>> ## JM / TM finished task reconciliation (
> > > >>>>>>>> https://issues.apache.org/jira/browse/FLINK-17075)
> > > >>>>>>>> This prevents the system from going out of sync if a task
> state
> > > >>>>> change
> > > >>>>>>> from
> > > >>>>>>>> the TM to the JM is lost.
> > > >>>>>>>>
> > > >>>>>>>> ## Make metrics services work with Kubernetes deployments (
> > > >>>>>>>> https://issues.apache.org/jira/browse/FLINK-11127)
> > > >>>>>>>> Invert the direction in which the MetricFetcher connects to
> the
> > > >>>>>>>> MetricQueryFetchers. That way it will no longer be necessary
> to
> > > >>>>> expose
> > > >>>>>> on
> > > >>>>>>>> K8s for every TaskManager a port on which the
> > > >> MetricQueryFetcher
> > > >>>>> runs.
> > > >>>>>>> This
> > > >>>>>>>> will then make the deployment of Flink clusters on K8s easier.
> > > >>>>>>>>
> > > >>>>>>>> ## Handle long-blocking operations during job submission
> > > >>> (savepoint
> > > >>>>>>>> restore) (https://issues.apache.org/jira/browse/FLINK-16866)
> > > >>>>>>>> Submitting a Flink job can involve the interaction with
> > > >> external
> > > >>>>>> systems
> > > >>>>>>>> (blocking operations). Depending on the job the interactions
> > > >> can
> > > >>>> take
> > > >>>>>> so
> > > >>>>>>>> long that it exceeds the submission timeout which reports a
> > > >>> failure
> > > >>>>> on
> > > >>>>>>> the
> > > >>>>>>>> client side even though the actual submission succeeded. By
> > > >>>>> decoupling
> > > >>>>>>> the
> > > >>>>>>>> creation of the ExecutionGraph from the job submission, we can
> > > >>> make
> > > >>>>> the
> > > >>>>>>> job
> > > >>>>>>>> submission non-blocking which will solve this problem.
> > > >>>>>>>>
> > > >>>>>>>> ## Make IDs more intuitive to ease debugging (FLIP-118) (
> > > >>>>>>>> https://issues.apache.org/jira/browse/FLINK-15679)
> > > >>>>>>>> By making the internal Flink IDs compositional or logging how
> > > >>> they
> > > >>>>>> belong
> > > >>>>>>>> together, we can make the debugging of Flink's operations much
> > > >>>>> easier.
> > > >>>>>>>> Cheers,
> > > >>>>>>>> Till
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>> On Thu, Jul 23, 2020 at 7:48 AM Canbin Zheng <
> > > >>>> felixzhengcb@gmail.com
> > > >>>>>>>> wrote:
> > > >>>>>>>>
> > > >>>>>>>>> Hi All,
> > > >>>>>>>>>
> > > >>>>>>>>> Thanks for bring-up this discussion, Robert!
> > > >>>>>>>>> Congratulations on becoming the release manager of 1.12， Dian
> > > >>> and
> > > >>>>>>> Robert
> > > >>>>>>>> !
> > > >>>>>>>>> ----------
> > > >>>>>>>>> Here are some of my thoughts of the features for native
> > > >>>> integration
> > > >>>>>>> with
> > > >>>>>>>>> Kubernetes in Flink 1.12:
> > > >>>>>>>>>
> > > >>>>>>>>> 1. Support user-specified pod templates
> > > >>>>>>>>>      Description:
> > > >>>>>>>>>      The current approach of introducing new configuration
> > > >>> options
> > > >>>>> for
> > > >>>>>>>> each
> > > >>>>>>>>> aspect of pod specification a user might wish is becoming
> > > >>>> unwieldy,
> > > >>>>>> we
> > > >>>>>>>> have
> > > >>>>>>>>> to maintain more and more Flink side Kubernetes configuration
> > > >>>>> options
> > > >>>>>>> and
> > > >>>>>>>>> users have to learn the gap between the declarative model
> > > >> used
> > > >>> by
> > > >>>>>>>>> Kubernetes and the configuration model used by Flink. It's a
> > > >>>> great
> > > >>>>>>>>> improvement to allow users to specify pod templates as
> > > >> central
> > > >>>>> places
> > > >>>>>>> for
> > > >>>>>>>>> all customization needs for the jobmanager and taskmanager
> > > >>> pods.
> > > >>>>>>>>>      Benefits:
> > > >>>>>>>>>      Users can leverage many of the advanced K8s features
> that
> > > >>> the
> > > >>>>>> Flink
> > > >>>>>>>>> community does not support explicitly, such as volume
> > > >> mounting,
> > > >>>> DNS
> > > >>>>>>>>> configuration, pod affinity/anti-affinity setting, etc.
> > > >>>>>>>>>
> > > >>>>>>>>> 2. Support running PyFlink on Kubernetes
> > > >>>>>>>>>      Description:
> > > >>>>>>>>>      Support running PyFlink on Kubernetes, including session
> > > >>>>> cluster
> > > >>>>>>> and
> > > >>>>>>>>> application cluster.
> > > >>>>>>>>>      Benefits:
> > > >>>>>>>>>      Running python application in a containerized
> > > >> environment.
> > > >>>>>>>>> 3. Support built-in init-Container
> > > >>>>>>>>>      Description:
> > > >>>>>>>>>      We need a built-in init-Container to help solve
> > > >> dependency
> > > >>>>>>> management
> > > >>>>>>>>> in a containerized environment, especially in the application
> > > >>>> mode.
> > > >>>>>>>>>      Benefits:
> > > >>>>>>>>>      Separate the base Flink image from dynamic dependencies.
> > > >>>>>>>>>
> > > >>>>>>>>> 4. Support accessing secured services via K8s secrets
> > > >>>>>>>>>      Description:
> > > >>>>>>>>>      Kubernetes Secrets
> > > >>>>>>>>> <https://kubernetes.io/docs/concepts/configuration/secret/>
> > > >>> can
> > > >>>> be
> > > >>>>>>> used
> > > >>>>>>>> to
> > > >>>>>>>>> provide credentials for a Flink application to access secured
> > > >>>>>> services.
> > > >>>>>>>> It
> > > >>>>>>>>> helps people who want to use a user-specified K8s Secret
> > > >>> through
> > > >>>> an
> > > >>>>>>>>> environment variable.
> > > >>>>>>>>>      Benefits:
> > > >>>>>>>>>      Improve user experience.
> > > >>>>>>>>>
> > > >>>>>>>>> 5. Support configuring replica of JobManager Deployment in
> > > >>>>> ZooKeeper
> > > >>>>>> HA
> > > >>>>>>>>> setups
> > > >>>>>>>>>      Description:
> > > >>>>>>>>>      Make the *replica* of Deployment configurable in the
> > > >>>> ZooKeeper
> > > >>>>> HA
> > > >>>>>>>>> setups.
> > > >>>>>>>>>      Benefits:
> > > >>>>>>>>>      Achieve faster failover.
> > > >>>>>>>>>
> > > >>>>>>>>> 6. Support to configure limit for CPU requirement
> > > >>>>>>>>>      Description:
> > > >>>>>>>>>      To leverage the Kubernetes feature of container
> > > >>> request/limit
> > > >>>>>> CPU.
> > > >>>>>>>>>      Benefits:
> > > >>>>>>>>>      Reduce cost.
> > > >>>>>>>>>
> > > >>>>>>>>> Regards,
> > > >>>>>>>>> Canbin Zheng
> > > >>>>>>>>>
> > > >>>>>>>>> Harold.Miao <mi...@gmail.com> 于2020年7月23日周四 下午12:44写道：
> > > >>>>>>>>>
> > > >>>>>>>>>> I'm excited to hear about this feature,  very, very, very
> > > >>>> highly
> > > >>>>>>>>> encouraged
> > > >>>>>>>>>>
> > > >>>>>>>>>> Prasanna kumar <pr...@gmail.com>
> > > >> 于2020年7月23日周四
> > > >>>>>>>> 上午12:10写道：
> > > >>>>>>>>>>> Hi Flink Dev Team,
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> Dynamic AutoScaling Based on the incoming data load would
> > > >>> be
> > > >>>> a
> > > >>>>>>> great
> > > >>>>>>>>>>> feature.
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> We should be able have some rule say If the load
> > > >> increased
> > > >>> by
> > > >>>>>> 20% ,
> > > >>>>>>>> add
> > > >>>>>>>>>>> extra resource should be added.
> > > >>>>>>>>>>> Or time based say during these peak hours the pipeline
> > > >>> should
> > > >>>>>> scale
> > > >>>>>>>>>>> automatically by 50%.
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> This will help a lot in cost reduction.
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> EMR cluster provides a similar feature for SPARK based
> > > >>>>>> application.
> > > >>>>>>>>>>> Thanks,
> > > >>>>>>>>>>> Prasanna.
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> On Wed, Jul 22, 2020 at 5:40 PM Robert Metzger <
> > > >>>>>>> rmetzger@apache.org>
> > > >>>>>>>>>>> wrote:
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>> Hi all,
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> Now that the 1.11 release is out, it is time to plan
> > > >> for
> > > >>>> the
> > > >>>>>> next
> > > >>>>>>>>> major
> > > >>>>>>>>>>>> Flink release.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> Some items:
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>>     1.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>>     Dian Fu and me volunteer to be the release managers
> > > >>> for
> > > >>>>>> Flink
> > > >>>>>>>>> 1.12.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>>     1.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>>     Timeline: We propose to stick to our approximate 4
> > > >>> month
> > > >>>>>>> release
> > > >>>>>>>>>>> cycle,
> > > >>>>>>>>>>>>     thus the release should be done by late October.
> > > >> Given
> > > >>>>> that
> > > >>>>>>>>> there’s
> > > >>>>>>>>>> a
> > > >>>>>>>>>>>>     holiday week in China at the beginning of October, I
> > > >>>>> propose
> > > >>>>>>> to
> > > >>>>>>>> do
> > > >>>>>>>>>> the
> > > >>>>>>>>>>>>     feature freeze on master by late September.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>>     2.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>>     Collecting features: It would be good to have a
> > > >> rough
> > > >>>>>> overview
> > > >>>>>>>> of
> > > >>>>>>>>>> the
> > > >>>>>>>>>>>>     features that will likely be ready to be merged by
> > > >>> late
> > > >>>>>>>> September,
> > > >>>>>>>>>> and
> > > >>>>>>>>>>>> that
> > > >>>>>>>>>>>>     we want in the release.
> > > >>>>>>>>>>>>     Based on the discussion, we will update the Roadmap
> > > >> on
> > > >>>> the
> > > >>>>>>> Flink
> > > >>>>>>>>>>> website
> > > >>>>>>>>>>>>     again!
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>>     1.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>>     Test instabilities and blockers: I would like to
> > > >>> avoid a
> > > >>>>>>>> situation
> > > >>>>>>>>>>> where
> > > >>>>>>>>>>>>     we have many blocking issues or build instabilities
> > > >> at
> > > >>>> the
> > > >>>>>>> time
> > > >>>>>>>> of
> > > >>>>>>>>>> the
> > > >>>>>>>>>>>>     feature freeze. To achieve that, we will try to
> > > >> check
> > > >>>>> every
> > > >>>>>>>> build
> > > >>>>>>>>>>>>     instability within a week, to decide if it is a
> > > >>> blocker
> > > >>>>>> (make
> > > >>>>>>>> sure
> > > >>>>>>>>>> to
> > > >>>>>>>>>>>> use
> > > >>>>>>>>>>>>     the “test-stability” label for those tickets!)
> > > >>>>>>>>>>>>     Blocker issues will need to have somebody assigned
> > > >>>>>>> (responsible)
> > > >>>>>>>>>>> within
> > > >>>>>>>>>>>>     a week, and we want to see progress on all blocker
> > > >>>> issues
> > > >>>>>>>>>> (downgrade,
> > > >>>>>>>>>>>>     resolution, a good plan how to proceed if it is more
> > > >>>>>>>> complicated)
> > > >>>>>>>>>>>>     2.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>>     Quality and stability of new features: In order to
> > > >>> have
> > > >>>> a
> > > >>>>>>> short
> > > >>>>>>>>>>> feature
> > > >>>>>>>>>>>>     freeze phase, we encourage developers to only merge
> > > >>>>>>> well-tested
> > > >>>>>>>>> and
> > > >>>>>>>>>>>>     documented features. In our experience, the feature
> > > >>>> freeze
> > > >>>>>>> works
> > > >>>>>>>>>> best
> > > >>>>>>>>>>> if
> > > >>>>>>>>>>>>     new features are complete, and the community can
> > > >> focus
> > > >>>>> fully
> > > >>>>>>> on
> > > >>>>>>>>>>>> addressing
> > > >>>>>>>>>>>>     newly found bugs and voting the release.
> > > >>>>>>>>>>>>     By having a smooth release process, the next
> > > >>>> merge-window
> > > >>>>>> for
> > > >>>>>>>> the
> > > >>>>>>>>>> next
> > > >>>>>>>>>>>>     release will come sooner.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> Let me know what you think about our items, and share
> > > >>> which
> > > >>>>>>>> features
> > > >>>>>>>>>> you
> > > >>>>>>>>>>>> want in Flink 1.12.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> Best,
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> Robert & Dian
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>
> > > >>>>>>>>>> --
> > > >>>>>>>>>>
> > > >>>>>>>>>> Best Regards,
> > > >>>>>>>>>> Harold Miao
> > > >>>>>>>>>>
> > >
> > >
> >
>

Re: [DISCUSS] Planning Flink 1.12

Posted by Yuan Mei <yu...@gmail.com>.

+1

> +1 for extending the feature freeze date to the end of October.



On Thu, Aug 6, 2020 at 12:08 PM Yu Li <ca...@gmail.com> wrote:

> +1 for extending feature freeze date to end of October.
>
> Feature development in the master branch could be unblocked through
> creating the release branch, but every coin has its two sides (smile)
>
> Best Regards,
> Yu
>
>
> On Wed, 5 Aug 2020 at 20:12, Robert Metzger <rm...@apache.org> wrote:
>
> > Thanks all for your opinion.
> >
> > @Chesnay: That is a risk, but I hope the people responsible for
> individual
> > FLIPs plan accordingly. Extending the time till the feature freeze should
> > not mean that we are extending the scope of the release.
> > Ideally, features are done before FF, and they use the time till the
> freeze
> > for additional testing and documentation polishing.
> > This FF will be virtual, there should be less disruption than a physical
> > conference with all the travelling.
> > Do you have a different proposal for the timing?
> >
> >
> > I'm currently considering splitting the feature freeze and the release
> > branch creation. Similar to the Linux kernel development, we could have a
> > "merge window" and a stabilization phase. At the end of the stabilization
> > phase, we cut the release branch and open the next merge window (I'll
> start
> > a separate thread regarding this towards the end of this release cycle,
> if
> > I still like the idea then)
> >
> >
> > On Wed, Aug 5, 2020 at 12:04 PM Chesnay Schepler <ch...@apache.org>
> > wrote:
> >
> > > I'm a bit concerned about end of October, because it means we have
> Flink
> > > forward, which usually means at least 1 week of little-to-no activity,
> > > and then 1 week until feature-freeze.
> > >
> > > On 05/08/2020 11:56, jincheng sun wrote:
> > > > +1 for end of October from me as well.
> > > >
> > > > Best,
> > > > Jincheng
> > > >
> > > >
> > > > Kostas Kloudas <kk...@gmail.com> 于2020年8月5日周三 下午4:59写道：
> > > >
> > > >> +1 for end of October from me as well.
> > > >>
> > > >> Cheers,
> > > >> Kostas
> > > >>
> > > >> On Wed, Aug 5, 2020 at 9:59 AM Till Rohrmann <tr...@apache.org>
> > > wrote:
> > > >>
> > > >>> +1 for end of October from my side as well.
> > > >>>
> > > >>> Cheers,
> > > >>> Till
> > > >>>
> > > >>> On Tue, Aug 4, 2020 at 9:46 PM Stephan Ewen <se...@apache.org>
> > wrote:
> > > >>>
> > > >>>> The end of October sounds good from my side, unless it collides
> with
> > > >> some
> > > >>>> holidays that affect many committers.
> > > >>>>
> > > >>>> Feature-wise, I believe we can definitely make good use of the
> time
> > to
> > > >>> wrap
> > > >>>> up some critical threads (like finishing the FLIP-27 source
> > efforts).
> > > >>>>
> > > >>>> So +1 to the end of October from my side.
> > > >>>>
> > > >>>> Best,
> > > >>>> Stephan
> > > >>>>
> > > >>>>
> > > >>>> On Tue, Aug 4, 2020 at 8:59 AM Robert Metzger <
> rmetzger@apache.org>
> > > >>> wrote:
> > > >>>>> Thanks a lot for commenting on the feature freeze date.
> > > >>>>>
> > > >>>>> You are raising a few good points on the timing.
> > > >>>>> If we have already (2 months before) concerns regarding the
> > deadline,
> > > >>>> then
> > > >>>>> I agree that we should move it till the end of October.
> > > >>>>>
> > > >>>>> We then just need to be careful not to run into the Christmas
> > season
> > > >> at
> > > >>>> the
> > > >>>>> end of December.
> > > >>>>>
> > > >>>>> If nobody objects within a few days, I'll update the feature
> freeze
> > > >>> date
> > > >>>> in
> > > >>>>> the Wiki.
> > > >>>>>
> > > >>>>>
> > > >>>>> On Tue, Aug 4, 2020 at 7:52 AM Kurt Young <yk...@gmail.com>
> > wrote:
> > > >>>>>
> > > >>>>>> Regarding setting the feature freeze date to late September, I
> > have
> > > >>>> some
> > > >>>>>> concern that it might make
> > > >>>>>> the development time of 1.12 too short.
> > > >>>>>>
> > > >>>>>> One reason for this is we took too much time (about 1.5 month,
> > from
> > > >>> mid
> > > >>>>> of
> > > >>>>>> May to beginning of July)
> > > >>>>>> for testing 1.11. It's not ideal but further squeeze the
> > > >> development
> > > >>>> time
> > > >>>>>> of 1.12 won't make this better.
> > > >>>>>>   Besides, AFAIK July & August is also a popular vacation season
> > for
> > > >>>>>> European. Given the fact most
> > > >>>>>>   committers of Flink come from Europe, I think we should also
> > take
> > > >>> this
> > > >>>>>> into consideration.
> > > >>>>>>
> > > >>>>>> It's also true that the first week of October is the national
> > > >> holiday
> > > >>>> of
> > > >>>>>> China, so I'm wondering whether the
> > > >>>>>> end of October could be a candidate feature freeze date.
> > > >>>>>>
> > > >>>>>> Best,
> > > >>>>>> Kurt
> > > >>>>>>
> > > >>>>>>
> > > >>>>>> On Tue, Jul 28, 2020 at 2:41 AM Robert Metzger <
> > > >> rmetzger@apache.org>
> > > >>>>>> wrote:
> > > >>>>>>
> > > >>>>>>> Hi all,
> > > >>>>>>>
> > > >>>>>>> Thanks a lot for the responses so far. I've put them into this
> > > >> Wiki
> > > >>>>> page:
> > > >>>>>>> https://cwiki.apache.org/confluence/display/FLINK/1.12+Release
> > > >> to
> > > >>>> keep
> > > >>>>>>> track of them. Ideally, post JIRA tickets for your feature,
> then
> > > >>> the
> > > >>>>>> status
> > > >>>>>>> will update automatically in the wiki :)
> > > >>>>>>>
> > > >>>>>>> Please keep posting features here, or add them to the Wiki
> > > >> yourself
> > > >>>> 🙏
> > > >>>>>>> @Prasanna kumar <pr...@gmail.com>: Dynamic Auto
> > > >>>> Scaling
> > > >>>>>> is a
> > > >>>>>>> feature request the community is well-aware of. Till has posted
> > > >>>>>>> "Reactive-scaling mode" as a feature he's working on for the
> 1.12
> > > >>>>>> release.
> > > >>>>>>> This work will introduce the basic building blocks and partial
> > > >>>> support
> > > >>>>>> for
> > > >>>>>>> the feature you are requesting.
> > > >>>>>>> Proper support for dynamic scaling, while maintaining Flink's
> > > >> high
> > > >>>>>>> performance (throughout, low latency) and correctness is a
> > > >>> difficult
> > > >>>>> task
> > > >>>>>>> that needs a lot of work. It will probably take a little bit of
> > > >>> time
> > > >>>>> till
> > > >>>>>>> this is fully available.
> > > >>>>>>>
> > > >>>>>>> Cheers,
> > > >>>>>>> Robert
> > > >>>>>>>
> > > >>>>>>>
> > > >>>>>>>
> > > >>>>>>> On Thu, Jul 23, 2020 at 2:27 PM Till Rohrmann <
> > > >>> trohrmann@apache.org>
> > > >>>>>>> wrote:
> > > >>>>>>>
> > > >>>>>>>> Thanks for being our release managers for the 1.12 release
> > > >> Dian &
> > > >>>>>> Robert!
> > > >>>>>>>> Here are some features I would like to work on for this
> > > >> release:
> > > >>>>>>>> # Features
> > > >>>>>>>>
> > > >>>>>>>> ## Finishing pipelined region scheduling (
> > > >>>>>>>> https://issues.apache.org/jira/browse/FLINK-16430)
> > > >>>>>>>> With the pipelined region scheduler we want to implement a
> > > >>>> scheduler
> > > >>>>>>> which
> > > >>>>>>>> can serve streaming as well as batch workloads alike while
> > > >> being
> > > >>>> able
> > > >>>>>> to
> > > >>>>>>>> run jobs under constrained resources. The latter is
> > > >> particularly
> > > >>>>>>> important
> > > >>>>>>>> for bounded streaming jobs which, currently, are not well
> > > >>>> supported.
> > > >>>>>>>> ## Reactive-scaling mode
> > > >>>>>>>> Being able to react to newly available resources and rescaling
> > > >> a
> > > >>>>>> running
> > > >>>>>>>> job accordingly will make Flink's operation much easier
> because
> > > >>>>>> resources
> > > >>>>>>>> can then be controlled by an external tool (e.g. GCP
> > > >> autoscaling,
> > > >>>> K8s
> > > >>>>>>>> horizontal pod scaler, etc.). In this release we want to make
> a
> > > >>> big
> > > >>>>>> step
> > > >>>>>>>> towards this direction. As a first step we want to support the
> > > >>>>>> execution
> > > >>>>>>> of
> > > >>>>>>>> jobs with a parallelism which is lower than the specified
> > > >>>> parallelism
> > > >>>>>> in
> > > >>>>>>>> case that Flink lost a TaskManager or could not acquire enough
> > > >>>>>> resources.
> > > >>>>>>>> # Maintenance/Stability
> > > >>>>>>>>
> > > >>>>>>>> ## JM / TM finished task reconciliation (
> > > >>>>>>>> https://issues.apache.org/jira/browse/FLINK-17075)
> > > >>>>>>>> This prevents the system from going out of sync if a task
> state
> > > >>>>> change
> > > >>>>>>> from
> > > >>>>>>>> the TM to the JM is lost.
> > > >>>>>>>>
> > > >>>>>>>> ## Make metrics services work with Kubernetes deployments (
> > > >>>>>>>> https://issues.apache.org/jira/browse/FLINK-11127)
> > > >>>>>>>> Invert the direction in which the MetricFetcher connects to
> the
> > > >>>>>>>> MetricQueryFetchers. That way it will no longer be necessary
> to
> > > >>>>> expose
> > > >>>>>> on
> > > >>>>>>>> K8s for every TaskManager a port on which the
> > > >> MetricQueryFetcher
> > > >>>>> runs.
> > > >>>>>>> This
> > > >>>>>>>> will then make the deployment of Flink clusters on K8s easier.
> > > >>>>>>>>
> > > >>>>>>>> ## Handle long-blocking operations during job submission
> > > >>> (savepoint
> > > >>>>>>>> restore) (https://issues.apache.org/jira/browse/FLINK-16866)
> > > >>>>>>>> Submitting a Flink job can involve the interaction with
> > > >> external
> > > >>>>>> systems
> > > >>>>>>>> (blocking operations). Depending on the job the interactions
> > > >> can
> > > >>>> take
> > > >>>>>> so
> > > >>>>>>>> long that it exceeds the submission timeout which reports a
> > > >>> failure
> > > >>>>> on
> > > >>>>>>> the
> > > >>>>>>>> client side even though the actual submission succeeded. By
> > > >>>>> decoupling
> > > >>>>>>> the
> > > >>>>>>>> creation of the ExecutionGraph from the job submission, we can
> > > >>> make
> > > >>>>> the
> > > >>>>>>> job
> > > >>>>>>>> submission non-blocking which will solve this problem.
> > > >>>>>>>>
> > > >>>>>>>> ## Make IDs more intuitive to ease debugging (FLIP-118) (
> > > >>>>>>>> https://issues.apache.org/jira/browse/FLINK-15679)
> > > >>>>>>>> By making the internal Flink IDs compositional or logging how
> > > >>> they
> > > >>>>>> belong
> > > >>>>>>>> together, we can make the debugging of Flink's operations much
> > > >>>>> easier.
> > > >>>>>>>> Cheers,
> > > >>>>>>>> Till
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>> On Thu, Jul 23, 2020 at 7:48 AM Canbin Zheng <
> > > >>>> felixzhengcb@gmail.com
> > > >>>>>>>> wrote:
> > > >>>>>>>>
> > > >>>>>>>>> Hi All,
> > > >>>>>>>>>
> > > >>>>>>>>> Thanks for bring-up this discussion, Robert!
> > > >>>>>>>>> Congratulations on becoming the release manager of 1.12， Dian
> > > >>> and
> > > >>>>>>> Robert
> > > >>>>>>>> !
> > > >>>>>>>>> ----------
> > > >>>>>>>>> Here are some of my thoughts of the features for native
> > > >>>> integration
> > > >>>>>>> with
> > > >>>>>>>>> Kubernetes in Flink 1.12:
> > > >>>>>>>>>
> > > >>>>>>>>> 1. Support user-specified pod templates
> > > >>>>>>>>>      Description:
> > > >>>>>>>>>      The current approach of introducing new configuration
> > > >>> options
> > > >>>>> for
> > > >>>>>>>> each
> > > >>>>>>>>> aspect of pod specification a user might wish is becoming
> > > >>>> unwieldy,
> > > >>>>>> we
> > > >>>>>>>> have
> > > >>>>>>>>> to maintain more and more Flink side Kubernetes configuration
> > > >>>>> options
> > > >>>>>>> and
> > > >>>>>>>>> users have to learn the gap between the declarative model
> > > >> used
> > > >>> by
> > > >>>>>>>>> Kubernetes and the configuration model used by Flink. It's a
> > > >>>> great
> > > >>>>>>>>> improvement to allow users to specify pod templates as
> > > >> central
> > > >>>>> places
> > > >>>>>>> for
> > > >>>>>>>>> all customization needs for the jobmanager and taskmanager
> > > >>> pods.
> > > >>>>>>>>>      Benefits:
> > > >>>>>>>>>      Users can leverage many of the advanced K8s features
> that
> > > >>> the
> > > >>>>>> Flink
> > > >>>>>>>>> community does not support explicitly, such as volume
> > > >> mounting,
> > > >>>> DNS
> > > >>>>>>>>> configuration, pod affinity/anti-affinity setting, etc.
> > > >>>>>>>>>
> > > >>>>>>>>> 2. Support running PyFlink on Kubernetes
> > > >>>>>>>>>      Description:
> > > >>>>>>>>>      Support running PyFlink on Kubernetes, including session
> > > >>>>> cluster
> > > >>>>>>> and
> > > >>>>>>>>> application cluster.
> > > >>>>>>>>>      Benefits:
> > > >>>>>>>>>      Running python application in a containerized
> > > >> environment.
> > > >>>>>>>>> 3. Support built-in init-Container
> > > >>>>>>>>>      Description:
> > > >>>>>>>>>      We need a built-in init-Container to help solve
> > > >> dependency
> > > >>>>>>> management
> > > >>>>>>>>> in a containerized environment, especially in the application
> > > >>>> mode.
> > > >>>>>>>>>      Benefits:
> > > >>>>>>>>>      Separate the base Flink image from dynamic dependencies.
> > > >>>>>>>>>
> > > >>>>>>>>> 4. Support accessing secured services via K8s secrets
> > > >>>>>>>>>      Description:
> > > >>>>>>>>>      Kubernetes Secrets
> > > >>>>>>>>> <https://kubernetes.io/docs/concepts/configuration/secret/>
> > > >>> can
> > > >>>> be
> > > >>>>>>> used
> > > >>>>>>>> to
> > > >>>>>>>>> provide credentials for a Flink application to access secured
> > > >>>>>> services.
> > > >>>>>>>> It
> > > >>>>>>>>> helps people who want to use a user-specified K8s Secret
> > > >>> through
> > > >>>> an
> > > >>>>>>>>> environment variable.
> > > >>>>>>>>>      Benefits:
> > > >>>>>>>>>      Improve user experience.
> > > >>>>>>>>>
> > > >>>>>>>>> 5. Support configuring replica of JobManager Deployment in
> > > >>>>> ZooKeeper
> > > >>>>>> HA
> > > >>>>>>>>> setups
> > > >>>>>>>>>      Description:
> > > >>>>>>>>>      Make the *replica* of Deployment configurable in the
> > > >>>> ZooKeeper
> > > >>>>> HA
> > > >>>>>>>>> setups.
> > > >>>>>>>>>      Benefits:
> > > >>>>>>>>>      Achieve faster failover.
> > > >>>>>>>>>
> > > >>>>>>>>> 6. Support to configure limit for CPU requirement
> > > >>>>>>>>>      Description:
> > > >>>>>>>>>      To leverage the Kubernetes feature of container
> > > >>> request/limit
> > > >>>>>> CPU.
> > > >>>>>>>>>      Benefits:
> > > >>>>>>>>>      Reduce cost.
> > > >>>>>>>>>
> > > >>>>>>>>> Regards,
> > > >>>>>>>>> Canbin Zheng
> > > >>>>>>>>>
> > > >>>>>>>>> Harold.Miao <mi...@gmail.com> 于2020年7月23日周四 下午12:44写道：
> > > >>>>>>>>>
> > > >>>>>>>>>> I'm excited to hear about this feature,  very, very, very
> > > >>>> highly
> > > >>>>>>>>> encouraged
> > > >>>>>>>>>>
> > > >>>>>>>>>> Prasanna kumar <pr...@gmail.com>
> > > >> 于2020年7月23日周四
> > > >>>>>>>> 上午12:10写道：
> > > >>>>>>>>>>> Hi Flink Dev Team,
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> Dynamic AutoScaling Based on the incoming data load would
> > > >>> be
> > > >>>> a
> > > >>>>>>> great
> > > >>>>>>>>>>> feature.
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> We should be able have some rule say If the load
> > > >> increased
> > > >>> by
> > > >>>>>> 20% ,
> > > >>>>>>>> add
> > > >>>>>>>>>>> extra resource should be added.
> > > >>>>>>>>>>> Or time based say during these peak hours the pipeline
> > > >>> should
> > > >>>>>> scale
> > > >>>>>>>>>>> automatically by 50%.
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> This will help a lot in cost reduction.
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> EMR cluster provides a similar feature for SPARK based
> > > >>>>>> application.
> > > >>>>>>>>>>> Thanks,
> > > >>>>>>>>>>> Prasanna.
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> On Wed, Jul 22, 2020 at 5:40 PM Robert Metzger <
> > > >>>>>>> rmetzger@apache.org>
> > > >>>>>>>>>>> wrote:
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>> Hi all,
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> Now that the 1.11 release is out, it is time to plan
> > > >> for
> > > >>>> the
> > > >>>>>> next
> > > >>>>>>>>> major
> > > >>>>>>>>>>>> Flink release.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> Some items:
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>>     1.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>>     Dian Fu and me volunteer to be the release managers
> > > >>> for
> > > >>>>>> Flink
> > > >>>>>>>>> 1.12.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>>     1.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>>     Timeline: We propose to stick to our approximate 4
> > > >>> month
> > > >>>>>>> release
> > > >>>>>>>>>>> cycle,
> > > >>>>>>>>>>>>     thus the release should be done by late October.
> > > >> Given
> > > >>>>> that
> > > >>>>>>>>> there’s
> > > >>>>>>>>>> a
> > > >>>>>>>>>>>>     holiday week in China at the beginning of October, I
> > > >>>>> propose
> > > >>>>>>> to
> > > >>>>>>>> do
> > > >>>>>>>>>> the
> > > >>>>>>>>>>>>     feature freeze on master by late September.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>>     2.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>>     Collecting features: It would be good to have a
> > > >> rough
> > > >>>>>> overview
> > > >>>>>>>> of
> > > >>>>>>>>>> the
> > > >>>>>>>>>>>>     features that will likely be ready to be merged by
> > > >>> late
> > > >>>>>>>> September,
> > > >>>>>>>>>> and
> > > >>>>>>>>>>>> that
> > > >>>>>>>>>>>>     we want in the release.
> > > >>>>>>>>>>>>     Based on the discussion, we will update the Roadmap
> > > >> on
> > > >>>> the
> > > >>>>>>> Flink
> > > >>>>>>>>>>> website
> > > >>>>>>>>>>>>     again!
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>>     1.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>>     Test instabilities and blockers: I would like to
> > > >>> avoid a
> > > >>>>>>>> situation
> > > >>>>>>>>>>> where
> > > >>>>>>>>>>>>     we have many blocking issues or build instabilities
> > > >> at
> > > >>>> the
> > > >>>>>>> time
> > > >>>>>>>> of
> > > >>>>>>>>>> the
> > > >>>>>>>>>>>>     feature freeze. To achieve that, we will try to
> > > >> check
> > > >>>>> every
> > > >>>>>>>> build
> > > >>>>>>>>>>>>     instability within a week, to decide if it is a
> > > >>> blocker
> > > >>>>>> (make
> > > >>>>>>>> sure
> > > >>>>>>>>>> to
> > > >>>>>>>>>>>> use
> > > >>>>>>>>>>>>     the “test-stability” label for those tickets!)
> > > >>>>>>>>>>>>     Blocker issues will need to have somebody assigned
> > > >>>>>>> (responsible)
> > > >>>>>>>>>>> within
> > > >>>>>>>>>>>>     a week, and we want to see progress on all blocker
> > > >>>> issues
> > > >>>>>>>>>> (downgrade,
> > > >>>>>>>>>>>>     resolution, a good plan how to proceed if it is more
> > > >>>>>>>> complicated)
> > > >>>>>>>>>>>>     2.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>>     Quality and stability of new features: In order to
> > > >>> have
> > > >>>> a
> > > >>>>>>> short
> > > >>>>>>>>>>> feature
> > > >>>>>>>>>>>>     freeze phase, we encourage developers to only merge
> > > >>>>>>> well-tested
> > > >>>>>>>>> and
> > > >>>>>>>>>>>>     documented features. In our experience, the feature
> > > >>>> freeze
> > > >>>>>>> works
> > > >>>>>>>>>> best
> > > >>>>>>>>>>> if
> > > >>>>>>>>>>>>     new features are complete, and the community can
> > > >> focus
> > > >>>>> fully
> > > >>>>>>> on
> > > >>>>>>>>>>>> addressing
> > > >>>>>>>>>>>>     newly found bugs and voting the release.
> > > >>>>>>>>>>>>     By having a smooth release process, the next
> > > >>>> merge-window
> > > >>>>>> for
> > > >>>>>>>> the
> > > >>>>>>>>>> next
> > > >>>>>>>>>>>>     release will come sooner.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> Let me know what you think about our items, and share
> > > >>> which
> > > >>>>>>>> features
> > > >>>>>>>>>> you
> > > >>>>>>>>>>>> want in Flink 1.12.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> Best,
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> Robert & Dian
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>
> > > >>>>>>>>>> --
> > > >>>>>>>>>>
> > > >>>>>>>>>> Best Regards,
> > > >>>>>>>>>> Harold Miao
> > > >>>>>>>>>>
> > >
> > >
> >
>

Re: [DISCUSS] Planning Flink 1.12

Posted by Yu Li <ca...@gmail.com>.

+1 for extending feature freeze date to end of October.

Feature development in the master branch could be unblocked through
creating the release branch, but every coin has its two sides (smile)

Best Regards,
Yu


On Wed, 5 Aug 2020 at 20:12, Robert Metzger <rm...@apache.org> wrote:

> Thanks all for your opinion.
>
> @Chesnay: That is a risk, but I hope the people responsible for individual
> FLIPs plan accordingly. Extending the time till the feature freeze should
> not mean that we are extending the scope of the release.
> Ideally, features are done before FF, and they use the time till the freeze
> for additional testing and documentation polishing.
> This FF will be virtual, there should be less disruption than a physical
> conference with all the travelling.
> Do you have a different proposal for the timing?
>
>
> I'm currently considering splitting the feature freeze and the release
> branch creation. Similar to the Linux kernel development, we could have a
> "merge window" and a stabilization phase. At the end of the stabilization
> phase, we cut the release branch and open the next merge window (I'll start
> a separate thread regarding this towards the end of this release cycle, if
> I still like the idea then)
>
>
> On Wed, Aug 5, 2020 at 12:04 PM Chesnay Schepler <ch...@apache.org>
> wrote:
>
> > I'm a bit concerned about end of October, because it means we have Flink
> > forward, which usually means at least 1 week of little-to-no activity,
> > and then 1 week until feature-freeze.
> >
> > On 05/08/2020 11:56, jincheng sun wrote:
> > > +1 for end of October from me as well.
> > >
> > > Best,
> > > Jincheng
> > >
> > >
> > > Kostas Kloudas <kk...@gmail.com> 于2020年8月5日周三 下午4:59写道：
> > >
> > >> +1 for end of October from me as well.
> > >>
> > >> Cheers,
> > >> Kostas
> > >>
> > >> On Wed, Aug 5, 2020 at 9:59 AM Till Rohrmann <tr...@apache.org>
> > wrote:
> > >>
> > >>> +1 for end of October from my side as well.
> > >>>
> > >>> Cheers,
> > >>> Till
> > >>>
> > >>> On Tue, Aug 4, 2020 at 9:46 PM Stephan Ewen <se...@apache.org>
> wrote:
> > >>>
> > >>>> The end of October sounds good from my side, unless it collides with
> > >> some
> > >>>> holidays that affect many committers.
> > >>>>
> > >>>> Feature-wise, I believe we can definitely make good use of the time
> to
> > >>> wrap
> > >>>> up some critical threads (like finishing the FLIP-27 source
> efforts).
> > >>>>
> > >>>> So +1 to the end of October from my side.
> > >>>>
> > >>>> Best,
> > >>>> Stephan
> > >>>>
> > >>>>
> > >>>> On Tue, Aug 4, 2020 at 8:59 AM Robert Metzger <rm...@apache.org>
> > >>> wrote:
> > >>>>> Thanks a lot for commenting on the feature freeze date.
> > >>>>>
> > >>>>> You are raising a few good points on the timing.
> > >>>>> If we have already (2 months before) concerns regarding the
> deadline,
> > >>>> then
> > >>>>> I agree that we should move it till the end of October.
> > >>>>>
> > >>>>> We then just need to be careful not to run into the Christmas
> season
> > >> at
> > >>>> the
> > >>>>> end of December.
> > >>>>>
> > >>>>> If nobody objects within a few days, I'll update the feature freeze
> > >>> date
> > >>>> in
> > >>>>> the Wiki.
> > >>>>>
> > >>>>>
> > >>>>> On Tue, Aug 4, 2020 at 7:52 AM Kurt Young <yk...@gmail.com>
> wrote:
> > >>>>>
> > >>>>>> Regarding setting the feature freeze date to late September, I
> have
> > >>>> some
> > >>>>>> concern that it might make
> > >>>>>> the development time of 1.12 too short.
> > >>>>>>
> > >>>>>> One reason for this is we took too much time (about 1.5 month,
> from
> > >>> mid
> > >>>>> of
> > >>>>>> May to beginning of July)
> > >>>>>> for testing 1.11. It's not ideal but further squeeze the
> > >> development
> > >>>> time
> > >>>>>> of 1.12 won't make this better.
> > >>>>>>   Besides, AFAIK July & August is also a popular vacation season
> for
> > >>>>>> European. Given the fact most
> > >>>>>>   committers of Flink come from Europe, I think we should also
> take
> > >>> this
> > >>>>>> into consideration.
> > >>>>>>
> > >>>>>> It's also true that the first week of October is the national
> > >> holiday
> > >>>> of
> > >>>>>> China, so I'm wondering whether the
> > >>>>>> end of October could be a candidate feature freeze date.
> > >>>>>>
> > >>>>>> Best,
> > >>>>>> Kurt
> > >>>>>>
> > >>>>>>
> > >>>>>> On Tue, Jul 28, 2020 at 2:41 AM Robert Metzger <
> > >> rmetzger@apache.org>
> > >>>>>> wrote:
> > >>>>>>
> > >>>>>>> Hi all,
> > >>>>>>>
> > >>>>>>> Thanks a lot for the responses so far. I've put them into this
> > >> Wiki
> > >>>>> page:
> > >>>>>>> https://cwiki.apache.org/confluence/display/FLINK/1.12+Release
> > >> to
> > >>>> keep
> > >>>>>>> track of them. Ideally, post JIRA tickets for your feature, then
> > >>> the
> > >>>>>> status
> > >>>>>>> will update automatically in the wiki :)
> > >>>>>>>
> > >>>>>>> Please keep posting features here, or add them to the Wiki
> > >> yourself
> > >>>> 🙏
> > >>>>>>> @Prasanna kumar <pr...@gmail.com>: Dynamic Auto
> > >>>> Scaling
> > >>>>>> is a
> > >>>>>>> feature request the community is well-aware of. Till has posted
> > >>>>>>> "Reactive-scaling mode" as a feature he's working on for the 1.12
> > >>>>>> release.
> > >>>>>>> This work will introduce the basic building blocks and partial
> > >>>> support
> > >>>>>> for
> > >>>>>>> the feature you are requesting.
> > >>>>>>> Proper support for dynamic scaling, while maintaining Flink's
> > >> high
> > >>>>>>> performance (throughout, low latency) and correctness is a
> > >>> difficult
> > >>>>> task
> > >>>>>>> that needs a lot of work. It will probably take a little bit of
> > >>> time
> > >>>>> till
> > >>>>>>> this is fully available.
> > >>>>>>>
> > >>>>>>> Cheers,
> > >>>>>>> Robert
> > >>>>>>>
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> On Thu, Jul 23, 2020 at 2:27 PM Till Rohrmann <
> > >>> trohrmann@apache.org>
> > >>>>>>> wrote:
> > >>>>>>>
> > >>>>>>>> Thanks for being our release managers for the 1.12 release
> > >> Dian &
> > >>>>>> Robert!
> > >>>>>>>> Here are some features I would like to work on for this
> > >> release:
> > >>>>>>>> # Features
> > >>>>>>>>
> > >>>>>>>> ## Finishing pipelined region scheduling (
> > >>>>>>>> https://issues.apache.org/jira/browse/FLINK-16430)
> > >>>>>>>> With the pipelined region scheduler we want to implement a
> > >>>> scheduler
> > >>>>>>> which
> > >>>>>>>> can serve streaming as well as batch workloads alike while
> > >> being
> > >>>> able
> > >>>>>> to
> > >>>>>>>> run jobs under constrained resources. The latter is
> > >> particularly
> > >>>>>>> important
> > >>>>>>>> for bounded streaming jobs which, currently, are not well
> > >>>> supported.
> > >>>>>>>> ## Reactive-scaling mode
> > >>>>>>>> Being able to react to newly available resources and rescaling
> > >> a
> > >>>>>> running
> > >>>>>>>> job accordingly will make Flink's operation much easier because
> > >>>>>> resources
> > >>>>>>>> can then be controlled by an external tool (e.g. GCP
> > >> autoscaling,
> > >>>> K8s
> > >>>>>>>> horizontal pod scaler, etc.). In this release we want to make a
> > >>> big
> > >>>>>> step
> > >>>>>>>> towards this direction. As a first step we want to support the
> > >>>>>> execution
> > >>>>>>> of
> > >>>>>>>> jobs with a parallelism which is lower than the specified
> > >>>> parallelism
> > >>>>>> in
> > >>>>>>>> case that Flink lost a TaskManager or could not acquire enough
> > >>>>>> resources.
> > >>>>>>>> # Maintenance/Stability
> > >>>>>>>>
> > >>>>>>>> ## JM / TM finished task reconciliation (
> > >>>>>>>> https://issues.apache.org/jira/browse/FLINK-17075)
> > >>>>>>>> This prevents the system from going out of sync if a task state
> > >>>>> change
> > >>>>>>> from
> > >>>>>>>> the TM to the JM is lost.
> > >>>>>>>>
> > >>>>>>>> ## Make metrics services work with Kubernetes deployments (
> > >>>>>>>> https://issues.apache.org/jira/browse/FLINK-11127)
> > >>>>>>>> Invert the direction in which the MetricFetcher connects to the
> > >>>>>>>> MetricQueryFetchers. That way it will no longer be necessary to
> > >>>>> expose
> > >>>>>> on
> > >>>>>>>> K8s for every TaskManager a port on which the
> > >> MetricQueryFetcher
> > >>>>> runs.
> > >>>>>>> This
> > >>>>>>>> will then make the deployment of Flink clusters on K8s easier.
> > >>>>>>>>
> > >>>>>>>> ## Handle long-blocking operations during job submission
> > >>> (savepoint
> > >>>>>>>> restore) (https://issues.apache.org/jira/browse/FLINK-16866)
> > >>>>>>>> Submitting a Flink job can involve the interaction with
> > >> external
> > >>>>>> systems
> > >>>>>>>> (blocking operations). Depending on the job the interactions
> > >> can
> > >>>> take
> > >>>>>> so
> > >>>>>>>> long that it exceeds the submission timeout which reports a
> > >>> failure
> > >>>>> on
> > >>>>>>> the
> > >>>>>>>> client side even though the actual submission succeeded. By
> > >>>>> decoupling
> > >>>>>>> the
> > >>>>>>>> creation of the ExecutionGraph from the job submission, we can
> > >>> make
> > >>>>> the
> > >>>>>>> job
> > >>>>>>>> submission non-blocking which will solve this problem.
> > >>>>>>>>
> > >>>>>>>> ## Make IDs more intuitive to ease debugging (FLIP-118) (
> > >>>>>>>> https://issues.apache.org/jira/browse/FLINK-15679)
> > >>>>>>>> By making the internal Flink IDs compositional or logging how
> > >>> they
> > >>>>>> belong
> > >>>>>>>> together, we can make the debugging of Flink's operations much
> > >>>>> easier.
> > >>>>>>>> Cheers,
> > >>>>>>>> Till
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> On Thu, Jul 23, 2020 at 7:48 AM Canbin Zheng <
> > >>>> felixzhengcb@gmail.com
> > >>>>>>>> wrote:
> > >>>>>>>>
> > >>>>>>>>> Hi All,
> > >>>>>>>>>
> > >>>>>>>>> Thanks for bring-up this discussion, Robert!
> > >>>>>>>>> Congratulations on becoming the release manager of 1.12， Dian
> > >>> and
> > >>>>>>> Robert
> > >>>>>>>> !
> > >>>>>>>>> ----------
> > >>>>>>>>> Here are some of my thoughts of the features for native
> > >>>> integration
> > >>>>>>> with
> > >>>>>>>>> Kubernetes in Flink 1.12:
> > >>>>>>>>>
> > >>>>>>>>> 1. Support user-specified pod templates
> > >>>>>>>>>      Description:
> > >>>>>>>>>      The current approach of introducing new configuration
> > >>> options
> > >>>>> for
> > >>>>>>>> each
> > >>>>>>>>> aspect of pod specification a user might wish is becoming
> > >>>> unwieldy,
> > >>>>>> we
> > >>>>>>>> have
> > >>>>>>>>> to maintain more and more Flink side Kubernetes configuration
> > >>>>> options
> > >>>>>>> and
> > >>>>>>>>> users have to learn the gap between the declarative model
> > >> used
> > >>> by
> > >>>>>>>>> Kubernetes and the configuration model used by Flink. It's a
> > >>>> great
> > >>>>>>>>> improvement to allow users to specify pod templates as
> > >> central
> > >>>>> places
> > >>>>>>> for
> > >>>>>>>>> all customization needs for the jobmanager and taskmanager
> > >>> pods.
> > >>>>>>>>>      Benefits:
> > >>>>>>>>>      Users can leverage many of the advanced K8s features that
> > >>> the
> > >>>>>> Flink
> > >>>>>>>>> community does not support explicitly, such as volume
> > >> mounting,
> > >>>> DNS
> > >>>>>>>>> configuration, pod affinity/anti-affinity setting, etc.
> > >>>>>>>>>
> > >>>>>>>>> 2. Support running PyFlink on Kubernetes
> > >>>>>>>>>      Description:
> > >>>>>>>>>      Support running PyFlink on Kubernetes, including session
> > >>>>> cluster
> > >>>>>>> and
> > >>>>>>>>> application cluster.
> > >>>>>>>>>      Benefits:
> > >>>>>>>>>      Running python application in a containerized
> > >> environment.
> > >>>>>>>>> 3. Support built-in init-Container
> > >>>>>>>>>      Description:
> > >>>>>>>>>      We need a built-in init-Container to help solve
> > >> dependency
> > >>>>>>> management
> > >>>>>>>>> in a containerized environment, especially in the application
> > >>>> mode.
> > >>>>>>>>>      Benefits:
> > >>>>>>>>>      Separate the base Flink image from dynamic dependencies.
> > >>>>>>>>>
> > >>>>>>>>> 4. Support accessing secured services via K8s secrets
> > >>>>>>>>>      Description:
> > >>>>>>>>>      Kubernetes Secrets
> > >>>>>>>>> <https://kubernetes.io/docs/concepts/configuration/secret/>
> > >>> can
> > >>>> be
> > >>>>>>> used
> > >>>>>>>> to
> > >>>>>>>>> provide credentials for a Flink application to access secured
> > >>>>>> services.
> > >>>>>>>> It
> > >>>>>>>>> helps people who want to use a user-specified K8s Secret
> > >>> through
> > >>>> an
> > >>>>>>>>> environment variable.
> > >>>>>>>>>      Benefits:
> > >>>>>>>>>      Improve user experience.
> > >>>>>>>>>
> > >>>>>>>>> 5. Support configuring replica of JobManager Deployment in
> > >>>>> ZooKeeper
> > >>>>>> HA
> > >>>>>>>>> setups
> > >>>>>>>>>      Description:
> > >>>>>>>>>      Make the *replica* of Deployment configurable in the
> > >>>> ZooKeeper
> > >>>>> HA
> > >>>>>>>>> setups.
> > >>>>>>>>>      Benefits:
> > >>>>>>>>>      Achieve faster failover.
> > >>>>>>>>>
> > >>>>>>>>> 6. Support to configure limit for CPU requirement
> > >>>>>>>>>      Description:
> > >>>>>>>>>      To leverage the Kubernetes feature of container
> > >>> request/limit
> > >>>>>> CPU.
> > >>>>>>>>>      Benefits:
> > >>>>>>>>>      Reduce cost.
> > >>>>>>>>>
> > >>>>>>>>> Regards,
> > >>>>>>>>> Canbin Zheng
> > >>>>>>>>>
> > >>>>>>>>> Harold.Miao <mi...@gmail.com> 于2020年7月23日周四 下午12:44写道：
> > >>>>>>>>>
> > >>>>>>>>>> I'm excited to hear about this feature,  very, very, very
> > >>>> highly
> > >>>>>>>>> encouraged
> > >>>>>>>>>>
> > >>>>>>>>>> Prasanna kumar <pr...@gmail.com>
> > >> 于2020年7月23日周四
> > >>>>>>>> 上午12:10写道：
> > >>>>>>>>>>> Hi Flink Dev Team,
> > >>>>>>>>>>>
> > >>>>>>>>>>> Dynamic AutoScaling Based on the incoming data load would
> > >>> be
> > >>>> a
> > >>>>>>> great
> > >>>>>>>>>>> feature.
> > >>>>>>>>>>>
> > >>>>>>>>>>> We should be able have some rule say If the load
> > >> increased
> > >>> by
> > >>>>>> 20% ,
> > >>>>>>>> add
> > >>>>>>>>>>> extra resource should be added.
> > >>>>>>>>>>> Or time based say during these peak hours the pipeline
> > >>> should
> > >>>>>> scale
> > >>>>>>>>>>> automatically by 50%.
> > >>>>>>>>>>>
> > >>>>>>>>>>> This will help a lot in cost reduction.
> > >>>>>>>>>>>
> > >>>>>>>>>>> EMR cluster provides a similar feature for SPARK based
> > >>>>>> application.
> > >>>>>>>>>>> Thanks,
> > >>>>>>>>>>> Prasanna.
> > >>>>>>>>>>>
> > >>>>>>>>>>> On Wed, Jul 22, 2020 at 5:40 PM Robert Metzger <
> > >>>>>>> rmetzger@apache.org>
> > >>>>>>>>>>> wrote:
> > >>>>>>>>>>>
> > >>>>>>>>>>>> Hi all,
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Now that the 1.11 release is out, it is time to plan
> > >> for
> > >>>> the
> > >>>>>> next
> > >>>>>>>>> major
> > >>>>>>>>>>>> Flink release.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Some items:
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>     1.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>     Dian Fu and me volunteer to be the release managers
> > >>> for
> > >>>>>> Flink
> > >>>>>>>>> 1.12.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>     1.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>     Timeline: We propose to stick to our approximate 4
> > >>> month
> > >>>>>>> release
> > >>>>>>>>>>> cycle,
> > >>>>>>>>>>>>     thus the release should be done by late October.
> > >> Given
> > >>>>> that
> > >>>>>>>>> there’s
> > >>>>>>>>>> a
> > >>>>>>>>>>>>     holiday week in China at the beginning of October, I
> > >>>>> propose
> > >>>>>>> to
> > >>>>>>>> do
> > >>>>>>>>>> the
> > >>>>>>>>>>>>     feature freeze on master by late September.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>     2.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>     Collecting features: It would be good to have a
> > >> rough
> > >>>>>> overview
> > >>>>>>>> of
> > >>>>>>>>>> the
> > >>>>>>>>>>>>     features that will likely be ready to be merged by
> > >>> late
> > >>>>>>>> September,
> > >>>>>>>>>> and
> > >>>>>>>>>>>> that
> > >>>>>>>>>>>>     we want in the release.
> > >>>>>>>>>>>>     Based on the discussion, we will update the Roadmap
> > >> on
> > >>>> the
> > >>>>>>> Flink
> > >>>>>>>>>>> website
> > >>>>>>>>>>>>     again!
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>     1.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>     Test instabilities and blockers: I would like to
> > >>> avoid a
> > >>>>>>>> situation
> > >>>>>>>>>>> where
> > >>>>>>>>>>>>     we have many blocking issues or build instabilities
> > >> at
> > >>>> the
> > >>>>>>> time
> > >>>>>>>> of
> > >>>>>>>>>> the
> > >>>>>>>>>>>>     feature freeze. To achieve that, we will try to
> > >> check
> > >>>>> every
> > >>>>>>>> build
> > >>>>>>>>>>>>     instability within a week, to decide if it is a
> > >>> blocker
> > >>>>>> (make
> > >>>>>>>> sure
> > >>>>>>>>>> to
> > >>>>>>>>>>>> use
> > >>>>>>>>>>>>     the “test-stability” label for those tickets!)
> > >>>>>>>>>>>>     Blocker issues will need to have somebody assigned
> > >>>>>>> (responsible)
> > >>>>>>>>>>> within
> > >>>>>>>>>>>>     a week, and we want to see progress on all blocker
> > >>>> issues
> > >>>>>>>>>> (downgrade,
> > >>>>>>>>>>>>     resolution, a good plan how to proceed if it is more
> > >>>>>>>> complicated)
> > >>>>>>>>>>>>     2.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>     Quality and stability of new features: In order to
> > >>> have
> > >>>> a
> > >>>>>>> short
> > >>>>>>>>>>> feature
> > >>>>>>>>>>>>     freeze phase, we encourage developers to only merge
> > >>>>>>> well-tested
> > >>>>>>>>> and
> > >>>>>>>>>>>>     documented features. In our experience, the feature
> > >>>> freeze
> > >>>>>>> works
> > >>>>>>>>>> best
> > >>>>>>>>>>> if
> > >>>>>>>>>>>>     new features are complete, and the community can
> > >> focus
> > >>>>> fully
> > >>>>>>> on
> > >>>>>>>>>>>> addressing
> > >>>>>>>>>>>>     newly found bugs and voting the release.
> > >>>>>>>>>>>>     By having a smooth release process, the next
> > >>>> merge-window
> > >>>>>> for
> > >>>>>>>> the
> > >>>>>>>>>> next
> > >>>>>>>>>>>>     release will come sooner.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Let me know what you think about our items, and share
> > >>> which
> > >>>>>>>> features
> > >>>>>>>>>> you
> > >>>>>>>>>>>> want in Flink 1.12.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Best,
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Robert & Dian
> > >>>>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>> --
> > >>>>>>>>>>
> > >>>>>>>>>> Best Regards,
> > >>>>>>>>>> Harold Miao
> > >>>>>>>>>>
> >
> >
>

Re: [DISCUSS] Planning Flink 1.12

Posted by Robert Metzger <rm...@apache.org>.

Thanks all for your opinion.

@Chesnay: That is a risk, but I hope the people responsible for individual
FLIPs plan accordingly. Extending the time till the feature freeze should
not mean that we are extending the scope of the release.
Ideally, features are done before FF, and they use the time till the freeze
for additional testing and documentation polishing.
This FF will be virtual, there should be less disruption than a physical
conference with all the travelling.
Do you have a different proposal for the timing?


I'm currently considering splitting the feature freeze and the release
branch creation. Similar to the Linux kernel development, we could have a
"merge window" and a stabilization phase. At the end of the stabilization
phase, we cut the release branch and open the next merge window (I'll start
a separate thread regarding this towards the end of this release cycle, if
I still like the idea then)


On Wed, Aug 5, 2020 at 12:04 PM Chesnay Schepler <ch...@apache.org> wrote:

> I'm a bit concerned about end of October, because it means we have Flink
> forward, which usually means at least 1 week of little-to-no activity,
> and then 1 week until feature-freeze.
>
> On 05/08/2020 11:56, jincheng sun wrote:
> > +1 for end of October from me as well.
> >
> > Best,
> > Jincheng
> >
> >
> > Kostas Kloudas <kk...@gmail.com> 于2020年8月5日周三 下午4:59写道：
> >
> >> +1 for end of October from me as well.
> >>
> >> Cheers,
> >> Kostas
> >>
> >> On Wed, Aug 5, 2020 at 9:59 AM Till Rohrmann <tr...@apache.org>
> wrote:
> >>
> >>> +1 for end of October from my side as well.
> >>>
> >>> Cheers,
> >>> Till
> >>>
> >>> On Tue, Aug 4, 2020 at 9:46 PM Stephan Ewen <se...@apache.org> wrote:
> >>>
> >>>> The end of October sounds good from my side, unless it collides with
> >> some
> >>>> holidays that affect many committers.
> >>>>
> >>>> Feature-wise, I believe we can definitely make good use of the time to
> >>> wrap
> >>>> up some critical threads (like finishing the FLIP-27 source efforts).
> >>>>
> >>>> So +1 to the end of October from my side.
> >>>>
> >>>> Best,
> >>>> Stephan
> >>>>
> >>>>
> >>>> On Tue, Aug 4, 2020 at 8:59 AM Robert Metzger <rm...@apache.org>
> >>> wrote:
> >>>>> Thanks a lot for commenting on the feature freeze date.
> >>>>>
> >>>>> You are raising a few good points on the timing.
> >>>>> If we have already (2 months before) concerns regarding the deadline,
> >>>> then
> >>>>> I agree that we should move it till the end of October.
> >>>>>
> >>>>> We then just need to be careful not to run into the Christmas season
> >> at
> >>>> the
> >>>>> end of December.
> >>>>>
> >>>>> If nobody objects within a few days, I'll update the feature freeze
> >>> date
> >>>> in
> >>>>> the Wiki.
> >>>>>
> >>>>>
> >>>>> On Tue, Aug 4, 2020 at 7:52 AM Kurt Young <yk...@gmail.com> wrote:
> >>>>>
> >>>>>> Regarding setting the feature freeze date to late September, I have
> >>>> some
> >>>>>> concern that it might make
> >>>>>> the development time of 1.12 too short.
> >>>>>>
> >>>>>> One reason for this is we took too much time (about 1.5 month, from
> >>> mid
> >>>>> of
> >>>>>> May to beginning of July)
> >>>>>> for testing 1.11. It's not ideal but further squeeze the
> >> development
> >>>> time
> >>>>>> of 1.12 won't make this better.
> >>>>>>   Besides, AFAIK July & August is also a popular vacation season for
> >>>>>> European. Given the fact most
> >>>>>>   committers of Flink come from Europe, I think we should also take
> >>> this
> >>>>>> into consideration.
> >>>>>>
> >>>>>> It's also true that the first week of October is the national
> >> holiday
> >>>> of
> >>>>>> China, so I'm wondering whether the
> >>>>>> end of October could be a candidate feature freeze date.
> >>>>>>
> >>>>>> Best,
> >>>>>> Kurt
> >>>>>>
> >>>>>>
> >>>>>> On Tue, Jul 28, 2020 at 2:41 AM Robert Metzger <
> >> rmetzger@apache.org>
> >>>>>> wrote:
> >>>>>>
> >>>>>>> Hi all,
> >>>>>>>
> >>>>>>> Thanks a lot for the responses so far. I've put them into this
> >> Wiki
> >>>>> page:
> >>>>>>> https://cwiki.apache.org/confluence/display/FLINK/1.12+Release
> >> to
> >>>> keep
> >>>>>>> track of them. Ideally, post JIRA tickets for your feature, then
> >>> the
> >>>>>> status
> >>>>>>> will update automatically in the wiki :)
> >>>>>>>
> >>>>>>> Please keep posting features here, or add them to the Wiki
> >> yourself
> >>>> 🙏
> >>>>>>> @Prasanna kumar <pr...@gmail.com>: Dynamic Auto
> >>>> Scaling
> >>>>>> is a
> >>>>>>> feature request the community is well-aware of. Till has posted
> >>>>>>> "Reactive-scaling mode" as a feature he's working on for the 1.12
> >>>>>> release.
> >>>>>>> This work will introduce the basic building blocks and partial
> >>>> support
> >>>>>> for
> >>>>>>> the feature you are requesting.
> >>>>>>> Proper support for dynamic scaling, while maintaining Flink's
> >> high
> >>>>>>> performance (throughout, low latency) and correctness is a
> >>> difficult
> >>>>> task
> >>>>>>> that needs a lot of work. It will probably take a little bit of
> >>> time
> >>>>> till
> >>>>>>> this is fully available.
> >>>>>>>
> >>>>>>> Cheers,
> >>>>>>> Robert
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> On Thu, Jul 23, 2020 at 2:27 PM Till Rohrmann <
> >>> trohrmann@apache.org>
> >>>>>>> wrote:
> >>>>>>>
> >>>>>>>> Thanks for being our release managers for the 1.12 release
> >> Dian &
> >>>>>> Robert!
> >>>>>>>> Here are some features I would like to work on for this
> >> release:
> >>>>>>>> # Features
> >>>>>>>>
> >>>>>>>> ## Finishing pipelined region scheduling (
> >>>>>>>> https://issues.apache.org/jira/browse/FLINK-16430)
> >>>>>>>> With the pipelined region scheduler we want to implement a
> >>>> scheduler
> >>>>>>> which
> >>>>>>>> can serve streaming as well as batch workloads alike while
> >> being
> >>>> able
> >>>>>> to
> >>>>>>>> run jobs under constrained resources. The latter is
> >> particularly
> >>>>>>> important
> >>>>>>>> for bounded streaming jobs which, currently, are not well
> >>>> supported.
> >>>>>>>> ## Reactive-scaling mode
> >>>>>>>> Being able to react to newly available resources and rescaling
> >> a
> >>>>>> running
> >>>>>>>> job accordingly will make Flink's operation much easier because
> >>>>>> resources
> >>>>>>>> can then be controlled by an external tool (e.g. GCP
> >> autoscaling,
> >>>> K8s
> >>>>>>>> horizontal pod scaler, etc.). In this release we want to make a
> >>> big
> >>>>>> step
> >>>>>>>> towards this direction. As a first step we want to support the
> >>>>>> execution
> >>>>>>> of
> >>>>>>>> jobs with a parallelism which is lower than the specified
> >>>> parallelism
> >>>>>> in
> >>>>>>>> case that Flink lost a TaskManager or could not acquire enough
> >>>>>> resources.
> >>>>>>>> # Maintenance/Stability
> >>>>>>>>
> >>>>>>>> ## JM / TM finished task reconciliation (
> >>>>>>>> https://issues.apache.org/jira/browse/FLINK-17075)
> >>>>>>>> This prevents the system from going out of sync if a task state
> >>>>> change
> >>>>>>> from
> >>>>>>>> the TM to the JM is lost.
> >>>>>>>>
> >>>>>>>> ## Make metrics services work with Kubernetes deployments (
> >>>>>>>> https://issues.apache.org/jira/browse/FLINK-11127)
> >>>>>>>> Invert the direction in which the MetricFetcher connects to the
> >>>>>>>> MetricQueryFetchers. That way it will no longer be necessary to
> >>>>> expose
> >>>>>> on
> >>>>>>>> K8s for every TaskManager a port on which the
> >> MetricQueryFetcher
> >>>>> runs.
> >>>>>>> This
> >>>>>>>> will then make the deployment of Flink clusters on K8s easier.
> >>>>>>>>
> >>>>>>>> ## Handle long-blocking operations during job submission
> >>> (savepoint
> >>>>>>>> restore) (https://issues.apache.org/jira/browse/FLINK-16866)
> >>>>>>>> Submitting a Flink job can involve the interaction with
> >> external
> >>>>>> systems
> >>>>>>>> (blocking operations). Depending on the job the interactions
> >> can
> >>>> take
> >>>>>> so
> >>>>>>>> long that it exceeds the submission timeout which reports a
> >>> failure
> >>>>> on
> >>>>>>> the
> >>>>>>>> client side even though the actual submission succeeded. By
> >>>>> decoupling
> >>>>>>> the
> >>>>>>>> creation of the ExecutionGraph from the job submission, we can
> >>> make
> >>>>> the
> >>>>>>> job
> >>>>>>>> submission non-blocking which will solve this problem.
> >>>>>>>>
> >>>>>>>> ## Make IDs more intuitive to ease debugging (FLIP-118) (
> >>>>>>>> https://issues.apache.org/jira/browse/FLINK-15679)
> >>>>>>>> By making the internal Flink IDs compositional or logging how
> >>> they
> >>>>>> belong
> >>>>>>>> together, we can make the debugging of Flink's operations much
> >>>>> easier.
> >>>>>>>> Cheers,
> >>>>>>>> Till
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On Thu, Jul 23, 2020 at 7:48 AM Canbin Zheng <
> >>>> felixzhengcb@gmail.com
> >>>>>>>> wrote:
> >>>>>>>>
> >>>>>>>>> Hi All,
> >>>>>>>>>
> >>>>>>>>> Thanks for bring-up this discussion, Robert!
> >>>>>>>>> Congratulations on becoming the release manager of 1.12， Dian
> >>> and
> >>>>>>> Robert
> >>>>>>>> !
> >>>>>>>>> ----------
> >>>>>>>>> Here are some of my thoughts of the features for native
> >>>> integration
> >>>>>>> with
> >>>>>>>>> Kubernetes in Flink 1.12:
> >>>>>>>>>
> >>>>>>>>> 1. Support user-specified pod templates
> >>>>>>>>>      Description:
> >>>>>>>>>      The current approach of introducing new configuration
> >>> options
> >>>>> for
> >>>>>>>> each
> >>>>>>>>> aspect of pod specification a user might wish is becoming
> >>>> unwieldy,
> >>>>>> we
> >>>>>>>> have
> >>>>>>>>> to maintain more and more Flink side Kubernetes configuration
> >>>>> options
> >>>>>>> and
> >>>>>>>>> users have to learn the gap between the declarative model
> >> used
> >>> by
> >>>>>>>>> Kubernetes and the configuration model used by Flink. It's a
> >>>> great
> >>>>>>>>> improvement to allow users to specify pod templates as
> >> central
> >>>>> places
> >>>>>>> for
> >>>>>>>>> all customization needs for the jobmanager and taskmanager
> >>> pods.
> >>>>>>>>>      Benefits:
> >>>>>>>>>      Users can leverage many of the advanced K8s features that
> >>> the
> >>>>>> Flink
> >>>>>>>>> community does not support explicitly, such as volume
> >> mounting,
> >>>> DNS
> >>>>>>>>> configuration, pod affinity/anti-affinity setting, etc.
> >>>>>>>>>
> >>>>>>>>> 2. Support running PyFlink on Kubernetes
> >>>>>>>>>      Description:
> >>>>>>>>>      Support running PyFlink on Kubernetes, including session
> >>>>> cluster
> >>>>>>> and
> >>>>>>>>> application cluster.
> >>>>>>>>>      Benefits:
> >>>>>>>>>      Running python application in a containerized
> >> environment.
> >>>>>>>>> 3. Support built-in init-Container
> >>>>>>>>>      Description:
> >>>>>>>>>      We need a built-in init-Container to help solve
> >> dependency
> >>>>>>> management
> >>>>>>>>> in a containerized environment, especially in the application
> >>>> mode.
> >>>>>>>>>      Benefits:
> >>>>>>>>>      Separate the base Flink image from dynamic dependencies.
> >>>>>>>>>
> >>>>>>>>> 4. Support accessing secured services via K8s secrets
> >>>>>>>>>      Description:
> >>>>>>>>>      Kubernetes Secrets
> >>>>>>>>> <https://kubernetes.io/docs/concepts/configuration/secret/>
> >>> can
> >>>> be
> >>>>>>> used
> >>>>>>>> to
> >>>>>>>>> provide credentials for a Flink application to access secured
> >>>>>> services.
> >>>>>>>> It
> >>>>>>>>> helps people who want to use a user-specified K8s Secret
> >>> through
> >>>> an
> >>>>>>>>> environment variable.
> >>>>>>>>>      Benefits:
> >>>>>>>>>      Improve user experience.
> >>>>>>>>>
> >>>>>>>>> 5. Support configuring replica of JobManager Deployment in
> >>>>> ZooKeeper
> >>>>>> HA
> >>>>>>>>> setups
> >>>>>>>>>      Description:
> >>>>>>>>>      Make the *replica* of Deployment configurable in the
> >>>> ZooKeeper
> >>>>> HA
> >>>>>>>>> setups.
> >>>>>>>>>      Benefits:
> >>>>>>>>>      Achieve faster failover.
> >>>>>>>>>
> >>>>>>>>> 6. Support to configure limit for CPU requirement
> >>>>>>>>>      Description:
> >>>>>>>>>      To leverage the Kubernetes feature of container
> >>> request/limit
> >>>>>> CPU.
> >>>>>>>>>      Benefits:
> >>>>>>>>>      Reduce cost.
> >>>>>>>>>
> >>>>>>>>> Regards,
> >>>>>>>>> Canbin Zheng
> >>>>>>>>>
> >>>>>>>>> Harold.Miao <mi...@gmail.com> 于2020年7月23日周四 下午12:44写道：
> >>>>>>>>>
> >>>>>>>>>> I'm excited to hear about this feature,  very, very, very
> >>>> highly
> >>>>>>>>> encouraged
> >>>>>>>>>>
> >>>>>>>>>> Prasanna kumar <pr...@gmail.com>
> >> 于2020年7月23日周四
> >>>>>>>> 上午12:10写道：
> >>>>>>>>>>> Hi Flink Dev Team,
> >>>>>>>>>>>
> >>>>>>>>>>> Dynamic AutoScaling Based on the incoming data load would
> >>> be
> >>>> a
> >>>>>>> great
> >>>>>>>>>>> feature.
> >>>>>>>>>>>
> >>>>>>>>>>> We should be able have some rule say If the load
> >> increased
> >>> by
> >>>>>> 20% ,
> >>>>>>>> add
> >>>>>>>>>>> extra resource should be added.
> >>>>>>>>>>> Or time based say during these peak hours the pipeline
> >>> should
> >>>>>> scale
> >>>>>>>>>>> automatically by 50%.
> >>>>>>>>>>>
> >>>>>>>>>>> This will help a lot in cost reduction.
> >>>>>>>>>>>
> >>>>>>>>>>> EMR cluster provides a similar feature for SPARK based
> >>>>>> application.
> >>>>>>>>>>> Thanks,
> >>>>>>>>>>> Prasanna.
> >>>>>>>>>>>
> >>>>>>>>>>> On Wed, Jul 22, 2020 at 5:40 PM Robert Metzger <
> >>>>>>> rmetzger@apache.org>
> >>>>>>>>>>> wrote:
> >>>>>>>>>>>
> >>>>>>>>>>>> Hi all,
> >>>>>>>>>>>>
> >>>>>>>>>>>> Now that the 1.11 release is out, it is time to plan
> >> for
> >>>> the
> >>>>>> next
> >>>>>>>>> major
> >>>>>>>>>>>> Flink release.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Some items:
> >>>>>>>>>>>>
> >>>>>>>>>>>>     1.
> >>>>>>>>>>>>
> >>>>>>>>>>>>     Dian Fu and me volunteer to be the release managers
> >>> for
> >>>>>> Flink
> >>>>>>>>> 1.12.
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>     1.
> >>>>>>>>>>>>
> >>>>>>>>>>>>     Timeline: We propose to stick to our approximate 4
> >>> month
> >>>>>>> release
> >>>>>>>>>>> cycle,
> >>>>>>>>>>>>     thus the release should be done by late October.
> >> Given
> >>>>> that
> >>>>>>>>> there’s
> >>>>>>>>>> a
> >>>>>>>>>>>>     holiday week in China at the beginning of October, I
> >>>>> propose
> >>>>>>> to
> >>>>>>>> do
> >>>>>>>>>> the
> >>>>>>>>>>>>     feature freeze on master by late September.
> >>>>>>>>>>>>
> >>>>>>>>>>>>     2.
> >>>>>>>>>>>>
> >>>>>>>>>>>>     Collecting features: It would be good to have a
> >> rough
> >>>>>> overview
> >>>>>>>> of
> >>>>>>>>>> the
> >>>>>>>>>>>>     features that will likely be ready to be merged by
> >>> late
> >>>>>>>> September,
> >>>>>>>>>> and
> >>>>>>>>>>>> that
> >>>>>>>>>>>>     we want in the release.
> >>>>>>>>>>>>     Based on the discussion, we will update the Roadmap
> >> on
> >>>> the
> >>>>>>> Flink
> >>>>>>>>>>> website
> >>>>>>>>>>>>     again!
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>     1.
> >>>>>>>>>>>>
> >>>>>>>>>>>>     Test instabilities and blockers: I would like to
> >>> avoid a
> >>>>>>>> situation
> >>>>>>>>>>> where
> >>>>>>>>>>>>     we have many blocking issues or build instabilities
> >> at
> >>>> the
> >>>>>>> time
> >>>>>>>> of
> >>>>>>>>>> the
> >>>>>>>>>>>>     feature freeze. To achieve that, we will try to
> >> check
> >>>>> every
> >>>>>>>> build
> >>>>>>>>>>>>     instability within a week, to decide if it is a
> >>> blocker
> >>>>>> (make
> >>>>>>>> sure
> >>>>>>>>>> to
> >>>>>>>>>>>> use
> >>>>>>>>>>>>     the “test-stability” label for those tickets!)
> >>>>>>>>>>>>     Blocker issues will need to have somebody assigned
> >>>>>>> (responsible)
> >>>>>>>>>>> within
> >>>>>>>>>>>>     a week, and we want to see progress on all blocker
> >>>> issues
> >>>>>>>>>> (downgrade,
> >>>>>>>>>>>>     resolution, a good plan how to proceed if it is more
> >>>>>>>> complicated)
> >>>>>>>>>>>>     2.
> >>>>>>>>>>>>
> >>>>>>>>>>>>     Quality and stability of new features: In order to
> >>> have
> >>>> a
> >>>>>>> short
> >>>>>>>>>>> feature
> >>>>>>>>>>>>     freeze phase, we encourage developers to only merge
> >>>>>>> well-tested
> >>>>>>>>> and
> >>>>>>>>>>>>     documented features. In our experience, the feature
> >>>> freeze
> >>>>>>> works
> >>>>>>>>>> best
> >>>>>>>>>>> if
> >>>>>>>>>>>>     new features are complete, and the community can
> >> focus
> >>>>> fully
> >>>>>>> on
> >>>>>>>>>>>> addressing
> >>>>>>>>>>>>     newly found bugs and voting the release.
> >>>>>>>>>>>>     By having a smooth release process, the next
> >>>> merge-window
> >>>>>> for
> >>>>>>>> the
> >>>>>>>>>> next
> >>>>>>>>>>>>     release will come sooner.
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> Let me know what you think about our items, and share
> >>> which
> >>>>>>>> features
> >>>>>>>>>> you
> >>>>>>>>>>>> want in Flink 1.12.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Best,
> >>>>>>>>>>>>
> >>>>>>>>>>>> Robert & Dian
> >>>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> --
> >>>>>>>>>>
> >>>>>>>>>> Best Regards,
> >>>>>>>>>> Harold Miao
> >>>>>>>>>>
>
>

Re: [DISCUSS] Planning Flink 1.12

Posted by Chesnay Schepler <ch...@apache.org>.

I'm a bit concerned about end of October, because it means we have Flink 
forward, which usually means at least 1 week of little-to-no activity, 
and then 1 week until feature-freeze.

On 05/08/2020 11:56, jincheng sun wrote:
> +1 for end of October from me as well.
>
> Best,
> Jincheng
>
>
> Kostas Kloudas <kk...@gmail.com> 于2020年8月5日周三 下午4:59写道：
>
>> +1 for end of October from me as well.
>>
>> Cheers,
>> Kostas
>>
>> On Wed, Aug 5, 2020 at 9:59 AM Till Rohrmann <tr...@apache.org> wrote:
>>
>>> +1 for end of October from my side as well.
>>>
>>> Cheers,
>>> Till
>>>
>>> On Tue, Aug 4, 2020 at 9:46 PM Stephan Ewen <se...@apache.org> wrote:
>>>
>>>> The end of October sounds good from my side, unless it collides with
>> some
>>>> holidays that affect many committers.
>>>>
>>>> Feature-wise, I believe we can definitely make good use of the time to
>>> wrap
>>>> up some critical threads (like finishing the FLIP-27 source efforts).
>>>>
>>>> So +1 to the end of October from my side.
>>>>
>>>> Best,
>>>> Stephan
>>>>
>>>>
>>>> On Tue, Aug 4, 2020 at 8:59 AM Robert Metzger <rm...@apache.org>
>>> wrote:
>>>>> Thanks a lot for commenting on the feature freeze date.
>>>>>
>>>>> You are raising a few good points on the timing.
>>>>> If we have already (2 months before) concerns regarding the deadline,
>>>> then
>>>>> I agree that we should move it till the end of October.
>>>>>
>>>>> We then just need to be careful not to run into the Christmas season
>> at
>>>> the
>>>>> end of December.
>>>>>
>>>>> If nobody objects within a few days, I'll update the feature freeze
>>> date
>>>> in
>>>>> the Wiki.
>>>>>
>>>>>
>>>>> On Tue, Aug 4, 2020 at 7:52 AM Kurt Young <yk...@gmail.com> wrote:
>>>>>
>>>>>> Regarding setting the feature freeze date to late September, I have
>>>> some
>>>>>> concern that it might make
>>>>>> the development time of 1.12 too short.
>>>>>>
>>>>>> One reason for this is we took too much time (about 1.5 month, from
>>> mid
>>>>> of
>>>>>> May to beginning of July)
>>>>>> for testing 1.11. It's not ideal but further squeeze the
>> development
>>>> time
>>>>>> of 1.12 won't make this better.
>>>>>>   Besides, AFAIK July & August is also a popular vacation season for
>>>>>> European. Given the fact most
>>>>>>   committers of Flink come from Europe, I think we should also take
>>> this
>>>>>> into consideration.
>>>>>>
>>>>>> It's also true that the first week of October is the national
>> holiday
>>>> of
>>>>>> China, so I'm wondering whether the
>>>>>> end of October could be a candidate feature freeze date.
>>>>>>
>>>>>> Best,
>>>>>> Kurt
>>>>>>
>>>>>>
>>>>>> On Tue, Jul 28, 2020 at 2:41 AM Robert Metzger <
>> rmetzger@apache.org>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi all,
>>>>>>>
>>>>>>> Thanks a lot for the responses so far. I've put them into this
>> Wiki
>>>>> page:
>>>>>>> https://cwiki.apache.org/confluence/display/FLINK/1.12+Release
>> to
>>>> keep
>>>>>>> track of them. Ideally, post JIRA tickets for your feature, then
>>> the
>>>>>> status
>>>>>>> will update automatically in the wiki :)
>>>>>>>
>>>>>>> Please keep posting features here, or add them to the Wiki
>> yourself
>>>> 🙏
>>>>>>> @Prasanna kumar <pr...@gmail.com>: Dynamic Auto
>>>> Scaling
>>>>>> is a
>>>>>>> feature request the community is well-aware of. Till has posted
>>>>>>> "Reactive-scaling mode" as a feature he's working on for the 1.12
>>>>>> release.
>>>>>>> This work will introduce the basic building blocks and partial
>>>> support
>>>>>> for
>>>>>>> the feature you are requesting.
>>>>>>> Proper support for dynamic scaling, while maintaining Flink's
>> high
>>>>>>> performance (throughout, low latency) and correctness is a
>>> difficult
>>>>> task
>>>>>>> that needs a lot of work. It will probably take a little bit of
>>> time
>>>>> till
>>>>>>> this is fully available.
>>>>>>>
>>>>>>> Cheers,
>>>>>>> Robert
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Thu, Jul 23, 2020 at 2:27 PM Till Rohrmann <
>>> trohrmann@apache.org>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Thanks for being our release managers for the 1.12 release
>> Dian &
>>>>>> Robert!
>>>>>>>> Here are some features I would like to work on for this
>> release:
>>>>>>>> # Features
>>>>>>>>
>>>>>>>> ## Finishing pipelined region scheduling (
>>>>>>>> https://issues.apache.org/jira/browse/FLINK-16430)
>>>>>>>> With the pipelined region scheduler we want to implement a
>>>> scheduler
>>>>>>> which
>>>>>>>> can serve streaming as well as batch workloads alike while
>> being
>>>> able
>>>>>> to
>>>>>>>> run jobs under constrained resources. The latter is
>> particularly
>>>>>>> important
>>>>>>>> for bounded streaming jobs which, currently, are not well
>>>> supported.
>>>>>>>> ## Reactive-scaling mode
>>>>>>>> Being able to react to newly available resources and rescaling
>> a
>>>>>> running
>>>>>>>> job accordingly will make Flink's operation much easier because
>>>>>> resources
>>>>>>>> can then be controlled by an external tool (e.g. GCP
>> autoscaling,
>>>> K8s
>>>>>>>> horizontal pod scaler, etc.). In this release we want to make a
>>> big
>>>>>> step
>>>>>>>> towards this direction. As a first step we want to support the
>>>>>> execution
>>>>>>> of
>>>>>>>> jobs with a parallelism which is lower than the specified
>>>> parallelism
>>>>>> in
>>>>>>>> case that Flink lost a TaskManager or could not acquire enough
>>>>>> resources.
>>>>>>>> # Maintenance/Stability
>>>>>>>>
>>>>>>>> ## JM / TM finished task reconciliation (
>>>>>>>> https://issues.apache.org/jira/browse/FLINK-17075)
>>>>>>>> This prevents the system from going out of sync if a task state
>>>>> change
>>>>>>> from
>>>>>>>> the TM to the JM is lost.
>>>>>>>>
>>>>>>>> ## Make metrics services work with Kubernetes deployments (
>>>>>>>> https://issues.apache.org/jira/browse/FLINK-11127)
>>>>>>>> Invert the direction in which the MetricFetcher connects to the
>>>>>>>> MetricQueryFetchers. That way it will no longer be necessary to
>>>>> expose
>>>>>> on
>>>>>>>> K8s for every TaskManager a port on which the
>> MetricQueryFetcher
>>>>> runs.
>>>>>>> This
>>>>>>>> will then make the deployment of Flink clusters on K8s easier.
>>>>>>>>
>>>>>>>> ## Handle long-blocking operations during job submission
>>> (savepoint
>>>>>>>> restore) (https://issues.apache.org/jira/browse/FLINK-16866)
>>>>>>>> Submitting a Flink job can involve the interaction with
>> external
>>>>>> systems
>>>>>>>> (blocking operations). Depending on the job the interactions
>> can
>>>> take
>>>>>> so
>>>>>>>> long that it exceeds the submission timeout which reports a
>>> failure
>>>>> on
>>>>>>> the
>>>>>>>> client side even though the actual submission succeeded. By
>>>>> decoupling
>>>>>>> the
>>>>>>>> creation of the ExecutionGraph from the job submission, we can
>>> make
>>>>> the
>>>>>>> job
>>>>>>>> submission non-blocking which will solve this problem.
>>>>>>>>
>>>>>>>> ## Make IDs more intuitive to ease debugging (FLIP-118) (
>>>>>>>> https://issues.apache.org/jira/browse/FLINK-15679)
>>>>>>>> By making the internal Flink IDs compositional or logging how
>>> they
>>>>>> belong
>>>>>>>> together, we can make the debugging of Flink's operations much
>>>>> easier.
>>>>>>>> Cheers,
>>>>>>>> Till
>>>>>>>>
>>>>>>>>
>>>>>>>> On Thu, Jul 23, 2020 at 7:48 AM Canbin Zheng <
>>>> felixzhengcb@gmail.com
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Hi All,
>>>>>>>>>
>>>>>>>>> Thanks for bring-up this discussion, Robert!
>>>>>>>>> Congratulations on becoming the release manager of 1.12， Dian
>>> and
>>>>>>> Robert
>>>>>>>> !
>>>>>>>>> ----------
>>>>>>>>> Here are some of my thoughts of the features for native
>>>> integration
>>>>>>> with
>>>>>>>>> Kubernetes in Flink 1.12:
>>>>>>>>>
>>>>>>>>> 1. Support user-specified pod templates
>>>>>>>>>      Description:
>>>>>>>>>      The current approach of introducing new configuration
>>> options
>>>>> for
>>>>>>>> each
>>>>>>>>> aspect of pod specification a user might wish is becoming
>>>> unwieldy,
>>>>>> we
>>>>>>>> have
>>>>>>>>> to maintain more and more Flink side Kubernetes configuration
>>>>> options
>>>>>>> and
>>>>>>>>> users have to learn the gap between the declarative model
>> used
>>> by
>>>>>>>>> Kubernetes and the configuration model used by Flink. It's a
>>>> great
>>>>>>>>> improvement to allow users to specify pod templates as
>> central
>>>>> places
>>>>>>> for
>>>>>>>>> all customization needs for the jobmanager and taskmanager
>>> pods.
>>>>>>>>>      Benefits:
>>>>>>>>>      Users can leverage many of the advanced K8s features that
>>> the
>>>>>> Flink
>>>>>>>>> community does not support explicitly, such as volume
>> mounting,
>>>> DNS
>>>>>>>>> configuration, pod affinity/anti-affinity setting, etc.
>>>>>>>>>
>>>>>>>>> 2. Support running PyFlink on Kubernetes
>>>>>>>>>      Description:
>>>>>>>>>      Support running PyFlink on Kubernetes, including session
>>>>> cluster
>>>>>>> and
>>>>>>>>> application cluster.
>>>>>>>>>      Benefits:
>>>>>>>>>      Running python application in a containerized
>> environment.
>>>>>>>>> 3. Support built-in init-Container
>>>>>>>>>      Description:
>>>>>>>>>      We need a built-in init-Container to help solve
>> dependency
>>>>>>> management
>>>>>>>>> in a containerized environment, especially in the application
>>>> mode.
>>>>>>>>>      Benefits:
>>>>>>>>>      Separate the base Flink image from dynamic dependencies.
>>>>>>>>>
>>>>>>>>> 4. Support accessing secured services via K8s secrets
>>>>>>>>>      Description:
>>>>>>>>>      Kubernetes Secrets
>>>>>>>>> <https://kubernetes.io/docs/concepts/configuration/secret/>
>>> can
>>>> be
>>>>>>> used
>>>>>>>> to
>>>>>>>>> provide credentials for a Flink application to access secured
>>>>>> services.
>>>>>>>> It
>>>>>>>>> helps people who want to use a user-specified K8s Secret
>>> through
>>>> an
>>>>>>>>> environment variable.
>>>>>>>>>      Benefits:
>>>>>>>>>      Improve user experience.
>>>>>>>>>
>>>>>>>>> 5. Support configuring replica of JobManager Deployment in
>>>>> ZooKeeper
>>>>>> HA
>>>>>>>>> setups
>>>>>>>>>      Description:
>>>>>>>>>      Make the *replica* of Deployment configurable in the
>>>> ZooKeeper
>>>>> HA
>>>>>>>>> setups.
>>>>>>>>>      Benefits:
>>>>>>>>>      Achieve faster failover.
>>>>>>>>>
>>>>>>>>> 6. Support to configure limit for CPU requirement
>>>>>>>>>      Description:
>>>>>>>>>      To leverage the Kubernetes feature of container
>>> request/limit
>>>>>> CPU.
>>>>>>>>>      Benefits:
>>>>>>>>>      Reduce cost.
>>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>> Canbin Zheng
>>>>>>>>>
>>>>>>>>> Harold.Miao <mi...@gmail.com> 于2020年7月23日周四 下午12:44写道：
>>>>>>>>>
>>>>>>>>>> I'm excited to hear about this feature,  very, very, very
>>>> highly
>>>>>>>>> encouraged
>>>>>>>>>>
>>>>>>>>>> Prasanna kumar <pr...@gmail.com>
>> 于2020年7月23日周四
>>>>>>>> 上午12:10写道：
>>>>>>>>>>> Hi Flink Dev Team,
>>>>>>>>>>>
>>>>>>>>>>> Dynamic AutoScaling Based on the incoming data load would
>>> be
>>>> a
>>>>>>> great
>>>>>>>>>>> feature.
>>>>>>>>>>>
>>>>>>>>>>> We should be able have some rule say If the load
>> increased
>>> by
>>>>>> 20% ,
>>>>>>>> add
>>>>>>>>>>> extra resource should be added.
>>>>>>>>>>> Or time based say during these peak hours the pipeline
>>> should
>>>>>> scale
>>>>>>>>>>> automatically by 50%.
>>>>>>>>>>>
>>>>>>>>>>> This will help a lot in cost reduction.
>>>>>>>>>>>
>>>>>>>>>>> EMR cluster provides a similar feature for SPARK based
>>>>>> application.
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Prasanna.
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Jul 22, 2020 at 5:40 PM Robert Metzger <
>>>>>>> rmetzger@apache.org>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>
>>>>>>>>>>>> Now that the 1.11 release is out, it is time to plan
>> for
>>>> the
>>>>>> next
>>>>>>>>> major
>>>>>>>>>>>> Flink release.
>>>>>>>>>>>>
>>>>>>>>>>>> Some items:
>>>>>>>>>>>>
>>>>>>>>>>>>     1.
>>>>>>>>>>>>
>>>>>>>>>>>>     Dian Fu and me volunteer to be the release managers
>>> for
>>>>>> Flink
>>>>>>>>> 1.12.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>     1.
>>>>>>>>>>>>
>>>>>>>>>>>>     Timeline: We propose to stick to our approximate 4
>>> month
>>>>>>> release
>>>>>>>>>>> cycle,
>>>>>>>>>>>>     thus the release should be done by late October.
>> Given
>>>>> that
>>>>>>>>> there’s
>>>>>>>>>> a
>>>>>>>>>>>>     holiday week in China at the beginning of October, I
>>>>> propose
>>>>>>> to
>>>>>>>> do
>>>>>>>>>> the
>>>>>>>>>>>>     feature freeze on master by late September.
>>>>>>>>>>>>
>>>>>>>>>>>>     2.
>>>>>>>>>>>>
>>>>>>>>>>>>     Collecting features: It would be good to have a
>> rough
>>>>>> overview
>>>>>>>> of
>>>>>>>>>> the
>>>>>>>>>>>>     features that will likely be ready to be merged by
>>> late
>>>>>>>> September,
>>>>>>>>>> and
>>>>>>>>>>>> that
>>>>>>>>>>>>     we want in the release.
>>>>>>>>>>>>     Based on the discussion, we will update the Roadmap
>> on
>>>> the
>>>>>>> Flink
>>>>>>>>>>> website
>>>>>>>>>>>>     again!
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>     1.
>>>>>>>>>>>>
>>>>>>>>>>>>     Test instabilities and blockers: I would like to
>>> avoid a
>>>>>>>> situation
>>>>>>>>>>> where
>>>>>>>>>>>>     we have many blocking issues or build instabilities
>> at
>>>> the
>>>>>>> time
>>>>>>>> of
>>>>>>>>>> the
>>>>>>>>>>>>     feature freeze. To achieve that, we will try to
>> check
>>>>> every
>>>>>>>> build
>>>>>>>>>>>>     instability within a week, to decide if it is a
>>> blocker
>>>>>> (make
>>>>>>>> sure
>>>>>>>>>> to
>>>>>>>>>>>> use
>>>>>>>>>>>>     the “test-stability” label for those tickets!)
>>>>>>>>>>>>     Blocker issues will need to have somebody assigned
>>>>>>> (responsible)
>>>>>>>>>>> within
>>>>>>>>>>>>     a week, and we want to see progress on all blocker
>>>> issues
>>>>>>>>>> (downgrade,
>>>>>>>>>>>>     resolution, a good plan how to proceed if it is more
>>>>>>>> complicated)
>>>>>>>>>>>>     2.
>>>>>>>>>>>>
>>>>>>>>>>>>     Quality and stability of new features: In order to
>>> have
>>>> a
>>>>>>> short
>>>>>>>>>>> feature
>>>>>>>>>>>>     freeze phase, we encourage developers to only merge
>>>>>>> well-tested
>>>>>>>>> and
>>>>>>>>>>>>     documented features. In our experience, the feature
>>>> freeze
>>>>>>> works
>>>>>>>>>> best
>>>>>>>>>>> if
>>>>>>>>>>>>     new features are complete, and the community can
>> focus
>>>>> fully
>>>>>>> on
>>>>>>>>>>>> addressing
>>>>>>>>>>>>     newly found bugs and voting the release.
>>>>>>>>>>>>     By having a smooth release process, the next
>>>> merge-window
>>>>>> for
>>>>>>>> the
>>>>>>>>>> next
>>>>>>>>>>>>     release will come sooner.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Let me know what you think about our items, and share
>>> which
>>>>>>>> features
>>>>>>>>>> you
>>>>>>>>>>>> want in Flink 1.12.
>>>>>>>>>>>>
>>>>>>>>>>>> Best,
>>>>>>>>>>>>
>>>>>>>>>>>> Robert & Dian
>>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>>
>>>>>>>>>> Best Regards,
>>>>>>>>>> Harold Miao
>>>>>>>>>>

Re: [DISCUSS] Planning Flink 1.12

Posted by jincheng sun <su...@gmail.com>.

+1 for end of October from me as well.

Best,
Jincheng


Kostas Kloudas <kk...@gmail.com> 于2020年8月5日周三 下午4:59写道：

> +1 for end of October from me as well.
>
> Cheers,
> Kostas
>
> On Wed, Aug 5, 2020 at 9:59 AM Till Rohrmann <tr...@apache.org> wrote:
>
> > +1 for end of October from my side as well.
> >
> > Cheers,
> > Till
> >
> > On Tue, Aug 4, 2020 at 9:46 PM Stephan Ewen <se...@apache.org> wrote:
> >
> > > The end of October sounds good from my side, unless it collides with
> some
> > > holidays that affect many committers.
> > >
> > > Feature-wise, I believe we can definitely make good use of the time to
> > wrap
> > > up some critical threads (like finishing the FLIP-27 source efforts).
> > >
> > > So +1 to the end of October from my side.
> > >
> > > Best,
> > > Stephan
> > >
> > >
> > > On Tue, Aug 4, 2020 at 8:59 AM Robert Metzger <rm...@apache.org>
> > wrote:
> > >
> > > > Thanks a lot for commenting on the feature freeze date.
> > > >
> > > > You are raising a few good points on the timing.
> > > > If we have already (2 months before) concerns regarding the deadline,
> > > then
> > > > I agree that we should move it till the end of October.
> > > >
> > > > We then just need to be careful not to run into the Christmas season
> at
> > > the
> > > > end of December.
> > > >
> > > > If nobody objects within a few days, I'll update the feature freeze
> > date
> > > in
> > > > the Wiki.
> > > >
> > > >
> > > > On Tue, Aug 4, 2020 at 7:52 AM Kurt Young <yk...@gmail.com> wrote:
> > > >
> > > > > Regarding setting the feature freeze date to late September, I have
> > > some
> > > > > concern that it might make
> > > > > the development time of 1.12 too short.
> > > > >
> > > > > One reason for this is we took too much time (about 1.5 month, from
> > mid
> > > > of
> > > > > May to beginning of July)
> > > > > for testing 1.11. It's not ideal but further squeeze the
> development
> > > time
> > > > > of 1.12 won't make this better.
> > > > >  Besides, AFAIK July & August is also a popular vacation season for
> > > > > European. Given the fact most
> > > > >  committers of Flink come from Europe, I think we should also take
> > this
> > > > > into consideration.
> > > > >
> > > > > It's also true that the first week of October is the national
> holiday
> > > of
> > > > > China, so I'm wondering whether the
> > > > > end of October could be a candidate feature freeze date.
> > > > >
> > > > > Best,
> > > > > Kurt
> > > > >
> > > > >
> > > > > On Tue, Jul 28, 2020 at 2:41 AM Robert Metzger <
> rmetzger@apache.org>
> > > > > wrote:
> > > > >
> > > > > > Hi all,
> > > > > >
> > > > > > Thanks a lot for the responses so far. I've put them into this
> Wiki
> > > > page:
> > > > > > https://cwiki.apache.org/confluence/display/FLINK/1.12+Release
> to
> > > keep
> > > > > > track of them. Ideally, post JIRA tickets for your feature, then
> > the
> > > > > status
> > > > > > will update automatically in the wiki :)
> > > > > >
> > > > > > Please keep posting features here, or add them to the Wiki
> yourself
> > > 🙏
> > > > > >
> > > > > > @Prasanna kumar <pr...@gmail.com>: Dynamic Auto
> > > Scaling
> > > > > is a
> > > > > > feature request the community is well-aware of. Till has posted
> > > > > > "Reactive-scaling mode" as a feature he's working on for the 1.12
> > > > > release.
> > > > > > This work will introduce the basic building blocks and partial
> > > support
> > > > > for
> > > > > > the feature you are requesting.
> > > > > > Proper support for dynamic scaling, while maintaining Flink's
> high
> > > > > > performance (throughout, low latency) and correctness is a
> > difficult
> > > > task
> > > > > > that needs a lot of work. It will probably take a little bit of
> > time
> > > > till
> > > > > > this is fully available.
> > > > > >
> > > > > > Cheers,
> > > > > > Robert
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Thu, Jul 23, 2020 at 2:27 PM Till Rohrmann <
> > trohrmann@apache.org>
> > > > > > wrote:
> > > > > >
> > > > > > > Thanks for being our release managers for the 1.12 release
> Dian &
> > > > > Robert!
> > > > > > >
> > > > > > > Here are some features I would like to work on for this
> release:
> > > > > > >
> > > > > > > # Features
> > > > > > >
> > > > > > > ## Finishing pipelined region scheduling (
> > > > > > > https://issues.apache.org/jira/browse/FLINK-16430)
> > > > > > > With the pipelined region scheduler we want to implement a
> > > scheduler
> > > > > > which
> > > > > > > can serve streaming as well as batch workloads alike while
> being
> > > able
> > > > > to
> > > > > > > run jobs under constrained resources. The latter is
> particularly
> > > > > > important
> > > > > > > for bounded streaming jobs which, currently, are not well
> > > supported.
> > > > > > >
> > > > > > > ## Reactive-scaling mode
> > > > > > > Being able to react to newly available resources and rescaling
> a
> > > > > running
> > > > > > > job accordingly will make Flink's operation much easier because
> > > > > resources
> > > > > > > can then be controlled by an external tool (e.g. GCP
> autoscaling,
> > > K8s
> > > > > > > horizontal pod scaler, etc.). In this release we want to make a
> > big
> > > > > step
> > > > > > > towards this direction. As a first step we want to support the
> > > > > execution
> > > > > > of
> > > > > > > jobs with a parallelism which is lower than the specified
> > > parallelism
> > > > > in
> > > > > > > case that Flink lost a TaskManager or could not acquire enough
> > > > > resources.
> > > > > > >
> > > > > > > # Maintenance/Stability
> > > > > > >
> > > > > > > ## JM / TM finished task reconciliation (
> > > > > > > https://issues.apache.org/jira/browse/FLINK-17075)
> > > > > > > This prevents the system from going out of sync if a task state
> > > > change
> > > > > > from
> > > > > > > the TM to the JM is lost.
> > > > > > >
> > > > > > > ## Make metrics services work with Kubernetes deployments (
> > > > > > > https://issues.apache.org/jira/browse/FLINK-11127)
> > > > > > > Invert the direction in which the MetricFetcher connects to the
> > > > > > > MetricQueryFetchers. That way it will no longer be necessary to
> > > > expose
> > > > > on
> > > > > > > K8s for every TaskManager a port on which the
> MetricQueryFetcher
> > > > runs.
> > > > > > This
> > > > > > > will then make the deployment of Flink clusters on K8s easier.
> > > > > > >
> > > > > > > ## Handle long-blocking operations during job submission
> > (savepoint
> > > > > > > restore) (https://issues.apache.org/jira/browse/FLINK-16866)
> > > > > > > Submitting a Flink job can involve the interaction with
> external
> > > > > systems
> > > > > > > (blocking operations). Depending on the job the interactions
> can
> > > take
> > > > > so
> > > > > > > long that it exceeds the submission timeout which reports a
> > failure
> > > > on
> > > > > > the
> > > > > > > client side even though the actual submission succeeded. By
> > > > decoupling
> > > > > > the
> > > > > > > creation of the ExecutionGraph from the job submission, we can
> > make
> > > > the
> > > > > > job
> > > > > > > submission non-blocking which will solve this problem.
> > > > > > >
> > > > > > > ## Make IDs more intuitive to ease debugging (FLIP-118) (
> > > > > > > https://issues.apache.org/jira/browse/FLINK-15679)
> > > > > > > By making the internal Flink IDs compositional or logging how
> > they
> > > > > belong
> > > > > > > together, we can make the debugging of Flink's operations much
> > > > easier.
> > > > > > >
> > > > > > > Cheers,
> > > > > > > Till
> > > > > > >
> > > > > > >
> > > > > > > On Thu, Jul 23, 2020 at 7:48 AM Canbin Zheng <
> > > felixzhengcb@gmail.com
> > > > >
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Hi All,
> > > > > > > >
> > > > > > > > Thanks for bring-up this discussion, Robert!
> > > > > > > > Congratulations on becoming the release manager of 1.12， Dian
> > and
> > > > > > Robert
> > > > > > > !
> > > > > > > >
> > > > > > > > ----------
> > > > > > > > Here are some of my thoughts of the features for native
> > > integration
> > > > > > with
> > > > > > > > Kubernetes in Flink 1.12:
> > > > > > > >
> > > > > > > > 1. Support user-specified pod templates
> > > > > > > >     Description:
> > > > > > > >     The current approach of introducing new configuration
> > options
> > > > for
> > > > > > > each
> > > > > > > > aspect of pod specification a user might wish is becoming
> > > unwieldy,
> > > > > we
> > > > > > > have
> > > > > > > > to maintain more and more Flink side Kubernetes configuration
> > > > options
> > > > > > and
> > > > > > > > users have to learn the gap between the declarative model
> used
> > by
> > > > > > > > Kubernetes and the configuration model used by Flink. It's a
> > > great
> > > > > > > > improvement to allow users to specify pod templates as
> central
> > > > places
> > > > > > for
> > > > > > > > all customization needs for the jobmanager and taskmanager
> > pods.
> > > > > > > >     Benefits:
> > > > > > > >     Users can leverage many of the advanced K8s features that
> > the
> > > > > Flink
> > > > > > > > community does not support explicitly, such as volume
> mounting,
> > > DNS
> > > > > > > > configuration, pod affinity/anti-affinity setting, etc.
> > > > > > > >
> > > > > > > > 2. Support running PyFlink on Kubernetes
> > > > > > > >     Description:
> > > > > > > >     Support running PyFlink on Kubernetes, including session
> > > > cluster
> > > > > > and
> > > > > > > > application cluster.
> > > > > > > >     Benefits:
> > > > > > > >     Running python application in a containerized
> environment.
> > > > > > > >
> > > > > > > > 3. Support built-in init-Container
> > > > > > > >     Description:
> > > > > > > >     We need a built-in init-Container to help solve
> dependency
> > > > > > management
> > > > > > > > in a containerized environment, especially in the application
> > > mode.
> > > > > > > >     Benefits:
> > > > > > > >     Separate the base Flink image from dynamic dependencies.
> > > > > > > >
> > > > > > > > 4. Support accessing secured services via K8s secrets
> > > > > > > >     Description:
> > > > > > > >     Kubernetes Secrets
> > > > > > > > <https://kubernetes.io/docs/concepts/configuration/secret/>
> > can
> > > be
> > > > > > used
> > > > > > > to
> > > > > > > > provide credentials for a Flink application to access secured
> > > > > services.
> > > > > > > It
> > > > > > > > helps people who want to use a user-specified K8s Secret
> > through
> > > an
> > > > > > > > environment variable.
> > > > > > > >     Benefits:
> > > > > > > >     Improve user experience.
> > > > > > > >
> > > > > > > > 5. Support configuring replica of JobManager Deployment in
> > > > ZooKeeper
> > > > > HA
> > > > > > > > setups
> > > > > > > >     Description:
> > > > > > > >     Make the *replica* of Deployment configurable in the
> > > ZooKeeper
> > > > HA
> > > > > > > > setups.
> > > > > > > >     Benefits:
> > > > > > > >     Achieve faster failover.
> > > > > > > >
> > > > > > > > 6. Support to configure limit for CPU requirement
> > > > > > > >     Description:
> > > > > > > >     To leverage the Kubernetes feature of container
> > request/limit
> > > > > CPU.
> > > > > > > >     Benefits:
> > > > > > > >     Reduce cost.
> > > > > > > >
> > > > > > > > Regards,
> > > > > > > > Canbin Zheng
> > > > > > > >
> > > > > > > > Harold.Miao <mi...@gmail.com> 于2020年7月23日周四 下午12:44写道：
> > > > > > > >
> > > > > > > > > I'm excited to hear about this feature,  very, very, very
> > > highly
> > > > > > > > encouraged
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > Prasanna kumar <pr...@gmail.com>
> 于2020年7月23日周四
> > > > > > > 上午12:10写道：
> > > > > > > > >
> > > > > > > > > > Hi Flink Dev Team,
> > > > > > > > > >
> > > > > > > > > > Dynamic AutoScaling Based on the incoming data load would
> > be
> > > a
> > > > > > great
> > > > > > > > > > feature.
> > > > > > > > > >
> > > > > > > > > > We should be able have some rule say If the load
> increased
> > by
> > > > > 20% ,
> > > > > > > add
> > > > > > > > > > extra resource should be added.
> > > > > > > > > > Or time based say during these peak hours the pipeline
> > should
> > > > > scale
> > > > > > > > > > automatically by 50%.
> > > > > > > > > >
> > > > > > > > > > This will help a lot in cost reduction.
> > > > > > > > > >
> > > > > > > > > > EMR cluster provides a similar feature for SPARK based
> > > > > application.
> > > > > > > > > >
> > > > > > > > > > Thanks,
> > > > > > > > > > Prasanna.
> > > > > > > > > >
> > > > > > > > > > On Wed, Jul 22, 2020 at 5:40 PM Robert Metzger <
> > > > > > rmetzger@apache.org>
> > > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Hi all,
> > > > > > > > > > >
> > > > > > > > > > > Now that the 1.11 release is out, it is time to plan
> for
> > > the
> > > > > next
> > > > > > > > major
> > > > > > > > > > > Flink release.
> > > > > > > > > > >
> > > > > > > > > > > Some items:
> > > > > > > > > > >
> > > > > > > > > > >    1.
> > > > > > > > > > >
> > > > > > > > > > >    Dian Fu and me volunteer to be the release managers
> > for
> > > > > Flink
> > > > > > > > 1.12.
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >    1.
> > > > > > > > > > >
> > > > > > > > > > >    Timeline: We propose to stick to our approximate 4
> > month
> > > > > > release
> > > > > > > > > > cycle,
> > > > > > > > > > >    thus the release should be done by late October.
> Given
> > > > that
> > > > > > > > there’s
> > > > > > > > > a
> > > > > > > > > > >    holiday week in China at the beginning of October, I
> > > > propose
> > > > > > to
> > > > > > > do
> > > > > > > > > the
> > > > > > > > > > >    feature freeze on master by late September.
> > > > > > > > > > >
> > > > > > > > > > >    2.
> > > > > > > > > > >
> > > > > > > > > > >    Collecting features: It would be good to have a
> rough
> > > > > overview
> > > > > > > of
> > > > > > > > > the
> > > > > > > > > > >    features that will likely be ready to be merged by
> > late
> > > > > > > September,
> > > > > > > > > and
> > > > > > > > > > > that
> > > > > > > > > > >    we want in the release.
> > > > > > > > > > >    Based on the discussion, we will update the Roadmap
> on
> > > the
> > > > > > Flink
> > > > > > > > > > website
> > > > > > > > > > >    again!
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >    1.
> > > > > > > > > > >
> > > > > > > > > > >    Test instabilities and blockers: I would like to
> > avoid a
> > > > > > > situation
> > > > > > > > > > where
> > > > > > > > > > >    we have many blocking issues or build instabilities
> at
> > > the
> > > > > > time
> > > > > > > of
> > > > > > > > > the
> > > > > > > > > > >    feature freeze. To achieve that, we will try to
> check
> > > > every
> > > > > > > build
> > > > > > > > > > >    instability within a week, to decide if it is a
> > blocker
> > > > > (make
> > > > > > > sure
> > > > > > > > > to
> > > > > > > > > > > use
> > > > > > > > > > >    the “test-stability” label for those tickets!)
> > > > > > > > > > >    Blocker issues will need to have somebody assigned
> > > > > > (responsible)
> > > > > > > > > > within
> > > > > > > > > > >    a week, and we want to see progress on all blocker
> > > issues
> > > > > > > > > (downgrade,
> > > > > > > > > > >    resolution, a good plan how to proceed if it is more
> > > > > > > complicated)
> > > > > > > > > > >
> > > > > > > > > > >    2.
> > > > > > > > > > >
> > > > > > > > > > >    Quality and stability of new features: In order to
> > have
> > > a
> > > > > > short
> > > > > > > > > > feature
> > > > > > > > > > >    freeze phase, we encourage developers to only merge
> > > > > > well-tested
> > > > > > > > and
> > > > > > > > > > >    documented features. In our experience, the feature
> > > freeze
> > > > > > works
> > > > > > > > > best
> > > > > > > > > > if
> > > > > > > > > > >    new features are complete, and the community can
> focus
> > > > fully
> > > > > > on
> > > > > > > > > > > addressing
> > > > > > > > > > >    newly found bugs and voting the release.
> > > > > > > > > > >    By having a smooth release process, the next
> > > merge-window
> > > > > for
> > > > > > > the
> > > > > > > > > next
> > > > > > > > > > >    release will come sooner.
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > Let me know what you think about our items, and share
> > which
> > > > > > > features
> > > > > > > > > you
> > > > > > > > > > > want in Flink 1.12.
> > > > > > > > > > >
> > > > > > > > > > > Best,
> > > > > > > > > > >
> > > > > > > > > > > Robert & Dian
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > >
> > > > > > > > > Best Regards,
> > > > > > > > > Harold Miao
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] Planning Flink 1.12

Posted by Kostas Kloudas <kk...@gmail.com>.

+1 for end of October from me as well.

Cheers,
Kostas

On Wed, Aug 5, 2020 at 9:59 AM Till Rohrmann <tr...@apache.org> wrote:

> +1 for end of October from my side as well.
>
> Cheers,
> Till
>
> On Tue, Aug 4, 2020 at 9:46 PM Stephan Ewen <se...@apache.org> wrote:
>
> > The end of October sounds good from my side, unless it collides with some
> > holidays that affect many committers.
> >
> > Feature-wise, I believe we can definitely make good use of the time to
> wrap
> > up some critical threads (like finishing the FLIP-27 source efforts).
> >
> > So +1 to the end of October from my side.
> >
> > Best,
> > Stephan
> >
> >
> > On Tue, Aug 4, 2020 at 8:59 AM Robert Metzger <rm...@apache.org>
> wrote:
> >
> > > Thanks a lot for commenting on the feature freeze date.
> > >
> > > You are raising a few good points on the timing.
> > > If we have already (2 months before) concerns regarding the deadline,
> > then
> > > I agree that we should move it till the end of October.
> > >
> > > We then just need to be careful not to run into the Christmas season at
> > the
> > > end of December.
> > >
> > > If nobody objects within a few days, I'll update the feature freeze
> date
> > in
> > > the Wiki.
> > >
> > >
> > > On Tue, Aug 4, 2020 at 7:52 AM Kurt Young <yk...@gmail.com> wrote:
> > >
> > > > Regarding setting the feature freeze date to late September, I have
> > some
> > > > concern that it might make
> > > > the development time of 1.12 too short.
> > > >
> > > > One reason for this is we took too much time (about 1.5 month, from
> mid
> > > of
> > > > May to beginning of July)
> > > > for testing 1.11. It's not ideal but further squeeze the development
> > time
> > > > of 1.12 won't make this better.
> > > >  Besides, AFAIK July & August is also a popular vacation season for
> > > > European. Given the fact most
> > > >  committers of Flink come from Europe, I think we should also take
> this
> > > > into consideration.
> > > >
> > > > It's also true that the first week of October is the national holiday
> > of
> > > > China, so I'm wondering whether the
> > > > end of October could be a candidate feature freeze date.
> > > >
> > > > Best,
> > > > Kurt
> > > >
> > > >
> > > > On Tue, Jul 28, 2020 at 2:41 AM Robert Metzger <rm...@apache.org>
> > > > wrote:
> > > >
> > > > > Hi all,
> > > > >
> > > > > Thanks a lot for the responses so far. I've put them into this Wiki
> > > page:
> > > > > https://cwiki.apache.org/confluence/display/FLINK/1.12+Release to
> > keep
> > > > > track of them. Ideally, post JIRA tickets for your feature, then
> the
> > > > status
> > > > > will update automatically in the wiki :)
> > > > >
> > > > > Please keep posting features here, or add them to the Wiki yourself
> > 🙏
> > > > >
> > > > > @Prasanna kumar <pr...@gmail.com>: Dynamic Auto
> > Scaling
> > > > is a
> > > > > feature request the community is well-aware of. Till has posted
> > > > > "Reactive-scaling mode" as a feature he's working on for the 1.12
> > > > release.
> > > > > This work will introduce the basic building blocks and partial
> > support
> > > > for
> > > > > the feature you are requesting.
> > > > > Proper support for dynamic scaling, while maintaining Flink's high
> > > > > performance (throughout, low latency) and correctness is a
> difficult
> > > task
> > > > > that needs a lot of work. It will probably take a little bit of
> time
> > > till
> > > > > this is fully available.
> > > > >
> > > > > Cheers,
> > > > > Robert
> > > > >
> > > > >
> > > > >
> > > > > On Thu, Jul 23, 2020 at 2:27 PM Till Rohrmann <
> trohrmann@apache.org>
> > > > > wrote:
> > > > >
> > > > > > Thanks for being our release managers for the 1.12 release Dian &
> > > > Robert!
> > > > > >
> > > > > > Here are some features I would like to work on for this release:
> > > > > >
> > > > > > # Features
> > > > > >
> > > > > > ## Finishing pipelined region scheduling (
> > > > > > https://issues.apache.org/jira/browse/FLINK-16430)
> > > > > > With the pipelined region scheduler we want to implement a
> > scheduler
> > > > > which
> > > > > > can serve streaming as well as batch workloads alike while being
> > able
> > > > to
> > > > > > run jobs under constrained resources. The latter is particularly
> > > > > important
> > > > > > for bounded streaming jobs which, currently, are not well
> > supported.
> > > > > >
> > > > > > ## Reactive-scaling mode
> > > > > > Being able to react to newly available resources and rescaling a
> > > > running
> > > > > > job accordingly will make Flink's operation much easier because
> > > > resources
> > > > > > can then be controlled by an external tool (e.g. GCP autoscaling,
> > K8s
> > > > > > horizontal pod scaler, etc.). In this release we want to make a
> big
> > > > step
> > > > > > towards this direction. As a first step we want to support the
> > > > execution
> > > > > of
> > > > > > jobs with a parallelism which is lower than the specified
> > parallelism
> > > > in
> > > > > > case that Flink lost a TaskManager or could not acquire enough
> > > > resources.
> > > > > >
> > > > > > # Maintenance/Stability
> > > > > >
> > > > > > ## JM / TM finished task reconciliation (
> > > > > > https://issues.apache.org/jira/browse/FLINK-17075)
> > > > > > This prevents the system from going out of sync if a task state
> > > change
> > > > > from
> > > > > > the TM to the JM is lost.
> > > > > >
> > > > > > ## Make metrics services work with Kubernetes deployments (
> > > > > > https://issues.apache.org/jira/browse/FLINK-11127)
> > > > > > Invert the direction in which the MetricFetcher connects to the
> > > > > > MetricQueryFetchers. That way it will no longer be necessary to
> > > expose
> > > > on
> > > > > > K8s for every TaskManager a port on which the MetricQueryFetcher
> > > runs.
> > > > > This
> > > > > > will then make the deployment of Flink clusters on K8s easier.
> > > > > >
> > > > > > ## Handle long-blocking operations during job submission
> (savepoint
> > > > > > restore) (https://issues.apache.org/jira/browse/FLINK-16866)
> > > > > > Submitting a Flink job can involve the interaction with external
> > > > systems
> > > > > > (blocking operations). Depending on the job the interactions can
> > take
> > > > so
> > > > > > long that it exceeds the submission timeout which reports a
> failure
> > > on
> > > > > the
> > > > > > client side even though the actual submission succeeded. By
> > > decoupling
> > > > > the
> > > > > > creation of the ExecutionGraph from the job submission, we can
> make
> > > the
> > > > > job
> > > > > > submission non-blocking which will solve this problem.
> > > > > >
> > > > > > ## Make IDs more intuitive to ease debugging (FLIP-118) (
> > > > > > https://issues.apache.org/jira/browse/FLINK-15679)
> > > > > > By making the internal Flink IDs compositional or logging how
> they
> > > > belong
> > > > > > together, we can make the debugging of Flink's operations much
> > > easier.
> > > > > >
> > > > > > Cheers,
> > > > > > Till
> > > > > >
> > > > > >
> > > > > > On Thu, Jul 23, 2020 at 7:48 AM Canbin Zheng <
> > felixzhengcb@gmail.com
> > > >
> > > > > > wrote:
> > > > > >
> > > > > > > Hi All,
> > > > > > >
> > > > > > > Thanks for bring-up this discussion, Robert!
> > > > > > > Congratulations on becoming the release manager of 1.12， Dian
> and
> > > > > Robert
> > > > > > !
> > > > > > >
> > > > > > > ----------
> > > > > > > Here are some of my thoughts of the features for native
> > integration
> > > > > with
> > > > > > > Kubernetes in Flink 1.12:
> > > > > > >
> > > > > > > 1. Support user-specified pod templates
> > > > > > >     Description:
> > > > > > >     The current approach of introducing new configuration
> options
> > > for
> > > > > > each
> > > > > > > aspect of pod specification a user might wish is becoming
> > unwieldy,
> > > > we
> > > > > > have
> > > > > > > to maintain more and more Flink side Kubernetes configuration
> > > options
> > > > > and
> > > > > > > users have to learn the gap between the declarative model used
> by
> > > > > > > Kubernetes and the configuration model used by Flink. It's a
> > great
> > > > > > > improvement to allow users to specify pod templates as central
> > > places
> > > > > for
> > > > > > > all customization needs for the jobmanager and taskmanager
> pods.
> > > > > > >     Benefits:
> > > > > > >     Users can leverage many of the advanced K8s features that
> the
> > > > Flink
> > > > > > > community does not support explicitly, such as volume mounting,
> > DNS
> > > > > > > configuration, pod affinity/anti-affinity setting, etc.
> > > > > > >
> > > > > > > 2. Support running PyFlink on Kubernetes
> > > > > > >     Description:
> > > > > > >     Support running PyFlink on Kubernetes, including session
> > > cluster
> > > > > and
> > > > > > > application cluster.
> > > > > > >     Benefits:
> > > > > > >     Running python application in a containerized environment.
> > > > > > >
> > > > > > > 3. Support built-in init-Container
> > > > > > >     Description:
> > > > > > >     We need a built-in init-Container to help solve dependency
> > > > > management
> > > > > > > in a containerized environment, especially in the application
> > mode.
> > > > > > >     Benefits:
> > > > > > >     Separate the base Flink image from dynamic dependencies.
> > > > > > >
> > > > > > > 4. Support accessing secured services via K8s secrets
> > > > > > >     Description:
> > > > > > >     Kubernetes Secrets
> > > > > > > <https://kubernetes.io/docs/concepts/configuration/secret/>
> can
> > be
> > > > > used
> > > > > > to
> > > > > > > provide credentials for a Flink application to access secured
> > > > services.
> > > > > > It
> > > > > > > helps people who want to use a user-specified K8s Secret
> through
> > an
> > > > > > > environment variable.
> > > > > > >     Benefits:
> > > > > > >     Improve user experience.
> > > > > > >
> > > > > > > 5. Support configuring replica of JobManager Deployment in
> > > ZooKeeper
> > > > HA
> > > > > > > setups
> > > > > > >     Description:
> > > > > > >     Make the *replica* of Deployment configurable in the
> > ZooKeeper
> > > HA
> > > > > > > setups.
> > > > > > >     Benefits:
> > > > > > >     Achieve faster failover.
> > > > > > >
> > > > > > > 6. Support to configure limit for CPU requirement
> > > > > > >     Description:
> > > > > > >     To leverage the Kubernetes feature of container
> request/limit
> > > > CPU.
> > > > > > >     Benefits:
> > > > > > >     Reduce cost.
> > > > > > >
> > > > > > > Regards,
> > > > > > > Canbin Zheng
> > > > > > >
> > > > > > > Harold.Miao <mi...@gmail.com> 于2020年7月23日周四 下午12:44写道：
> > > > > > >
> > > > > > > > I'm excited to hear about this feature,  very, very, very
> > highly
> > > > > > > encouraged
> > > > > > > >
> > > > > > > >
> > > > > > > > Prasanna kumar <pr...@gmail.com> 于2020年7月23日周四
> > > > > > 上午12:10写道：
> > > > > > > >
> > > > > > > > > Hi Flink Dev Team,
> > > > > > > > >
> > > > > > > > > Dynamic AutoScaling Based on the incoming data load would
> be
> > a
> > > > > great
> > > > > > > > > feature.
> > > > > > > > >
> > > > > > > > > We should be able have some rule say If the load increased
> by
> > > > 20% ,
> > > > > > add
> > > > > > > > > extra resource should be added.
> > > > > > > > > Or time based say during these peak hours the pipeline
> should
> > > > scale
> > > > > > > > > automatically by 50%.
> > > > > > > > >
> > > > > > > > > This will help a lot in cost reduction.
> > > > > > > > >
> > > > > > > > > EMR cluster provides a similar feature for SPARK based
> > > > application.
> > > > > > > > >
> > > > > > > > > Thanks,
> > > > > > > > > Prasanna.
> > > > > > > > >
> > > > > > > > > On Wed, Jul 22, 2020 at 5:40 PM Robert Metzger <
> > > > > rmetzger@apache.org>
> > > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Hi all,
> > > > > > > > > >
> > > > > > > > > > Now that the 1.11 release is out, it is time to plan for
> > the
> > > > next
> > > > > > > major
> > > > > > > > > > Flink release.
> > > > > > > > > >
> > > > > > > > > > Some items:
> > > > > > > > > >
> > > > > > > > > >    1.
> > > > > > > > > >
> > > > > > > > > >    Dian Fu and me volunteer to be the release managers
> for
> > > > Flink
> > > > > > > 1.12.
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >    1.
> > > > > > > > > >
> > > > > > > > > >    Timeline: We propose to stick to our approximate 4
> month
> > > > > release
> > > > > > > > > cycle,
> > > > > > > > > >    thus the release should be done by late October. Given
> > > that
> > > > > > > there’s
> > > > > > > > a
> > > > > > > > > >    holiday week in China at the beginning of October, I
> > > propose
> > > > > to
> > > > > > do
> > > > > > > > the
> > > > > > > > > >    feature freeze on master by late September.
> > > > > > > > > >
> > > > > > > > > >    2.
> > > > > > > > > >
> > > > > > > > > >    Collecting features: It would be good to have a rough
> > > > overview
> > > > > > of
> > > > > > > > the
> > > > > > > > > >    features that will likely be ready to be merged by
> late
> > > > > > September,
> > > > > > > > and
> > > > > > > > > > that
> > > > > > > > > >    we want in the release.
> > > > > > > > > >    Based on the discussion, we will update the Roadmap on
> > the
> > > > > Flink
> > > > > > > > > website
> > > > > > > > > >    again!
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >    1.
> > > > > > > > > >
> > > > > > > > > >    Test instabilities and blockers: I would like to
> avoid a
> > > > > > situation
> > > > > > > > > where
> > > > > > > > > >    we have many blocking issues or build instabilities at
> > the
> > > > > time
> > > > > > of
> > > > > > > > the
> > > > > > > > > >    feature freeze. To achieve that, we will try to check
> > > every
> > > > > > build
> > > > > > > > > >    instability within a week, to decide if it is a
> blocker
> > > > (make
> > > > > > sure
> > > > > > > > to
> > > > > > > > > > use
> > > > > > > > > >    the “test-stability” label for those tickets!)
> > > > > > > > > >    Blocker issues will need to have somebody assigned
> > > > > (responsible)
> > > > > > > > > within
> > > > > > > > > >    a week, and we want to see progress on all blocker
> > issues
> > > > > > > > (downgrade,
> > > > > > > > > >    resolution, a good plan how to proceed if it is more
> > > > > > complicated)
> > > > > > > > > >
> > > > > > > > > >    2.
> > > > > > > > > >
> > > > > > > > > >    Quality and stability of new features: In order to
> have
> > a
> > > > > short
> > > > > > > > > feature
> > > > > > > > > >    freeze phase, we encourage developers to only merge
> > > > > well-tested
> > > > > > > and
> > > > > > > > > >    documented features. In our experience, the feature
> > freeze
> > > > > works
> > > > > > > > best
> > > > > > > > > if
> > > > > > > > > >    new features are complete, and the community can focus
> > > fully
> > > > > on
> > > > > > > > > > addressing
> > > > > > > > > >    newly found bugs and voting the release.
> > > > > > > > > >    By having a smooth release process, the next
> > merge-window
> > > > for
> > > > > > the
> > > > > > > > next
> > > > > > > > > >    release will come sooner.
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Let me know what you think about our items, and share
> which
> > > > > > features
> > > > > > > > you
> > > > > > > > > > want in Flink 1.12.
> > > > > > > > > >
> > > > > > > > > > Best,
> > > > > > > > > >
> > > > > > > > > > Robert & Dian
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > --
> > > > > > > >
> > > > > > > > Best Regards,
> > > > > > > > Harold Miao
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] Planning Flink 1.12

Posted by Till Rohrmann <tr...@apache.org>.

+1 for end of October from my side as well.

Cheers,
Till

On Tue, Aug 4, 2020 at 9:46 PM Stephan Ewen <se...@apache.org> wrote:

> The end of October sounds good from my side, unless it collides with some
> holidays that affect many committers.
>
> Feature-wise, I believe we can definitely make good use of the time to wrap
> up some critical threads (like finishing the FLIP-27 source efforts).
>
> So +1 to the end of October from my side.
>
> Best,
> Stephan
>
>
> On Tue, Aug 4, 2020 at 8:59 AM Robert Metzger <rm...@apache.org> wrote:
>
> > Thanks a lot for commenting on the feature freeze date.
> >
> > You are raising a few good points on the timing.
> > If we have already (2 months before) concerns regarding the deadline,
> then
> > I agree that we should move it till the end of October.
> >
> > We then just need to be careful not to run into the Christmas season at
> the
> > end of December.
> >
> > If nobody objects within a few days, I'll update the feature freeze date
> in
> > the Wiki.
> >
> >
> > On Tue, Aug 4, 2020 at 7:52 AM Kurt Young <yk...@gmail.com> wrote:
> >
> > > Regarding setting the feature freeze date to late September, I have
> some
> > > concern that it might make
> > > the development time of 1.12 too short.
> > >
> > > One reason for this is we took too much time (about 1.5 month, from mid
> > of
> > > May to beginning of July)
> > > for testing 1.11. It's not ideal but further squeeze the development
> time
> > > of 1.12 won't make this better.
> > >  Besides, AFAIK July & August is also a popular vacation season for
> > > European. Given the fact most
> > >  committers of Flink come from Europe, I think we should also take this
> > > into consideration.
> > >
> > > It's also true that the first week of October is the national holiday
> of
> > > China, so I'm wondering whether the
> > > end of October could be a candidate feature freeze date.
> > >
> > > Best,
> > > Kurt
> > >
> > >
> > > On Tue, Jul 28, 2020 at 2:41 AM Robert Metzger <rm...@apache.org>
> > > wrote:
> > >
> > > > Hi all,
> > > >
> > > > Thanks a lot for the responses so far. I've put them into this Wiki
> > page:
> > > > https://cwiki.apache.org/confluence/display/FLINK/1.12+Release to
> keep
> > > > track of them. Ideally, post JIRA tickets for your feature, then the
> > > status
> > > > will update automatically in the wiki :)
> > > >
> > > > Please keep posting features here, or add them to the Wiki yourself
> 🙏
> > > >
> > > > @Prasanna kumar <pr...@gmail.com>: Dynamic Auto
> Scaling
> > > is a
> > > > feature request the community is well-aware of. Till has posted
> > > > "Reactive-scaling mode" as a feature he's working on for the 1.12
> > > release.
> > > > This work will introduce the basic building blocks and partial
> support
> > > for
> > > > the feature you are requesting.
> > > > Proper support for dynamic scaling, while maintaining Flink's high
> > > > performance (throughout, low latency) and correctness is a difficult
> > task
> > > > that needs a lot of work. It will probably take a little bit of time
> > till
> > > > this is fully available.
> > > >
> > > > Cheers,
> > > > Robert
> > > >
> > > >
> > > >
> > > > On Thu, Jul 23, 2020 at 2:27 PM Till Rohrmann <tr...@apache.org>
> > > > wrote:
> > > >
> > > > > Thanks for being our release managers for the 1.12 release Dian &
> > > Robert!
> > > > >
> > > > > Here are some features I would like to work on for this release:
> > > > >
> > > > > # Features
> > > > >
> > > > > ## Finishing pipelined region scheduling (
> > > > > https://issues.apache.org/jira/browse/FLINK-16430)
> > > > > With the pipelined region scheduler we want to implement a
> scheduler
> > > > which
> > > > > can serve streaming as well as batch workloads alike while being
> able
> > > to
> > > > > run jobs under constrained resources. The latter is particularly
> > > > important
> > > > > for bounded streaming jobs which, currently, are not well
> supported.
> > > > >
> > > > > ## Reactive-scaling mode
> > > > > Being able to react to newly available resources and rescaling a
> > > running
> > > > > job accordingly will make Flink's operation much easier because
> > > resources
> > > > > can then be controlled by an external tool (e.g. GCP autoscaling,
> K8s
> > > > > horizontal pod scaler, etc.). In this release we want to make a big
> > > step
> > > > > towards this direction. As a first step we want to support the
> > > execution
> > > > of
> > > > > jobs with a parallelism which is lower than the specified
> parallelism
> > > in
> > > > > case that Flink lost a TaskManager or could not acquire enough
> > > resources.
> > > > >
> > > > > # Maintenance/Stability
> > > > >
> > > > > ## JM / TM finished task reconciliation (
> > > > > https://issues.apache.org/jira/browse/FLINK-17075)
> > > > > This prevents the system from going out of sync if a task state
> > change
> > > > from
> > > > > the TM to the JM is lost.
> > > > >
> > > > > ## Make metrics services work with Kubernetes deployments (
> > > > > https://issues.apache.org/jira/browse/FLINK-11127)
> > > > > Invert the direction in which the MetricFetcher connects to the
> > > > > MetricQueryFetchers. That way it will no longer be necessary to
> > expose
> > > on
> > > > > K8s for every TaskManager a port on which the MetricQueryFetcher
> > runs.
> > > > This
> > > > > will then make the deployment of Flink clusters on K8s easier.
> > > > >
> > > > > ## Handle long-blocking operations during job submission (savepoint
> > > > > restore) (https://issues.apache.org/jira/browse/FLINK-16866)
> > > > > Submitting a Flink job can involve the interaction with external
> > > systems
> > > > > (blocking operations). Depending on the job the interactions can
> take
> > > so
> > > > > long that it exceeds the submission timeout which reports a failure
> > on
> > > > the
> > > > > client side even though the actual submission succeeded. By
> > decoupling
> > > > the
> > > > > creation of the ExecutionGraph from the job submission, we can make
> > the
> > > > job
> > > > > submission non-blocking which will solve this problem.
> > > > >
> > > > > ## Make IDs more intuitive to ease debugging (FLIP-118) (
> > > > > https://issues.apache.org/jira/browse/FLINK-15679)
> > > > > By making the internal Flink IDs compositional or logging how they
> > > belong
> > > > > together, we can make the debugging of Flink's operations much
> > easier.
> > > > >
> > > > > Cheers,
> > > > > Till
> > > > >
> > > > >
> > > > > On Thu, Jul 23, 2020 at 7:48 AM Canbin Zheng <
> felixzhengcb@gmail.com
> > >
> > > > > wrote:
> > > > >
> > > > > > Hi All,
> > > > > >
> > > > > > Thanks for bring-up this discussion, Robert!
> > > > > > Congratulations on becoming the release manager of 1.12， Dian and
> > > > Robert
> > > > > !
> > > > > >
> > > > > > ----------
> > > > > > Here are some of my thoughts of the features for native
> integration
> > > > with
> > > > > > Kubernetes in Flink 1.12:
> > > > > >
> > > > > > 1. Support user-specified pod templates
> > > > > >     Description:
> > > > > >     The current approach of introducing new configuration options
> > for
> > > > > each
> > > > > > aspect of pod specification a user might wish is becoming
> unwieldy,
> > > we
> > > > > have
> > > > > > to maintain more and more Flink side Kubernetes configuration
> > options
> > > > and
> > > > > > users have to learn the gap between the declarative model used by
> > > > > > Kubernetes and the configuration model used by Flink. It's a
> great
> > > > > > improvement to allow users to specify pod templates as central
> > places
> > > > for
> > > > > > all customization needs for the jobmanager and taskmanager pods.
> > > > > >     Benefits:
> > > > > >     Users can leverage many of the advanced K8s features that the
> > > Flink
> > > > > > community does not support explicitly, such as volume mounting,
> DNS
> > > > > > configuration, pod affinity/anti-affinity setting, etc.
> > > > > >
> > > > > > 2. Support running PyFlink on Kubernetes
> > > > > >     Description:
> > > > > >     Support running PyFlink on Kubernetes, including session
> > cluster
> > > > and
> > > > > > application cluster.
> > > > > >     Benefits:
> > > > > >     Running python application in a containerized environment.
> > > > > >
> > > > > > 3. Support built-in init-Container
> > > > > >     Description:
> > > > > >     We need a built-in init-Container to help solve dependency
> > > > management
> > > > > > in a containerized environment, especially in the application
> mode.
> > > > > >     Benefits:
> > > > > >     Separate the base Flink image from dynamic dependencies.
> > > > > >
> > > > > > 4. Support accessing secured services via K8s secrets
> > > > > >     Description:
> > > > > >     Kubernetes Secrets
> > > > > > <https://kubernetes.io/docs/concepts/configuration/secret/> can
> be
> > > > used
> > > > > to
> > > > > > provide credentials for a Flink application to access secured
> > > services.
> > > > > It
> > > > > > helps people who want to use a user-specified K8s Secret through
> an
> > > > > > environment variable.
> > > > > >     Benefits:
> > > > > >     Improve user experience.
> > > > > >
> > > > > > 5. Support configuring replica of JobManager Deployment in
> > ZooKeeper
> > > HA
> > > > > > setups
> > > > > >     Description:
> > > > > >     Make the *replica* of Deployment configurable in the
> ZooKeeper
> > HA
> > > > > > setups.
> > > > > >     Benefits:
> > > > > >     Achieve faster failover.
> > > > > >
> > > > > > 6. Support to configure limit for CPU requirement
> > > > > >     Description:
> > > > > >     To leverage the Kubernetes feature of container request/limit
> > > CPU.
> > > > > >     Benefits:
> > > > > >     Reduce cost.
> > > > > >
> > > > > > Regards,
> > > > > > Canbin Zheng
> > > > > >
> > > > > > Harold.Miao <mi...@gmail.com> 于2020年7月23日周四 下午12:44写道：
> > > > > >
> > > > > > > I'm excited to hear about this feature,  very, very, very
> highly
> > > > > > encouraged
> > > > > > >
> > > > > > >
> > > > > > > Prasanna kumar <pr...@gmail.com> 于2020年7月23日周四
> > > > > 上午12:10写道：
> > > > > > >
> > > > > > > > Hi Flink Dev Team,
> > > > > > > >
> > > > > > > > Dynamic AutoScaling Based on the incoming data load would be
> a
> > > > great
> > > > > > > > feature.
> > > > > > > >
> > > > > > > > We should be able have some rule say If the load increased by
> > > 20% ,
> > > > > add
> > > > > > > > extra resource should be added.
> > > > > > > > Or time based say during these peak hours the pipeline should
> > > scale
> > > > > > > > automatically by 50%.
> > > > > > > >
> > > > > > > > This will help a lot in cost reduction.
> > > > > > > >
> > > > > > > > EMR cluster provides a similar feature for SPARK based
> > > application.
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > > Prasanna.
> > > > > > > >
> > > > > > > > On Wed, Jul 22, 2020 at 5:40 PM Robert Metzger <
> > > > rmetzger@apache.org>
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Hi all,
> > > > > > > > >
> > > > > > > > > Now that the 1.11 release is out, it is time to plan for
> the
> > > next
> > > > > > major
> > > > > > > > > Flink release.
> > > > > > > > >
> > > > > > > > > Some items:
> > > > > > > > >
> > > > > > > > >    1.
> > > > > > > > >
> > > > > > > > >    Dian Fu and me volunteer to be the release managers for
> > > Flink
> > > > > > 1.12.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >    1.
> > > > > > > > >
> > > > > > > > >    Timeline: We propose to stick to our approximate 4 month
> > > > release
> > > > > > > > cycle,
> > > > > > > > >    thus the release should be done by late October. Given
> > that
> > > > > > there’s
> > > > > > > a
> > > > > > > > >    holiday week in China at the beginning of October, I
> > propose
> > > > to
> > > > > do
> > > > > > > the
> > > > > > > > >    feature freeze on master by late September.
> > > > > > > > >
> > > > > > > > >    2.
> > > > > > > > >
> > > > > > > > >    Collecting features: It would be good to have a rough
> > > overview
> > > > > of
> > > > > > > the
> > > > > > > > >    features that will likely be ready to be merged by late
> > > > > September,
> > > > > > > and
> > > > > > > > > that
> > > > > > > > >    we want in the release.
> > > > > > > > >    Based on the discussion, we will update the Roadmap on
> the
> > > > Flink
> > > > > > > > website
> > > > > > > > >    again!
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >    1.
> > > > > > > > >
> > > > > > > > >    Test instabilities and blockers: I would like to avoid a
> > > > > situation
> > > > > > > > where
> > > > > > > > >    we have many blocking issues or build instabilities at
> the
> > > > time
> > > > > of
> > > > > > > the
> > > > > > > > >    feature freeze. To achieve that, we will try to check
> > every
> > > > > build
> > > > > > > > >    instability within a week, to decide if it is a blocker
> > > (make
> > > > > sure
> > > > > > > to
> > > > > > > > > use
> > > > > > > > >    the “test-stability” label for those tickets!)
> > > > > > > > >    Blocker issues will need to have somebody assigned
> > > > (responsible)
> > > > > > > > within
> > > > > > > > >    a week, and we want to see progress on all blocker
> issues
> > > > > > > (downgrade,
> > > > > > > > >    resolution, a good plan how to proceed if it is more
> > > > > complicated)
> > > > > > > > >
> > > > > > > > >    2.
> > > > > > > > >
> > > > > > > > >    Quality and stability of new features: In order to have
> a
> > > > short
> > > > > > > > feature
> > > > > > > > >    freeze phase, we encourage developers to only merge
> > > > well-tested
> > > > > > and
> > > > > > > > >    documented features. In our experience, the feature
> freeze
> > > > works
> > > > > > > best
> > > > > > > > if
> > > > > > > > >    new features are complete, and the community can focus
> > fully
> > > > on
> > > > > > > > > addressing
> > > > > > > > >    newly found bugs and voting the release.
> > > > > > > > >    By having a smooth release process, the next
> merge-window
> > > for
> > > > > the
> > > > > > > next
> > > > > > > > >    release will come sooner.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > Let me know what you think about our items, and share which
> > > > > features
> > > > > > > you
> > > > > > > > > want in Flink 1.12.
> > > > > > > > >
> > > > > > > > > Best,
> > > > > > > > >
> > > > > > > > > Robert & Dian
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > >
> > > > > > > Best Regards,
> > > > > > > Harold Miao
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] Planning Flink 1.12

Posted by Stephan Ewen <se...@apache.org>.

The end of October sounds good from my side, unless it collides with some
holidays that affect many committers.

Feature-wise, I believe we can definitely make good use of the time to wrap
up some critical threads (like finishing the FLIP-27 source efforts).

So +1 to the end of October from my side.

Best,
Stephan


On Tue, Aug 4, 2020 at 8:59 AM Robert Metzger <rm...@apache.org> wrote:

> Thanks a lot for commenting on the feature freeze date.
>
> You are raising a few good points on the timing.
> If we have already (2 months before) concerns regarding the deadline, then
> I agree that we should move it till the end of October.
>
> We then just need to be careful not to run into the Christmas season at the
> end of December.
>
> If nobody objects within a few days, I'll update the feature freeze date in
> the Wiki.
>
>
> On Tue, Aug 4, 2020 at 7:52 AM Kurt Young <yk...@gmail.com> wrote:
>
> > Regarding setting the feature freeze date to late September, I have some
> > concern that it might make
> > the development time of 1.12 too short.
> >
> > One reason for this is we took too much time (about 1.5 month, from mid
> of
> > May to beginning of July)
> > for testing 1.11. It's not ideal but further squeeze the development time
> > of 1.12 won't make this better.
> >  Besides, AFAIK July & August is also a popular vacation season for
> > European. Given the fact most
> >  committers of Flink come from Europe, I think we should also take this
> > into consideration.
> >
> > It's also true that the first week of October is the national holiday of
> > China, so I'm wondering whether the
> > end of October could be a candidate feature freeze date.
> >
> > Best,
> > Kurt
> >
> >
> > On Tue, Jul 28, 2020 at 2:41 AM Robert Metzger <rm...@apache.org>
> > wrote:
> >
> > > Hi all,
> > >
> > > Thanks a lot for the responses so far. I've put them into this Wiki
> page:
> > > https://cwiki.apache.org/confluence/display/FLINK/1.12+Release to keep
> > > track of them. Ideally, post JIRA tickets for your feature, then the
> > status
> > > will update automatically in the wiki :)
> > >
> > > Please keep posting features here, or add them to the Wiki yourself 🙏
> > >
> > > @Prasanna kumar <pr...@gmail.com>: Dynamic Auto Scaling
> > is a
> > > feature request the community is well-aware of. Till has posted
> > > "Reactive-scaling mode" as a feature he's working on for the 1.12
> > release.
> > > This work will introduce the basic building blocks and partial support
> > for
> > > the feature you are requesting.
> > > Proper support for dynamic scaling, while maintaining Flink's high
> > > performance (throughout, low latency) and correctness is a difficult
> task
> > > that needs a lot of work. It will probably take a little bit of time
> till
> > > this is fully available.
> > >
> > > Cheers,
> > > Robert
> > >
> > >
> > >
> > > On Thu, Jul 23, 2020 at 2:27 PM Till Rohrmann <tr...@apache.org>
> > > wrote:
> > >
> > > > Thanks for being our release managers for the 1.12 release Dian &
> > Robert!
> > > >
> > > > Here are some features I would like to work on for this release:
> > > >
> > > > # Features
> > > >
> > > > ## Finishing pipelined region scheduling (
> > > > https://issues.apache.org/jira/browse/FLINK-16430)
> > > > With the pipelined region scheduler we want to implement a scheduler
> > > which
> > > > can serve streaming as well as batch workloads alike while being able
> > to
> > > > run jobs under constrained resources. The latter is particularly
> > > important
> > > > for bounded streaming jobs which, currently, are not well supported.
> > > >
> > > > ## Reactive-scaling mode
> > > > Being able to react to newly available resources and rescaling a
> > running
> > > > job accordingly will make Flink's operation much easier because
> > resources
> > > > can then be controlled by an external tool (e.g. GCP autoscaling, K8s
> > > > horizontal pod scaler, etc.). In this release we want to make a big
> > step
> > > > towards this direction. As a first step we want to support the
> > execution
> > > of
> > > > jobs with a parallelism which is lower than the specified parallelism
> > in
> > > > case that Flink lost a TaskManager or could not acquire enough
> > resources.
> > > >
> > > > # Maintenance/Stability
> > > >
> > > > ## JM / TM finished task reconciliation (
> > > > https://issues.apache.org/jira/browse/FLINK-17075)
> > > > This prevents the system from going out of sync if a task state
> change
> > > from
> > > > the TM to the JM is lost.
> > > >
> > > > ## Make metrics services work with Kubernetes deployments (
> > > > https://issues.apache.org/jira/browse/FLINK-11127)
> > > > Invert the direction in which the MetricFetcher connects to the
> > > > MetricQueryFetchers. That way it will no longer be necessary to
> expose
> > on
> > > > K8s for every TaskManager a port on which the MetricQueryFetcher
> runs.
> > > This
> > > > will then make the deployment of Flink clusters on K8s easier.
> > > >
> > > > ## Handle long-blocking operations during job submission (savepoint
> > > > restore) (https://issues.apache.org/jira/browse/FLINK-16866)
> > > > Submitting a Flink job can involve the interaction with external
> > systems
> > > > (blocking operations). Depending on the job the interactions can take
> > so
> > > > long that it exceeds the submission timeout which reports a failure
> on
> > > the
> > > > client side even though the actual submission succeeded. By
> decoupling
> > > the
> > > > creation of the ExecutionGraph from the job submission, we can make
> the
> > > job
> > > > submission non-blocking which will solve this problem.
> > > >
> > > > ## Make IDs more intuitive to ease debugging (FLIP-118) (
> > > > https://issues.apache.org/jira/browse/FLINK-15679)
> > > > By making the internal Flink IDs compositional or logging how they
> > belong
> > > > together, we can make the debugging of Flink's operations much
> easier.
> > > >
> > > > Cheers,
> > > > Till
> > > >
> > > >
> > > > On Thu, Jul 23, 2020 at 7:48 AM Canbin Zheng <felixzhengcb@gmail.com
> >
> > > > wrote:
> > > >
> > > > > Hi All,
> > > > >
> > > > > Thanks for bring-up this discussion, Robert!
> > > > > Congratulations on becoming the release manager of 1.12， Dian and
> > > Robert
> > > > !
> > > > >
> > > > > ----------
> > > > > Here are some of my thoughts of the features for native integration
> > > with
> > > > > Kubernetes in Flink 1.12:
> > > > >
> > > > > 1. Support user-specified pod templates
> > > > >     Description:
> > > > >     The current approach of introducing new configuration options
> for
> > > > each
> > > > > aspect of pod specification a user might wish is becoming unwieldy,
> > we
> > > > have
> > > > > to maintain more and more Flink side Kubernetes configuration
> options
> > > and
> > > > > users have to learn the gap between the declarative model used by
> > > > > Kubernetes and the configuration model used by Flink. It's a great
> > > > > improvement to allow users to specify pod templates as central
> places
> > > for
> > > > > all customization needs for the jobmanager and taskmanager pods.
> > > > >     Benefits:
> > > > >     Users can leverage many of the advanced K8s features that the
> > Flink
> > > > > community does not support explicitly, such as volume mounting, DNS
> > > > > configuration, pod affinity/anti-affinity setting, etc.
> > > > >
> > > > > 2. Support running PyFlink on Kubernetes
> > > > >     Description:
> > > > >     Support running PyFlink on Kubernetes, including session
> cluster
> > > and
> > > > > application cluster.
> > > > >     Benefits:
> > > > >     Running python application in a containerized environment.
> > > > >
> > > > > 3. Support built-in init-Container
> > > > >     Description:
> > > > >     We need a built-in init-Container to help solve dependency
> > > management
> > > > > in a containerized environment, especially in the application mode.
> > > > >     Benefits:
> > > > >     Separate the base Flink image from dynamic dependencies.
> > > > >
> > > > > 4. Support accessing secured services via K8s secrets
> > > > >     Description:
> > > > >     Kubernetes Secrets
> > > > > <https://kubernetes.io/docs/concepts/configuration/secret/> can be
> > > used
> > > > to
> > > > > provide credentials for a Flink application to access secured
> > services.
> > > > It
> > > > > helps people who want to use a user-specified K8s Secret through an
> > > > > environment variable.
> > > > >     Benefits:
> > > > >     Improve user experience.
> > > > >
> > > > > 5. Support configuring replica of JobManager Deployment in
> ZooKeeper
> > HA
> > > > > setups
> > > > >     Description:
> > > > >     Make the *replica* of Deployment configurable in the ZooKeeper
> HA
> > > > > setups.
> > > > >     Benefits:
> > > > >     Achieve faster failover.
> > > > >
> > > > > 6. Support to configure limit for CPU requirement
> > > > >     Description:
> > > > >     To leverage the Kubernetes feature of container request/limit
> > CPU.
> > > > >     Benefits:
> > > > >     Reduce cost.
> > > > >
> > > > > Regards,
> > > > > Canbin Zheng
> > > > >
> > > > > Harold.Miao <mi...@gmail.com> 于2020年7月23日周四 下午12:44写道：
> > > > >
> > > > > > I'm excited to hear about this feature,  very, very, very highly
> > > > > encouraged
> > > > > >
> > > > > >
> > > > > > Prasanna kumar <pr...@gmail.com> 于2020年7月23日周四
> > > > 上午12:10写道：
> > > > > >
> > > > > > > Hi Flink Dev Team,
> > > > > > >
> > > > > > > Dynamic AutoScaling Based on the incoming data load would be a
> > > great
> > > > > > > feature.
> > > > > > >
> > > > > > > We should be able have some rule say If the load increased by
> > 20% ,
> > > > add
> > > > > > > extra resource should be added.
> > > > > > > Or time based say during these peak hours the pipeline should
> > scale
> > > > > > > automatically by 50%.
> > > > > > >
> > > > > > > This will help a lot in cost reduction.
> > > > > > >
> > > > > > > EMR cluster provides a similar feature for SPARK based
> > application.
> > > > > > >
> > > > > > > Thanks,
> > > > > > > Prasanna.
> > > > > > >
> > > > > > > On Wed, Jul 22, 2020 at 5:40 PM Robert Metzger <
> > > rmetzger@apache.org>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Hi all,
> > > > > > > >
> > > > > > > > Now that the 1.11 release is out, it is time to plan for the
> > next
> > > > > major
> > > > > > > > Flink release.
> > > > > > > >
> > > > > > > > Some items:
> > > > > > > >
> > > > > > > >    1.
> > > > > > > >
> > > > > > > >    Dian Fu and me volunteer to be the release managers for
> > Flink
> > > > > 1.12.
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > >    1.
> > > > > > > >
> > > > > > > >    Timeline: We propose to stick to our approximate 4 month
> > > release
> > > > > > > cycle,
> > > > > > > >    thus the release should be done by late October. Given
> that
> > > > > there’s
> > > > > > a
> > > > > > > >    holiday week in China at the beginning of October, I
> propose
> > > to
> > > > do
> > > > > > the
> > > > > > > >    feature freeze on master by late September.
> > > > > > > >
> > > > > > > >    2.
> > > > > > > >
> > > > > > > >    Collecting features: It would be good to have a rough
> > overview
> > > > of
> > > > > > the
> > > > > > > >    features that will likely be ready to be merged by late
> > > > September,
> > > > > > and
> > > > > > > > that
> > > > > > > >    we want in the release.
> > > > > > > >    Based on the discussion, we will update the Roadmap on the
> > > Flink
> > > > > > > website
> > > > > > > >    again!
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > >    1.
> > > > > > > >
> > > > > > > >    Test instabilities and blockers: I would like to avoid a
> > > > situation
> > > > > > > where
> > > > > > > >    we have many blocking issues or build instabilities at the
> > > time
> > > > of
> > > > > > the
> > > > > > > >    feature freeze. To achieve that, we will try to check
> every
> > > > build
> > > > > > > >    instability within a week, to decide if it is a blocker
> > (make
> > > > sure
> > > > > > to
> > > > > > > > use
> > > > > > > >    the “test-stability” label for those tickets!)
> > > > > > > >    Blocker issues will need to have somebody assigned
> > > (responsible)
> > > > > > > within
> > > > > > > >    a week, and we want to see progress on all blocker issues
> > > > > > (downgrade,
> > > > > > > >    resolution, a good plan how to proceed if it is more
> > > > complicated)
> > > > > > > >
> > > > > > > >    2.
> > > > > > > >
> > > > > > > >    Quality and stability of new features: In order to have a
> > > short
> > > > > > > feature
> > > > > > > >    freeze phase, we encourage developers to only merge
> > > well-tested
> > > > > and
> > > > > > > >    documented features. In our experience, the feature freeze
> > > works
> > > > > > best
> > > > > > > if
> > > > > > > >    new features are complete, and the community can focus
> fully
> > > on
> > > > > > > > addressing
> > > > > > > >    newly found bugs and voting the release.
> > > > > > > >    By having a smooth release process, the next merge-window
> > for
> > > > the
> > > > > > next
> > > > > > > >    release will come sooner.
> > > > > > > >
> > > > > > > >
> > > > > > > > Let me know what you think about our items, and share which
> > > > features
> > > > > > you
> > > > > > > > want in Flink 1.12.
> > > > > > > >
> > > > > > > > Best,
> > > > > > > >
> > > > > > > > Robert & Dian
> > > > > > > >
> > > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > >
> > > > > > Best Regards,
> > > > > > Harold Miao
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] Planning Flink 1.12

Posted by Robert Metzger <rm...@apache.org>.

Thanks a lot for commenting on the feature freeze date.

You are raising a few good points on the timing.
If we have already (2 months before) concerns regarding the deadline, then
I agree that we should move it till the end of October.

We then just need to be careful not to run into the Christmas season at the
end of December.

If nobody objects within a few days, I'll update the feature freeze date in
the Wiki.


On Tue, Aug 4, 2020 at 7:52 AM Kurt Young <yk...@gmail.com> wrote:

> Regarding setting the feature freeze date to late September, I have some
> concern that it might make
> the development time of 1.12 too short.
>
> One reason for this is we took too much time (about 1.5 month, from mid of
> May to beginning of July)
> for testing 1.11. It's not ideal but further squeeze the development time
> of 1.12 won't make this better.
>  Besides, AFAIK July & August is also a popular vacation season for
> European. Given the fact most
>  committers of Flink come from Europe, I think we should also take this
> into consideration.
>
> It's also true that the first week of October is the national holiday of
> China, so I'm wondering whether the
> end of October could be a candidate feature freeze date.
>
> Best,
> Kurt
>
>
> On Tue, Jul 28, 2020 at 2:41 AM Robert Metzger <rm...@apache.org>
> wrote:
>
> > Hi all,
> >
> > Thanks a lot for the responses so far. I've put them into this Wiki page:
> > https://cwiki.apache.org/confluence/display/FLINK/1.12+Release to keep
> > track of them. Ideally, post JIRA tickets for your feature, then the
> status
> > will update automatically in the wiki :)
> >
> > Please keep posting features here, or add them to the Wiki yourself 🙏
> >
> > @Prasanna kumar <pr...@gmail.com>: Dynamic Auto Scaling
> is a
> > feature request the community is well-aware of. Till has posted
> > "Reactive-scaling mode" as a feature he's working on for the 1.12
> release.
> > This work will introduce the basic building blocks and partial support
> for
> > the feature you are requesting.
> > Proper support for dynamic scaling, while maintaining Flink's high
> > performance (throughout, low latency) and correctness is a difficult task
> > that needs a lot of work. It will probably take a little bit of time till
> > this is fully available.
> >
> > Cheers,
> > Robert
> >
> >
> >
> > On Thu, Jul 23, 2020 at 2:27 PM Till Rohrmann <tr...@apache.org>
> > wrote:
> >
> > > Thanks for being our release managers for the 1.12 release Dian &
> Robert!
> > >
> > > Here are some features I would like to work on for this release:
> > >
> > > # Features
> > >
> > > ## Finishing pipelined region scheduling (
> > > https://issues.apache.org/jira/browse/FLINK-16430)
> > > With the pipelined region scheduler we want to implement a scheduler
> > which
> > > can serve streaming as well as batch workloads alike while being able
> to
> > > run jobs under constrained resources. The latter is particularly
> > important
> > > for bounded streaming jobs which, currently, are not well supported.
> > >
> > > ## Reactive-scaling mode
> > > Being able to react to newly available resources and rescaling a
> running
> > > job accordingly will make Flink's operation much easier because
> resources
> > > can then be controlled by an external tool (e.g. GCP autoscaling, K8s
> > > horizontal pod scaler, etc.). In this release we want to make a big
> step
> > > towards this direction. As a first step we want to support the
> execution
> > of
> > > jobs with a parallelism which is lower than the specified parallelism
> in
> > > case that Flink lost a TaskManager or could not acquire enough
> resources.
> > >
> > > # Maintenance/Stability
> > >
> > > ## JM / TM finished task reconciliation (
> > > https://issues.apache.org/jira/browse/FLINK-17075)
> > > This prevents the system from going out of sync if a task state change
> > from
> > > the TM to the JM is lost.
> > >
> > > ## Make metrics services work with Kubernetes deployments (
> > > https://issues.apache.org/jira/browse/FLINK-11127)
> > > Invert the direction in which the MetricFetcher connects to the
> > > MetricQueryFetchers. That way it will no longer be necessary to expose
> on
> > > K8s for every TaskManager a port on which the MetricQueryFetcher runs.
> > This
> > > will then make the deployment of Flink clusters on K8s easier.
> > >
> > > ## Handle long-blocking operations during job submission (savepoint
> > > restore) (https://issues.apache.org/jira/browse/FLINK-16866)
> > > Submitting a Flink job can involve the interaction with external
> systems
> > > (blocking operations). Depending on the job the interactions can take
> so
> > > long that it exceeds the submission timeout which reports a failure on
> > the
> > > client side even though the actual submission succeeded. By decoupling
> > the
> > > creation of the ExecutionGraph from the job submission, we can make the
> > job
> > > submission non-blocking which will solve this problem.
> > >
> > > ## Make IDs more intuitive to ease debugging (FLIP-118) (
> > > https://issues.apache.org/jira/browse/FLINK-15679)
> > > By making the internal Flink IDs compositional or logging how they
> belong
> > > together, we can make the debugging of Flink's operations much easier.
> > >
> > > Cheers,
> > > Till
> > >
> > >
> > > On Thu, Jul 23, 2020 at 7:48 AM Canbin Zheng <fe...@gmail.com>
> > > wrote:
> > >
> > > > Hi All,
> > > >
> > > > Thanks for bring-up this discussion, Robert!
> > > > Congratulations on becoming the release manager of 1.12， Dian and
> > Robert
> > > !
> > > >
> > > > ----------
> > > > Here are some of my thoughts of the features for native integration
> > with
> > > > Kubernetes in Flink 1.12:
> > > >
> > > > 1. Support user-specified pod templates
> > > >     Description:
> > > >     The current approach of introducing new configuration options for
> > > each
> > > > aspect of pod specification a user might wish is becoming unwieldy,
> we
> > > have
> > > > to maintain more and more Flink side Kubernetes configuration options
> > and
> > > > users have to learn the gap between the declarative model used by
> > > > Kubernetes and the configuration model used by Flink. It's a great
> > > > improvement to allow users to specify pod templates as central places
> > for
> > > > all customization needs for the jobmanager and taskmanager pods.
> > > >     Benefits:
> > > >     Users can leverage many of the advanced K8s features that the
> Flink
> > > > community does not support explicitly, such as volume mounting, DNS
> > > > configuration, pod affinity/anti-affinity setting, etc.
> > > >
> > > > 2. Support running PyFlink on Kubernetes
> > > >     Description:
> > > >     Support running PyFlink on Kubernetes, including session cluster
> > and
> > > > application cluster.
> > > >     Benefits:
> > > >     Running python application in a containerized environment.
> > > >
> > > > 3. Support built-in init-Container
> > > >     Description:
> > > >     We need a built-in init-Container to help solve dependency
> > management
> > > > in a containerized environment, especially in the application mode.
> > > >     Benefits:
> > > >     Separate the base Flink image from dynamic dependencies.
> > > >
> > > > 4. Support accessing secured services via K8s secrets
> > > >     Description:
> > > >     Kubernetes Secrets
> > > > <https://kubernetes.io/docs/concepts/configuration/secret/> can be
> > used
> > > to
> > > > provide credentials for a Flink application to access secured
> services.
> > > It
> > > > helps people who want to use a user-specified K8s Secret through an
> > > > environment variable.
> > > >     Benefits:
> > > >     Improve user experience.
> > > >
> > > > 5. Support configuring replica of JobManager Deployment in ZooKeeper
> HA
> > > > setups
> > > >     Description:
> > > >     Make the *replica* of Deployment configurable in the ZooKeeper HA
> > > > setups.
> > > >     Benefits:
> > > >     Achieve faster failover.
> > > >
> > > > 6. Support to configure limit for CPU requirement
> > > >     Description:
> > > >     To leverage the Kubernetes feature of container request/limit
> CPU.
> > > >     Benefits:
> > > >     Reduce cost.
> > > >
> > > > Regards,
> > > > Canbin Zheng
> > > >
> > > > Harold.Miao <mi...@gmail.com> 于2020年7月23日周四 下午12:44写道：
> > > >
> > > > > I'm excited to hear about this feature,  very, very, very highly
> > > > encouraged
> > > > >
> > > > >
> > > > > Prasanna kumar <pr...@gmail.com> 于2020年7月23日周四
> > > 上午12:10写道：
> > > > >
> > > > > > Hi Flink Dev Team,
> > > > > >
> > > > > > Dynamic AutoScaling Based on the incoming data load would be a
> > great
> > > > > > feature.
> > > > > >
> > > > > > We should be able have some rule say If the load increased by
> 20% ,
> > > add
> > > > > > extra resource should be added.
> > > > > > Or time based say during these peak hours the pipeline should
> scale
> > > > > > automatically by 50%.
> > > > > >
> > > > > > This will help a lot in cost reduction.
> > > > > >
> > > > > > EMR cluster provides a similar feature for SPARK based
> application.
> > > > > >
> > > > > > Thanks,
> > > > > > Prasanna.
> > > > > >
> > > > > > On Wed, Jul 22, 2020 at 5:40 PM Robert Metzger <
> > rmetzger@apache.org>
> > > > > > wrote:
> > > > > >
> > > > > > > Hi all,
> > > > > > >
> > > > > > > Now that the 1.11 release is out, it is time to plan for the
> next
> > > > major
> > > > > > > Flink release.
> > > > > > >
> > > > > > > Some items:
> > > > > > >
> > > > > > >    1.
> > > > > > >
> > > > > > >    Dian Fu and me volunteer to be the release managers for
> Flink
> > > > 1.12.
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >    1.
> > > > > > >
> > > > > > >    Timeline: We propose to stick to our approximate 4 month
> > release
> > > > > > cycle,
> > > > > > >    thus the release should be done by late October. Given that
> > > > there’s
> > > > > a
> > > > > > >    holiday week in China at the beginning of October, I propose
> > to
> > > do
> > > > > the
> > > > > > >    feature freeze on master by late September.
> > > > > > >
> > > > > > >    2.
> > > > > > >
> > > > > > >    Collecting features: It would be good to have a rough
> overview
> > > of
> > > > > the
> > > > > > >    features that will likely be ready to be merged by late
> > > September,
> > > > > and
> > > > > > > that
> > > > > > >    we want in the release.
> > > > > > >    Based on the discussion, we will update the Roadmap on the
> > Flink
> > > > > > website
> > > > > > >    again!
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >    1.
> > > > > > >
> > > > > > >    Test instabilities and blockers: I would like to avoid a
> > > situation
> > > > > > where
> > > > > > >    we have many blocking issues or build instabilities at the
> > time
> > > of
> > > > > the
> > > > > > >    feature freeze. To achieve that, we will try to check every
> > > build
> > > > > > >    instability within a week, to decide if it is a blocker
> (make
> > > sure
> > > > > to
> > > > > > > use
> > > > > > >    the “test-stability” label for those tickets!)
> > > > > > >    Blocker issues will need to have somebody assigned
> > (responsible)
> > > > > > within
> > > > > > >    a week, and we want to see progress on all blocker issues
> > > > > (downgrade,
> > > > > > >    resolution, a good plan how to proceed if it is more
> > > complicated)
> > > > > > >
> > > > > > >    2.
> > > > > > >
> > > > > > >    Quality and stability of new features: In order to have a
> > short
> > > > > > feature
> > > > > > >    freeze phase, we encourage developers to only merge
> > well-tested
> > > > and
> > > > > > >    documented features. In our experience, the feature freeze
> > works
> > > > > best
> > > > > > if
> > > > > > >    new features are complete, and the community can focus fully
> > on
> > > > > > > addressing
> > > > > > >    newly found bugs and voting the release.
> > > > > > >    By having a smooth release process, the next merge-window
> for
> > > the
> > > > > next
> > > > > > >    release will come sooner.
> > > > > > >
> > > > > > >
> > > > > > > Let me know what you think about our items, and share which
> > > features
> > > > > you
> > > > > > > want in Flink 1.12.
> > > > > > >
> > > > > > > Best,
> > > > > > >
> > > > > > > Robert & Dian
> > > > > > >
> > > > > >
> > > > >
> > > > >
> > > > > --
> > > > >
> > > > > Best Regards,
> > > > > Harold Miao
> > > > >
> > > >
> > >
> >
>