You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@flink.apache.org by Xintong Song <to...@gmail.com> on 2023/06/21 06:41:41 UTC

[DISCUSS] Release 2.0 Work Items

Hi devs,

As previously discussed in [1], we had been collecting work item proposals
for the 2.0 release until June 15th, on the wiki page [2].

   - As we have passed the due date, I'd like to kindly remind everyone *not
   to add / remove items directly on the wiki page*. If needed, please post
   in this thread or reach out to the release managers instead.
   - I've reached out to some folks for clarifications about their
   proposals. Some of them mentioned that they can not yet tell whether we
   should do an item or not, and would need more time / discussions to make
   the decision. So I added a new symbol for items whose priorities are `TBD`.

Now it's time to collaboratively decide a minimum set of must-have items.
I've gone through the entire list of proposed items, and found most of them
make quite much sense. So I think an online sync might not be necessary for
this. I'd like to go with this DISCUSS thread, where everyone can comment
on how they think the list can be improved, followed by a VOTE to formally
make the decision.

Any feedback and opinions, including but not limited to the following
aspects, will be appreciated.

   - Important items that are missing from the list
   - Concerns regarding the listed items or their priorities

Looking forward to your feedback.

Best,

Xintong


[1]
https://lists.apache.org/list?dev@flink.apache.org:lte=1M:release%202.0%20status%20updates

[2] https://cwiki.apache.org/confluence/display/FLINK/2.0+Release

Re: [DISCUSS] Release 2.0 Work Items

Posted by Yuan Mei <yu...@gmail.com>.
Thanks Xintong!

I am +1 on the change.

Best
Yuan

On Mon, Jul 3, 2023 at 6:20 PM Jing Ge <ji...@ververica.com.invalid> wrote:

> Hi Sergey,
>
> Thanks for the clarification! I will not hijack this thread to discuss
> Scala code strategy.
>
> Best regards,
> Jing
>
> On Mon, Jul 3, 2023 at 10:51 AM Sergey Nuyanzin <sn...@gmail.com>
> wrote:
>
> > Hi Jing,
> >
> > Maybe I was not clear enough, sorry.
> > However the main reason for this item about Calcite rules is not
> abandoning
> > Scala.
> > The main reason are changes in Calcite itself where there was introduced
> > code generator framework (immutables)
> > to generate config java classes for rules and old api (which is used in
> > Scala Calcirte rules) for that is marked as deprecated.
> > Since Immutables implies code generation while java compilation it seems
> > impossible to use for rules in Scala code.
> >
> > On Mon, Jul 3, 2023 at 10:44 AM Jing Ge <ji...@ververica.com.invalid>
> > wrote:
> >
> > > Hi,
> > >
> > > Speaking of "Move Calcite rules from Scala to Java", I was wondering if
> > > this thread is the right place to talk about it. Afaik, the Flink
> > community
> > > has decided to abandon Scala. That is the reason, I guess, we want to
> > move
> > > those Calcite rules from Scala to Java. On the other side, new Scala
> code
> > > will be added while developing new features[1]. Do we have any thoughts
> > > wrt the Scala code strategy?
> > >
> > > Best regards,
> > > Jing
> > >
> > >
> > >
> > > [1] https://lists.apache.org/thread/tnygl4n3q1fx75cl2vclc78j8mrxmz6y
> > >
> > > On Mon, Jul 3, 2023 at 10:31 AM Xintong Song <to...@gmail.com>
> > > wrote:
> > >
> > > > Thanks all for the discussion.
> > > >
> > > >
> > > > IIUC, we need to make the following changes. Please correct me if I
> get
> > > it
> > > > wrong.
> > > >
> > > >
> > > > 1. Disaggregated State Management - Clarify that only the public API
> > > > related part is must-have for 2.0.
> > > >
> > > > 2. Java version support - Split it into 3 items: a) make java 17 the
> > > > default (must-have), b) drop java 8 (must-have), and c) drop java 11
> > > > (nice-to-have)
> > > >
> > > > 3. Add MetricGroup#getLogicalScope - Should be promoted to must-have
> > > >
> > > > 4. ProcessFunction API - Should be downgrade to nice-to-have
> > > >
> > > > 5. Configuration - Add an item "revisit all config option types and
> > > default
> > > > values", which IIUC should also be a must-have
> > > >
> > > >
> > > > There seems to be no changes needed for "Move Calcite rules from
> Scala
> > to
> > > > Java" as it's already nice-to-have.
> > > >
> > > >
> > > > If there's no objections, I'll update the wiki page accordingly, and
> > > start
> > > > a VOTE in the next couple of days.
> > > >
> > > >
> > > > Best,
> > > >
> > > > Xintong
> > > >
> > > >
> > > >
> > > > On Fri, Jun 30, 2023 at 12:53 AM Teoh, Hong
> > <liangtl@amazon.co.uk.invalid
> > > >
> > > > wrote:
> > > >
> > > > > Thanks Xintong for driving the effort.
> > > > >
> > > > > I’d add a +1 to reworking configs, as suggested by @Jark and
> > @Chesnay,
> > > > > especially the types. We have various configs that encode Time /
> > > > MemorySize
> > > > > that are Long instead!
> > > > >
> > > > > Regards,
> > > > > Hong
> > > > >
> > > > >
> > > > >
> > > > > > On 29 Jun 2023, at 16:19, Yuan Mei <yu...@gmail.com>
> wrote:
> > > > > >
> > > > > > CAUTION: This email originated from outside of the organization.
> Do
> > > not
> > > > > click links or open attachments unless you can confirm the sender
> and
> > > > know
> > > > > the content is safe.
> > > > > >
> > > > > >
> > > > > >
> > > > > > Thanks for driving this effort, Xintong!
> > > > > >
> > > > > > To Chesnay
> > > > > >> I'm curious as to why the "Disaggregated State Management" item
> is
> > > > > >> marked as a must-have; will it require changes that break
> > something?
> > > > > >> What prevents it from being added in 2.1?
> > > > > >
> > > > > > As to "Disaggregated State Management".
> > > > > >
> > > > > > We plan to provide a new type of state backend to support DFS as
> > > > primary
> > > > > > storage.
> > > > > > To achieve this, we at least need to include two parts of amends
> > (not
> > > > > > entirely sure yet, since we are still in the designing and
> > prototype
> > > > > phase)
> > > > > >
> > > > > > 1. Statebackend Change
> > > > > > 2. State Access Change
> > > > > >
> > > > > > Not all of the interfaces related are `@Internal`. Some of the
> > > > interfaces
> > > > > > like `StateBackend` is `@PublicEvolving`
> > > > > > So, you are right in the sense that "Disaggregated State
> > Management"
> > > > > itself
> > > > > > probably does not need to be a "Must Have"
> > > > > >
> > > > > > But I was hoping changes that related to public APIs can be
> > finalized
> > > > and
> > > > > > merged in Flink 2.0 (I will fix the wiki accordingly).
> > > > > >
> > > > > > I also agree with Jark that 2.0 is a good chance to rework the
> > > default
> > > > > > value of configurations.
> > > > > >
> > > > > > Best
> > > > > > Yuan
> > > > > >
> > > > > >
> > > > > > On Thu, Jun 29, 2023 at 8:43 PM Chesnay Schepler <
> > chesnay@apache.org
> > > >
> > > > > wrote:
> > > > > >
> > > > > >> Something else configuration-related is that there are a bunch
> of
> > > > > >> options where the type isn't quite correct (e.g., a String where
> > it
> > > > > >> could be an enum, a string where it should be an int or
> > something).
> > > > > >> Could do a pass over those as well.
> > > > > >>
> > > > > >> On 29/06/2023 13:50, Jark Wu wrote:
> > > > > >>> Hi,
> > > > > >>>
> > > > > >>> I think one more thing we need to consider to do in 2.0 is
> > changing
> > > > the
> > > > > >>> default value of configuration to improve out-of-box user
> > > experience.
> > > > > >>>
> > > > > >>> Currently, in order to run a Flink job, users may need to set
> > > > > >>> a bunch of configurations, such as minibatch, checkpoint
> > interval,
> > > > > >>> exactly-once,
> > > > > >>> incremental-checkpoint, etc. It's very verbose and hard to use
> > for
> > > > > >>> beginners.
> > > > > >>> Most of them can have a universally applicable value.  Because
> > > > changing
> > > > > >> the
> > > > > >>> default value is a breaking change. I think It's worth
> > considering
> > > > > >> changing
> > > > > >>> them in 2.0.
> > > > > >>>
> > > > > >>> What do you think?
> > > > > >>>
> > > > > >>> Best,
> > > > > >>> Jark
> > > > > >>>
> > > > > >>>
> > > > > >>> On Wed, 28 Jun 2023 at 14:10, Sergey Nuyanzin <
> > snuyanzin@gmail.com
> > > >
> > > > > >> wrote:
> > > > > >>>
> > > > > >>>> Hi Chesnay
> > > > > >>>>
> > > > > >>>>> "Move Calcite rules from Scala to Java": I would hope that
> this
> > > > would
> > > > > >> be
> > > > > >>>>> an entirely internal change, and could thus be an incremental
> > > > process
> > > > > >>>>> independent of major releases.
> > > > > >>>>> What is the actual scale of this item; how much are we
> actually
> > > > > >>>> re-writing?
> > > > > >>>>
> > > > > >>>> Thanks for asking
> > > > > >>>> yes, you're right, that should be internal change.
> > > > > >>>> Yeah I was also thinking about incremental change (rule by
> rule
> > or
> > > > > >>>> reasonable small group of rules).
> > > > > >>>> And yes, this could be an independent (on major release)
> > activity
> > > > > >>>>
> > > > > >>>> The problem is actually for children of RelOptRule.
> > > > > >>>> Currently I see 60+ such rules (in Scala) using the mentioned
> > > > > deprecated
> > > > > >>>> api.
> > > > > >>>> There are also children of ConverterRule (50+) which do not
> have
> > > > such
> > > > > >>>> issues.
> > > > > >>>> Maybe it could be considered as the next step to have all the
> > > rules
> > > > in
> > > > > >>>> Java.
> > > > > >>>>
> > > > > >>>> On Tue, Jun 27, 2023 at 1:34 PM Xintong Song <
> > > tonysong820@gmail.com
> > > > >
> > > > > >>>> wrote:
> > > > > >>>>
> > > > > >>>>> Hi Alex & Gyula,
> > > > > >>>>>
> > > > > >>>>> By compatibility discussion do you mean the "[DISCUSS]
> > FLIP-321:
> > > > > >>>> Introduce
> > > > > >>>>>> an API deprecation process" thread [1]?
> > > > > >>>>>>
> > > > > >>>>> Yes, I meant the FLIP-321 discussion. I just noticed I pasted
> > the
> > > > > wrong
> > > > > >>>> url
> > > > > >>>>> in my previous email. Sorry for the mistake.
> > > > > >>>>>
> > > > > >>>>> I am also curious to know if the rationale behind this new
> API
> > > has
> > > > > been
> > > > > >>>>>> previously discussed on the mailing list. Do we have a list
> of
> > > > > >>>>> shortcomings
> > > > > >>>>>> in the current DataStream API that it tries to resolve? How
> > does
> > > > the
> > > > > >>>>>> current ProcessFunction functionality fit into the picture?
> > Will
> > > > it
> > > > > be
> > > > > >>>>> kept
> > > > > >>>>>> as is or subsumed by new API?
> > > > > >>>>>>
> > > > > >>>>> I don't think we should create a replacement for the
> DataStream
> > > API
> > > > > >>>> unless
> > > > > >>>>>> we have a very good reason to do so and with a proper
> > discussion
> > > > > about
> > > > > >>>>> this
> > > > > >>>>>> as Alex said.
> > > > > >>>>>
> > > > > >>>>> The ProcessFunction API which is targeting to replace
> > DataStream
> > > > API
> > > > > is
> > > > > >>>>> still a proposal, not a decision. Sorry for the confusion, I
> > > should
> > > > > >> have
> > > > > >>>>> been more careful with my words, not giving the impression
> that
> > > > this
> > > > > is
> > > > > >>>>> something we'll do anyway.
> > > > > >>>>>
> > > > > >>>>> There will be a FLIP describing the motivations and designs
> in
> > > > > detail,
> > > > > >>>> for
> > > > > >>>>> the community to discuss and vote on. We are still working on
> > it.
> > > > > TBH,
> > > > > >>>> this
> > > > > >>>>> is not trivial and we would need more time on it.
> > > > > >>>>>
> > > > > >>>>> Just to quickly share some backgrounds:
> > > > > >>>>>
> > > > > >>>>>    - We see quite some problems with the current DataStream
> > APIs
> > > > > >>>>>       - Users are working with concrete classes rather than
> > > > > >> interfaces,
> > > > > >>>>>       which means
> > > > > >>>>>       - Users can access methods that are designed to be used
> > by
> > > > > >> internal
> > > > > >>>>>          classes, even though they are annotated with
> > > `@Internal`.
> > > > > >> E.g.,
> > > > > >>>>>          `DataStream#getTransformation`.
> > > > > >>>>>          - Changes to the non-API implementations (e.g.,
> > > > > >>>> `Transformation`)
> > > > > >>>>>          would affect the API classes (e.g., `DataStream`),
> > which
> > > > > >>>>> makes it hard to
> > > > > >>>>>          provide binary compatibility.
> > > > > >>>>>       - Internal classes are used as parameter / return-value
> > of
> > > > > >> public
> > > > > >>>>>       APIs. E.g., while `AbstractStreamOperator` is
> > > PublicEvolving,
> > > > > >>>>> `StreamTask`
> > > > > >>>>>       which returns from
> > > `AbstractStreamOperator#getContainingTask`
> > > > > is
> > > > > >>>>> Internal.
> > > > > >>>>>       - In many cases, users are asked to extend the API
> > classes,
> > > > > >> rather
> > > > > >>>>>       than implementing interfaces. E.g.,
> > > `AbstractStreamOperator`.
> > > > > >>>>>          - Any changes to the base classes, even the internal
> > > part,
> > > > > >> may
> > > > > >>>>>          affect the behavior of the user-provided sub-classes
> > > > > >>>>>          - Users can override the behavior of the base
> classes
> > > > > >>>>>       - The API module `flink-streaming-java` contains
> non-API
> > > > > >> classes,
> > > > > >>>> and
> > > > > >>>>>       depends on internal modules such as `flink-runtime`,
> > which
> > > > > means
> > > > > >>>>>       - Changes to the internal modules may affect the API
> > > modules,
> > > > > >> which
> > > > > >>>>>          requires users to re-build their applications upon
> > > > upgrading
> > > > > >>>>>          - The artifact user needs for building their
> > application
> > > > > >> larger
> > > > > >>>>>          than necessary.
> > > > > >>>>>       - We probably should not expose operators (e.g.,
> > > > > >>>>>       `AbstractStreamOperator`) to users. Functions should be
> > > > enough
> > > > > >>>>> for users to
> > > > > >>>>>       define their data processing logics. Exposing
> > > operator-level
> > > > > >>>> concepts
> > > > > >>>>>       (e.g., mailbox thread model, checkpoint barrier
> > alignment,
> > > > > >> etc.) is
> > > > > >>>>>       unnecessary and limits the improvement regarding such
> > > exposed
> > > > > >>>>> mechanisms
> > > > > >>>>>       with compatibility considerations.
> > > > > >>>>>       - The current DataStream API seems to be a mixture of
> > many
> > > > > >> things,
> > > > > >>>>>       making it hard to understand especially for newcomers.
> It
> > > > might
> > > > > >> be
> > > > > >>>>> better
> > > > > >>>>>       to re-organize it into several parts: (the taxonomy
> below
> > > are
> > > > > >> just
> > > > > >>>> an
> > > > > >>>>>       example of the, we are still working on this)
> > > > > >>>>>          - The most fundamental stateful stream processing:
> > > > streams,
> > > > > >>>>>          partitions / key, process functions, state,
> > > > timeline-service
> > > > > >>>>>          - An extension for common batch-streaming unified
> > > > functions:
> > > > > >>>> map,
> > > > > >>>>>          flatmap, filter, agg, reduce, join, etc.
> > > > > >>>>>          - An extension for windowing supports:  window,
> > > triggering
> > > > > >>>>>          - An extension for event-time supports: event time,
> > > > > watermark
> > > > > >>>>>          - The extensions are like short-cuts / sugars,
> without
> > > > which
> > > > > >>>> users
> > > > > >>>>>          can probably still achieve the same behavior by
> > working
> > > > with
> > > > > >> the
> > > > > >>>>>          fundamental APIs, but would be a lot easier with the
> > > > > >> extensions
> > > > > >>>>>       - The original plan was to do in-place refactors /
> > changes
> > > on
> > > > > >>>>>    DataStream API. Some related items are listed in this doc
> > [2]
> > > > > >> attached
> > > > > >>>>> to
> > > > > >>>>>    the kicking off email [3]. Not all of the above issues are
> > > > listed,
> > > > > >>>>> because
> > > > > >>>>>    we haven't looked into this as deeply as now  by that
> time.
> > > > > >>>>>    - We proposed this as a new API rather than in-place
> > refactors
> > > > in
> > > > > >> the
> > > > > >>>>>    2.0 work item list, because we realized the changes might
> be
> > > too
> > > > > >> big
> > > > > >>>>> for an
> > > > > >>>>>    in-place change. First having a new API then gradually
> > > retiring
> > > > > the
> > > > > >>>> old
> > > > > >>>>> one
> > > > > >>>>>    would help users to smoothly migrate between them.
> > > > > >>>>>
> > > > > >>>>> A thorough discussion is definitely needed once the FLIP is
> > out.
> > > > And
> > > > > of
> > > > > >>>>> course it's possible that the FLIP might be rejected. Given
> > that
> > > we
> > > > > are
> > > > > >>>>> planning for release 2.0, I just feel it would be better to
> > bring
> > > > > this
> > > > > >> up
> > > > > >>>>> early even the concrete plan is not yet ready,
> > > > > >>>>>
> > > > > >>>>> Best,
> > > > > >>>>>
> > > > > >>>>> Xintong
> > > > > >>>>>
> > > > > >>>>>
> > > > > >>>>> [1]
> > > > https://lists.apache.org/thread/vmhzv8fcw2b33pqxp43486owrxbkd5x9
> > > > > >>>>> [2]
> > > > > >>>>>
> > > > > >>>>>
> > > > > >>>>
> > > > > >>
> > > > >
> > > >
> > >
> >
> https://docs.google.com/document/d/1_PMGl5RuDQGlV99_gL3y7OiRsF0DgCk91Coua6hFXhE/edit?usp=sharing
> > > > > >>>>> [3]
> > > > https://lists.apache.org/thread/b8w5cx0qqbwzzklyn5xxf54vw9ymys1c
> > > > > >>>>>
> > > > > >>>>> On Tue, Jun 27, 2023 at 5:15 PM Gyula Fóra <
> gyfora@apache.org>
> > > > > wrote:
> > > > > >>>>>
> > > > > >>>>>> Hey!
> > > > > >>>>>>
> > > > > >>>>>> I share the same concerns mentioned above regarding the
> > > > > >>>> "ProcessFunction
> > > > > >>>>>> API".
> > > > > >>>>>>
> > > > > >>>>>> I don't think we should create a replacement for the
> > DataStream
> > > > API
> > > > > >>>>> unless
> > > > > >>>>>> we have a very good reason to do so and with a proper
> > discussion
> > > > > about
> > > > > >>>>> this
> > > > > >>>>>> as Alex said.
> > > > > >>>>>>
> > > > > >>>>>> Cheers,
> > > > > >>>>>> Gyula
> > > > > >>>>>>
> > > > > >>>>>> On Tue, Jun 27, 2023 at 11:03 AM Alexander Fedulov <
> > > > > >>>>>> alexander.fedulov@gmail.com> wrote:
> > > > > >>>>>>
> > > > > >>>>>>> Hi Xintong,
> > > > > >>>>>>>
> > > > > >>>>>>> By compatibility discussion do you mean the "[DISCUSS]
> > > FLIP-321:
> > > > > >>>>>> Introduce
> > > > > >>>>>>> an API deprecation process" thread [1]?
> > > > > >>>>>>>
> > > > > >>>>>>> I am also curious to know if the rationale behind this new
> > API
> > > > has
> > > > > >>>> been
> > > > > >>>>>>> previously discussed on the mailing list. Do we have a list
> > of
> > > > > >>>>>> shortcomings
> > > > > >>>>>>> in the current DataStream API that it tries to resolve? How
> > > does
> > > > > the
> > > > > >>>>>>> current ProcessFunction functionality fit into the picture?
> > > Will
> > > > it
> > > > > >>>> be
> > > > > >>>>>> kept
> > > > > >>>>>>> as is or subsumed by new API?
> > > > > >>>>>>>
> > > > > >>>>>>> [1]
> > > > > https://lists.apache.org/thread/vmhzv8fcw2b33pqxp43486owrxbkd5x9
> > > > > >>>>>>>
> > > > > >>>>>>> Best,
> > > > > >>>>>>> Alex
> > > > > >>>>>>>
> > > > > >>>>>>> On Mon, 26 Jun 2023 at 14:33, Xintong Song <
> > > > tonysong820@gmail.com>
> > > > > >>>>>> wrote:
> > > > > >>>>>>>>> The ProcessFunction API item is giving me the most
> > headaches
> > > > > >>>>> because
> > > > > >>>>>>> it's
> > > > > >>>>>>>>> very unclear what it actually entails; like is it an
> > entirely
> > > > > >>>>>> separate
> > > > > >>>>>>>> API
> > > > > >>>>>>>>> to DataStream (sounds like it is!) or an extension of
> > > > DataStream.
> > > > > >>>>> How
> > > > > >>>>>>>> much
> > > > > >>>>>>>>> will it share the internals with DataStream etc.; how
> does
> > it
> > > > > >>>>> relate
> > > > > >>>>>> to
> > > > > >>>>>>>> the
> > > > > >>>>>>>>> Table API (w.r.t. switching APIs / what Table API uses
> > > > > >>>> underneath).
> > > > > >>>>>>>> I totally understand your confusion. We started planning
> > this
> > > > > after
> > > > > >>>>>>> kicking
> > > > > >>>>>>>> off the release 2.0, so there's still a lot to be explored
> > and
> > > > the
> > > > > >>>>> plan
> > > > > >>>>>>>> keeps changing.
> > > > > >>>>>>>>
> > > > > >>>>>>>>
> > > > > >>>>>>>>    - In the beginning, we planned to do an in-place
> refactor
> > > of
> > > > > >>>>>>> DataStream
> > > > > >>>>>>>>    API, until the API migration period is proposed.
> > > > > >>>>>>>>    - Then we want to make it an entirely separate API to
> > > > > >>>> DataStream,
> > > > > >>>>>> and
> > > > > >>>>>>>>    listed as a must-have for release 2.0 so that we can
> > remove
> > > > > >>>>>> DataStream
> > > > > >>>>>>>> once
> > > > > >>>>>>>>    it's ready.
> > > > > >>>>>>>>    - However, depending on the outcome of the API
> > > compatibility
> > > > > >>>>>>> discussion
> > > > > >>>>>>>>    [1], we may not be able to remove DataStream in 2.0
> > anyway,
> > > > > >>>> which
> > > > > >>>>>>> means
> > > > > >>>>>>>> we
> > > > > >>>>>>>>    might need to re-evaluate the necessity of this item
> for
> > > 2.0.
> > > > > >>>>>>>>
> > > > > >>>>>>>> I'd say we wait a bit longer for the compatibility
> > discussion
> > > > [1]
> > > > > >>>> and
> > > > > >>>>>>>> decide the priority for this item afterwards.
> > > > > >>>>>>>>
> > > > > >>>>>>>>
> > > > > >>>>>>>> Best,
> > > > > >>>>>>>>
> > > > > >>>>>>>> Xintong
> > > > > >>>>>>>>
> > > > > >>>>>>>>
> > > > > >>>>>>>> [1]
> https://lists.apache.org/list.html?dev@flink.apache.org
> > > > > >>>>>>>>
> > > > > >>>>>>>>
> > > > > >>>>>>>> On Mon, Jun 26, 2023 at 6:00 PM Chesnay Schepler <
> > > > > >>>> chesnay@apache.org
> > > > > >>>>>>>> wrote:
> > > > > >>>>>>>>
> > > > > >>>>>>>>> by-and-large I'm quite happy with the list of items.
> > > > > >>>>>>>>>
> > > > > >>>>>>>>> I'm curious as to why the "Disaggregated State
> Management"
> > > item
> > > > > >>>> is
> > > > > >>>>>>> marked
> > > > > >>>>>>>>> as a must-have; will it require changes that break
> > something?
> > > > > >>>> What
> > > > > >>>>>>>> prevents
> > > > > >>>>>>>>> it from being added in 2.1?
> > > > > >>>>>>>>>
> > > > > >>>>>>>>> We may want to update the Java 17 item to "Make Java 17
> the
> > > > > >>>>> default,
> > > > > >>>>>>> drop
> > > > > >>>>>>>>> Java 8/11". Maybe even split it into a must-have "Drop
> Java
> > > 8"
> > > > > >>>> and
> > > > > >>>>> a
> > > > > >>>>>>>>> nice-to-have "Drop Java 11"?
> > > > > >>>>>>>>>
> > > > > >>>>>>>>> "Move Calcite rules from Scala to Java": I would hope
> that
> > > this
> > > > > >>>>> would
> > > > > >>>>>>> be
> > > > > >>>>>>>>> an entirely internal change, and could thus be an
> > incremental
> > > > > >>>>> process
> > > > > >>>>>>>>> independent of major releases.
> > > > > >>>>>>>>> What is the actual scale of this item; how much are we
> > > actually
> > > > > >>>>>>>> re-writing?
> > > > > >>>>>>>>> "Add MetricGroup#getLogicalScope": I'd raise this to a
> > > > > >>>> must-have; i
> > > > > >>>>>>> think
> > > > > >>>>>>>>> I marked it down as nice-to-have only because it depends
> on
> > > > > >>>> another
> > > > > >>>>>>> item.
> > > > > >>>>>>>>> The ProcessFunction API item is giving me the most
> > headaches
> > > > > >>>>> because
> > > > > >>>>>>> it's
> > > > > >>>>>>>>> very unclear what it actually entails; like is it an
> > entirely
> > > > > >>>>>> separate
> > > > > >>>>>>>> API
> > > > > >>>>>>>>> to DataStream (sounds like it is!) or an extension of
> > > > DataStream.
> > > > > >>>>> How
> > > > > >>>>>>>> much
> > > > > >>>>>>>>> will it share the internals with DataStream etc.; how
> does
> > it
> > > > > >>>>> relate
> > > > > >>>>>> to
> > > > > >>>>>>>> the
> > > > > >>>>>>>>> Table API (w.r.t. switching APIs / what Table API uses
> > > > > >>>> underneath).
> > > > > >>>>>>>>> There are a few items I added as ideas which don't have a
> > > > > >>>> priority
> > > > > >>>>>> yet;
> > > > > >>>>>>>>> would love to get some feedback on those.
> > > > > >>>>>>>>>
> > > > > >>>>>>>>> On 21/06/2023 08:41, Xintong Song wrote:
> > > > > >>>>>>>>>
> > > > > >>>>>>>>> Hi devs,
> > > > > >>>>>>>>>
> > > > > >>>>>>>>> As previously discussed in [1], we had been collecting
> work
> > > > item
> > > > > >>>>>>>> proposals
> > > > > >>>>>>>>> for the 2.0 release until June 15th, on the wiki page
> [2].
> > > > > >>>>>>>>>
> > > > > >>>>>>>>>    - As we have passed the due date, I'd like to kindly
> > > remind
> > > > > >>>>>> everyone
> > > > > >>>>>>>> *not
> > > > > >>>>>>>>>    to add / remove items directly on the wiki page*. If
> > > needed,
> > > > > >>>>>> please
> > > > > >>>>>>>> post
> > > > > >>>>>>>>>    in this thread or reach out to the release managers
> > > instead.
> > > > > >>>>>>>>>    - I've reached out to some folks for clarifications
> > about
> > > > > >>>> their
> > > > > >>>>>>>>>    proposals. Some of them mentioned that they can not
> yet
> > > tell
> > > > > >>>>>> whether
> > > > > >>>>>>>> we
> > > > > >>>>>>>>>    should do an item or not, and would need more time /
> > > > > >>>> discussions
> > > > > >>>>>> to
> > > > > >>>>>>>> make
> > > > > >>>>>>>>>    the decision. So I added a new symbol for items whose
> > > > > >>>> priorities
> > > > > >>>>>> are
> > > > > >>>>>>>> `TBD`.
> > > > > >>>>>>>>> Now it's time to collaboratively decide a minimum set of
> > > > > >>>> must-have
> > > > > >>>>>>> items.
> > > > > >>>>>>>>> I've gone through the entire list of proposed items, and
> > > found
> > > > > >>>> most
> > > > > >>>>>> of
> > > > > >>>>>>>> them
> > > > > >>>>>>>>> make quite much sense. So I think an online sync might
> not
> > be
> > > > > >>>>>> necessary
> > > > > >>>>>>>> for
> > > > > >>>>>>>>> this. I'd like to go with this DISCUSS thread, where
> > everyone
> > > > can
> > > > > >>>>>>> comment
> > > > > >>>>>>>>> on how they think the list can be improved, followed by a
> > > VOTE
> > > > to
> > > > > >>>>>>>> formally
> > > > > >>>>>>>>> make the decision.
> > > > > >>>>>>>>>
> > > > > >>>>>>>>> Any feedback and opinions, including but not limited to
> the
> > > > > >>>>> following
> > > > > >>>>>>>>> aspects, will be appreciated.
> > > > > >>>>>>>>>
> > > > > >>>>>>>>>    - Important items that are missing from the list
> > > > > >>>>>>>>>    - Concerns regarding the listed items or their
> > priorities
> > > > > >>>>>>>>>
> > > > > >>>>>>>>> Looking forward to your feedback.
> > > > > >>>>>>>>>
> > > > > >>>>>>>>> Best,
> > > > > >>>>>>>>>
> > > > > >>>>>>>>> Xintong
> > > > > >>>>>>>>>
> > > > > >>>>>>>>>
> > > > > >>>>>>>>> [1]
> > > > > >>>>
> > > > > >>
> > > > >
> > > >
> > >
> >
> https://lists.apache.org/list?dev@flink.apache.org:lte=1M:release%202.0%20status%20updates
> > > > > >>>>>>>>> [2]
> > > > > >>>> https://cwiki.apache.org/confluence/display/FLINK/2.0+Release
> > > > > >>>>>>>>>
> > > > > >>>>>>>>>
> > > > > >>>>
> > > > > >>>> --
> > > > > >>>> Best regards,
> > > > > >>>> Sergey
> > > > > >>>>
> > > > > >>
> > > > > >>
> > > > >
> > > > >
> > > >
> > >
> >
> >
> > --
> > Best regards,
> > Sergey
> >
>

Re: [DISCUSS] Release 2.0 Work Items

Posted by Jing Ge <ji...@ververica.com.INVALID>.
Hi Sergey,

Thanks for the clarification! I will not hijack this thread to discuss
Scala code strategy.

Best regards,
Jing

On Mon, Jul 3, 2023 at 10:51 AM Sergey Nuyanzin <sn...@gmail.com> wrote:

> Hi Jing,
>
> Maybe I was not clear enough, sorry.
> However the main reason for this item about Calcite rules is not abandoning
> Scala.
> The main reason are changes in Calcite itself where there was introduced
> code generator framework (immutables)
> to generate config java classes for rules and old api (which is used in
> Scala Calcirte rules) for that is marked as deprecated.
> Since Immutables implies code generation while java compilation it seems
> impossible to use for rules in Scala code.
>
> On Mon, Jul 3, 2023 at 10:44 AM Jing Ge <ji...@ververica.com.invalid>
> wrote:
>
> > Hi,
> >
> > Speaking of "Move Calcite rules from Scala to Java", I was wondering if
> > this thread is the right place to talk about it. Afaik, the Flink
> community
> > has decided to abandon Scala. That is the reason, I guess, we want to
> move
> > those Calcite rules from Scala to Java. On the other side, new Scala code
> > will be added while developing new features[1]. Do we have any thoughts
> > wrt the Scala code strategy?
> >
> > Best regards,
> > Jing
> >
> >
> >
> > [1] https://lists.apache.org/thread/tnygl4n3q1fx75cl2vclc78j8mrxmz6y
> >
> > On Mon, Jul 3, 2023 at 10:31 AM Xintong Song <to...@gmail.com>
> > wrote:
> >
> > > Thanks all for the discussion.
> > >
> > >
> > > IIUC, we need to make the following changes. Please correct me if I get
> > it
> > > wrong.
> > >
> > >
> > > 1. Disaggregated State Management - Clarify that only the public API
> > > related part is must-have for 2.0.
> > >
> > > 2. Java version support - Split it into 3 items: a) make java 17 the
> > > default (must-have), b) drop java 8 (must-have), and c) drop java 11
> > > (nice-to-have)
> > >
> > > 3. Add MetricGroup#getLogicalScope - Should be promoted to must-have
> > >
> > > 4. ProcessFunction API - Should be downgrade to nice-to-have
> > >
> > > 5. Configuration - Add an item "revisit all config option types and
> > default
> > > values", which IIUC should also be a must-have
> > >
> > >
> > > There seems to be no changes needed for "Move Calcite rules from Scala
> to
> > > Java" as it's already nice-to-have.
> > >
> > >
> > > If there's no objections, I'll update the wiki page accordingly, and
> > start
> > > a VOTE in the next couple of days.
> > >
> > >
> > > Best,
> > >
> > > Xintong
> > >
> > >
> > >
> > > On Fri, Jun 30, 2023 at 12:53 AM Teoh, Hong
> <liangtl@amazon.co.uk.invalid
> > >
> > > wrote:
> > >
> > > > Thanks Xintong for driving the effort.
> > > >
> > > > I’d add a +1 to reworking configs, as suggested by @Jark and
> @Chesnay,
> > > > especially the types. We have various configs that encode Time /
> > > MemorySize
> > > > that are Long instead!
> > > >
> > > > Regards,
> > > > Hong
> > > >
> > > >
> > > >
> > > > > On 29 Jun 2023, at 16:19, Yuan Mei <yu...@gmail.com> wrote:
> > > > >
> > > > > CAUTION: This email originated from outside of the organization. Do
> > not
> > > > click links or open attachments unless you can confirm the sender and
> > > know
> > > > the content is safe.
> > > > >
> > > > >
> > > > >
> > > > > Thanks for driving this effort, Xintong!
> > > > >
> > > > > To Chesnay
> > > > >> I'm curious as to why the "Disaggregated State Management" item is
> > > > >> marked as a must-have; will it require changes that break
> something?
> > > > >> What prevents it from being added in 2.1?
> > > > >
> > > > > As to "Disaggregated State Management".
> > > > >
> > > > > We plan to provide a new type of state backend to support DFS as
> > > primary
> > > > > storage.
> > > > > To achieve this, we at least need to include two parts of amends
> (not
> > > > > entirely sure yet, since we are still in the designing and
> prototype
> > > > phase)
> > > > >
> > > > > 1. Statebackend Change
> > > > > 2. State Access Change
> > > > >
> > > > > Not all of the interfaces related are `@Internal`. Some of the
> > > interfaces
> > > > > like `StateBackend` is `@PublicEvolving`
> > > > > So, you are right in the sense that "Disaggregated State
> Management"
> > > > itself
> > > > > probably does not need to be a "Must Have"
> > > > >
> > > > > But I was hoping changes that related to public APIs can be
> finalized
> > > and
> > > > > merged in Flink 2.0 (I will fix the wiki accordingly).
> > > > >
> > > > > I also agree with Jark that 2.0 is a good chance to rework the
> > default
> > > > > value of configurations.
> > > > >
> > > > > Best
> > > > > Yuan
> > > > >
> > > > >
> > > > > On Thu, Jun 29, 2023 at 8:43 PM Chesnay Schepler <
> chesnay@apache.org
> > >
> > > > wrote:
> > > > >
> > > > >> Something else configuration-related is that there are a bunch of
> > > > >> options where the type isn't quite correct (e.g., a String where
> it
> > > > >> could be an enum, a string where it should be an int or
> something).
> > > > >> Could do a pass over those as well.
> > > > >>
> > > > >> On 29/06/2023 13:50, Jark Wu wrote:
> > > > >>> Hi,
> > > > >>>
> > > > >>> I think one more thing we need to consider to do in 2.0 is
> changing
> > > the
> > > > >>> default value of configuration to improve out-of-box user
> > experience.
> > > > >>>
> > > > >>> Currently, in order to run a Flink job, users may need to set
> > > > >>> a bunch of configurations, such as minibatch, checkpoint
> interval,
> > > > >>> exactly-once,
> > > > >>> incremental-checkpoint, etc. It's very verbose and hard to use
> for
> > > > >>> beginners.
> > > > >>> Most of them can have a universally applicable value.  Because
> > > changing
> > > > >> the
> > > > >>> default value is a breaking change. I think It's worth
> considering
> > > > >> changing
> > > > >>> them in 2.0.
> > > > >>>
> > > > >>> What do you think?
> > > > >>>
> > > > >>> Best,
> > > > >>> Jark
> > > > >>>
> > > > >>>
> > > > >>> On Wed, 28 Jun 2023 at 14:10, Sergey Nuyanzin <
> snuyanzin@gmail.com
> > >
> > > > >> wrote:
> > > > >>>
> > > > >>>> Hi Chesnay
> > > > >>>>
> > > > >>>>> "Move Calcite rules from Scala to Java": I would hope that this
> > > would
> > > > >> be
> > > > >>>>> an entirely internal change, and could thus be an incremental
> > > process
> > > > >>>>> independent of major releases.
> > > > >>>>> What is the actual scale of this item; how much are we actually
> > > > >>>> re-writing?
> > > > >>>>
> > > > >>>> Thanks for asking
> > > > >>>> yes, you're right, that should be internal change.
> > > > >>>> Yeah I was also thinking about incremental change (rule by rule
> or
> > > > >>>> reasonable small group of rules).
> > > > >>>> And yes, this could be an independent (on major release)
> activity
> > > > >>>>
> > > > >>>> The problem is actually for children of RelOptRule.
> > > > >>>> Currently I see 60+ such rules (in Scala) using the mentioned
> > > > deprecated
> > > > >>>> api.
> > > > >>>> There are also children of ConverterRule (50+) which do not have
> > > such
> > > > >>>> issues.
> > > > >>>> Maybe it could be considered as the next step to have all the
> > rules
> > > in
> > > > >>>> Java.
> > > > >>>>
> > > > >>>> On Tue, Jun 27, 2023 at 1:34 PM Xintong Song <
> > tonysong820@gmail.com
> > > >
> > > > >>>> wrote:
> > > > >>>>
> > > > >>>>> Hi Alex & Gyula,
> > > > >>>>>
> > > > >>>>> By compatibility discussion do you mean the "[DISCUSS]
> FLIP-321:
> > > > >>>> Introduce
> > > > >>>>>> an API deprecation process" thread [1]?
> > > > >>>>>>
> > > > >>>>> Yes, I meant the FLIP-321 discussion. I just noticed I pasted
> the
> > > > wrong
> > > > >>>> url
> > > > >>>>> in my previous email. Sorry for the mistake.
> > > > >>>>>
> > > > >>>>> I am also curious to know if the rationale behind this new API
> > has
> > > > been
> > > > >>>>>> previously discussed on the mailing list. Do we have a list of
> > > > >>>>> shortcomings
> > > > >>>>>> in the current DataStream API that it tries to resolve? How
> does
> > > the
> > > > >>>>>> current ProcessFunction functionality fit into the picture?
> Will
> > > it
> > > > be
> > > > >>>>> kept
> > > > >>>>>> as is or subsumed by new API?
> > > > >>>>>>
> > > > >>>>> I don't think we should create a replacement for the DataStream
> > API
> > > > >>>> unless
> > > > >>>>>> we have a very good reason to do so and with a proper
> discussion
> > > > about
> > > > >>>>> this
> > > > >>>>>> as Alex said.
> > > > >>>>>
> > > > >>>>> The ProcessFunction API which is targeting to replace
> DataStream
> > > API
> > > > is
> > > > >>>>> still a proposal, not a decision. Sorry for the confusion, I
> > should
> > > > >> have
> > > > >>>>> been more careful with my words, not giving the impression that
> > > this
> > > > is
> > > > >>>>> something we'll do anyway.
> > > > >>>>>
> > > > >>>>> There will be a FLIP describing the motivations and designs in
> > > > detail,
> > > > >>>> for
> > > > >>>>> the community to discuss and vote on. We are still working on
> it.
> > > > TBH,
> > > > >>>> this
> > > > >>>>> is not trivial and we would need more time on it.
> > > > >>>>>
> > > > >>>>> Just to quickly share some backgrounds:
> > > > >>>>>
> > > > >>>>>    - We see quite some problems with the current DataStream
> APIs
> > > > >>>>>       - Users are working with concrete classes rather than
> > > > >> interfaces,
> > > > >>>>>       which means
> > > > >>>>>       - Users can access methods that are designed to be used
> by
> > > > >> internal
> > > > >>>>>          classes, even though they are annotated with
> > `@Internal`.
> > > > >> E.g.,
> > > > >>>>>          `DataStream#getTransformation`.
> > > > >>>>>          - Changes to the non-API implementations (e.g.,
> > > > >>>> `Transformation`)
> > > > >>>>>          would affect the API classes (e.g., `DataStream`),
> which
> > > > >>>>> makes it hard to
> > > > >>>>>          provide binary compatibility.
> > > > >>>>>       - Internal classes are used as parameter / return-value
> of
> > > > >> public
> > > > >>>>>       APIs. E.g., while `AbstractStreamOperator` is
> > PublicEvolving,
> > > > >>>>> `StreamTask`
> > > > >>>>>       which returns from
> > `AbstractStreamOperator#getContainingTask`
> > > > is
> > > > >>>>> Internal.
> > > > >>>>>       - In many cases, users are asked to extend the API
> classes,
> > > > >> rather
> > > > >>>>>       than implementing interfaces. E.g.,
> > `AbstractStreamOperator`.
> > > > >>>>>          - Any changes to the base classes, even the internal
> > part,
> > > > >> may
> > > > >>>>>          affect the behavior of the user-provided sub-classes
> > > > >>>>>          - Users can override the behavior of the base classes
> > > > >>>>>       - The API module `flink-streaming-java` contains non-API
> > > > >> classes,
> > > > >>>> and
> > > > >>>>>       depends on internal modules such as `flink-runtime`,
> which
> > > > means
> > > > >>>>>       - Changes to the internal modules may affect the API
> > modules,
> > > > >> which
> > > > >>>>>          requires users to re-build their applications upon
> > > upgrading
> > > > >>>>>          - The artifact user needs for building their
> application
> > > > >> larger
> > > > >>>>>          than necessary.
> > > > >>>>>       - We probably should not expose operators (e.g.,
> > > > >>>>>       `AbstractStreamOperator`) to users. Functions should be
> > > enough
> > > > >>>>> for users to
> > > > >>>>>       define their data processing logics. Exposing
> > operator-level
> > > > >>>> concepts
> > > > >>>>>       (e.g., mailbox thread model, checkpoint barrier
> alignment,
> > > > >> etc.) is
> > > > >>>>>       unnecessary and limits the improvement regarding such
> > exposed
> > > > >>>>> mechanisms
> > > > >>>>>       with compatibility considerations.
> > > > >>>>>       - The current DataStream API seems to be a mixture of
> many
> > > > >> things,
> > > > >>>>>       making it hard to understand especially for newcomers. It
> > > might
> > > > >> be
> > > > >>>>> better
> > > > >>>>>       to re-organize it into several parts: (the taxonomy below
> > are
> > > > >> just
> > > > >>>> an
> > > > >>>>>       example of the, we are still working on this)
> > > > >>>>>          - The most fundamental stateful stream processing:
> > > streams,
> > > > >>>>>          partitions / key, process functions, state,
> > > timeline-service
> > > > >>>>>          - An extension for common batch-streaming unified
> > > functions:
> > > > >>>> map,
> > > > >>>>>          flatmap, filter, agg, reduce, join, etc.
> > > > >>>>>          - An extension for windowing supports:  window,
> > triggering
> > > > >>>>>          - An extension for event-time supports: event time,
> > > > watermark
> > > > >>>>>          - The extensions are like short-cuts / sugars, without
> > > which
> > > > >>>> users
> > > > >>>>>          can probably still achieve the same behavior by
> working
> > > with
> > > > >> the
> > > > >>>>>          fundamental APIs, but would be a lot easier with the
> > > > >> extensions
> > > > >>>>>       - The original plan was to do in-place refactors /
> changes
> > on
> > > > >>>>>    DataStream API. Some related items are listed in this doc
> [2]
> > > > >> attached
> > > > >>>>> to
> > > > >>>>>    the kicking off email [3]. Not all of the above issues are
> > > listed,
> > > > >>>>> because
> > > > >>>>>    we haven't looked into this as deeply as now  by that time.
> > > > >>>>>    - We proposed this as a new API rather than in-place
> refactors
> > > in
> > > > >> the
> > > > >>>>>    2.0 work item list, because we realized the changes might be
> > too
> > > > >> big
> > > > >>>>> for an
> > > > >>>>>    in-place change. First having a new API then gradually
> > retiring
> > > > the
> > > > >>>> old
> > > > >>>>> one
> > > > >>>>>    would help users to smoothly migrate between them.
> > > > >>>>>
> > > > >>>>> A thorough discussion is definitely needed once the FLIP is
> out.
> > > And
> > > > of
> > > > >>>>> course it's possible that the FLIP might be rejected. Given
> that
> > we
> > > > are
> > > > >>>>> planning for release 2.0, I just feel it would be better to
> bring
> > > > this
> > > > >> up
> > > > >>>>> early even the concrete plan is not yet ready,
> > > > >>>>>
> > > > >>>>> Best,
> > > > >>>>>
> > > > >>>>> Xintong
> > > > >>>>>
> > > > >>>>>
> > > > >>>>> [1]
> > > https://lists.apache.org/thread/vmhzv8fcw2b33pqxp43486owrxbkd5x9
> > > > >>>>> [2]
> > > > >>>>>
> > > > >>>>>
> > > > >>>>
> > > > >>
> > > >
> > >
> >
> https://docs.google.com/document/d/1_PMGl5RuDQGlV99_gL3y7OiRsF0DgCk91Coua6hFXhE/edit?usp=sharing
> > > > >>>>> [3]
> > > https://lists.apache.org/thread/b8w5cx0qqbwzzklyn5xxf54vw9ymys1c
> > > > >>>>>
> > > > >>>>> On Tue, Jun 27, 2023 at 5:15 PM Gyula Fóra <gy...@apache.org>
> > > > wrote:
> > > > >>>>>
> > > > >>>>>> Hey!
> > > > >>>>>>
> > > > >>>>>> I share the same concerns mentioned above regarding the
> > > > >>>> "ProcessFunction
> > > > >>>>>> API".
> > > > >>>>>>
> > > > >>>>>> I don't think we should create a replacement for the
> DataStream
> > > API
> > > > >>>>> unless
> > > > >>>>>> we have a very good reason to do so and with a proper
> discussion
> > > > about
> > > > >>>>> this
> > > > >>>>>> as Alex said.
> > > > >>>>>>
> > > > >>>>>> Cheers,
> > > > >>>>>> Gyula
> > > > >>>>>>
> > > > >>>>>> On Tue, Jun 27, 2023 at 11:03 AM Alexander Fedulov <
> > > > >>>>>> alexander.fedulov@gmail.com> wrote:
> > > > >>>>>>
> > > > >>>>>>> Hi Xintong,
> > > > >>>>>>>
> > > > >>>>>>> By compatibility discussion do you mean the "[DISCUSS]
> > FLIP-321:
> > > > >>>>>> Introduce
> > > > >>>>>>> an API deprecation process" thread [1]?
> > > > >>>>>>>
> > > > >>>>>>> I am also curious to know if the rationale behind this new
> API
> > > has
> > > > >>>> been
> > > > >>>>>>> previously discussed on the mailing list. Do we have a list
> of
> > > > >>>>>> shortcomings
> > > > >>>>>>> in the current DataStream API that it tries to resolve? How
> > does
> > > > the
> > > > >>>>>>> current ProcessFunction functionality fit into the picture?
> > Will
> > > it
> > > > >>>> be
> > > > >>>>>> kept
> > > > >>>>>>> as is or subsumed by new API?
> > > > >>>>>>>
> > > > >>>>>>> [1]
> > > > https://lists.apache.org/thread/vmhzv8fcw2b33pqxp43486owrxbkd5x9
> > > > >>>>>>>
> > > > >>>>>>> Best,
> > > > >>>>>>> Alex
> > > > >>>>>>>
> > > > >>>>>>> On Mon, 26 Jun 2023 at 14:33, Xintong Song <
> > > tonysong820@gmail.com>
> > > > >>>>>> wrote:
> > > > >>>>>>>>> The ProcessFunction API item is giving me the most
> headaches
> > > > >>>>> because
> > > > >>>>>>> it's
> > > > >>>>>>>>> very unclear what it actually entails; like is it an
> entirely
> > > > >>>>>> separate
> > > > >>>>>>>> API
> > > > >>>>>>>>> to DataStream (sounds like it is!) or an extension of
> > > DataStream.
> > > > >>>>> How
> > > > >>>>>>>> much
> > > > >>>>>>>>> will it share the internals with DataStream etc.; how does
> it
> > > > >>>>> relate
> > > > >>>>>> to
> > > > >>>>>>>> the
> > > > >>>>>>>>> Table API (w.r.t. switching APIs / what Table API uses
> > > > >>>> underneath).
> > > > >>>>>>>> I totally understand your confusion. We started planning
> this
> > > > after
> > > > >>>>>>> kicking
> > > > >>>>>>>> off the release 2.0, so there's still a lot to be explored
> and
> > > the
> > > > >>>>> plan
> > > > >>>>>>>> keeps changing.
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>>    - In the beginning, we planned to do an in-place refactor
> > of
> > > > >>>>>>> DataStream
> > > > >>>>>>>>    API, until the API migration period is proposed.
> > > > >>>>>>>>    - Then we want to make it an entirely separate API to
> > > > >>>> DataStream,
> > > > >>>>>> and
> > > > >>>>>>>>    listed as a must-have for release 2.0 so that we can
> remove
> > > > >>>>>> DataStream
> > > > >>>>>>>> once
> > > > >>>>>>>>    it's ready.
> > > > >>>>>>>>    - However, depending on the outcome of the API
> > compatibility
> > > > >>>>>>> discussion
> > > > >>>>>>>>    [1], we may not be able to remove DataStream in 2.0
> anyway,
> > > > >>>> which
> > > > >>>>>>> means
> > > > >>>>>>>> we
> > > > >>>>>>>>    might need to re-evaluate the necessity of this item for
> > 2.0.
> > > > >>>>>>>>
> > > > >>>>>>>> I'd say we wait a bit longer for the compatibility
> discussion
> > > [1]
> > > > >>>> and
> > > > >>>>>>>> decide the priority for this item afterwards.
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> Best,
> > > > >>>>>>>>
> > > > >>>>>>>> Xintong
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> [1] https://lists.apache.org/list.html?dev@flink.apache.org
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>> On Mon, Jun 26, 2023 at 6:00 PM Chesnay Schepler <
> > > > >>>> chesnay@apache.org
> > > > >>>>>>>> wrote:
> > > > >>>>>>>>
> > > > >>>>>>>>> by-and-large I'm quite happy with the list of items.
> > > > >>>>>>>>>
> > > > >>>>>>>>> I'm curious as to why the "Disaggregated State Management"
> > item
> > > > >>>> is
> > > > >>>>>>> marked
> > > > >>>>>>>>> as a must-have; will it require changes that break
> something?
> > > > >>>> What
> > > > >>>>>>>> prevents
> > > > >>>>>>>>> it from being added in 2.1?
> > > > >>>>>>>>>
> > > > >>>>>>>>> We may want to update the Java 17 item to "Make Java 17 the
> > > > >>>>> default,
> > > > >>>>>>> drop
> > > > >>>>>>>>> Java 8/11". Maybe even split it into a must-have "Drop Java
> > 8"
> > > > >>>> and
> > > > >>>>> a
> > > > >>>>>>>>> nice-to-have "Drop Java 11"?
> > > > >>>>>>>>>
> > > > >>>>>>>>> "Move Calcite rules from Scala to Java": I would hope that
> > this
> > > > >>>>> would
> > > > >>>>>>> be
> > > > >>>>>>>>> an entirely internal change, and could thus be an
> incremental
> > > > >>>>> process
> > > > >>>>>>>>> independent of major releases.
> > > > >>>>>>>>> What is the actual scale of this item; how much are we
> > actually
> > > > >>>>>>>> re-writing?
> > > > >>>>>>>>> "Add MetricGroup#getLogicalScope": I'd raise this to a
> > > > >>>> must-have; i
> > > > >>>>>>> think
> > > > >>>>>>>>> I marked it down as nice-to-have only because it depends on
> > > > >>>> another
> > > > >>>>>>> item.
> > > > >>>>>>>>> The ProcessFunction API item is giving me the most
> headaches
> > > > >>>>> because
> > > > >>>>>>> it's
> > > > >>>>>>>>> very unclear what it actually entails; like is it an
> entirely
> > > > >>>>>> separate
> > > > >>>>>>>> API
> > > > >>>>>>>>> to DataStream (sounds like it is!) or an extension of
> > > DataStream.
> > > > >>>>> How
> > > > >>>>>>>> much
> > > > >>>>>>>>> will it share the internals with DataStream etc.; how does
> it
> > > > >>>>> relate
> > > > >>>>>> to
> > > > >>>>>>>> the
> > > > >>>>>>>>> Table API (w.r.t. switching APIs / what Table API uses
> > > > >>>> underneath).
> > > > >>>>>>>>> There are a few items I added as ideas which don't have a
> > > > >>>> priority
> > > > >>>>>> yet;
> > > > >>>>>>>>> would love to get some feedback on those.
> > > > >>>>>>>>>
> > > > >>>>>>>>> On 21/06/2023 08:41, Xintong Song wrote:
> > > > >>>>>>>>>
> > > > >>>>>>>>> Hi devs,
> > > > >>>>>>>>>
> > > > >>>>>>>>> As previously discussed in [1], we had been collecting work
> > > item
> > > > >>>>>>>> proposals
> > > > >>>>>>>>> for the 2.0 release until June 15th, on the wiki page [2].
> > > > >>>>>>>>>
> > > > >>>>>>>>>    - As we have passed the due date, I'd like to kindly
> > remind
> > > > >>>>>> everyone
> > > > >>>>>>>> *not
> > > > >>>>>>>>>    to add / remove items directly on the wiki page*. If
> > needed,
> > > > >>>>>> please
> > > > >>>>>>>> post
> > > > >>>>>>>>>    in this thread or reach out to the release managers
> > instead.
> > > > >>>>>>>>>    - I've reached out to some folks for clarifications
> about
> > > > >>>> their
> > > > >>>>>>>>>    proposals. Some of them mentioned that they can not yet
> > tell
> > > > >>>>>> whether
> > > > >>>>>>>> we
> > > > >>>>>>>>>    should do an item or not, and would need more time /
> > > > >>>> discussions
> > > > >>>>>> to
> > > > >>>>>>>> make
> > > > >>>>>>>>>    the decision. So I added a new symbol for items whose
> > > > >>>> priorities
> > > > >>>>>> are
> > > > >>>>>>>> `TBD`.
> > > > >>>>>>>>> Now it's time to collaboratively decide a minimum set of
> > > > >>>> must-have
> > > > >>>>>>> items.
> > > > >>>>>>>>> I've gone through the entire list of proposed items, and
> > found
> > > > >>>> most
> > > > >>>>>> of
> > > > >>>>>>>> them
> > > > >>>>>>>>> make quite much sense. So I think an online sync might not
> be
> > > > >>>>>> necessary
> > > > >>>>>>>> for
> > > > >>>>>>>>> this. I'd like to go with this DISCUSS thread, where
> everyone
> > > can
> > > > >>>>>>> comment
> > > > >>>>>>>>> on how they think the list can be improved, followed by a
> > VOTE
> > > to
> > > > >>>>>>>> formally
> > > > >>>>>>>>> make the decision.
> > > > >>>>>>>>>
> > > > >>>>>>>>> Any feedback and opinions, including but not limited to the
> > > > >>>>> following
> > > > >>>>>>>>> aspects, will be appreciated.
> > > > >>>>>>>>>
> > > > >>>>>>>>>    - Important items that are missing from the list
> > > > >>>>>>>>>    - Concerns regarding the listed items or their
> priorities
> > > > >>>>>>>>>
> > > > >>>>>>>>> Looking forward to your feedback.
> > > > >>>>>>>>>
> > > > >>>>>>>>> Best,
> > > > >>>>>>>>>
> > > > >>>>>>>>> Xintong
> > > > >>>>>>>>>
> > > > >>>>>>>>>
> > > > >>>>>>>>> [1]
> > > > >>>>
> > > > >>
> > > >
> > >
> >
> https://lists.apache.org/list?dev@flink.apache.org:lte=1M:release%202.0%20status%20updates
> > > > >>>>>>>>> [2]
> > > > >>>> https://cwiki.apache.org/confluence/display/FLINK/2.0+Release
> > > > >>>>>>>>>
> > > > >>>>>>>>>
> > > > >>>>
> > > > >>>> --
> > > > >>>> Best regards,
> > > > >>>> Sergey
> > > > >>>>
> > > > >>
> > > > >>
> > > >
> > > >
> > >
> >
>
>
> --
> Best regards,
> Sergey
>

Re: [DISCUSS] Release 2.0 Work Items

Posted by Sergey Nuyanzin <sn...@gmail.com>.
Hi Jing,

Maybe I was not clear enough, sorry.
However the main reason for this item about Calcite rules is not abandoning
Scala.
The main reason are changes in Calcite itself where there was introduced
code generator framework (immutables)
to generate config java classes for rules and old api (which is used in
Scala Calcirte rules) for that is marked as deprecated.
Since Immutables implies code generation while java compilation it seems
impossible to use for rules in Scala code.

On Mon, Jul 3, 2023 at 10:44 AM Jing Ge <ji...@ververica.com.invalid> wrote:

> Hi,
>
> Speaking of "Move Calcite rules from Scala to Java", I was wondering if
> this thread is the right place to talk about it. Afaik, the Flink community
> has decided to abandon Scala. That is the reason, I guess, we want to move
> those Calcite rules from Scala to Java. On the other side, new Scala code
> will be added while developing new features[1]. Do we have any thoughts
> wrt the Scala code strategy?
>
> Best regards,
> Jing
>
>
>
> [1] https://lists.apache.org/thread/tnygl4n3q1fx75cl2vclc78j8mrxmz6y
>
> On Mon, Jul 3, 2023 at 10:31 AM Xintong Song <to...@gmail.com>
> wrote:
>
> > Thanks all for the discussion.
> >
> >
> > IIUC, we need to make the following changes. Please correct me if I get
> it
> > wrong.
> >
> >
> > 1. Disaggregated State Management - Clarify that only the public API
> > related part is must-have for 2.0.
> >
> > 2. Java version support - Split it into 3 items: a) make java 17 the
> > default (must-have), b) drop java 8 (must-have), and c) drop java 11
> > (nice-to-have)
> >
> > 3. Add MetricGroup#getLogicalScope - Should be promoted to must-have
> >
> > 4. ProcessFunction API - Should be downgrade to nice-to-have
> >
> > 5. Configuration - Add an item "revisit all config option types and
> default
> > values", which IIUC should also be a must-have
> >
> >
> > There seems to be no changes needed for "Move Calcite rules from Scala to
> > Java" as it's already nice-to-have.
> >
> >
> > If there's no objections, I'll update the wiki page accordingly, and
> start
> > a VOTE in the next couple of days.
> >
> >
> > Best,
> >
> > Xintong
> >
> >
> >
> > On Fri, Jun 30, 2023 at 12:53 AM Teoh, Hong <liangtl@amazon.co.uk.invalid
> >
> > wrote:
> >
> > > Thanks Xintong for driving the effort.
> > >
> > > I’d add a +1 to reworking configs, as suggested by @Jark and @Chesnay,
> > > especially the types. We have various configs that encode Time /
> > MemorySize
> > > that are Long instead!
> > >
> > > Regards,
> > > Hong
> > >
> > >
> > >
> > > > On 29 Jun 2023, at 16:19, Yuan Mei <yu...@gmail.com> wrote:
> > > >
> > > > CAUTION: This email originated from outside of the organization. Do
> not
> > > click links or open attachments unless you can confirm the sender and
> > know
> > > the content is safe.
> > > >
> > > >
> > > >
> > > > Thanks for driving this effort, Xintong!
> > > >
> > > > To Chesnay
> > > >> I'm curious as to why the "Disaggregated State Management" item is
> > > >> marked as a must-have; will it require changes that break something?
> > > >> What prevents it from being added in 2.1?
> > > >
> > > > As to "Disaggregated State Management".
> > > >
> > > > We plan to provide a new type of state backend to support DFS as
> > primary
> > > > storage.
> > > > To achieve this, we at least need to include two parts of amends (not
> > > > entirely sure yet, since we are still in the designing and prototype
> > > phase)
> > > >
> > > > 1. Statebackend Change
> > > > 2. State Access Change
> > > >
> > > > Not all of the interfaces related are `@Internal`. Some of the
> > interfaces
> > > > like `StateBackend` is `@PublicEvolving`
> > > > So, you are right in the sense that "Disaggregated State Management"
> > > itself
> > > > probably does not need to be a "Must Have"
> > > >
> > > > But I was hoping changes that related to public APIs can be finalized
> > and
> > > > merged in Flink 2.0 (I will fix the wiki accordingly).
> > > >
> > > > I also agree with Jark that 2.0 is a good chance to rework the
> default
> > > > value of configurations.
> > > >
> > > > Best
> > > > Yuan
> > > >
> > > >
> > > > On Thu, Jun 29, 2023 at 8:43 PM Chesnay Schepler <chesnay@apache.org
> >
> > > wrote:
> > > >
> > > >> Something else configuration-related is that there are a bunch of
> > > >> options where the type isn't quite correct (e.g., a String where it
> > > >> could be an enum, a string where it should be an int or something).
> > > >> Could do a pass over those as well.
> > > >>
> > > >> On 29/06/2023 13:50, Jark Wu wrote:
> > > >>> Hi,
> > > >>>
> > > >>> I think one more thing we need to consider to do in 2.0 is changing
> > the
> > > >>> default value of configuration to improve out-of-box user
> experience.
> > > >>>
> > > >>> Currently, in order to run a Flink job, users may need to set
> > > >>> a bunch of configurations, such as minibatch, checkpoint interval,
> > > >>> exactly-once,
> > > >>> incremental-checkpoint, etc. It's very verbose and hard to use for
> > > >>> beginners.
> > > >>> Most of them can have a universally applicable value.  Because
> > changing
> > > >> the
> > > >>> default value is a breaking change. I think It's worth considering
> > > >> changing
> > > >>> them in 2.0.
> > > >>>
> > > >>> What do you think?
> > > >>>
> > > >>> Best,
> > > >>> Jark
> > > >>>
> > > >>>
> > > >>> On Wed, 28 Jun 2023 at 14:10, Sergey Nuyanzin <snuyanzin@gmail.com
> >
> > > >> wrote:
> > > >>>
> > > >>>> Hi Chesnay
> > > >>>>
> > > >>>>> "Move Calcite rules from Scala to Java": I would hope that this
> > would
> > > >> be
> > > >>>>> an entirely internal change, and could thus be an incremental
> > process
> > > >>>>> independent of major releases.
> > > >>>>> What is the actual scale of this item; how much are we actually
> > > >>>> re-writing?
> > > >>>>
> > > >>>> Thanks for asking
> > > >>>> yes, you're right, that should be internal change.
> > > >>>> Yeah I was also thinking about incremental change (rule by rule or
> > > >>>> reasonable small group of rules).
> > > >>>> And yes, this could be an independent (on major release) activity
> > > >>>>
> > > >>>> The problem is actually for children of RelOptRule.
> > > >>>> Currently I see 60+ such rules (in Scala) using the mentioned
> > > deprecated
> > > >>>> api.
> > > >>>> There are also children of ConverterRule (50+) which do not have
> > such
> > > >>>> issues.
> > > >>>> Maybe it could be considered as the next step to have all the
> rules
> > in
> > > >>>> Java.
> > > >>>>
> > > >>>> On Tue, Jun 27, 2023 at 1:34 PM Xintong Song <
> tonysong820@gmail.com
> > >
> > > >>>> wrote:
> > > >>>>
> > > >>>>> Hi Alex & Gyula,
> > > >>>>>
> > > >>>>> By compatibility discussion do you mean the "[DISCUSS] FLIP-321:
> > > >>>> Introduce
> > > >>>>>> an API deprecation process" thread [1]?
> > > >>>>>>
> > > >>>>> Yes, I meant the FLIP-321 discussion. I just noticed I pasted the
> > > wrong
> > > >>>> url
> > > >>>>> in my previous email. Sorry for the mistake.
> > > >>>>>
> > > >>>>> I am also curious to know if the rationale behind this new API
> has
> > > been
> > > >>>>>> previously discussed on the mailing list. Do we have a list of
> > > >>>>> shortcomings
> > > >>>>>> in the current DataStream API that it tries to resolve? How does
> > the
> > > >>>>>> current ProcessFunction functionality fit into the picture? Will
> > it
> > > be
> > > >>>>> kept
> > > >>>>>> as is or subsumed by new API?
> > > >>>>>>
> > > >>>>> I don't think we should create a replacement for the DataStream
> API
> > > >>>> unless
> > > >>>>>> we have a very good reason to do so and with a proper discussion
> > > about
> > > >>>>> this
> > > >>>>>> as Alex said.
> > > >>>>>
> > > >>>>> The ProcessFunction API which is targeting to replace DataStream
> > API
> > > is
> > > >>>>> still a proposal, not a decision. Sorry for the confusion, I
> should
> > > >> have
> > > >>>>> been more careful with my words, not giving the impression that
> > this
> > > is
> > > >>>>> something we'll do anyway.
> > > >>>>>
> > > >>>>> There will be a FLIP describing the motivations and designs in
> > > detail,
> > > >>>> for
> > > >>>>> the community to discuss and vote on. We are still working on it.
> > > TBH,
> > > >>>> this
> > > >>>>> is not trivial and we would need more time on it.
> > > >>>>>
> > > >>>>> Just to quickly share some backgrounds:
> > > >>>>>
> > > >>>>>    - We see quite some problems with the current DataStream APIs
> > > >>>>>       - Users are working with concrete classes rather than
> > > >> interfaces,
> > > >>>>>       which means
> > > >>>>>       - Users can access methods that are designed to be used by
> > > >> internal
> > > >>>>>          classes, even though they are annotated with
> `@Internal`.
> > > >> E.g.,
> > > >>>>>          `DataStream#getTransformation`.
> > > >>>>>          - Changes to the non-API implementations (e.g.,
> > > >>>> `Transformation`)
> > > >>>>>          would affect the API classes (e.g., `DataStream`), which
> > > >>>>> makes it hard to
> > > >>>>>          provide binary compatibility.
> > > >>>>>       - Internal classes are used as parameter / return-value of
> > > >> public
> > > >>>>>       APIs. E.g., while `AbstractStreamOperator` is
> PublicEvolving,
> > > >>>>> `StreamTask`
> > > >>>>>       which returns from
> `AbstractStreamOperator#getContainingTask`
> > > is
> > > >>>>> Internal.
> > > >>>>>       - In many cases, users are asked to extend the API classes,
> > > >> rather
> > > >>>>>       than implementing interfaces. E.g.,
> `AbstractStreamOperator`.
> > > >>>>>          - Any changes to the base classes, even the internal
> part,
> > > >> may
> > > >>>>>          affect the behavior of the user-provided sub-classes
> > > >>>>>          - Users can override the behavior of the base classes
> > > >>>>>       - The API module `flink-streaming-java` contains non-API
> > > >> classes,
> > > >>>> and
> > > >>>>>       depends on internal modules such as `flink-runtime`, which
> > > means
> > > >>>>>       - Changes to the internal modules may affect the API
> modules,
> > > >> which
> > > >>>>>          requires users to re-build their applications upon
> > upgrading
> > > >>>>>          - The artifact user needs for building their application
> > > >> larger
> > > >>>>>          than necessary.
> > > >>>>>       - We probably should not expose operators (e.g.,
> > > >>>>>       `AbstractStreamOperator`) to users. Functions should be
> > enough
> > > >>>>> for users to
> > > >>>>>       define their data processing logics. Exposing
> operator-level
> > > >>>> concepts
> > > >>>>>       (e.g., mailbox thread model, checkpoint barrier alignment,
> > > >> etc.) is
> > > >>>>>       unnecessary and limits the improvement regarding such
> exposed
> > > >>>>> mechanisms
> > > >>>>>       with compatibility considerations.
> > > >>>>>       - The current DataStream API seems to be a mixture of many
> > > >> things,
> > > >>>>>       making it hard to understand especially for newcomers. It
> > might
> > > >> be
> > > >>>>> better
> > > >>>>>       to re-organize it into several parts: (the taxonomy below
> are
> > > >> just
> > > >>>> an
> > > >>>>>       example of the, we are still working on this)
> > > >>>>>          - The most fundamental stateful stream processing:
> > streams,
> > > >>>>>          partitions / key, process functions, state,
> > timeline-service
> > > >>>>>          - An extension for common batch-streaming unified
> > functions:
> > > >>>> map,
> > > >>>>>          flatmap, filter, agg, reduce, join, etc.
> > > >>>>>          - An extension for windowing supports:  window,
> triggering
> > > >>>>>          - An extension for event-time supports: event time,
> > > watermark
> > > >>>>>          - The extensions are like short-cuts / sugars, without
> > which
> > > >>>> users
> > > >>>>>          can probably still achieve the same behavior by working
> > with
> > > >> the
> > > >>>>>          fundamental APIs, but would be a lot easier with the
> > > >> extensions
> > > >>>>>       - The original plan was to do in-place refactors / changes
> on
> > > >>>>>    DataStream API. Some related items are listed in this doc [2]
> > > >> attached
> > > >>>>> to
> > > >>>>>    the kicking off email [3]. Not all of the above issues are
> > listed,
> > > >>>>> because
> > > >>>>>    we haven't looked into this as deeply as now  by that time.
> > > >>>>>    - We proposed this as a new API rather than in-place refactors
> > in
> > > >> the
> > > >>>>>    2.0 work item list, because we realized the changes might be
> too
> > > >> big
> > > >>>>> for an
> > > >>>>>    in-place change. First having a new API then gradually
> retiring
> > > the
> > > >>>> old
> > > >>>>> one
> > > >>>>>    would help users to smoothly migrate between them.
> > > >>>>>
> > > >>>>> A thorough discussion is definitely needed once the FLIP is out.
> > And
> > > of
> > > >>>>> course it's possible that the FLIP might be rejected. Given that
> we
> > > are
> > > >>>>> planning for release 2.0, I just feel it would be better to bring
> > > this
> > > >> up
> > > >>>>> early even the concrete plan is not yet ready,
> > > >>>>>
> > > >>>>> Best,
> > > >>>>>
> > > >>>>> Xintong
> > > >>>>>
> > > >>>>>
> > > >>>>> [1]
> > https://lists.apache.org/thread/vmhzv8fcw2b33pqxp43486owrxbkd5x9
> > > >>>>> [2]
> > > >>>>>
> > > >>>>>
> > > >>>>
> > > >>
> > >
> >
> https://docs.google.com/document/d/1_PMGl5RuDQGlV99_gL3y7OiRsF0DgCk91Coua6hFXhE/edit?usp=sharing
> > > >>>>> [3]
> > https://lists.apache.org/thread/b8w5cx0qqbwzzklyn5xxf54vw9ymys1c
> > > >>>>>
> > > >>>>> On Tue, Jun 27, 2023 at 5:15 PM Gyula Fóra <gy...@apache.org>
> > > wrote:
> > > >>>>>
> > > >>>>>> Hey!
> > > >>>>>>
> > > >>>>>> I share the same concerns mentioned above regarding the
> > > >>>> "ProcessFunction
> > > >>>>>> API".
> > > >>>>>>
> > > >>>>>> I don't think we should create a replacement for the DataStream
> > API
> > > >>>>> unless
> > > >>>>>> we have a very good reason to do so and with a proper discussion
> > > about
> > > >>>>> this
> > > >>>>>> as Alex said.
> > > >>>>>>
> > > >>>>>> Cheers,
> > > >>>>>> Gyula
> > > >>>>>>
> > > >>>>>> On Tue, Jun 27, 2023 at 11:03 AM Alexander Fedulov <
> > > >>>>>> alexander.fedulov@gmail.com> wrote:
> > > >>>>>>
> > > >>>>>>> Hi Xintong,
> > > >>>>>>>
> > > >>>>>>> By compatibility discussion do you mean the "[DISCUSS]
> FLIP-321:
> > > >>>>>> Introduce
> > > >>>>>>> an API deprecation process" thread [1]?
> > > >>>>>>>
> > > >>>>>>> I am also curious to know if the rationale behind this new API
> > has
> > > >>>> been
> > > >>>>>>> previously discussed on the mailing list. Do we have a list of
> > > >>>>>> shortcomings
> > > >>>>>>> in the current DataStream API that it tries to resolve? How
> does
> > > the
> > > >>>>>>> current ProcessFunction functionality fit into the picture?
> Will
> > it
> > > >>>> be
> > > >>>>>> kept
> > > >>>>>>> as is or subsumed by new API?
> > > >>>>>>>
> > > >>>>>>> [1]
> > > https://lists.apache.org/thread/vmhzv8fcw2b33pqxp43486owrxbkd5x9
> > > >>>>>>>
> > > >>>>>>> Best,
> > > >>>>>>> Alex
> > > >>>>>>>
> > > >>>>>>> On Mon, 26 Jun 2023 at 14:33, Xintong Song <
> > tonysong820@gmail.com>
> > > >>>>>> wrote:
> > > >>>>>>>>> The ProcessFunction API item is giving me the most headaches
> > > >>>>> because
> > > >>>>>>> it's
> > > >>>>>>>>> very unclear what it actually entails; like is it an entirely
> > > >>>>>> separate
> > > >>>>>>>> API
> > > >>>>>>>>> to DataStream (sounds like it is!) or an extension of
> > DataStream.
> > > >>>>> How
> > > >>>>>>>> much
> > > >>>>>>>>> will it share the internals with DataStream etc.; how does it
> > > >>>>> relate
> > > >>>>>> to
> > > >>>>>>>> the
> > > >>>>>>>>> Table API (w.r.t. switching APIs / what Table API uses
> > > >>>> underneath).
> > > >>>>>>>> I totally understand your confusion. We started planning this
> > > after
> > > >>>>>>> kicking
> > > >>>>>>>> off the release 2.0, so there's still a lot to be explored and
> > the
> > > >>>>> plan
> > > >>>>>>>> keeps changing.
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>>    - In the beginning, we planned to do an in-place refactor
> of
> > > >>>>>>> DataStream
> > > >>>>>>>>    API, until the API migration period is proposed.
> > > >>>>>>>>    - Then we want to make it an entirely separate API to
> > > >>>> DataStream,
> > > >>>>>> and
> > > >>>>>>>>    listed as a must-have for release 2.0 so that we can remove
> > > >>>>>> DataStream
> > > >>>>>>>> once
> > > >>>>>>>>    it's ready.
> > > >>>>>>>>    - However, depending on the outcome of the API
> compatibility
> > > >>>>>>> discussion
> > > >>>>>>>>    [1], we may not be able to remove DataStream in 2.0 anyway,
> > > >>>> which
> > > >>>>>>> means
> > > >>>>>>>> we
> > > >>>>>>>>    might need to re-evaluate the necessity of this item for
> 2.0.
> > > >>>>>>>>
> > > >>>>>>>> I'd say we wait a bit longer for the compatibility discussion
> > [1]
> > > >>>> and
> > > >>>>>>>> decide the priority for this item afterwards.
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>> Best,
> > > >>>>>>>>
> > > >>>>>>>> Xintong
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>> [1] https://lists.apache.org/list.html?dev@flink.apache.org
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>> On Mon, Jun 26, 2023 at 6:00 PM Chesnay Schepler <
> > > >>>> chesnay@apache.org
> > > >>>>>>>> wrote:
> > > >>>>>>>>
> > > >>>>>>>>> by-and-large I'm quite happy with the list of items.
> > > >>>>>>>>>
> > > >>>>>>>>> I'm curious as to why the "Disaggregated State Management"
> item
> > > >>>> is
> > > >>>>>>> marked
> > > >>>>>>>>> as a must-have; will it require changes that break something?
> > > >>>> What
> > > >>>>>>>> prevents
> > > >>>>>>>>> it from being added in 2.1?
> > > >>>>>>>>>
> > > >>>>>>>>> We may want to update the Java 17 item to "Make Java 17 the
> > > >>>>> default,
> > > >>>>>>> drop
> > > >>>>>>>>> Java 8/11". Maybe even split it into a must-have "Drop Java
> 8"
> > > >>>> and
> > > >>>>> a
> > > >>>>>>>>> nice-to-have "Drop Java 11"?
> > > >>>>>>>>>
> > > >>>>>>>>> "Move Calcite rules from Scala to Java": I would hope that
> this
> > > >>>>> would
> > > >>>>>>> be
> > > >>>>>>>>> an entirely internal change, and could thus be an incremental
> > > >>>>> process
> > > >>>>>>>>> independent of major releases.
> > > >>>>>>>>> What is the actual scale of this item; how much are we
> actually
> > > >>>>>>>> re-writing?
> > > >>>>>>>>> "Add MetricGroup#getLogicalScope": I'd raise this to a
> > > >>>> must-have; i
> > > >>>>>>> think
> > > >>>>>>>>> I marked it down as nice-to-have only because it depends on
> > > >>>> another
> > > >>>>>>> item.
> > > >>>>>>>>> The ProcessFunction API item is giving me the most headaches
> > > >>>>> because
> > > >>>>>>> it's
> > > >>>>>>>>> very unclear what it actually entails; like is it an entirely
> > > >>>>>> separate
> > > >>>>>>>> API
> > > >>>>>>>>> to DataStream (sounds like it is!) or an extension of
> > DataStream.
> > > >>>>> How
> > > >>>>>>>> much
> > > >>>>>>>>> will it share the internals with DataStream etc.; how does it
> > > >>>>> relate
> > > >>>>>> to
> > > >>>>>>>> the
> > > >>>>>>>>> Table API (w.r.t. switching APIs / what Table API uses
> > > >>>> underneath).
> > > >>>>>>>>> There are a few items I added as ideas which don't have a
> > > >>>> priority
> > > >>>>>> yet;
> > > >>>>>>>>> would love to get some feedback on those.
> > > >>>>>>>>>
> > > >>>>>>>>> On 21/06/2023 08:41, Xintong Song wrote:
> > > >>>>>>>>>
> > > >>>>>>>>> Hi devs,
> > > >>>>>>>>>
> > > >>>>>>>>> As previously discussed in [1], we had been collecting work
> > item
> > > >>>>>>>> proposals
> > > >>>>>>>>> for the 2.0 release until June 15th, on the wiki page [2].
> > > >>>>>>>>>
> > > >>>>>>>>>    - As we have passed the due date, I'd like to kindly
> remind
> > > >>>>>> everyone
> > > >>>>>>>> *not
> > > >>>>>>>>>    to add / remove items directly on the wiki page*. If
> needed,
> > > >>>>>> please
> > > >>>>>>>> post
> > > >>>>>>>>>    in this thread or reach out to the release managers
> instead.
> > > >>>>>>>>>    - I've reached out to some folks for clarifications about
> > > >>>> their
> > > >>>>>>>>>    proposals. Some of them mentioned that they can not yet
> tell
> > > >>>>>> whether
> > > >>>>>>>> we
> > > >>>>>>>>>    should do an item or not, and would need more time /
> > > >>>> discussions
> > > >>>>>> to
> > > >>>>>>>> make
> > > >>>>>>>>>    the decision. So I added a new symbol for items whose
> > > >>>> priorities
> > > >>>>>> are
> > > >>>>>>>> `TBD`.
> > > >>>>>>>>> Now it's time to collaboratively decide a minimum set of
> > > >>>> must-have
> > > >>>>>>> items.
> > > >>>>>>>>> I've gone through the entire list of proposed items, and
> found
> > > >>>> most
> > > >>>>>> of
> > > >>>>>>>> them
> > > >>>>>>>>> make quite much sense. So I think an online sync might not be
> > > >>>>>> necessary
> > > >>>>>>>> for
> > > >>>>>>>>> this. I'd like to go with this DISCUSS thread, where everyone
> > can
> > > >>>>>>> comment
> > > >>>>>>>>> on how they think the list can be improved, followed by a
> VOTE
> > to
> > > >>>>>>>> formally
> > > >>>>>>>>> make the decision.
> > > >>>>>>>>>
> > > >>>>>>>>> Any feedback and opinions, including but not limited to the
> > > >>>>> following
> > > >>>>>>>>> aspects, will be appreciated.
> > > >>>>>>>>>
> > > >>>>>>>>>    - Important items that are missing from the list
> > > >>>>>>>>>    - Concerns regarding the listed items or their priorities
> > > >>>>>>>>>
> > > >>>>>>>>> Looking forward to your feedback.
> > > >>>>>>>>>
> > > >>>>>>>>> Best,
> > > >>>>>>>>>
> > > >>>>>>>>> Xintong
> > > >>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>>>> [1]
> > > >>>>
> > > >>
> > >
> >
> https://lists.apache.org/list?dev@flink.apache.org:lte=1M:release%202.0%20status%20updates
> > > >>>>>>>>> [2]
> > > >>>> https://cwiki.apache.org/confluence/display/FLINK/2.0+Release
> > > >>>>>>>>>
> > > >>>>>>>>>
> > > >>>>
> > > >>>> --
> > > >>>> Best regards,
> > > >>>> Sergey
> > > >>>>
> > > >>
> > > >>
> > >
> > >
> >
>


-- 
Best regards,
Sergey

Re: [DISCUSS] Release 2.0 Work Items

Posted by Jing Ge <ji...@ververica.com.INVALID>.
Hi,

Speaking of "Move Calcite rules from Scala to Java", I was wondering if
this thread is the right place to talk about it. Afaik, the Flink community
has decided to abandon Scala. That is the reason, I guess, we want to move
those Calcite rules from Scala to Java. On the other side, new Scala code
will be added while developing new features[1]. Do we have any thoughts
wrt the Scala code strategy?

Best regards,
Jing



[1] https://lists.apache.org/thread/tnygl4n3q1fx75cl2vclc78j8mrxmz6y

On Mon, Jul 3, 2023 at 10:31 AM Xintong Song <to...@gmail.com> wrote:

> Thanks all for the discussion.
>
>
> IIUC, we need to make the following changes. Please correct me if I get it
> wrong.
>
>
> 1. Disaggregated State Management - Clarify that only the public API
> related part is must-have for 2.0.
>
> 2. Java version support - Split it into 3 items: a) make java 17 the
> default (must-have), b) drop java 8 (must-have), and c) drop java 11
> (nice-to-have)
>
> 3. Add MetricGroup#getLogicalScope - Should be promoted to must-have
>
> 4. ProcessFunction API - Should be downgrade to nice-to-have
>
> 5. Configuration - Add an item "revisit all config option types and default
> values", which IIUC should also be a must-have
>
>
> There seems to be no changes needed for "Move Calcite rules from Scala to
> Java" as it's already nice-to-have.
>
>
> If there's no objections, I'll update the wiki page accordingly, and start
> a VOTE in the next couple of days.
>
>
> Best,
>
> Xintong
>
>
>
> On Fri, Jun 30, 2023 at 12:53 AM Teoh, Hong <li...@amazon.co.uk.invalid>
> wrote:
>
> > Thanks Xintong for driving the effort.
> >
> > I’d add a +1 to reworking configs, as suggested by @Jark and @Chesnay,
> > especially the types. We have various configs that encode Time /
> MemorySize
> > that are Long instead!
> >
> > Regards,
> > Hong
> >
> >
> >
> > > On 29 Jun 2023, at 16:19, Yuan Mei <yu...@gmail.com> wrote:
> > >
> > > CAUTION: This email originated from outside of the organization. Do not
> > click links or open attachments unless you can confirm the sender and
> know
> > the content is safe.
> > >
> > >
> > >
> > > Thanks for driving this effort, Xintong!
> > >
> > > To Chesnay
> > >> I'm curious as to why the "Disaggregated State Management" item is
> > >> marked as a must-have; will it require changes that break something?
> > >> What prevents it from being added in 2.1?
> > >
> > > As to "Disaggregated State Management".
> > >
> > > We plan to provide a new type of state backend to support DFS as
> primary
> > > storage.
> > > To achieve this, we at least need to include two parts of amends (not
> > > entirely sure yet, since we are still in the designing and prototype
> > phase)
> > >
> > > 1. Statebackend Change
> > > 2. State Access Change
> > >
> > > Not all of the interfaces related are `@Internal`. Some of the
> interfaces
> > > like `StateBackend` is `@PublicEvolving`
> > > So, you are right in the sense that "Disaggregated State Management"
> > itself
> > > probably does not need to be a "Must Have"
> > >
> > > But I was hoping changes that related to public APIs can be finalized
> and
> > > merged in Flink 2.0 (I will fix the wiki accordingly).
> > >
> > > I also agree with Jark that 2.0 is a good chance to rework the default
> > > value of configurations.
> > >
> > > Best
> > > Yuan
> > >
> > >
> > > On Thu, Jun 29, 2023 at 8:43 PM Chesnay Schepler <ch...@apache.org>
> > wrote:
> > >
> > >> Something else configuration-related is that there are a bunch of
> > >> options where the type isn't quite correct (e.g., a String where it
> > >> could be an enum, a string where it should be an int or something).
> > >> Could do a pass over those as well.
> > >>
> > >> On 29/06/2023 13:50, Jark Wu wrote:
> > >>> Hi,
> > >>>
> > >>> I think one more thing we need to consider to do in 2.0 is changing
> the
> > >>> default value of configuration to improve out-of-box user experience.
> > >>>
> > >>> Currently, in order to run a Flink job, users may need to set
> > >>> a bunch of configurations, such as minibatch, checkpoint interval,
> > >>> exactly-once,
> > >>> incremental-checkpoint, etc. It's very verbose and hard to use for
> > >>> beginners.
> > >>> Most of them can have a universally applicable value.  Because
> changing
> > >> the
> > >>> default value is a breaking change. I think It's worth considering
> > >> changing
> > >>> them in 2.0.
> > >>>
> > >>> What do you think?
> > >>>
> > >>> Best,
> > >>> Jark
> > >>>
> > >>>
> > >>> On Wed, 28 Jun 2023 at 14:10, Sergey Nuyanzin <sn...@gmail.com>
> > >> wrote:
> > >>>
> > >>>> Hi Chesnay
> > >>>>
> > >>>>> "Move Calcite rules from Scala to Java": I would hope that this
> would
> > >> be
> > >>>>> an entirely internal change, and could thus be an incremental
> process
> > >>>>> independent of major releases.
> > >>>>> What is the actual scale of this item; how much are we actually
> > >>>> re-writing?
> > >>>>
> > >>>> Thanks for asking
> > >>>> yes, you're right, that should be internal change.
> > >>>> Yeah I was also thinking about incremental change (rule by rule or
> > >>>> reasonable small group of rules).
> > >>>> And yes, this could be an independent (on major release) activity
> > >>>>
> > >>>> The problem is actually for children of RelOptRule.
> > >>>> Currently I see 60+ such rules (in Scala) using the mentioned
> > deprecated
> > >>>> api.
> > >>>> There are also children of ConverterRule (50+) which do not have
> such
> > >>>> issues.
> > >>>> Maybe it could be considered as the next step to have all the rules
> in
> > >>>> Java.
> > >>>>
> > >>>> On Tue, Jun 27, 2023 at 1:34 PM Xintong Song <tonysong820@gmail.com
> >
> > >>>> wrote:
> > >>>>
> > >>>>> Hi Alex & Gyula,
> > >>>>>
> > >>>>> By compatibility discussion do you mean the "[DISCUSS] FLIP-321:
> > >>>> Introduce
> > >>>>>> an API deprecation process" thread [1]?
> > >>>>>>
> > >>>>> Yes, I meant the FLIP-321 discussion. I just noticed I pasted the
> > wrong
> > >>>> url
> > >>>>> in my previous email. Sorry for the mistake.
> > >>>>>
> > >>>>> I am also curious to know if the rationale behind this new API has
> > been
> > >>>>>> previously discussed on the mailing list. Do we have a list of
> > >>>>> shortcomings
> > >>>>>> in the current DataStream API that it tries to resolve? How does
> the
> > >>>>>> current ProcessFunction functionality fit into the picture? Will
> it
> > be
> > >>>>> kept
> > >>>>>> as is or subsumed by new API?
> > >>>>>>
> > >>>>> I don't think we should create a replacement for the DataStream API
> > >>>> unless
> > >>>>>> we have a very good reason to do so and with a proper discussion
> > about
> > >>>>> this
> > >>>>>> as Alex said.
> > >>>>>
> > >>>>> The ProcessFunction API which is targeting to replace DataStream
> API
> > is
> > >>>>> still a proposal, not a decision. Sorry for the confusion, I should
> > >> have
> > >>>>> been more careful with my words, not giving the impression that
> this
> > is
> > >>>>> something we'll do anyway.
> > >>>>>
> > >>>>> There will be a FLIP describing the motivations and designs in
> > detail,
> > >>>> for
> > >>>>> the community to discuss and vote on. We are still working on it.
> > TBH,
> > >>>> this
> > >>>>> is not trivial and we would need more time on it.
> > >>>>>
> > >>>>> Just to quickly share some backgrounds:
> > >>>>>
> > >>>>>    - We see quite some problems with the current DataStream APIs
> > >>>>>       - Users are working with concrete classes rather than
> > >> interfaces,
> > >>>>>       which means
> > >>>>>       - Users can access methods that are designed to be used by
> > >> internal
> > >>>>>          classes, even though they are annotated with `@Internal`.
> > >> E.g.,
> > >>>>>          `DataStream#getTransformation`.
> > >>>>>          - Changes to the non-API implementations (e.g.,
> > >>>> `Transformation`)
> > >>>>>          would affect the API classes (e.g., `DataStream`), which
> > >>>>> makes it hard to
> > >>>>>          provide binary compatibility.
> > >>>>>       - Internal classes are used as parameter / return-value of
> > >> public
> > >>>>>       APIs. E.g., while `AbstractStreamOperator` is PublicEvolving,
> > >>>>> `StreamTask`
> > >>>>>       which returns from `AbstractStreamOperator#getContainingTask`
> > is
> > >>>>> Internal.
> > >>>>>       - In many cases, users are asked to extend the API classes,
> > >> rather
> > >>>>>       than implementing interfaces. E.g., `AbstractStreamOperator`.
> > >>>>>          - Any changes to the base classes, even the internal part,
> > >> may
> > >>>>>          affect the behavior of the user-provided sub-classes
> > >>>>>          - Users can override the behavior of the base classes
> > >>>>>       - The API module `flink-streaming-java` contains non-API
> > >> classes,
> > >>>> and
> > >>>>>       depends on internal modules such as `flink-runtime`, which
> > means
> > >>>>>       - Changes to the internal modules may affect the API modules,
> > >> which
> > >>>>>          requires users to re-build their applications upon
> upgrading
> > >>>>>          - The artifact user needs for building their application
> > >> larger
> > >>>>>          than necessary.
> > >>>>>       - We probably should not expose operators (e.g.,
> > >>>>>       `AbstractStreamOperator`) to users. Functions should be
> enough
> > >>>>> for users to
> > >>>>>       define their data processing logics. Exposing operator-level
> > >>>> concepts
> > >>>>>       (e.g., mailbox thread model, checkpoint barrier alignment,
> > >> etc.) is
> > >>>>>       unnecessary and limits the improvement regarding such exposed
> > >>>>> mechanisms
> > >>>>>       with compatibility considerations.
> > >>>>>       - The current DataStream API seems to be a mixture of many
> > >> things,
> > >>>>>       making it hard to understand especially for newcomers. It
> might
> > >> be
> > >>>>> better
> > >>>>>       to re-organize it into several parts: (the taxonomy below are
> > >> just
> > >>>> an
> > >>>>>       example of the, we are still working on this)
> > >>>>>          - The most fundamental stateful stream processing:
> streams,
> > >>>>>          partitions / key, process functions, state,
> timeline-service
> > >>>>>          - An extension for common batch-streaming unified
> functions:
> > >>>> map,
> > >>>>>          flatmap, filter, agg, reduce, join, etc.
> > >>>>>          - An extension for windowing supports:  window, triggering
> > >>>>>          - An extension for event-time supports: event time,
> > watermark
> > >>>>>          - The extensions are like short-cuts / sugars, without
> which
> > >>>> users
> > >>>>>          can probably still achieve the same behavior by working
> with
> > >> the
> > >>>>>          fundamental APIs, but would be a lot easier with the
> > >> extensions
> > >>>>>       - The original plan was to do in-place refactors / changes on
> > >>>>>    DataStream API. Some related items are listed in this doc [2]
> > >> attached
> > >>>>> to
> > >>>>>    the kicking off email [3]. Not all of the above issues are
> listed,
> > >>>>> because
> > >>>>>    we haven't looked into this as deeply as now  by that time.
> > >>>>>    - We proposed this as a new API rather than in-place refactors
> in
> > >> the
> > >>>>>    2.0 work item list, because we realized the changes might be too
> > >> big
> > >>>>> for an
> > >>>>>    in-place change. First having a new API then gradually retiring
> > the
> > >>>> old
> > >>>>> one
> > >>>>>    would help users to smoothly migrate between them.
> > >>>>>
> > >>>>> A thorough discussion is definitely needed once the FLIP is out.
> And
> > of
> > >>>>> course it's possible that the FLIP might be rejected. Given that we
> > are
> > >>>>> planning for release 2.0, I just feel it would be better to bring
> > this
> > >> up
> > >>>>> early even the concrete plan is not yet ready,
> > >>>>>
> > >>>>> Best,
> > >>>>>
> > >>>>> Xintong
> > >>>>>
> > >>>>>
> > >>>>> [1]
> https://lists.apache.org/thread/vmhzv8fcw2b33pqxp43486owrxbkd5x9
> > >>>>> [2]
> > >>>>>
> > >>>>>
> > >>>>
> > >>
> >
> https://docs.google.com/document/d/1_PMGl5RuDQGlV99_gL3y7OiRsF0DgCk91Coua6hFXhE/edit?usp=sharing
> > >>>>> [3]
> https://lists.apache.org/thread/b8w5cx0qqbwzzklyn5xxf54vw9ymys1c
> > >>>>>
> > >>>>> On Tue, Jun 27, 2023 at 5:15 PM Gyula Fóra <gy...@apache.org>
> > wrote:
> > >>>>>
> > >>>>>> Hey!
> > >>>>>>
> > >>>>>> I share the same concerns mentioned above regarding the
> > >>>> "ProcessFunction
> > >>>>>> API".
> > >>>>>>
> > >>>>>> I don't think we should create a replacement for the DataStream
> API
> > >>>>> unless
> > >>>>>> we have a very good reason to do so and with a proper discussion
> > about
> > >>>>> this
> > >>>>>> as Alex said.
> > >>>>>>
> > >>>>>> Cheers,
> > >>>>>> Gyula
> > >>>>>>
> > >>>>>> On Tue, Jun 27, 2023 at 11:03 AM Alexander Fedulov <
> > >>>>>> alexander.fedulov@gmail.com> wrote:
> > >>>>>>
> > >>>>>>> Hi Xintong,
> > >>>>>>>
> > >>>>>>> By compatibility discussion do you mean the "[DISCUSS] FLIP-321:
> > >>>>>> Introduce
> > >>>>>>> an API deprecation process" thread [1]?
> > >>>>>>>
> > >>>>>>> I am also curious to know if the rationale behind this new API
> has
> > >>>> been
> > >>>>>>> previously discussed on the mailing list. Do we have a list of
> > >>>>>> shortcomings
> > >>>>>>> in the current DataStream API that it tries to resolve? How does
> > the
> > >>>>>>> current ProcessFunction functionality fit into the picture? Will
> it
> > >>>> be
> > >>>>>> kept
> > >>>>>>> as is or subsumed by new API?
> > >>>>>>>
> > >>>>>>> [1]
> > https://lists.apache.org/thread/vmhzv8fcw2b33pqxp43486owrxbkd5x9
> > >>>>>>>
> > >>>>>>> Best,
> > >>>>>>> Alex
> > >>>>>>>
> > >>>>>>> On Mon, 26 Jun 2023 at 14:33, Xintong Song <
> tonysong820@gmail.com>
> > >>>>>> wrote:
> > >>>>>>>>> The ProcessFunction API item is giving me the most headaches
> > >>>>> because
> > >>>>>>> it's
> > >>>>>>>>> very unclear what it actually entails; like is it an entirely
> > >>>>>> separate
> > >>>>>>>> API
> > >>>>>>>>> to DataStream (sounds like it is!) or an extension of
> DataStream.
> > >>>>> How
> > >>>>>>>> much
> > >>>>>>>>> will it share the internals with DataStream etc.; how does it
> > >>>>> relate
> > >>>>>> to
> > >>>>>>>> the
> > >>>>>>>>> Table API (w.r.t. switching APIs / what Table API uses
> > >>>> underneath).
> > >>>>>>>> I totally understand your confusion. We started planning this
> > after
> > >>>>>>> kicking
> > >>>>>>>> off the release 2.0, so there's still a lot to be explored and
> the
> > >>>>> plan
> > >>>>>>>> keeps changing.
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>>    - In the beginning, we planned to do an in-place refactor of
> > >>>>>>> DataStream
> > >>>>>>>>    API, until the API migration period is proposed.
> > >>>>>>>>    - Then we want to make it an entirely separate API to
> > >>>> DataStream,
> > >>>>>> and
> > >>>>>>>>    listed as a must-have for release 2.0 so that we can remove
> > >>>>>> DataStream
> > >>>>>>>> once
> > >>>>>>>>    it's ready.
> > >>>>>>>>    - However, depending on the outcome of the API compatibility
> > >>>>>>> discussion
> > >>>>>>>>    [1], we may not be able to remove DataStream in 2.0 anyway,
> > >>>> which
> > >>>>>>> means
> > >>>>>>>> we
> > >>>>>>>>    might need to re-evaluate the necessity of this item for 2.0.
> > >>>>>>>>
> > >>>>>>>> I'd say we wait a bit longer for the compatibility discussion
> [1]
> > >>>> and
> > >>>>>>>> decide the priority for this item afterwards.
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> Best,
> > >>>>>>>>
> > >>>>>>>> Xintong
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> [1] https://lists.apache.org/list.html?dev@flink.apache.org
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> On Mon, Jun 26, 2023 at 6:00 PM Chesnay Schepler <
> > >>>> chesnay@apache.org
> > >>>>>>>> wrote:
> > >>>>>>>>
> > >>>>>>>>> by-and-large I'm quite happy with the list of items.
> > >>>>>>>>>
> > >>>>>>>>> I'm curious as to why the "Disaggregated State Management" item
> > >>>> is
> > >>>>>>> marked
> > >>>>>>>>> as a must-have; will it require changes that break something?
> > >>>> What
> > >>>>>>>> prevents
> > >>>>>>>>> it from being added in 2.1?
> > >>>>>>>>>
> > >>>>>>>>> We may want to update the Java 17 item to "Make Java 17 the
> > >>>>> default,
> > >>>>>>> drop
> > >>>>>>>>> Java 8/11". Maybe even split it into a must-have "Drop Java 8"
> > >>>> and
> > >>>>> a
> > >>>>>>>>> nice-to-have "Drop Java 11"?
> > >>>>>>>>>
> > >>>>>>>>> "Move Calcite rules from Scala to Java": I would hope that this
> > >>>>> would
> > >>>>>>> be
> > >>>>>>>>> an entirely internal change, and could thus be an incremental
> > >>>>> process
> > >>>>>>>>> independent of major releases.
> > >>>>>>>>> What is the actual scale of this item; how much are we actually
> > >>>>>>>> re-writing?
> > >>>>>>>>> "Add MetricGroup#getLogicalScope": I'd raise this to a
> > >>>> must-have; i
> > >>>>>>> think
> > >>>>>>>>> I marked it down as nice-to-have only because it depends on
> > >>>> another
> > >>>>>>> item.
> > >>>>>>>>> The ProcessFunction API item is giving me the most headaches
> > >>>>> because
> > >>>>>>> it's
> > >>>>>>>>> very unclear what it actually entails; like is it an entirely
> > >>>>>> separate
> > >>>>>>>> API
> > >>>>>>>>> to DataStream (sounds like it is!) or an extension of
> DataStream.
> > >>>>> How
> > >>>>>>>> much
> > >>>>>>>>> will it share the internals with DataStream etc.; how does it
> > >>>>> relate
> > >>>>>> to
> > >>>>>>>> the
> > >>>>>>>>> Table API (w.r.t. switching APIs / what Table API uses
> > >>>> underneath).
> > >>>>>>>>> There are a few items I added as ideas which don't have a
> > >>>> priority
> > >>>>>> yet;
> > >>>>>>>>> would love to get some feedback on those.
> > >>>>>>>>>
> > >>>>>>>>> On 21/06/2023 08:41, Xintong Song wrote:
> > >>>>>>>>>
> > >>>>>>>>> Hi devs,
> > >>>>>>>>>
> > >>>>>>>>> As previously discussed in [1], we had been collecting work
> item
> > >>>>>>>> proposals
> > >>>>>>>>> for the 2.0 release until June 15th, on the wiki page [2].
> > >>>>>>>>>
> > >>>>>>>>>    - As we have passed the due date, I'd like to kindly remind
> > >>>>>> everyone
> > >>>>>>>> *not
> > >>>>>>>>>    to add / remove items directly on the wiki page*. If needed,
> > >>>>>> please
> > >>>>>>>> post
> > >>>>>>>>>    in this thread or reach out to the release managers instead.
> > >>>>>>>>>    - I've reached out to some folks for clarifications about
> > >>>> their
> > >>>>>>>>>    proposals. Some of them mentioned that they can not yet tell
> > >>>>>> whether
> > >>>>>>>> we
> > >>>>>>>>>    should do an item or not, and would need more time /
> > >>>> discussions
> > >>>>>> to
> > >>>>>>>> make
> > >>>>>>>>>    the decision. So I added a new symbol for items whose
> > >>>> priorities
> > >>>>>> are
> > >>>>>>>> `TBD`.
> > >>>>>>>>> Now it's time to collaboratively decide a minimum set of
> > >>>> must-have
> > >>>>>>> items.
> > >>>>>>>>> I've gone through the entire list of proposed items, and found
> > >>>> most
> > >>>>>> of
> > >>>>>>>> them
> > >>>>>>>>> make quite much sense. So I think an online sync might not be
> > >>>>>> necessary
> > >>>>>>>> for
> > >>>>>>>>> this. I'd like to go with this DISCUSS thread, where everyone
> can
> > >>>>>>> comment
> > >>>>>>>>> on how they think the list can be improved, followed by a VOTE
> to
> > >>>>>>>> formally
> > >>>>>>>>> make the decision.
> > >>>>>>>>>
> > >>>>>>>>> Any feedback and opinions, including but not limited to the
> > >>>>> following
> > >>>>>>>>> aspects, will be appreciated.
> > >>>>>>>>>
> > >>>>>>>>>    - Important items that are missing from the list
> > >>>>>>>>>    - Concerns regarding the listed items or their priorities
> > >>>>>>>>>
> > >>>>>>>>> Looking forward to your feedback.
> > >>>>>>>>>
> > >>>>>>>>> Best,
> > >>>>>>>>>
> > >>>>>>>>> Xintong
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>> [1]
> > >>>>
> > >>
> >
> https://lists.apache.org/list?dev@flink.apache.org:lte=1M:release%202.0%20status%20updates
> > >>>>>>>>> [2]
> > >>>> https://cwiki.apache.org/confluence/display/FLINK/2.0+Release
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>
> > >>>> --
> > >>>> Best regards,
> > >>>> Sergey
> > >>>>
> > >>
> > >>
> >
> >
>

Re: [DISCUSS] Release 2.0 Work Items

Posted by Xintong Song <to...@gmail.com>.
Thanks all for the discussion.


IIUC, we need to make the following changes. Please correct me if I get it
wrong.


1. Disaggregated State Management - Clarify that only the public API
related part is must-have for 2.0.

2. Java version support - Split it into 3 items: a) make java 17 the
default (must-have), b) drop java 8 (must-have), and c) drop java 11
(nice-to-have)

3. Add MetricGroup#getLogicalScope - Should be promoted to must-have

4. ProcessFunction API - Should be downgrade to nice-to-have

5. Configuration - Add an item "revisit all config option types and default
values", which IIUC should also be a must-have


There seems to be no changes needed for "Move Calcite rules from Scala to
Java" as it's already nice-to-have.


If there's no objections, I'll update the wiki page accordingly, and start
a VOTE in the next couple of days.


Best,

Xintong



On Fri, Jun 30, 2023 at 12:53 AM Teoh, Hong <li...@amazon.co.uk.invalid>
wrote:

> Thanks Xintong for driving the effort.
>
> I’d add a +1 to reworking configs, as suggested by @Jark and @Chesnay,
> especially the types. We have various configs that encode Time / MemorySize
> that are Long instead!
>
> Regards,
> Hong
>
>
>
> > On 29 Jun 2023, at 16:19, Yuan Mei <yu...@gmail.com> wrote:
> >
> > CAUTION: This email originated from outside of the organization. Do not
> click links or open attachments unless you can confirm the sender and know
> the content is safe.
> >
> >
> >
> > Thanks for driving this effort, Xintong!
> >
> > To Chesnay
> >> I'm curious as to why the "Disaggregated State Management" item is
> >> marked as a must-have; will it require changes that break something?
> >> What prevents it from being added in 2.1?
> >
> > As to "Disaggregated State Management".
> >
> > We plan to provide a new type of state backend to support DFS as primary
> > storage.
> > To achieve this, we at least need to include two parts of amends (not
> > entirely sure yet, since we are still in the designing and prototype
> phase)
> >
> > 1. Statebackend Change
> > 2. State Access Change
> >
> > Not all of the interfaces related are `@Internal`. Some of the interfaces
> > like `StateBackend` is `@PublicEvolving`
> > So, you are right in the sense that "Disaggregated State Management"
> itself
> > probably does not need to be a "Must Have"
> >
> > But I was hoping changes that related to public APIs can be finalized and
> > merged in Flink 2.0 (I will fix the wiki accordingly).
> >
> > I also agree with Jark that 2.0 is a good chance to rework the default
> > value of configurations.
> >
> > Best
> > Yuan
> >
> >
> > On Thu, Jun 29, 2023 at 8:43 PM Chesnay Schepler <ch...@apache.org>
> wrote:
> >
> >> Something else configuration-related is that there are a bunch of
> >> options where the type isn't quite correct (e.g., a String where it
> >> could be an enum, a string where it should be an int or something).
> >> Could do a pass over those as well.
> >>
> >> On 29/06/2023 13:50, Jark Wu wrote:
> >>> Hi,
> >>>
> >>> I think one more thing we need to consider to do in 2.0 is changing the
> >>> default value of configuration to improve out-of-box user experience.
> >>>
> >>> Currently, in order to run a Flink job, users may need to set
> >>> a bunch of configurations, such as minibatch, checkpoint interval,
> >>> exactly-once,
> >>> incremental-checkpoint, etc. It's very verbose and hard to use for
> >>> beginners.
> >>> Most of them can have a universally applicable value.  Because changing
> >> the
> >>> default value is a breaking change. I think It's worth considering
> >> changing
> >>> them in 2.0.
> >>>
> >>> What do you think?
> >>>
> >>> Best,
> >>> Jark
> >>>
> >>>
> >>> On Wed, 28 Jun 2023 at 14:10, Sergey Nuyanzin <sn...@gmail.com>
> >> wrote:
> >>>
> >>>> Hi Chesnay
> >>>>
> >>>>> "Move Calcite rules from Scala to Java": I would hope that this would
> >> be
> >>>>> an entirely internal change, and could thus be an incremental process
> >>>>> independent of major releases.
> >>>>> What is the actual scale of this item; how much are we actually
> >>>> re-writing?
> >>>>
> >>>> Thanks for asking
> >>>> yes, you're right, that should be internal change.
> >>>> Yeah I was also thinking about incremental change (rule by rule or
> >>>> reasonable small group of rules).
> >>>> And yes, this could be an independent (on major release) activity
> >>>>
> >>>> The problem is actually for children of RelOptRule.
> >>>> Currently I see 60+ such rules (in Scala) using the mentioned
> deprecated
> >>>> api.
> >>>> There are also children of ConverterRule (50+) which do not have such
> >>>> issues.
> >>>> Maybe it could be considered as the next step to have all the rules in
> >>>> Java.
> >>>>
> >>>> On Tue, Jun 27, 2023 at 1:34 PM Xintong Song <to...@gmail.com>
> >>>> wrote:
> >>>>
> >>>>> Hi Alex & Gyula,
> >>>>>
> >>>>> By compatibility discussion do you mean the "[DISCUSS] FLIP-321:
> >>>> Introduce
> >>>>>> an API deprecation process" thread [1]?
> >>>>>>
> >>>>> Yes, I meant the FLIP-321 discussion. I just noticed I pasted the
> wrong
> >>>> url
> >>>>> in my previous email. Sorry for the mistake.
> >>>>>
> >>>>> I am also curious to know if the rationale behind this new API has
> been
> >>>>>> previously discussed on the mailing list. Do we have a list of
> >>>>> shortcomings
> >>>>>> in the current DataStream API that it tries to resolve? How does the
> >>>>>> current ProcessFunction functionality fit into the picture? Will it
> be
> >>>>> kept
> >>>>>> as is or subsumed by new API?
> >>>>>>
> >>>>> I don't think we should create a replacement for the DataStream API
> >>>> unless
> >>>>>> we have a very good reason to do so and with a proper discussion
> about
> >>>>> this
> >>>>>> as Alex said.
> >>>>>
> >>>>> The ProcessFunction API which is targeting to replace DataStream API
> is
> >>>>> still a proposal, not a decision. Sorry for the confusion, I should
> >> have
> >>>>> been more careful with my words, not giving the impression that this
> is
> >>>>> something we'll do anyway.
> >>>>>
> >>>>> There will be a FLIP describing the motivations and designs in
> detail,
> >>>> for
> >>>>> the community to discuss and vote on. We are still working on it.
> TBH,
> >>>> this
> >>>>> is not trivial and we would need more time on it.
> >>>>>
> >>>>> Just to quickly share some backgrounds:
> >>>>>
> >>>>>    - We see quite some problems with the current DataStream APIs
> >>>>>       - Users are working with concrete classes rather than
> >> interfaces,
> >>>>>       which means
> >>>>>       - Users can access methods that are designed to be used by
> >> internal
> >>>>>          classes, even though they are annotated with `@Internal`.
> >> E.g.,
> >>>>>          `DataStream#getTransformation`.
> >>>>>          - Changes to the non-API implementations (e.g.,
> >>>> `Transformation`)
> >>>>>          would affect the API classes (e.g., `DataStream`), which
> >>>>> makes it hard to
> >>>>>          provide binary compatibility.
> >>>>>       - Internal classes are used as parameter / return-value of
> >> public
> >>>>>       APIs. E.g., while `AbstractStreamOperator` is PublicEvolving,
> >>>>> `StreamTask`
> >>>>>       which returns from `AbstractStreamOperator#getContainingTask`
> is
> >>>>> Internal.
> >>>>>       - In many cases, users are asked to extend the API classes,
> >> rather
> >>>>>       than implementing interfaces. E.g., `AbstractStreamOperator`.
> >>>>>          - Any changes to the base classes, even the internal part,
> >> may
> >>>>>          affect the behavior of the user-provided sub-classes
> >>>>>          - Users can override the behavior of the base classes
> >>>>>       - The API module `flink-streaming-java` contains non-API
> >> classes,
> >>>> and
> >>>>>       depends on internal modules such as `flink-runtime`, which
> means
> >>>>>       - Changes to the internal modules may affect the API modules,
> >> which
> >>>>>          requires users to re-build their applications upon upgrading
> >>>>>          - The artifact user needs for building their application
> >> larger
> >>>>>          than necessary.
> >>>>>       - We probably should not expose operators (e.g.,
> >>>>>       `AbstractStreamOperator`) to users. Functions should be enough
> >>>>> for users to
> >>>>>       define their data processing logics. Exposing operator-level
> >>>> concepts
> >>>>>       (e.g., mailbox thread model, checkpoint barrier alignment,
> >> etc.) is
> >>>>>       unnecessary and limits the improvement regarding such exposed
> >>>>> mechanisms
> >>>>>       with compatibility considerations.
> >>>>>       - The current DataStream API seems to be a mixture of many
> >> things,
> >>>>>       making it hard to understand especially for newcomers. It might
> >> be
> >>>>> better
> >>>>>       to re-organize it into several parts: (the taxonomy below are
> >> just
> >>>> an
> >>>>>       example of the, we are still working on this)
> >>>>>          - The most fundamental stateful stream processing: streams,
> >>>>>          partitions / key, process functions, state, timeline-service
> >>>>>          - An extension for common batch-streaming unified functions:
> >>>> map,
> >>>>>          flatmap, filter, agg, reduce, join, etc.
> >>>>>          - An extension for windowing supports:  window, triggering
> >>>>>          - An extension for event-time supports: event time,
> watermark
> >>>>>          - The extensions are like short-cuts / sugars, without which
> >>>> users
> >>>>>          can probably still achieve the same behavior by working with
> >> the
> >>>>>          fundamental APIs, but would be a lot easier with the
> >> extensions
> >>>>>       - The original plan was to do in-place refactors / changes on
> >>>>>    DataStream API. Some related items are listed in this doc [2]
> >> attached
> >>>>> to
> >>>>>    the kicking off email [3]. Not all of the above issues are listed,
> >>>>> because
> >>>>>    we haven't looked into this as deeply as now  by that time.
> >>>>>    - We proposed this as a new API rather than in-place refactors in
> >> the
> >>>>>    2.0 work item list, because we realized the changes might be too
> >> big
> >>>>> for an
> >>>>>    in-place change. First having a new API then gradually retiring
> the
> >>>> old
> >>>>> one
> >>>>>    would help users to smoothly migrate between them.
> >>>>>
> >>>>> A thorough discussion is definitely needed once the FLIP is out. And
> of
> >>>>> course it's possible that the FLIP might be rejected. Given that we
> are
> >>>>> planning for release 2.0, I just feel it would be better to bring
> this
> >> up
> >>>>> early even the concrete plan is not yet ready,
> >>>>>
> >>>>> Best,
> >>>>>
> >>>>> Xintong
> >>>>>
> >>>>>
> >>>>> [1] https://lists.apache.org/thread/vmhzv8fcw2b33pqxp43486owrxbkd5x9
> >>>>> [2]
> >>>>>
> >>>>>
> >>>>
> >>
> https://docs.google.com/document/d/1_PMGl5RuDQGlV99_gL3y7OiRsF0DgCk91Coua6hFXhE/edit?usp=sharing
> >>>>> [3] https://lists.apache.org/thread/b8w5cx0qqbwzzklyn5xxf54vw9ymys1c
> >>>>>
> >>>>> On Tue, Jun 27, 2023 at 5:15 PM Gyula Fóra <gy...@apache.org>
> wrote:
> >>>>>
> >>>>>> Hey!
> >>>>>>
> >>>>>> I share the same concerns mentioned above regarding the
> >>>> "ProcessFunction
> >>>>>> API".
> >>>>>>
> >>>>>> I don't think we should create a replacement for the DataStream API
> >>>>> unless
> >>>>>> we have a very good reason to do so and with a proper discussion
> about
> >>>>> this
> >>>>>> as Alex said.
> >>>>>>
> >>>>>> Cheers,
> >>>>>> Gyula
> >>>>>>
> >>>>>> On Tue, Jun 27, 2023 at 11:03 AM Alexander Fedulov <
> >>>>>> alexander.fedulov@gmail.com> wrote:
> >>>>>>
> >>>>>>> Hi Xintong,
> >>>>>>>
> >>>>>>> By compatibility discussion do you mean the "[DISCUSS] FLIP-321:
> >>>>>> Introduce
> >>>>>>> an API deprecation process" thread [1]?
> >>>>>>>
> >>>>>>> I am also curious to know if the rationale behind this new API has
> >>>> been
> >>>>>>> previously discussed on the mailing list. Do we have a list of
> >>>>>> shortcomings
> >>>>>>> in the current DataStream API that it tries to resolve? How does
> the
> >>>>>>> current ProcessFunction functionality fit into the picture? Will it
> >>>> be
> >>>>>> kept
> >>>>>>> as is or subsumed by new API?
> >>>>>>>
> >>>>>>> [1]
> https://lists.apache.org/thread/vmhzv8fcw2b33pqxp43486owrxbkd5x9
> >>>>>>>
> >>>>>>> Best,
> >>>>>>> Alex
> >>>>>>>
> >>>>>>> On Mon, 26 Jun 2023 at 14:33, Xintong Song <to...@gmail.com>
> >>>>>> wrote:
> >>>>>>>>> The ProcessFunction API item is giving me the most headaches
> >>>>> because
> >>>>>>> it's
> >>>>>>>>> very unclear what it actually entails; like is it an entirely
> >>>>>> separate
> >>>>>>>> API
> >>>>>>>>> to DataStream (sounds like it is!) or an extension of DataStream.
> >>>>> How
> >>>>>>>> much
> >>>>>>>>> will it share the internals with DataStream etc.; how does it
> >>>>> relate
> >>>>>> to
> >>>>>>>> the
> >>>>>>>>> Table API (w.r.t. switching APIs / what Table API uses
> >>>> underneath).
> >>>>>>>> I totally understand your confusion. We started planning this
> after
> >>>>>>> kicking
> >>>>>>>> off the release 2.0, so there's still a lot to be explored and the
> >>>>> plan
> >>>>>>>> keeps changing.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>    - In the beginning, we planned to do an in-place refactor of
> >>>>>>> DataStream
> >>>>>>>>    API, until the API migration period is proposed.
> >>>>>>>>    - Then we want to make it an entirely separate API to
> >>>> DataStream,
> >>>>>> and
> >>>>>>>>    listed as a must-have for release 2.0 so that we can remove
> >>>>>> DataStream
> >>>>>>>> once
> >>>>>>>>    it's ready.
> >>>>>>>>    - However, depending on the outcome of the API compatibility
> >>>>>>> discussion
> >>>>>>>>    [1], we may not be able to remove DataStream in 2.0 anyway,
> >>>> which
> >>>>>>> means
> >>>>>>>> we
> >>>>>>>>    might need to re-evaluate the necessity of this item for 2.0.
> >>>>>>>>
> >>>>>>>> I'd say we wait a bit longer for the compatibility discussion [1]
> >>>> and
> >>>>>>>> decide the priority for this item afterwards.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Best,
> >>>>>>>>
> >>>>>>>> Xintong
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> [1] https://lists.apache.org/list.html?dev@flink.apache.org
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On Mon, Jun 26, 2023 at 6:00 PM Chesnay Schepler <
> >>>> chesnay@apache.org
> >>>>>>>> wrote:
> >>>>>>>>
> >>>>>>>>> by-and-large I'm quite happy with the list of items.
> >>>>>>>>>
> >>>>>>>>> I'm curious as to why the "Disaggregated State Management" item
> >>>> is
> >>>>>>> marked
> >>>>>>>>> as a must-have; will it require changes that break something?
> >>>> What
> >>>>>>>> prevents
> >>>>>>>>> it from being added in 2.1?
> >>>>>>>>>
> >>>>>>>>> We may want to update the Java 17 item to "Make Java 17 the
> >>>>> default,
> >>>>>>> drop
> >>>>>>>>> Java 8/11". Maybe even split it into a must-have "Drop Java 8"
> >>>> and
> >>>>> a
> >>>>>>>>> nice-to-have "Drop Java 11"?
> >>>>>>>>>
> >>>>>>>>> "Move Calcite rules from Scala to Java": I would hope that this
> >>>>> would
> >>>>>>> be
> >>>>>>>>> an entirely internal change, and could thus be an incremental
> >>>>> process
> >>>>>>>>> independent of major releases.
> >>>>>>>>> What is the actual scale of this item; how much are we actually
> >>>>>>>> re-writing?
> >>>>>>>>> "Add MetricGroup#getLogicalScope": I'd raise this to a
> >>>> must-have; i
> >>>>>>> think
> >>>>>>>>> I marked it down as nice-to-have only because it depends on
> >>>> another
> >>>>>>> item.
> >>>>>>>>> The ProcessFunction API item is giving me the most headaches
> >>>>> because
> >>>>>>> it's
> >>>>>>>>> very unclear what it actually entails; like is it an entirely
> >>>>>> separate
> >>>>>>>> API
> >>>>>>>>> to DataStream (sounds like it is!) or an extension of DataStream.
> >>>>> How
> >>>>>>>> much
> >>>>>>>>> will it share the internals with DataStream etc.; how does it
> >>>>> relate
> >>>>>> to
> >>>>>>>> the
> >>>>>>>>> Table API (w.r.t. switching APIs / what Table API uses
> >>>> underneath).
> >>>>>>>>> There are a few items I added as ideas which don't have a
> >>>> priority
> >>>>>> yet;
> >>>>>>>>> would love to get some feedback on those.
> >>>>>>>>>
> >>>>>>>>> On 21/06/2023 08:41, Xintong Song wrote:
> >>>>>>>>>
> >>>>>>>>> Hi devs,
> >>>>>>>>>
> >>>>>>>>> As previously discussed in [1], we had been collecting work item
> >>>>>>>> proposals
> >>>>>>>>> for the 2.0 release until June 15th, on the wiki page [2].
> >>>>>>>>>
> >>>>>>>>>    - As we have passed the due date, I'd like to kindly remind
> >>>>>> everyone
> >>>>>>>> *not
> >>>>>>>>>    to add / remove items directly on the wiki page*. If needed,
> >>>>>> please
> >>>>>>>> post
> >>>>>>>>>    in this thread or reach out to the release managers instead.
> >>>>>>>>>    - I've reached out to some folks for clarifications about
> >>>> their
> >>>>>>>>>    proposals. Some of them mentioned that they can not yet tell
> >>>>>> whether
> >>>>>>>> we
> >>>>>>>>>    should do an item or not, and would need more time /
> >>>> discussions
> >>>>>> to
> >>>>>>>> make
> >>>>>>>>>    the decision. So I added a new symbol for items whose
> >>>> priorities
> >>>>>> are
> >>>>>>>> `TBD`.
> >>>>>>>>> Now it's time to collaboratively decide a minimum set of
> >>>> must-have
> >>>>>>> items.
> >>>>>>>>> I've gone through the entire list of proposed items, and found
> >>>> most
> >>>>>> of
> >>>>>>>> them
> >>>>>>>>> make quite much sense. So I think an online sync might not be
> >>>>>> necessary
> >>>>>>>> for
> >>>>>>>>> this. I'd like to go with this DISCUSS thread, where everyone can
> >>>>>>> comment
> >>>>>>>>> on how they think the list can be improved, followed by a VOTE to
> >>>>>>>> formally
> >>>>>>>>> make the decision.
> >>>>>>>>>
> >>>>>>>>> Any feedback and opinions, including but not limited to the
> >>>>> following
> >>>>>>>>> aspects, will be appreciated.
> >>>>>>>>>
> >>>>>>>>>    - Important items that are missing from the list
> >>>>>>>>>    - Concerns regarding the listed items or their priorities
> >>>>>>>>>
> >>>>>>>>> Looking forward to your feedback.
> >>>>>>>>>
> >>>>>>>>> Best,
> >>>>>>>>>
> >>>>>>>>> Xintong
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> [1]
> >>>>
> >>
> https://lists.apache.org/list?dev@flink.apache.org:lte=1M:release%202.0%20status%20updates
> >>>>>>>>> [2]
> >>>> https://cwiki.apache.org/confluence/display/FLINK/2.0+Release
> >>>>>>>>>
> >>>>>>>>>
> >>>>
> >>>> --
> >>>> Best regards,
> >>>> Sergey
> >>>>
> >>
> >>
>
>

Re: Re: [DISCUSS] Release 2.0 Work Items

Posted by Xintong Song <to...@gmail.com>.
Thanks for the inputs, Matthias,

- FLINK-4503: Yes, this should be subsumed by "Deprecated
methods/fields/classes in DataStream", which doesn't really need any action
in 1.18. Sorry for overlooking it.

- FLINK-5875: Based on the JIRA descriptions, it seems this only makes
sense if we want to accept arrays as keys in DataStream. I'm not entirely
sure about the necessity and feasibility of the latter. To be specific, I'm
not aware of anyone complaining about not being able to use arrays as keys,
and I don't know whether there will be other negative effects by allowing
arrays as keys. That's why I think more investigations are needed for this,
thus putting it as TBD.

- FLINK-15470: I think you're right. The YARN properties file is not a
public API, but it is mentioned in documentation [1], and the removal of it
would lead to behavior changes (having to always specify the deployment
target). It would be nice to update the documentation and log warning
messages in advance.

Best,

Xintong


[1]
https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/resource-providers/yarn/#session-mode



On Thu, Jul 20, 2023 at 8:05 PM Matthias Pohl
<ma...@aiven.io.invalid> wrote:

> Sorry for the late reply in that matter. I was off the last few days. I
> should have made this clear in the ML. Anyway, I went over the issues as
> well. Xintong's summary matches more or less my findings aside from the
> following items:
>
> - FLINK-4503 (remove deprecated methods from CoGroupedStreams and
> JoinedStreams) was not mentioned in the above summary (AFAICS) but is
> most-likely subsumed by the deprecated DataStream API cleanup
> - FLINK-5875 (using TypeComparator.hash() instead of Object.hashCode())
> felt to me like a nice-to-have item because it fixes a bug that was treated
> with a restrictive workaround. But I see your point that it should have
> been raised in the ML if it would have been a bigger issue.
> - FLINK-15470 (remove YARN properties file): Shouldn't we add a log warning
> and update the documentation as part of 1.18 to make this issue happen? In
> this sense, I'd say that we should list FLINK-15470 under 1.18 changes
> necessary
>
> Best,
> Matthias
>
>
> On Wed, Jul 19, 2023 at 10:27 AM Martijn Visser <ma...@apache.org>
> wrote:
>
> > First off, good discussion on these topics.
> >
> > +1 on Xintong's latest proposal in this thread
> >
> > On Wed, Jul 19, 2023 at 5:16 AM Xintong Song <to...@gmail.com>
> > wrote:
> >
> >> I went through the remaining Jira tickets with 2.0.0 fix-version and are
> >> not included in FLINK-3975.
> >>
> >> I skipped the 3 umbrella tickets below and their subtasks, which are
> newly
> >> created for the 2.0 work items.
> >>
> >>    - FLINK-32377 Breaking REST API changes
> >>    - FLINK-32378 Breaking Metrics system changes
> >>    - FLINK-32383 2.0 Breaking configuration changes
> >>
> >> I'd suggest going ahead with the following tickets.
> >>
> >>    - Need action in 1.18
> >>       - FLINK-29739: Already listed in the release 2.0 wiki. Needs mark
> >> all
> >>       Scala APIs as deprecated.
> >>    - Need no action in 1.18
> >>       - FLINK-23620: Already listed in the release 2.0
> >>       - FLINK-15470/30246/32437: Behavior changes, no API to be
> deprecated
> >>
> >> I'd suggest not doing the following tickets.
> >>
> >>    - FLINK-11409: Subsumed by "Convert user-facing concrete classes into
> >>    interfaces" in the release 2.0 wiki
> >>
> >> I'd suggest leaving the following tickets as TBD, and would be slightly
> in
> >> favor of not doing them unless someone volunteers to look more into
> them.
> >>
> >>    - FLINK-10113 Drop support for pre 1.6 shared buffer state
> >>    - FLINK-10374 [Map State] Let user value serializer handle null
> values
> >>    - FLINK-13928 Make windows api more extendable
> >>    - FLINK-17539 Migrate the configuration options which do not follow
> the
> >>    xyz.max/min pattern
> >>
> >>
> >> Best,
> >>
> >> Xintong
> >>
> >>
> >>
> >> On Tue, Jul 18, 2023 at 5:20 PM Wencong Liu <li...@163.com>
> wrote:
> >>
> >> > Hi Chesnay,
> >> > Thanks for the reply. I think it is reasonable to remove the
> >> configuration
> >> > argument
> >> > in AbstractUdfStreamOperator#open if it is consistently empty. I'll
> >> > propose a discuss
> >> > about the specific actions in FLINK-6912 at a later time.
> >> >
> >> >
> >> > Best,
> >> > Wencong Liu
> >> >
> >> >
> >> >
> >> >
> >> >
> >> >
> >> >
> >> >
> >> >
> >> >
> >> >
> >> > At 2023-07-18 16:38:59, "Chesnay Schepler" <ch...@apache.org>
> wrote:
> >> > >On 18/07/2023 10:33, Wencong Liu wrote:
> >> > >> For FLINK-6912:
> >> > >>
> >> > >>      There are three implementations of RichFunction that actually
> >> use
> >> > >> the Configuration parameter in RichFunction#open:
> >> > >>      1. ContinuousFileMonitoringFunction#open: It uses the
> >> configuration
> >> > >> to configure the FileInputFormat. [1]
> >> > >>      2. OutputFormatSinkFunction#open: It uses the configuration
> >> > >> to configure the OutputFormat. [2]
> >> > >>      3. InputFormatSourceFunction#open: It uses the configuration
> >> > >>   to configure the InputFormat. [3]
> >> > >
> >> > >And none of them should have any effect since the configuration is
> >> empty.
> >> > >
> >> > >See
> >> >
> org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator#open.
> >> >
> >>
> >
>

Re: Re: [DISCUSS] Release 2.0 Work Items

Posted by Matthias Pohl <ma...@aiven.io.INVALID>.
Sorry for the late reply in that matter. I was off the last few days. I
should have made this clear in the ML. Anyway, I went over the issues as
well. Xintong's summary matches more or less my findings aside from the
following items:

- FLINK-4503 (remove deprecated methods from CoGroupedStreams and
JoinedStreams) was not mentioned in the above summary (AFAICS) but is
most-likely subsumed by the deprecated DataStream API cleanup
- FLINK-5875 (using TypeComparator.hash() instead of Object.hashCode())
felt to me like a nice-to-have item because it fixes a bug that was treated
with a restrictive workaround. But I see your point that it should have
been raised in the ML if it would have been a bigger issue.
- FLINK-15470 (remove YARN properties file): Shouldn't we add a log warning
and update the documentation as part of 1.18 to make this issue happen? In
this sense, I'd say that we should list FLINK-15470 under 1.18 changes
necessary

Best,
Matthias


On Wed, Jul 19, 2023 at 10:27 AM Martijn Visser <ma...@apache.org>
wrote:

> First off, good discussion on these topics.
>
> +1 on Xintong's latest proposal in this thread
>
> On Wed, Jul 19, 2023 at 5:16 AM Xintong Song <to...@gmail.com>
> wrote:
>
>> I went through the remaining Jira tickets with 2.0.0 fix-version and are
>> not included in FLINK-3975.
>>
>> I skipped the 3 umbrella tickets below and their subtasks, which are newly
>> created for the 2.0 work items.
>>
>>    - FLINK-32377 Breaking REST API changes
>>    - FLINK-32378 Breaking Metrics system changes
>>    - FLINK-32383 2.0 Breaking configuration changes
>>
>> I'd suggest going ahead with the following tickets.
>>
>>    - Need action in 1.18
>>       - FLINK-29739: Already listed in the release 2.0 wiki. Needs mark
>> all
>>       Scala APIs as deprecated.
>>    - Need no action in 1.18
>>       - FLINK-23620: Already listed in the release 2.0
>>       - FLINK-15470/30246/32437: Behavior changes, no API to be deprecated
>>
>> I'd suggest not doing the following tickets.
>>
>>    - FLINK-11409: Subsumed by "Convert user-facing concrete classes into
>>    interfaces" in the release 2.0 wiki
>>
>> I'd suggest leaving the following tickets as TBD, and would be slightly in
>> favor of not doing them unless someone volunteers to look more into them.
>>
>>    - FLINK-10113 Drop support for pre 1.6 shared buffer state
>>    - FLINK-10374 [Map State] Let user value serializer handle null values
>>    - FLINK-13928 Make windows api more extendable
>>    - FLINK-17539 Migrate the configuration options which do not follow the
>>    xyz.max/min pattern
>>
>>
>> Best,
>>
>> Xintong
>>
>>
>>
>> On Tue, Jul 18, 2023 at 5:20 PM Wencong Liu <li...@163.com> wrote:
>>
>> > Hi Chesnay,
>> > Thanks for the reply. I think it is reasonable to remove the
>> configuration
>> > argument
>> > in AbstractUdfStreamOperator#open if it is consistently empty. I'll
>> > propose a discuss
>> > about the specific actions in FLINK-6912 at a later time.
>> >
>> >
>> > Best,
>> > Wencong Liu
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> > At 2023-07-18 16:38:59, "Chesnay Schepler" <ch...@apache.org> wrote:
>> > >On 18/07/2023 10:33, Wencong Liu wrote:
>> > >> For FLINK-6912:
>> > >>
>> > >>      There are three implementations of RichFunction that actually
>> use
>> > >> the Configuration parameter in RichFunction#open:
>> > >>      1. ContinuousFileMonitoringFunction#open: It uses the
>> configuration
>> > >> to configure the FileInputFormat. [1]
>> > >>      2. OutputFormatSinkFunction#open: It uses the configuration
>> > >> to configure the OutputFormat. [2]
>> > >>      3. InputFormatSourceFunction#open: It uses the configuration
>> > >>   to configure the InputFormat. [3]
>> > >
>> > >And none of them should have any effect since the configuration is
>> empty.
>> > >
>> > >See
>> > org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator#open.
>> >
>>
>

Re: Re: [DISCUSS] Release 2.0 Work Items

Posted by Martijn Visser <ma...@apache.org>.
First off, good discussion on these topics.

+1 on Xintong's latest proposal in this thread

On Wed, Jul 19, 2023 at 5:16 AM Xintong Song <to...@gmail.com> wrote:

> I went through the remaining Jira tickets with 2.0.0 fix-version and are
> not included in FLINK-3975.
>
> I skipped the 3 umbrella tickets below and their subtasks, which are newly
> created for the 2.0 work items.
>
>    - FLINK-32377 Breaking REST API changes
>    - FLINK-32378 Breaking Metrics system changes
>    - FLINK-32383 2.0 Breaking configuration changes
>
> I'd suggest going ahead with the following tickets.
>
>    - Need action in 1.18
>       - FLINK-29739: Already listed in the release 2.0 wiki. Needs mark all
>       Scala APIs as deprecated.
>    - Need no action in 1.18
>       - FLINK-23620: Already listed in the release 2.0
>       - FLINK-15470/30246/32437: Behavior changes, no API to be deprecated
>
> I'd suggest not doing the following tickets.
>
>    - FLINK-11409: Subsumed by "Convert user-facing concrete classes into
>    interfaces" in the release 2.0 wiki
>
> I'd suggest leaving the following tickets as TBD, and would be slightly in
> favor of not doing them unless someone volunteers to look more into them.
>
>    - FLINK-10113 Drop support for pre 1.6 shared buffer state
>    - FLINK-10374 [Map State] Let user value serializer handle null values
>    - FLINK-13928 Make windows api more extendable
>    - FLINK-17539 Migrate the configuration options which do not follow the
>    xyz.max/min pattern
>
>
> Best,
>
> Xintong
>
>
>
> On Tue, Jul 18, 2023 at 5:20 PM Wencong Liu <li...@163.com> wrote:
>
> > Hi Chesnay,
> > Thanks for the reply. I think it is reasonable to remove the
> configuration
> > argument
> > in AbstractUdfStreamOperator#open if it is consistently empty. I'll
> > propose a discuss
> > about the specific actions in FLINK-6912 at a later time.
> >
> >
> > Best,
> > Wencong Liu
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > At 2023-07-18 16:38:59, "Chesnay Schepler" <ch...@apache.org> wrote:
> > >On 18/07/2023 10:33, Wencong Liu wrote:
> > >> For FLINK-6912:
> > >>
> > >>      There are three implementations of RichFunction that actually use
> > >> the Configuration parameter in RichFunction#open:
> > >>      1. ContinuousFileMonitoringFunction#open: It uses the
> configuration
> > >> to configure the FileInputFormat. [1]
> > >>      2. OutputFormatSinkFunction#open: It uses the configuration
> > >> to configure the OutputFormat. [2]
> > >>      3. InputFormatSourceFunction#open: It uses the configuration
> > >>   to configure the InputFormat. [3]
> > >
> > >And none of them should have any effect since the configuration is
> empty.
> > >
> > >See
> > org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator#open.
> >
>

Re: Re: [DISCUSS] Release 2.0 Work Items

Posted by Xintong Song <to...@gmail.com>.
I went through the remaining Jira tickets with 2.0.0 fix-version and are
not included in FLINK-3975.

I skipped the 3 umbrella tickets below and their subtasks, which are newly
created for the 2.0 work items.

   - FLINK-32377 Breaking REST API changes
   - FLINK-32378 Breaking Metrics system changes
   - FLINK-32383 2.0 Breaking configuration changes

I'd suggest going ahead with the following tickets.

   - Need action in 1.18
      - FLINK-29739: Already listed in the release 2.0 wiki. Needs mark all
      Scala APIs as deprecated.
   - Need no action in 1.18
      - FLINK-23620: Already listed in the release 2.0
      - FLINK-15470/30246/32437: Behavior changes, no API to be deprecated

I'd suggest not doing the following tickets.

   - FLINK-11409: Subsumed by "Convert user-facing concrete classes into
   interfaces" in the release 2.0 wiki

I'd suggest leaving the following tickets as TBD, and would be slightly in
favor of not doing them unless someone volunteers to look more into them.

   - FLINK-10113 Drop support for pre 1.6 shared buffer state
   - FLINK-10374 [Map State] Let user value serializer handle null values
   - FLINK-13928 Make windows api more extendable
   - FLINK-17539 Migrate the configuration options which do not follow the
   xyz.max/min pattern


Best,

Xintong



On Tue, Jul 18, 2023 at 5:20 PM Wencong Liu <li...@163.com> wrote:

> Hi Chesnay,
> Thanks for the reply. I think it is reasonable to remove the configuration
> argument
> in AbstractUdfStreamOperator#open if it is consistently empty. I'll
> propose a discuss
> about the specific actions in FLINK-6912 at a later time.
>
>
> Best,
> Wencong Liu
>
>
>
>
>
>
>
>
>
>
>
> At 2023-07-18 16:38:59, "Chesnay Schepler" <ch...@apache.org> wrote:
> >On 18/07/2023 10:33, Wencong Liu wrote:
> >> For FLINK-6912:
> >>
> >>      There are three implementations of RichFunction that actually use
> >> the Configuration parameter in RichFunction#open:
> >>      1. ContinuousFileMonitoringFunction#open: It uses the configuration
> >> to configure the FileInputFormat. [1]
> >>      2. OutputFormatSinkFunction#open: It uses the configuration
> >> to configure the OutputFormat. [2]
> >>      3. InputFormatSourceFunction#open: It uses the configuration
> >>   to configure the InputFormat. [3]
> >
> >And none of them should have any effect since the configuration is empty.
> >
> >See
> org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator#open.
>

Re:Re: [DISCUSS] Release 2.0 Work Items

Posted by Wencong Liu <li...@163.com>.
Hi Chesnay,
Thanks for the reply. I think it is reasonable to remove the configuration argument
in AbstractUdfStreamOperator#open if it is consistently empty. I'll propose a discuss
about the specific actions in FLINK-6912 at a later time.


Best,
Wencong Liu











At 2023-07-18 16:38:59, "Chesnay Schepler" <ch...@apache.org> wrote:
>On 18/07/2023 10:33, Wencong Liu wrote:
>> For FLINK-6912:
>>
>>      There are three implementations of RichFunction that actually use
>> the Configuration parameter in RichFunction#open:
>>      1. ContinuousFileMonitoringFunction#open: It uses the configuration
>> to configure the FileInputFormat. [1]
>>      2. OutputFormatSinkFunction#open: It uses the configuration
>> to configure the OutputFormat. [2]
>>      3. InputFormatSourceFunction#open: It uses the configuration
>>   to configure the InputFormat. [3]
>
>And none of them should have any effect since the configuration is empty.
>
>See org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator#open.

Re: [DISCUSS] Release 2.0 Work Items

Posted by Chesnay Schepler <ch...@apache.org>.
On 18/07/2023 10:33, Wencong Liu wrote:
> For FLINK-6912:
>
>      There are three implementations of RichFunction that actually use
> the Configuration parameter in RichFunction#open:
>      1. ContinuousFileMonitoringFunction#open: It uses the configuration
> to configure the FileInputFormat. [1]
>      2. OutputFormatSinkFunction#open: It uses the configuration
> to configure the OutputFormat. [2]
>      3. InputFormatSourceFunction#open: It uses the configuration
>   to configure the InputFormat. [3]

And none of them should have any effect since the configuration is empty.

See org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator#open.

Re:Re: [DISCUSS] Release 2.0 Work Items

Posted by Wencong Liu <li...@163.com>.
Thanks Xintong Song and Matthias for the insightful discussion!


I have double-checked the jira tickets that belong to the 
"Need action in 1.18" section and have some inputs to share.

For FLINK-4675:

    The argument StreamExecutionEnvironment in WindowAssigner.getDefaultTrigger() 
is not used in all implementations of WindowAssigner and is no longer needed.

For FLINK-6912:

    There are three implementations of RichFunction that actually use 
the Configuration parameter in RichFunction#open:
    1. ContinuousFileMonitoringFunction#open: It uses the configuration 
to configure the FileInputFormat. [1]
    2. OutputFormatSinkFunction#open: It uses the configuration 
to configure the OutputFormat. [2]
    3. InputFormatSourceFunction#open: It uses the configuration
 to configure the InputFormat. [3]
    I think RichFunction#open should still take a Configuration 
instance as an argument.

For FLINK-5336:

    There are three classes that de/serialize the Path through IOReadWritable 
interface:
    1. FileSourceSplitSerializer: It de/serializes the Path during the process 
of de/serializing FileSourceSplit. [4]
    2. TestManagedSinkCommittableSerializer: It de/serializes the Path during 
the process of de/serializing TestManagedCommittable. [5]
    3. TestManagedFileSourceSplitSerializer: It de/serializes the Path during 
the process of de/serializing TestManagedIterableSourceSplit. [6]
    I think the Path should still implement the IOReadWritable interface.


I plan to propose a discussion about removing argument in FLINK-4675 and 
comment the conclusion in FLINK-6912 and FLINK-5336, WDYT?

[1] https://github.com/apache/flink/blob/9c3c8afbd9325b5df8291bd831da2d9f8785b30a/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/functions/source/ContinuousFileMonitoringFunction.java#L199
[2] https://github.com/apache/flink/blob/9c3c8afbd9325b5df8291bd831da2d9f8785b30a/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/functions/sink/OutputFormatSinkFunction.java#L63
[3] https://github.com/apache/flink/blob/9c3c8afbd9325b5df8291bd831da2d9f8785b30a/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/functions/source/InputFormatSourceFunction.java#L64C2-L64C2
[4] https://github.com/apache/flink/blob/9c3c8afbd9325b5df8291bd831da2d9f8785b30a/flink-connectors/flink-connector-files/src/main/java/org/apache/flink/connector/file/src/FileSourceSplitSerializer.java#L67

[5] https://github.com/apache/flink/blob/9c3c8afbd9325b5df8291bd831da2d9f8785b30a/flink-table/flink-table-common/src/test/java/org/apache/flink/table/connector/sink/TestManagedSinkCommittableSerializer.java#L113
[6] https://github.com/apache/flink/blob/9c3c8afbd9325b5df8291bd831da2d9f8785b30a/flink-table/flink-table-common/src/test/java/org/apache/flink/table/connector/source/TestManagedFileSourceSplitSerializer.java#L56

















At 2023-07-17 12:23:51, "Xintong Song" <to...@gmail.com> wrote:
>Hi Matthias,
>
>How's it going with the summary of existing 2.0.0 jira tickets?
>
>I have gone through everything listed under FLINK-3957[1], and will
>continue with other Jira tickets whose fix-version is 2.0.0.
>
>Here are my 2-cents on the FLINK-3975 subtasks. Hope this helps on your
>summary.
>
>I'd suggest going ahead with the following tickets.
>
>   - Need action in 1.18
>      - FLINK-4675: Double-check whether the argument is indeed not used.
>      Introduce a new non-argument API, and mark the original one as
>      `@Deprecated`. FLIP needed.
>      - FLINK-6912: Double-check whether the argument is indeed not used.
>      Introduce a new non-argument API, and mark the original one as
>      `@Deprecated`. FLIP needed.
>      - FLINK-5336: Double-check whether `IOReadableWritable` is indeed not
>      needed for `Path`. Mark methods from `IOReadableWritable` as
>`@Deprecated`
>      in `Path`. FLIP needed.
>   - Need no action in 1.18
>      - FLINK-4602/14068: Already listed in the release 2.0 wiki [2]
>      - FLINK-3986/3991/3992/4367/5130/7691: Subsumed by "Deprecated
>      methods/fields/classes in DataStream" in the release 2.0 wiki [2]
>      - FLINK-6375: Change the hashCode behavior of `LongValue` (and other
>      numeric types).
>
>I'd suggest not doing the following tickets.
>
>   - FLINK-4147/4330/9529/14658: These changes are non-trivial for both
>   developers and users. Also, we are taking them into consideration designing
>   the new ProcessFunction API. I'd be in favor of letting users migrate to
>   the ProcessFunction API directly once it's ready, rather than forcing users
>   to adapt to the breaking changes twice.
>   - FLINK-3610: Only affects Scala API, which will soon be removed.
>
>I don't have strong opinions on whether to work on the following tickets or
>not. Some of them are not very clear to me based on the description and
>conversation on the ticket, others may require further investigation and
>evaluation to decide. Unless someone volunteers to look into them, I'd be
>slightly in favor of not doing them, as I'm not aware of them causing any
>serious problems.
>
>   - FLINK-3959 Remove implicit Sinks
>   - FLINK-4757 Unify "GlobalJobParameters" and "Configuration"
>   - FLINK-4758 Remove IOReadableWritable from classes where not needed
>   - FLINK-4971 Unify Stream Sinks and OutputFormats
>   - FLINK-5126 Remove Checked Exceptions from State Interfaces
>   - FLINK-5337 Introduce backwards compatible state to task assignment
>   - FLINK-5346 Remove all ad-hoc config loading via GlobalConfiguration
>   - FLINK-5875 Use TypeComparator.hash() instead of Object.hashCode() for
>   keying in DataStream API
>   - FLINK-9798 Drop canEqual() from TypeInformation, TypeSerializer, etc.
>   - FLINK-13926 `ProcessingTimeSessionWindows` and
>   `EventTimeSessionWindows` should be generic
>
>
>WDYT?
>
>Best,
>
>Xintong
>
>
>[1] https://issues.apache.org/jira/browse/FLINK-3957
>
>[2] https://cwiki.apache.org/confluence/display/FLINK/2.0+Release
>
>
>
>On Thu, Jul 13, 2023 at 10:31 AM li zhiqiang <li...@gmail.com>
>wrote:
>
>> @Xingtong
>> I already know the modification of some api, but because there are many
>> changes involved,
>> I am afraid that the consideration is not comprehensive.
>> I'm willing to do the work, but I haven't found a committer yet.
>>
>> Best,
>> Zhiqiang
>>
>> 发件人: Xintong Song <to...@gmail.com>
>> 日期: 星期四, 2023年7月13日 10:03
>> 收件人: dev@flink.apache.org <de...@flink.apache.org>
>> 主题: Re: [DISCUSS] Release 2.0 Work Items
>> Thanks for the inputs, Zhiqiang and Jiabao.
>>
>> @Zhiqiang,
>> The proposal sounds interesting. Do you already have an idea what API
>> changes are needed in order to make the connectors pluggable? I think
>> whether this should go into Flink 2.0 would significantly depend on what
>> API changes are needed. Moreover, would you like to work on this effort or
>> simply raise a need? And if you'd like to work on this, do you already find
>> a committer who can help on this?
>>
>> @Jiabao,
>> Thanks for the suggestions. I agree that it would be nice to improve the
>> experiences in deploying Flink instances and submitting tasks. It would be
>> helpful if you can point out the specific behaviors that make integrating
>> Flink in your production difficult. Also, I'd like to understand how this
>> topic is related to the Release 2.0 topic. Or asked differently, is this
>> something that requires breaking changes that can only happen in major
>> version bumps, or is it just improvement that can go into any minor
>> version?
>>
>>
>> Best,
>>
>> Xintong
>>
>>
>>
>> On Thu, Jul 13, 2023 at 12:49 AM Jiabao Sun <jiabao.sun@xtransfer.cn
>> .invalid>
>> wrote:
>>
>> > Thanks Xintong for driving the effort.
>> >
>> >
>> > I’d add a +1 to improving out-of-box user experience, as suggested by
>> > @Jark and @Chesnay.
>> > For beginners, understanding complex configurations is a hard work.
>> >
>> > In addition, the deployment of a set of Flink runtime environment is also
>> > a complex matter.
>> > At present, there are still big differences in the submission tasks for
>> > different computing resource. If users need time for their own data
>> > development platform, they need to deeply understand these differences
>> when
>> > processing task submission and running status check.
>> >
>> > I'm glad to see features like flink-sql-gateway being implemented by the
>> > community because it makes it easy for users to submit flink sql tasks.
>> > Further more, can we provide more unified, out-of-the-box capabilities
>> that
>> > allow users to quickly pull up a production-ready Flink environment and
>> > easily integrate Flink into their own data development platform?
>> >
>> >
>> > Best,
>> > Jiabao
>> >
>> >
>> > > 2023年7月12日 下午8:16,zhiqiang li <li...@gmail.com> 写道:
>> > >
>> > > I have seen in [1] connectors and formats, and user code will be
>> > pluggable.
>> > > If the connectors are pluggable, the benefits are obvious, as the
>> > conflicts
>> > > between different jar package versions can be avoided.
>> > > If you don't use classloader isolation, shade is needed to resolve
>> > > conflicts. A lot of development time is wasted.
>> > > I know that this change may involve a lot of API changes, so I would
>> like
>> > > to discuss in this email whether we can make changes in Flink 2.0.
>> > > Plugins facilitate a strict separation of code through restricted
>> > > classloaders.
>> > >
>> > > Plugins cannot access classes from other plugins or from Flink that
>> have
>> > >> not been specifically whitelisted.
>> > >> This strict isolation allows plugins to contain conflicting versions
>> of
>> > >> the same library without the need to relocate classes or to converge
>> to
>> > >> common versions.
>> > >> Currently, file systems and metric reporters are pluggable *but in the
>> > >> future, connectors, formats, and even user code should also be
>> > pluggable.*
>> > >>
>> > >
>> > > [1]
>> > >
>> >
>> https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/filesystems/overview/
>> > >
>> > > Xintong Song <to...@gmail.com> 于2023年7月11日周二 18:50写道:
>> > >
>> > >>>
>> > >>> What we might want to come up with is a summary with each 2.0.0 issue
>> > on
>> > >>> why it should be included or not. That summary is something the
>> > community
>> > >>> could vote on. WDYT? I'm happy to help here.
>> > >>>
>> > >>
>> > >> That sounds great. Thanks for offering the help. I'll also try to go
>> > >> through the issues, but TBH I'm quite overwhelmed and cannot promise
>> to
>> > get
>> > >> this done very soon. Your help is very much needed.
>> > >>
>> > >>
>> > >> Best,
>> > >>
>> > >> Xintong
>> > >>
>> > >>
>> > >>
>> > >> On Tue, Jul 11, 2023 at 6:08 PM Matthias Pohl
>> > >> <ma...@aiven.io.invalid> wrote:
>> > >>
>> > >>> @Xintong I guess it makes sense. I agree with your conclusions on the
>> > >> four
>> > >>> mentioned Jira issues.
>> > >>>
>> > >>> I just checked any issues that have fixVersion = 2.0.0 [1]. There
>> are a
>> > >> few
>> > >>> more items that are not affiliated with FLINK-3957 [2]. I guess we
>> > should
>> > >>> find answers for these issues: Either closing them with a reason to
>> > have
>> > >> a
>> > >>> consistent state in Jira or adding them to the feature list as part
>> of
>> > a
>> > >>> separate voting thread (to leave the current vote untouched).
>> > >>>
>> > >>> What we might want to come up with is a summary with each 2.0.0 issue
>> > on
>> > >>> why it should be included or not. That summary is something the
>> > community
>> > >>> could vote on. WDYT? I'm happy to help here.
>> > >>>
>> > >>> Matthias
>> > >>>
>> > >>> [1]
>> > >>>
>> > >>>
>> > >>
>> >
>> https://issues.apache.org/jira/browse/FLINK-32437?jql=project%20%3D%20FLINK%20AND%20fixVersion%20%3D%202.0.0%20AND%20status%20NOT%20IN%20(Closed%2C%20Resolved)%20%20
>> > >>> [2] https://issues.apache.org/jira/browse/FLINK-3957
>> > >>>
>> > >>>
>> > >>> On Tue, Jul 11, 2023 at 5:01 AM Xintong Song <to...@gmail.com>
>> > >>> wrote:
>> > >>>
>> > >>>> @Zhu,
>> > >>>> As you are downgrading "Clarify the scopes of configuration options"
>> > to
>> > >>>> nice-to-have priority, could you also bring that up in the vote
>> > >>> thread[1]?
>> > >>>> I'm asking because there are people who already voted on the
>> original
>> > >>> list.
>> > >>>> I think restarting the vote is probably an overkill and unnecessary,
>> > >> but
>> > >>> we
>> > >>>> should at least bring this change to their attention.
>> > >>>>
>> > >>>> @Matthias,
>> > >>>> Thanks a lot for bringing this up. I wasn't aware of this early
>> > >>> umbrella. I
>> > >>>> haven't gone through everything in FLINK-3957 yet. I'll do it asap.
>> > >>>>
>> > >>>> Just quickly went through the 4 issues you mentioned.
>> > >>>> - FLINK-4675 & FLINK-14068: I'd be +1 to deprecate them in 1.18, as
>> > >> long
>> > >>> as
>> > >>>> the new APIs that we want users to migrate to are ready. For these 2
>> > >>>> tickets, I think introduction of the updated APIs should be
>> > >>> straightforward
>> > >>>> and feasible for 1.18.
>> > >>>> - FLINK-13926: I'm not sure about this one. The two mentioned
>> classes
>> > >>>> `ProcessingTimeSessionWindows` and `EventTimeSessionWindows` are not
>> > >> even
>> > >>>> marked as Public or PublicEvolving APIs. Moreover, I don't see a
>> good
>> > >> way
>> > >>>> to smoothly replace the classes with a generic version.
>> > >>>> - FLINK-5126: This is a bit unclear to me. From the description and
>> > >>>> conversation on the ticket, I don't fully understand which concrete
>> > >> APIs
>> > >>>> the ticket is referring to. Or maybe it refers to all / most of the
>> > >> APIs
>> > >>>> that throws Exception / IOException in general. Moreover, I don't
>> > think
>> > >>>> removing Exception / IOException from the API signature is a
>> breaking
>> > >>>> change. It requires no code changes on the caller side.
>> > >>>>
>> > >>>> WDYT?
>> > >>>>
>> > >>>> Best,
>> > >>>>
>> > >>>> Xintong
>> > >>>>
>> > >>>>
>> > >>>> [1]
>> https://lists.apache.org/thread/r0y9syc6k5nmcxvnd0hj33htdpdj9k6m
>> > >>>> [2] https://issues.apache.org/jira/browse/FLINK-3957
>> > >>>>
>> > >>>> On Mon, Jul 10, 2023 at 10:53 PM Matthias Pohl
>> > >>>> <ma...@aiven.io.invalid> wrote:
>> > >>>>
>> > >>>>> I brought it up in the deprecating APIs in 1.18 thread [1] already
>> > >> but
>> > >>> it
>> > >>>>> feels misplaced there. I just wanted to ask whether someone did a
>> > >> pass
>> > >>>> over
>> > >>>>> FLINK-3957 [2]. I came across it when going through the release 2.0
>> > >>>> feature
>> > >>>>> list [3] as part of the vote. I have the feeling that there are
>> some
>> > >>>> valid
>> > >>>>> action items (e.g. FLINK-4675, FLINK-5126, FLINK-13926 [4-6]) which
>> > >> do
>> > >>>> not
>> > >>>>> seem to be listed in the 2.0 feature list [3], yet (or are included
>> > >> in
>> > >>>> some
>> > >>>>> of the bigger items). Majority of the subtasks are probably covered
>> > >> by
>> > >>>> the
>> > >>>>> DataSet removal, the Scala API removal and the ProcessFunction
>> > >>>> refactoring.
>> > >>>>> Other subtasks (FLINK-14068 [7]) made it into the feature list.
>> > >>>>>
>> > >>>>> I haven't worked with the SDK code that much so that I can judge
>> > >>> whether
>> > >>>>> the subtasks are still reasonable or actually obsolete. That is
>> why I
>> > >>>>> wanted to mention the Jira issue here once more.
>> > >>>>>
>> > >>>>> I don't consider it a blocker for the ongoing vote but was
>> wondering
>> > >>>>> whether it makes sense for someone who might have more experience
>> in
>> > >>> that
>> > >>>>> field to add some of the subtasks to the feature list.
>> > >>>>>
>> > >>>>> Or shall we just consider it as "not interesting enough" because
>> > >> nobody
>> > >>>>> added it in the first place to the 2.0 feature list [3]?
>> > >>>>>
>> > >>>>> Matthias
>> > >>>>>
>> > >>>>> [1]
>> https://lists.apache.org/thread/3dw4f8frlg8hzlv324ql7n2755bzs9hy
>> > >>>>> [2] https://issues.apache.org/jira/browse/FLINK-3957
>> > >>>>> [3] https://cwiki.apache.org/confluence/display/FLINK/2.0+Release
>> > >>>>> [4] https://issues.apache.org/jira/browse/FLINK-4675
>> > >>>>> [5] https://issues.apache.org/jira/browse/FLINK-5126
>> > >>>>> [6] https://issues.apache.org/jira/browse/FLINK-13926
>> > >>>>> [7] https://issues.apache.org/jira/browse/FLINK-14068
>> > >>>>>
>> > >>>>> On Mon, Jul 10, 2023 at 3:17 PM Zhu Zhu <re...@gmail.com> wrote:
>> > >>>>>
>> > >>>>>> Agreed that we should deprecate affected APIs as soon as possible.
>> > >>>>>> But there is not much time before the feature freeze of 1.18,
>> > >> hence
>> > >>>>>> I'm a bit concerned that some of the deprecations might not be
>> done
>> > >>>> 1.18.
>> > >>>>>>
>> > >>>>>> We are currently looking into the improvements of the
>> configuration
>> > >>>>> layer.
>> > >>>>>> Most of the proposed changes would require a public discussion, or
>> > >>> even
>> > >>>>>> a FLIP, which I think can hardly close before the feature freeze
>> of
>> > >>>> 1.18.
>> > >>>>>> And some of the APIs can be deprecated only after the
>> corresponding
>> > >>> new
>> > >>>>>> APIs are developed. Therefore we previously targeted them for
>> 1.19.
>> > >>>>>>
>> > >>>>>> We may review later to see what deprecation work can be done in
>> > >> 1.18
>> > >>>> and
>> > >>>>>> make it if possible. I think we can do the work even after the
>> > >>> feature
>> > >>>>>> freeze
>> > >>>>>> date, if it is a purely deprecation work (simply adding
>> > >> annotations).
>> > >>>>> WDYT?
>> > >>>>>>
>> > >>>>>> I'm also changing the priority of "Clarify the scopes of
>> > >>> configuration
>> > >>>>>> options"
>> > >>>>>> to nice to have. I think most of the work are not breaking changes
>> > >>> and
>> > >>>>> can
>> > >>>>>> be done in 1.x or 2.1+. For the breaking changes which might be
>> > >>> needed,
>> > >>>>> we
>> > >>>>>> will consider it as part of the configuration layer rework.
>> > >>>>>>
>> > >>>>>> Thanks,
>> > >>>>>> Zhu
>> > >>>>>>
>> > >>>>>> Xintong Song <to...@gmail.com> 于2023年7月10日周一 19:58写道:
>> > >>>>>>>
>> > >>>>>>>>
>> > >>>>>>>> At what point are the FLIP discussions coming into play?
>> > >>>>>>>
>> > >>>>>>> I keep wondering if these shouldn't have started already.
>> > >>>>>>>
>> > >>>>>>>
>> > >>>>>>> I think this depends on the responsible contributor and reviewer
>> > >> of
>> > >>>>>>> individual items. From my perspective, the FLIP discussions can
>> > >>> start
>> > >>>>> any
>> > >>>>>>> time as long as the contributors are ready, the earlier the
>> > >> better.
>> > >>>>>>>
>> > >>>>>>>
>> > >>>>>>> What we need to ensure is that all breaking API changes are
>> > >>>>>>>> discussed/decided before 1.18 is released so we can deprecate
>> > >>>>> affected
>> > >>>>>> APIs.
>> > >>>>>>>>
>> > >>>>>>>
>> > >>>>>>> The introduction of the migration period has brought the
>> > >>> requirement
>> > >>>> to
>> > >>>>>>> plan the removal of public APIs 2 minor releases ahead of the
>> > >> major
>> > >>>>>>> release, which is TBH a bit unexpected. I agree it would be nice
>> > >> if
>> > >>>> we
>> > >>>>>> can
>> > >>>>>>> get the FLIPs ready by releasing 1.18. But I also don't think we
>> > >>>> should
>> > >>>>>>> rush on it. If the deprecation of a Public API does not make
>> > >> 1.18,
>> > >>> we
>> > >>>>> may
>> > >>>>>>> carry it until 3.0. Or if there are many Public APIs whose
>> > >>>> deprecation
>> > >>>>>> does
>> > >>>>>>> not make 1.18, we may deprecate them in 1.19 and postpone the
>> > >> major
>> > >>>>>> version
>> > >>>>>>> bump to after a 1.20 release. Moreover, as mentioned in
>> > >>> FLIP-321[1],
>> > >>>>>>> exceptions are discussable given that the migration period is
>> > >> newly
>> > >>>>>>> proposed and we did not give developers the chance to plan things
>> > >>>>> ahead.
>> > >>>>>> To
>> > >>>>>>> sum up, I'd say we try identify APIs that need to be deprecated
>> > >> in
>> > >>>> 1.18
>> > >>>>>>> with best efforts, and evaluate the remaining options (carrying
>> > >> the
>> > >>>> API
>> > >>>>>> for
>> > >>>>>>> the entire 2.x cycle, postpone 2.0, or making an exception)
>> > >>>>> case-by-case.
>> > >>>>>>> WDYT?
>> > >>>>>>>
>> > >>>>>>> Best,
>> > >>>>>>>
>> > >>>>>>> Xintong
>> > >>>>>>>
>> > >>>>>>>
>> > >>>>>>> [1]
>> > >>> https://lists.apache.org/thread/vmhzv8fcw2b33pqxp43486owrxbkd5x9
>> > >>>>>>>
>> > >>>>>>> On Mon, Jul 10, 2023 at 6:13 PM Chesnay Schepler <
>> > >>> chesnay@apache.org
>> > >>>>>
>> > >>>>>> wrote:
>> > >>>>>>>
>> > >>>>>>>> At what point are the FLIP discussions coming into play?
>> > >>>>>>>>
>> > >>>>>>>> I keep wondering if these shouldn't have started already.
>> > >>>>>>>> It just seems that a lot of decisions are implicitly reliant on
>> > >>> the
>> > >>>>>>>> items even being accepted.
>> > >>>>>>>> Estimates can only be provided if we actually know the scope of
>> > >>> the
>> > >>>>>>>> change, but that's not always clear from the description in the
>> > >>>> doc.
>> > >>>>>>>>
>> > >>>>>>>> What we need to ensure is that all breaking API changes are
>> > >>>>>>>> discussed/decided before 1.18 is released so we can deprecate
>> > >>>>> affected
>> > >>>>>>>> APIs.
>> > >>>>>>>>
>> > >>>>>>>> On 10/07/2023 11:32, Xintong Song wrote:
>> > >>>>>>>>> Hi Matthias,
>> > >>>>>>>>>
>> > >>>>>>>>> The questions you asked are indeed very important. Here're
>> > >> some
>> > >>>>> quick
>> > >>>>>>>>> responses, based on the plans I had in mind, which I have not
>> > >>>>> aligned
>> > >>>>>>>> with
>> > >>>>>>>>> other release managers yet.
>> > >>>>>>>>>
>> > >>>>>>>>> In the previous discussions between the RMs, we were not able
>> > >>> to
>> > >>>>> make
>> > >>>>>>>>> proposals on things like how to make a time plan, how to
>> > >> manage
>> > >>>> the
>> > >>>>>>>> release
>> > >>>>>>>>> branch, etc., due to the lack of inputs on e.g., the work
>> > >> items
>> > >>>>> need
>> > >>>>>> to
>> > >>>>>>>> be
>> > >>>>>>>>> included (which transitively depends on the API compatibility
>> > >>> to
>> > >>>>>> provide
>> > >>>>>>>>> between major versions) and the workloads / time needed for
>> > >>> them.
>> > >>>>>> With
>> > >>>>>>>> the
>> > >>>>>>>>> recent discussions, we have collected at least the majority
>> > >> of
>> > >>>> the
>> > >>>>>> inputs
>> > >>>>>>>>> needed.
>> > >>>>>>>>>
>> > >>>>>>>>> Here are things that I think we as the release managers would
>> > >>> do
>> > >>>>> next
>> > >>>>>>>>> (again, not aligned with other release managers yet)
>> > >>>>>>>>> - Creating a time plan, by reaching out to people to
>> > >> understand
>> > >>>> the
>> > >>>>>>>>> estimated workloads, prerequisites and ETA of each work item.
>> > >>>>>>>>> - Make a proposal on how to manage the release branch, i.e.,
>> > >>> when
>> > >>>>> to
>> > >>>>>> cut
>> > >>>>>>>>> the branch and whether to ship the milestone releases, etc.
>> > >>>>>>>>> - Set-up regular release syncs (bi-weekly / monthly) to
>> > >> update
>> > >>>> the
>> > >>>>>> status
>> > >>>>>>>>> and draw attention to where help is needed.
>> > >>>>>>>>>
>> > >>>>>>>>> So back to your questions.
>> > >>>>>>>>>
>> > >>>>>>>>> There are still to-be-discussed items in the list of
>> > >> features.
>> > >>>>>> What's the
>> > >>>>>>>>>> plan with those?
>> > >>>>>>>>> When collecting ETA, for items that the completion time
>> > >> cannot
>> > >>>> yet
>> > >>>>> be
>> > >>>>>>>>> estimated, we would like to have at least a time by which the
>> > >>>>>> estimation
>> > >>>>>>>>> can be made. I think the same applies to the to-be-discussed
>> > >>>> items.
>> > >>>>>> And
>> > >>>>>>>> if
>> > >>>>>>>>> the items should be included as must-haves, we would need
>> > >>> another
>> > >>>>>> vote to
>> > >>>>>>>>> adjust the must-have item list.
>> > >>>>>>>>>
>> > >>>>>>>>> Some of them don't have anyone assigned.
>> > >>>>>>>>> My concern is that they will be overlooked because nobody
>> > >> feels
>> > >>>> to
>> > >>>>>> be in
>> > >>>>>>>>>> charge.
>> > >>>>>>>>> This is a tricky one. For must-have items without assignees,
>> > >> we
>> > >>>> as
>> > >>>>>> the
>> > >>>>>>>>> release managers should be responsible for raising them up in
>> > >>> the
>> > >>>>>> release
>> > >>>>>>>>> syncs, and try to find assignees for them. Hopefully, there
>> > >>> will
>> > >>>> be
>> > >>>>>>>> someone
>> > >>>>>>>>> who stands out. But it is possible that for a must-have item
>> > >>>> nobody
>> > >>>>>> wants
>> > >>>>>>>>> to work on it. If that happens, which I don't think it will,
>> > >> it
>> > >>>>>> probably
>> > >>>>>>>>> means the item is not that critical and we may have to
>> > >> exclude
>> > >>> it
>> > >>>>>> from
>> > >>>>>>>> the
>> > >>>>>>>>> release. Either way, they should not be overlooked, because
>> > >>> IMHO
>> > >>>>>> release
>> > >>>>>>>>> managers should be responsible for trying to get someone to
>> > >>> work
>> > >>>> on
>> > >>>>>> the
>> > >>>>>>>>> un-assigned items.
>> > >>>>>>>>>
>> > >>>>>>>>> We'll have more discussions soon and keep the community
>> > >>> updated.
>> > >>>>>>>>>
>> > >>>>>>>>> Best,
>> > >>>>>>>>>
>> > >>>>>>>>> Xintong
>> > >>>>>>>>>
>> > >>>>>>>>>
>> > >>>>>>>>>
>> > >>>>>>>>> On Mon, Jul 10, 2023 at 3:53 PM Matthias Pohl
>> > >>>>>>>>> <ma...@aiven.io.invalid> wrote:
>> > >>>>>>>>>
>> > >>>>>>>>>> Now that the vote is started on the must-have items: There
>> > >> are
>> > >>>>> still
>> > >>>>>>>>>> to-be-discussed items in the list of features. What's the
>> > >> plan
>> > >>>>> with
>> > >>>>>>>> those?
>> > >>>>>>>>>> Some of them don't have anyone assigned. Were these items
>> > >>>>> discussed
>> > >>>>>>>> among
>> > >>>>>>>>>> the release managers? So far, it looks like they are handled
>> > >>> as
>> > >>>>>>>>>> nice-to-have if someone volunteers to pick them up?
>> > >>>>>>>>>>
>> > >>>>>>>>>> My concern is that they will be overlooked because nobody
>> > >>> feels
>> > >>>> to
>> > >>>>>> be in
>> > >>>>>>>>>> charge.
>> > >>>>>>>>>>
>> > >>>>>>>>>> Best,
>> > >>>>>>>>>> Matthias
>> > >>>>>>>>>>
>> > >>>>>>>>>> On Fri, Jul 7, 2023 at 11:06 AM Xintong Song <
>> > >>>>> tonysong820@gmail.com
>> > >>>>>>>
>> > >>>>>>>>>> wrote:
>> > >>>>>>>>>>
>> > >>>>>>>>>>> Thanks all for the discussion.
>> > >>>>>>>>>>>
>> > >>>>>>>>>>> The wiki has been updated as discussed. I'm starting a vote
>> > >>>> now.
>> > >>>>>>>>>>>
>> > >>>>>>>>>>> Best,
>> > >>>>>>>>>>>
>> > >>>>>>>>>>> Xintong
>> > >>>>>>>>>>>
>> > >>>>>>>>>>>
>> > >>>>>>>>>>>
>> > >>>>>>>>>>> On Wed, Jul 5, 2023 at 9:52 AM Xintong Song <
>> > >>>>> tonysong820@gmail.com
>> > >>>>>>>
>> > >>>>>>>>>> wrote:
>> > >>>>>>>>>>>> Hi ConradJam,
>> > >>>>>>>>>>>>
>> > >>>>>>>>>>>> I think Chesnay has already put his name as the
>> > >> Contributor
>> > >>>> for
>> > >>>>>> the
>> > >>>>>>>> two
>> > >>>>>>>>>>>> tasks you listed. Maybe you can reach out to him to see if
>> > >>> you
>> > >>>>> can
>> > >>>>>>>>>>>> collaborate on this.
>> > >>>>>>>>>>>>
>> > >>>>>>>>>>>> In general, I don't think contributing to a release 2.0
>> > >>> issue
>> > >>>> is
>> > >>>>>> much
>> > >>>>>>>>>>>> different from contributing to a regular issue. We haven't
>> > >>> yet
>> > >>>>>> created
>> > >>>>>>>>>>> JIRA
>> > >>>>>>>>>>>> tickets for all the listed tasks because many of them
>> > >> needs
>> > >>>>>> further
>> > >>>>>>>>>>>> discussions and / or FLIPs to decide whether and how they
>> > >>>> should
>> > >>>>>> be
>> > >>>>>>>>>>>> performed.
>> > >>>>>>>>>>>>
>> > >>>>>>>>>>>> Best,
>> > >>>>>>>>>>>>
>> > >>>>>>>>>>>> Xintong
>> > >>>>>>>>>>>>
>> > >>>>>>>>>>>>
>> > >>>>>>>>>>>>
>> > >>>>>>>>>>>> On Mon, Jul 3, 2023 at 10:37 PM ConradJam <
>> > >>>> jam.gzczy@gmail.com>
>> > >>>>>>>> wrote:
>> > >>>>>>>>>>>>
>> > >>>>>>>>>>>>> Hi Community:
>> > >>>>>>>>>>>>>   I see some tasks in the 2.0 list that haven't been
>> > >>>> assigned
>> > >>>>>> yet. I
>> > >>>>>>>>>>> want
>> > >>>>>>>>>>>>> to take the initiative to take on some tasks that I can
>> > >>>>>> complete. How
>> > >>>>>>>>>>> do I
>> > >>>>>>>>>>>>> apply to the community for this part of the task? I am
>> > >>>>>> interested in
>> > >>>>>>>>>> the
>> > >>>>>>>>>>>>> following parts of FLINK-32377
>> > >>>>>>>>>>>>> <https://issues.apache.org/jira/browse/FLINK-32377>, do
>> > >> I
>> > >>>> need
>> > >>>>>> to
>> > >>>>>>>>>>> create
>> > >>>>>>>>>>>>> issuse myself and point it to myself?
>> > >>>>>>>>>>>>>
>> > >>>>>>>>>>>>> - the current timestamp, which is problematic w.r.t.
>> > >>> caching
>> > >>>>> and
>> > >>>>>>>>>>> testing,
>> > >>>>>>>>>>>>> while providing no value.
>> > >>>>>>>>>>>>> - Remove JarRequestBody#programArgs in favor of
>> > >>>>> #programArgsList.
>> > >>>>>>>>>>>>>
>> > >>>>>>>>>>>>> [1] FLINK-32377 <
>> > >>>>>> https://issues.apache.org/jira/browse/FLINK-32377><
>> https://issues.apache.org/jira/browse/FLINK-32377%3e>
>> > >>>>>>>>>>>>> https://issues.apache.org/jira/browse/FLINK-32377
>> > >>>>>>>>>>>>>
>> > >>>>>>>>>>>>> Teoh, Hong <li...@amazon.co.uk.invalid> 于2023年6月30日周五
>> > >>>>> 00:53写道:
>> > >>>>>>>>>>>>>
>> > >>>>>>>>>>>>>
>> > >>>>>>>>>>>>> Teoh, Hong <li...@amazon.co.uk.invalid> 于2023年6月30日周五
>> > >>>>> 00:53写道:
>> > >>>>>>>>>>>>>
>> > >>>>>>>>>>>>>> Thanks Xintong for driving the effort.
>> > >>>>>>>>>>>>>>
>> > >>>>>>>>>>>>>> I’d add a +1 to reworking configs, as suggested by @Jark
>> > >>> and
>> > >>>>>>>>>> @Chesnay,
>> > >>>>>>>>>>>>>> especially the types. We have various configs that
>> > >> encode
>> > >>>>> Time /
>> > >>>>>>>>>>>>> MemorySize
>> > >>>>>>>>>>>>>> that are Long instead!
>> > >>>>>>>>>>>>>>
>> > >>>>>>>>>>>>>> Regards,
>> > >>>>>>>>>>>>>> Hong
>> > >>>>>>>>>>>>>>
>> > >>>>>>>>>>>>>>
>> > >>>>>>>>>>>>>>
>> > >>>>>>>>>>>>>>> On 29 Jun 2023, at 16:19, Yuan Mei <
>> > >>> yuanmei.work@gmail.com
>> > >>>>>
>> > >>>>>>>>>> wrote:
>> > >>>>>>>>>>>>>>> CAUTION: This email originated from outside of the
>> > >>>>>> organization.
>> > >>>>>>>>>> Do
>> > >>>>>>>>>>>>> not
>> > >>>>>>>>>>>>>> click links or open attachments unless you can confirm
>> > >> the
>> > >>>>>> sender
>> > >>>>>>>>>> and
>> > >>>>>>>>>>>>> know
>> > >>>>>>>>>>>>>> the content is safe.
>> > >>>>>>>>>>>>>>>
>> > >>>>>>>>>>>>>>>
>> > >>>>>>>>>>>>>>> Thanks for driving this effort, Xintong!
>> > >>>>>>>>>>>>>>>
>> > >>>>>>>>>>>>>>> To Chesnay
>> > >>>>>>>>>>>>>>>> I'm curious as to why the "Disaggregated State
>> > >>> Management"
>> > >>>>>> item
>> > >>>>>>>>>> is
>> > >>>>>>>>>>>>>>>> marked as a must-have; will it require changes that
>> > >>> break
>> > >>>>>>>>>>> something?
>> > >>>>>>>>>>>>>>>> What prevents it from being added in 2.1?
>> > >>>>>>>>>>>>>>> As to "Disaggregated State Management".
>> > >>>>>>>>>>>>>>>
>> > >>>>>>>>>>>>>>> We plan to provide a new type of state backend to
>> > >> support
>> > >>>> DFS
>> > >>>>>> as
>> > >>>>>>>>>>>>> primary
>> > >>>>>>>>>>>>>>> storage.
>> > >>>>>>>>>>>>>>> To achieve this, we at least need to include two parts
>> > >> of
>> > >>>>>> amends
>> > >>>>>>>>>>> (not
>> > >>>>>>>>>>>>>>> entirely sure yet, since we are still in the designing
>> > >>> and
>> > >>>>>>>>>> prototype
>> > >>>>>>>>>>>>>> phase)
>> > >>>>>>>>>>>>>>> 1. Statebackend Change
>> > >>>>>>>>>>>>>>> 2. State Access Change
>> > >>>>>>>>>>>>>>>
>> > >>>>>>>>>>>>>>> Not all of the interfaces related are `@Internal`. Some
>> > >>> of
>> > >>>>> the
>> > >>>>>>>>>>>>> interfaces
>> > >>>>>>>>>>>>>>> like `StateBackend` is `@PublicEvolving`
>> > >>>>>>>>>>>>>>> So, you are right in the sense that "Disaggregated
>> > >> State
>> > >>>>>>>>>> Management"
>> > >>>>>>>>>>>>>> itself
>> > >>>>>>>>>>>>>>> probably does not need to be a "Must Have"
>> > >>>>>>>>>>>>>>>
>> > >>>>>>>>>>>>>>> But I was hoping changes that related to public APIs
>> > >> can
>> > >>> be
>> > >>>>>>>>>>> finalized
>> > >>>>>>>>>>>>> and
>> > >>>>>>>>>>>>>>> merged in Flink 2.0 (I will fix the wiki accordingly).
>> > >>>>>>>>>>>>>>>
>> > >>>>>>>>>>>>>>> I also agree with Jark that 2.0 is a good chance to
>> > >>> rework
>> > >>>>> the
>> > >>>>>>>>>>> default
>> > >>>>>>>>>>>>>>> value of configurations.
>> > >>>>>>>>>>>>>>>
>> > >>>>>>>>>>>>>>> Best
>> > >>>>>>>>>>>>>>> Yuan
>> > >>>>>>>>>>>>>>>
>> > >>>>>>>>>>>>>>>
>> > >>>>>>>>>>>>>>> On Thu, Jun 29, 2023 at 8:43 PM Chesnay Schepler <
>> > >>>>>>>>>>> chesnay@apache.org>
>> > >>>>>>>>>>>>>> wrote:
>> > >>>>>>>>>>>>>>>> Something else configuration-related is that there
>> > >> are a
>> > >>>>>> bunch of
>> > >>>>>>>>>>>>>>>> options where the type isn't quite correct (e.g., a
>> > >>> String
>> > >>>>>> where
>> > >>>>>>>>>> it
>> > >>>>>>>>>>>>>>>> could be an enum, a string where it should be an int
>> > >> or
>> > >>>>>>>>>> something).
>> > >>>>>>>>>>>>>>>> Could do a pass over those as well.
>> > >>>>>>>>>>>>>>>>
>> > >>>>>>>>>>>>>>>> On 29/06/2023 13:50, Jark Wu wrote:
>> > >>>>>>>>>>>>>>>>> Hi,
>> > >>>>>>>>>>>>>>>>>
>> > >>>>>>>>>>>>>>>>> I think one more thing we need to consider to do in
>> > >> 2.0
>> > >>>> is
>> > >>>>>>>>>>> changing
>> > >>>>>>>>>>>>> the
>> > >>>>>>>>>>>>>>>>> default value of configuration to improve out-of-box
>> > >>> user
>> > >>>>>>>>>>>>> experience.
>> > >>>>>>>>>>>>>>>>> Currently, in order to run a Flink job, users may
>> > >> need
>> > >>> to
>> > >>>>> set
>> > >>>>>>>>>>>>>>>>> a bunch of configurations, such as minibatch,
>> > >>> checkpoint
>> > >>>>>>>>>> interval,
>> > >>>>>>>>>>>>>>>>> exactly-once,
>> > >>>>>>>>>>>>>>>>> incremental-checkpoint, etc. It's very verbose and
>> > >> hard
>> > >>>> to
>> > >>>>>> use
>> > >>>>>>>>>> for
>> > >>>>>>>>>>>>>>>>> beginners.
>> > >>>>>>>>>>>>>>>>> Most of them can have a universally applicable value.
>> > >>>>>> Because
>> > >>>>>>>>>>>>> changing
>> > >>>>>>>>>>>>>>>> the
>> > >>>>>>>>>>>>>>>>> default value is a breaking change. I think It's
>> > >> worth
>> > >>>>>>>>>> considering
>> > >>>>>>>>>>>>>>>> changing
>> > >>>>>>>>>>>>>>>>> them in 2.0.
>> > >>>>>>>>>>>>>>>>>
>> > >>>>>>>>>>>>>>>>> What do you think?
>> > >>>>>>>>>>>>>>>>>
>> > >>>>>>>>>>>>>>>>> Best,
>> > >>>>>>>>>>>>>>>>> Jark
>> > >>>>>>>>>>>>>>>>>
>> > >>>>>>>>>>>>>>>>>
>> > >>>>>>>>>>>>>>>>> On Wed, 28 Jun 2023 at 14:10, Sergey Nuyanzin <
>> > >>>>>>>>>>> snuyanzin@gmail.com>
>> > >>>>>>>>>>>>>>>> wrote:
>> > >>>>>>>>>>>>>>>>>> Hi Chesnay
>> > >>>>>>>>>>>>>>>>>>
>> > >>>>>>>>>>>>>>>>>>> "Move Calcite rules from Scala to Java": I would
>> > >> hope
>> > >>>>> that
>> > >>>>>>>>>> this
>> > >>>>>>>>>>>>> would
>> > >>>>>>>>>>>>>>>> be
>> > >>>>>>>>>>>>>>>>>>> an entirely internal change, and could thus be an
>> > >>>>>> incremental
>> > >>>>>>>>>>>>> process
>> > >>>>>>>>>>>>>>>>>>> independent of major releases.
>> > >>>>>>>>>>>>>>>>>>> What is the actual scale of this item; how much are
>> > >>> we
>> > >>>>>>>>>> actually
>> > >>>>>>>>>>>>>>>>>> re-writing?
>> > >>>>>>>>>>>>>>>>>>
>> > >>>>>>>>>>>>>>>>>> Thanks for asking
>> > >>>>>>>>>>>>>>>>>> yes, you're right, that should be internal change.
>> > >>>>>>>>>>>>>>>>>> Yeah I was also thinking about incremental change
>> > >>> (rule
>> > >>>> by
>> > >>>>>> rule
>> > >>>>>>>>>>> or
>> > >>>>>>>>>>>>>>>>>> reasonable small group of rules).
>> > >>>>>>>>>>>>>>>>>> And yes, this could be an independent (on major
>> > >>> release)
>> > >>>>>>>>>> activity
>> > >>>>>>>>>>>>>>>>>> The problem is actually for children of RelOptRule.
>> > >>>>>>>>>>>>>>>>>> Currently I see 60+ such rules (in Scala) using the
>> > >>>>>> mentioned
>> > >>>>>>>>>>>>>> deprecated
>> > >>>>>>>>>>>>>>>>>> api.
>> > >>>>>>>>>>>>>>>>>> There are also children of ConverterRule (50+) which
>> > >>> do
>> > >>>>> not
>> > >>>>>>>>>> have
>> > >>>>>>>>>>>>> such
>> > >>>>>>>>>>>>>>>>>> issues.
>> > >>>>>>>>>>>>>>>>>> Maybe it could be considered as the next step to
>> > >> have
>> > >>>> all
>> > >>>>>> the
>> > >>>>>>>>>>>>> rules in
>> > >>>>>>>>>>>>>>>>>> Java.
>> > >>>>>>>>>>>>>>>>>>
>> > >>>>>>>>>>>>>>>>>> On Tue, Jun 27, 2023 at 1:34 PM Xintong Song <
>> > >>>>>>>>>>>>> tonysong820@gmail.com>
>> > >>>>>>>>>>>>>>>>>> wrote:
>> > >>>>>>>>>>>>>>>>>>
>> > >>>>>>>>>>>>>>>>>>> Hi Alex & Gyula,
>> > >>>>>>>>>>>>>>>>>>>
>> > >>>>>>>>>>>>>>>>>>> By compatibility discussion do you mean the
>> > >>> "[DISCUSS]
>> > >>>>>>>>>> FLIP-321:
>> > >>>>>>>>>>>>>>>>>> Introduce
>> > >>>>>>>>>>>>>>>>>>>> an API deprecation process" thread [1]?
>> > >>>>>>>>>>>>>>>>>>>>
>> > >>>>>>>>>>>>>>>>>>> Yes, I meant the FLIP-321 discussion. I just
>> > >> noticed
>> > >>> I
>> > >>>>>> pasted
>> > >>>>>>>>>>> the
>> > >>>>>>>>>>>>>> wrong
>> > >>>>>>>>>>>>>>>>>> url
>> > >>>>>>>>>>>>>>>>>>> in my previous email. Sorry for the mistake.
>> > >>>>>>>>>>>>>>>>>>>
>> > >>>>>>>>>>>>>>>>>>> I am also curious to know if the rationale behind
>> > >>> this
>> > >>>>> new
>> > >>>>>> API
>> > >>>>>>>>>>> has
>> > >>>>>>>>>>>>>> been
>> > >>>>>>>>>>>>>>>>>>>> previously discussed on the mailing list. Do we
>> > >>> have a
>> > >>>>>> list
>> > >>>>>>>>>> of
>> > >>>>>>>>>>>>>>>>>>> shortcomings
>> > >>>>>>>>>>>>>>>>>>>> in the current DataStream API that it tries to
>> > >>>> resolve?
>> > >>>>>> How
>> > >>>>>>>>>>> does
>> > >>>>>>>>>>>>> the
>> > >>>>>>>>>>>>>>>>>>>> current ProcessFunction functionality fit into the
>> > >>>>>> picture?
>> > >>>>>>>>>>> Will
>> > >>>>>>>>>>>>> it
>> > >>>>>>>>>>>>>> be
>> > >>>>>>>>>>>>>>>>>>> kept
>> > >>>>>>>>>>>>>>>>>>>> as is or subsumed by new API?
>> > >>>>>>>>>>>>>>>>>>>>
>> > >>>>>>>>>>>>>>>>>>> I don't think we should create a replacement for
>> > >> the
>> > >>>>>>>>>> DataStream
>> > >>>>>>>>>>>>> API
>> > >>>>>>>>>>>>>>>>>> unless
>> > >>>>>>>>>>>>>>>>>>>> we have a very good reason to do so and with a
>> > >>> proper
>> > >>>>>>>>>>> discussion
>> > >>>>>>>>>>>>>> about
>> > >>>>>>>>>>>>>>>>>>> this
>> > >>>>>>>>>>>>>>>>>>>> as Alex said.
>> > >>>>>>>>>>>>>>>>>>> The ProcessFunction API which is targeting to
>> > >> replace
>> > >>>>>>>>>> DataStream
>> > >>>>>>>>>>>>> API
>> > >>>>>>>>>>>>>> is
>> > >>>>>>>>>>>>>>>>>>> still a proposal, not a decision. Sorry for the
>> > >>>>> confusion,
>> > >>>>>> I
>> > >>>>>>>>>>>>> should
>> > >>>>>>>>>>>>>>>> have
>> > >>>>>>>>>>>>>>>>>>> been more careful with my words, not giving the
>> > >>>>> impression
>> > >>>>>>>>>> that
>> > >>>>>>>>>>>>> this
>> > >>>>>>>>>>>>>> is
>> > >>>>>>>>>>>>>>>>>>> something we'll do anyway.
>> > >>>>>>>>>>>>>>>>>>>
>> > >>>>>>>>>>>>>>>>>>> There will be a FLIP describing the motivations and
>> > >>>>>> designs in
>> > >>>>>>>>>>>>>> detail,
>> > >>>>>>>>>>>>>>>>>> for
>> > >>>>>>>>>>>>>>>>>>> the community to discuss and vote on. We are still
>> > >>>>> working
>> > >>>>>> on
>> > >>>>>>>>>>> it.
>> > >>>>>>>>>>>>>> TBH,
>> > >>>>>>>>>>>>>>>>>> this
>> > >>>>>>>>>>>>>>>>>>> is not trivial and we would need more time on it.
>> > >>>>>>>>>>>>>>>>>>>
>> > >>>>>>>>>>>>>>>>>>> Just to quickly share some backgrounds:
>> > >>>>>>>>>>>>>>>>>>>
>> > >>>>>>>>>>>>>>>>>>>    - We see quite some problems with the current
>> > >>>>>> DataStream
>> > >>>>>>>>>> APIs
>> > >>>>>>>>>>>>>>>>>>>       - Users are working with concrete classes
>> > >>> rather
>> > >>>>>> than
>> > >>>>>>>>>>>>>>>> interfaces,
>> > >>>>>>>>>>>>>>>>>>>       which means
>> > >>>>>>>>>>>>>>>>>>>       - Users can access methods that are designed
>> > >>> to
>> > >>>> be
>> > >>>>>> used
>> > >>>>>>>>>> by
>> > >>>>>>>>>>>>>>>> internal
>> > >>>>>>>>>>>>>>>>>>>          classes, even though they are annotated
>> > >>> with
>> > >>>>>>>>>>> `@Internal`.
>> > >>>>>>>>>>>>>>>> E.g.,
>> > >>>>>>>>>>>>>>>>>>>          `DataStream#getTransformation`.
>> > >>>>>>>>>>>>>>>>>>>          - Changes to the non-API implementations
>> > >>>> (e.g.,
>> > >>>>>>>>>>>>>>>>>> `Transformation`)
>> > >>>>>>>>>>>>>>>>>>>          would affect the API classes (e.g.,
>> > >>>>>> `DataStream`),
>> > >>>>>>>>>>> which
>> > >>>>>>>>>>>>>>>>>>> makes it hard to
>> > >>>>>>>>>>>>>>>>>>>          provide binary compatibility.
>> > >>>>>>>>>>>>>>>>>>>       - Internal classes are used as parameter /
>> > >>>>>> return-value
>> > >>>>>>>>>> of
>> > >>>>>>>>>>>>>>>> public
>> > >>>>>>>>>>>>>>>>>>>       APIs. E.g., while `AbstractStreamOperator`
>> > >> is
>> > >>>>>>>>>>>>> PublicEvolving,
>> > >>>>>>>>>>>>>>>>>>> `StreamTask`
>> > >>>>>>>>>>>>>>>>>>>       which returns from
>> > >>>>>>>>>>>>> `AbstractStreamOperator#getContainingTask`
>> > >>>>>>>>>>>>>> is
>> > >>>>>>>>>>>>>>>>>>> Internal.
>> > >>>>>>>>>>>>>>>>>>>       - In many cases, users are asked to extend
>> > >> the
>> > >>>> API
>> > >>>>>>>>>>> classes,
>> > >>>>>>>>>>>>>>>> rather
>> > >>>>>>>>>>>>>>>>>>>       than implementing interfaces. E.g.,
>> > >>>>>>>>>>>>> `AbstractStreamOperator`.
>> > >>>>>>>>>>>>>>>>>>>          - Any changes to the base classes, even
>> > >> the
>> > >>>>>> internal
>> > >>>>>>>>>>>>> part,
>> > >>>>>>>>>>>>>>>> may
>> > >>>>>>>>>>>>>>>>>>>          affect the behavior of the user-provided
>> > >>>>>> sub-classes
>> > >>>>>>>>>>>>>>>>>>>          - Users can override the behavior of the
>> > >>> base
>> > >>>>>> classes
>> > >>>>>>>>>>>>>>>>>>>       - The API module `flink-streaming-java`
>> > >>> contains
>> > >>>>>> non-API
>> > >>>>>>>>>>>>>>>> classes,
>> > >>>>>>>>>>>>>>>>>> and
>> > >>>>>>>>>>>>>>>>>>>       depends on internal modules such as
>> > >>>>> `flink-runtime`,
>> > >>>>>>>>>> which
>> > >>>>>>>>>>>>>> means
>> > >>>>>>>>>>>>>>>>>>>       - Changes to the internal modules may affect
>> > >>> the
>> > >>>>> API
>> > >>>>>>>>>>>>> modules,
>> > >>>>>>>>>>>>>>>> which
>> > >>>>>>>>>>>>>>>>>>>          requires users to re-build their
>> > >>> applications
>> > >>>>>> upon
>> > >>>>>>>>>>>>> upgrading
>> > >>>>>>>>>>>>>>>>>>>          - The artifact user needs for building
>> > >>> their
>> > >>>>>>>>>>> application
>> > >>>>>>>>>>>>>>>> larger
>> > >>>>>>>>>>>>>>>>>>>          than necessary.
>> > >>>>>>>>>>>>>>>>>>>       - We probably should not expose operators
>> > >>> (e.g.,
>> > >>>>>>>>>>>>>>>>>>>       `AbstractStreamOperator`) to users.
>> > >> Functions
>> > >>>>>> should be
>> > >>>>>>>>>>>>> enough
>> > >>>>>>>>>>>>>>>>>>> for users to
>> > >>>>>>>>>>>>>>>>>>>       define their data processing logics.
>> > >> Exposing
>> > >>>>>>>>>>> operator-level
>> > >>>>>>>>>>>>>>>>>> concepts
>> > >>>>>>>>>>>>>>>>>>>       (e.g., mailbox thread model, checkpoint
>> > >>> barrier
>> > >>>>>>>>>> alignment,
>> > >>>>>>>>>>>>>>>> etc.) is
>> > >>>>>>>>>>>>>>>>>>>       unnecessary and limits the improvement
>> > >>> regarding
>> > >>>>>> such
>> > >>>>>>>>>>>>> exposed
>> > >>>>>>>>>>>>>>>>>>> mechanisms
>> > >>>>>>>>>>>>>>>>>>>       with compatibility considerations.
>> > >>>>>>>>>>>>>>>>>>>       - The current DataStream API seems to be a
>> > >>>> mixture
>> > >>>>>> of
>> > >>>>>>>>>> many
>> > >>>>>>>>>>>>>>>> things,
>> > >>>>>>>>>>>>>>>>>>>       making it hard to understand especially for
>> > >>>>>> newcomers.
>> > >>>>>>>>>> It
>> > >>>>>>>>>>>>> might
>> > >>>>>>>>>>>>>>>> be
>> > >>>>>>>>>>>>>>>>>>> better
>> > >>>>>>>>>>>>>>>>>>>       to re-organize it into several parts: (the
>> > >>>>> taxonomy
>> > >>>>>>>>>> below
>> > >>>>>>>>>>>>> are
>> > >>>>>>>>>>>>>>>> just
>> > >>>>>>>>>>>>>>>>>> an
>> > >>>>>>>>>>>>>>>>>>>       example of the, we are still working on
>> > >> this)
>> > >>>>>>>>>>>>>>>>>>>          - The most fundamental stateful stream
>> > >>>>>> processing:
>> > >>>>>>>>>>>>> streams,
>> > >>>>>>>>>>>>>>>>>>>          partitions / key, process functions,
>> > >> state,
>> > >>>>>>>>>>>>> timeline-service
>> > >>>>>>>>>>>>>>>>>>>          - An extension for common batch-streaming
>> > >>>>> unified
>> > >>>>>>>>>>>>> functions:
>> > >>>>>>>>>>>>>>>>>> map,
>> > >>>>>>>>>>>>>>>>>>>          flatmap, filter, agg, reduce, join, etc.
>> > >>>>>>>>>>>>>>>>>>>          - An extension for windowing supports:
>> > >>>> window,
>> > >>>>>>>>>>>>> triggering
>> > >>>>>>>>>>>>>>>>>>>          - An extension for event-time supports:
>> > >>> event
>> > >>>>>> time,
>> > >>>>>>>>>>>>>> watermark
>> > >>>>>>>>>>>>>>>>>>>          - The extensions are like short-cuts /
>> > >>>> sugars,
>> > >>>>>>>>>> without
>> > >>>>>>>>>>>>> which
>> > >>>>>>>>>>>>>>>>>> users
>> > >>>>>>>>>>>>>>>>>>>          can probably still achieve the same
>> > >>> behavior
>> > >>>> by
>> > >>>>>>>>>> working
>> > >>>>>>>>>>>>> with
>> > >>>>>>>>>>>>>>>> the
>> > >>>>>>>>>>>>>>>>>>>          fundamental APIs, but would be a lot
>> > >> easier
>> > >>>>> with
>> > >>>>>> the
>> > >>>>>>>>>>>>>>>> extensions
>> > >>>>>>>>>>>>>>>>>>>       - The original plan was to do in-place
>> > >>>> refactors /
>> > >>>>>>>>>> changes
>> > >>>>>>>>>>>>> on
>> > >>>>>>>>>>>>>>>>>>>    DataStream API. Some related items are listed
>> > >> in
>> > >>>> this
>> > >>>>>> doc
>> > >>>>>>>>>> [2]
>> > >>>>>>>>>>>>>>>> attached
>> > >>>>>>>>>>>>>>>>>>> to
>> > >>>>>>>>>>>>>>>>>>>    the kicking off email [3]. Not all of the above
>> > >>>>> issues
>> > >>>>>> are
>> > >>>>>>>>>>>>> listed,
>> > >>>>>>>>>>>>>>>>>>> because
>> > >>>>>>>>>>>>>>>>>>>    we haven't looked into this as deeply as now
>> > >> by
>> > >>>> that
>> > >>>>>> time.
>> > >>>>>>>>>>>>>>>>>>>    - We proposed this as a new API rather than
>> > >>>> in-place
>> > >>>>>>>>>>> refactors
>> > >>>>>>>>>>>>> in
>> > >>>>>>>>>>>>>>>> the
>> > >>>>>>>>>>>>>>>>>>>    2.0 work item list, because we realized the
>> > >>> changes
>> > >>>>>> might
>> > >>>>>>>>>> be
>> > >>>>>>>>>>>>> too
>> > >>>>>>>>>>>>>>>> big
>> > >>>>>>>>>>>>>>>>>>> for an
>> > >>>>>>>>>>>>>>>>>>>    in-place change. First having a new API then
>> > >>>>> gradually
>> > >>>>>>>>>>> retiring
>> > >>>>>>>>>>>>>> the
>> > >>>>>>>>>>>>>>>>>> old
>> > >>>>>>>>>>>>>>>>>>> one
>> > >>>>>>>>>>>>>>>>>>>    would help users to smoothly migrate between
>> > >>> them.
>> > >>>>>>>>>>>>>>>>>>>
>> > >>>>>>>>>>>>>>>>>>> A thorough discussion is definitely needed once the
>> > >>>> FLIP
>> > >>>>> is
>> > >>>>>>>>>> out.
>> > >>>>>>>>>>>>> And
>> > >>>>>>>>>>>>>> of
>> > >>>>>>>>>>>>>>>>>>> course it's possible that the FLIP might be
>> > >> rejected.
>> > >>>>> Given
>> > >>>>>>>>>> that
>> > >>>>>>>>>>>>> we
>> > >>>>>>>>>>>>>> are
>> > >>>>>>>>>>>>>>>>>>> planning for release 2.0, I just feel it would be
>> > >>>> better
>> > >>>>> to
>> > >>>>>>>>>>> bring
>> > >>>>>>>>>>>>>> this
>> > >>>>>>>>>>>>>>>> up
>> > >>>>>>>>>>>>>>>>>>> early even the concrete plan is not yet ready,
>> > >>>>>>>>>>>>>>>>>>>
>> > >>>>>>>>>>>>>>>>>>> Best,
>> > >>>>>>>>>>>>>>>>>>>
>> > >>>>>>>>>>>>>>>>>>> Xintong
>> > >>>>>>>>>>>>>>>>>>>
>> > >>>>>>>>>>>>>>>>>>>
>> > >>>>>>>>>>>>>>>>>>> [1]
>> > >>>>>>>>>>>>>
>> > >>>>> https://lists.apache.org/thread/vmhzv8fcw2b33pqxp43486owrxbkd5x9
>> > >>>>>>>>>>>>>>>>>>> [2]
>> > >>>>>>>>>>>>>>>>>>>
>> > >>>>>>>>>>>>>>>>>>>
>> > >>>>>>>>>>
>> > >>>>>>>>
>> > >>>>>>
>> > >>>>>
>> > >>>>
>> > >>>
>> > >>
>> >
>> https://docs.google.com/document/d/1_PMGl5RuDQGlV99_gL3y7OiRsF0DgCk91Coua6hFXhE/edit?usp=sharing
>> > >>>>>>>>>>>>>>>>>>> [3]
>> > >>>>>>>>>>>>>
>> > >>>>> https://lists.apache.org/thread/b8w5cx0qqbwzzklyn5xxf54vw9ymys1c
>> > >>>>>>>>>>>>>>>>>>> On Tue, Jun 27, 2023 at 5:15 PM Gyula Fóra <
>> > >>>>>> gyfora@apache.org
>> > >>>>>>>>>>>>>> wrote:
>> > >>>>>>>>>>>>>>>>>>>> Hey!
>> > >>>>>>>>>>>>>>>>>>>>
>> > >>>>>>>>>>>>>>>>>>>> I share the same concerns mentioned above
>> > >> regarding
>> > >>>> the
>> > >>>>>>>>>>>>>>>>>> "ProcessFunction
>> > >>>>>>>>>>>>>>>>>>>> API".
>> > >>>>>>>>>>>>>>>>>>>>
>> > >>>>>>>>>>>>>>>>>>>> I don't think we should create a replacement for
>> > >> the
>> > >>>>>>>>>> DataStream
>> > >>>>>>>>>>>>> API
>> > >>>>>>>>>>>>>>>>>>> unless
>> > >>>>>>>>>>>>>>>>>>>> we have a very good reason to do so and with a
>> > >>> proper
>> > >>>>>>>>>>> discussion
>> > >>>>>>>>>>>>>> about
>> > >>>>>>>>>>>>>>>>>>> this
>> > >>>>>>>>>>>>>>>>>>>> as Alex said.
>> > >>>>>>>>>>>>>>>>>>>>
>> > >>>>>>>>>>>>>>>>>>>> Cheers,
>> > >>>>>>>>>>>>>>>>>>>> Gyula
>> > >>>>>>>>>>>>>>>>>>>>
>> > >>>>>>>>>>>>>>>>>>>> On Tue, Jun 27, 2023 at 11:03 AM Alexander
>> > >> Fedulov <
>> > >>>>>>>>>>>>>>>>>>>> alexander.fedulov@gmail.com> wrote:
>> > >>>>>>>>>>>>>>>>>>>>
>> > >>>>>>>>>>>>>>>>>>>>> Hi Xintong,
>> > >>>>>>>>>>>>>>>>>>>>>
>> > >>>>>>>>>>>>>>>>>>>>> By compatibility discussion do you mean the
>> > >>>> "[DISCUSS]
>> > >>>>>>>>>>> FLIP-321:
>> > >>>>>>>>>>>>>>>>>>>> Introduce
>> > >>>>>>>>>>>>>>>>>>>>> an API deprecation process" thread [1]?
>> > >>>>>>>>>>>>>>>>>>>>>
>> > >>>>>>>>>>>>>>>>>>>>> I am also curious to know if the rationale behind
>> > >>>> this
>> > >>>>>> new
>> > >>>>>>>>>> API
>> > >>>>>>>>>>>>> has
>> > >>>>>>>>>>>>>>>>>> been
>> > >>>>>>>>>>>>>>>>>>>>> previously discussed on the mailing list. Do we
>> > >>> have
>> > >>>> a
>> > >>>>>> list
>> > >>>>>>>>>> of
>> > >>>>>>>>>>>>>>>>>>>> shortcomings
>> > >>>>>>>>>>>>>>>>>>>>> in the current DataStream API that it tries to
>> > >>>> resolve?
>> > >>>>>> How
>> > >>>>>>>>>>> does
>> > >>>>>>>>>>>>>> the
>> > >>>>>>>>>>>>>>>>>>>>> current ProcessFunction functionality fit into
>> > >> the
>> > >>>>>> picture?
>> > >>>>>>>>>>>>> Will it
>> > >>>>>>>>>>>>>>>>>> be
>> > >>>>>>>>>>>>>>>>>>>> kept
>> > >>>>>>>>>>>>>>>>>>>>> as is or subsumed by new API?
>> > >>>>>>>>>>>>>>>>>>>>>
>> > >>>>>>>>>>>>>>>>>>>>> [1]
>> > >>>>>>>>>>>>>>
>> > >>>>>> https://lists.apache.org/thread/vmhzv8fcw2b33pqxp43486owrxbkd5x9
>> > >>>>>>>>>>>>>>>>>>>>> Best,
>> > >>>>>>>>>>>>>>>>>>>>> Alex
>> > >>>>>>>>>>>>>>>>>>>>>
>> > >>>>>>>>>>>>>>>>>>>>> On Mon, 26 Jun 2023 at 14:33, Xintong Song <
>> > >>>>>>>>>>>>> tonysong820@gmail.com>
>> > >>>>>>>>>>>>>>>>>>>> wrote:
>> > >>>>>>>>>>>>>>>>>>>>>>> The ProcessFunction API item is giving me the
>> > >>> most
>> > >>>>>>>>>> headaches
>> > >>>>>>>>>>>>>>>>>>> because
>> > >>>>>>>>>>>>>>>>>>>>> it's
>> > >>>>>>>>>>>>>>>>>>>>>>> very unclear what it actually entails; like is
>> > >> it
>> > >>>> an
>> > >>>>>>>>>>> entirely
>> > >>>>>>>>>>>>>>>>>>>> separate
>> > >>>>>>>>>>>>>>>>>>>>>> API
>> > >>>>>>>>>>>>>>>>>>>>>>> to DataStream (sounds like it is!) or an
>> > >>> extension
>> > >>>> of
>> > >>>>>>>>>>>>> DataStream.
>> > >>>>>>>>>>>>>>>>>>> How
>> > >>>>>>>>>>>>>>>>>>>>>> much
>> > >>>>>>>>>>>>>>>>>>>>>>> will it share the internals with DataStream
>> > >> etc.;
>> > >>>> how
>> > >>>>>> does
>> > >>>>>>>>>>> it
>> > >>>>>>>>>>>>>>>>>>> relate
>> > >>>>>>>>>>>>>>>>>>>> to
>> > >>>>>>>>>>>>>>>>>>>>>> the
>> > >>>>>>>>>>>>>>>>>>>>>>> Table API (w.r.t. switching APIs / what Table
>> > >> API
>> > >>>>> uses
>> > >>>>>>>>>>>>>>>>>> underneath).
>> > >>>>>>>>>>>>>>>>>>>>>> I totally understand your confusion. We started
>> > >>>>> planning
>> > >>>>>>>>>> this
>> > >>>>>>>>>>>>>> after
>> > >>>>>>>>>>>>>>>>>>>>> kicking
>> > >>>>>>>>>>>>>>>>>>>>>> off the release 2.0, so there's still a lot to
>> > >> be
>> > >>>>>> explored
>> > >>>>>>>>>>> and
>> > >>>>>>>>>>>>> the
>> > >>>>>>>>>>>>>>>>>>> plan
>> > >>>>>>>>>>>>>>>>>>>>>> keeps changing.
>> > >>>>>>>>>>>>>>>>>>>>>>
>> > >>>>>>>>>>>>>>>>>>>>>>
>> > >>>>>>>>>>>>>>>>>>>>>>    - In the beginning, we planned to do an
>> > >>> in-place
>> > >>>>>>>>>> refactor
>> > >>>>>>>>>>> of
>> > >>>>>>>>>>>>>>>>>>>>> DataStream
>> > >>>>>>>>>>>>>>>>>>>>>>    API, until the API migration period is
>> > >>> proposed.
>> > >>>>>>>>>>>>>>>>>>>>>>    - Then we want to make it an entirely
>> > >> separate
>> > >>>> API
>> > >>>>>> to
>> > >>>>>>>>>>>>>>>>>> DataStream,
>> > >>>>>>>>>>>>>>>>>>>> and
>> > >>>>>>>>>>>>>>>>>>>>>>    listed as a must-have for release 2.0 so
>> > >> that
>> > >>> we
>> > >>>>> can
>> > >>>>>>>>>>> remove
>> > >>>>>>>>>>>>>>>>>>>> DataStream
>> > >>>>>>>>>>>>>>>>>>>>>> once
>> > >>>>>>>>>>>>>>>>>>>>>>    it's ready.
>> > >>>>>>>>>>>>>>>>>>>>>>    - However, depending on the outcome of the
>> > >> API
>> > >>>>>>>>>>> compatibility
>> > >>>>>>>>>>>>>>>>>>>>> discussion
>> > >>>>>>>>>>>>>>>>>>>>>>    [1], we may not be able to remove DataStream
>> > >>> in
>> > >>>>> 2.0
>> > >>>>>>>>>>> anyway,
>> > >>>>>>>>>>>>>>>>>> which
>> > >>>>>>>>>>>>>>>>>>>>> means
>> > >>>>>>>>>>>>>>>>>>>>>> we
>> > >>>>>>>>>>>>>>>>>>>>>>    might need to re-evaluate the necessity of
>> > >>> this
>> > >>>>>> item for
>> > >>>>>>>>>>>>> 2.0.
>> > >>>>>>>>>>>>>>>>>>>>>> I'd say we wait a bit longer for the
>> > >> compatibility
>> > >>>>>>>>>> discussion
>> > >>>>>>>>>>>>> [1]
>> > >>>>>>>>>>>>>>>>>> and
>> > >>>>>>>>>>>>>>>>>>>>>> decide the priority for this item afterwards.
>> > >>>>>>>>>>>>>>>>>>>>>>
>> > >>>>>>>>>>>>>>>>>>>>>>
>> > >>>>>>>>>>>>>>>>>>>>>> Best,
>> > >>>>>>>>>>>>>>>>>>>>>>
>> > >>>>>>>>>>>>>>>>>>>>>> Xintong
>> > >>>>>>>>>>>>>>>>>>>>>>
>> > >>>>>>>>>>>>>>>>>>>>>>
>> > >>>>>>>>>>>>>>>>>>>>>> [1]
>> > >>>>>>>>>> https://lists.apache.org/list.html?dev@flink.apache.org
>> > >>>>>>>>>>>>>>>>>>>>>>
>> > >>>>>>>>>>>>>>>>>>>>>> On Mon, Jun 26, 2023 at 6:00 PM Chesnay
>> > >> Schepler <
>> > >>>>>>>>>>>>>>>>>> chesnay@apache.org
>> > >>>>>>>>>>>>>>>>>>>>>> wrote:
>> > >>>>>>>>>>>>>>>>>>>>>>
>> > >>>>>>>>>>>>>>>>>>>>>>> by-and-large I'm quite happy with the list of
>> > >>>> items.
>> > >>>>>>>>>>>>>>>>>>>>>>>
>> > >>>>>>>>>>>>>>>>>>>>>>> I'm curious as to why the "Disaggregated State
>> > >>>>>> Management"
>> > >>>>>>>>>>>>> item
>> > >>>>>>>>>>>>>>>>>> is
>> > >>>>>>>>>>>>>>>>>>>>> marked
>> > >>>>>>>>>>>>>>>>>>>>>>> as a must-have; will it require changes that
>> > >>> break
>> > >>>>>>>>>>> something?
>> > >>>>>>>>>>>>>>>>>> What
>> > >>>>>>>>>>>>>>>>>>>>>> prevents
>> > >>>>>>>>>>>>>>>>>>>>>>> it from being added in 2.1?
>> > >>>>>>>>>>>>>>>>>>>>>>>
>> > >>>>>>>>>>>>>>>>>>>>>>> We may want to update the Java 17 item to "Make
>> > >>>> Java
>> > >>>>> 17
>> > >>>>>>>>>> the
>> > >>>>>>>>>>>>>>>>>>> default,
>> > >>>>>>>>>>>>>>>>>>>>> drop
>> > >>>>>>>>>>>>>>>>>>>>>>> Java 8/11". Maybe even split it into a
>> > >> must-have
>> > >>>>> "Drop
>> > >>>>>>>>>> Java
>> > >>>>>>>>>>> 8"
>> > >>>>>>>>>>>>>>>>>> and
>> > >>>>>>>>>>>>>>>>>>> a
>> > >>>>>>>>>>>>>>>>>>>>>>> nice-to-have "Drop Java 11"?
>> > >>>>>>>>>>>>>>>>>>>>>>>
>> > >>>>>>>>>>>>>>>>>>>>>>> "Move Calcite rules from Scala to Java": I
>> > >> would
>> > >>>> hope
>> > >>>>>> that
>> > >>>>>>>>>>>>> this
>> > >>>>>>>>>>>>>>>>>>> would
>> > >>>>>>>>>>>>>>>>>>>>> be
>> > >>>>>>>>>>>>>>>>>>>>>>> an entirely internal change, and could thus be
>> > >> an
>> > >>>>>>>>>>> incremental
>> > >>>>>>>>>>>>>>>>>>> process
>> > >>>>>>>>>>>>>>>>>>>>>>> independent of major releases.
>> > >>>>>>>>>>>>>>>>>>>>>>> What is the actual scale of this item; how much
>> > >>> are
>> > >>>>> we
>> > >>>>>>>>>>>>> actually
>> > >>>>>>>>>>>>>>>>>>>>>> re-writing?
>> > >>>>>>>>>>>>>>>>>>>>>>> "Add MetricGroup#getLogicalScope": I'd raise
>> > >> this
>> > >>>> to
>> > >>>>> a
>> > >>>>>>>>>>>>>>>>>> must-have; i
>> > >>>>>>>>>>>>>>>>>>>>> think
>> > >>>>>>>>>>>>>>>>>>>>>>> I marked it down as nice-to-have only because
>> > >> it
>> > >>>>>> depends
>> > >>>>>>>>>> on
>> > >>>>>>>>>>>>>>>>>> another
>> > >>>>>>>>>>>>>>>>>>>>> item.
>> > >>>>>>>>>>>>>>>>>>>>>>> The ProcessFunction API item is giving me the
>> > >>> most
>> > >>>>>>>>>> headaches
>> > >>>>>>>>>>>>>>>>>>> because
>> > >>>>>>>>>>>>>>>>>>>>> it's
>> > >>>>>>>>>>>>>>>>>>>>>>> very unclear what it actually entails; like is
>> > >> it
>> > >>>> an
>> > >>>>>>>>>>> entirely
>> > >>>>>>>>>>>>>>>>>>>> separate
>> > >>>>>>>>>>>>>>>>>>>>>> API
>> > >>>>>>>>>>>>>>>>>>>>>>> to DataStream (sounds like it is!) or an
>> > >>> extension
>> > >>>> of
>> > >>>>>>>>>>>>> DataStream.
>> > >>>>>>>>>>>>>>>>>>> How
>> > >>>>>>>>>>>>>>>>>>>>>> much
>> > >>>>>>>>>>>>>>>>>>>>>>> will it share the internals with DataStream
>> > >> etc.;
>> > >>>> how
>> > >>>>>> does
>> > >>>>>>>>>>> it
>> > >>>>>>>>>>>>>>>>>>> relate
>> > >>>>>>>>>>>>>>>>>>>> to
>> > >>>>>>>>>>>>>>>>>>>>>> the
>> > >>>>>>>>>>>>>>>>>>>>>>> Table API (w.r.t. switching APIs / what Table
>> > >> API
>> > >>>>> uses
>> > >>>>>>>>>>>>>>>>>> underneath).
>> > >>>>>>>>>>>>>>>>>>>>>>> There are a few items I added as ideas which
>> > >>> don't
>> > >>>>>> have a
>> > >>>>>>>>>>>>>>>>>> priority
>> > >>>>>>>>>>>>>>>>>>>> yet;
>> > >>>>>>>>>>>>>>>>>>>>>>> would love to get some feedback on those.
>> > >>>>>>>>>>>>>>>>>>>>>>>
>> > >>>>>>>>>>>>>>>>>>>>>>> On 21/06/2023 08:41, Xintong Song wrote:
>> > >>>>>>>>>>>>>>>>>>>>>>>
>> > >>>>>>>>>>>>>>>>>>>>>>> Hi devs,
>> > >>>>>>>>>>>>>>>>>>>>>>>
>> > >>>>>>>>>>>>>>>>>>>>>>> As previously discussed in [1], we had been
>> > >>>>> collecting
>> > >>>>>>>>>> work
>> > >>>>>>>>>>>>> item
>> > >>>>>>>>>>>>>>>>>>>>>> proposals
>> > >>>>>>>>>>>>>>>>>>>>>>> for the 2.0 release until June 15th, on the
>> > >> wiki
>> > >>>> page
>> > >>>>>> [2].
>> > >>>>>>>>>>>>>>>>>>>>>>>
>> > >>>>>>>>>>>>>>>>>>>>>>>    - As we have passed the due date, I'd like
>> > >> to
>> > >>>>>> kindly
>> > >>>>>>>>>>> remind
>> > >>>>>>>>>>>>>>>>>>>> everyone
>> > >>>>>>>>>>>>>>>>>>>>>> *not
>> > >>>>>>>>>>>>>>>>>>>>>>>    to add / remove items directly on the wiki
>> > >>>> page*.
>> > >>>>>> If
>> > >>>>>>>>>>>>> needed,
>> > >>>>>>>>>>>>>>>>>>>> please
>> > >>>>>>>>>>>>>>>>>>>>>> post
>> > >>>>>>>>>>>>>>>>>>>>>>>    in this thread or reach out to the release
>> > >>>>> managers
>> > >>>>>>>>>>>>> instead.
>> > >>>>>>>>>>>>>>>>>>>>>>>    - I've reached out to some folks for
>> > >>>>> clarifications
>> > >>>>>>>>>> about
>> > >>>>>>>>>>>>>>>>>> their
>> > >>>>>>>>>>>>>>>>>>>>>>>    proposals. Some of them mentioned that they
>> > >>> can
>> > >>>>>> not yet
>> > >>>>>>>>>>>>> tell
>> > >>>>>>>>>>>>>>>>>>>> whether
>> > >>>>>>>>>>>>>>>>>>>>>> we
>> > >>>>>>>>>>>>>>>>>>>>>>>    should do an item or not, and would need
>> > >> more
>> > >>>>> time
>> > >>>>>> /
>> > >>>>>>>>>>>>>>>>>> discussions
>> > >>>>>>>>>>>>>>>>>>>> to
>> > >>>>>>>>>>>>>>>>>>>>>> make
>> > >>>>>>>>>>>>>>>>>>>>>>>    the decision. So I added a new symbol for
>> > >>> items
>> > >>>>>> whose
>> > >>>>>>>>>>>>>>>>>> priorities
>> > >>>>>>>>>>>>>>>>>>>> are
>> > >>>>>>>>>>>>>>>>>>>>>> `TBD`.
>> > >>>>>>>>>>>>>>>>>>>>>>> Now it's time to collaboratively decide a
>> > >> minimum
>> > >>>> set
>> > >>>>>> of
>> > >>>>>>>>>>>>>>>>>> must-have
>> > >>>>>>>>>>>>>>>>>>>>> items.
>> > >>>>>>>>>>>>>>>>>>>>>>> I've gone through the entire list of proposed
>> > >>>> items,
>> > >>>>>> and
>> > >>>>>>>>>>> found
>> > >>>>>>>>>>>>>>>>>> most
>> > >>>>>>>>>>>>>>>>>>>> of
>> > >>>>>>>>>>>>>>>>>>>>>> them
>> > >>>>>>>>>>>>>>>>>>>>>>> make quite much sense. So I think an online
>> > >> sync
>> > >>>>> might
>> > >>>>>> not
>> > >>>>>>>>>>> be
>> > >>>>>>>>>>>>>>>>>>>> necessary
>> > >>>>>>>>>>>>>>>>>>>>>> for
>> > >>>>>>>>>>>>>>>>>>>>>>> this. I'd like to go with this DISCUSS thread,
>> > >>>> where
>> > >>>>>>>>>>> everyone
>> > >>>>>>>>>>>>> can
>> > >>>>>>>>>>>>>>>>>>>>> comment
>> > >>>>>>>>>>>>>>>>>>>>>>> on how they think the list can be improved,
>> > >>>> followed
>> > >>>>>> by a
>> > >>>>>>>>>>>>> VOTE to
>> > >>>>>>>>>>>>>>>>>>>>>> formally
>> > >>>>>>>>>>>>>>>>>>>>>>> make the decision.
>> > >>>>>>>>>>>>>>>>>>>>>>>
>> > >>>>>>>>>>>>>>>>>>>>>>> Any feedback and opinions, including but not
>> > >>>> limited
>> > >>>>> to
>> > >>>>>>>>>> the
>> > >>>>>>>>>>>>>>>>>
>>

Re: [DISCUSS] Release 2.0 Work Items

Posted by Xintong Song <to...@gmail.com>.
Hi Matthias,

How's it going with the summary of existing 2.0.0 jira tickets?

I have gone through everything listed under FLINK-3957[1], and will
continue with other Jira tickets whose fix-version is 2.0.0.

Here are my 2-cents on the FLINK-3975 subtasks. Hope this helps on your
summary.

I'd suggest going ahead with the following tickets.

   - Need action in 1.18
      - FLINK-4675: Double-check whether the argument is indeed not used.
      Introduce a new non-argument API, and mark the original one as
      `@Deprecated`. FLIP needed.
      - FLINK-6912: Double-check whether the argument is indeed not used.
      Introduce a new non-argument API, and mark the original one as
      `@Deprecated`. FLIP needed.
      - FLINK-5336: Double-check whether `IOReadableWritable` is indeed not
      needed for `Path`. Mark methods from `IOReadableWritable` as
`@Deprecated`
      in `Path`. FLIP needed.
   - Need no action in 1.18
      - FLINK-4602/14068: Already listed in the release 2.0 wiki [2]
      - FLINK-3986/3991/3992/4367/5130/7691: Subsumed by "Deprecated
      methods/fields/classes in DataStream" in the release 2.0 wiki [2]
      - FLINK-6375: Change the hashCode behavior of `LongValue` (and other
      numeric types).

I'd suggest not doing the following tickets.

   - FLINK-4147/4330/9529/14658: These changes are non-trivial for both
   developers and users. Also, we are taking them into consideration designing
   the new ProcessFunction API. I'd be in favor of letting users migrate to
   the ProcessFunction API directly once it's ready, rather than forcing users
   to adapt to the breaking changes twice.
   - FLINK-3610: Only affects Scala API, which will soon be removed.

I don't have strong opinions on whether to work on the following tickets or
not. Some of them are not very clear to me based on the description and
conversation on the ticket, others may require further investigation and
evaluation to decide. Unless someone volunteers to look into them, I'd be
slightly in favor of not doing them, as I'm not aware of them causing any
serious problems.

   - FLINK-3959 Remove implicit Sinks
   - FLINK-4757 Unify "GlobalJobParameters" and "Configuration"
   - FLINK-4758 Remove IOReadableWritable from classes where not needed
   - FLINK-4971 Unify Stream Sinks and OutputFormats
   - FLINK-5126 Remove Checked Exceptions from State Interfaces
   - FLINK-5337 Introduce backwards compatible state to task assignment
   - FLINK-5346 Remove all ad-hoc config loading via GlobalConfiguration
   - FLINK-5875 Use TypeComparator.hash() instead of Object.hashCode() for
   keying in DataStream API
   - FLINK-9798 Drop canEqual() from TypeInformation, TypeSerializer, etc.
   - FLINK-13926 `ProcessingTimeSessionWindows` and
   `EventTimeSessionWindows` should be generic


WDYT?

Best,

Xintong


[1] https://issues.apache.org/jira/browse/FLINK-3957

[2] https://cwiki.apache.org/confluence/display/FLINK/2.0+Release



On Thu, Jul 13, 2023 at 10:31 AM li zhiqiang <li...@gmail.com>
wrote:

> @Xingtong
> I already know the modification of some api, but because there are many
> changes involved,
> I am afraid that the consideration is not comprehensive.
> I'm willing to do the work, but I haven't found a committer yet.
>
> Best,
> Zhiqiang
>
> 发件人: Xintong Song <to...@gmail.com>
> 日期: 星期四, 2023年7月13日 10:03
> 收件人: dev@flink.apache.org <de...@flink.apache.org>
> 主题: Re: [DISCUSS] Release 2.0 Work Items
> Thanks for the inputs, Zhiqiang and Jiabao.
>
> @Zhiqiang,
> The proposal sounds interesting. Do you already have an idea what API
> changes are needed in order to make the connectors pluggable? I think
> whether this should go into Flink 2.0 would significantly depend on what
> API changes are needed. Moreover, would you like to work on this effort or
> simply raise a need? And if you'd like to work on this, do you already find
> a committer who can help on this?
>
> @Jiabao,
> Thanks for the suggestions. I agree that it would be nice to improve the
> experiences in deploying Flink instances and submitting tasks. It would be
> helpful if you can point out the specific behaviors that make integrating
> Flink in your production difficult. Also, I'd like to understand how this
> topic is related to the Release 2.0 topic. Or asked differently, is this
> something that requires breaking changes that can only happen in major
> version bumps, or is it just improvement that can go into any minor
> version?
>
>
> Best,
>
> Xintong
>
>
>
> On Thu, Jul 13, 2023 at 12:49 AM Jiabao Sun <jiabao.sun@xtransfer.cn
> .invalid>
> wrote:
>
> > Thanks Xintong for driving the effort.
> >
> >
> > I’d add a +1 to improving out-of-box user experience, as suggested by
> > @Jark and @Chesnay.
> > For beginners, understanding complex configurations is a hard work.
> >
> > In addition, the deployment of a set of Flink runtime environment is also
> > a complex matter.
> > At present, there are still big differences in the submission tasks for
> > different computing resource. If users need time for their own data
> > development platform, they need to deeply understand these differences
> when
> > processing task submission and running status check.
> >
> > I'm glad to see features like flink-sql-gateway being implemented by the
> > community because it makes it easy for users to submit flink sql tasks.
> > Further more, can we provide more unified, out-of-the-box capabilities
> that
> > allow users to quickly pull up a production-ready Flink environment and
> > easily integrate Flink into their own data development platform?
> >
> >
> > Best,
> > Jiabao
> >
> >
> > > 2023年7月12日 下午8:16,zhiqiang li <li...@gmail.com> 写道:
> > >
> > > I have seen in [1] connectors and formats, and user code will be
> > pluggable.
> > > If the connectors are pluggable, the benefits are obvious, as the
> > conflicts
> > > between different jar package versions can be avoided.
> > > If you don't use classloader isolation, shade is needed to resolve
> > > conflicts. A lot of development time is wasted.
> > > I know that this change may involve a lot of API changes, so I would
> like
> > > to discuss in this email whether we can make changes in Flink 2.0.
> > > Plugins facilitate a strict separation of code through restricted
> > > classloaders.
> > >
> > > Plugins cannot access classes from other plugins or from Flink that
> have
> > >> not been specifically whitelisted.
> > >> This strict isolation allows plugins to contain conflicting versions
> of
> > >> the same library without the need to relocate classes or to converge
> to
> > >> common versions.
> > >> Currently, file systems and metric reporters are pluggable *but in the
> > >> future, connectors, formats, and even user code should also be
> > pluggable.*
> > >>
> > >
> > > [1]
> > >
> >
> https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/filesystems/overview/
> > >
> > > Xintong Song <to...@gmail.com> 于2023年7月11日周二 18:50写道:
> > >
> > >>>
> > >>> What we might want to come up with is a summary with each 2.0.0 issue
> > on
> > >>> why it should be included or not. That summary is something the
> > community
> > >>> could vote on. WDYT? I'm happy to help here.
> > >>>
> > >>
> > >> That sounds great. Thanks for offering the help. I'll also try to go
> > >> through the issues, but TBH I'm quite overwhelmed and cannot promise
> to
> > get
> > >> this done very soon. Your help is very much needed.
> > >>
> > >>
> > >> Best,
> > >>
> > >> Xintong
> > >>
> > >>
> > >>
> > >> On Tue, Jul 11, 2023 at 6:08 PM Matthias Pohl
> > >> <ma...@aiven.io.invalid> wrote:
> > >>
> > >>> @Xintong I guess it makes sense. I agree with your conclusions on the
> > >> four
> > >>> mentioned Jira issues.
> > >>>
> > >>> I just checked any issues that have fixVersion = 2.0.0 [1]. There
> are a
> > >> few
> > >>> more items that are not affiliated with FLINK-3957 [2]. I guess we
> > should
> > >>> find answers for these issues: Either closing them with a reason to
> > have
> > >> a
> > >>> consistent state in Jira or adding them to the feature list as part
> of
> > a
> > >>> separate voting thread (to leave the current vote untouched).
> > >>>
> > >>> What we might want to come up with is a summary with each 2.0.0 issue
> > on
> > >>> why it should be included or not. That summary is something the
> > community
> > >>> could vote on. WDYT? I'm happy to help here.
> > >>>
> > >>> Matthias
> > >>>
> > >>> [1]
> > >>>
> > >>>
> > >>
> >
> https://issues.apache.org/jira/browse/FLINK-32437?jql=project%20%3D%20FLINK%20AND%20fixVersion%20%3D%202.0.0%20AND%20status%20NOT%20IN%20(Closed%2C%20Resolved)%20%20
> > >>> [2] https://issues.apache.org/jira/browse/FLINK-3957
> > >>>
> > >>>
> > >>> On Tue, Jul 11, 2023 at 5:01 AM Xintong Song <to...@gmail.com>
> > >>> wrote:
> > >>>
> > >>>> @Zhu,
> > >>>> As you are downgrading "Clarify the scopes of configuration options"
> > to
> > >>>> nice-to-have priority, could you also bring that up in the vote
> > >>> thread[1]?
> > >>>> I'm asking because there are people who already voted on the
> original
> > >>> list.
> > >>>> I think restarting the vote is probably an overkill and unnecessary,
> > >> but
> > >>> we
> > >>>> should at least bring this change to their attention.
> > >>>>
> > >>>> @Matthias,
> > >>>> Thanks a lot for bringing this up. I wasn't aware of this early
> > >>> umbrella. I
> > >>>> haven't gone through everything in FLINK-3957 yet. I'll do it asap.
> > >>>>
> > >>>> Just quickly went through the 4 issues you mentioned.
> > >>>> - FLINK-4675 & FLINK-14068: I'd be +1 to deprecate them in 1.18, as
> > >> long
> > >>> as
> > >>>> the new APIs that we want users to migrate to are ready. For these 2
> > >>>> tickets, I think introduction of the updated APIs should be
> > >>> straightforward
> > >>>> and feasible for 1.18.
> > >>>> - FLINK-13926: I'm not sure about this one. The two mentioned
> classes
> > >>>> `ProcessingTimeSessionWindows` and `EventTimeSessionWindows` are not
> > >> even
> > >>>> marked as Public or PublicEvolving APIs. Moreover, I don't see a
> good
> > >> way
> > >>>> to smoothly replace the classes with a generic version.
> > >>>> - FLINK-5126: This is a bit unclear to me. From the description and
> > >>>> conversation on the ticket, I don't fully understand which concrete
> > >> APIs
> > >>>> the ticket is referring to. Or maybe it refers to all / most of the
> > >> APIs
> > >>>> that throws Exception / IOException in general. Moreover, I don't
> > think
> > >>>> removing Exception / IOException from the API signature is a
> breaking
> > >>>> change. It requires no code changes on the caller side.
> > >>>>
> > >>>> WDYT?
> > >>>>
> > >>>> Best,
> > >>>>
> > >>>> Xintong
> > >>>>
> > >>>>
> > >>>> [1]
> https://lists.apache.org/thread/r0y9syc6k5nmcxvnd0hj33htdpdj9k6m
> > >>>> [2] https://issues.apache.org/jira/browse/FLINK-3957
> > >>>>
> > >>>> On Mon, Jul 10, 2023 at 10:53 PM Matthias Pohl
> > >>>> <ma...@aiven.io.invalid> wrote:
> > >>>>
> > >>>>> I brought it up in the deprecating APIs in 1.18 thread [1] already
> > >> but
> > >>> it
> > >>>>> feels misplaced there. I just wanted to ask whether someone did a
> > >> pass
> > >>>> over
> > >>>>> FLINK-3957 [2]. I came across it when going through the release 2.0
> > >>>> feature
> > >>>>> list [3] as part of the vote. I have the feeling that there are
> some
> > >>>> valid
> > >>>>> action items (e.g. FLINK-4675, FLINK-5126, FLINK-13926 [4-6]) which
> > >> do
> > >>>> not
> > >>>>> seem to be listed in the 2.0 feature list [3], yet (or are included
> > >> in
> > >>>> some
> > >>>>> of the bigger items). Majority of the subtasks are probably covered
> > >> by
> > >>>> the
> > >>>>> DataSet removal, the Scala API removal and the ProcessFunction
> > >>>> refactoring.
> > >>>>> Other subtasks (FLINK-14068 [7]) made it into the feature list.
> > >>>>>
> > >>>>> I haven't worked with the SDK code that much so that I can judge
> > >>> whether
> > >>>>> the subtasks are still reasonable or actually obsolete. That is
> why I
> > >>>>> wanted to mention the Jira issue here once more.
> > >>>>>
> > >>>>> I don't consider it a blocker for the ongoing vote but was
> wondering
> > >>>>> whether it makes sense for someone who might have more experience
> in
> > >>> that
> > >>>>> field to add some of the subtasks to the feature list.
> > >>>>>
> > >>>>> Or shall we just consider it as "not interesting enough" because
> > >> nobody
> > >>>>> added it in the first place to the 2.0 feature list [3]?
> > >>>>>
> > >>>>> Matthias
> > >>>>>
> > >>>>> [1]
> https://lists.apache.org/thread/3dw4f8frlg8hzlv324ql7n2755bzs9hy
> > >>>>> [2] https://issues.apache.org/jira/browse/FLINK-3957
> > >>>>> [3] https://cwiki.apache.org/confluence/display/FLINK/2.0+Release
> > >>>>> [4] https://issues.apache.org/jira/browse/FLINK-4675
> > >>>>> [5] https://issues.apache.org/jira/browse/FLINK-5126
> > >>>>> [6] https://issues.apache.org/jira/browse/FLINK-13926
> > >>>>> [7] https://issues.apache.org/jira/browse/FLINK-14068
> > >>>>>
> > >>>>> On Mon, Jul 10, 2023 at 3:17 PM Zhu Zhu <re...@gmail.com> wrote:
> > >>>>>
> > >>>>>> Agreed that we should deprecate affected APIs as soon as possible.
> > >>>>>> But there is not much time before the feature freeze of 1.18,
> > >> hence
> > >>>>>> I'm a bit concerned that some of the deprecations might not be
> done
> > >>>> 1.18.
> > >>>>>>
> > >>>>>> We are currently looking into the improvements of the
> configuration
> > >>>>> layer.
> > >>>>>> Most of the proposed changes would require a public discussion, or
> > >>> even
> > >>>>>> a FLIP, which I think can hardly close before the feature freeze
> of
> > >>>> 1.18.
> > >>>>>> And some of the APIs can be deprecated only after the
> corresponding
> > >>> new
> > >>>>>> APIs are developed. Therefore we previously targeted them for
> 1.19.
> > >>>>>>
> > >>>>>> We may review later to see what deprecation work can be done in
> > >> 1.18
> > >>>> and
> > >>>>>> make it if possible. I think we can do the work even after the
> > >>> feature
> > >>>>>> freeze
> > >>>>>> date, if it is a purely deprecation work (simply adding
> > >> annotations).
> > >>>>> WDYT?
> > >>>>>>
> > >>>>>> I'm also changing the priority of "Clarify the scopes of
> > >>> configuration
> > >>>>>> options"
> > >>>>>> to nice to have. I think most of the work are not breaking changes
> > >>> and
> > >>>>> can
> > >>>>>> be done in 1.x or 2.1+. For the breaking changes which might be
> > >>> needed,
> > >>>>> we
> > >>>>>> will consider it as part of the configuration layer rework.
> > >>>>>>
> > >>>>>> Thanks,
> > >>>>>> Zhu
> > >>>>>>
> > >>>>>> Xintong Song <to...@gmail.com> 于2023年7月10日周一 19:58写道:
> > >>>>>>>
> > >>>>>>>>
> > >>>>>>>> At what point are the FLIP discussions coming into play?
> > >>>>>>>
> > >>>>>>> I keep wondering if these shouldn't have started already.
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> I think this depends on the responsible contributor and reviewer
> > >> of
> > >>>>>>> individual items. From my perspective, the FLIP discussions can
> > >>> start
> > >>>>> any
> > >>>>>>> time as long as the contributors are ready, the earlier the
> > >> better.
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> What we need to ensure is that all breaking API changes are
> > >>>>>>>> discussed/decided before 1.18 is released so we can deprecate
> > >>>>> affected
> > >>>>>> APIs.
> > >>>>>>>>
> > >>>>>>>
> > >>>>>>> The introduction of the migration period has brought the
> > >>> requirement
> > >>>> to
> > >>>>>>> plan the removal of public APIs 2 minor releases ahead of the
> > >> major
> > >>>>>>> release, which is TBH a bit unexpected. I agree it would be nice
> > >> if
> > >>>> we
> > >>>>>> can
> > >>>>>>> get the FLIPs ready by releasing 1.18. But I also don't think we
> > >>>> should
> > >>>>>>> rush on it. If the deprecation of a Public API does not make
> > >> 1.18,
> > >>> we
> > >>>>> may
> > >>>>>>> carry it until 3.0. Or if there are many Public APIs whose
> > >>>> deprecation
> > >>>>>> does
> > >>>>>>> not make 1.18, we may deprecate them in 1.19 and postpone the
> > >> major
> > >>>>>> version
> > >>>>>>> bump to after a 1.20 release. Moreover, as mentioned in
> > >>> FLIP-321[1],
> > >>>>>>> exceptions are discussable given that the migration period is
> > >> newly
> > >>>>>>> proposed and we did not give developers the chance to plan things
> > >>>>> ahead.
> > >>>>>> To
> > >>>>>>> sum up, I'd say we try identify APIs that need to be deprecated
> > >> in
> > >>>> 1.18
> > >>>>>>> with best efforts, and evaluate the remaining options (carrying
> > >> the
> > >>>> API
> > >>>>>> for
> > >>>>>>> the entire 2.x cycle, postpone 2.0, or making an exception)
> > >>>>> case-by-case.
> > >>>>>>> WDYT?
> > >>>>>>>
> > >>>>>>> Best,
> > >>>>>>>
> > >>>>>>> Xintong
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> [1]
> > >>> https://lists.apache.org/thread/vmhzv8fcw2b33pqxp43486owrxbkd5x9
> > >>>>>>>
> > >>>>>>> On Mon, Jul 10, 2023 at 6:13 PM Chesnay Schepler <
> > >>> chesnay@apache.org
> > >>>>>
> > >>>>>> wrote:
> > >>>>>>>
> > >>>>>>>> At what point are the FLIP discussions coming into play?
> > >>>>>>>>
> > >>>>>>>> I keep wondering if these shouldn't have started already.
> > >>>>>>>> It just seems that a lot of decisions are implicitly reliant on
> > >>> the
> > >>>>>>>> items even being accepted.
> > >>>>>>>> Estimates can only be provided if we actually know the scope of
> > >>> the
> > >>>>>>>> change, but that's not always clear from the description in the
> > >>>> doc.
> > >>>>>>>>
> > >>>>>>>> What we need to ensure is that all breaking API changes are
> > >>>>>>>> discussed/decided before 1.18 is released so we can deprecate
> > >>>>> affected
> > >>>>>>>> APIs.
> > >>>>>>>>
> > >>>>>>>> On 10/07/2023 11:32, Xintong Song wrote:
> > >>>>>>>>> Hi Matthias,
> > >>>>>>>>>
> > >>>>>>>>> The questions you asked are indeed very important. Here're
> > >> some
> > >>>>> quick
> > >>>>>>>>> responses, based on the plans I had in mind, which I have not
> > >>>>> aligned
> > >>>>>>>> with
> > >>>>>>>>> other release managers yet.
> > >>>>>>>>>
> > >>>>>>>>> In the previous discussions between the RMs, we were not able
> > >>> to
> > >>>>> make
> > >>>>>>>>> proposals on things like how to make a time plan, how to
> > >> manage
> > >>>> the
> > >>>>>>>> release
> > >>>>>>>>> branch, etc., due to the lack of inputs on e.g., the work
> > >> items
> > >>>>> need
> > >>>>>> to
> > >>>>>>>> be
> > >>>>>>>>> included (which transitively depends on the API compatibility
> > >>> to
> > >>>>>> provide
> > >>>>>>>>> between major versions) and the workloads / time needed for
> > >>> them.
> > >>>>>> With
> > >>>>>>>> the
> > >>>>>>>>> recent discussions, we have collected at least the majority
> > >> of
> > >>>> the
> > >>>>>> inputs
> > >>>>>>>>> needed.
> > >>>>>>>>>
> > >>>>>>>>> Here are things that I think we as the release managers would
> > >>> do
> > >>>>> next
> > >>>>>>>>> (again, not aligned with other release managers yet)
> > >>>>>>>>> - Creating a time plan, by reaching out to people to
> > >> understand
> > >>>> the
> > >>>>>>>>> estimated workloads, prerequisites and ETA of each work item.
> > >>>>>>>>> - Make a proposal on how to manage the release branch, i.e.,
> > >>> when
> > >>>>> to
> > >>>>>> cut
> > >>>>>>>>> the branch and whether to ship the milestone releases, etc.
> > >>>>>>>>> - Set-up regular release syncs (bi-weekly / monthly) to
> > >> update
> > >>>> the
> > >>>>>> status
> > >>>>>>>>> and draw attention to where help is needed.
> > >>>>>>>>>
> > >>>>>>>>> So back to your questions.
> > >>>>>>>>>
> > >>>>>>>>> There are still to-be-discussed items in the list of
> > >> features.
> > >>>>>> What's the
> > >>>>>>>>>> plan with those?
> > >>>>>>>>> When collecting ETA, for items that the completion time
> > >> cannot
> > >>>> yet
> > >>>>> be
> > >>>>>>>>> estimated, we would like to have at least a time by which the
> > >>>>>> estimation
> > >>>>>>>>> can be made. I think the same applies to the to-be-discussed
> > >>>> items.
> > >>>>>> And
> > >>>>>>>> if
> > >>>>>>>>> the items should be included as must-haves, we would need
> > >>> another
> > >>>>>> vote to
> > >>>>>>>>> adjust the must-have item list.
> > >>>>>>>>>
> > >>>>>>>>> Some of them don't have anyone assigned.
> > >>>>>>>>> My concern is that they will be overlooked because nobody
> > >> feels
> > >>>> to
> > >>>>>> be in
> > >>>>>>>>>> charge.
> > >>>>>>>>> This is a tricky one. For must-have items without assignees,
> > >> we
> > >>>> as
> > >>>>>> the
> > >>>>>>>>> release managers should be responsible for raising them up in
> > >>> the
> > >>>>>> release
> > >>>>>>>>> syncs, and try to find assignees for them. Hopefully, there
> > >>> will
> > >>>> be
> > >>>>>>>> someone
> > >>>>>>>>> who stands out. But it is possible that for a must-have item
> > >>>> nobody
> > >>>>>> wants
> > >>>>>>>>> to work on it. If that happens, which I don't think it will,
> > >> it
> > >>>>>> probably
> > >>>>>>>>> means the item is not that critical and we may have to
> > >> exclude
> > >>> it
> > >>>>>> from
> > >>>>>>>> the
> > >>>>>>>>> release. Either way, they should not be overlooked, because
> > >>> IMHO
> > >>>>>> release
> > >>>>>>>>> managers should be responsible for trying to get someone to
> > >>> work
> > >>>> on
> > >>>>>> the
> > >>>>>>>>> un-assigned items.
> > >>>>>>>>>
> > >>>>>>>>> We'll have more discussions soon and keep the community
> > >>> updated.
> > >>>>>>>>>
> > >>>>>>>>> Best,
> > >>>>>>>>>
> > >>>>>>>>> Xintong
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>> On Mon, Jul 10, 2023 at 3:53 PM Matthias Pohl
> > >>>>>>>>> <ma...@aiven.io.invalid> wrote:
> > >>>>>>>>>
> > >>>>>>>>>> Now that the vote is started on the must-have items: There
> > >> are
> > >>>>> still
> > >>>>>>>>>> to-be-discussed items in the list of features. What's the
> > >> plan
> > >>>>> with
> > >>>>>>>> those?
> > >>>>>>>>>> Some of them don't have anyone assigned. Were these items
> > >>>>> discussed
> > >>>>>>>> among
> > >>>>>>>>>> the release managers? So far, it looks like they are handled
> > >>> as
> > >>>>>>>>>> nice-to-have if someone volunteers to pick them up?
> > >>>>>>>>>>
> > >>>>>>>>>> My concern is that they will be overlooked because nobody
> > >>> feels
> > >>>> to
> > >>>>>> be in
> > >>>>>>>>>> charge.
> > >>>>>>>>>>
> > >>>>>>>>>> Best,
> > >>>>>>>>>> Matthias
> > >>>>>>>>>>
> > >>>>>>>>>> On Fri, Jul 7, 2023 at 11:06 AM Xintong Song <
> > >>>>> tonysong820@gmail.com
> > >>>>>>>
> > >>>>>>>>>> wrote:
> > >>>>>>>>>>
> > >>>>>>>>>>> Thanks all for the discussion.
> > >>>>>>>>>>>
> > >>>>>>>>>>> The wiki has been updated as discussed. I'm starting a vote
> > >>>> now.
> > >>>>>>>>>>>
> > >>>>>>>>>>> Best,
> > >>>>>>>>>>>
> > >>>>>>>>>>> Xintong
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>> On Wed, Jul 5, 2023 at 9:52 AM Xintong Song <
> > >>>>> tonysong820@gmail.com
> > >>>>>>>
> > >>>>>>>>>> wrote:
> > >>>>>>>>>>>> Hi ConradJam,
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> I think Chesnay has already put his name as the
> > >> Contributor
> > >>>> for
> > >>>>>> the
> > >>>>>>>> two
> > >>>>>>>>>>>> tasks you listed. Maybe you can reach out to him to see if
> > >>> you
> > >>>>> can
> > >>>>>>>>>>>> collaborate on this.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> In general, I don't think contributing to a release 2.0
> > >>> issue
> > >>>> is
> > >>>>>> much
> > >>>>>>>>>>>> different from contributing to a regular issue. We haven't
> > >>> yet
> > >>>>>> created
> > >>>>>>>>>>> JIRA
> > >>>>>>>>>>>> tickets for all the listed tasks because many of them
> > >> needs
> > >>>>>> further
> > >>>>>>>>>>>> discussions and / or FLIPs to decide whether and how they
> > >>>> should
> > >>>>>> be
> > >>>>>>>>>>>> performed.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Best,
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Xintong
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> On Mon, Jul 3, 2023 at 10:37 PM ConradJam <
> > >>>> jam.gzczy@gmail.com>
> > >>>>>>>> wrote:
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>> Hi Community:
> > >>>>>>>>>>>>>   I see some tasks in the 2.0 list that haven't been
> > >>>> assigned
> > >>>>>> yet. I
> > >>>>>>>>>>> want
> > >>>>>>>>>>>>> to take the initiative to take on some tasks that I can
> > >>>>>> complete. How
> > >>>>>>>>>>> do I
> > >>>>>>>>>>>>> apply to the community for this part of the task? I am
> > >>>>>> interested in
> > >>>>>>>>>> the
> > >>>>>>>>>>>>> following parts of FLINK-32377
> > >>>>>>>>>>>>> <https://issues.apache.org/jira/browse/FLINK-32377>, do
> > >> I
> > >>>> need
> > >>>>>> to
> > >>>>>>>>>>> create
> > >>>>>>>>>>>>> issuse myself and point it to myself?
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> - the current timestamp, which is problematic w.r.t.
> > >>> caching
> > >>>>> and
> > >>>>>>>>>>> testing,
> > >>>>>>>>>>>>> while providing no value.
> > >>>>>>>>>>>>> - Remove JarRequestBody#programArgs in favor of
> > >>>>> #programArgsList.
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> [1] FLINK-32377 <
> > >>>>>> https://issues.apache.org/jira/browse/FLINK-32377><
> https://issues.apache.org/jira/browse/FLINK-32377%3e>
> > >>>>>>>>>>>>> https://issues.apache.org/jira/browse/FLINK-32377
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> Teoh, Hong <li...@amazon.co.uk.invalid> 于2023年6月30日周五
> > >>>>> 00:53写道:
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> Teoh, Hong <li...@amazon.co.uk.invalid> 于2023年6月30日周五
> > >>>>> 00:53写道:
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>> Thanks Xintong for driving the effort.
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> I’d add a +1 to reworking configs, as suggested by @Jark
> > >>> and
> > >>>>>>>>>> @Chesnay,
> > >>>>>>>>>>>>>> especially the types. We have various configs that
> > >> encode
> > >>>>> Time /
> > >>>>>>>>>>>>> MemorySize
> > >>>>>>>>>>>>>> that are Long instead!
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> Regards,
> > >>>>>>>>>>>>>> Hong
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> On 29 Jun 2023, at 16:19, Yuan Mei <
> > >>> yuanmei.work@gmail.com
> > >>>>>
> > >>>>>>>>>> wrote:
> > >>>>>>>>>>>>>>> CAUTION: This email originated from outside of the
> > >>>>>> organization.
> > >>>>>>>>>> Do
> > >>>>>>>>>>>>> not
> > >>>>>>>>>>>>>> click links or open attachments unless you can confirm
> > >> the
> > >>>>>> sender
> > >>>>>>>>>> and
> > >>>>>>>>>>>>> know
> > >>>>>>>>>>>>>> the content is safe.
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> Thanks for driving this effort, Xintong!
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> To Chesnay
> > >>>>>>>>>>>>>>>> I'm curious as to why the "Disaggregated State
> > >>> Management"
> > >>>>>> item
> > >>>>>>>>>> is
> > >>>>>>>>>>>>>>>> marked as a must-have; will it require changes that
> > >>> break
> > >>>>>>>>>>> something?
> > >>>>>>>>>>>>>>>> What prevents it from being added in 2.1?
> > >>>>>>>>>>>>>>> As to "Disaggregated State Management".
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> We plan to provide a new type of state backend to
> > >> support
> > >>>> DFS
> > >>>>>> as
> > >>>>>>>>>>>>> primary
> > >>>>>>>>>>>>>>> storage.
> > >>>>>>>>>>>>>>> To achieve this, we at least need to include two parts
> > >> of
> > >>>>>> amends
> > >>>>>>>>>>> (not
> > >>>>>>>>>>>>>>> entirely sure yet, since we are still in the designing
> > >>> and
> > >>>>>>>>>> prototype
> > >>>>>>>>>>>>>> phase)
> > >>>>>>>>>>>>>>> 1. Statebackend Change
> > >>>>>>>>>>>>>>> 2. State Access Change
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> Not all of the interfaces related are `@Internal`. Some
> > >>> of
> > >>>>> the
> > >>>>>>>>>>>>> interfaces
> > >>>>>>>>>>>>>>> like `StateBackend` is `@PublicEvolving`
> > >>>>>>>>>>>>>>> So, you are right in the sense that "Disaggregated
> > >> State
> > >>>>>>>>>> Management"
> > >>>>>>>>>>>>>> itself
> > >>>>>>>>>>>>>>> probably does not need to be a "Must Have"
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> But I was hoping changes that related to public APIs
> > >> can
> > >>> be
> > >>>>>>>>>>> finalized
> > >>>>>>>>>>>>> and
> > >>>>>>>>>>>>>>> merged in Flink 2.0 (I will fix the wiki accordingly).
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> I also agree with Jark that 2.0 is a good chance to
> > >>> rework
> > >>>>> the
> > >>>>>>>>>>> default
> > >>>>>>>>>>>>>>> value of configurations.
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> Best
> > >>>>>>>>>>>>>>> Yuan
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> On Thu, Jun 29, 2023 at 8:43 PM Chesnay Schepler <
> > >>>>>>>>>>> chesnay@apache.org>
> > >>>>>>>>>>>>>> wrote:
> > >>>>>>>>>>>>>>>> Something else configuration-related is that there
> > >> are a
> > >>>>>> bunch of
> > >>>>>>>>>>>>>>>> options where the type isn't quite correct (e.g., a
> > >>> String
> > >>>>>> where
> > >>>>>>>>>> it
> > >>>>>>>>>>>>>>>> could be an enum, a string where it should be an int
> > >> or
> > >>>>>>>>>> something).
> > >>>>>>>>>>>>>>>> Could do a pass over those as well.
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> On 29/06/2023 13:50, Jark Wu wrote:
> > >>>>>>>>>>>>>>>>> Hi,
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>> I think one more thing we need to consider to do in
> > >> 2.0
> > >>>> is
> > >>>>>>>>>>> changing
> > >>>>>>>>>>>>> the
> > >>>>>>>>>>>>>>>>> default value of configuration to improve out-of-box
> > >>> user
> > >>>>>>>>>>>>> experience.
> > >>>>>>>>>>>>>>>>> Currently, in order to run a Flink job, users may
> > >> need
> > >>> to
> > >>>>> set
> > >>>>>>>>>>>>>>>>> a bunch of configurations, such as minibatch,
> > >>> checkpoint
> > >>>>>>>>>> interval,
> > >>>>>>>>>>>>>>>>> exactly-once,
> > >>>>>>>>>>>>>>>>> incremental-checkpoint, etc. It's very verbose and
> > >> hard
> > >>>> to
> > >>>>>> use
> > >>>>>>>>>> for
> > >>>>>>>>>>>>>>>>> beginners.
> > >>>>>>>>>>>>>>>>> Most of them can have a universally applicable value.
> > >>>>>> Because
> > >>>>>>>>>>>>> changing
> > >>>>>>>>>>>>>>>> the
> > >>>>>>>>>>>>>>>>> default value is a breaking change. I think It's
> > >> worth
> > >>>>>>>>>> considering
> > >>>>>>>>>>>>>>>> changing
> > >>>>>>>>>>>>>>>>> them in 2.0.
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>> What do you think?
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>> Best,
> > >>>>>>>>>>>>>>>>> Jark
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>> On Wed, 28 Jun 2023 at 14:10, Sergey Nuyanzin <
> > >>>>>>>>>>> snuyanzin@gmail.com>
> > >>>>>>>>>>>>>>>> wrote:
> > >>>>>>>>>>>>>>>>>> Hi Chesnay
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> "Move Calcite rules from Scala to Java": I would
> > >> hope
> > >>>>> that
> > >>>>>>>>>> this
> > >>>>>>>>>>>>> would
> > >>>>>>>>>>>>>>>> be
> > >>>>>>>>>>>>>>>>>>> an entirely internal change, and could thus be an
> > >>>>>> incremental
> > >>>>>>>>>>>>> process
> > >>>>>>>>>>>>>>>>>>> independent of major releases.
> > >>>>>>>>>>>>>>>>>>> What is the actual scale of this item; how much are
> > >>> we
> > >>>>>>>>>> actually
> > >>>>>>>>>>>>>>>>>> re-writing?
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> Thanks for asking
> > >>>>>>>>>>>>>>>>>> yes, you're right, that should be internal change.
> > >>>>>>>>>>>>>>>>>> Yeah I was also thinking about incremental change
> > >>> (rule
> > >>>> by
> > >>>>>> rule
> > >>>>>>>>>>> or
> > >>>>>>>>>>>>>>>>>> reasonable small group of rules).
> > >>>>>>>>>>>>>>>>>> And yes, this could be an independent (on major
> > >>> release)
> > >>>>>>>>>> activity
> > >>>>>>>>>>>>>>>>>> The problem is actually for children of RelOptRule.
> > >>>>>>>>>>>>>>>>>> Currently I see 60+ such rules (in Scala) using the
> > >>>>>> mentioned
> > >>>>>>>>>>>>>> deprecated
> > >>>>>>>>>>>>>>>>>> api.
> > >>>>>>>>>>>>>>>>>> There are also children of ConverterRule (50+) which
> > >>> do
> > >>>>> not
> > >>>>>>>>>> have
> > >>>>>>>>>>>>> such
> > >>>>>>>>>>>>>>>>>> issues.
> > >>>>>>>>>>>>>>>>>> Maybe it could be considered as the next step to
> > >> have
> > >>>> all
> > >>>>>> the
> > >>>>>>>>>>>>> rules in
> > >>>>>>>>>>>>>>>>>> Java.
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> On Tue, Jun 27, 2023 at 1:34 PM Xintong Song <
> > >>>>>>>>>>>>> tonysong820@gmail.com>
> > >>>>>>>>>>>>>>>>>> wrote:
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> Hi Alex & Gyula,
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> By compatibility discussion do you mean the
> > >>> "[DISCUSS]
> > >>>>>>>>>> FLIP-321:
> > >>>>>>>>>>>>>>>>>> Introduce
> > >>>>>>>>>>>>>>>>>>>> an API deprecation process" thread [1]?
> > >>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> Yes, I meant the FLIP-321 discussion. I just
> > >> noticed
> > >>> I
> > >>>>>> pasted
> > >>>>>>>>>>> the
> > >>>>>>>>>>>>>> wrong
> > >>>>>>>>>>>>>>>>>> url
> > >>>>>>>>>>>>>>>>>>> in my previous email. Sorry for the mistake.
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> I am also curious to know if the rationale behind
> > >>> this
> > >>>>> new
> > >>>>>> API
> > >>>>>>>>>>> has
> > >>>>>>>>>>>>>> been
> > >>>>>>>>>>>>>>>>>>>> previously discussed on the mailing list. Do we
> > >>> have a
> > >>>>>> list
> > >>>>>>>>>> of
> > >>>>>>>>>>>>>>>>>>> shortcomings
> > >>>>>>>>>>>>>>>>>>>> in the current DataStream API that it tries to
> > >>>> resolve?
> > >>>>>> How
> > >>>>>>>>>>> does
> > >>>>>>>>>>>>> the
> > >>>>>>>>>>>>>>>>>>>> current ProcessFunction functionality fit into the
> > >>>>>> picture?
> > >>>>>>>>>>> Will
> > >>>>>>>>>>>>> it
> > >>>>>>>>>>>>>> be
> > >>>>>>>>>>>>>>>>>>> kept
> > >>>>>>>>>>>>>>>>>>>> as is or subsumed by new API?
> > >>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> I don't think we should create a replacement for
> > >> the
> > >>>>>>>>>> DataStream
> > >>>>>>>>>>>>> API
> > >>>>>>>>>>>>>>>>>> unless
> > >>>>>>>>>>>>>>>>>>>> we have a very good reason to do so and with a
> > >>> proper
> > >>>>>>>>>>> discussion
> > >>>>>>>>>>>>>> about
> > >>>>>>>>>>>>>>>>>>> this
> > >>>>>>>>>>>>>>>>>>>> as Alex said.
> > >>>>>>>>>>>>>>>>>>> The ProcessFunction API which is targeting to
> > >> replace
> > >>>>>>>>>> DataStream
> > >>>>>>>>>>>>> API
> > >>>>>>>>>>>>>> is
> > >>>>>>>>>>>>>>>>>>> still a proposal, not a decision. Sorry for the
> > >>>>> confusion,
> > >>>>>> I
> > >>>>>>>>>>>>> should
> > >>>>>>>>>>>>>>>> have
> > >>>>>>>>>>>>>>>>>>> been more careful with my words, not giving the
> > >>>>> impression
> > >>>>>>>>>> that
> > >>>>>>>>>>>>> this
> > >>>>>>>>>>>>>> is
> > >>>>>>>>>>>>>>>>>>> something we'll do anyway.
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> There will be a FLIP describing the motivations and
> > >>>>>> designs in
> > >>>>>>>>>>>>>> detail,
> > >>>>>>>>>>>>>>>>>> for
> > >>>>>>>>>>>>>>>>>>> the community to discuss and vote on. We are still
> > >>>>> working
> > >>>>>> on
> > >>>>>>>>>>> it.
> > >>>>>>>>>>>>>> TBH,
> > >>>>>>>>>>>>>>>>>> this
> > >>>>>>>>>>>>>>>>>>> is not trivial and we would need more time on it.
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> Just to quickly share some backgrounds:
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>    - We see quite some problems with the current
> > >>>>>> DataStream
> > >>>>>>>>>> APIs
> > >>>>>>>>>>>>>>>>>>>       - Users are working with concrete classes
> > >>> rather
> > >>>>>> than
> > >>>>>>>>>>>>>>>> interfaces,
> > >>>>>>>>>>>>>>>>>>>       which means
> > >>>>>>>>>>>>>>>>>>>       - Users can access methods that are designed
> > >>> to
> > >>>> be
> > >>>>>> used
> > >>>>>>>>>> by
> > >>>>>>>>>>>>>>>> internal
> > >>>>>>>>>>>>>>>>>>>          classes, even though they are annotated
> > >>> with
> > >>>>>>>>>>> `@Internal`.
> > >>>>>>>>>>>>>>>> E.g.,
> > >>>>>>>>>>>>>>>>>>>          `DataStream#getTransformation`.
> > >>>>>>>>>>>>>>>>>>>          - Changes to the non-API implementations
> > >>>> (e.g.,
> > >>>>>>>>>>>>>>>>>> `Transformation`)
> > >>>>>>>>>>>>>>>>>>>          would affect the API classes (e.g.,
> > >>>>>> `DataStream`),
> > >>>>>>>>>>> which
> > >>>>>>>>>>>>>>>>>>> makes it hard to
> > >>>>>>>>>>>>>>>>>>>          provide binary compatibility.
> > >>>>>>>>>>>>>>>>>>>       - Internal classes are used as parameter /
> > >>>>>> return-value
> > >>>>>>>>>> of
> > >>>>>>>>>>>>>>>> public
> > >>>>>>>>>>>>>>>>>>>       APIs. E.g., while `AbstractStreamOperator`
> > >> is
> > >>>>>>>>>>>>> PublicEvolving,
> > >>>>>>>>>>>>>>>>>>> `StreamTask`
> > >>>>>>>>>>>>>>>>>>>       which returns from
> > >>>>>>>>>>>>> `AbstractStreamOperator#getContainingTask`
> > >>>>>>>>>>>>>> is
> > >>>>>>>>>>>>>>>>>>> Internal.
> > >>>>>>>>>>>>>>>>>>>       - In many cases, users are asked to extend
> > >> the
> > >>>> API
> > >>>>>>>>>>> classes,
> > >>>>>>>>>>>>>>>> rather
> > >>>>>>>>>>>>>>>>>>>       than implementing interfaces. E.g.,
> > >>>>>>>>>>>>> `AbstractStreamOperator`.
> > >>>>>>>>>>>>>>>>>>>          - Any changes to the base classes, even
> > >> the
> > >>>>>> internal
> > >>>>>>>>>>>>> part,
> > >>>>>>>>>>>>>>>> may
> > >>>>>>>>>>>>>>>>>>>          affect the behavior of the user-provided
> > >>>>>> sub-classes
> > >>>>>>>>>>>>>>>>>>>          - Users can override the behavior of the
> > >>> base
> > >>>>>> classes
> > >>>>>>>>>>>>>>>>>>>       - The API module `flink-streaming-java`
> > >>> contains
> > >>>>>> non-API
> > >>>>>>>>>>>>>>>> classes,
> > >>>>>>>>>>>>>>>>>> and
> > >>>>>>>>>>>>>>>>>>>       depends on internal modules such as
> > >>>>> `flink-runtime`,
> > >>>>>>>>>> which
> > >>>>>>>>>>>>>> means
> > >>>>>>>>>>>>>>>>>>>       - Changes to the internal modules may affect
> > >>> the
> > >>>>> API
> > >>>>>>>>>>>>> modules,
> > >>>>>>>>>>>>>>>> which
> > >>>>>>>>>>>>>>>>>>>          requires users to re-build their
> > >>> applications
> > >>>>>> upon
> > >>>>>>>>>>>>> upgrading
> > >>>>>>>>>>>>>>>>>>>          - The artifact user needs for building
> > >>> their
> > >>>>>>>>>>> application
> > >>>>>>>>>>>>>>>> larger
> > >>>>>>>>>>>>>>>>>>>          than necessary.
> > >>>>>>>>>>>>>>>>>>>       - We probably should not expose operators
> > >>> (e.g.,
> > >>>>>>>>>>>>>>>>>>>       `AbstractStreamOperator`) to users.
> > >> Functions
> > >>>>>> should be
> > >>>>>>>>>>>>> enough
> > >>>>>>>>>>>>>>>>>>> for users to
> > >>>>>>>>>>>>>>>>>>>       define their data processing logics.
> > >> Exposing
> > >>>>>>>>>>> operator-level
> > >>>>>>>>>>>>>>>>>> concepts
> > >>>>>>>>>>>>>>>>>>>       (e.g., mailbox thread model, checkpoint
> > >>> barrier
> > >>>>>>>>>> alignment,
> > >>>>>>>>>>>>>>>> etc.) is
> > >>>>>>>>>>>>>>>>>>>       unnecessary and limits the improvement
> > >>> regarding
> > >>>>>> such
> > >>>>>>>>>>>>> exposed
> > >>>>>>>>>>>>>>>>>>> mechanisms
> > >>>>>>>>>>>>>>>>>>>       with compatibility considerations.
> > >>>>>>>>>>>>>>>>>>>       - The current DataStream API seems to be a
> > >>>> mixture
> > >>>>>> of
> > >>>>>>>>>> many
> > >>>>>>>>>>>>>>>> things,
> > >>>>>>>>>>>>>>>>>>>       making it hard to understand especially for
> > >>>>>> newcomers.
> > >>>>>>>>>> It
> > >>>>>>>>>>>>> might
> > >>>>>>>>>>>>>>>> be
> > >>>>>>>>>>>>>>>>>>> better
> > >>>>>>>>>>>>>>>>>>>       to re-organize it into several parts: (the
> > >>>>> taxonomy
> > >>>>>>>>>> below
> > >>>>>>>>>>>>> are
> > >>>>>>>>>>>>>>>> just
> > >>>>>>>>>>>>>>>>>> an
> > >>>>>>>>>>>>>>>>>>>       example of the, we are still working on
> > >> this)
> > >>>>>>>>>>>>>>>>>>>          - The most fundamental stateful stream
> > >>>>>> processing:
> > >>>>>>>>>>>>> streams,
> > >>>>>>>>>>>>>>>>>>>          partitions / key, process functions,
> > >> state,
> > >>>>>>>>>>>>> timeline-service
> > >>>>>>>>>>>>>>>>>>>          - An extension for common batch-streaming
> > >>>>> unified
> > >>>>>>>>>>>>> functions:
> > >>>>>>>>>>>>>>>>>> map,
> > >>>>>>>>>>>>>>>>>>>          flatmap, filter, agg, reduce, join, etc.
> > >>>>>>>>>>>>>>>>>>>          - An extension for windowing supports:
> > >>>> window,
> > >>>>>>>>>>>>> triggering
> > >>>>>>>>>>>>>>>>>>>          - An extension for event-time supports:
> > >>> event
> > >>>>>> time,
> > >>>>>>>>>>>>>> watermark
> > >>>>>>>>>>>>>>>>>>>          - The extensions are like short-cuts /
> > >>>> sugars,
> > >>>>>>>>>> without
> > >>>>>>>>>>>>> which
> > >>>>>>>>>>>>>>>>>> users
> > >>>>>>>>>>>>>>>>>>>          can probably still achieve the same
> > >>> behavior
> > >>>> by
> > >>>>>>>>>> working
> > >>>>>>>>>>>>> with
> > >>>>>>>>>>>>>>>> the
> > >>>>>>>>>>>>>>>>>>>          fundamental APIs, but would be a lot
> > >> easier
> > >>>>> with
> > >>>>>> the
> > >>>>>>>>>>>>>>>> extensions
> > >>>>>>>>>>>>>>>>>>>       - The original plan was to do in-place
> > >>>> refactors /
> > >>>>>>>>>> changes
> > >>>>>>>>>>>>> on
> > >>>>>>>>>>>>>>>>>>>    DataStream API. Some related items are listed
> > >> in
> > >>>> this
> > >>>>>> doc
> > >>>>>>>>>> [2]
> > >>>>>>>>>>>>>>>> attached
> > >>>>>>>>>>>>>>>>>>> to
> > >>>>>>>>>>>>>>>>>>>    the kicking off email [3]. Not all of the above
> > >>>>> issues
> > >>>>>> are
> > >>>>>>>>>>>>> listed,
> > >>>>>>>>>>>>>>>>>>> because
> > >>>>>>>>>>>>>>>>>>>    we haven't looked into this as deeply as now
> > >> by
> > >>>> that
> > >>>>>> time.
> > >>>>>>>>>>>>>>>>>>>    - We proposed this as a new API rather than
> > >>>> in-place
> > >>>>>>>>>>> refactors
> > >>>>>>>>>>>>> in
> > >>>>>>>>>>>>>>>> the
> > >>>>>>>>>>>>>>>>>>>    2.0 work item list, because we realized the
> > >>> changes
> > >>>>>> might
> > >>>>>>>>>> be
> > >>>>>>>>>>>>> too
> > >>>>>>>>>>>>>>>> big
> > >>>>>>>>>>>>>>>>>>> for an
> > >>>>>>>>>>>>>>>>>>>    in-place change. First having a new API then
> > >>>>> gradually
> > >>>>>>>>>>> retiring
> > >>>>>>>>>>>>>> the
> > >>>>>>>>>>>>>>>>>> old
> > >>>>>>>>>>>>>>>>>>> one
> > >>>>>>>>>>>>>>>>>>>    would help users to smoothly migrate between
> > >>> them.
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> A thorough discussion is definitely needed once the
> > >>>> FLIP
> > >>>>> is
> > >>>>>>>>>> out.
> > >>>>>>>>>>>>> And
> > >>>>>>>>>>>>>> of
> > >>>>>>>>>>>>>>>>>>> course it's possible that the FLIP might be
> > >> rejected.
> > >>>>> Given
> > >>>>>>>>>> that
> > >>>>>>>>>>>>> we
> > >>>>>>>>>>>>>> are
> > >>>>>>>>>>>>>>>>>>> planning for release 2.0, I just feel it would be
> > >>>> better
> > >>>>> to
> > >>>>>>>>>>> bring
> > >>>>>>>>>>>>>> this
> > >>>>>>>>>>>>>>>> up
> > >>>>>>>>>>>>>>>>>>> early even the concrete plan is not yet ready,
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> Best,
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> Xintong
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> [1]
> > >>>>>>>>>>>>>
> > >>>>> https://lists.apache.org/thread/vmhzv8fcw2b33pqxp43486owrxbkd5x9
> > >>>>>>>>>>>>>>>>>>> [2]
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>
> > >>>>>>
> > >>>>>
> > >>>>
> > >>>
> > >>
> >
> https://docs.google.com/document/d/1_PMGl5RuDQGlV99_gL3y7OiRsF0DgCk91Coua6hFXhE/edit?usp=sharing
> > >>>>>>>>>>>>>>>>>>> [3]
> > >>>>>>>>>>>>>
> > >>>>> https://lists.apache.org/thread/b8w5cx0qqbwzzklyn5xxf54vw9ymys1c
> > >>>>>>>>>>>>>>>>>>> On Tue, Jun 27, 2023 at 5:15 PM Gyula Fóra <
> > >>>>>> gyfora@apache.org
> > >>>>>>>>>>>>>> wrote:
> > >>>>>>>>>>>>>>>>>>>> Hey!
> > >>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>> I share the same concerns mentioned above
> > >> regarding
> > >>>> the
> > >>>>>>>>>>>>>>>>>> "ProcessFunction
> > >>>>>>>>>>>>>>>>>>>> API".
> > >>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>> I don't think we should create a replacement for
> > >> the
> > >>>>>>>>>> DataStream
> > >>>>>>>>>>>>> API
> > >>>>>>>>>>>>>>>>>>> unless
> > >>>>>>>>>>>>>>>>>>>> we have a very good reason to do so and with a
> > >>> proper
> > >>>>>>>>>>> discussion
> > >>>>>>>>>>>>>> about
> > >>>>>>>>>>>>>>>>>>> this
> > >>>>>>>>>>>>>>>>>>>> as Alex said.
> > >>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>> Cheers,
> > >>>>>>>>>>>>>>>>>>>> Gyula
> > >>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>> On Tue, Jun 27, 2023 at 11:03 AM Alexander
> > >> Fedulov <
> > >>>>>>>>>>>>>>>>>>>> alexander.fedulov@gmail.com> wrote:
> > >>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>> Hi Xintong,
> > >>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>> By compatibility discussion do you mean the
> > >>>> "[DISCUSS]
> > >>>>>>>>>>> FLIP-321:
> > >>>>>>>>>>>>>>>>>>>> Introduce
> > >>>>>>>>>>>>>>>>>>>>> an API deprecation process" thread [1]?
> > >>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>> I am also curious to know if the rationale behind
> > >>>> this
> > >>>>>> new
> > >>>>>>>>>> API
> > >>>>>>>>>>>>> has
> > >>>>>>>>>>>>>>>>>> been
> > >>>>>>>>>>>>>>>>>>>>> previously discussed on the mailing list. Do we
> > >>> have
> > >>>> a
> > >>>>>> list
> > >>>>>>>>>> of
> > >>>>>>>>>>>>>>>>>>>> shortcomings
> > >>>>>>>>>>>>>>>>>>>>> in the current DataStream API that it tries to
> > >>>> resolve?
> > >>>>>> How
> > >>>>>>>>>>> does
> > >>>>>>>>>>>>>> the
> > >>>>>>>>>>>>>>>>>>>>> current ProcessFunction functionality fit into
> > >> the
> > >>>>>> picture?
> > >>>>>>>>>>>>> Will it
> > >>>>>>>>>>>>>>>>>> be
> > >>>>>>>>>>>>>>>>>>>> kept
> > >>>>>>>>>>>>>>>>>>>>> as is or subsumed by new API?
> > >>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>> [1]
> > >>>>>>>>>>>>>>
> > >>>>>> https://lists.apache.org/thread/vmhzv8fcw2b33pqxp43486owrxbkd5x9
> > >>>>>>>>>>>>>>>>>>>>> Best,
> > >>>>>>>>>>>>>>>>>>>>> Alex
> > >>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>> On Mon, 26 Jun 2023 at 14:33, Xintong Song <
> > >>>>>>>>>>>>> tonysong820@gmail.com>
> > >>>>>>>>>>>>>>>>>>>> wrote:
> > >>>>>>>>>>>>>>>>>>>>>>> The ProcessFunction API item is giving me the
> > >>> most
> > >>>>>>>>>> headaches
> > >>>>>>>>>>>>>>>>>>> because
> > >>>>>>>>>>>>>>>>>>>>> it's
> > >>>>>>>>>>>>>>>>>>>>>>> very unclear what it actually entails; like is
> > >> it
> > >>>> an
> > >>>>>>>>>>> entirely
> > >>>>>>>>>>>>>>>>>>>> separate
> > >>>>>>>>>>>>>>>>>>>>>> API
> > >>>>>>>>>>>>>>>>>>>>>>> to DataStream (sounds like it is!) or an
> > >>> extension
> > >>>> of
> > >>>>>>>>>>>>> DataStream.
> > >>>>>>>>>>>>>>>>>>> How
> > >>>>>>>>>>>>>>>>>>>>>> much
> > >>>>>>>>>>>>>>>>>>>>>>> will it share the internals with DataStream
> > >> etc.;
> > >>>> how
> > >>>>>> does
> > >>>>>>>>>>> it
> > >>>>>>>>>>>>>>>>>>> relate
> > >>>>>>>>>>>>>>>>>>>> to
> > >>>>>>>>>>>>>>>>>>>>>> the
> > >>>>>>>>>>>>>>>>>>>>>>> Table API (w.r.t. switching APIs / what Table
> > >> API
> > >>>>> uses
> > >>>>>>>>>>>>>>>>>> underneath).
> > >>>>>>>>>>>>>>>>>>>>>> I totally understand your confusion. We started
> > >>>>> planning
> > >>>>>>>>>> this
> > >>>>>>>>>>>>>> after
> > >>>>>>>>>>>>>>>>>>>>> kicking
> > >>>>>>>>>>>>>>>>>>>>>> off the release 2.0, so there's still a lot to
> > >> be
> > >>>>>> explored
> > >>>>>>>>>>> and
> > >>>>>>>>>>>>> the
> > >>>>>>>>>>>>>>>>>>> plan
> > >>>>>>>>>>>>>>>>>>>>>> keeps changing.
> > >>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>    - In the beginning, we planned to do an
> > >>> in-place
> > >>>>>>>>>> refactor
> > >>>>>>>>>>> of
> > >>>>>>>>>>>>>>>>>>>>> DataStream
> > >>>>>>>>>>>>>>>>>>>>>>    API, until the API migration period is
> > >>> proposed.
> > >>>>>>>>>>>>>>>>>>>>>>    - Then we want to make it an entirely
> > >> separate
> > >>>> API
> > >>>>>> to
> > >>>>>>>>>>>>>>>>>> DataStream,
> > >>>>>>>>>>>>>>>>>>>> and
> > >>>>>>>>>>>>>>>>>>>>>>    listed as a must-have for release 2.0 so
> > >> that
> > >>> we
> > >>>>> can
> > >>>>>>>>>>> remove
> > >>>>>>>>>>>>>>>>>>>> DataStream
> > >>>>>>>>>>>>>>>>>>>>>> once
> > >>>>>>>>>>>>>>>>>>>>>>    it's ready.
> > >>>>>>>>>>>>>>>>>>>>>>    - However, depending on the outcome of the
> > >> API
> > >>>>>>>>>>> compatibility
> > >>>>>>>>>>>>>>>>>>>>> discussion
> > >>>>>>>>>>>>>>>>>>>>>>    [1], we may not be able to remove DataStream
> > >>> in
> > >>>>> 2.0
> > >>>>>>>>>>> anyway,
> > >>>>>>>>>>>>>>>>>> which
> > >>>>>>>>>>>>>>>>>>>>> means
> > >>>>>>>>>>>>>>>>>>>>>> we
> > >>>>>>>>>>>>>>>>>>>>>>    might need to re-evaluate the necessity of
> > >>> this
> > >>>>>> item for
> > >>>>>>>>>>>>> 2.0.
> > >>>>>>>>>>>>>>>>>>>>>> I'd say we wait a bit longer for the
> > >> compatibility
> > >>>>>>>>>> discussion
> > >>>>>>>>>>>>> [1]
> > >>>>>>>>>>>>>>>>>> and
> > >>>>>>>>>>>>>>>>>>>>>> decide the priority for this item afterwards.
> > >>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>> Best,
> > >>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>> Xintong
> > >>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>> [1]
> > >>>>>>>>>> https://lists.apache.org/list.html?dev@flink.apache.org
> > >>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>> On Mon, Jun 26, 2023 at 6:00 PM Chesnay
> > >> Schepler <
> > >>>>>>>>>>>>>>>>>> chesnay@apache.org
> > >>>>>>>>>>>>>>>>>>>>>> wrote:
> > >>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>> by-and-large I'm quite happy with the list of
> > >>>> items.
> > >>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>> I'm curious as to why the "Disaggregated State
> > >>>>>> Management"
> > >>>>>>>>>>>>> item
> > >>>>>>>>>>>>>>>>>> is
> > >>>>>>>>>>>>>>>>>>>>> marked
> > >>>>>>>>>>>>>>>>>>>>>>> as a must-have; will it require changes that
> > >>> break
> > >>>>>>>>>>> something?
> > >>>>>>>>>>>>>>>>>> What
> > >>>>>>>>>>>>>>>>>>>>>> prevents
> > >>>>>>>>>>>>>>>>>>>>>>> it from being added in 2.1?
> > >>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>> We may want to update the Java 17 item to "Make
> > >>>> Java
> > >>>>> 17
> > >>>>>>>>>> the
> > >>>>>>>>>>>>>>>>>>> default,
> > >>>>>>>>>>>>>>>>>>>>> drop
> > >>>>>>>>>>>>>>>>>>>>>>> Java 8/11". Maybe even split it into a
> > >> must-have
> > >>>>> "Drop
> > >>>>>>>>>> Java
> > >>>>>>>>>>> 8"
> > >>>>>>>>>>>>>>>>>> and
> > >>>>>>>>>>>>>>>>>>> a
> > >>>>>>>>>>>>>>>>>>>>>>> nice-to-have "Drop Java 11"?
> > >>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>> "Move Calcite rules from Scala to Java": I
> > >> would
> > >>>> hope
> > >>>>>> that
> > >>>>>>>>>>>>> this
> > >>>>>>>>>>>>>>>>>>> would
> > >>>>>>>>>>>>>>>>>>>>> be
> > >>>>>>>>>>>>>>>>>>>>>>> an entirely internal change, and could thus be
> > >> an
> > >>>>>>>>>>> incremental
> > >>>>>>>>>>>>>>>>>>> process
> > >>>>>>>>>>>>>>>>>>>>>>> independent of major releases.
> > >>>>>>>>>>>>>>>>>>>>>>> What is the actual scale of this item; how much
> > >>> are
> > >>>>> we
> > >>>>>>>>>>>>> actually
> > >>>>>>>>>>>>>>>>>>>>>> re-writing?
> > >>>>>>>>>>>>>>>>>>>>>>> "Add MetricGroup#getLogicalScope": I'd raise
> > >> this
> > >>>> to
> > >>>>> a
> > >>>>>>>>>>>>>>>>>> must-have; i
> > >>>>>>>>>>>>>>>>>>>>> think
> > >>>>>>>>>>>>>>>>>>>>>>> I marked it down as nice-to-have only because
> > >> it
> > >>>>>> depends
> > >>>>>>>>>> on
> > >>>>>>>>>>>>>>>>>> another
> > >>>>>>>>>>>>>>>>>>>>> item.
> > >>>>>>>>>>>>>>>>>>>>>>> The ProcessFunction API item is giving me the
> > >>> most
> > >>>>>>>>>> headaches
> > >>>>>>>>>>>>>>>>>>> because
> > >>>>>>>>>>>>>>>>>>>>> it's
> > >>>>>>>>>>>>>>>>>>>>>>> very unclear what it actually entails; like is
> > >> it
> > >>>> an
> > >>>>>>>>>>> entirely
> > >>>>>>>>>>>>>>>>>>>> separate
> > >>>>>>>>>>>>>>>>>>>>>> API
> > >>>>>>>>>>>>>>>>>>>>>>> to DataStream (sounds like it is!) or an
> > >>> extension
> > >>>> of
> > >>>>>>>>>>>>> DataStream.
> > >>>>>>>>>>>>>>>>>>> How
> > >>>>>>>>>>>>>>>>>>>>>> much
> > >>>>>>>>>>>>>>>>>>>>>>> will it share the internals with DataStream
> > >> etc.;
> > >>>> how
> > >>>>>> does
> > >>>>>>>>>>> it
> > >>>>>>>>>>>>>>>>>>> relate
> > >>>>>>>>>>>>>>>>>>>> to
> > >>>>>>>>>>>>>>>>>>>>>> the
> > >>>>>>>>>>>>>>>>>>>>>>> Table API (w.r.t. switching APIs / what Table
> > >> API
> > >>>>> uses
> > >>>>>>>>>>>>>>>>>> underneath).
> > >>>>>>>>>>>>>>>>>>>>>>> There are a few items I added as ideas which
> > >>> don't
> > >>>>>> have a
> > >>>>>>>>>>>>>>>>>> priority
> > >>>>>>>>>>>>>>>>>>>> yet;
> > >>>>>>>>>>>>>>>>>>>>>>> would love to get some feedback on those.
> > >>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>> On 21/06/2023 08:41, Xintong Song wrote:
> > >>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>> Hi devs,
> > >>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>> As previously discussed in [1], we had been
> > >>>>> collecting
> > >>>>>>>>>> work
> > >>>>>>>>>>>>> item
> > >>>>>>>>>>>>>>>>>>>>>> proposals
> > >>>>>>>>>>>>>>>>>>>>>>> for the 2.0 release until June 15th, on the
> > >> wiki
> > >>>> page
> > >>>>>> [2].
> > >>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>    - As we have passed the due date, I'd like
> > >> to
> > >>>>>> kindly
> > >>>>>>>>>>> remind
> > >>>>>>>>>>>>>>>>>>>> everyone
> > >>>>>>>>>>>>>>>>>>>>>> *not
> > >>>>>>>>>>>>>>>>>>>>>>>    to add / remove items directly on the wiki
> > >>>> page*.
> > >>>>>> If
> > >>>>>>>>>>>>> needed,
> > >>>>>>>>>>>>>>>>>>>> please
> > >>>>>>>>>>>>>>>>>>>>>> post
> > >>>>>>>>>>>>>>>>>>>>>>>    in this thread or reach out to the release
> > >>>>> managers
> > >>>>>>>>>>>>> instead.
> > >>>>>>>>>>>>>>>>>>>>>>>    - I've reached out to some folks for
> > >>>>> clarifications
> > >>>>>>>>>> about
> > >>>>>>>>>>>>>>>>>> their
> > >>>>>>>>>>>>>>>>>>>>>>>    proposals. Some of them mentioned that they
> > >>> can
> > >>>>>> not yet
> > >>>>>>>>>>>>> tell
> > >>>>>>>>>>>>>>>>>>>> whether
> > >>>>>>>>>>>>>>>>>>>>>> we
> > >>>>>>>>>>>>>>>>>>>>>>>    should do an item or not, and would need
> > >> more
> > >>>>> time
> > >>>>>> /
> > >>>>>>>>>>>>>>>>>> discussions
> > >>>>>>>>>>>>>>>>>>>> to
> > >>>>>>>>>>>>>>>>>>>>>> make
> > >>>>>>>>>>>>>>>>>>>>>>>    the decision. So I added a new symbol for
> > >>> items
> > >>>>>> whose
> > >>>>>>>>>>>>>>>>>> priorities
> > >>>>>>>>>>>>>>>>>>>> are
> > >>>>>>>>>>>>>>>>>>>>>> `TBD`.
> > >>>>>>>>>>>>>>>>>>>>>>> Now it's time to collaboratively decide a
> > >> minimum
> > >>>> set
> > >>>>>> of
> > >>>>>>>>>>>>>>>>>> must-have
> > >>>>>>>>>>>>>>>>>>>>> items.
> > >>>>>>>>>>>>>>>>>>>>>>> I've gone through the entire list of proposed
> > >>>> items,
> > >>>>>> and
> > >>>>>>>>>>> found
> > >>>>>>>>>>>>>>>>>> most
> > >>>>>>>>>>>>>>>>>>>> of
> > >>>>>>>>>>>>>>>>>>>>>> them
> > >>>>>>>>>>>>>>>>>>>>>>> make quite much sense. So I think an online
> > >> sync
> > >>>>> might
> > >>>>>> not
> > >>>>>>>>>>> be
> > >>>>>>>>>>>>>>>>>>>> necessary
> > >>>>>>>>>>>>>>>>>>>>>> for
> > >>>>>>>>>>>>>>>>>>>>>>> this. I'd like to go with this DISCUSS thread,
> > >>>> where
> > >>>>>>>>>>> everyone
> > >>>>>>>>>>>>> can
> > >>>>>>>>>>>>>>>>>>>>> comment
> > >>>>>>>>>>>>>>>>>>>>>>> on how they think the list can be improved,
> > >>>> followed
> > >>>>>> by a
> > >>>>>>>>>>>>> VOTE to
> > >>>>>>>>>>>>>>>>>>>>>> formally
> > >>>>>>>>>>>>>>>>>>>>>>> make the decision.
> > >>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>> Any feedback and opinions, including but not
> > >>>> limited
> > >>>>> to
> > >>>>>>>>>> the
> > >>>>>>>>>>>>>>>>>
>

答复: [DISCUSS] Release 2.0 Work Items

Posted by li zhiqiang <li...@gmail.com>.
@Xingtong
I already know the modification of some api, but because there are many changes involved,
I am afraid that the consideration is not comprehensive.
I'm willing to do the work, but I haven't found a committer yet.

Best,
Zhiqiang

发件人: Xintong Song <to...@gmail.com>
日期: 星期四, 2023年7月13日 10:03
收件人: dev@flink.apache.org <de...@flink.apache.org>
主题: Re: [DISCUSS] Release 2.0 Work Items
Thanks for the inputs, Zhiqiang and Jiabao.

@Zhiqiang,
The proposal sounds interesting. Do you already have an idea what API
changes are needed in order to make the connectors pluggable? I think
whether this should go into Flink 2.0 would significantly depend on what
API changes are needed. Moreover, would you like to work on this effort or
simply raise a need? And if you'd like to work on this, do you already find
a committer who can help on this?

@Jiabao,
Thanks for the suggestions. I agree that it would be nice to improve the
experiences in deploying Flink instances and submitting tasks. It would be
helpful if you can point out the specific behaviors that make integrating
Flink in your production difficult. Also, I'd like to understand how this
topic is related to the Release 2.0 topic. Or asked differently, is this
something that requires breaking changes that can only happen in major
version bumps, or is it just improvement that can go into any minor version?


Best,

Xintong



On Thu, Jul 13, 2023 at 12:49 AM Jiabao Sun <ji...@xtransfer.cn.invalid>
wrote:

> Thanks Xintong for driving the effort.
>
>
> I’d add a +1 to improving out-of-box user experience, as suggested by
> @Jark and @Chesnay.
> For beginners, understanding complex configurations is a hard work.
>
> In addition, the deployment of a set of Flink runtime environment is also
> a complex matter.
> At present, there are still big differences in the submission tasks for
> different computing resource. If users need time for their own data
> development platform, they need to deeply understand these differences when
> processing task submission and running status check.
>
> I'm glad to see features like flink-sql-gateway being implemented by the
> community because it makes it easy for users to submit flink sql tasks.
> Further more, can we provide more unified, out-of-the-box capabilities that
> allow users to quickly pull up a production-ready Flink environment and
> easily integrate Flink into their own data development platform?
>
>
> Best,
> Jiabao
>
>
> > 2023年7月12日 下午8:16,zhiqiang li <li...@gmail.com> 写道:
> >
> > I have seen in [1] connectors and formats, and user code will be
> pluggable.
> > If the connectors are pluggable, the benefits are obvious, as the
> conflicts
> > between different jar package versions can be avoided.
> > If you don't use classloader isolation, shade is needed to resolve
> > conflicts. A lot of development time is wasted.
> > I know that this change may involve a lot of API changes, so I would like
> > to discuss in this email whether we can make changes in Flink 2.0.
> > Plugins facilitate a strict separation of code through restricted
> > classloaders.
> >
> > Plugins cannot access classes from other plugins or from Flink that have
> >> not been specifically whitelisted.
> >> This strict isolation allows plugins to contain conflicting versions of
> >> the same library without the need to relocate classes or to converge to
> >> common versions.
> >> Currently, file systems and metric reporters are pluggable *but in the
> >> future, connectors, formats, and even user code should also be
> pluggable.*
> >>
> >
> > [1]
> >
> https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/filesystems/overview/
> >
> > Xintong Song <to...@gmail.com> 于2023年7月11日周二 18:50写道:
> >
> >>>
> >>> What we might want to come up with is a summary with each 2.0.0 issue
> on
> >>> why it should be included or not. That summary is something the
> community
> >>> could vote on. WDYT? I'm happy to help here.
> >>>
> >>
> >> That sounds great. Thanks for offering the help. I'll also try to go
> >> through the issues, but TBH I'm quite overwhelmed and cannot promise to
> get
> >> this done very soon. Your help is very much needed.
> >>
> >>
> >> Best,
> >>
> >> Xintong
> >>
> >>
> >>
> >> On Tue, Jul 11, 2023 at 6:08 PM Matthias Pohl
> >> <ma...@aiven.io.invalid> wrote:
> >>
> >>> @Xintong I guess it makes sense. I agree with your conclusions on the
> >> four
> >>> mentioned Jira issues.
> >>>
> >>> I just checked any issues that have fixVersion = 2.0.0 [1]. There are a
> >> few
> >>> more items that are not affiliated with FLINK-3957 [2]. I guess we
> should
> >>> find answers for these issues: Either closing them with a reason to
> have
> >> a
> >>> consistent state in Jira or adding them to the feature list as part of
> a
> >>> separate voting thread (to leave the current vote untouched).
> >>>
> >>> What we might want to come up with is a summary with each 2.0.0 issue
> on
> >>> why it should be included or not. That summary is something the
> community
> >>> could vote on. WDYT? I'm happy to help here.
> >>>
> >>> Matthias
> >>>
> >>> [1]
> >>>
> >>>
> >>
> https://issues.apache.org/jira/browse/FLINK-32437?jql=project%20%3D%20FLINK%20AND%20fixVersion%20%3D%202.0.0%20AND%20status%20NOT%20IN%20(Closed%2C%20Resolved)%20%20
> >>> [2] https://issues.apache.org/jira/browse/FLINK-3957
> >>>
> >>>
> >>> On Tue, Jul 11, 2023 at 5:01 AM Xintong Song <to...@gmail.com>
> >>> wrote:
> >>>
> >>>> @Zhu,
> >>>> As you are downgrading "Clarify the scopes of configuration options"
> to
> >>>> nice-to-have priority, could you also bring that up in the vote
> >>> thread[1]?
> >>>> I'm asking because there are people who already voted on the original
> >>> list.
> >>>> I think restarting the vote is probably an overkill and unnecessary,
> >> but
> >>> we
> >>>> should at least bring this change to their attention.
> >>>>
> >>>> @Matthias,
> >>>> Thanks a lot for bringing this up. I wasn't aware of this early
> >>> umbrella. I
> >>>> haven't gone through everything in FLINK-3957 yet. I'll do it asap.
> >>>>
> >>>> Just quickly went through the 4 issues you mentioned.
> >>>> - FLINK-4675 & FLINK-14068: I'd be +1 to deprecate them in 1.18, as
> >> long
> >>> as
> >>>> the new APIs that we want users to migrate to are ready. For these 2
> >>>> tickets, I think introduction of the updated APIs should be
> >>> straightforward
> >>>> and feasible for 1.18.
> >>>> - FLINK-13926: I'm not sure about this one. The two mentioned classes
> >>>> `ProcessingTimeSessionWindows` and `EventTimeSessionWindows` are not
> >> even
> >>>> marked as Public or PublicEvolving APIs. Moreover, I don't see a good
> >> way
> >>>> to smoothly replace the classes with a generic version.
> >>>> - FLINK-5126: This is a bit unclear to me. From the description and
> >>>> conversation on the ticket, I don't fully understand which concrete
> >> APIs
> >>>> the ticket is referring to. Or maybe it refers to all / most of the
> >> APIs
> >>>> that throws Exception / IOException in general. Moreover, I don't
> think
> >>>> removing Exception / IOException from the API signature is a breaking
> >>>> change. It requires no code changes on the caller side.
> >>>>
> >>>> WDYT?
> >>>>
> >>>> Best,
> >>>>
> >>>> Xintong
> >>>>
> >>>>
> >>>> [1] https://lists.apache.org/thread/r0y9syc6k5nmcxvnd0hj33htdpdj9k6m
> >>>> [2] https://issues.apache.org/jira/browse/FLINK-3957
> >>>>
> >>>> On Mon, Jul 10, 2023 at 10:53 PM Matthias Pohl
> >>>> <ma...@aiven.io.invalid> wrote:
> >>>>
> >>>>> I brought it up in the deprecating APIs in 1.18 thread [1] already
> >> but
> >>> it
> >>>>> feels misplaced there. I just wanted to ask whether someone did a
> >> pass
> >>>> over
> >>>>> FLINK-3957 [2]. I came across it when going through the release 2.0
> >>>> feature
> >>>>> list [3] as part of the vote. I have the feeling that there are some
> >>>> valid
> >>>>> action items (e.g. FLINK-4675, FLINK-5126, FLINK-13926 [4-6]) which
> >> do
> >>>> not
> >>>>> seem to be listed in the 2.0 feature list [3], yet (or are included
> >> in
> >>>> some
> >>>>> of the bigger items). Majority of the subtasks are probably covered
> >> by
> >>>> the
> >>>>> DataSet removal, the Scala API removal and the ProcessFunction
> >>>> refactoring.
> >>>>> Other subtasks (FLINK-14068 [7]) made it into the feature list.
> >>>>>
> >>>>> I haven't worked with the SDK code that much so that I can judge
> >>> whether
> >>>>> the subtasks are still reasonable or actually obsolete. That is why I
> >>>>> wanted to mention the Jira issue here once more.
> >>>>>
> >>>>> I don't consider it a blocker for the ongoing vote but was wondering
> >>>>> whether it makes sense for someone who might have more experience in
> >>> that
> >>>>> field to add some of the subtasks to the feature list.
> >>>>>
> >>>>> Or shall we just consider it as "not interesting enough" because
> >> nobody
> >>>>> added it in the first place to the 2.0 feature list [3]?
> >>>>>
> >>>>> Matthias
> >>>>>
> >>>>> [1] https://lists.apache.org/thread/3dw4f8frlg8hzlv324ql7n2755bzs9hy
> >>>>> [2] https://issues.apache.org/jira/browse/FLINK-3957
> >>>>> [3] https://cwiki.apache.org/confluence/display/FLINK/2.0+Release
> >>>>> [4] https://issues.apache.org/jira/browse/FLINK-4675
> >>>>> [5] https://issues.apache.org/jira/browse/FLINK-5126
> >>>>> [6] https://issues.apache.org/jira/browse/FLINK-13926
> >>>>> [7] https://issues.apache.org/jira/browse/FLINK-14068
> >>>>>
> >>>>> On Mon, Jul 10, 2023 at 3:17 PM Zhu Zhu <re...@gmail.com> wrote:
> >>>>>
> >>>>>> Agreed that we should deprecate affected APIs as soon as possible.
> >>>>>> But there is not much time before the feature freeze of 1.18,
> >> hence
> >>>>>> I'm a bit concerned that some of the deprecations might not be done
> >>>> 1.18.
> >>>>>>
> >>>>>> We are currently looking into the improvements of the configuration
> >>>>> layer.
> >>>>>> Most of the proposed changes would require a public discussion, or
> >>> even
> >>>>>> a FLIP, which I think can hardly close before the feature freeze of
> >>>> 1.18.
> >>>>>> And some of the APIs can be deprecated only after the corresponding
> >>> new
> >>>>>> APIs are developed. Therefore we previously targeted them for 1.19.
> >>>>>>
> >>>>>> We may review later to see what deprecation work can be done in
> >> 1.18
> >>>> and
> >>>>>> make it if possible. I think we can do the work even after the
> >>> feature
> >>>>>> freeze
> >>>>>> date, if it is a purely deprecation work (simply adding
> >> annotations).
> >>>>> WDYT?
> >>>>>>
> >>>>>> I'm also changing the priority of "Clarify the scopes of
> >>> configuration
> >>>>>> options"
> >>>>>> to nice to have. I think most of the work are not breaking changes
> >>> and
> >>>>> can
> >>>>>> be done in 1.x or 2.1+. For the breaking changes which might be
> >>> needed,
> >>>>> we
> >>>>>> will consider it as part of the configuration layer rework.
> >>>>>>
> >>>>>> Thanks,
> >>>>>> Zhu
> >>>>>>
> >>>>>> Xintong Song <to...@gmail.com> 于2023年7月10日周一 19:58写道:
> >>>>>>>
> >>>>>>>>
> >>>>>>>> At what point are the FLIP discussions coming into play?
> >>>>>>>
> >>>>>>> I keep wondering if these shouldn't have started already.
> >>>>>>>
> >>>>>>>
> >>>>>>> I think this depends on the responsible contributor and reviewer
> >> of
> >>>>>>> individual items. From my perspective, the FLIP discussions can
> >>> start
> >>>>> any
> >>>>>>> time as long as the contributors are ready, the earlier the
> >> better.
> >>>>>>>
> >>>>>>>
> >>>>>>> What we need to ensure is that all breaking API changes are
> >>>>>>>> discussed/decided before 1.18 is released so we can deprecate
> >>>>> affected
> >>>>>> APIs.
> >>>>>>>>
> >>>>>>>
> >>>>>>> The introduction of the migration period has brought the
> >>> requirement
> >>>> to
> >>>>>>> plan the removal of public APIs 2 minor releases ahead of the
> >> major
> >>>>>>> release, which is TBH a bit unexpected. I agree it would be nice
> >> if
> >>>> we
> >>>>>> can
> >>>>>>> get the FLIPs ready by releasing 1.18. But I also don't think we
> >>>> should
> >>>>>>> rush on it. If the deprecation of a Public API does not make
> >> 1.18,
> >>> we
> >>>>> may
> >>>>>>> carry it until 3.0. Or if there are many Public APIs whose
> >>>> deprecation
> >>>>>> does
> >>>>>>> not make 1.18, we may deprecate them in 1.19 and postpone the
> >> major
> >>>>>> version
> >>>>>>> bump to after a 1.20 release. Moreover, as mentioned in
> >>> FLIP-321[1],
> >>>>>>> exceptions are discussable given that the migration period is
> >> newly
> >>>>>>> proposed and we did not give developers the chance to plan things
> >>>>> ahead.
> >>>>>> To
> >>>>>>> sum up, I'd say we try identify APIs that need to be deprecated
> >> in
> >>>> 1.18
> >>>>>>> with best efforts, and evaluate the remaining options (carrying
> >> the
> >>>> API
> >>>>>> for
> >>>>>>> the entire 2.x cycle, postpone 2.0, or making an exception)
> >>>>> case-by-case.
> >>>>>>> WDYT?
> >>>>>>>
> >>>>>>> Best,
> >>>>>>>
> >>>>>>> Xintong
> >>>>>>>
> >>>>>>>
> >>>>>>> [1]
> >>> https://lists.apache.org/thread/vmhzv8fcw2b33pqxp43486owrxbkd5x9
> >>>>>>>
> >>>>>>> On Mon, Jul 10, 2023 at 6:13 PM Chesnay Schepler <
> >>> chesnay@apache.org
> >>>>>
> >>>>>> wrote:
> >>>>>>>
> >>>>>>>> At what point are the FLIP discussions coming into play?
> >>>>>>>>
> >>>>>>>> I keep wondering if these shouldn't have started already.
> >>>>>>>> It just seems that a lot of decisions are implicitly reliant on
> >>> the
> >>>>>>>> items even being accepted.
> >>>>>>>> Estimates can only be provided if we actually know the scope of
> >>> the
> >>>>>>>> change, but that's not always clear from the description in the
> >>>> doc.
> >>>>>>>>
> >>>>>>>> What we need to ensure is that all breaking API changes are
> >>>>>>>> discussed/decided before 1.18 is released so we can deprecate
> >>>>> affected
> >>>>>>>> APIs.
> >>>>>>>>
> >>>>>>>> On 10/07/2023 11:32, Xintong Song wrote:
> >>>>>>>>> Hi Matthias,
> >>>>>>>>>
> >>>>>>>>> The questions you asked are indeed very important. Here're
> >> some
> >>>>> quick
> >>>>>>>>> responses, based on the plans I had in mind, which I have not
> >>>>> aligned
> >>>>>>>> with
> >>>>>>>>> other release managers yet.
> >>>>>>>>>
> >>>>>>>>> In the previous discussions between the RMs, we were not able
> >>> to
> >>>>> make
> >>>>>>>>> proposals on things like how to make a time plan, how to
> >> manage
> >>>> the
> >>>>>>>> release
> >>>>>>>>> branch, etc., due to the lack of inputs on e.g., the work
> >> items
> >>>>> need
> >>>>>> to
> >>>>>>>> be
> >>>>>>>>> included (which transitively depends on the API compatibility
> >>> to
> >>>>>> provide
> >>>>>>>>> between major versions) and the workloads / time needed for
> >>> them.
> >>>>>> With
> >>>>>>>> the
> >>>>>>>>> recent discussions, we have collected at least the majority
> >> of
> >>>> the
> >>>>>> inputs
> >>>>>>>>> needed.
> >>>>>>>>>
> >>>>>>>>> Here are things that I think we as the release managers would
> >>> do
> >>>>> next
> >>>>>>>>> (again, not aligned with other release managers yet)
> >>>>>>>>> - Creating a time plan, by reaching out to people to
> >> understand
> >>>> the
> >>>>>>>>> estimated workloads, prerequisites and ETA of each work item.
> >>>>>>>>> - Make a proposal on how to manage the release branch, i.e.,
> >>> when
> >>>>> to
> >>>>>> cut
> >>>>>>>>> the branch and whether to ship the milestone releases, etc.
> >>>>>>>>> - Set-up regular release syncs (bi-weekly / monthly) to
> >> update
> >>>> the
> >>>>>> status
> >>>>>>>>> and draw attention to where help is needed.
> >>>>>>>>>
> >>>>>>>>> So back to your questions.
> >>>>>>>>>
> >>>>>>>>> There are still to-be-discussed items in the list of
> >> features.
> >>>>>> What's the
> >>>>>>>>>> plan with those?
> >>>>>>>>> When collecting ETA, for items that the completion time
> >> cannot
> >>>> yet
> >>>>> be
> >>>>>>>>> estimated, we would like to have at least a time by which the
> >>>>>> estimation
> >>>>>>>>> can be made. I think the same applies to the to-be-discussed
> >>>> items.
> >>>>>> And
> >>>>>>>> if
> >>>>>>>>> the items should be included as must-haves, we would need
> >>> another
> >>>>>> vote to
> >>>>>>>>> adjust the must-have item list.
> >>>>>>>>>
> >>>>>>>>> Some of them don't have anyone assigned.
> >>>>>>>>> My concern is that they will be overlooked because nobody
> >> feels
> >>>> to
> >>>>>> be in
> >>>>>>>>>> charge.
> >>>>>>>>> This is a tricky one. For must-have items without assignees,
> >> we
> >>>> as
> >>>>>> the
> >>>>>>>>> release managers should be responsible for raising them up in
> >>> the
> >>>>>> release
> >>>>>>>>> syncs, and try to find assignees for them. Hopefully, there
> >>> will
> >>>> be
> >>>>>>>> someone
> >>>>>>>>> who stands out. But it is possible that for a must-have item
> >>>> nobody
> >>>>>> wants
> >>>>>>>>> to work on it. If that happens, which I don't think it will,
> >> it
> >>>>>> probably
> >>>>>>>>> means the item is not that critical and we may have to
> >> exclude
> >>> it
> >>>>>> from
> >>>>>>>> the
> >>>>>>>>> release. Either way, they should not be overlooked, because
> >>> IMHO
> >>>>>> release
> >>>>>>>>> managers should be responsible for trying to get someone to
> >>> work
> >>>> on
> >>>>>> the
> >>>>>>>>> un-assigned items.
> >>>>>>>>>
> >>>>>>>>> We'll have more discussions soon and keep the community
> >>> updated.
> >>>>>>>>>
> >>>>>>>>> Best,
> >>>>>>>>>
> >>>>>>>>> Xintong
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> On Mon, Jul 10, 2023 at 3:53 PM Matthias Pohl
> >>>>>>>>> <ma...@aiven.io.invalid> wrote:
> >>>>>>>>>
> >>>>>>>>>> Now that the vote is started on the must-have items: There
> >> are
> >>>>> still
> >>>>>>>>>> to-be-discussed items in the list of features. What's the
> >> plan
> >>>>> with
> >>>>>>>> those?
> >>>>>>>>>> Some of them don't have anyone assigned. Were these items
> >>>>> discussed
> >>>>>>>> among
> >>>>>>>>>> the release managers? So far, it looks like they are handled
> >>> as
> >>>>>>>>>> nice-to-have if someone volunteers to pick them up?
> >>>>>>>>>>
> >>>>>>>>>> My concern is that they will be overlooked because nobody
> >>> feels
> >>>> to
> >>>>>> be in
> >>>>>>>>>> charge.
> >>>>>>>>>>
> >>>>>>>>>> Best,
> >>>>>>>>>> Matthias
> >>>>>>>>>>
> >>>>>>>>>> On Fri, Jul 7, 2023 at 11:06 AM Xintong Song <
> >>>>> tonysong820@gmail.com
> >>>>>>>
> >>>>>>>>>> wrote:
> >>>>>>>>>>
> >>>>>>>>>>> Thanks all for the discussion.
> >>>>>>>>>>>
> >>>>>>>>>>> The wiki has been updated as discussed. I'm starting a vote
> >>>> now.
> >>>>>>>>>>>
> >>>>>>>>>>> Best,
> >>>>>>>>>>>
> >>>>>>>>>>> Xintong
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> On Wed, Jul 5, 2023 at 9:52 AM Xintong Song <
> >>>>> tonysong820@gmail.com
> >>>>>>>
> >>>>>>>>>> wrote:
> >>>>>>>>>>>> Hi ConradJam,
> >>>>>>>>>>>>
> >>>>>>>>>>>> I think Chesnay has already put his name as the
> >> Contributor
> >>>> for
> >>>>>> the
> >>>>>>>> two
> >>>>>>>>>>>> tasks you listed. Maybe you can reach out to him to see if
> >>> you
> >>>>> can
> >>>>>>>>>>>> collaborate on this.
> >>>>>>>>>>>>
> >>>>>>>>>>>> In general, I don't think contributing to a release 2.0
> >>> issue
> >>>> is
> >>>>>> much
> >>>>>>>>>>>> different from contributing to a regular issue. We haven't
> >>> yet
> >>>>>> created
> >>>>>>>>>>> JIRA
> >>>>>>>>>>>> tickets for all the listed tasks because many of them
> >> needs
> >>>>>> further
> >>>>>>>>>>>> discussions and / or FLIPs to decide whether and how they
> >>>> should
> >>>>>> be
> >>>>>>>>>>>> performed.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Best,
> >>>>>>>>>>>>
> >>>>>>>>>>>> Xintong
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> On Mon, Jul 3, 2023 at 10:37 PM ConradJam <
> >>>> jam.gzczy@gmail.com>
> >>>>>>>> wrote:
> >>>>>>>>>>>>
> >>>>>>>>>>>>> Hi Community:
> >>>>>>>>>>>>>   I see some tasks in the 2.0 list that haven't been
> >>>> assigned
> >>>>>> yet. I
> >>>>>>>>>>> want
> >>>>>>>>>>>>> to take the initiative to take on some tasks that I can
> >>>>>> complete. How
> >>>>>>>>>>> do I
> >>>>>>>>>>>>> apply to the community for this part of the task? I am
> >>>>>> interested in
> >>>>>>>>>> the
> >>>>>>>>>>>>> following parts of FLINK-32377
> >>>>>>>>>>>>> <https://issues.apache.org/jira/browse/FLINK-32377>, do
> >> I
> >>>> need
> >>>>>> to
> >>>>>>>>>>> create
> >>>>>>>>>>>>> issuse myself and point it to myself?
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> - the current timestamp, which is problematic w.r.t.
> >>> caching
> >>>>> and
> >>>>>>>>>>> testing,
> >>>>>>>>>>>>> while providing no value.
> >>>>>>>>>>>>> - Remove JarRequestBody#programArgs in favor of
> >>>>> #programArgsList.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> [1] FLINK-32377 <
> >>>>>> https://issues.apache.org/jira/browse/FLINK-32377><https://issues.apache.org/jira/browse/FLINK-32377%3e>
> >>>>>>>>>>>>> https://issues.apache.org/jira/browse/FLINK-32377
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Teoh, Hong <li...@amazon.co.uk.invalid> 于2023年6月30日周五
> >>>>> 00:53写道:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Teoh, Hong <li...@amazon.co.uk.invalid> 于2023年6月30日周五
> >>>>> 00:53写道:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>> Thanks Xintong for driving the effort.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> I’d add a +1 to reworking configs, as suggested by @Jark
> >>> and
> >>>>>>>>>> @Chesnay,
> >>>>>>>>>>>>>> especially the types. We have various configs that
> >> encode
> >>>>> Time /
> >>>>>>>>>>>>> MemorySize
> >>>>>>>>>>>>>> that are Long instead!
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Regards,
> >>>>>>>>>>>>>> Hong
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> On 29 Jun 2023, at 16:19, Yuan Mei <
> >>> yuanmei.work@gmail.com
> >>>>>
> >>>>>>>>>> wrote:
> >>>>>>>>>>>>>>> CAUTION: This email originated from outside of the
> >>>>>> organization.
> >>>>>>>>>> Do
> >>>>>>>>>>>>> not
> >>>>>>>>>>>>>> click links or open attachments unless you can confirm
> >> the
> >>>>>> sender
> >>>>>>>>>> and
> >>>>>>>>>>>>> know
> >>>>>>>>>>>>>> the content is safe.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Thanks for driving this effort, Xintong!
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> To Chesnay
> >>>>>>>>>>>>>>>> I'm curious as to why the "Disaggregated State
> >>> Management"
> >>>>>> item
> >>>>>>>>>> is
> >>>>>>>>>>>>>>>> marked as a must-have; will it require changes that
> >>> break
> >>>>>>>>>>> something?
> >>>>>>>>>>>>>>>> What prevents it from being added in 2.1?
> >>>>>>>>>>>>>>> As to "Disaggregated State Management".
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> We plan to provide a new type of state backend to
> >> support
> >>>> DFS
> >>>>>> as
> >>>>>>>>>>>>> primary
> >>>>>>>>>>>>>>> storage.
> >>>>>>>>>>>>>>> To achieve this, we at least need to include two parts
> >> of
> >>>>>> amends
> >>>>>>>>>>> (not
> >>>>>>>>>>>>>>> entirely sure yet, since we are still in the designing
> >>> and
> >>>>>>>>>> prototype
> >>>>>>>>>>>>>> phase)
> >>>>>>>>>>>>>>> 1. Statebackend Change
> >>>>>>>>>>>>>>> 2. State Access Change
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Not all of the interfaces related are `@Internal`. Some
> >>> of
> >>>>> the
> >>>>>>>>>>>>> interfaces
> >>>>>>>>>>>>>>> like `StateBackend` is `@PublicEvolving`
> >>>>>>>>>>>>>>> So, you are right in the sense that "Disaggregated
> >> State
> >>>>>>>>>> Management"
> >>>>>>>>>>>>>> itself
> >>>>>>>>>>>>>>> probably does not need to be a "Must Have"
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> But I was hoping changes that related to public APIs
> >> can
> >>> be
> >>>>>>>>>>> finalized
> >>>>>>>>>>>>> and
> >>>>>>>>>>>>>>> merged in Flink 2.0 (I will fix the wiki accordingly).
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> I also agree with Jark that 2.0 is a good chance to
> >>> rework
> >>>>> the
> >>>>>>>>>>> default
> >>>>>>>>>>>>>>> value of configurations.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Best
> >>>>>>>>>>>>>>> Yuan
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> On Thu, Jun 29, 2023 at 8:43 PM Chesnay Schepler <
> >>>>>>>>>>> chesnay@apache.org>
> >>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>> Something else configuration-related is that there
> >> are a
> >>>>>> bunch of
> >>>>>>>>>>>>>>>> options where the type isn't quite correct (e.g., a
> >>> String
> >>>>>> where
> >>>>>>>>>> it
> >>>>>>>>>>>>>>>> could be an enum, a string where it should be an int
> >> or
> >>>>>>>>>> something).
> >>>>>>>>>>>>>>>> Could do a pass over those as well.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> On 29/06/2023 13:50, Jark Wu wrote:
> >>>>>>>>>>>>>>>>> Hi,
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> I think one more thing we need to consider to do in
> >> 2.0
> >>>> is
> >>>>>>>>>>> changing
> >>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>> default value of configuration to improve out-of-box
> >>> user
> >>>>>>>>>>>>> experience.
> >>>>>>>>>>>>>>>>> Currently, in order to run a Flink job, users may
> >> need
> >>> to
> >>>>> set
> >>>>>>>>>>>>>>>>> a bunch of configurations, such as minibatch,
> >>> checkpoint
> >>>>>>>>>> interval,
> >>>>>>>>>>>>>>>>> exactly-once,
> >>>>>>>>>>>>>>>>> incremental-checkpoint, etc. It's very verbose and
> >> hard
> >>>> to
> >>>>>> use
> >>>>>>>>>> for
> >>>>>>>>>>>>>>>>> beginners.
> >>>>>>>>>>>>>>>>> Most of them can have a universally applicable value.
> >>>>>> Because
> >>>>>>>>>>>>> changing
> >>>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>> default value is a breaking change. I think It's
> >> worth
> >>>>>>>>>> considering
> >>>>>>>>>>>>>>>> changing
> >>>>>>>>>>>>>>>>> them in 2.0.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> What do you think?
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Best,
> >>>>>>>>>>>>>>>>> Jark
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> On Wed, 28 Jun 2023 at 14:10, Sergey Nuyanzin <
> >>>>>>>>>>> snuyanzin@gmail.com>
> >>>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>> Hi Chesnay
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> "Move Calcite rules from Scala to Java": I would
> >> hope
> >>>>> that
> >>>>>>>>>> this
> >>>>>>>>>>>>> would
> >>>>>>>>>>>>>>>> be
> >>>>>>>>>>>>>>>>>>> an entirely internal change, and could thus be an
> >>>>>> incremental
> >>>>>>>>>>>>> process
> >>>>>>>>>>>>>>>>>>> independent of major releases.
> >>>>>>>>>>>>>>>>>>> What is the actual scale of this item; how much are
> >>> we
> >>>>>>>>>> actually
> >>>>>>>>>>>>>>>>>> re-writing?
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> Thanks for asking
> >>>>>>>>>>>>>>>>>> yes, you're right, that should be internal change.
> >>>>>>>>>>>>>>>>>> Yeah I was also thinking about incremental change
> >>> (rule
> >>>> by
> >>>>>> rule
> >>>>>>>>>>> or
> >>>>>>>>>>>>>>>>>> reasonable small group of rules).
> >>>>>>>>>>>>>>>>>> And yes, this could be an independent (on major
> >>> release)
> >>>>>>>>>> activity
> >>>>>>>>>>>>>>>>>> The problem is actually for children of RelOptRule.
> >>>>>>>>>>>>>>>>>> Currently I see 60+ such rules (in Scala) using the
> >>>>>> mentioned
> >>>>>>>>>>>>>> deprecated
> >>>>>>>>>>>>>>>>>> api.
> >>>>>>>>>>>>>>>>>> There are also children of ConverterRule (50+) which
> >>> do
> >>>>> not
> >>>>>>>>>> have
> >>>>>>>>>>>>> such
> >>>>>>>>>>>>>>>>>> issues.
> >>>>>>>>>>>>>>>>>> Maybe it could be considered as the next step to
> >> have
> >>>> all
> >>>>>> the
> >>>>>>>>>>>>> rules in
> >>>>>>>>>>>>>>>>>> Java.
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> On Tue, Jun 27, 2023 at 1:34 PM Xintong Song <
> >>>>>>>>>>>>> tonysong820@gmail.com>
> >>>>>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> Hi Alex & Gyula,
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> By compatibility discussion do you mean the
> >>> "[DISCUSS]
> >>>>>>>>>> FLIP-321:
> >>>>>>>>>>>>>>>>>> Introduce
> >>>>>>>>>>>>>>>>>>>> an API deprecation process" thread [1]?
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> Yes, I meant the FLIP-321 discussion. I just
> >> noticed
> >>> I
> >>>>>> pasted
> >>>>>>>>>>> the
> >>>>>>>>>>>>>> wrong
> >>>>>>>>>>>>>>>>>> url
> >>>>>>>>>>>>>>>>>>> in my previous email. Sorry for the mistake.
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> I am also curious to know if the rationale behind
> >>> this
> >>>>> new
> >>>>>> API
> >>>>>>>>>>> has
> >>>>>>>>>>>>>> been
> >>>>>>>>>>>>>>>>>>>> previously discussed on the mailing list. Do we
> >>> have a
> >>>>>> list
> >>>>>>>>>> of
> >>>>>>>>>>>>>>>>>>> shortcomings
> >>>>>>>>>>>>>>>>>>>> in the current DataStream API that it tries to
> >>>> resolve?
> >>>>>> How
> >>>>>>>>>>> does
> >>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>>>> current ProcessFunction functionality fit into the
> >>>>>> picture?
> >>>>>>>>>>> Will
> >>>>>>>>>>>>> it
> >>>>>>>>>>>>>> be
> >>>>>>>>>>>>>>>>>>> kept
> >>>>>>>>>>>>>>>>>>>> as is or subsumed by new API?
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> I don't think we should create a replacement for
> >> the
> >>>>>>>>>> DataStream
> >>>>>>>>>>>>> API
> >>>>>>>>>>>>>>>>>> unless
> >>>>>>>>>>>>>>>>>>>> we have a very good reason to do so and with a
> >>> proper
> >>>>>>>>>>> discussion
> >>>>>>>>>>>>>> about
> >>>>>>>>>>>>>>>>>>> this
> >>>>>>>>>>>>>>>>>>>> as Alex said.
> >>>>>>>>>>>>>>>>>>> The ProcessFunction API which is targeting to
> >> replace
> >>>>>>>>>> DataStream
> >>>>>>>>>>>>> API
> >>>>>>>>>>>>>> is
> >>>>>>>>>>>>>>>>>>> still a proposal, not a decision. Sorry for the
> >>>>> confusion,
> >>>>>> I
> >>>>>>>>>>>>> should
> >>>>>>>>>>>>>>>> have
> >>>>>>>>>>>>>>>>>>> been more careful with my words, not giving the
> >>>>> impression
> >>>>>>>>>> that
> >>>>>>>>>>>>> this
> >>>>>>>>>>>>>> is
> >>>>>>>>>>>>>>>>>>> something we'll do anyway.
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> There will be a FLIP describing the motivations and
> >>>>>> designs in
> >>>>>>>>>>>>>> detail,
> >>>>>>>>>>>>>>>>>> for
> >>>>>>>>>>>>>>>>>>> the community to discuss and vote on. We are still
> >>>>> working
> >>>>>> on
> >>>>>>>>>>> it.
> >>>>>>>>>>>>>> TBH,
> >>>>>>>>>>>>>>>>>> this
> >>>>>>>>>>>>>>>>>>> is not trivial and we would need more time on it.
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> Just to quickly share some backgrounds:
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>    - We see quite some problems with the current
> >>>>>> DataStream
> >>>>>>>>>> APIs
> >>>>>>>>>>>>>>>>>>>       - Users are working with concrete classes
> >>> rather
> >>>>>> than
> >>>>>>>>>>>>>>>> interfaces,
> >>>>>>>>>>>>>>>>>>>       which means
> >>>>>>>>>>>>>>>>>>>       - Users can access methods that are designed
> >>> to
> >>>> be
> >>>>>> used
> >>>>>>>>>> by
> >>>>>>>>>>>>>>>> internal
> >>>>>>>>>>>>>>>>>>>          classes, even though they are annotated
> >>> with
> >>>>>>>>>>> `@Internal`.
> >>>>>>>>>>>>>>>> E.g.,
> >>>>>>>>>>>>>>>>>>>          `DataStream#getTransformation`.
> >>>>>>>>>>>>>>>>>>>          - Changes to the non-API implementations
> >>>> (e.g.,
> >>>>>>>>>>>>>>>>>> `Transformation`)
> >>>>>>>>>>>>>>>>>>>          would affect the API classes (e.g.,
> >>>>>> `DataStream`),
> >>>>>>>>>>> which
> >>>>>>>>>>>>>>>>>>> makes it hard to
> >>>>>>>>>>>>>>>>>>>          provide binary compatibility.
> >>>>>>>>>>>>>>>>>>>       - Internal classes are used as parameter /
> >>>>>> return-value
> >>>>>>>>>> of
> >>>>>>>>>>>>>>>> public
> >>>>>>>>>>>>>>>>>>>       APIs. E.g., while `AbstractStreamOperator`
> >> is
> >>>>>>>>>>>>> PublicEvolving,
> >>>>>>>>>>>>>>>>>>> `StreamTask`
> >>>>>>>>>>>>>>>>>>>       which returns from
> >>>>>>>>>>>>> `AbstractStreamOperator#getContainingTask`
> >>>>>>>>>>>>>> is
> >>>>>>>>>>>>>>>>>>> Internal.
> >>>>>>>>>>>>>>>>>>>       - In many cases, users are asked to extend
> >> the
> >>>> API
> >>>>>>>>>>> classes,
> >>>>>>>>>>>>>>>> rather
> >>>>>>>>>>>>>>>>>>>       than implementing interfaces. E.g.,
> >>>>>>>>>>>>> `AbstractStreamOperator`.
> >>>>>>>>>>>>>>>>>>>          - Any changes to the base classes, even
> >> the
> >>>>>> internal
> >>>>>>>>>>>>> part,
> >>>>>>>>>>>>>>>> may
> >>>>>>>>>>>>>>>>>>>          affect the behavior of the user-provided
> >>>>>> sub-classes
> >>>>>>>>>>>>>>>>>>>          - Users can override the behavior of the
> >>> base
> >>>>>> classes
> >>>>>>>>>>>>>>>>>>>       - The API module `flink-streaming-java`
> >>> contains
> >>>>>> non-API
> >>>>>>>>>>>>>>>> classes,
> >>>>>>>>>>>>>>>>>> and
> >>>>>>>>>>>>>>>>>>>       depends on internal modules such as
> >>>>> `flink-runtime`,
> >>>>>>>>>> which
> >>>>>>>>>>>>>> means
> >>>>>>>>>>>>>>>>>>>       - Changes to the internal modules may affect
> >>> the
> >>>>> API
> >>>>>>>>>>>>> modules,
> >>>>>>>>>>>>>>>> which
> >>>>>>>>>>>>>>>>>>>          requires users to re-build their
> >>> applications
> >>>>>> upon
> >>>>>>>>>>>>> upgrading
> >>>>>>>>>>>>>>>>>>>          - The artifact user needs for building
> >>> their
> >>>>>>>>>>> application
> >>>>>>>>>>>>>>>> larger
> >>>>>>>>>>>>>>>>>>>          than necessary.
> >>>>>>>>>>>>>>>>>>>       - We probably should not expose operators
> >>> (e.g.,
> >>>>>>>>>>>>>>>>>>>       `AbstractStreamOperator`) to users.
> >> Functions
> >>>>>> should be
> >>>>>>>>>>>>> enough
> >>>>>>>>>>>>>>>>>>> for users to
> >>>>>>>>>>>>>>>>>>>       define their data processing logics.
> >> Exposing
> >>>>>>>>>>> operator-level
> >>>>>>>>>>>>>>>>>> concepts
> >>>>>>>>>>>>>>>>>>>       (e.g., mailbox thread model, checkpoint
> >>> barrier
> >>>>>>>>>> alignment,
> >>>>>>>>>>>>>>>> etc.) is
> >>>>>>>>>>>>>>>>>>>       unnecessary and limits the improvement
> >>> regarding
> >>>>>> such
> >>>>>>>>>>>>> exposed
> >>>>>>>>>>>>>>>>>>> mechanisms
> >>>>>>>>>>>>>>>>>>>       with compatibility considerations.
> >>>>>>>>>>>>>>>>>>>       - The current DataStream API seems to be a
> >>>> mixture
> >>>>>> of
> >>>>>>>>>> many
> >>>>>>>>>>>>>>>> things,
> >>>>>>>>>>>>>>>>>>>       making it hard to understand especially for
> >>>>>> newcomers.
> >>>>>>>>>> It
> >>>>>>>>>>>>> might
> >>>>>>>>>>>>>>>> be
> >>>>>>>>>>>>>>>>>>> better
> >>>>>>>>>>>>>>>>>>>       to re-organize it into several parts: (the
> >>>>> taxonomy
> >>>>>>>>>> below
> >>>>>>>>>>>>> are
> >>>>>>>>>>>>>>>> just
> >>>>>>>>>>>>>>>>>> an
> >>>>>>>>>>>>>>>>>>>       example of the, we are still working on
> >> this)
> >>>>>>>>>>>>>>>>>>>          - The most fundamental stateful stream
> >>>>>> processing:
> >>>>>>>>>>>>> streams,
> >>>>>>>>>>>>>>>>>>>          partitions / key, process functions,
> >> state,
> >>>>>>>>>>>>> timeline-service
> >>>>>>>>>>>>>>>>>>>          - An extension for common batch-streaming
> >>>>> unified
> >>>>>>>>>>>>> functions:
> >>>>>>>>>>>>>>>>>> map,
> >>>>>>>>>>>>>>>>>>>          flatmap, filter, agg, reduce, join, etc.
> >>>>>>>>>>>>>>>>>>>          - An extension for windowing supports:
> >>>> window,
> >>>>>>>>>>>>> triggering
> >>>>>>>>>>>>>>>>>>>          - An extension for event-time supports:
> >>> event
> >>>>>> time,
> >>>>>>>>>>>>>> watermark
> >>>>>>>>>>>>>>>>>>>          - The extensions are like short-cuts /
> >>>> sugars,
> >>>>>>>>>> without
> >>>>>>>>>>>>> which
> >>>>>>>>>>>>>>>>>> users
> >>>>>>>>>>>>>>>>>>>          can probably still achieve the same
> >>> behavior
> >>>> by
> >>>>>>>>>> working
> >>>>>>>>>>>>> with
> >>>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>>>          fundamental APIs, but would be a lot
> >> easier
> >>>>> with
> >>>>>> the
> >>>>>>>>>>>>>>>> extensions
> >>>>>>>>>>>>>>>>>>>       - The original plan was to do in-place
> >>>> refactors /
> >>>>>>>>>> changes
> >>>>>>>>>>>>> on
> >>>>>>>>>>>>>>>>>>>    DataStream API. Some related items are listed
> >> in
> >>>> this
> >>>>>> doc
> >>>>>>>>>> [2]
> >>>>>>>>>>>>>>>> attached
> >>>>>>>>>>>>>>>>>>> to
> >>>>>>>>>>>>>>>>>>>    the kicking off email [3]. Not all of the above
> >>>>> issues
> >>>>>> are
> >>>>>>>>>>>>> listed,
> >>>>>>>>>>>>>>>>>>> because
> >>>>>>>>>>>>>>>>>>>    we haven't looked into this as deeply as now
> >> by
> >>>> that
> >>>>>> time.
> >>>>>>>>>>>>>>>>>>>    - We proposed this as a new API rather than
> >>>> in-place
> >>>>>>>>>>> refactors
> >>>>>>>>>>>>> in
> >>>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>>>    2.0 work item list, because we realized the
> >>> changes
> >>>>>> might
> >>>>>>>>>> be
> >>>>>>>>>>>>> too
> >>>>>>>>>>>>>>>> big
> >>>>>>>>>>>>>>>>>>> for an
> >>>>>>>>>>>>>>>>>>>    in-place change. First having a new API then
> >>>>> gradually
> >>>>>>>>>>> retiring
> >>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>> old
> >>>>>>>>>>>>>>>>>>> one
> >>>>>>>>>>>>>>>>>>>    would help users to smoothly migrate between
> >>> them.
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> A thorough discussion is definitely needed once the
> >>>> FLIP
> >>>>> is
> >>>>>>>>>> out.
> >>>>>>>>>>>>> And
> >>>>>>>>>>>>>> of
> >>>>>>>>>>>>>>>>>>> course it's possible that the FLIP might be
> >> rejected.
> >>>>> Given
> >>>>>>>>>> that
> >>>>>>>>>>>>> we
> >>>>>>>>>>>>>> are
> >>>>>>>>>>>>>>>>>>> planning for release 2.0, I just feel it would be
> >>>> better
> >>>>> to
> >>>>>>>>>>> bring
> >>>>>>>>>>>>>> this
> >>>>>>>>>>>>>>>> up
> >>>>>>>>>>>>>>>>>>> early even the concrete plan is not yet ready,
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> Best,
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> Xintong
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> [1]
> >>>>>>>>>>>>>
> >>>>> https://lists.apache.org/thread/vmhzv8fcw2b33pqxp43486owrxbkd5x9
> >>>>>>>>>>>>>>>>>>> [2]
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>>
> >>
> https://docs.google.com/document/d/1_PMGl5RuDQGlV99_gL3y7OiRsF0DgCk91Coua6hFXhE/edit?usp=sharing
> >>>>>>>>>>>>>>>>>>> [3]
> >>>>>>>>>>>>>
> >>>>> https://lists.apache.org/thread/b8w5cx0qqbwzzklyn5xxf54vw9ymys1c
> >>>>>>>>>>>>>>>>>>> On Tue, Jun 27, 2023 at 5:15 PM Gyula Fóra <
> >>>>>> gyfora@apache.org
> >>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>>>> Hey!
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> I share the same concerns mentioned above
> >> regarding
> >>>> the
> >>>>>>>>>>>>>>>>>> "ProcessFunction
> >>>>>>>>>>>>>>>>>>>> API".
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> I don't think we should create a replacement for
> >> the
> >>>>>>>>>> DataStream
> >>>>>>>>>>>>> API
> >>>>>>>>>>>>>>>>>>> unless
> >>>>>>>>>>>>>>>>>>>> we have a very good reason to do so and with a
> >>> proper
> >>>>>>>>>>> discussion
> >>>>>>>>>>>>>> about
> >>>>>>>>>>>>>>>>>>> this
> >>>>>>>>>>>>>>>>>>>> as Alex said.
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> Cheers,
> >>>>>>>>>>>>>>>>>>>> Gyula
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> On Tue, Jun 27, 2023 at 11:03 AM Alexander
> >> Fedulov <
> >>>>>>>>>>>>>>>>>>>> alexander.fedulov@gmail.com> wrote:
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> Hi Xintong,
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> By compatibility discussion do you mean the
> >>>> "[DISCUSS]
> >>>>>>>>>>> FLIP-321:
> >>>>>>>>>>>>>>>>>>>> Introduce
> >>>>>>>>>>>>>>>>>>>>> an API deprecation process" thread [1]?
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> I am also curious to know if the rationale behind
> >>>> this
> >>>>>> new
> >>>>>>>>>> API
> >>>>>>>>>>>>> has
> >>>>>>>>>>>>>>>>>> been
> >>>>>>>>>>>>>>>>>>>>> previously discussed on the mailing list. Do we
> >>> have
> >>>> a
> >>>>>> list
> >>>>>>>>>> of
> >>>>>>>>>>>>>>>>>>>> shortcomings
> >>>>>>>>>>>>>>>>>>>>> in the current DataStream API that it tries to
> >>>> resolve?
> >>>>>> How
> >>>>>>>>>>> does
> >>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>>>>> current ProcessFunction functionality fit into
> >> the
> >>>>>> picture?
> >>>>>>>>>>>>> Will it
> >>>>>>>>>>>>>>>>>> be
> >>>>>>>>>>>>>>>>>>>> kept
> >>>>>>>>>>>>>>>>>>>>> as is or subsumed by new API?
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> [1]
> >>>>>>>>>>>>>>
> >>>>>> https://lists.apache.org/thread/vmhzv8fcw2b33pqxp43486owrxbkd5x9
> >>>>>>>>>>>>>>>>>>>>> Best,
> >>>>>>>>>>>>>>>>>>>>> Alex
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> On Mon, 26 Jun 2023 at 14:33, Xintong Song <
> >>>>>>>>>>>>> tonysong820@gmail.com>
> >>>>>>>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>>>>>>> The ProcessFunction API item is giving me the
> >>> most
> >>>>>>>>>> headaches
> >>>>>>>>>>>>>>>>>>> because
> >>>>>>>>>>>>>>>>>>>>> it's
> >>>>>>>>>>>>>>>>>>>>>>> very unclear what it actually entails; like is
> >> it
> >>>> an
> >>>>>>>>>>> entirely
> >>>>>>>>>>>>>>>>>>>> separate
> >>>>>>>>>>>>>>>>>>>>>> API
> >>>>>>>>>>>>>>>>>>>>>>> to DataStream (sounds like it is!) or an
> >>> extension
> >>>> of
> >>>>>>>>>>>>> DataStream.
> >>>>>>>>>>>>>>>>>>> How
> >>>>>>>>>>>>>>>>>>>>>> much
> >>>>>>>>>>>>>>>>>>>>>>> will it share the internals with DataStream
> >> etc.;
> >>>> how
> >>>>>> does
> >>>>>>>>>>> it
> >>>>>>>>>>>>>>>>>>> relate
> >>>>>>>>>>>>>>>>>>>> to
> >>>>>>>>>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>>>>>>> Table API (w.r.t. switching APIs / what Table
> >> API
> >>>>> uses
> >>>>>>>>>>>>>>>>>> underneath).
> >>>>>>>>>>>>>>>>>>>>>> I totally understand your confusion. We started
> >>>>> planning
> >>>>>>>>>> this
> >>>>>>>>>>>>>> after
> >>>>>>>>>>>>>>>>>>>>> kicking
> >>>>>>>>>>>>>>>>>>>>>> off the release 2.0, so there's still a lot to
> >> be
> >>>>>> explored
> >>>>>>>>>>> and
> >>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>>> plan
> >>>>>>>>>>>>>>>>>>>>>> keeps changing.
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>    - In the beginning, we planned to do an
> >>> in-place
> >>>>>>>>>> refactor
> >>>>>>>>>>> of
> >>>>>>>>>>>>>>>>>>>>> DataStream
> >>>>>>>>>>>>>>>>>>>>>>    API, until the API migration period is
> >>> proposed.
> >>>>>>>>>>>>>>>>>>>>>>    - Then we want to make it an entirely
> >> separate
> >>>> API
> >>>>>> to
> >>>>>>>>>>>>>>>>>> DataStream,
> >>>>>>>>>>>>>>>>>>>> and
> >>>>>>>>>>>>>>>>>>>>>>    listed as a must-have for release 2.0 so
> >> that
> >>> we
> >>>>> can
> >>>>>>>>>>> remove
> >>>>>>>>>>>>>>>>>>>> DataStream
> >>>>>>>>>>>>>>>>>>>>>> once
> >>>>>>>>>>>>>>>>>>>>>>    it's ready.
> >>>>>>>>>>>>>>>>>>>>>>    - However, depending on the outcome of the
> >> API
> >>>>>>>>>>> compatibility
> >>>>>>>>>>>>>>>>>>>>> discussion
> >>>>>>>>>>>>>>>>>>>>>>    [1], we may not be able to remove DataStream
> >>> in
> >>>>> 2.0
> >>>>>>>>>>> anyway,
> >>>>>>>>>>>>>>>>>> which
> >>>>>>>>>>>>>>>>>>>>> means
> >>>>>>>>>>>>>>>>>>>>>> we
> >>>>>>>>>>>>>>>>>>>>>>    might need to re-evaluate the necessity of
> >>> this
> >>>>>> item for
> >>>>>>>>>>>>> 2.0.
> >>>>>>>>>>>>>>>>>>>>>> I'd say we wait a bit longer for the
> >> compatibility
> >>>>>>>>>> discussion
> >>>>>>>>>>>>> [1]
> >>>>>>>>>>>>>>>>>> and
> >>>>>>>>>>>>>>>>>>>>>> decide the priority for this item afterwards.
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>> Best,
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>> Xintong
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>> [1]
> >>>>>>>>>> https://lists.apache.org/list.html?dev@flink.apache.org
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>> On Mon, Jun 26, 2023 at 6:00 PM Chesnay
> >> Schepler <
> >>>>>>>>>>>>>>>>>> chesnay@apache.org
> >>>>>>>>>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>> by-and-large I'm quite happy with the list of
> >>>> items.
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>> I'm curious as to why the "Disaggregated State
> >>>>>> Management"
> >>>>>>>>>>>>> item
> >>>>>>>>>>>>>>>>>> is
> >>>>>>>>>>>>>>>>>>>>> marked
> >>>>>>>>>>>>>>>>>>>>>>> as a must-have; will it require changes that
> >>> break
> >>>>>>>>>>> something?
> >>>>>>>>>>>>>>>>>> What
> >>>>>>>>>>>>>>>>>>>>>> prevents
> >>>>>>>>>>>>>>>>>>>>>>> it from being added in 2.1?
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>> We may want to update the Java 17 item to "Make
> >>>> Java
> >>>>> 17
> >>>>>>>>>> the
> >>>>>>>>>>>>>>>>>>> default,
> >>>>>>>>>>>>>>>>>>>>> drop
> >>>>>>>>>>>>>>>>>>>>>>> Java 8/11". Maybe even split it into a
> >> must-have
> >>>>> "Drop
> >>>>>>>>>> Java
> >>>>>>>>>>> 8"
> >>>>>>>>>>>>>>>>>> and
> >>>>>>>>>>>>>>>>>>> a
> >>>>>>>>>>>>>>>>>>>>>>> nice-to-have "Drop Java 11"?
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>> "Move Calcite rules from Scala to Java": I
> >> would
> >>>> hope
> >>>>>> that
> >>>>>>>>>>>>> this
> >>>>>>>>>>>>>>>>>>> would
> >>>>>>>>>>>>>>>>>>>>> be
> >>>>>>>>>>>>>>>>>>>>>>> an entirely internal change, and could thus be
> >> an
> >>>>>>>>>>> incremental
> >>>>>>>>>>>>>>>>>>> process
> >>>>>>>>>>>>>>>>>>>>>>> independent of major releases.
> >>>>>>>>>>>>>>>>>>>>>>> What is the actual scale of this item; how much
> >>> are
> >>>>> we
> >>>>>>>>>>>>> actually
> >>>>>>>>>>>>>>>>>>>>>> re-writing?
> >>>>>>>>>>>>>>>>>>>>>>> "Add MetricGroup#getLogicalScope": I'd raise
> >> this
> >>>> to
> >>>>> a
> >>>>>>>>>>>>>>>>>> must-have; i
> >>>>>>>>>>>>>>>>>>>>> think
> >>>>>>>>>>>>>>>>>>>>>>> I marked it down as nice-to-have only because
> >> it
> >>>>>> depends
> >>>>>>>>>> on
> >>>>>>>>>>>>>>>>>> another
> >>>>>>>>>>>>>>>>>>>>> item.
> >>>>>>>>>>>>>>>>>>>>>>> The ProcessFunction API item is giving me the
> >>> most
> >>>>>>>>>> headaches
> >>>>>>>>>>>>>>>>>>> because
> >>>>>>>>>>>>>>>>>>>>> it's
> >>>>>>>>>>>>>>>>>>>>>>> very unclear what it actually entails; like is
> >> it
> >>>> an
> >>>>>>>>>>> entirely
> >>>>>>>>>>>>>>>>>>>> separate
> >>>>>>>>>>>>>>>>>>>>>> API
> >>>>>>>>>>>>>>>>>>>>>>> to DataStream (sounds like it is!) or an
> >>> extension
> >>>> of
> >>>>>>>>>>>>> DataStream.
> >>>>>>>>>>>>>>>>>>> How
> >>>>>>>>>>>>>>>>>>>>>> much
> >>>>>>>>>>>>>>>>>>>>>>> will it share the internals with DataStream
> >> etc.;
> >>>> how
> >>>>>> does
> >>>>>>>>>>> it
> >>>>>>>>>>>>>>>>>>> relate
> >>>>>>>>>>>>>>>>>>>> to
> >>>>>>>>>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>>>>>>> Table API (w.r.t. switching APIs / what Table
> >> API
> >>>>> uses
> >>>>>>>>>>>>>>>>>> underneath).
> >>>>>>>>>>>>>>>>>>>>>>> There are a few items I added as ideas which
> >>> don't
> >>>>>> have a
> >>>>>>>>>>>>>>>>>> priority
> >>>>>>>>>>>>>>>>>>>> yet;
> >>>>>>>>>>>>>>>>>>>>>>> would love to get some feedback on those.
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>> On 21/06/2023 08:41, Xintong Song wrote:
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>> Hi devs,
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>> As previously discussed in [1], we had been
> >>>>> collecting
> >>>>>>>>>> work
> >>>>>>>>>>>>> item
> >>>>>>>>>>>>>>>>>>>>>> proposals
> >>>>>>>>>>>>>>>>>>>>>>> for the 2.0 release until June 15th, on the
> >> wiki
> >>>> page
> >>>>>> [2].
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>    - As we have passed the due date, I'd like
> >> to
> >>>>>> kindly
> >>>>>>>>>>> remind
> >>>>>>>>>>>>>>>>>>>> everyone
> >>>>>>>>>>>>>>>>>>>>>> *not
> >>>>>>>>>>>>>>>>>>>>>>>    to add / remove items directly on the wiki
> >>>> page*.
> >>>>>> If
> >>>>>>>>>>>>> needed,
> >>>>>>>>>>>>>>>>>>>> please
> >>>>>>>>>>>>>>>>>>>>>> post
> >>>>>>>>>>>>>>>>>>>>>>>    in this thread or reach out to the release
> >>>>> managers
> >>>>>>>>>>>>> instead.
> >>>>>>>>>>>>>>>>>>>>>>>    - I've reached out to some folks for
> >>>>> clarifications
> >>>>>>>>>> about
> >>>>>>>>>>>>>>>>>> their
> >>>>>>>>>>>>>>>>>>>>>>>    proposals. Some of them mentioned that they
> >>> can
> >>>>>> not yet
> >>>>>>>>>>>>> tell
> >>>>>>>>>>>>>>>>>>>> whether
> >>>>>>>>>>>>>>>>>>>>>> we
> >>>>>>>>>>>>>>>>>>>>>>>    should do an item or not, and would need
> >> more
> >>>>> time
> >>>>>> /
> >>>>>>>>>>>>>>>>>> discussions
> >>>>>>>>>>>>>>>>>>>> to
> >>>>>>>>>>>>>>>>>>>>>> make
> >>>>>>>>>>>>>>>>>>>>>>>    the decision. So I added a new symbol for
> >>> items
> >>>>>> whose
> >>>>>>>>>>>>>>>>>> priorities
> >>>>>>>>>>>>>>>>>>>> are
> >>>>>>>>>>>>>>>>>>>>>> `TBD`.
> >>>>>>>>>>>>>>>>>>>>>>> Now it's time to collaboratively decide a
> >> minimum
> >>>> set
> >>>>>> of
> >>>>>>>>>>>>>>>>>> must-have
> >>>>>>>>>>>>>>>>>>>>> items.
> >>>>>>>>>>>>>>>>>>>>>>> I've gone through the entire list of proposed
> >>>> items,
> >>>>>> and
> >>>>>>>>>>> found
> >>>>>>>>>>>>>>>>>> most
> >>>>>>>>>>>>>>>>>>>> of
> >>>>>>>>>>>>>>>>>>>>>> them
> >>>>>>>>>>>>>>>>>>>>>>> make quite much sense. So I think an online
> >> sync
> >>>>> might
> >>>>>> not
> >>>>>>>>>>> be
> >>>>>>>>>>>>>>>>>>>> necessary
> >>>>>>>>>>>>>>>>>>>>>> for
> >>>>>>>>>>>>>>>>>>>>>>> this. I'd like to go with this DISCUSS thread,
> >>>> where
> >>>>>>>>>>> everyone
> >>>>>>>>>>>>> can
> >>>>>>>>>>>>>>>>>>>>> comment
> >>>>>>>>>>>>>>>>>>>>>>> on how they think the list can be improved,
> >>>> followed
> >>>>>> by a
> >>>>>>>>>>>>> VOTE to
> >>>>>>>>>>>>>>>>>>>>>> formally
> >>>>>>>>>>>>>>>>>>>>>>> make the decision.
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>> Any feedback and opinions, including but not
> >>>> limited
> >>>>> to
> >>>>>>>>>> the
> >>>>>>>>>>>>>>>>>

Re: [DISCUSS] Release 2.0 Work Items

Posted by Xintong Song <to...@gmail.com>.
Thanks for the inputs, Zhiqiang and Jiabao.

@Zhiqiang,
The proposal sounds interesting. Do you already have an idea what API
changes are needed in order to make the connectors pluggable? I think
whether this should go into Flink 2.0 would significantly depend on what
API changes are needed. Moreover, would you like to work on this effort or
simply raise a need? And if you'd like to work on this, do you already find
a committer who can help on this?

@Jiabao,
Thanks for the suggestions. I agree that it would be nice to improve the
experiences in deploying Flink instances and submitting tasks. It would be
helpful if you can point out the specific behaviors that make integrating
Flink in your production difficult. Also, I'd like to understand how this
topic is related to the Release 2.0 topic. Or asked differently, is this
something that requires breaking changes that can only happen in major
version bumps, or is it just improvement that can go into any minor version?


Best,

Xintong



On Thu, Jul 13, 2023 at 12:49 AM Jiabao Sun <ji...@xtransfer.cn.invalid>
wrote:

> Thanks Xintong for driving the effort.
>
>
> I’d add a +1 to improving out-of-box user experience, as suggested by
> @Jark and @Chesnay.
> For beginners, understanding complex configurations is a hard work.
>
> In addition, the deployment of a set of Flink runtime environment is also
> a complex matter.
> At present, there are still big differences in the submission tasks for
> different computing resource. If users need time for their own data
> development platform, they need to deeply understand these differences when
> processing task submission and running status check.
>
> I'm glad to see features like flink-sql-gateway being implemented by the
> community because it makes it easy for users to submit flink sql tasks.
> Further more, can we provide more unified, out-of-the-box capabilities that
> allow users to quickly pull up a production-ready Flink environment and
> easily integrate Flink into their own data development platform?
>
>
> Best,
> Jiabao
>
>
> > 2023年7月12日 下午8:16,zhiqiang li <li...@gmail.com> 写道:
> >
> > I have seen in [1] connectors and formats, and user code will be
> pluggable.
> > If the connectors are pluggable, the benefits are obvious, as the
> conflicts
> > between different jar package versions can be avoided.
> > If you don't use classloader isolation, shade is needed to resolve
> > conflicts. A lot of development time is wasted.
> > I know that this change may involve a lot of API changes, so I would like
> > to discuss in this email whether we can make changes in Flink 2.0.
> > Plugins facilitate a strict separation of code through restricted
> > classloaders.
> >
> > Plugins cannot access classes from other plugins or from Flink that have
> >> not been specifically whitelisted.
> >> This strict isolation allows plugins to contain conflicting versions of
> >> the same library without the need to relocate classes or to converge to
> >> common versions.
> >> Currently, file systems and metric reporters are pluggable *but in the
> >> future, connectors, formats, and even user code should also be
> pluggable.*
> >>
> >
> > [1]
> >
> https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/filesystems/overview/
> >
> > Xintong Song <to...@gmail.com> 于2023年7月11日周二 18:50写道:
> >
> >>>
> >>> What we might want to come up with is a summary with each 2.0.0 issue
> on
> >>> why it should be included or not. That summary is something the
> community
> >>> could vote on. WDYT? I'm happy to help here.
> >>>
> >>
> >> That sounds great. Thanks for offering the help. I'll also try to go
> >> through the issues, but TBH I'm quite overwhelmed and cannot promise to
> get
> >> this done very soon. Your help is very much needed.
> >>
> >>
> >> Best,
> >>
> >> Xintong
> >>
> >>
> >>
> >> On Tue, Jul 11, 2023 at 6:08 PM Matthias Pohl
> >> <ma...@aiven.io.invalid> wrote:
> >>
> >>> @Xintong I guess it makes sense. I agree with your conclusions on the
> >> four
> >>> mentioned Jira issues.
> >>>
> >>> I just checked any issues that have fixVersion = 2.0.0 [1]. There are a
> >> few
> >>> more items that are not affiliated with FLINK-3957 [2]. I guess we
> should
> >>> find answers for these issues: Either closing them with a reason to
> have
> >> a
> >>> consistent state in Jira or adding them to the feature list as part of
> a
> >>> separate voting thread (to leave the current vote untouched).
> >>>
> >>> What we might want to come up with is a summary with each 2.0.0 issue
> on
> >>> why it should be included or not. That summary is something the
> community
> >>> could vote on. WDYT? I'm happy to help here.
> >>>
> >>> Matthias
> >>>
> >>> [1]
> >>>
> >>>
> >>
> https://issues.apache.org/jira/browse/FLINK-32437?jql=project%20%3D%20FLINK%20AND%20fixVersion%20%3D%202.0.0%20AND%20status%20NOT%20IN%20(Closed%2C%20Resolved)%20%20
> >>> [2] https://issues.apache.org/jira/browse/FLINK-3957
> >>>
> >>>
> >>> On Tue, Jul 11, 2023 at 5:01 AM Xintong Song <to...@gmail.com>
> >>> wrote:
> >>>
> >>>> @Zhu,
> >>>> As you are downgrading "Clarify the scopes of configuration options"
> to
> >>>> nice-to-have priority, could you also bring that up in the vote
> >>> thread[1]?
> >>>> I'm asking because there are people who already voted on the original
> >>> list.
> >>>> I think restarting the vote is probably an overkill and unnecessary,
> >> but
> >>> we
> >>>> should at least bring this change to their attention.
> >>>>
> >>>> @Matthias,
> >>>> Thanks a lot for bringing this up. I wasn't aware of this early
> >>> umbrella. I
> >>>> haven't gone through everything in FLINK-3957 yet. I'll do it asap.
> >>>>
> >>>> Just quickly went through the 4 issues you mentioned.
> >>>> - FLINK-4675 & FLINK-14068: I'd be +1 to deprecate them in 1.18, as
> >> long
> >>> as
> >>>> the new APIs that we want users to migrate to are ready. For these 2
> >>>> tickets, I think introduction of the updated APIs should be
> >>> straightforward
> >>>> and feasible for 1.18.
> >>>> - FLINK-13926: I'm not sure about this one. The two mentioned classes
> >>>> `ProcessingTimeSessionWindows` and `EventTimeSessionWindows` are not
> >> even
> >>>> marked as Public or PublicEvolving APIs. Moreover, I don't see a good
> >> way
> >>>> to smoothly replace the classes with a generic version.
> >>>> - FLINK-5126: This is a bit unclear to me. From the description and
> >>>> conversation on the ticket, I don't fully understand which concrete
> >> APIs
> >>>> the ticket is referring to. Or maybe it refers to all / most of the
> >> APIs
> >>>> that throws Exception / IOException in general. Moreover, I don't
> think
> >>>> removing Exception / IOException from the API signature is a breaking
> >>>> change. It requires no code changes on the caller side.
> >>>>
> >>>> WDYT?
> >>>>
> >>>> Best,
> >>>>
> >>>> Xintong
> >>>>
> >>>>
> >>>> [1] https://lists.apache.org/thread/r0y9syc6k5nmcxvnd0hj33htdpdj9k6m
> >>>> [2] https://issues.apache.org/jira/browse/FLINK-3957
> >>>>
> >>>> On Mon, Jul 10, 2023 at 10:53 PM Matthias Pohl
> >>>> <ma...@aiven.io.invalid> wrote:
> >>>>
> >>>>> I brought it up in the deprecating APIs in 1.18 thread [1] already
> >> but
> >>> it
> >>>>> feels misplaced there. I just wanted to ask whether someone did a
> >> pass
> >>>> over
> >>>>> FLINK-3957 [2]. I came across it when going through the release 2.0
> >>>> feature
> >>>>> list [3] as part of the vote. I have the feeling that there are some
> >>>> valid
> >>>>> action items (e.g. FLINK-4675, FLINK-5126, FLINK-13926 [4-6]) which
> >> do
> >>>> not
> >>>>> seem to be listed in the 2.0 feature list [3], yet (or are included
> >> in
> >>>> some
> >>>>> of the bigger items). Majority of the subtasks are probably covered
> >> by
> >>>> the
> >>>>> DataSet removal, the Scala API removal and the ProcessFunction
> >>>> refactoring.
> >>>>> Other subtasks (FLINK-14068 [7]) made it into the feature list.
> >>>>>
> >>>>> I haven't worked with the SDK code that much so that I can judge
> >>> whether
> >>>>> the subtasks are still reasonable or actually obsolete. That is why I
> >>>>> wanted to mention the Jira issue here once more.
> >>>>>
> >>>>> I don't consider it a blocker for the ongoing vote but was wondering
> >>>>> whether it makes sense for someone who might have more experience in
> >>> that
> >>>>> field to add some of the subtasks to the feature list.
> >>>>>
> >>>>> Or shall we just consider it as "not interesting enough" because
> >> nobody
> >>>>> added it in the first place to the 2.0 feature list [3]?
> >>>>>
> >>>>> Matthias
> >>>>>
> >>>>> [1] https://lists.apache.org/thread/3dw4f8frlg8hzlv324ql7n2755bzs9hy
> >>>>> [2] https://issues.apache.org/jira/browse/FLINK-3957
> >>>>> [3] https://cwiki.apache.org/confluence/display/FLINK/2.0+Release
> >>>>> [4] https://issues.apache.org/jira/browse/FLINK-4675
> >>>>> [5] https://issues.apache.org/jira/browse/FLINK-5126
> >>>>> [6] https://issues.apache.org/jira/browse/FLINK-13926
> >>>>> [7] https://issues.apache.org/jira/browse/FLINK-14068
> >>>>>
> >>>>> On Mon, Jul 10, 2023 at 3:17 PM Zhu Zhu <re...@gmail.com> wrote:
> >>>>>
> >>>>>> Agreed that we should deprecate affected APIs as soon as possible.
> >>>>>> But there is not much time before the feature freeze of 1.18,
> >> hence
> >>>>>> I'm a bit concerned that some of the deprecations might not be done
> >>>> 1.18.
> >>>>>>
> >>>>>> We are currently looking into the improvements of the configuration
> >>>>> layer.
> >>>>>> Most of the proposed changes would require a public discussion, or
> >>> even
> >>>>>> a FLIP, which I think can hardly close before the feature freeze of
> >>>> 1.18.
> >>>>>> And some of the APIs can be deprecated only after the corresponding
> >>> new
> >>>>>> APIs are developed. Therefore we previously targeted them for 1.19.
> >>>>>>
> >>>>>> We may review later to see what deprecation work can be done in
> >> 1.18
> >>>> and
> >>>>>> make it if possible. I think we can do the work even after the
> >>> feature
> >>>>>> freeze
> >>>>>> date, if it is a purely deprecation work (simply adding
> >> annotations).
> >>>>> WDYT?
> >>>>>>
> >>>>>> I'm also changing the priority of "Clarify the scopes of
> >>> configuration
> >>>>>> options"
> >>>>>> to nice to have. I think most of the work are not breaking changes
> >>> and
> >>>>> can
> >>>>>> be done in 1.x or 2.1+. For the breaking changes which might be
> >>> needed,
> >>>>> we
> >>>>>> will consider it as part of the configuration layer rework.
> >>>>>>
> >>>>>> Thanks,
> >>>>>> Zhu
> >>>>>>
> >>>>>> Xintong Song <to...@gmail.com> 于2023年7月10日周一 19:58写道:
> >>>>>>>
> >>>>>>>>
> >>>>>>>> At what point are the FLIP discussions coming into play?
> >>>>>>>
> >>>>>>> I keep wondering if these shouldn't have started already.
> >>>>>>>
> >>>>>>>
> >>>>>>> I think this depends on the responsible contributor and reviewer
> >> of
> >>>>>>> individual items. From my perspective, the FLIP discussions can
> >>> start
> >>>>> any
> >>>>>>> time as long as the contributors are ready, the earlier the
> >> better.
> >>>>>>>
> >>>>>>>
> >>>>>>> What we need to ensure is that all breaking API changes are
> >>>>>>>> discussed/decided before 1.18 is released so we can deprecate
> >>>>> affected
> >>>>>> APIs.
> >>>>>>>>
> >>>>>>>
> >>>>>>> The introduction of the migration period has brought the
> >>> requirement
> >>>> to
> >>>>>>> plan the removal of public APIs 2 minor releases ahead of the
> >> major
> >>>>>>> release, which is TBH a bit unexpected. I agree it would be nice
> >> if
> >>>> we
> >>>>>> can
> >>>>>>> get the FLIPs ready by releasing 1.18. But I also don't think we
> >>>> should
> >>>>>>> rush on it. If the deprecation of a Public API does not make
> >> 1.18,
> >>> we
> >>>>> may
> >>>>>>> carry it until 3.0. Or if there are many Public APIs whose
> >>>> deprecation
> >>>>>> does
> >>>>>>> not make 1.18, we may deprecate them in 1.19 and postpone the
> >> major
> >>>>>> version
> >>>>>>> bump to after a 1.20 release. Moreover, as mentioned in
> >>> FLIP-321[1],
> >>>>>>> exceptions are discussable given that the migration period is
> >> newly
> >>>>>>> proposed and we did not give developers the chance to plan things
> >>>>> ahead.
> >>>>>> To
> >>>>>>> sum up, I'd say we try identify APIs that need to be deprecated
> >> in
> >>>> 1.18
> >>>>>>> with best efforts, and evaluate the remaining options (carrying
> >> the
> >>>> API
> >>>>>> for
> >>>>>>> the entire 2.x cycle, postpone 2.0, or making an exception)
> >>>>> case-by-case.
> >>>>>>> WDYT?
> >>>>>>>
> >>>>>>> Best,
> >>>>>>>
> >>>>>>> Xintong
> >>>>>>>
> >>>>>>>
> >>>>>>> [1]
> >>> https://lists.apache.org/thread/vmhzv8fcw2b33pqxp43486owrxbkd5x9
> >>>>>>>
> >>>>>>> On Mon, Jul 10, 2023 at 6:13 PM Chesnay Schepler <
> >>> chesnay@apache.org
> >>>>>
> >>>>>> wrote:
> >>>>>>>
> >>>>>>>> At what point are the FLIP discussions coming into play?
> >>>>>>>>
> >>>>>>>> I keep wondering if these shouldn't have started already.
> >>>>>>>> It just seems that a lot of decisions are implicitly reliant on
> >>> the
> >>>>>>>> items even being accepted.
> >>>>>>>> Estimates can only be provided if we actually know the scope of
> >>> the
> >>>>>>>> change, but that's not always clear from the description in the
> >>>> doc.
> >>>>>>>>
> >>>>>>>> What we need to ensure is that all breaking API changes are
> >>>>>>>> discussed/decided before 1.18 is released so we can deprecate
> >>>>> affected
> >>>>>>>> APIs.
> >>>>>>>>
> >>>>>>>> On 10/07/2023 11:32, Xintong Song wrote:
> >>>>>>>>> Hi Matthias,
> >>>>>>>>>
> >>>>>>>>> The questions you asked are indeed very important. Here're
> >> some
> >>>>> quick
> >>>>>>>>> responses, based on the plans I had in mind, which I have not
> >>>>> aligned
> >>>>>>>> with
> >>>>>>>>> other release managers yet.
> >>>>>>>>>
> >>>>>>>>> In the previous discussions between the RMs, we were not able
> >>> to
> >>>>> make
> >>>>>>>>> proposals on things like how to make a time plan, how to
> >> manage
> >>>> the
> >>>>>>>> release
> >>>>>>>>> branch, etc., due to the lack of inputs on e.g., the work
> >> items
> >>>>> need
> >>>>>> to
> >>>>>>>> be
> >>>>>>>>> included (which transitively depends on the API compatibility
> >>> to
> >>>>>> provide
> >>>>>>>>> between major versions) and the workloads / time needed for
> >>> them.
> >>>>>> With
> >>>>>>>> the
> >>>>>>>>> recent discussions, we have collected at least the majority
> >> of
> >>>> the
> >>>>>> inputs
> >>>>>>>>> needed.
> >>>>>>>>>
> >>>>>>>>> Here are things that I think we as the release managers would
> >>> do
> >>>>> next
> >>>>>>>>> (again, not aligned with other release managers yet)
> >>>>>>>>> - Creating a time plan, by reaching out to people to
> >> understand
> >>>> the
> >>>>>>>>> estimated workloads, prerequisites and ETA of each work item.
> >>>>>>>>> - Make a proposal on how to manage the release branch, i.e.,
> >>> when
> >>>>> to
> >>>>>> cut
> >>>>>>>>> the branch and whether to ship the milestone releases, etc.
> >>>>>>>>> - Set-up regular release syncs (bi-weekly / monthly) to
> >> update
> >>>> the
> >>>>>> status
> >>>>>>>>> and draw attention to where help is needed.
> >>>>>>>>>
> >>>>>>>>> So back to your questions.
> >>>>>>>>>
> >>>>>>>>> There are still to-be-discussed items in the list of
> >> features.
> >>>>>> What's the
> >>>>>>>>>> plan with those?
> >>>>>>>>> When collecting ETA, for items that the completion time
> >> cannot
> >>>> yet
> >>>>> be
> >>>>>>>>> estimated, we would like to have at least a time by which the
> >>>>>> estimation
> >>>>>>>>> can be made. I think the same applies to the to-be-discussed
> >>>> items.
> >>>>>> And
> >>>>>>>> if
> >>>>>>>>> the items should be included as must-haves, we would need
> >>> another
> >>>>>> vote to
> >>>>>>>>> adjust the must-have item list.
> >>>>>>>>>
> >>>>>>>>> Some of them don't have anyone assigned.
> >>>>>>>>> My concern is that they will be overlooked because nobody
> >> feels
> >>>> to
> >>>>>> be in
> >>>>>>>>>> charge.
> >>>>>>>>> This is a tricky one. For must-have items without assignees,
> >> we
> >>>> as
> >>>>>> the
> >>>>>>>>> release managers should be responsible for raising them up in
> >>> the
> >>>>>> release
> >>>>>>>>> syncs, and try to find assignees for them. Hopefully, there
> >>> will
> >>>> be
> >>>>>>>> someone
> >>>>>>>>> who stands out. But it is possible that for a must-have item
> >>>> nobody
> >>>>>> wants
> >>>>>>>>> to work on it. If that happens, which I don't think it will,
> >> it
> >>>>>> probably
> >>>>>>>>> means the item is not that critical and we may have to
> >> exclude
> >>> it
> >>>>>> from
> >>>>>>>> the
> >>>>>>>>> release. Either way, they should not be overlooked, because
> >>> IMHO
> >>>>>> release
> >>>>>>>>> managers should be responsible for trying to get someone to
> >>> work
> >>>> on
> >>>>>> the
> >>>>>>>>> un-assigned items.
> >>>>>>>>>
> >>>>>>>>> We'll have more discussions soon and keep the community
> >>> updated.
> >>>>>>>>>
> >>>>>>>>> Best,
> >>>>>>>>>
> >>>>>>>>> Xintong
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> On Mon, Jul 10, 2023 at 3:53 PM Matthias Pohl
> >>>>>>>>> <ma...@aiven.io.invalid> wrote:
> >>>>>>>>>
> >>>>>>>>>> Now that the vote is started on the must-have items: There
> >> are
> >>>>> still
> >>>>>>>>>> to-be-discussed items in the list of features. What's the
> >> plan
> >>>>> with
> >>>>>>>> those?
> >>>>>>>>>> Some of them don't have anyone assigned. Were these items
> >>>>> discussed
> >>>>>>>> among
> >>>>>>>>>> the release managers? So far, it looks like they are handled
> >>> as
> >>>>>>>>>> nice-to-have if someone volunteers to pick them up?
> >>>>>>>>>>
> >>>>>>>>>> My concern is that they will be overlooked because nobody
> >>> feels
> >>>> to
> >>>>>> be in
> >>>>>>>>>> charge.
> >>>>>>>>>>
> >>>>>>>>>> Best,
> >>>>>>>>>> Matthias
> >>>>>>>>>>
> >>>>>>>>>> On Fri, Jul 7, 2023 at 11:06 AM Xintong Song <
> >>>>> tonysong820@gmail.com
> >>>>>>>
> >>>>>>>>>> wrote:
> >>>>>>>>>>
> >>>>>>>>>>> Thanks all for the discussion.
> >>>>>>>>>>>
> >>>>>>>>>>> The wiki has been updated as discussed. I'm starting a vote
> >>>> now.
> >>>>>>>>>>>
> >>>>>>>>>>> Best,
> >>>>>>>>>>>
> >>>>>>>>>>> Xintong
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> On Wed, Jul 5, 2023 at 9:52 AM Xintong Song <
> >>>>> tonysong820@gmail.com
> >>>>>>>
> >>>>>>>>>> wrote:
> >>>>>>>>>>>> Hi ConradJam,
> >>>>>>>>>>>>
> >>>>>>>>>>>> I think Chesnay has already put his name as the
> >> Contributor
> >>>> for
> >>>>>> the
> >>>>>>>> two
> >>>>>>>>>>>> tasks you listed. Maybe you can reach out to him to see if
> >>> you
> >>>>> can
> >>>>>>>>>>>> collaborate on this.
> >>>>>>>>>>>>
> >>>>>>>>>>>> In general, I don't think contributing to a release 2.0
> >>> issue
> >>>> is
> >>>>>> much
> >>>>>>>>>>>> different from contributing to a regular issue. We haven't
> >>> yet
> >>>>>> created
> >>>>>>>>>>> JIRA
> >>>>>>>>>>>> tickets for all the listed tasks because many of them
> >> needs
> >>>>>> further
> >>>>>>>>>>>> discussions and / or FLIPs to decide whether and how they
> >>>> should
> >>>>>> be
> >>>>>>>>>>>> performed.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Best,
> >>>>>>>>>>>>
> >>>>>>>>>>>> Xintong
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> On Mon, Jul 3, 2023 at 10:37 PM ConradJam <
> >>>> jam.gzczy@gmail.com>
> >>>>>>>> wrote:
> >>>>>>>>>>>>
> >>>>>>>>>>>>> Hi Community:
> >>>>>>>>>>>>>   I see some tasks in the 2.0 list that haven't been
> >>>> assigned
> >>>>>> yet. I
> >>>>>>>>>>> want
> >>>>>>>>>>>>> to take the initiative to take on some tasks that I can
> >>>>>> complete. How
> >>>>>>>>>>> do I
> >>>>>>>>>>>>> apply to the community for this part of the task? I am
> >>>>>> interested in
> >>>>>>>>>> the
> >>>>>>>>>>>>> following parts of FLINK-32377
> >>>>>>>>>>>>> <https://issues.apache.org/jira/browse/FLINK-32377>, do
> >> I
> >>>> need
> >>>>>> to
> >>>>>>>>>>> create
> >>>>>>>>>>>>> issuse myself and point it to myself?
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> - the current timestamp, which is problematic w.r.t.
> >>> caching
> >>>>> and
> >>>>>>>>>>> testing,
> >>>>>>>>>>>>> while providing no value.
> >>>>>>>>>>>>> - Remove JarRequestBody#programArgs in favor of
> >>>>> #programArgsList.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> [1] FLINK-32377 <
> >>>>>> https://issues.apache.org/jira/browse/FLINK-32377>
> >>>>>>>>>>>>> https://issues.apache.org/jira/browse/FLINK-32377
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Teoh, Hong <li...@amazon.co.uk.invalid> 于2023年6月30日周五
> >>>>> 00:53写道:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Teoh, Hong <li...@amazon.co.uk.invalid> 于2023年6月30日周五
> >>>>> 00:53写道:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>> Thanks Xintong for driving the effort.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> I’d add a +1 to reworking configs, as suggested by @Jark
> >>> and
> >>>>>>>>>> @Chesnay,
> >>>>>>>>>>>>>> especially the types. We have various configs that
> >> encode
> >>>>> Time /
> >>>>>>>>>>>>> MemorySize
> >>>>>>>>>>>>>> that are Long instead!
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Regards,
> >>>>>>>>>>>>>> Hong
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> On 29 Jun 2023, at 16:19, Yuan Mei <
> >>> yuanmei.work@gmail.com
> >>>>>
> >>>>>>>>>> wrote:
> >>>>>>>>>>>>>>> CAUTION: This email originated from outside of the
> >>>>>> organization.
> >>>>>>>>>> Do
> >>>>>>>>>>>>> not
> >>>>>>>>>>>>>> click links or open attachments unless you can confirm
> >> the
> >>>>>> sender
> >>>>>>>>>> and
> >>>>>>>>>>>>> know
> >>>>>>>>>>>>>> the content is safe.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Thanks for driving this effort, Xintong!
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> To Chesnay
> >>>>>>>>>>>>>>>> I'm curious as to why the "Disaggregated State
> >>> Management"
> >>>>>> item
> >>>>>>>>>> is
> >>>>>>>>>>>>>>>> marked as a must-have; will it require changes that
> >>> break
> >>>>>>>>>>> something?
> >>>>>>>>>>>>>>>> What prevents it from being added in 2.1?
> >>>>>>>>>>>>>>> As to "Disaggregated State Management".
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> We plan to provide a new type of state backend to
> >> support
> >>>> DFS
> >>>>>> as
> >>>>>>>>>>>>> primary
> >>>>>>>>>>>>>>> storage.
> >>>>>>>>>>>>>>> To achieve this, we at least need to include two parts
> >> of
> >>>>>> amends
> >>>>>>>>>>> (not
> >>>>>>>>>>>>>>> entirely sure yet, since we are still in the designing
> >>> and
> >>>>>>>>>> prototype
> >>>>>>>>>>>>>> phase)
> >>>>>>>>>>>>>>> 1. Statebackend Change
> >>>>>>>>>>>>>>> 2. State Access Change
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Not all of the interfaces related are `@Internal`. Some
> >>> of
> >>>>> the
> >>>>>>>>>>>>> interfaces
> >>>>>>>>>>>>>>> like `StateBackend` is `@PublicEvolving`
> >>>>>>>>>>>>>>> So, you are right in the sense that "Disaggregated
> >> State
> >>>>>>>>>> Management"
> >>>>>>>>>>>>>> itself
> >>>>>>>>>>>>>>> probably does not need to be a "Must Have"
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> But I was hoping changes that related to public APIs
> >> can
> >>> be
> >>>>>>>>>>> finalized
> >>>>>>>>>>>>> and
> >>>>>>>>>>>>>>> merged in Flink 2.0 (I will fix the wiki accordingly).
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> I also agree with Jark that 2.0 is a good chance to
> >>> rework
> >>>>> the
> >>>>>>>>>>> default
> >>>>>>>>>>>>>>> value of configurations.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Best
> >>>>>>>>>>>>>>> Yuan
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> On Thu, Jun 29, 2023 at 8:43 PM Chesnay Schepler <
> >>>>>>>>>>> chesnay@apache.org>
> >>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>> Something else configuration-related is that there
> >> are a
> >>>>>> bunch of
> >>>>>>>>>>>>>>>> options where the type isn't quite correct (e.g., a
> >>> String
> >>>>>> where
> >>>>>>>>>> it
> >>>>>>>>>>>>>>>> could be an enum, a string where it should be an int
> >> or
> >>>>>>>>>> something).
> >>>>>>>>>>>>>>>> Could do a pass over those as well.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> On 29/06/2023 13:50, Jark Wu wrote:
> >>>>>>>>>>>>>>>>> Hi,
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> I think one more thing we need to consider to do in
> >> 2.0
> >>>> is
> >>>>>>>>>>> changing
> >>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>> default value of configuration to improve out-of-box
> >>> user
> >>>>>>>>>>>>> experience.
> >>>>>>>>>>>>>>>>> Currently, in order to run a Flink job, users may
> >> need
> >>> to
> >>>>> set
> >>>>>>>>>>>>>>>>> a bunch of configurations, such as minibatch,
> >>> checkpoint
> >>>>>>>>>> interval,
> >>>>>>>>>>>>>>>>> exactly-once,
> >>>>>>>>>>>>>>>>> incremental-checkpoint, etc. It's very verbose and
> >> hard
> >>>> to
> >>>>>> use
> >>>>>>>>>> for
> >>>>>>>>>>>>>>>>> beginners.
> >>>>>>>>>>>>>>>>> Most of them can have a universally applicable value.
> >>>>>> Because
> >>>>>>>>>>>>> changing
> >>>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>> default value is a breaking change. I think It's
> >> worth
> >>>>>>>>>> considering
> >>>>>>>>>>>>>>>> changing
> >>>>>>>>>>>>>>>>> them in 2.0.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> What do you think?
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Best,
> >>>>>>>>>>>>>>>>> Jark
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> On Wed, 28 Jun 2023 at 14:10, Sergey Nuyanzin <
> >>>>>>>>>>> snuyanzin@gmail.com>
> >>>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>> Hi Chesnay
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> "Move Calcite rules from Scala to Java": I would
> >> hope
> >>>>> that
> >>>>>>>>>> this
> >>>>>>>>>>>>> would
> >>>>>>>>>>>>>>>> be
> >>>>>>>>>>>>>>>>>>> an entirely internal change, and could thus be an
> >>>>>> incremental
> >>>>>>>>>>>>> process
> >>>>>>>>>>>>>>>>>>> independent of major releases.
> >>>>>>>>>>>>>>>>>>> What is the actual scale of this item; how much are
> >>> we
> >>>>>>>>>> actually
> >>>>>>>>>>>>>>>>>> re-writing?
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> Thanks for asking
> >>>>>>>>>>>>>>>>>> yes, you're right, that should be internal change.
> >>>>>>>>>>>>>>>>>> Yeah I was also thinking about incremental change
> >>> (rule
> >>>> by
> >>>>>> rule
> >>>>>>>>>>> or
> >>>>>>>>>>>>>>>>>> reasonable small group of rules).
> >>>>>>>>>>>>>>>>>> And yes, this could be an independent (on major
> >>> release)
> >>>>>>>>>> activity
> >>>>>>>>>>>>>>>>>> The problem is actually for children of RelOptRule.
> >>>>>>>>>>>>>>>>>> Currently I see 60+ such rules (in Scala) using the
> >>>>>> mentioned
> >>>>>>>>>>>>>> deprecated
> >>>>>>>>>>>>>>>>>> api.
> >>>>>>>>>>>>>>>>>> There are also children of ConverterRule (50+) which
> >>> do
> >>>>> not
> >>>>>>>>>> have
> >>>>>>>>>>>>> such
> >>>>>>>>>>>>>>>>>> issues.
> >>>>>>>>>>>>>>>>>> Maybe it could be considered as the next step to
> >> have
> >>>> all
> >>>>>> the
> >>>>>>>>>>>>> rules in
> >>>>>>>>>>>>>>>>>> Java.
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> On Tue, Jun 27, 2023 at 1:34 PM Xintong Song <
> >>>>>>>>>>>>> tonysong820@gmail.com>
> >>>>>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> Hi Alex & Gyula,
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> By compatibility discussion do you mean the
> >>> "[DISCUSS]
> >>>>>>>>>> FLIP-321:
> >>>>>>>>>>>>>>>>>> Introduce
> >>>>>>>>>>>>>>>>>>>> an API deprecation process" thread [1]?
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> Yes, I meant the FLIP-321 discussion. I just
> >> noticed
> >>> I
> >>>>>> pasted
> >>>>>>>>>>> the
> >>>>>>>>>>>>>> wrong
> >>>>>>>>>>>>>>>>>> url
> >>>>>>>>>>>>>>>>>>> in my previous email. Sorry for the mistake.
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> I am also curious to know if the rationale behind
> >>> this
> >>>>> new
> >>>>>> API
> >>>>>>>>>>> has
> >>>>>>>>>>>>>> been
> >>>>>>>>>>>>>>>>>>>> previously discussed on the mailing list. Do we
> >>> have a
> >>>>>> list
> >>>>>>>>>> of
> >>>>>>>>>>>>>>>>>>> shortcomings
> >>>>>>>>>>>>>>>>>>>> in the current DataStream API that it tries to
> >>>> resolve?
> >>>>>> How
> >>>>>>>>>>> does
> >>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>>>> current ProcessFunction functionality fit into the
> >>>>>> picture?
> >>>>>>>>>>> Will
> >>>>>>>>>>>>> it
> >>>>>>>>>>>>>> be
> >>>>>>>>>>>>>>>>>>> kept
> >>>>>>>>>>>>>>>>>>>> as is or subsumed by new API?
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> I don't think we should create a replacement for
> >> the
> >>>>>>>>>> DataStream
> >>>>>>>>>>>>> API
> >>>>>>>>>>>>>>>>>> unless
> >>>>>>>>>>>>>>>>>>>> we have a very good reason to do so and with a
> >>> proper
> >>>>>>>>>>> discussion
> >>>>>>>>>>>>>> about
> >>>>>>>>>>>>>>>>>>> this
> >>>>>>>>>>>>>>>>>>>> as Alex said.
> >>>>>>>>>>>>>>>>>>> The ProcessFunction API which is targeting to
> >> replace
> >>>>>>>>>> DataStream
> >>>>>>>>>>>>> API
> >>>>>>>>>>>>>> is
> >>>>>>>>>>>>>>>>>>> still a proposal, not a decision. Sorry for the
> >>>>> confusion,
> >>>>>> I
> >>>>>>>>>>>>> should
> >>>>>>>>>>>>>>>> have
> >>>>>>>>>>>>>>>>>>> been more careful with my words, not giving the
> >>>>> impression
> >>>>>>>>>> that
> >>>>>>>>>>>>> this
> >>>>>>>>>>>>>> is
> >>>>>>>>>>>>>>>>>>> something we'll do anyway.
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> There will be a FLIP describing the motivations and
> >>>>>> designs in
> >>>>>>>>>>>>>> detail,
> >>>>>>>>>>>>>>>>>> for
> >>>>>>>>>>>>>>>>>>> the community to discuss and vote on. We are still
> >>>>> working
> >>>>>> on
> >>>>>>>>>>> it.
> >>>>>>>>>>>>>> TBH,
> >>>>>>>>>>>>>>>>>> this
> >>>>>>>>>>>>>>>>>>> is not trivial and we would need more time on it.
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> Just to quickly share some backgrounds:
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>    - We see quite some problems with the current
> >>>>>> DataStream
> >>>>>>>>>> APIs
> >>>>>>>>>>>>>>>>>>>       - Users are working with concrete classes
> >>> rather
> >>>>>> than
> >>>>>>>>>>>>>>>> interfaces,
> >>>>>>>>>>>>>>>>>>>       which means
> >>>>>>>>>>>>>>>>>>>       - Users can access methods that are designed
> >>> to
> >>>> be
> >>>>>> used
> >>>>>>>>>> by
> >>>>>>>>>>>>>>>> internal
> >>>>>>>>>>>>>>>>>>>          classes, even though they are annotated
> >>> with
> >>>>>>>>>>> `@Internal`.
> >>>>>>>>>>>>>>>> E.g.,
> >>>>>>>>>>>>>>>>>>>          `DataStream#getTransformation`.
> >>>>>>>>>>>>>>>>>>>          - Changes to the non-API implementations
> >>>> (e.g.,
> >>>>>>>>>>>>>>>>>> `Transformation`)
> >>>>>>>>>>>>>>>>>>>          would affect the API classes (e.g.,
> >>>>>> `DataStream`),
> >>>>>>>>>>> which
> >>>>>>>>>>>>>>>>>>> makes it hard to
> >>>>>>>>>>>>>>>>>>>          provide binary compatibility.
> >>>>>>>>>>>>>>>>>>>       - Internal classes are used as parameter /
> >>>>>> return-value
> >>>>>>>>>> of
> >>>>>>>>>>>>>>>> public
> >>>>>>>>>>>>>>>>>>>       APIs. E.g., while `AbstractStreamOperator`
> >> is
> >>>>>>>>>>>>> PublicEvolving,
> >>>>>>>>>>>>>>>>>>> `StreamTask`
> >>>>>>>>>>>>>>>>>>>       which returns from
> >>>>>>>>>>>>> `AbstractStreamOperator#getContainingTask`
> >>>>>>>>>>>>>> is
> >>>>>>>>>>>>>>>>>>> Internal.
> >>>>>>>>>>>>>>>>>>>       - In many cases, users are asked to extend
> >> the
> >>>> API
> >>>>>>>>>>> classes,
> >>>>>>>>>>>>>>>> rather
> >>>>>>>>>>>>>>>>>>>       than implementing interfaces. E.g.,
> >>>>>>>>>>>>> `AbstractStreamOperator`.
> >>>>>>>>>>>>>>>>>>>          - Any changes to the base classes, even
> >> the
> >>>>>> internal
> >>>>>>>>>>>>> part,
> >>>>>>>>>>>>>>>> may
> >>>>>>>>>>>>>>>>>>>          affect the behavior of the user-provided
> >>>>>> sub-classes
> >>>>>>>>>>>>>>>>>>>          - Users can override the behavior of the
> >>> base
> >>>>>> classes
> >>>>>>>>>>>>>>>>>>>       - The API module `flink-streaming-java`
> >>> contains
> >>>>>> non-API
> >>>>>>>>>>>>>>>> classes,
> >>>>>>>>>>>>>>>>>> and
> >>>>>>>>>>>>>>>>>>>       depends on internal modules such as
> >>>>> `flink-runtime`,
> >>>>>>>>>> which
> >>>>>>>>>>>>>> means
> >>>>>>>>>>>>>>>>>>>       - Changes to the internal modules may affect
> >>> the
> >>>>> API
> >>>>>>>>>>>>> modules,
> >>>>>>>>>>>>>>>> which
> >>>>>>>>>>>>>>>>>>>          requires users to re-build their
> >>> applications
> >>>>>> upon
> >>>>>>>>>>>>> upgrading
> >>>>>>>>>>>>>>>>>>>          - The artifact user needs for building
> >>> their
> >>>>>>>>>>> application
> >>>>>>>>>>>>>>>> larger
> >>>>>>>>>>>>>>>>>>>          than necessary.
> >>>>>>>>>>>>>>>>>>>       - We probably should not expose operators
> >>> (e.g.,
> >>>>>>>>>>>>>>>>>>>       `AbstractStreamOperator`) to users.
> >> Functions
> >>>>>> should be
> >>>>>>>>>>>>> enough
> >>>>>>>>>>>>>>>>>>> for users to
> >>>>>>>>>>>>>>>>>>>       define their data processing logics.
> >> Exposing
> >>>>>>>>>>> operator-level
> >>>>>>>>>>>>>>>>>> concepts
> >>>>>>>>>>>>>>>>>>>       (e.g., mailbox thread model, checkpoint
> >>> barrier
> >>>>>>>>>> alignment,
> >>>>>>>>>>>>>>>> etc.) is
> >>>>>>>>>>>>>>>>>>>       unnecessary and limits the improvement
> >>> regarding
> >>>>>> such
> >>>>>>>>>>>>> exposed
> >>>>>>>>>>>>>>>>>>> mechanisms
> >>>>>>>>>>>>>>>>>>>       with compatibility considerations.
> >>>>>>>>>>>>>>>>>>>       - The current DataStream API seems to be a
> >>>> mixture
> >>>>>> of
> >>>>>>>>>> many
> >>>>>>>>>>>>>>>> things,
> >>>>>>>>>>>>>>>>>>>       making it hard to understand especially for
> >>>>>> newcomers.
> >>>>>>>>>> It
> >>>>>>>>>>>>> might
> >>>>>>>>>>>>>>>> be
> >>>>>>>>>>>>>>>>>>> better
> >>>>>>>>>>>>>>>>>>>       to re-organize it into several parts: (the
> >>>>> taxonomy
> >>>>>>>>>> below
> >>>>>>>>>>>>> are
> >>>>>>>>>>>>>>>> just
> >>>>>>>>>>>>>>>>>> an
> >>>>>>>>>>>>>>>>>>>       example of the, we are still working on
> >> this)
> >>>>>>>>>>>>>>>>>>>          - The most fundamental stateful stream
> >>>>>> processing:
> >>>>>>>>>>>>> streams,
> >>>>>>>>>>>>>>>>>>>          partitions / key, process functions,
> >> state,
> >>>>>>>>>>>>> timeline-service
> >>>>>>>>>>>>>>>>>>>          - An extension for common batch-streaming
> >>>>> unified
> >>>>>>>>>>>>> functions:
> >>>>>>>>>>>>>>>>>> map,
> >>>>>>>>>>>>>>>>>>>          flatmap, filter, agg, reduce, join, etc.
> >>>>>>>>>>>>>>>>>>>          - An extension for windowing supports:
> >>>> window,
> >>>>>>>>>>>>> triggering
> >>>>>>>>>>>>>>>>>>>          - An extension for event-time supports:
> >>> event
> >>>>>> time,
> >>>>>>>>>>>>>> watermark
> >>>>>>>>>>>>>>>>>>>          - The extensions are like short-cuts /
> >>>> sugars,
> >>>>>>>>>> without
> >>>>>>>>>>>>> which
> >>>>>>>>>>>>>>>>>> users
> >>>>>>>>>>>>>>>>>>>          can probably still achieve the same
> >>> behavior
> >>>> by
> >>>>>>>>>> working
> >>>>>>>>>>>>> with
> >>>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>>>          fundamental APIs, but would be a lot
> >> easier
> >>>>> with
> >>>>>> the
> >>>>>>>>>>>>>>>> extensions
> >>>>>>>>>>>>>>>>>>>       - The original plan was to do in-place
> >>>> refactors /
> >>>>>>>>>> changes
> >>>>>>>>>>>>> on
> >>>>>>>>>>>>>>>>>>>    DataStream API. Some related items are listed
> >> in
> >>>> this
> >>>>>> doc
> >>>>>>>>>> [2]
> >>>>>>>>>>>>>>>> attached
> >>>>>>>>>>>>>>>>>>> to
> >>>>>>>>>>>>>>>>>>>    the kicking off email [3]. Not all of the above
> >>>>> issues
> >>>>>> are
> >>>>>>>>>>>>> listed,
> >>>>>>>>>>>>>>>>>>> because
> >>>>>>>>>>>>>>>>>>>    we haven't looked into this as deeply as now
> >> by
> >>>> that
> >>>>>> time.
> >>>>>>>>>>>>>>>>>>>    - We proposed this as a new API rather than
> >>>> in-place
> >>>>>>>>>>> refactors
> >>>>>>>>>>>>> in
> >>>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>>>    2.0 work item list, because we realized the
> >>> changes
> >>>>>> might
> >>>>>>>>>> be
> >>>>>>>>>>>>> too
> >>>>>>>>>>>>>>>> big
> >>>>>>>>>>>>>>>>>>> for an
> >>>>>>>>>>>>>>>>>>>    in-place change. First having a new API then
> >>>>> gradually
> >>>>>>>>>>> retiring
> >>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>> old
> >>>>>>>>>>>>>>>>>>> one
> >>>>>>>>>>>>>>>>>>>    would help users to smoothly migrate between
> >>> them.
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> A thorough discussion is definitely needed once the
> >>>> FLIP
> >>>>> is
> >>>>>>>>>> out.
> >>>>>>>>>>>>> And
> >>>>>>>>>>>>>> of
> >>>>>>>>>>>>>>>>>>> course it's possible that the FLIP might be
> >> rejected.
> >>>>> Given
> >>>>>>>>>> that
> >>>>>>>>>>>>> we
> >>>>>>>>>>>>>> are
> >>>>>>>>>>>>>>>>>>> planning for release 2.0, I just feel it would be
> >>>> better
> >>>>> to
> >>>>>>>>>>> bring
> >>>>>>>>>>>>>> this
> >>>>>>>>>>>>>>>> up
> >>>>>>>>>>>>>>>>>>> early even the concrete plan is not yet ready,
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> Best,
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> Xintong
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> [1]
> >>>>>>>>>>>>>
> >>>>> https://lists.apache.org/thread/vmhzv8fcw2b33pqxp43486owrxbkd5x9
> >>>>>>>>>>>>>>>>>>> [2]
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>>
> >>
> https://docs.google.com/document/d/1_PMGl5RuDQGlV99_gL3y7OiRsF0DgCk91Coua6hFXhE/edit?usp=sharing
> >>>>>>>>>>>>>>>>>>> [3]
> >>>>>>>>>>>>>
> >>>>> https://lists.apache.org/thread/b8w5cx0qqbwzzklyn5xxf54vw9ymys1c
> >>>>>>>>>>>>>>>>>>> On Tue, Jun 27, 2023 at 5:15 PM Gyula Fóra <
> >>>>>> gyfora@apache.org
> >>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>>>> Hey!
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> I share the same concerns mentioned above
> >> regarding
> >>>> the
> >>>>>>>>>>>>>>>>>> "ProcessFunction
> >>>>>>>>>>>>>>>>>>>> API".
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> I don't think we should create a replacement for
> >> the
> >>>>>>>>>> DataStream
> >>>>>>>>>>>>> API
> >>>>>>>>>>>>>>>>>>> unless
> >>>>>>>>>>>>>>>>>>>> we have a very good reason to do so and with a
> >>> proper
> >>>>>>>>>>> discussion
> >>>>>>>>>>>>>> about
> >>>>>>>>>>>>>>>>>>> this
> >>>>>>>>>>>>>>>>>>>> as Alex said.
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> Cheers,
> >>>>>>>>>>>>>>>>>>>> Gyula
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> On Tue, Jun 27, 2023 at 11:03 AM Alexander
> >> Fedulov <
> >>>>>>>>>>>>>>>>>>>> alexander.fedulov@gmail.com> wrote:
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> Hi Xintong,
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> By compatibility discussion do you mean the
> >>>> "[DISCUSS]
> >>>>>>>>>>> FLIP-321:
> >>>>>>>>>>>>>>>>>>>> Introduce
> >>>>>>>>>>>>>>>>>>>>> an API deprecation process" thread [1]?
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> I am also curious to know if the rationale behind
> >>>> this
> >>>>>> new
> >>>>>>>>>> API
> >>>>>>>>>>>>> has
> >>>>>>>>>>>>>>>>>> been
> >>>>>>>>>>>>>>>>>>>>> previously discussed on the mailing list. Do we
> >>> have
> >>>> a
> >>>>>> list
> >>>>>>>>>> of
> >>>>>>>>>>>>>>>>>>>> shortcomings
> >>>>>>>>>>>>>>>>>>>>> in the current DataStream API that it tries to
> >>>> resolve?
> >>>>>> How
> >>>>>>>>>>> does
> >>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>>>>> current ProcessFunction functionality fit into
> >> the
> >>>>>> picture?
> >>>>>>>>>>>>> Will it
> >>>>>>>>>>>>>>>>>> be
> >>>>>>>>>>>>>>>>>>>> kept
> >>>>>>>>>>>>>>>>>>>>> as is or subsumed by new API?
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> [1]
> >>>>>>>>>>>>>>
> >>>>>> https://lists.apache.org/thread/vmhzv8fcw2b33pqxp43486owrxbkd5x9
> >>>>>>>>>>>>>>>>>>>>> Best,
> >>>>>>>>>>>>>>>>>>>>> Alex
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> On Mon, 26 Jun 2023 at 14:33, Xintong Song <
> >>>>>>>>>>>>> tonysong820@gmail.com>
> >>>>>>>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>>>>>>> The ProcessFunction API item is giving me the
> >>> most
> >>>>>>>>>> headaches
> >>>>>>>>>>>>>>>>>>> because
> >>>>>>>>>>>>>>>>>>>>> it's
> >>>>>>>>>>>>>>>>>>>>>>> very unclear what it actually entails; like is
> >> it
> >>>> an
> >>>>>>>>>>> entirely
> >>>>>>>>>>>>>>>>>>>> separate
> >>>>>>>>>>>>>>>>>>>>>> API
> >>>>>>>>>>>>>>>>>>>>>>> to DataStream (sounds like it is!) or an
> >>> extension
> >>>> of
> >>>>>>>>>>>>> DataStream.
> >>>>>>>>>>>>>>>>>>> How
> >>>>>>>>>>>>>>>>>>>>>> much
> >>>>>>>>>>>>>>>>>>>>>>> will it share the internals with DataStream
> >> etc.;
> >>>> how
> >>>>>> does
> >>>>>>>>>>> it
> >>>>>>>>>>>>>>>>>>> relate
> >>>>>>>>>>>>>>>>>>>> to
> >>>>>>>>>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>>>>>>> Table API (w.r.t. switching APIs / what Table
> >> API
> >>>>> uses
> >>>>>>>>>>>>>>>>>> underneath).
> >>>>>>>>>>>>>>>>>>>>>> I totally understand your confusion. We started
> >>>>> planning
> >>>>>>>>>> this
> >>>>>>>>>>>>>> after
> >>>>>>>>>>>>>>>>>>>>> kicking
> >>>>>>>>>>>>>>>>>>>>>> off the release 2.0, so there's still a lot to
> >> be
> >>>>>> explored
> >>>>>>>>>>> and
> >>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>>> plan
> >>>>>>>>>>>>>>>>>>>>>> keeps changing.
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>    - In the beginning, we planned to do an
> >>> in-place
> >>>>>>>>>> refactor
> >>>>>>>>>>> of
> >>>>>>>>>>>>>>>>>>>>> DataStream
> >>>>>>>>>>>>>>>>>>>>>>    API, until the API migration period is
> >>> proposed.
> >>>>>>>>>>>>>>>>>>>>>>    - Then we want to make it an entirely
> >> separate
> >>>> API
> >>>>>> to
> >>>>>>>>>>>>>>>>>> DataStream,
> >>>>>>>>>>>>>>>>>>>> and
> >>>>>>>>>>>>>>>>>>>>>>    listed as a must-have for release 2.0 so
> >> that
> >>> we
> >>>>> can
> >>>>>>>>>>> remove
> >>>>>>>>>>>>>>>>>>>> DataStream
> >>>>>>>>>>>>>>>>>>>>>> once
> >>>>>>>>>>>>>>>>>>>>>>    it's ready.
> >>>>>>>>>>>>>>>>>>>>>>    - However, depending on the outcome of the
> >> API
> >>>>>>>>>>> compatibility
> >>>>>>>>>>>>>>>>>>>>> discussion
> >>>>>>>>>>>>>>>>>>>>>>    [1], we may not be able to remove DataStream
> >>> in
> >>>>> 2.0
> >>>>>>>>>>> anyway,
> >>>>>>>>>>>>>>>>>> which
> >>>>>>>>>>>>>>>>>>>>> means
> >>>>>>>>>>>>>>>>>>>>>> we
> >>>>>>>>>>>>>>>>>>>>>>    might need to re-evaluate the necessity of
> >>> this
> >>>>>> item for
> >>>>>>>>>>>>> 2.0.
> >>>>>>>>>>>>>>>>>>>>>> I'd say we wait a bit longer for the
> >> compatibility
> >>>>>>>>>> discussion
> >>>>>>>>>>>>> [1]
> >>>>>>>>>>>>>>>>>> and
> >>>>>>>>>>>>>>>>>>>>>> decide the priority for this item afterwards.
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>> Best,
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>> Xintong
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>> [1]
> >>>>>>>>>> https://lists.apache.org/list.html?dev@flink.apache.org
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>> On Mon, Jun 26, 2023 at 6:00 PM Chesnay
> >> Schepler <
> >>>>>>>>>>>>>>>>>> chesnay@apache.org
> >>>>>>>>>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>> by-and-large I'm quite happy with the list of
> >>>> items.
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>> I'm curious as to why the "Disaggregated State
> >>>>>> Management"
> >>>>>>>>>>>>> item
> >>>>>>>>>>>>>>>>>> is
> >>>>>>>>>>>>>>>>>>>>> marked
> >>>>>>>>>>>>>>>>>>>>>>> as a must-have; will it require changes that
> >>> break
> >>>>>>>>>>> something?
> >>>>>>>>>>>>>>>>>> What
> >>>>>>>>>>>>>>>>>>>>>> prevents
> >>>>>>>>>>>>>>>>>>>>>>> it from being added in 2.1?
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>> We may want to update the Java 17 item to "Make
> >>>> Java
> >>>>> 17
> >>>>>>>>>> the
> >>>>>>>>>>>>>>>>>>> default,
> >>>>>>>>>>>>>>>>>>>>> drop
> >>>>>>>>>>>>>>>>>>>>>>> Java 8/11". Maybe even split it into a
> >> must-have
> >>>>> "Drop
> >>>>>>>>>> Java
> >>>>>>>>>>> 8"
> >>>>>>>>>>>>>>>>>> and
> >>>>>>>>>>>>>>>>>>> a
> >>>>>>>>>>>>>>>>>>>>>>> nice-to-have "Drop Java 11"?
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>> "Move Calcite rules from Scala to Java": I
> >> would
> >>>> hope
> >>>>>> that
> >>>>>>>>>>>>> this
> >>>>>>>>>>>>>>>>>>> would
> >>>>>>>>>>>>>>>>>>>>> be
> >>>>>>>>>>>>>>>>>>>>>>> an entirely internal change, and could thus be
> >> an
> >>>>>>>>>>> incremental
> >>>>>>>>>>>>>>>>>>> process
> >>>>>>>>>>>>>>>>>>>>>>> independent of major releases.
> >>>>>>>>>>>>>>>>>>>>>>> What is the actual scale of this item; how much
> >>> are
> >>>>> we
> >>>>>>>>>>>>> actually
> >>>>>>>>>>>>>>>>>>>>>> re-writing?
> >>>>>>>>>>>>>>>>>>>>>>> "Add MetricGroup#getLogicalScope": I'd raise
> >> this
> >>>> to
> >>>>> a
> >>>>>>>>>>>>>>>>>> must-have; i
> >>>>>>>>>>>>>>>>>>>>> think
> >>>>>>>>>>>>>>>>>>>>>>> I marked it down as nice-to-have only because
> >> it
> >>>>>> depends
> >>>>>>>>>> on
> >>>>>>>>>>>>>>>>>> another
> >>>>>>>>>>>>>>>>>>>>> item.
> >>>>>>>>>>>>>>>>>>>>>>> The ProcessFunction API item is giving me the
> >>> most
> >>>>>>>>>> headaches
> >>>>>>>>>>>>>>>>>>> because
> >>>>>>>>>>>>>>>>>>>>> it's
> >>>>>>>>>>>>>>>>>>>>>>> very unclear what it actually entails; like is
> >> it
> >>>> an
> >>>>>>>>>>> entirely
> >>>>>>>>>>>>>>>>>>>> separate
> >>>>>>>>>>>>>>>>>>>>>> API
> >>>>>>>>>>>>>>>>>>>>>>> to DataStream (sounds like it is!) or an
> >>> extension
> >>>> of
> >>>>>>>>>>>>> DataStream.
> >>>>>>>>>>>>>>>>>>> How
> >>>>>>>>>>>>>>>>>>>>>> much
> >>>>>>>>>>>>>>>>>>>>>>> will it share the internals with DataStream
> >> etc.;
> >>>> how
> >>>>>> does
> >>>>>>>>>>> it
> >>>>>>>>>>>>>>>>>>> relate
> >>>>>>>>>>>>>>>>>>>> to
> >>>>>>>>>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>>>>>>> Table API (w.r.t. switching APIs / what Table
> >> API
> >>>>> uses
> >>>>>>>>>>>>>>>>>> underneath).
> >>>>>>>>>>>>>>>>>>>>>>> There are a few items I added as ideas which
> >>> don't
> >>>>>> have a
> >>>>>>>>>>>>>>>>>> priority
> >>>>>>>>>>>>>>>>>>>> yet;
> >>>>>>>>>>>>>>>>>>>>>>> would love to get some feedback on those.
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>> On 21/06/2023 08:41, Xintong Song wrote:
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>> Hi devs,
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>> As previously discussed in [1], we had been
> >>>>> collecting
> >>>>>>>>>> work
> >>>>>>>>>>>>> item
> >>>>>>>>>>>>>>>>>>>>>> proposals
> >>>>>>>>>>>>>>>>>>>>>>> for the 2.0 release until June 15th, on the
> >> wiki
> >>>> page
> >>>>>> [2].
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>    - As we have passed the due date, I'd like
> >> to
> >>>>>> kindly
> >>>>>>>>>>> remind
> >>>>>>>>>>>>>>>>>>>> everyone
> >>>>>>>>>>>>>>>>>>>>>> *not
> >>>>>>>>>>>>>>>>>>>>>>>    to add / remove items directly on the wiki
> >>>> page*.
> >>>>>> If
> >>>>>>>>>>>>> needed,
> >>>>>>>>>>>>>>>>>>>> please
> >>>>>>>>>>>>>>>>>>>>>> post
> >>>>>>>>>>>>>>>>>>>>>>>    in this thread or reach out to the release
> >>>>> managers
> >>>>>>>>>>>>> instead.
> >>>>>>>>>>>>>>>>>>>>>>>    - I've reached out to some folks for
> >>>>> clarifications
> >>>>>>>>>> about
> >>>>>>>>>>>>>>>>>> their
> >>>>>>>>>>>>>>>>>>>>>>>    proposals. Some of them mentioned that they
> >>> can
> >>>>>> not yet
> >>>>>>>>>>>>> tell
> >>>>>>>>>>>>>>>>>>>> whether
> >>>>>>>>>>>>>>>>>>>>>> we
> >>>>>>>>>>>>>>>>>>>>>>>    should do an item or not, and would need
> >> more
> >>>>> time
> >>>>>> /
> >>>>>>>>>>>>>>>>>> discussions
> >>>>>>>>>>>>>>>>>>>> to
> >>>>>>>>>>>>>>>>>>>>>> make
> >>>>>>>>>>>>>>>>>>>>>>>    the decision. So I added a new symbol for
> >>> items
> >>>>>> whose
> >>>>>>>>>>>>>>>>>> priorities
> >>>>>>>>>>>>>>>>>>>> are
> >>>>>>>>>>>>>>>>>>>>>> `TBD`.
> >>>>>>>>>>>>>>>>>>>>>>> Now it's time to collaboratively decide a
> >> minimum
> >>>> set
> >>>>>> of
> >>>>>>>>>>>>>>>>>> must-have
> >>>>>>>>>>>>>>>>>>>>> items.
> >>>>>>>>>>>>>>>>>>>>>>> I've gone through the entire list of proposed
> >>>> items,
> >>>>>> and
> >>>>>>>>>>> found
> >>>>>>>>>>>>>>>>>> most
> >>>>>>>>>>>>>>>>>>>> of
> >>>>>>>>>>>>>>>>>>>>>> them
> >>>>>>>>>>>>>>>>>>>>>>> make quite much sense. So I think an online
> >> sync
> >>>>> might
> >>>>>> not
> >>>>>>>>>>> be
> >>>>>>>>>>>>>>>>>>>> necessary
> >>>>>>>>>>>>>>>>>>>>>> for
> >>>>>>>>>>>>>>>>>>>>>>> this. I'd like to go with this DISCUSS thread,
> >>>> where
> >>>>>>>>>>> everyone
> >>>>>>>>>>>>> can
> >>>>>>>>>>>>>>>>>>>>> comment
> >>>>>>>>>>>>>>>>>>>>>>> on how they think the list can be improved,
> >>>> followed
> >>>>>> by a
> >>>>>>>>>>>>> VOTE to
> >>>>>>>>>>>>>>>>>>>>>> formally
> >>>>>>>>>>>>>>>>>>>>>>> make the decision.
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>> Any feedback and opinions, including but not
> >>>> limited
> >>>>> to
> >>>>>>>>>> the
> >>>>>>>>>>>>>>>>>

Re: [DISCUSS] Release 2.0 Work Items

Posted by Jiabao Sun <ji...@xtransfer.cn.INVALID>.
Thanks Xintong for driving the effort.


I’d add a +1 to improving out-of-box user experience, as suggested by @Jark and @Chesnay.
For beginners, understanding complex configurations is a hard work.

In addition, the deployment of a set of Flink runtime environment is also a complex matter.
At present, there are still big differences in the submission tasks for different computing resource. If users need time for their own data development platform, they need to deeply understand these differences when processing task submission and running status check.

I'm glad to see features like flink-sql-gateway being implemented by the community because it makes it easy for users to submit flink sql tasks. Further more, can we provide more unified, out-of-the-box capabilities that allow users to quickly pull up a production-ready Flink environment and easily integrate Flink into their own data development platform?


Best,
Jiabao


> 2023年7月12日 下午8:16,zhiqiang li <li...@gmail.com> 写道:
> 
> I have seen in [1] connectors and formats, and user code will be pluggable.
> If the connectors are pluggable, the benefits are obvious, as the conflicts
> between different jar package versions can be avoided.
> If you don't use classloader isolation, shade is needed to resolve
> conflicts. A lot of development time is wasted.
> I know that this change may involve a lot of API changes, so I would like
> to discuss in this email whether we can make changes in Flink 2.0.
> Plugins facilitate a strict separation of code through restricted
> classloaders.
> 
> Plugins cannot access classes from other plugins or from Flink that have
>> not been specifically whitelisted.
>> This strict isolation allows plugins to contain conflicting versions of
>> the same library without the need to relocate classes or to converge to
>> common versions.
>> Currently, file systems and metric reporters are pluggable *but in the
>> future, connectors, formats, and even user code should also be pluggable.*
>> 
> 
> [1]
> https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/filesystems/overview/
> 
> Xintong Song <to...@gmail.com> 于2023年7月11日周二 18:50写道:
> 
>>> 
>>> What we might want to come up with is a summary with each 2.0.0 issue on
>>> why it should be included or not. That summary is something the community
>>> could vote on. WDYT? I'm happy to help here.
>>> 
>> 
>> That sounds great. Thanks for offering the help. I'll also try to go
>> through the issues, but TBH I'm quite overwhelmed and cannot promise to get
>> this done very soon. Your help is very much needed.
>> 
>> 
>> Best,
>> 
>> Xintong
>> 
>> 
>> 
>> On Tue, Jul 11, 2023 at 6:08 PM Matthias Pohl
>> <ma...@aiven.io.invalid> wrote:
>> 
>>> @Xintong I guess it makes sense. I agree with your conclusions on the
>> four
>>> mentioned Jira issues.
>>> 
>>> I just checked any issues that have fixVersion = 2.0.0 [1]. There are a
>> few
>>> more items that are not affiliated with FLINK-3957 [2]. I guess we should
>>> find answers for these issues: Either closing them with a reason to have
>> a
>>> consistent state in Jira or adding them to the feature list as part of a
>>> separate voting thread (to leave the current vote untouched).
>>> 
>>> What we might want to come up with is a summary with each 2.0.0 issue on
>>> why it should be included or not. That summary is something the community
>>> could vote on. WDYT? I'm happy to help here.
>>> 
>>> Matthias
>>> 
>>> [1]
>>> 
>>> 
>> https://issues.apache.org/jira/browse/FLINK-32437?jql=project%20%3D%20FLINK%20AND%20fixVersion%20%3D%202.0.0%20AND%20status%20NOT%20IN%20(Closed%2C%20Resolved)%20%20
>>> [2] https://issues.apache.org/jira/browse/FLINK-3957
>>> 
>>> 
>>> On Tue, Jul 11, 2023 at 5:01 AM Xintong Song <to...@gmail.com>
>>> wrote:
>>> 
>>>> @Zhu,
>>>> As you are downgrading "Clarify the scopes of configuration options" to
>>>> nice-to-have priority, could you also bring that up in the vote
>>> thread[1]?
>>>> I'm asking because there are people who already voted on the original
>>> list.
>>>> I think restarting the vote is probably an overkill and unnecessary,
>> but
>>> we
>>>> should at least bring this change to their attention.
>>>> 
>>>> @Matthias,
>>>> Thanks a lot for bringing this up. I wasn't aware of this early
>>> umbrella. I
>>>> haven't gone through everything in FLINK-3957 yet. I'll do it asap.
>>>> 
>>>> Just quickly went through the 4 issues you mentioned.
>>>> - FLINK-4675 & FLINK-14068: I'd be +1 to deprecate them in 1.18, as
>> long
>>> as
>>>> the new APIs that we want users to migrate to are ready. For these 2
>>>> tickets, I think introduction of the updated APIs should be
>>> straightforward
>>>> and feasible for 1.18.
>>>> - FLINK-13926: I'm not sure about this one. The two mentioned classes
>>>> `ProcessingTimeSessionWindows` and `EventTimeSessionWindows` are not
>> even
>>>> marked as Public or PublicEvolving APIs. Moreover, I don't see a good
>> way
>>>> to smoothly replace the classes with a generic version.
>>>> - FLINK-5126: This is a bit unclear to me. From the description and
>>>> conversation on the ticket, I don't fully understand which concrete
>> APIs
>>>> the ticket is referring to. Or maybe it refers to all / most of the
>> APIs
>>>> that throws Exception / IOException in general. Moreover, I don't think
>>>> removing Exception / IOException from the API signature is a breaking
>>>> change. It requires no code changes on the caller side.
>>>> 
>>>> WDYT?
>>>> 
>>>> Best,
>>>> 
>>>> Xintong
>>>> 
>>>> 
>>>> [1] https://lists.apache.org/thread/r0y9syc6k5nmcxvnd0hj33htdpdj9k6m
>>>> [2] https://issues.apache.org/jira/browse/FLINK-3957
>>>> 
>>>> On Mon, Jul 10, 2023 at 10:53 PM Matthias Pohl
>>>> <ma...@aiven.io.invalid> wrote:
>>>> 
>>>>> I brought it up in the deprecating APIs in 1.18 thread [1] already
>> but
>>> it
>>>>> feels misplaced there. I just wanted to ask whether someone did a
>> pass
>>>> over
>>>>> FLINK-3957 [2]. I came across it when going through the release 2.0
>>>> feature
>>>>> list [3] as part of the vote. I have the feeling that there are some
>>>> valid
>>>>> action items (e.g. FLINK-4675, FLINK-5126, FLINK-13926 [4-6]) which
>> do
>>>> not
>>>>> seem to be listed in the 2.0 feature list [3], yet (or are included
>> in
>>>> some
>>>>> of the bigger items). Majority of the subtasks are probably covered
>> by
>>>> the
>>>>> DataSet removal, the Scala API removal and the ProcessFunction
>>>> refactoring.
>>>>> Other subtasks (FLINK-14068 [7]) made it into the feature list.
>>>>> 
>>>>> I haven't worked with the SDK code that much so that I can judge
>>> whether
>>>>> the subtasks are still reasonable or actually obsolete. That is why I
>>>>> wanted to mention the Jira issue here once more.
>>>>> 
>>>>> I don't consider it a blocker for the ongoing vote but was wondering
>>>>> whether it makes sense for someone who might have more experience in
>>> that
>>>>> field to add some of the subtasks to the feature list.
>>>>> 
>>>>> Or shall we just consider it as "not interesting enough" because
>> nobody
>>>>> added it in the first place to the 2.0 feature list [3]?
>>>>> 
>>>>> Matthias
>>>>> 
>>>>> [1] https://lists.apache.org/thread/3dw4f8frlg8hzlv324ql7n2755bzs9hy
>>>>> [2] https://issues.apache.org/jira/browse/FLINK-3957
>>>>> [3] https://cwiki.apache.org/confluence/display/FLINK/2.0+Release
>>>>> [4] https://issues.apache.org/jira/browse/FLINK-4675
>>>>> [5] https://issues.apache.org/jira/browse/FLINK-5126
>>>>> [6] https://issues.apache.org/jira/browse/FLINK-13926
>>>>> [7] https://issues.apache.org/jira/browse/FLINK-14068
>>>>> 
>>>>> On Mon, Jul 10, 2023 at 3:17 PM Zhu Zhu <re...@gmail.com> wrote:
>>>>> 
>>>>>> Agreed that we should deprecate affected APIs as soon as possible.
>>>>>> But there is not much time before the feature freeze of 1.18,
>> hence
>>>>>> I'm a bit concerned that some of the deprecations might not be done
>>>> 1.18.
>>>>>> 
>>>>>> We are currently looking into the improvements of the configuration
>>>>> layer.
>>>>>> Most of the proposed changes would require a public discussion, or
>>> even
>>>>>> a FLIP, which I think can hardly close before the feature freeze of
>>>> 1.18.
>>>>>> And some of the APIs can be deprecated only after the corresponding
>>> new
>>>>>> APIs are developed. Therefore we previously targeted them for 1.19.
>>>>>> 
>>>>>> We may review later to see what deprecation work can be done in
>> 1.18
>>>> and
>>>>>> make it if possible. I think we can do the work even after the
>>> feature
>>>>>> freeze
>>>>>> date, if it is a purely deprecation work (simply adding
>> annotations).
>>>>> WDYT?
>>>>>> 
>>>>>> I'm also changing the priority of "Clarify the scopes of
>>> configuration
>>>>>> options"
>>>>>> to nice to have. I think most of the work are not breaking changes
>>> and
>>>>> can
>>>>>> be done in 1.x or 2.1+. For the breaking changes which might be
>>> needed,
>>>>> we
>>>>>> will consider it as part of the configuration layer rework.
>>>>>> 
>>>>>> Thanks,
>>>>>> Zhu
>>>>>> 
>>>>>> Xintong Song <to...@gmail.com> 于2023年7月10日周一 19:58写道:
>>>>>>> 
>>>>>>>> 
>>>>>>>> At what point are the FLIP discussions coming into play?
>>>>>>> 
>>>>>>> I keep wondering if these shouldn't have started already.
>>>>>>> 
>>>>>>> 
>>>>>>> I think this depends on the responsible contributor and reviewer
>> of
>>>>>>> individual items. From my perspective, the FLIP discussions can
>>> start
>>>>> any
>>>>>>> time as long as the contributors are ready, the earlier the
>> better.
>>>>>>> 
>>>>>>> 
>>>>>>> What we need to ensure is that all breaking API changes are
>>>>>>>> discussed/decided before 1.18 is released so we can deprecate
>>>>> affected
>>>>>> APIs.
>>>>>>>> 
>>>>>>> 
>>>>>>> The introduction of the migration period has brought the
>>> requirement
>>>> to
>>>>>>> plan the removal of public APIs 2 minor releases ahead of the
>> major
>>>>>>> release, which is TBH a bit unexpected. I agree it would be nice
>> if
>>>> we
>>>>>> can
>>>>>>> get the FLIPs ready by releasing 1.18. But I also don't think we
>>>> should
>>>>>>> rush on it. If the deprecation of a Public API does not make
>> 1.18,
>>> we
>>>>> may
>>>>>>> carry it until 3.0. Or if there are many Public APIs whose
>>>> deprecation
>>>>>> does
>>>>>>> not make 1.18, we may deprecate them in 1.19 and postpone the
>> major
>>>>>> version
>>>>>>> bump to after a 1.20 release. Moreover, as mentioned in
>>> FLIP-321[1],
>>>>>>> exceptions are discussable given that the migration period is
>> newly
>>>>>>> proposed and we did not give developers the chance to plan things
>>>>> ahead.
>>>>>> To
>>>>>>> sum up, I'd say we try identify APIs that need to be deprecated
>> in
>>>> 1.18
>>>>>>> with best efforts, and evaluate the remaining options (carrying
>> the
>>>> API
>>>>>> for
>>>>>>> the entire 2.x cycle, postpone 2.0, or making an exception)
>>>>> case-by-case.
>>>>>>> WDYT?
>>>>>>> 
>>>>>>> Best,
>>>>>>> 
>>>>>>> Xintong
>>>>>>> 
>>>>>>> 
>>>>>>> [1]
>>> https://lists.apache.org/thread/vmhzv8fcw2b33pqxp43486owrxbkd5x9
>>>>>>> 
>>>>>>> On Mon, Jul 10, 2023 at 6:13 PM Chesnay Schepler <
>>> chesnay@apache.org
>>>>> 
>>>>>> wrote:
>>>>>>> 
>>>>>>>> At what point are the FLIP discussions coming into play?
>>>>>>>> 
>>>>>>>> I keep wondering if these shouldn't have started already.
>>>>>>>> It just seems that a lot of decisions are implicitly reliant on
>>> the
>>>>>>>> items even being accepted.
>>>>>>>> Estimates can only be provided if we actually know the scope of
>>> the
>>>>>>>> change, but that's not always clear from the description in the
>>>> doc.
>>>>>>>> 
>>>>>>>> What we need to ensure is that all breaking API changes are
>>>>>>>> discussed/decided before 1.18 is released so we can deprecate
>>>>> affected
>>>>>>>> APIs.
>>>>>>>> 
>>>>>>>> On 10/07/2023 11:32, Xintong Song wrote:
>>>>>>>>> Hi Matthias,
>>>>>>>>> 
>>>>>>>>> The questions you asked are indeed very important. Here're
>> some
>>>>> quick
>>>>>>>>> responses, based on the plans I had in mind, which I have not
>>>>> aligned
>>>>>>>> with
>>>>>>>>> other release managers yet.
>>>>>>>>> 
>>>>>>>>> In the previous discussions between the RMs, we were not able
>>> to
>>>>> make
>>>>>>>>> proposals on things like how to make a time plan, how to
>> manage
>>>> the
>>>>>>>> release
>>>>>>>>> branch, etc., due to the lack of inputs on e.g., the work
>> items
>>>>> need
>>>>>> to
>>>>>>>> be
>>>>>>>>> included (which transitively depends on the API compatibility
>>> to
>>>>>> provide
>>>>>>>>> between major versions) and the workloads / time needed for
>>> them.
>>>>>> With
>>>>>>>> the
>>>>>>>>> recent discussions, we have collected at least the majority
>> of
>>>> the
>>>>>> inputs
>>>>>>>>> needed.
>>>>>>>>> 
>>>>>>>>> Here are things that I think we as the release managers would
>>> do
>>>>> next
>>>>>>>>> (again, not aligned with other release managers yet)
>>>>>>>>> - Creating a time plan, by reaching out to people to
>> understand
>>>> the
>>>>>>>>> estimated workloads, prerequisites and ETA of each work item.
>>>>>>>>> - Make a proposal on how to manage the release branch, i.e.,
>>> when
>>>>> to
>>>>>> cut
>>>>>>>>> the branch and whether to ship the milestone releases, etc.
>>>>>>>>> - Set-up regular release syncs (bi-weekly / monthly) to
>> update
>>>> the
>>>>>> status
>>>>>>>>> and draw attention to where help is needed.
>>>>>>>>> 
>>>>>>>>> So back to your questions.
>>>>>>>>> 
>>>>>>>>> There are still to-be-discussed items in the list of
>> features.
>>>>>> What's the
>>>>>>>>>> plan with those?
>>>>>>>>> When collecting ETA, for items that the completion time
>> cannot
>>>> yet
>>>>> be
>>>>>>>>> estimated, we would like to have at least a time by which the
>>>>>> estimation
>>>>>>>>> can be made. I think the same applies to the to-be-discussed
>>>> items.
>>>>>> And
>>>>>>>> if
>>>>>>>>> the items should be included as must-haves, we would need
>>> another
>>>>>> vote to
>>>>>>>>> adjust the must-have item list.
>>>>>>>>> 
>>>>>>>>> Some of them don't have anyone assigned.
>>>>>>>>> My concern is that they will be overlooked because nobody
>> feels
>>>> to
>>>>>> be in
>>>>>>>>>> charge.
>>>>>>>>> This is a tricky one. For must-have items without assignees,
>> we
>>>> as
>>>>>> the
>>>>>>>>> release managers should be responsible for raising them up in
>>> the
>>>>>> release
>>>>>>>>> syncs, and try to find assignees for them. Hopefully, there
>>> will
>>>> be
>>>>>>>> someone
>>>>>>>>> who stands out. But it is possible that for a must-have item
>>>> nobody
>>>>>> wants
>>>>>>>>> to work on it. If that happens, which I don't think it will,
>> it
>>>>>> probably
>>>>>>>>> means the item is not that critical and we may have to
>> exclude
>>> it
>>>>>> from
>>>>>>>> the
>>>>>>>>> release. Either way, they should not be overlooked, because
>>> IMHO
>>>>>> release
>>>>>>>>> managers should be responsible for trying to get someone to
>>> work
>>>> on
>>>>>> the
>>>>>>>>> un-assigned items.
>>>>>>>>> 
>>>>>>>>> We'll have more discussions soon and keep the community
>>> updated.
>>>>>>>>> 
>>>>>>>>> Best,
>>>>>>>>> 
>>>>>>>>> Xintong
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> On Mon, Jul 10, 2023 at 3:53 PM Matthias Pohl
>>>>>>>>> <ma...@aiven.io.invalid> wrote:
>>>>>>>>> 
>>>>>>>>>> Now that the vote is started on the must-have items: There
>> are
>>>>> still
>>>>>>>>>> to-be-discussed items in the list of features. What's the
>> plan
>>>>> with
>>>>>>>> those?
>>>>>>>>>> Some of them don't have anyone assigned. Were these items
>>>>> discussed
>>>>>>>> among
>>>>>>>>>> the release managers? So far, it looks like they are handled
>>> as
>>>>>>>>>> nice-to-have if someone volunteers to pick them up?
>>>>>>>>>> 
>>>>>>>>>> My concern is that they will be overlooked because nobody
>>> feels
>>>> to
>>>>>> be in
>>>>>>>>>> charge.
>>>>>>>>>> 
>>>>>>>>>> Best,
>>>>>>>>>> Matthias
>>>>>>>>>> 
>>>>>>>>>> On Fri, Jul 7, 2023 at 11:06 AM Xintong Song <
>>>>> tonysong820@gmail.com
>>>>>>> 
>>>>>>>>>> wrote:
>>>>>>>>>> 
>>>>>>>>>>> Thanks all for the discussion.
>>>>>>>>>>> 
>>>>>>>>>>> The wiki has been updated as discussed. I'm starting a vote
>>>> now.
>>>>>>>>>>> 
>>>>>>>>>>> Best,
>>>>>>>>>>> 
>>>>>>>>>>> Xintong
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> On Wed, Jul 5, 2023 at 9:52 AM Xintong Song <
>>>>> tonysong820@gmail.com
>>>>>>> 
>>>>>>>>>> wrote:
>>>>>>>>>>>> Hi ConradJam,
>>>>>>>>>>>> 
>>>>>>>>>>>> I think Chesnay has already put his name as the
>> Contributor
>>>> for
>>>>>> the
>>>>>>>> two
>>>>>>>>>>>> tasks you listed. Maybe you can reach out to him to see if
>>> you
>>>>> can
>>>>>>>>>>>> collaborate on this.
>>>>>>>>>>>> 
>>>>>>>>>>>> In general, I don't think contributing to a release 2.0
>>> issue
>>>> is
>>>>>> much
>>>>>>>>>>>> different from contributing to a regular issue. We haven't
>>> yet
>>>>>> created
>>>>>>>>>>> JIRA
>>>>>>>>>>>> tickets for all the listed tasks because many of them
>> needs
>>>>>> further
>>>>>>>>>>>> discussions and / or FLIPs to decide whether and how they
>>>> should
>>>>>> be
>>>>>>>>>>>> performed.
>>>>>>>>>>>> 
>>>>>>>>>>>> Best,
>>>>>>>>>>>> 
>>>>>>>>>>>> Xintong
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> On Mon, Jul 3, 2023 at 10:37 PM ConradJam <
>>>> jam.gzczy@gmail.com>
>>>>>>>> wrote:
>>>>>>>>>>>> 
>>>>>>>>>>>>> Hi Community:
>>>>>>>>>>>>>   I see some tasks in the 2.0 list that haven't been
>>>> assigned
>>>>>> yet. I
>>>>>>>>>>> want
>>>>>>>>>>>>> to take the initiative to take on some tasks that I can
>>>>>> complete. How
>>>>>>>>>>> do I
>>>>>>>>>>>>> apply to the community for this part of the task? I am
>>>>>> interested in
>>>>>>>>>> the
>>>>>>>>>>>>> following parts of FLINK-32377
>>>>>>>>>>>>> <https://issues.apache.org/jira/browse/FLINK-32377>, do
>> I
>>>> need
>>>>>> to
>>>>>>>>>>> create
>>>>>>>>>>>>> issuse myself and point it to myself?
>>>>>>>>>>>>> 
>>>>>>>>>>>>> - the current timestamp, which is problematic w.r.t.
>>> caching
>>>>> and
>>>>>>>>>>> testing,
>>>>>>>>>>>>> while providing no value.
>>>>>>>>>>>>> - Remove JarRequestBody#programArgs in favor of
>>>>> #programArgsList.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> [1] FLINK-32377 <
>>>>>> https://issues.apache.org/jira/browse/FLINK-32377>
>>>>>>>>>>>>> https://issues.apache.org/jira/browse/FLINK-32377
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Teoh, Hong <li...@amazon.co.uk.invalid> 于2023年6月30日周五
>>>>> 00:53写道:
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Teoh, Hong <li...@amazon.co.uk.invalid> 于2023年6月30日周五
>>>>> 00:53写道:
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Thanks Xintong for driving the effort.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> I’d add a +1 to reworking configs, as suggested by @Jark
>>> and
>>>>>>>>>> @Chesnay,
>>>>>>>>>>>>>> especially the types. We have various configs that
>> encode
>>>>> Time /
>>>>>>>>>>>>> MemorySize
>>>>>>>>>>>>>> that are Long instead!
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>> Hong
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> On 29 Jun 2023, at 16:19, Yuan Mei <
>>> yuanmei.work@gmail.com
>>>>> 
>>>>>>>>>> wrote:
>>>>>>>>>>>>>>> CAUTION: This email originated from outside of the
>>>>>> organization.
>>>>>>>>>> Do
>>>>>>>>>>>>> not
>>>>>>>>>>>>>> click links or open attachments unless you can confirm
>> the
>>>>>> sender
>>>>>>>>>> and
>>>>>>>>>>>>> know
>>>>>>>>>>>>>> the content is safe.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Thanks for driving this effort, Xintong!
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> To Chesnay
>>>>>>>>>>>>>>>> I'm curious as to why the "Disaggregated State
>>> Management"
>>>>>> item
>>>>>>>>>> is
>>>>>>>>>>>>>>>> marked as a must-have; will it require changes that
>>> break
>>>>>>>>>>> something?
>>>>>>>>>>>>>>>> What prevents it from being added in 2.1?
>>>>>>>>>>>>>>> As to "Disaggregated State Management".
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> We plan to provide a new type of state backend to
>> support
>>>> DFS
>>>>>> as
>>>>>>>>>>>>> primary
>>>>>>>>>>>>>>> storage.
>>>>>>>>>>>>>>> To achieve this, we at least need to include two parts
>> of
>>>>>> amends
>>>>>>>>>>> (not
>>>>>>>>>>>>>>> entirely sure yet, since we are still in the designing
>>> and
>>>>>>>>>> prototype
>>>>>>>>>>>>>> phase)
>>>>>>>>>>>>>>> 1. Statebackend Change
>>>>>>>>>>>>>>> 2. State Access Change
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Not all of the interfaces related are `@Internal`. Some
>>> of
>>>>> the
>>>>>>>>>>>>> interfaces
>>>>>>>>>>>>>>> like `StateBackend` is `@PublicEvolving`
>>>>>>>>>>>>>>> So, you are right in the sense that "Disaggregated
>> State
>>>>>>>>>> Management"
>>>>>>>>>>>>>> itself
>>>>>>>>>>>>>>> probably does not need to be a "Must Have"
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> But I was hoping changes that related to public APIs
>> can
>>> be
>>>>>>>>>>> finalized
>>>>>>>>>>>>> and
>>>>>>>>>>>>>>> merged in Flink 2.0 (I will fix the wiki accordingly).
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> I also agree with Jark that 2.0 is a good chance to
>>> rework
>>>>> the
>>>>>>>>>>> default
>>>>>>>>>>>>>>> value of configurations.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Best
>>>>>>>>>>>>>>> Yuan
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> On Thu, Jun 29, 2023 at 8:43 PM Chesnay Schepler <
>>>>>>>>>>> chesnay@apache.org>
>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>> Something else configuration-related is that there
>> are a
>>>>>> bunch of
>>>>>>>>>>>>>>>> options where the type isn't quite correct (e.g., a
>>> String
>>>>>> where
>>>>>>>>>> it
>>>>>>>>>>>>>>>> could be an enum, a string where it should be an int
>> or
>>>>>>>>>> something).
>>>>>>>>>>>>>>>> Could do a pass over those as well.
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> On 29/06/2023 13:50, Jark Wu wrote:
>>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> I think one more thing we need to consider to do in
>> 2.0
>>>> is
>>>>>>>>>>> changing
>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>> default value of configuration to improve out-of-box
>>> user
>>>>>>>>>>>>> experience.
>>>>>>>>>>>>>>>>> Currently, in order to run a Flink job, users may
>> need
>>> to
>>>>> set
>>>>>>>>>>>>>>>>> a bunch of configurations, such as minibatch,
>>> checkpoint
>>>>>>>>>> interval,
>>>>>>>>>>>>>>>>> exactly-once,
>>>>>>>>>>>>>>>>> incremental-checkpoint, etc. It's very verbose and
>> hard
>>>> to
>>>>>> use
>>>>>>>>>> for
>>>>>>>>>>>>>>>>> beginners.
>>>>>>>>>>>>>>>>> Most of them can have a universally applicable value.
>>>>>> Because
>>>>>>>>>>>>> changing
>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>> default value is a breaking change. I think It's
>> worth
>>>>>>>>>> considering
>>>>>>>>>>>>>>>> changing
>>>>>>>>>>>>>>>>> them in 2.0.
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> What do you think?
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>> Jark
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> On Wed, 28 Jun 2023 at 14:10, Sergey Nuyanzin <
>>>>>>>>>>> snuyanzin@gmail.com>
>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>> Hi Chesnay
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> "Move Calcite rules from Scala to Java": I would
>> hope
>>>>> that
>>>>>>>>>> this
>>>>>>>>>>>>> would
>>>>>>>>>>>>>>>> be
>>>>>>>>>>>>>>>>>>> an entirely internal change, and could thus be an
>>>>>> incremental
>>>>>>>>>>>>> process
>>>>>>>>>>>>>>>>>>> independent of major releases.
>>>>>>>>>>>>>>>>>>> What is the actual scale of this item; how much are
>>> we
>>>>>>>>>> actually
>>>>>>>>>>>>>>>>>> re-writing?
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> Thanks for asking
>>>>>>>>>>>>>>>>>> yes, you're right, that should be internal change.
>>>>>>>>>>>>>>>>>> Yeah I was also thinking about incremental change
>>> (rule
>>>> by
>>>>>> rule
>>>>>>>>>>> or
>>>>>>>>>>>>>>>>>> reasonable small group of rules).
>>>>>>>>>>>>>>>>>> And yes, this could be an independent (on major
>>> release)
>>>>>>>>>> activity
>>>>>>>>>>>>>>>>>> The problem is actually for children of RelOptRule.
>>>>>>>>>>>>>>>>>> Currently I see 60+ such rules (in Scala) using the
>>>>>> mentioned
>>>>>>>>>>>>>> deprecated
>>>>>>>>>>>>>>>>>> api.
>>>>>>>>>>>>>>>>>> There are also children of ConverterRule (50+) which
>>> do
>>>>> not
>>>>>>>>>> have
>>>>>>>>>>>>> such
>>>>>>>>>>>>>>>>>> issues.
>>>>>>>>>>>>>>>>>> Maybe it could be considered as the next step to
>> have
>>>> all
>>>>>> the
>>>>>>>>>>>>> rules in
>>>>>>>>>>>>>>>>>> Java.
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> On Tue, Jun 27, 2023 at 1:34 PM Xintong Song <
>>>>>>>>>>>>> tonysong820@gmail.com>
>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> Hi Alex & Gyula,
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> By compatibility discussion do you mean the
>>> "[DISCUSS]
>>>>>>>>>> FLIP-321:
>>>>>>>>>>>>>>>>>> Introduce
>>>>>>>>>>>>>>>>>>>> an API deprecation process" thread [1]?
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> Yes, I meant the FLIP-321 discussion. I just
>> noticed
>>> I
>>>>>> pasted
>>>>>>>>>>> the
>>>>>>>>>>>>>> wrong
>>>>>>>>>>>>>>>>>> url
>>>>>>>>>>>>>>>>>>> in my previous email. Sorry for the mistake.
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> I am also curious to know if the rationale behind
>>> this
>>>>> new
>>>>>> API
>>>>>>>>>>> has
>>>>>>>>>>>>>> been
>>>>>>>>>>>>>>>>>>>> previously discussed on the mailing list. Do we
>>> have a
>>>>>> list
>>>>>>>>>> of
>>>>>>>>>>>>>>>>>>> shortcomings
>>>>>>>>>>>>>>>>>>>> in the current DataStream API that it tries to
>>>> resolve?
>>>>>> How
>>>>>>>>>>> does
>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>> current ProcessFunction functionality fit into the
>>>>>> picture?
>>>>>>>>>>> Will
>>>>>>>>>>>>> it
>>>>>>>>>>>>>> be
>>>>>>>>>>>>>>>>>>> kept
>>>>>>>>>>>>>>>>>>>> as is or subsumed by new API?
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> I don't think we should create a replacement for
>> the
>>>>>>>>>> DataStream
>>>>>>>>>>>>> API
>>>>>>>>>>>>>>>>>> unless
>>>>>>>>>>>>>>>>>>>> we have a very good reason to do so and with a
>>> proper
>>>>>>>>>>> discussion
>>>>>>>>>>>>>> about
>>>>>>>>>>>>>>>>>>> this
>>>>>>>>>>>>>>>>>>>> as Alex said.
>>>>>>>>>>>>>>>>>>> The ProcessFunction API which is targeting to
>> replace
>>>>>>>>>> DataStream
>>>>>>>>>>>>> API
>>>>>>>>>>>>>> is
>>>>>>>>>>>>>>>>>>> still a proposal, not a decision. Sorry for the
>>>>> confusion,
>>>>>> I
>>>>>>>>>>>>> should
>>>>>>>>>>>>>>>> have
>>>>>>>>>>>>>>>>>>> been more careful with my words, not giving the
>>>>> impression
>>>>>>>>>> that
>>>>>>>>>>>>> this
>>>>>>>>>>>>>> is
>>>>>>>>>>>>>>>>>>> something we'll do anyway.
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> There will be a FLIP describing the motivations and
>>>>>> designs in
>>>>>>>>>>>>>> detail,
>>>>>>>>>>>>>>>>>> for
>>>>>>>>>>>>>>>>>>> the community to discuss and vote on. We are still
>>>>> working
>>>>>> on
>>>>>>>>>>> it.
>>>>>>>>>>>>>> TBH,
>>>>>>>>>>>>>>>>>> this
>>>>>>>>>>>>>>>>>>> is not trivial and we would need more time on it.
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> Just to quickly share some backgrounds:
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>    - We see quite some problems with the current
>>>>>> DataStream
>>>>>>>>>> APIs
>>>>>>>>>>>>>>>>>>>       - Users are working with concrete classes
>>> rather
>>>>>> than
>>>>>>>>>>>>>>>> interfaces,
>>>>>>>>>>>>>>>>>>>       which means
>>>>>>>>>>>>>>>>>>>       - Users can access methods that are designed
>>> to
>>>> be
>>>>>> used
>>>>>>>>>> by
>>>>>>>>>>>>>>>> internal
>>>>>>>>>>>>>>>>>>>          classes, even though they are annotated
>>> with
>>>>>>>>>>> `@Internal`.
>>>>>>>>>>>>>>>> E.g.,
>>>>>>>>>>>>>>>>>>>          `DataStream#getTransformation`.
>>>>>>>>>>>>>>>>>>>          - Changes to the non-API implementations
>>>> (e.g.,
>>>>>>>>>>>>>>>>>> `Transformation`)
>>>>>>>>>>>>>>>>>>>          would affect the API classes (e.g.,
>>>>>> `DataStream`),
>>>>>>>>>>> which
>>>>>>>>>>>>>>>>>>> makes it hard to
>>>>>>>>>>>>>>>>>>>          provide binary compatibility.
>>>>>>>>>>>>>>>>>>>       - Internal classes are used as parameter /
>>>>>> return-value
>>>>>>>>>> of
>>>>>>>>>>>>>>>> public
>>>>>>>>>>>>>>>>>>>       APIs. E.g., while `AbstractStreamOperator`
>> is
>>>>>>>>>>>>> PublicEvolving,
>>>>>>>>>>>>>>>>>>> `StreamTask`
>>>>>>>>>>>>>>>>>>>       which returns from
>>>>>>>>>>>>> `AbstractStreamOperator#getContainingTask`
>>>>>>>>>>>>>> is
>>>>>>>>>>>>>>>>>>> Internal.
>>>>>>>>>>>>>>>>>>>       - In many cases, users are asked to extend
>> the
>>>> API
>>>>>>>>>>> classes,
>>>>>>>>>>>>>>>> rather
>>>>>>>>>>>>>>>>>>>       than implementing interfaces. E.g.,
>>>>>>>>>>>>> `AbstractStreamOperator`.
>>>>>>>>>>>>>>>>>>>          - Any changes to the base classes, even
>> the
>>>>>> internal
>>>>>>>>>>>>> part,
>>>>>>>>>>>>>>>> may
>>>>>>>>>>>>>>>>>>>          affect the behavior of the user-provided
>>>>>> sub-classes
>>>>>>>>>>>>>>>>>>>          - Users can override the behavior of the
>>> base
>>>>>> classes
>>>>>>>>>>>>>>>>>>>       - The API module `flink-streaming-java`
>>> contains
>>>>>> non-API
>>>>>>>>>>>>>>>> classes,
>>>>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>>>       depends on internal modules such as
>>>>> `flink-runtime`,
>>>>>>>>>> which
>>>>>>>>>>>>>> means
>>>>>>>>>>>>>>>>>>>       - Changes to the internal modules may affect
>>> the
>>>>> API
>>>>>>>>>>>>> modules,
>>>>>>>>>>>>>>>> which
>>>>>>>>>>>>>>>>>>>          requires users to re-build their
>>> applications
>>>>>> upon
>>>>>>>>>>>>> upgrading
>>>>>>>>>>>>>>>>>>>          - The artifact user needs for building
>>> their
>>>>>>>>>>> application
>>>>>>>>>>>>>>>> larger
>>>>>>>>>>>>>>>>>>>          than necessary.
>>>>>>>>>>>>>>>>>>>       - We probably should not expose operators
>>> (e.g.,
>>>>>>>>>>>>>>>>>>>       `AbstractStreamOperator`) to users.
>> Functions
>>>>>> should be
>>>>>>>>>>>>> enough
>>>>>>>>>>>>>>>>>>> for users to
>>>>>>>>>>>>>>>>>>>       define their data processing logics.
>> Exposing
>>>>>>>>>>> operator-level
>>>>>>>>>>>>>>>>>> concepts
>>>>>>>>>>>>>>>>>>>       (e.g., mailbox thread model, checkpoint
>>> barrier
>>>>>>>>>> alignment,
>>>>>>>>>>>>>>>> etc.) is
>>>>>>>>>>>>>>>>>>>       unnecessary and limits the improvement
>>> regarding
>>>>>> such
>>>>>>>>>>>>> exposed
>>>>>>>>>>>>>>>>>>> mechanisms
>>>>>>>>>>>>>>>>>>>       with compatibility considerations.
>>>>>>>>>>>>>>>>>>>       - The current DataStream API seems to be a
>>>> mixture
>>>>>> of
>>>>>>>>>> many
>>>>>>>>>>>>>>>> things,
>>>>>>>>>>>>>>>>>>>       making it hard to understand especially for
>>>>>> newcomers.
>>>>>>>>>> It
>>>>>>>>>>>>> might
>>>>>>>>>>>>>>>> be
>>>>>>>>>>>>>>>>>>> better
>>>>>>>>>>>>>>>>>>>       to re-organize it into several parts: (the
>>>>> taxonomy
>>>>>>>>>> below
>>>>>>>>>>>>> are
>>>>>>>>>>>>>>>> just
>>>>>>>>>>>>>>>>>> an
>>>>>>>>>>>>>>>>>>>       example of the, we are still working on
>> this)
>>>>>>>>>>>>>>>>>>>          - The most fundamental stateful stream
>>>>>> processing:
>>>>>>>>>>>>> streams,
>>>>>>>>>>>>>>>>>>>          partitions / key, process functions,
>> state,
>>>>>>>>>>>>> timeline-service
>>>>>>>>>>>>>>>>>>>          - An extension for common batch-streaming
>>>>> unified
>>>>>>>>>>>>> functions:
>>>>>>>>>>>>>>>>>> map,
>>>>>>>>>>>>>>>>>>>          flatmap, filter, agg, reduce, join, etc.
>>>>>>>>>>>>>>>>>>>          - An extension for windowing supports:
>>>> window,
>>>>>>>>>>>>> triggering
>>>>>>>>>>>>>>>>>>>          - An extension for event-time supports:
>>> event
>>>>>> time,
>>>>>>>>>>>>>> watermark
>>>>>>>>>>>>>>>>>>>          - The extensions are like short-cuts /
>>>> sugars,
>>>>>>>>>> without
>>>>>>>>>>>>> which
>>>>>>>>>>>>>>>>>> users
>>>>>>>>>>>>>>>>>>>          can probably still achieve the same
>>> behavior
>>>> by
>>>>>>>>>> working
>>>>>>>>>>>>> with
>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>          fundamental APIs, but would be a lot
>> easier
>>>>> with
>>>>>> the
>>>>>>>>>>>>>>>> extensions
>>>>>>>>>>>>>>>>>>>       - The original plan was to do in-place
>>>> refactors /
>>>>>>>>>> changes
>>>>>>>>>>>>> on
>>>>>>>>>>>>>>>>>>>    DataStream API. Some related items are listed
>> in
>>>> this
>>>>>> doc
>>>>>>>>>> [2]
>>>>>>>>>>>>>>>> attached
>>>>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>>    the kicking off email [3]. Not all of the above
>>>>> issues
>>>>>> are
>>>>>>>>>>>>> listed,
>>>>>>>>>>>>>>>>>>> because
>>>>>>>>>>>>>>>>>>>    we haven't looked into this as deeply as now
>> by
>>>> that
>>>>>> time.
>>>>>>>>>>>>>>>>>>>    - We proposed this as a new API rather than
>>>> in-place
>>>>>>>>>>> refactors
>>>>>>>>>>>>> in
>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>    2.0 work item list, because we realized the
>>> changes
>>>>>> might
>>>>>>>>>> be
>>>>>>>>>>>>> too
>>>>>>>>>>>>>>>> big
>>>>>>>>>>>>>>>>>>> for an
>>>>>>>>>>>>>>>>>>>    in-place change. First having a new API then
>>>>> gradually
>>>>>>>>>>> retiring
>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>> old
>>>>>>>>>>>>>>>>>>> one
>>>>>>>>>>>>>>>>>>>    would help users to smoothly migrate between
>>> them.
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> A thorough discussion is definitely needed once the
>>>> FLIP
>>>>> is
>>>>>>>>>> out.
>>>>>>>>>>>>> And
>>>>>>>>>>>>>> of
>>>>>>>>>>>>>>>>>>> course it's possible that the FLIP might be
>> rejected.
>>>>> Given
>>>>>>>>>> that
>>>>>>>>>>>>> we
>>>>>>>>>>>>>> are
>>>>>>>>>>>>>>>>>>> planning for release 2.0, I just feel it would be
>>>> better
>>>>> to
>>>>>>>>>>> bring
>>>>>>>>>>>>>> this
>>>>>>>>>>>>>>>> up
>>>>>>>>>>>>>>>>>>> early even the concrete plan is not yet ready,
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> Xintong
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> [1]
>>>>>>>>>>>>> 
>>>>> https://lists.apache.org/thread/vmhzv8fcw2b33pqxp43486owrxbkd5x9
>>>>>>>>>>>>>>>>>>> [2]
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>>> 
>> https://docs.google.com/document/d/1_PMGl5RuDQGlV99_gL3y7OiRsF0DgCk91Coua6hFXhE/edit?usp=sharing
>>>>>>>>>>>>>>>>>>> [3]
>>>>>>>>>>>>> 
>>>>> https://lists.apache.org/thread/b8w5cx0qqbwzzklyn5xxf54vw9ymys1c
>>>>>>>>>>>>>>>>>>> On Tue, Jun 27, 2023 at 5:15 PM Gyula Fóra <
>>>>>> gyfora@apache.org
>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>> Hey!
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> I share the same concerns mentioned above
>> regarding
>>>> the
>>>>>>>>>>>>>>>>>> "ProcessFunction
>>>>>>>>>>>>>>>>>>>> API".
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> I don't think we should create a replacement for
>> the
>>>>>>>>>> DataStream
>>>>>>>>>>>>> API
>>>>>>>>>>>>>>>>>>> unless
>>>>>>>>>>>>>>>>>>>> we have a very good reason to do so and with a
>>> proper
>>>>>>>>>>> discussion
>>>>>>>>>>>>>> about
>>>>>>>>>>>>>>>>>>> this
>>>>>>>>>>>>>>>>>>>> as Alex said.
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>>>>>>>> Gyula
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> On Tue, Jun 27, 2023 at 11:03 AM Alexander
>> Fedulov <
>>>>>>>>>>>>>>>>>>>> alexander.fedulov@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> Hi Xintong,
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> By compatibility discussion do you mean the
>>>> "[DISCUSS]
>>>>>>>>>>> FLIP-321:
>>>>>>>>>>>>>>>>>>>> Introduce
>>>>>>>>>>>>>>>>>>>>> an API deprecation process" thread [1]?
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> I am also curious to know if the rationale behind
>>>> this
>>>>>> new
>>>>>>>>>> API
>>>>>>>>>>>>> has
>>>>>>>>>>>>>>>>>> been
>>>>>>>>>>>>>>>>>>>>> previously discussed on the mailing list. Do we
>>> have
>>>> a
>>>>>> list
>>>>>>>>>> of
>>>>>>>>>>>>>>>>>>>> shortcomings
>>>>>>>>>>>>>>>>>>>>> in the current DataStream API that it tries to
>>>> resolve?
>>>>>> How
>>>>>>>>>>> does
>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>> current ProcessFunction functionality fit into
>> the
>>>>>> picture?
>>>>>>>>>>>>> Will it
>>>>>>>>>>>>>>>>>> be
>>>>>>>>>>>>>>>>>>>> kept
>>>>>>>>>>>>>>>>>>>>> as is or subsumed by new API?
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> [1]
>>>>>>>>>>>>>> 
>>>>>> https://lists.apache.org/thread/vmhzv8fcw2b33pqxp43486owrxbkd5x9
>>>>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>>>>> Alex
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> On Mon, 26 Jun 2023 at 14:33, Xintong Song <
>>>>>>>>>>>>> tonysong820@gmail.com>
>>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>>> The ProcessFunction API item is giving me the
>>> most
>>>>>>>>>> headaches
>>>>>>>>>>>>>>>>>>> because
>>>>>>>>>>>>>>>>>>>>> it's
>>>>>>>>>>>>>>>>>>>>>>> very unclear what it actually entails; like is
>> it
>>>> an
>>>>>>>>>>> entirely
>>>>>>>>>>>>>>>>>>>> separate
>>>>>>>>>>>>>>>>>>>>>> API
>>>>>>>>>>>>>>>>>>>>>>> to DataStream (sounds like it is!) or an
>>> extension
>>>> of
>>>>>>>>>>>>> DataStream.
>>>>>>>>>>>>>>>>>>> How
>>>>>>>>>>>>>>>>>>>>>> much
>>>>>>>>>>>>>>>>>>>>>>> will it share the internals with DataStream
>> etc.;
>>>> how
>>>>>> does
>>>>>>>>>>> it
>>>>>>>>>>>>>>>>>>> relate
>>>>>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>>> Table API (w.r.t. switching APIs / what Table
>> API
>>>>> uses
>>>>>>>>>>>>>>>>>> underneath).
>>>>>>>>>>>>>>>>>>>>>> I totally understand your confusion. We started
>>>>> planning
>>>>>>>>>> this
>>>>>>>>>>>>>> after
>>>>>>>>>>>>>>>>>>>>> kicking
>>>>>>>>>>>>>>>>>>>>>> off the release 2.0, so there's still a lot to
>> be
>>>>>> explored
>>>>>>>>>>> and
>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>> plan
>>>>>>>>>>>>>>>>>>>>>> keeps changing.
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>    - In the beginning, we planned to do an
>>> in-place
>>>>>>>>>> refactor
>>>>>>>>>>> of
>>>>>>>>>>>>>>>>>>>>> DataStream
>>>>>>>>>>>>>>>>>>>>>>    API, until the API migration period is
>>> proposed.
>>>>>>>>>>>>>>>>>>>>>>    - Then we want to make it an entirely
>> separate
>>>> API
>>>>>> to
>>>>>>>>>>>>>>>>>> DataStream,
>>>>>>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>>>>>>    listed as a must-have for release 2.0 so
>> that
>>> we
>>>>> can
>>>>>>>>>>> remove
>>>>>>>>>>>>>>>>>>>> DataStream
>>>>>>>>>>>>>>>>>>>>>> once
>>>>>>>>>>>>>>>>>>>>>>    it's ready.
>>>>>>>>>>>>>>>>>>>>>>    - However, depending on the outcome of the
>> API
>>>>>>>>>>> compatibility
>>>>>>>>>>>>>>>>>>>>> discussion
>>>>>>>>>>>>>>>>>>>>>>    [1], we may not be able to remove DataStream
>>> in
>>>>> 2.0
>>>>>>>>>>> anyway,
>>>>>>>>>>>>>>>>>> which
>>>>>>>>>>>>>>>>>>>>> means
>>>>>>>>>>>>>>>>>>>>>> we
>>>>>>>>>>>>>>>>>>>>>>    might need to re-evaluate the necessity of
>>> this
>>>>>> item for
>>>>>>>>>>>>> 2.0.
>>>>>>>>>>>>>>>>>>>>>> I'd say we wait a bit longer for the
>> compatibility
>>>>>>>>>> discussion
>>>>>>>>>>>>> [1]
>>>>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>>>>>> decide the priority for this item afterwards.
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> Xintong
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> [1]
>>>>>>>>>> https://lists.apache.org/list.html?dev@flink.apache.org
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> On Mon, Jun 26, 2023 at 6:00 PM Chesnay
>> Schepler <
>>>>>>>>>>>>>>>>>> chesnay@apache.org
>>>>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> by-and-large I'm quite happy with the list of
>>>> items.
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> I'm curious as to why the "Disaggregated State
>>>>>> Management"
>>>>>>>>>>>>> item
>>>>>>>>>>>>>>>>>> is
>>>>>>>>>>>>>>>>>>>>> marked
>>>>>>>>>>>>>>>>>>>>>>> as a must-have; will it require changes that
>>> break
>>>>>>>>>>> something?
>>>>>>>>>>>>>>>>>> What
>>>>>>>>>>>>>>>>>>>>>> prevents
>>>>>>>>>>>>>>>>>>>>>>> it from being added in 2.1?
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> We may want to update the Java 17 item to "Make
>>>> Java
>>>>> 17
>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>> default,
>>>>>>>>>>>>>>>>>>>>> drop
>>>>>>>>>>>>>>>>>>>>>>> Java 8/11". Maybe even split it into a
>> must-have
>>>>> "Drop
>>>>>>>>>> Java
>>>>>>>>>>> 8"
>>>>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>>> a
>>>>>>>>>>>>>>>>>>>>>>> nice-to-have "Drop Java 11"?
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> "Move Calcite rules from Scala to Java": I
>> would
>>>> hope
>>>>>> that
>>>>>>>>>>>>> this
>>>>>>>>>>>>>>>>>>> would
>>>>>>>>>>>>>>>>>>>>> be
>>>>>>>>>>>>>>>>>>>>>>> an entirely internal change, and could thus be
>> an
>>>>>>>>>>> incremental
>>>>>>>>>>>>>>>>>>> process
>>>>>>>>>>>>>>>>>>>>>>> independent of major releases.
>>>>>>>>>>>>>>>>>>>>>>> What is the actual scale of this item; how much
>>> are
>>>>> we
>>>>>>>>>>>>> actually
>>>>>>>>>>>>>>>>>>>>>> re-writing?
>>>>>>>>>>>>>>>>>>>>>>> "Add MetricGroup#getLogicalScope": I'd raise
>> this
>>>> to
>>>>> a
>>>>>>>>>>>>>>>>>> must-have; i
>>>>>>>>>>>>>>>>>>>>> think
>>>>>>>>>>>>>>>>>>>>>>> I marked it down as nice-to-have only because
>> it
>>>>>> depends
>>>>>>>>>> on
>>>>>>>>>>>>>>>>>> another
>>>>>>>>>>>>>>>>>>>>> item.
>>>>>>>>>>>>>>>>>>>>>>> The ProcessFunction API item is giving me the
>>> most
>>>>>>>>>> headaches
>>>>>>>>>>>>>>>>>>> because
>>>>>>>>>>>>>>>>>>>>> it's
>>>>>>>>>>>>>>>>>>>>>>> very unclear what it actually entails; like is
>> it
>>>> an
>>>>>>>>>>> entirely
>>>>>>>>>>>>>>>>>>>> separate
>>>>>>>>>>>>>>>>>>>>>> API
>>>>>>>>>>>>>>>>>>>>>>> to DataStream (sounds like it is!) or an
>>> extension
>>>> of
>>>>>>>>>>>>> DataStream.
>>>>>>>>>>>>>>>>>>> How
>>>>>>>>>>>>>>>>>>>>>> much
>>>>>>>>>>>>>>>>>>>>>>> will it share the internals with DataStream
>> etc.;
>>>> how
>>>>>> does
>>>>>>>>>>> it
>>>>>>>>>>>>>>>>>>> relate
>>>>>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>>> Table API (w.r.t. switching APIs / what Table
>> API
>>>>> uses
>>>>>>>>>>>>>>>>>> underneath).
>>>>>>>>>>>>>>>>>>>>>>> There are a few items I added as ideas which
>>> don't
>>>>>> have a
>>>>>>>>>>>>>>>>>> priority
>>>>>>>>>>>>>>>>>>>> yet;
>>>>>>>>>>>>>>>>>>>>>>> would love to get some feedback on those.
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> On 21/06/2023 08:41, Xintong Song wrote:
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> Hi devs,
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> As previously discussed in [1], we had been
>>>>> collecting
>>>>>>>>>> work
>>>>>>>>>>>>> item
>>>>>>>>>>>>>>>>>>>>>> proposals
>>>>>>>>>>>>>>>>>>>>>>> for the 2.0 release until June 15th, on the
>> wiki
>>>> page
>>>>>> [2].
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>    - As we have passed the due date, I'd like
>> to
>>>>>> kindly
>>>>>>>>>>> remind
>>>>>>>>>>>>>>>>>>>> everyone
>>>>>>>>>>>>>>>>>>>>>> *not
>>>>>>>>>>>>>>>>>>>>>>>    to add / remove items directly on the wiki
>>>> page*.
>>>>>> If
>>>>>>>>>>>>> needed,
>>>>>>>>>>>>>>>>>>>> please
>>>>>>>>>>>>>>>>>>>>>> post
>>>>>>>>>>>>>>>>>>>>>>>    in this thread or reach out to the release
>>>>> managers
>>>>>>>>>>>>> instead.
>>>>>>>>>>>>>>>>>>>>>>>    - I've reached out to some folks for
>>>>> clarifications
>>>>>>>>>> about
>>>>>>>>>>>>>>>>>> their
>>>>>>>>>>>>>>>>>>>>>>>    proposals. Some of them mentioned that they
>>> can
>>>>>> not yet
>>>>>>>>>>>>> tell
>>>>>>>>>>>>>>>>>>>> whether
>>>>>>>>>>>>>>>>>>>>>> we
>>>>>>>>>>>>>>>>>>>>>>>    should do an item or not, and would need
>> more
>>>>> time
>>>>>> /
>>>>>>>>>>>>>>>>>> discussions
>>>>>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>>>>> make
>>>>>>>>>>>>>>>>>>>>>>>    the decision. So I added a new symbol for
>>> items
>>>>>> whose
>>>>>>>>>>>>>>>>>> priorities
>>>>>>>>>>>>>>>>>>>> are
>>>>>>>>>>>>>>>>>>>>>> `TBD`.
>>>>>>>>>>>>>>>>>>>>>>> Now it's time to collaboratively decide a
>> minimum
>>>> set
>>>>>> of
>>>>>>>>>>>>>>>>>> must-have
>>>>>>>>>>>>>>>>>>>>> items.
>>>>>>>>>>>>>>>>>>>>>>> I've gone through the entire list of proposed
>>>> items,
>>>>>> and
>>>>>>>>>>> found
>>>>>>>>>>>>>>>>>> most
>>>>>>>>>>>>>>>>>>>> of
>>>>>>>>>>>>>>>>>>>>>> them
>>>>>>>>>>>>>>>>>>>>>>> make quite much sense. So I think an online
>> sync
>>>>> might
>>>>>> not
>>>>>>>>>>> be
>>>>>>>>>>>>>>>>>>>> necessary
>>>>>>>>>>>>>>>>>>>>>> for
>>>>>>>>>>>>>>>>>>>>>>> this. I'd like to go with this DISCUSS thread,
>>>> where
>>>>>>>>>>> everyone
>>>>>>>>>>>>> can
>>>>>>>>>>>>>>>>>>>>> comment
>>>>>>>>>>>>>>>>>>>>>>> on how they think the list can be improved,
>>>> followed
>>>>>> by a
>>>>>>>>>>>>> VOTE to
>>>>>>>>>>>>>>>>>>>>>> formally
>>>>>>>>>>>>>>>>>>>>>>> make the decision.
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> Any feedback and opinions, including but not
>>>> limited
>>>>> to
>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>> following
>>>>>>>>>>>>>>>>>>>>>>> aspects, will be appreciated.
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>    - Important items that are missing from the
>>>> list
>>>>>>>>>>>>>>>>>>>>>>>    - Concerns regarding the listed items or
>>> their
>>>>>>>>>> priorities
>>>>>>>>>>>>>>>>>>>>>>> Looking forward to your feedback.
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> Xintong
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> [1]
>>>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>>> 
>> https://lists.apache.org/list?dev@flink.apache.org:lte=1M:release%202.0%20status%20updates
>>>>>>>>>>>>>>>>>>>>>>> [2]
>>>>>>>>>>>>>>>>>> 
>>>>>> https://cwiki.apache.org/confluence/display/FLINK/2.0+Release
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>>>>>>> Sergey
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> --
>>>>>>>>>>>>> Best
>>>>>>>>>>>>> 
>>>>>>>>>>>>> ConradJam
>>>>>>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>>> 
>> 



Re: [DISCUSS] Release 2.0 Work Items

Posted by zhiqiang li <li...@gmail.com>.
I have seen in [1] connectors and formats, and user code will be pluggable.
If the connectors are pluggable, the benefits are obvious, as the conflicts
between different jar package versions can be avoided.
If you don't use classloader isolation, shade is needed to resolve
conflicts. A lot of development time is wasted.
I know that this change may involve a lot of API changes, so I would like
to discuss in this email whether we can make changes in Flink 2.0.
Plugins facilitate a strict separation of code through restricted
classloaders.

Plugins cannot access classes from other plugins or from Flink that have
> not been specifically whitelisted.
> This strict isolation allows plugins to contain conflicting versions of
> the same library without the need to relocate classes or to converge to
> common versions.
> Currently, file systems and metric reporters are pluggable *but in the
> future, connectors, formats, and even user code should also be pluggable.*
>

[1]
https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/filesystems/overview/

Xintong Song <to...@gmail.com> 于2023年7月11日周二 18:50写道:

> >
> > What we might want to come up with is a summary with each 2.0.0 issue on
> > why it should be included or not. That summary is something the community
> > could vote on. WDYT? I'm happy to help here.
> >
>
> That sounds great. Thanks for offering the help. I'll also try to go
> through the issues, but TBH I'm quite overwhelmed and cannot promise to get
> this done very soon. Your help is very much needed.
>
>
> Best,
>
> Xintong
>
>
>
> On Tue, Jul 11, 2023 at 6:08 PM Matthias Pohl
> <ma...@aiven.io.invalid> wrote:
>
> > @Xintong I guess it makes sense. I agree with your conclusions on the
> four
> > mentioned Jira issues.
> >
> > I just checked any issues that have fixVersion = 2.0.0 [1]. There are a
> few
> > more items that are not affiliated with FLINK-3957 [2]. I guess we should
> > find answers for these issues: Either closing them with a reason to have
> a
> > consistent state in Jira or adding them to the feature list as part of a
> > separate voting thread (to leave the current vote untouched).
> >
> > What we might want to come up with is a summary with each 2.0.0 issue on
> > why it should be included or not. That summary is something the community
> > could vote on. WDYT? I'm happy to help here.
> >
> > Matthias
> >
> > [1]
> >
> >
> https://issues.apache.org/jira/browse/FLINK-32437?jql=project%20%3D%20FLINK%20AND%20fixVersion%20%3D%202.0.0%20AND%20status%20NOT%20IN%20(Closed%2C%20Resolved)%20%20
> > [2] https://issues.apache.org/jira/browse/FLINK-3957
> >
> >
> > On Tue, Jul 11, 2023 at 5:01 AM Xintong Song <to...@gmail.com>
> > wrote:
> >
> > > @Zhu,
> > > As you are downgrading "Clarify the scopes of configuration options" to
> > > nice-to-have priority, could you also bring that up in the vote
> > thread[1]?
> > > I'm asking because there are people who already voted on the original
> > list.
> > > I think restarting the vote is probably an overkill and unnecessary,
> but
> > we
> > > should at least bring this change to their attention.
> > >
> > > @Matthias,
> > > Thanks a lot for bringing this up. I wasn't aware of this early
> > umbrella. I
> > > haven't gone through everything in FLINK-3957 yet. I'll do it asap.
> > >
> > > Just quickly went through the 4 issues you mentioned.
> > > - FLINK-4675 & FLINK-14068: I'd be +1 to deprecate them in 1.18, as
> long
> > as
> > > the new APIs that we want users to migrate to are ready. For these 2
> > > tickets, I think introduction of the updated APIs should be
> > straightforward
> > > and feasible for 1.18.
> > > - FLINK-13926: I'm not sure about this one. The two mentioned classes
> > > `ProcessingTimeSessionWindows` and `EventTimeSessionWindows` are not
> even
> > > marked as Public or PublicEvolving APIs. Moreover, I don't see a good
> way
> > > to smoothly replace the classes with a generic version.
> > > - FLINK-5126: This is a bit unclear to me. From the description and
> > > conversation on the ticket, I don't fully understand which concrete
> APIs
> > > the ticket is referring to. Or maybe it refers to all / most of the
> APIs
> > > that throws Exception / IOException in general. Moreover, I don't think
> > > removing Exception / IOException from the API signature is a breaking
> > > change. It requires no code changes on the caller side.
> > >
> > > WDYT?
> > >
> > > Best,
> > >
> > > Xintong
> > >
> > >
> > > [1] https://lists.apache.org/thread/r0y9syc6k5nmcxvnd0hj33htdpdj9k6m
> > > [2] https://issues.apache.org/jira/browse/FLINK-3957
> > >
> > > On Mon, Jul 10, 2023 at 10:53 PM Matthias Pohl
> > > <ma...@aiven.io.invalid> wrote:
> > >
> > > > I brought it up in the deprecating APIs in 1.18 thread [1] already
> but
> > it
> > > > feels misplaced there. I just wanted to ask whether someone did a
> pass
> > > over
> > > > FLINK-3957 [2]. I came across it when going through the release 2.0
> > > feature
> > > > list [3] as part of the vote. I have the feeling that there are some
> > > valid
> > > > action items (e.g. FLINK-4675, FLINK-5126, FLINK-13926 [4-6]) which
> do
> > > not
> > > > seem to be listed in the 2.0 feature list [3], yet (or are included
> in
> > > some
> > > > of the bigger items). Majority of the subtasks are probably covered
> by
> > > the
> > > > DataSet removal, the Scala API removal and the ProcessFunction
> > > refactoring.
> > > > Other subtasks (FLINK-14068 [7]) made it into the feature list.
> > > >
> > > > I haven't worked with the SDK code that much so that I can judge
> > whether
> > > > the subtasks are still reasonable or actually obsolete. That is why I
> > > > wanted to mention the Jira issue here once more.
> > > >
> > > > I don't consider it a blocker for the ongoing vote but was wondering
> > > > whether it makes sense for someone who might have more experience in
> > that
> > > > field to add some of the subtasks to the feature list.
> > > >
> > > > Or shall we just consider it as "not interesting enough" because
> nobody
> > > > added it in the first place to the 2.0 feature list [3]?
> > > >
> > > > Matthias
> > > >
> > > > [1] https://lists.apache.org/thread/3dw4f8frlg8hzlv324ql7n2755bzs9hy
> > > > [2] https://issues.apache.org/jira/browse/FLINK-3957
> > > > [3] https://cwiki.apache.org/confluence/display/FLINK/2.0+Release
> > > > [4] https://issues.apache.org/jira/browse/FLINK-4675
> > > > [5] https://issues.apache.org/jira/browse/FLINK-5126
> > > > [6] https://issues.apache.org/jira/browse/FLINK-13926
> > > > [7] https://issues.apache.org/jira/browse/FLINK-14068
> > > >
> > > > On Mon, Jul 10, 2023 at 3:17 PM Zhu Zhu <re...@gmail.com> wrote:
> > > >
> > > > > Agreed that we should deprecate affected APIs as soon as possible.
> > > > > But there is not much time before the feature freeze of 1.18,
> hence
> > > > > I'm a bit concerned that some of the deprecations might not be done
> > > 1.18.
> > > > >
> > > > > We are currently looking into the improvements of the configuration
> > > > layer.
> > > > > Most of the proposed changes would require a public discussion, or
> > even
> > > > > a FLIP, which I think can hardly close before the feature freeze of
> > > 1.18.
> > > > > And some of the APIs can be deprecated only after the corresponding
> > new
> > > > > APIs are developed. Therefore we previously targeted them for 1.19.
> > > > >
> > > > > We may review later to see what deprecation work can be done in
> 1.18
> > > and
> > > > > make it if possible. I think we can do the work even after the
> > feature
> > > > > freeze
> > > > > date, if it is a purely deprecation work (simply adding
> annotations).
> > > > WDYT?
> > > > >
> > > > > I'm also changing the priority of "Clarify the scopes of
> > configuration
> > > > > options"
> > > > > to nice to have. I think most of the work are not breaking changes
> > and
> > > > can
> > > > > be done in 1.x or 2.1+. For the breaking changes which might be
> > needed,
> > > > we
> > > > > will consider it as part of the configuration layer rework.
> > > > >
> > > > > Thanks,
> > > > > Zhu
> > > > >
> > > > > Xintong Song <to...@gmail.com> 于2023年7月10日周一 19:58写道:
> > > > > >
> > > > > > >
> > > > > > > At what point are the FLIP discussions coming into play?
> > > > > >
> > > > > > I keep wondering if these shouldn't have started already.
> > > > > >
> > > > > >
> > > > > > I think this depends on the responsible contributor and reviewer
> of
> > > > > > individual items. From my perspective, the FLIP discussions can
> > start
> > > > any
> > > > > > time as long as the contributors are ready, the earlier the
> better.
> > > > > >
> > > > > >
> > > > > > What we need to ensure is that all breaking API changes are
> > > > > > > discussed/decided before 1.18 is released so we can deprecate
> > > > affected
> > > > > APIs.
> > > > > > >
> > > > > >
> > > > > > The introduction of the migration period has brought the
> > requirement
> > > to
> > > > > > plan the removal of public APIs 2 minor releases ahead of the
> major
> > > > > > release, which is TBH a bit unexpected. I agree it would be nice
> if
> > > we
> > > > > can
> > > > > > get the FLIPs ready by releasing 1.18. But I also don't think we
> > > should
> > > > > > rush on it. If the deprecation of a Public API does not make
> 1.18,
> > we
> > > > may
> > > > > > carry it until 3.0. Or if there are many Public APIs whose
> > > deprecation
> > > > > does
> > > > > > not make 1.18, we may deprecate them in 1.19 and postpone the
> major
> > > > > version
> > > > > > bump to after a 1.20 release. Moreover, as mentioned in
> > FLIP-321[1],
> > > > > > exceptions are discussable given that the migration period is
> newly
> > > > > > proposed and we did not give developers the chance to plan things
> > > > ahead.
> > > > > To
> > > > > > sum up, I'd say we try identify APIs that need to be deprecated
> in
> > > 1.18
> > > > > > with best efforts, and evaluate the remaining options (carrying
> the
> > > API
> > > > > for
> > > > > > the entire 2.x cycle, postpone 2.0, or making an exception)
> > > > case-by-case.
> > > > > > WDYT?
> > > > > >
> > > > > > Best,
> > > > > >
> > > > > > Xintong
> > > > > >
> > > > > >
> > > > > > [1]
> > https://lists.apache.org/thread/vmhzv8fcw2b33pqxp43486owrxbkd5x9
> > > > > >
> > > > > > On Mon, Jul 10, 2023 at 6:13 PM Chesnay Schepler <
> > chesnay@apache.org
> > > >
> > > > > wrote:
> > > > > >
> > > > > > > At what point are the FLIP discussions coming into play?
> > > > > > >
> > > > > > > I keep wondering if these shouldn't have started already.
> > > > > > > It just seems that a lot of decisions are implicitly reliant on
> > the
> > > > > > > items even being accepted.
> > > > > > > Estimates can only be provided if we actually know the scope of
> > the
> > > > > > > change, but that's not always clear from the description in the
> > > doc.
> > > > > > >
> > > > > > > What we need to ensure is that all breaking API changes are
> > > > > > > discussed/decided before 1.18 is released so we can deprecate
> > > > affected
> > > > > > > APIs.
> > > > > > >
> > > > > > > On 10/07/2023 11:32, Xintong Song wrote:
> > > > > > > > Hi Matthias,
> > > > > > > >
> > > > > > > > The questions you asked are indeed very important. Here're
> some
> > > > quick
> > > > > > > > responses, based on the plans I had in mind, which I have not
> > > > aligned
> > > > > > > with
> > > > > > > > other release managers yet.
> > > > > > > >
> > > > > > > > In the previous discussions between the RMs, we were not able
> > to
> > > > make
> > > > > > > > proposals on things like how to make a time plan, how to
> manage
> > > the
> > > > > > > release
> > > > > > > > branch, etc., due to the lack of inputs on e.g., the work
> items
> > > > need
> > > > > to
> > > > > > > be
> > > > > > > > included (which transitively depends on the API compatibility
> > to
> > > > > provide
> > > > > > > > between major versions) and the workloads / time needed for
> > them.
> > > > > With
> > > > > > > the
> > > > > > > > recent discussions, we have collected at least the majority
> of
> > > the
> > > > > inputs
> > > > > > > > needed.
> > > > > > > >
> > > > > > > > Here are things that I think we as the release managers would
> > do
> > > > next
> > > > > > > > (again, not aligned with other release managers yet)
> > > > > > > > - Creating a time plan, by reaching out to people to
> understand
> > > the
> > > > > > > > estimated workloads, prerequisites and ETA of each work item.
> > > > > > > > - Make a proposal on how to manage the release branch, i.e.,
> > when
> > > > to
> > > > > cut
> > > > > > > > the branch and whether to ship the milestone releases, etc.
> > > > > > > > - Set-up regular release syncs (bi-weekly / monthly) to
> update
> > > the
> > > > > status
> > > > > > > > and draw attention to where help is needed.
> > > > > > > >
> > > > > > > > So back to your questions.
> > > > > > > >
> > > > > > > > There are still to-be-discussed items in the list of
> features.
> > > > > What's the
> > > > > > > >> plan with those?
> > > > > > > > When collecting ETA, for items that the completion time
> cannot
> > > yet
> > > > be
> > > > > > > > estimated, we would like to have at least a time by which the
> > > > > estimation
> > > > > > > > can be made. I think the same applies to the to-be-discussed
> > > items.
> > > > > And
> > > > > > > if
> > > > > > > > the items should be included as must-haves, we would need
> > another
> > > > > vote to
> > > > > > > > adjust the must-have item list.
> > > > > > > >
> > > > > > > > Some of them don't have anyone assigned.
> > > > > > > > My concern is that they will be overlooked because nobody
> feels
> > > to
> > > > > be in
> > > > > > > >> charge.
> > > > > > > > This is a tricky one. For must-have items without assignees,
> we
> > > as
> > > > > the
> > > > > > > > release managers should be responsible for raising them up in
> > the
> > > > > release
> > > > > > > > syncs, and try to find assignees for them. Hopefully, there
> > will
> > > be
> > > > > > > someone
> > > > > > > > who stands out. But it is possible that for a must-have item
> > > nobody
> > > > > wants
> > > > > > > > to work on it. If that happens, which I don't think it will,
> it
> > > > > probably
> > > > > > > > means the item is not that critical and we may have to
> exclude
> > it
> > > > > from
> > > > > > > the
> > > > > > > > release. Either way, they should not be overlooked, because
> > IMHO
> > > > > release
> > > > > > > > managers should be responsible for trying to get someone to
> > work
> > > on
> > > > > the
> > > > > > > > un-assigned items.
> > > > > > > >
> > > > > > > > We'll have more discussions soon and keep the community
> > updated.
> > > > > > > >
> > > > > > > > Best,
> > > > > > > >
> > > > > > > > Xintong
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > On Mon, Jul 10, 2023 at 3:53 PM Matthias Pohl
> > > > > > > > <ma...@aiven.io.invalid> wrote:
> > > > > > > >
> > > > > > > >> Now that the vote is started on the must-have items: There
> are
> > > > still
> > > > > > > >> to-be-discussed items in the list of features. What's the
> plan
> > > > with
> > > > > > > those?
> > > > > > > >> Some of them don't have anyone assigned. Were these items
> > > > discussed
> > > > > > > among
> > > > > > > >> the release managers? So far, it looks like they are handled
> > as
> > > > > > > >> nice-to-have if someone volunteers to pick them up?
> > > > > > > >>
> > > > > > > >> My concern is that they will be overlooked because nobody
> > feels
> > > to
> > > > > be in
> > > > > > > >> charge.
> > > > > > > >>
> > > > > > > >> Best,
> > > > > > > >> Matthias
> > > > > > > >>
> > > > > > > >> On Fri, Jul 7, 2023 at 11:06 AM Xintong Song <
> > > > tonysong820@gmail.com
> > > > > >
> > > > > > > >> wrote:
> > > > > > > >>
> > > > > > > >>> Thanks all for the discussion.
> > > > > > > >>>
> > > > > > > >>> The wiki has been updated as discussed. I'm starting a vote
> > > now.
> > > > > > > >>>
> > > > > > > >>> Best,
> > > > > > > >>>
> > > > > > > >>> Xintong
> > > > > > > >>>
> > > > > > > >>>
> > > > > > > >>>
> > > > > > > >>> On Wed, Jul 5, 2023 at 9:52 AM Xintong Song <
> > > > tonysong820@gmail.com
> > > > > >
> > > > > > > >> wrote:
> > > > > > > >>>> Hi ConradJam,
> > > > > > > >>>>
> > > > > > > >>>> I think Chesnay has already put his name as the
> Contributor
> > > for
> > > > > the
> > > > > > > two
> > > > > > > >>>> tasks you listed. Maybe you can reach out to him to see if
> > you
> > > > can
> > > > > > > >>>> collaborate on this.
> > > > > > > >>>>
> > > > > > > >>>> In general, I don't think contributing to a release 2.0
> > issue
> > > is
> > > > > much
> > > > > > > >>>> different from contributing to a regular issue. We haven't
> > yet
> > > > > created
> > > > > > > >>> JIRA
> > > > > > > >>>> tickets for all the listed tasks because many of them
> needs
> > > > > further
> > > > > > > >>>> discussions and / or FLIPs to decide whether and how they
> > > should
> > > > > be
> > > > > > > >>>> performed.
> > > > > > > >>>>
> > > > > > > >>>> Best,
> > > > > > > >>>>
> > > > > > > >>>> Xintong
> > > > > > > >>>>
> > > > > > > >>>>
> > > > > > > >>>>
> > > > > > > >>>> On Mon, Jul 3, 2023 at 10:37 PM ConradJam <
> > > jam.gzczy@gmail.com>
> > > > > > > wrote:
> > > > > > > >>>>
> > > > > > > >>>>> Hi Community:
> > > > > > > >>>>>    I see some tasks in the 2.0 list that haven't been
> > > assigned
> > > > > yet. I
> > > > > > > >>> want
> > > > > > > >>>>> to take the initiative to take on some tasks that I can
> > > > > complete. How
> > > > > > > >>> do I
> > > > > > > >>>>> apply to the community for this part of the task? I am
> > > > > interested in
> > > > > > > >> the
> > > > > > > >>>>> following parts of FLINK-32377
> > > > > > > >>>>> <https://issues.apache.org/jira/browse/FLINK-32377>, do
> I
> > > need
> > > > > to
> > > > > > > >>> create
> > > > > > > >>>>> issuse myself and point it to myself?
> > > > > > > >>>>>
> > > > > > > >>>>> - the current timestamp, which is problematic w.r.t.
> > caching
> > > > and
> > > > > > > >>> testing,
> > > > > > > >>>>> while providing no value.
> > > > > > > >>>>> - Remove JarRequestBody#programArgs in favor of
> > > > #programArgsList.
> > > > > > > >>>>>
> > > > > > > >>>>> [1] FLINK-32377 <
> > > > > https://issues.apache.org/jira/browse/FLINK-32377>
> > > > > > > >>>>> https://issues.apache.org/jira/browse/FLINK-32377
> > > > > > > >>>>>
> > > > > > > >>>>> Teoh, Hong <li...@amazon.co.uk.invalid> 于2023年6月30日周五
> > > > 00:53写道:
> > > > > > > >>>>>
> > > > > > > >>>>>
> > > > > > > >>>>> Teoh, Hong <li...@amazon.co.uk.invalid> 于2023年6月30日周五
> > > > 00:53写道:
> > > > > > > >>>>>
> > > > > > > >>>>>> Thanks Xintong for driving the effort.
> > > > > > > >>>>>>
> > > > > > > >>>>>> I’d add a +1 to reworking configs, as suggested by @Jark
> > and
> > > > > > > >> @Chesnay,
> > > > > > > >>>>>> especially the types. We have various configs that
> encode
> > > > Time /
> > > > > > > >>>>> MemorySize
> > > > > > > >>>>>> that are Long instead!
> > > > > > > >>>>>>
> > > > > > > >>>>>> Regards,
> > > > > > > >>>>>> Hong
> > > > > > > >>>>>>
> > > > > > > >>>>>>
> > > > > > > >>>>>>
> > > > > > > >>>>>>> On 29 Jun 2023, at 16:19, Yuan Mei <
> > yuanmei.work@gmail.com
> > > >
> > > > > > > >> wrote:
> > > > > > > >>>>>>> CAUTION: This email originated from outside of the
> > > > > organization.
> > > > > > > >> Do
> > > > > > > >>>>> not
> > > > > > > >>>>>> click links or open attachments unless you can confirm
> the
> > > > > sender
> > > > > > > >> and
> > > > > > > >>>>> know
> > > > > > > >>>>>> the content is safe.
> > > > > > > >>>>>>>
> > > > > > > >>>>>>>
> > > > > > > >>>>>>> Thanks for driving this effort, Xintong!
> > > > > > > >>>>>>>
> > > > > > > >>>>>>> To Chesnay
> > > > > > > >>>>>>>> I'm curious as to why the "Disaggregated State
> > Management"
> > > > > item
> > > > > > > >> is
> > > > > > > >>>>>>>> marked as a must-have; will it require changes that
> > break
> > > > > > > >>> something?
> > > > > > > >>>>>>>> What prevents it from being added in 2.1?
> > > > > > > >>>>>>> As to "Disaggregated State Management".
> > > > > > > >>>>>>>
> > > > > > > >>>>>>> We plan to provide a new type of state backend to
> support
> > > DFS
> > > > > as
> > > > > > > >>>>> primary
> > > > > > > >>>>>>> storage.
> > > > > > > >>>>>>> To achieve this, we at least need to include two parts
> of
> > > > > amends
> > > > > > > >>> (not
> > > > > > > >>>>>>> entirely sure yet, since we are still in the designing
> > and
> > > > > > > >> prototype
> > > > > > > >>>>>> phase)
> > > > > > > >>>>>>> 1. Statebackend Change
> > > > > > > >>>>>>> 2. State Access Change
> > > > > > > >>>>>>>
> > > > > > > >>>>>>> Not all of the interfaces related are `@Internal`. Some
> > of
> > > > the
> > > > > > > >>>>> interfaces
> > > > > > > >>>>>>> like `StateBackend` is `@PublicEvolving`
> > > > > > > >>>>>>> So, you are right in the sense that "Disaggregated
> State
> > > > > > > >> Management"
> > > > > > > >>>>>> itself
> > > > > > > >>>>>>> probably does not need to be a "Must Have"
> > > > > > > >>>>>>>
> > > > > > > >>>>>>> But I was hoping changes that related to public APIs
> can
> > be
> > > > > > > >>> finalized
> > > > > > > >>>>> and
> > > > > > > >>>>>>> merged in Flink 2.0 (I will fix the wiki accordingly).
> > > > > > > >>>>>>>
> > > > > > > >>>>>>> I also agree with Jark that 2.0 is a good chance to
> > rework
> > > > the
> > > > > > > >>> default
> > > > > > > >>>>>>> value of configurations.
> > > > > > > >>>>>>>
> > > > > > > >>>>>>> Best
> > > > > > > >>>>>>> Yuan
> > > > > > > >>>>>>>
> > > > > > > >>>>>>>
> > > > > > > >>>>>>> On Thu, Jun 29, 2023 at 8:43 PM Chesnay Schepler <
> > > > > > > >>> chesnay@apache.org>
> > > > > > > >>>>>> wrote:
> > > > > > > >>>>>>>> Something else configuration-related is that there
> are a
> > > > > bunch of
> > > > > > > >>>>>>>> options where the type isn't quite correct (e.g., a
> > String
> > > > > where
> > > > > > > >> it
> > > > > > > >>>>>>>> could be an enum, a string where it should be an int
> or
> > > > > > > >> something).
> > > > > > > >>>>>>>> Could do a pass over those as well.
> > > > > > > >>>>>>>>
> > > > > > > >>>>>>>> On 29/06/2023 13:50, Jark Wu wrote:
> > > > > > > >>>>>>>>> Hi,
> > > > > > > >>>>>>>>>
> > > > > > > >>>>>>>>> I think one more thing we need to consider to do in
> 2.0
> > > is
> > > > > > > >>> changing
> > > > > > > >>>>> the
> > > > > > > >>>>>>>>> default value of configuration to improve out-of-box
> > user
> > > > > > > >>>>> experience.
> > > > > > > >>>>>>>>> Currently, in order to run a Flink job, users may
> need
> > to
> > > > set
> > > > > > > >>>>>>>>> a bunch of configurations, such as minibatch,
> > checkpoint
> > > > > > > >> interval,
> > > > > > > >>>>>>>>> exactly-once,
> > > > > > > >>>>>>>>> incremental-checkpoint, etc. It's very verbose and
> hard
> > > to
> > > > > use
> > > > > > > >> for
> > > > > > > >>>>>>>>> beginners.
> > > > > > > >>>>>>>>> Most of them can have a universally applicable value.
> > > > > Because
> > > > > > > >>>>> changing
> > > > > > > >>>>>>>> the
> > > > > > > >>>>>>>>> default value is a breaking change. I think It's
> worth
> > > > > > > >> considering
> > > > > > > >>>>>>>> changing
> > > > > > > >>>>>>>>> them in 2.0.
> > > > > > > >>>>>>>>>
> > > > > > > >>>>>>>>> What do you think?
> > > > > > > >>>>>>>>>
> > > > > > > >>>>>>>>> Best,
> > > > > > > >>>>>>>>> Jark
> > > > > > > >>>>>>>>>
> > > > > > > >>>>>>>>>
> > > > > > > >>>>>>>>> On Wed, 28 Jun 2023 at 14:10, Sergey Nuyanzin <
> > > > > > > >>> snuyanzin@gmail.com>
> > > > > > > >>>>>>>> wrote:
> > > > > > > >>>>>>>>>> Hi Chesnay
> > > > > > > >>>>>>>>>>
> > > > > > > >>>>>>>>>>> "Move Calcite rules from Scala to Java": I would
> hope
> > > > that
> > > > > > > >> this
> > > > > > > >>>>> would
> > > > > > > >>>>>>>> be
> > > > > > > >>>>>>>>>>> an entirely internal change, and could thus be an
> > > > > incremental
> > > > > > > >>>>> process
> > > > > > > >>>>>>>>>>> independent of major releases.
> > > > > > > >>>>>>>>>>> What is the actual scale of this item; how much are
> > we
> > > > > > > >> actually
> > > > > > > >>>>>>>>>> re-writing?
> > > > > > > >>>>>>>>>>
> > > > > > > >>>>>>>>>> Thanks for asking
> > > > > > > >>>>>>>>>> yes, you're right, that should be internal change.
> > > > > > > >>>>>>>>>> Yeah I was also thinking about incremental change
> > (rule
> > > by
> > > > > rule
> > > > > > > >>> or
> > > > > > > >>>>>>>>>> reasonable small group of rules).
> > > > > > > >>>>>>>>>> And yes, this could be an independent (on major
> > release)
> > > > > > > >> activity
> > > > > > > >>>>>>>>>> The problem is actually for children of RelOptRule.
> > > > > > > >>>>>>>>>> Currently I see 60+ such rules (in Scala) using the
> > > > > mentioned
> > > > > > > >>>>>> deprecated
> > > > > > > >>>>>>>>>> api.
> > > > > > > >>>>>>>>>> There are also children of ConverterRule (50+) which
> > do
> > > > not
> > > > > > > >> have
> > > > > > > >>>>> such
> > > > > > > >>>>>>>>>> issues.
> > > > > > > >>>>>>>>>> Maybe it could be considered as the next step to
> have
> > > all
> > > > > the
> > > > > > > >>>>> rules in
> > > > > > > >>>>>>>>>> Java.
> > > > > > > >>>>>>>>>>
> > > > > > > >>>>>>>>>> On Tue, Jun 27, 2023 at 1:34 PM Xintong Song <
> > > > > > > >>>>> tonysong820@gmail.com>
> > > > > > > >>>>>>>>>> wrote:
> > > > > > > >>>>>>>>>>
> > > > > > > >>>>>>>>>>> Hi Alex & Gyula,
> > > > > > > >>>>>>>>>>>
> > > > > > > >>>>>>>>>>> By compatibility discussion do you mean the
> > "[DISCUSS]
> > > > > > > >> FLIP-321:
> > > > > > > >>>>>>>>>> Introduce
> > > > > > > >>>>>>>>>>>> an API deprecation process" thread [1]?
> > > > > > > >>>>>>>>>>>>
> > > > > > > >>>>>>>>>>> Yes, I meant the FLIP-321 discussion. I just
> noticed
> > I
> > > > > pasted
> > > > > > > >>> the
> > > > > > > >>>>>> wrong
> > > > > > > >>>>>>>>>> url
> > > > > > > >>>>>>>>>>> in my previous email. Sorry for the mistake.
> > > > > > > >>>>>>>>>>>
> > > > > > > >>>>>>>>>>> I am also curious to know if the rationale behind
> > this
> > > > new
> > > > > API
> > > > > > > >>> has
> > > > > > > >>>>>> been
> > > > > > > >>>>>>>>>>>> previously discussed on the mailing list. Do we
> > have a
> > > > > list
> > > > > > > >> of
> > > > > > > >>>>>>>>>>> shortcomings
> > > > > > > >>>>>>>>>>>> in the current DataStream API that it tries to
> > > resolve?
> > > > > How
> > > > > > > >>> does
> > > > > > > >>>>> the
> > > > > > > >>>>>>>>>>>> current ProcessFunction functionality fit into the
> > > > > picture?
> > > > > > > >>> Will
> > > > > > > >>>>> it
> > > > > > > >>>>>> be
> > > > > > > >>>>>>>>>>> kept
> > > > > > > >>>>>>>>>>>> as is or subsumed by new API?
> > > > > > > >>>>>>>>>>>>
> > > > > > > >>>>>>>>>>> I don't think we should create a replacement for
> the
> > > > > > > >> DataStream
> > > > > > > >>>>> API
> > > > > > > >>>>>>>>>> unless
> > > > > > > >>>>>>>>>>>> we have a very good reason to do so and with a
> > proper
> > > > > > > >>> discussion
> > > > > > > >>>>>> about
> > > > > > > >>>>>>>>>>> this
> > > > > > > >>>>>>>>>>>> as Alex said.
> > > > > > > >>>>>>>>>>> The ProcessFunction API which is targeting to
> replace
> > > > > > > >> DataStream
> > > > > > > >>>>> API
> > > > > > > >>>>>> is
> > > > > > > >>>>>>>>>>> still a proposal, not a decision. Sorry for the
> > > > confusion,
> > > > > I
> > > > > > > >>>>> should
> > > > > > > >>>>>>>> have
> > > > > > > >>>>>>>>>>> been more careful with my words, not giving the
> > > > impression
> > > > > > > >> that
> > > > > > > >>>>> this
> > > > > > > >>>>>> is
> > > > > > > >>>>>>>>>>> something we'll do anyway.
> > > > > > > >>>>>>>>>>>
> > > > > > > >>>>>>>>>>> There will be a FLIP describing the motivations and
> > > > > designs in
> > > > > > > >>>>>> detail,
> > > > > > > >>>>>>>>>> for
> > > > > > > >>>>>>>>>>> the community to discuss and vote on. We are still
> > > > working
> > > > > on
> > > > > > > >>> it.
> > > > > > > >>>>>> TBH,
> > > > > > > >>>>>>>>>> this
> > > > > > > >>>>>>>>>>> is not trivial and we would need more time on it.
> > > > > > > >>>>>>>>>>>
> > > > > > > >>>>>>>>>>> Just to quickly share some backgrounds:
> > > > > > > >>>>>>>>>>>
> > > > > > > >>>>>>>>>>>     - We see quite some problems with the current
> > > > > DataStream
> > > > > > > >> APIs
> > > > > > > >>>>>>>>>>>        - Users are working with concrete classes
> > rather
> > > > > than
> > > > > > > >>>>>>>> interfaces,
> > > > > > > >>>>>>>>>>>        which means
> > > > > > > >>>>>>>>>>>        - Users can access methods that are designed
> > to
> > > be
> > > > > used
> > > > > > > >> by
> > > > > > > >>>>>>>> internal
> > > > > > > >>>>>>>>>>>           classes, even though they are annotated
> > with
> > > > > > > >>> `@Internal`.
> > > > > > > >>>>>>>> E.g.,
> > > > > > > >>>>>>>>>>>           `DataStream#getTransformation`.
> > > > > > > >>>>>>>>>>>           - Changes to the non-API implementations
> > > (e.g.,
> > > > > > > >>>>>>>>>> `Transformation`)
> > > > > > > >>>>>>>>>>>           would affect the API classes (e.g.,
> > > > > `DataStream`),
> > > > > > > >>> which
> > > > > > > >>>>>>>>>>> makes it hard to
> > > > > > > >>>>>>>>>>>           provide binary compatibility.
> > > > > > > >>>>>>>>>>>        - Internal classes are used as parameter /
> > > > > return-value
> > > > > > > >> of
> > > > > > > >>>>>>>> public
> > > > > > > >>>>>>>>>>>        APIs. E.g., while `AbstractStreamOperator`
> is
> > > > > > > >>>>> PublicEvolving,
> > > > > > > >>>>>>>>>>> `StreamTask`
> > > > > > > >>>>>>>>>>>        which returns from
> > > > > > > >>>>> `AbstractStreamOperator#getContainingTask`
> > > > > > > >>>>>> is
> > > > > > > >>>>>>>>>>> Internal.
> > > > > > > >>>>>>>>>>>        - In many cases, users are asked to extend
> the
> > > API
> > > > > > > >>> classes,
> > > > > > > >>>>>>>> rather
> > > > > > > >>>>>>>>>>>        than implementing interfaces. E.g.,
> > > > > > > >>>>> `AbstractStreamOperator`.
> > > > > > > >>>>>>>>>>>           - Any changes to the base classes, even
> the
> > > > > internal
> > > > > > > >>>>> part,
> > > > > > > >>>>>>>> may
> > > > > > > >>>>>>>>>>>           affect the behavior of the user-provided
> > > > > sub-classes
> > > > > > > >>>>>>>>>>>           - Users can override the behavior of the
> > base
> > > > > classes
> > > > > > > >>>>>>>>>>>        - The API module `flink-streaming-java`
> > contains
> > > > > non-API
> > > > > > > >>>>>>>> classes,
> > > > > > > >>>>>>>>>> and
> > > > > > > >>>>>>>>>>>        depends on internal modules such as
> > > > `flink-runtime`,
> > > > > > > >> which
> > > > > > > >>>>>> means
> > > > > > > >>>>>>>>>>>        - Changes to the internal modules may affect
> > the
> > > > API
> > > > > > > >>>>> modules,
> > > > > > > >>>>>>>> which
> > > > > > > >>>>>>>>>>>           requires users to re-build their
> > applications
> > > > > upon
> > > > > > > >>>>> upgrading
> > > > > > > >>>>>>>>>>>           - The artifact user needs for building
> > their
> > > > > > > >>> application
> > > > > > > >>>>>>>> larger
> > > > > > > >>>>>>>>>>>           than necessary.
> > > > > > > >>>>>>>>>>>        - We probably should not expose operators
> > (e.g.,
> > > > > > > >>>>>>>>>>>        `AbstractStreamOperator`) to users.
> Functions
> > > > > should be
> > > > > > > >>>>> enough
> > > > > > > >>>>>>>>>>> for users to
> > > > > > > >>>>>>>>>>>        define their data processing logics.
> Exposing
> > > > > > > >>> operator-level
> > > > > > > >>>>>>>>>> concepts
> > > > > > > >>>>>>>>>>>        (e.g., mailbox thread model, checkpoint
> > barrier
> > > > > > > >> alignment,
> > > > > > > >>>>>>>> etc.) is
> > > > > > > >>>>>>>>>>>        unnecessary and limits the improvement
> > regarding
> > > > > such
> > > > > > > >>>>> exposed
> > > > > > > >>>>>>>>>>> mechanisms
> > > > > > > >>>>>>>>>>>        with compatibility considerations.
> > > > > > > >>>>>>>>>>>        - The current DataStream API seems to be a
> > > mixture
> > > > > of
> > > > > > > >> many
> > > > > > > >>>>>>>> things,
> > > > > > > >>>>>>>>>>>        making it hard to understand especially for
> > > > > newcomers.
> > > > > > > >> It
> > > > > > > >>>>> might
> > > > > > > >>>>>>>> be
> > > > > > > >>>>>>>>>>> better
> > > > > > > >>>>>>>>>>>        to re-organize it into several parts: (the
> > > > taxonomy
> > > > > > > >> below
> > > > > > > >>>>> are
> > > > > > > >>>>>>>> just
> > > > > > > >>>>>>>>>> an
> > > > > > > >>>>>>>>>>>        example of the, we are still working on
> this)
> > > > > > > >>>>>>>>>>>           - The most fundamental stateful stream
> > > > > processing:
> > > > > > > >>>>> streams,
> > > > > > > >>>>>>>>>>>           partitions / key, process functions,
> state,
> > > > > > > >>>>> timeline-service
> > > > > > > >>>>>>>>>>>           - An extension for common batch-streaming
> > > > unified
> > > > > > > >>>>> functions:
> > > > > > > >>>>>>>>>> map,
> > > > > > > >>>>>>>>>>>           flatmap, filter, agg, reduce, join, etc.
> > > > > > > >>>>>>>>>>>           - An extension for windowing supports:
> > > window,
> > > > > > > >>>>> triggering
> > > > > > > >>>>>>>>>>>           - An extension for event-time supports:
> > event
> > > > > time,
> > > > > > > >>>>>> watermark
> > > > > > > >>>>>>>>>>>           - The extensions are like short-cuts /
> > > sugars,
> > > > > > > >> without
> > > > > > > >>>>> which
> > > > > > > >>>>>>>>>> users
> > > > > > > >>>>>>>>>>>           can probably still achieve the same
> > behavior
> > > by
> > > > > > > >> working
> > > > > > > >>>>> with
> > > > > > > >>>>>>>> the
> > > > > > > >>>>>>>>>>>           fundamental APIs, but would be a lot
> easier
> > > > with
> > > > > the
> > > > > > > >>>>>>>> extensions
> > > > > > > >>>>>>>>>>>        - The original plan was to do in-place
> > > refactors /
> > > > > > > >> changes
> > > > > > > >>>>> on
> > > > > > > >>>>>>>>>>>     DataStream API. Some related items are listed
> in
> > > this
> > > > > doc
> > > > > > > >> [2]
> > > > > > > >>>>>>>> attached
> > > > > > > >>>>>>>>>>> to
> > > > > > > >>>>>>>>>>>     the kicking off email [3]. Not all of the above
> > > > issues
> > > > > are
> > > > > > > >>>>> listed,
> > > > > > > >>>>>>>>>>> because
> > > > > > > >>>>>>>>>>>     we haven't looked into this as deeply as now
> by
> > > that
> > > > > time.
> > > > > > > >>>>>>>>>>>     - We proposed this as a new API rather than
> > > in-place
> > > > > > > >>> refactors
> > > > > > > >>>>> in
> > > > > > > >>>>>>>> the
> > > > > > > >>>>>>>>>>>     2.0 work item list, because we realized the
> > changes
> > > > > might
> > > > > > > >> be
> > > > > > > >>>>> too
> > > > > > > >>>>>>>> big
> > > > > > > >>>>>>>>>>> for an
> > > > > > > >>>>>>>>>>>     in-place change. First having a new API then
> > > > gradually
> > > > > > > >>> retiring
> > > > > > > >>>>>> the
> > > > > > > >>>>>>>>>> old
> > > > > > > >>>>>>>>>>> one
> > > > > > > >>>>>>>>>>>     would help users to smoothly migrate between
> > them.
> > > > > > > >>>>>>>>>>>
> > > > > > > >>>>>>>>>>> A thorough discussion is definitely needed once the
> > > FLIP
> > > > is
> > > > > > > >> out.
> > > > > > > >>>>> And
> > > > > > > >>>>>> of
> > > > > > > >>>>>>>>>>> course it's possible that the FLIP might be
> rejected.
> > > > Given
> > > > > > > >> that
> > > > > > > >>>>> we
> > > > > > > >>>>>> are
> > > > > > > >>>>>>>>>>> planning for release 2.0, I just feel it would be
> > > better
> > > > to
> > > > > > > >>> bring
> > > > > > > >>>>>> this
> > > > > > > >>>>>>>> up
> > > > > > > >>>>>>>>>>> early even the concrete plan is not yet ready,
> > > > > > > >>>>>>>>>>>
> > > > > > > >>>>>>>>>>> Best,
> > > > > > > >>>>>>>>>>>
> > > > > > > >>>>>>>>>>> Xintong
> > > > > > > >>>>>>>>>>>
> > > > > > > >>>>>>>>>>>
> > > > > > > >>>>>>>>>>> [1]
> > > > > > > >>>>>
> > > > https://lists.apache.org/thread/vmhzv8fcw2b33pqxp43486owrxbkd5x9
> > > > > > > >>>>>>>>>>> [2]
> > > > > > > >>>>>>>>>>>
> > > > > > > >>>>>>>>>>>
> > > > > > > >>
> > > > > > >
> > > > >
> > > >
> > >
> >
> https://docs.google.com/document/d/1_PMGl5RuDQGlV99_gL3y7OiRsF0DgCk91Coua6hFXhE/edit?usp=sharing
> > > > > > > >>>>>>>>>>> [3]
> > > > > > > >>>>>
> > > > https://lists.apache.org/thread/b8w5cx0qqbwzzklyn5xxf54vw9ymys1c
> > > > > > > >>>>>>>>>>> On Tue, Jun 27, 2023 at 5:15 PM Gyula Fóra <
> > > > > gyfora@apache.org
> > > > > > > >>>>>> wrote:
> > > > > > > >>>>>>>>>>>> Hey!
> > > > > > > >>>>>>>>>>>>
> > > > > > > >>>>>>>>>>>> I share the same concerns mentioned above
> regarding
> > > the
> > > > > > > >>>>>>>>>> "ProcessFunction
> > > > > > > >>>>>>>>>>>> API".
> > > > > > > >>>>>>>>>>>>
> > > > > > > >>>>>>>>>>>> I don't think we should create a replacement for
> the
> > > > > > > >> DataStream
> > > > > > > >>>>> API
> > > > > > > >>>>>>>>>>> unless
> > > > > > > >>>>>>>>>>>> we have a very good reason to do so and with a
> > proper
> > > > > > > >>> discussion
> > > > > > > >>>>>> about
> > > > > > > >>>>>>>>>>> this
> > > > > > > >>>>>>>>>>>> as Alex said.
> > > > > > > >>>>>>>>>>>>
> > > > > > > >>>>>>>>>>>> Cheers,
> > > > > > > >>>>>>>>>>>> Gyula
> > > > > > > >>>>>>>>>>>>
> > > > > > > >>>>>>>>>>>> On Tue, Jun 27, 2023 at 11:03 AM Alexander
> Fedulov <
> > > > > > > >>>>>>>>>>>> alexander.fedulov@gmail.com> wrote:
> > > > > > > >>>>>>>>>>>>
> > > > > > > >>>>>>>>>>>>> Hi Xintong,
> > > > > > > >>>>>>>>>>>>>
> > > > > > > >>>>>>>>>>>>> By compatibility discussion do you mean the
> > > "[DISCUSS]
> > > > > > > >>> FLIP-321:
> > > > > > > >>>>>>>>>>>> Introduce
> > > > > > > >>>>>>>>>>>>> an API deprecation process" thread [1]?
> > > > > > > >>>>>>>>>>>>>
> > > > > > > >>>>>>>>>>>>> I am also curious to know if the rationale behind
> > > this
> > > > > new
> > > > > > > >> API
> > > > > > > >>>>> has
> > > > > > > >>>>>>>>>> been
> > > > > > > >>>>>>>>>>>>> previously discussed on the mailing list. Do we
> > have
> > > a
> > > > > list
> > > > > > > >> of
> > > > > > > >>>>>>>>>>>> shortcomings
> > > > > > > >>>>>>>>>>>>> in the current DataStream API that it tries to
> > > resolve?
> > > > > How
> > > > > > > >>> does
> > > > > > > >>>>>> the
> > > > > > > >>>>>>>>>>>>> current ProcessFunction functionality fit into
> the
> > > > > picture?
> > > > > > > >>>>> Will it
> > > > > > > >>>>>>>>>> be
> > > > > > > >>>>>>>>>>>> kept
> > > > > > > >>>>>>>>>>>>> as is or subsumed by new API?
> > > > > > > >>>>>>>>>>>>>
> > > > > > > >>>>>>>>>>>>> [1]
> > > > > > > >>>>>>
> > > > > https://lists.apache.org/thread/vmhzv8fcw2b33pqxp43486owrxbkd5x9
> > > > > > > >>>>>>>>>>>>> Best,
> > > > > > > >>>>>>>>>>>>> Alex
> > > > > > > >>>>>>>>>>>>>
> > > > > > > >>>>>>>>>>>>> On Mon, 26 Jun 2023 at 14:33, Xintong Song <
> > > > > > > >>>>> tonysong820@gmail.com>
> > > > > > > >>>>>>>>>>>> wrote:
> > > > > > > >>>>>>>>>>>>>>> The ProcessFunction API item is giving me the
> > most
> > > > > > > >> headaches
> > > > > > > >>>>>>>>>>> because
> > > > > > > >>>>>>>>>>>>> it's
> > > > > > > >>>>>>>>>>>>>>> very unclear what it actually entails; like is
> it
> > > an
> > > > > > > >>> entirely
> > > > > > > >>>>>>>>>>>> separate
> > > > > > > >>>>>>>>>>>>>> API
> > > > > > > >>>>>>>>>>>>>>> to DataStream (sounds like it is!) or an
> > extension
> > > of
> > > > > > > >>>>> DataStream.
> > > > > > > >>>>>>>>>>> How
> > > > > > > >>>>>>>>>>>>>> much
> > > > > > > >>>>>>>>>>>>>>> will it share the internals with DataStream
> etc.;
> > > how
> > > > > does
> > > > > > > >>> it
> > > > > > > >>>>>>>>>>> relate
> > > > > > > >>>>>>>>>>>> to
> > > > > > > >>>>>>>>>>>>>> the
> > > > > > > >>>>>>>>>>>>>>> Table API (w.r.t. switching APIs / what Table
> API
> > > > uses
> > > > > > > >>>>>>>>>> underneath).
> > > > > > > >>>>>>>>>>>>>> I totally understand your confusion. We started
> > > > planning
> > > > > > > >> this
> > > > > > > >>>>>> after
> > > > > > > >>>>>>>>>>>>> kicking
> > > > > > > >>>>>>>>>>>>>> off the release 2.0, so there's still a lot to
> be
> > > > > explored
> > > > > > > >>> and
> > > > > > > >>>>> the
> > > > > > > >>>>>>>>>>> plan
> > > > > > > >>>>>>>>>>>>>> keeps changing.
> > > > > > > >>>>>>>>>>>>>>
> > > > > > > >>>>>>>>>>>>>>
> > > > > > > >>>>>>>>>>>>>>     - In the beginning, we planned to do an
> > in-place
> > > > > > > >> refactor
> > > > > > > >>> of
> > > > > > > >>>>>>>>>>>>> DataStream
> > > > > > > >>>>>>>>>>>>>>     API, until the API migration period is
> > proposed.
> > > > > > > >>>>>>>>>>>>>>     - Then we want to make it an entirely
> separate
> > > API
> > > > > to
> > > > > > > >>>>>>>>>> DataStream,
> > > > > > > >>>>>>>>>>>> and
> > > > > > > >>>>>>>>>>>>>>     listed as a must-have for release 2.0 so
> that
> > we
> > > > can
> > > > > > > >>> remove
> > > > > > > >>>>>>>>>>>> DataStream
> > > > > > > >>>>>>>>>>>>>> once
> > > > > > > >>>>>>>>>>>>>>     it's ready.
> > > > > > > >>>>>>>>>>>>>>     - However, depending on the outcome of the
> API
> > > > > > > >>> compatibility
> > > > > > > >>>>>>>>>>>>> discussion
> > > > > > > >>>>>>>>>>>>>>     [1], we may not be able to remove DataStream
> > in
> > > > 2.0
> > > > > > > >>> anyway,
> > > > > > > >>>>>>>>>> which
> > > > > > > >>>>>>>>>>>>> means
> > > > > > > >>>>>>>>>>>>>> we
> > > > > > > >>>>>>>>>>>>>>     might need to re-evaluate the necessity of
> > this
> > > > > item for
> > > > > > > >>>>> 2.0.
> > > > > > > >>>>>>>>>>>>>> I'd say we wait a bit longer for the
> compatibility
> > > > > > > >> discussion
> > > > > > > >>>>> [1]
> > > > > > > >>>>>>>>>> and
> > > > > > > >>>>>>>>>>>>>> decide the priority for this item afterwards.
> > > > > > > >>>>>>>>>>>>>>
> > > > > > > >>>>>>>>>>>>>>
> > > > > > > >>>>>>>>>>>>>> Best,
> > > > > > > >>>>>>>>>>>>>>
> > > > > > > >>>>>>>>>>>>>> Xintong
> > > > > > > >>>>>>>>>>>>>>
> > > > > > > >>>>>>>>>>>>>>
> > > > > > > >>>>>>>>>>>>>> [1]
> > > > > > > >> https://lists.apache.org/list.html?dev@flink.apache.org
> > > > > > > >>>>>>>>>>>>>>
> > > > > > > >>>>>>>>>>>>>> On Mon, Jun 26, 2023 at 6:00 PM Chesnay
> Schepler <
> > > > > > > >>>>>>>>>> chesnay@apache.org
> > > > > > > >>>>>>>>>>>>>> wrote:
> > > > > > > >>>>>>>>>>>>>>
> > > > > > > >>>>>>>>>>>>>>> by-and-large I'm quite happy with the list of
> > > items.
> > > > > > > >>>>>>>>>>>>>>>
> > > > > > > >>>>>>>>>>>>>>> I'm curious as to why the "Disaggregated State
> > > > > Management"
> > > > > > > >>>>> item
> > > > > > > >>>>>>>>>> is
> > > > > > > >>>>>>>>>>>>> marked
> > > > > > > >>>>>>>>>>>>>>> as a must-have; will it require changes that
> > break
> > > > > > > >>> something?
> > > > > > > >>>>>>>>>> What
> > > > > > > >>>>>>>>>>>>>> prevents
> > > > > > > >>>>>>>>>>>>>>> it from being added in 2.1?
> > > > > > > >>>>>>>>>>>>>>>
> > > > > > > >>>>>>>>>>>>>>> We may want to update the Java 17 item to "Make
> > > Java
> > > > 17
> > > > > > > >> the
> > > > > > > >>>>>>>>>>> default,
> > > > > > > >>>>>>>>>>>>> drop
> > > > > > > >>>>>>>>>>>>>>> Java 8/11". Maybe even split it into a
> must-have
> > > > "Drop
> > > > > > > >> Java
> > > > > > > >>> 8"
> > > > > > > >>>>>>>>>> and
> > > > > > > >>>>>>>>>>> a
> > > > > > > >>>>>>>>>>>>>>> nice-to-have "Drop Java 11"?
> > > > > > > >>>>>>>>>>>>>>>
> > > > > > > >>>>>>>>>>>>>>> "Move Calcite rules from Scala to Java": I
> would
> > > hope
> > > > > that
> > > > > > > >>>>> this
> > > > > > > >>>>>>>>>>> would
> > > > > > > >>>>>>>>>>>>> be
> > > > > > > >>>>>>>>>>>>>>> an entirely internal change, and could thus be
> an
> > > > > > > >>> incremental
> > > > > > > >>>>>>>>>>> process
> > > > > > > >>>>>>>>>>>>>>> independent of major releases.
> > > > > > > >>>>>>>>>>>>>>> What is the actual scale of this item; how much
> > are
> > > > we
> > > > > > > >>>>> actually
> > > > > > > >>>>>>>>>>>>>> re-writing?
> > > > > > > >>>>>>>>>>>>>>> "Add MetricGroup#getLogicalScope": I'd raise
> this
> > > to
> > > > a
> > > > > > > >>>>>>>>>> must-have; i
> > > > > > > >>>>>>>>>>>>> think
> > > > > > > >>>>>>>>>>>>>>> I marked it down as nice-to-have only because
> it
> > > > > depends
> > > > > > > >> on
> > > > > > > >>>>>>>>>> another
> > > > > > > >>>>>>>>>>>>> item.
> > > > > > > >>>>>>>>>>>>>>> The ProcessFunction API item is giving me the
> > most
> > > > > > > >> headaches
> > > > > > > >>>>>>>>>>> because
> > > > > > > >>>>>>>>>>>>> it's
> > > > > > > >>>>>>>>>>>>>>> very unclear what it actually entails; like is
> it
> > > an
> > > > > > > >>> entirely
> > > > > > > >>>>>>>>>>>> separate
> > > > > > > >>>>>>>>>>>>>> API
> > > > > > > >>>>>>>>>>>>>>> to DataStream (sounds like it is!) or an
> > extension
> > > of
> > > > > > > >>>>> DataStream.
> > > > > > > >>>>>>>>>>> How
> > > > > > > >>>>>>>>>>>>>> much
> > > > > > > >>>>>>>>>>>>>>> will it share the internals with DataStream
> etc.;
> > > how
> > > > > does
> > > > > > > >>> it
> > > > > > > >>>>>>>>>>> relate
> > > > > > > >>>>>>>>>>>> to
> > > > > > > >>>>>>>>>>>>>> the
> > > > > > > >>>>>>>>>>>>>>> Table API (w.r.t. switching APIs / what Table
> API
> > > > uses
> > > > > > > >>>>>>>>>> underneath).
> > > > > > > >>>>>>>>>>>>>>> There are a few items I added as ideas which
> > don't
> > > > > have a
> > > > > > > >>>>>>>>>> priority
> > > > > > > >>>>>>>>>>>> yet;
> > > > > > > >>>>>>>>>>>>>>> would love to get some feedback on those.
> > > > > > > >>>>>>>>>>>>>>>
> > > > > > > >>>>>>>>>>>>>>> On 21/06/2023 08:41, Xintong Song wrote:
> > > > > > > >>>>>>>>>>>>>>>
> > > > > > > >>>>>>>>>>>>>>> Hi devs,
> > > > > > > >>>>>>>>>>>>>>>
> > > > > > > >>>>>>>>>>>>>>> As previously discussed in [1], we had been
> > > > collecting
> > > > > > > >> work
> > > > > > > >>>>> item
> > > > > > > >>>>>>>>>>>>>> proposals
> > > > > > > >>>>>>>>>>>>>>> for the 2.0 release until June 15th, on the
> wiki
> > > page
> > > > > [2].
> > > > > > > >>>>>>>>>>>>>>>
> > > > > > > >>>>>>>>>>>>>>>     - As we have passed the due date, I'd like
> to
> > > > > kindly
> > > > > > > >>> remind
> > > > > > > >>>>>>>>>>>> everyone
> > > > > > > >>>>>>>>>>>>>> *not
> > > > > > > >>>>>>>>>>>>>>>     to add / remove items directly on the wiki
> > > page*.
> > > > > If
> > > > > > > >>>>> needed,
> > > > > > > >>>>>>>>>>>> please
> > > > > > > >>>>>>>>>>>>>> post
> > > > > > > >>>>>>>>>>>>>>>     in this thread or reach out to the release
> > > > managers
> > > > > > > >>>>> instead.
> > > > > > > >>>>>>>>>>>>>>>     - I've reached out to some folks for
> > > > clarifications
> > > > > > > >> about
> > > > > > > >>>>>>>>>> their
> > > > > > > >>>>>>>>>>>>>>>     proposals. Some of them mentioned that they
> > can
> > > > > not yet
> > > > > > > >>>>> tell
> > > > > > > >>>>>>>>>>>> whether
> > > > > > > >>>>>>>>>>>>>> we
> > > > > > > >>>>>>>>>>>>>>>     should do an item or not, and would need
> more
> > > > time
> > > > > /
> > > > > > > >>>>>>>>>> discussions
> > > > > > > >>>>>>>>>>>> to
> > > > > > > >>>>>>>>>>>>>> make
> > > > > > > >>>>>>>>>>>>>>>     the decision. So I added a new symbol for
> > items
> > > > > whose
> > > > > > > >>>>>>>>>> priorities
> > > > > > > >>>>>>>>>>>> are
> > > > > > > >>>>>>>>>>>>>> `TBD`.
> > > > > > > >>>>>>>>>>>>>>> Now it's time to collaboratively decide a
> minimum
> > > set
> > > > > of
> > > > > > > >>>>>>>>>> must-have
> > > > > > > >>>>>>>>>>>>> items.
> > > > > > > >>>>>>>>>>>>>>> I've gone through the entire list of proposed
> > > items,
> > > > > and
> > > > > > > >>> found
> > > > > > > >>>>>>>>>> most
> > > > > > > >>>>>>>>>>>> of
> > > > > > > >>>>>>>>>>>>>> them
> > > > > > > >>>>>>>>>>>>>>> make quite much sense. So I think an online
> sync
> > > > might
> > > > > not
> > > > > > > >>> be
> > > > > > > >>>>>>>>>>>> necessary
> > > > > > > >>>>>>>>>>>>>> for
> > > > > > > >>>>>>>>>>>>>>> this. I'd like to go with this DISCUSS thread,
> > > where
> > > > > > > >>> everyone
> > > > > > > >>>>> can
> > > > > > > >>>>>>>>>>>>> comment
> > > > > > > >>>>>>>>>>>>>>> on how they think the list can be improved,
> > > followed
> > > > > by a
> > > > > > > >>>>> VOTE to
> > > > > > > >>>>>>>>>>>>>> formally
> > > > > > > >>>>>>>>>>>>>>> make the decision.
> > > > > > > >>>>>>>>>>>>>>>
> > > > > > > >>>>>>>>>>>>>>> Any feedback and opinions, including but not
> > > limited
> > > > to
> > > > > > > >> the
> > > > > > > >>>>>>>>>>> following
> > > > > > > >>>>>>>>>>>>>>> aspects, will be appreciated.
> > > > > > > >>>>>>>>>>>>>>>
> > > > > > > >>>>>>>>>>>>>>>     - Important items that are missing from the
> > > list
> > > > > > > >>>>>>>>>>>>>>>     - Concerns regarding the listed items or
> > their
> > > > > > > >> priorities
> > > > > > > >>>>>>>>>>>>>>> Looking forward to your feedback.
> > > > > > > >>>>>>>>>>>>>>>
> > > > > > > >>>>>>>>>>>>>>> Best,
> > > > > > > >>>>>>>>>>>>>>>
> > > > > > > >>>>>>>>>>>>>>> Xintong
> > > > > > > >>>>>>>>>>>>>>>
> > > > > > > >>>>>>>>>>>>>>>
> > > > > > > >>>>>>>>>>>>>>> [1]
> > > > > > > >>
> > > > > > >
> > > > >
> > > >
> > >
> >
> https://lists.apache.org/list?dev@flink.apache.org:lte=1M:release%202.0%20status%20updates
> > > > > > > >>>>>>>>>>>>>>> [2]
> > > > > > > >>>>>>>>>>
> > > > > https://cwiki.apache.org/confluence/display/FLINK/2.0+Release
> > > > > > > >>>>>>>>>>>>>>>
> > > > > > > >>>>>>>>>> --
> > > > > > > >>>>>>>>>> Best regards,
> > > > > > > >>>>>>>>>> Sergey
> > > > > > > >>>>>>>>>>
> > > > > > > >>>>>>>>
> > > > > > > >>>>>>
> > > > > > > >>>>> --
> > > > > > > >>>>> Best
> > > > > > > >>>>>
> > > > > > > >>>>> ConradJam
> > > > > > > >>>>>
> > > > > > >
> > > > > > >
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] Release 2.0 Work Items

Posted by Xintong Song <to...@gmail.com>.
>
> What we might want to come up with is a summary with each 2.0.0 issue on
> why it should be included or not. That summary is something the community
> could vote on. WDYT? I'm happy to help here.
>

That sounds great. Thanks for offering the help. I'll also try to go
through the issues, but TBH I'm quite overwhelmed and cannot promise to get
this done very soon. Your help is very much needed.


Best,

Xintong



On Tue, Jul 11, 2023 at 6:08 PM Matthias Pohl
<ma...@aiven.io.invalid> wrote:

> @Xintong I guess it makes sense. I agree with your conclusions on the four
> mentioned Jira issues.
>
> I just checked any issues that have fixVersion = 2.0.0 [1]. There are a few
> more items that are not affiliated with FLINK-3957 [2]. I guess we should
> find answers for these issues: Either closing them with a reason to have a
> consistent state in Jira or adding them to the feature list as part of a
> separate voting thread (to leave the current vote untouched).
>
> What we might want to come up with is a summary with each 2.0.0 issue on
> why it should be included or not. That summary is something the community
> could vote on. WDYT? I'm happy to help here.
>
> Matthias
>
> [1]
>
> https://issues.apache.org/jira/browse/FLINK-32437?jql=project%20%3D%20FLINK%20AND%20fixVersion%20%3D%202.0.0%20AND%20status%20NOT%20IN%20(Closed%2C%20Resolved)%20%20
> [2] https://issues.apache.org/jira/browse/FLINK-3957
>
>
> On Tue, Jul 11, 2023 at 5:01 AM Xintong Song <to...@gmail.com>
> wrote:
>
> > @Zhu,
> > As you are downgrading "Clarify the scopes of configuration options" to
> > nice-to-have priority, could you also bring that up in the vote
> thread[1]?
> > I'm asking because there are people who already voted on the original
> list.
> > I think restarting the vote is probably an overkill and unnecessary, but
> we
> > should at least bring this change to their attention.
> >
> > @Matthias,
> > Thanks a lot for bringing this up. I wasn't aware of this early
> umbrella. I
> > haven't gone through everything in FLINK-3957 yet. I'll do it asap.
> >
> > Just quickly went through the 4 issues you mentioned.
> > - FLINK-4675 & FLINK-14068: I'd be +1 to deprecate them in 1.18, as long
> as
> > the new APIs that we want users to migrate to are ready. For these 2
> > tickets, I think introduction of the updated APIs should be
> straightforward
> > and feasible for 1.18.
> > - FLINK-13926: I'm not sure about this one. The two mentioned classes
> > `ProcessingTimeSessionWindows` and `EventTimeSessionWindows` are not even
> > marked as Public or PublicEvolving APIs. Moreover, I don't see a good way
> > to smoothly replace the classes with a generic version.
> > - FLINK-5126: This is a bit unclear to me. From the description and
> > conversation on the ticket, I don't fully understand which concrete APIs
> > the ticket is referring to. Or maybe it refers to all / most of the APIs
> > that throws Exception / IOException in general. Moreover, I don't think
> > removing Exception / IOException from the API signature is a breaking
> > change. It requires no code changes on the caller side.
> >
> > WDYT?
> >
> > Best,
> >
> > Xintong
> >
> >
> > [1] https://lists.apache.org/thread/r0y9syc6k5nmcxvnd0hj33htdpdj9k6m
> > [2] https://issues.apache.org/jira/browse/FLINK-3957
> >
> > On Mon, Jul 10, 2023 at 10:53 PM Matthias Pohl
> > <ma...@aiven.io.invalid> wrote:
> >
> > > I brought it up in the deprecating APIs in 1.18 thread [1] already but
> it
> > > feels misplaced there. I just wanted to ask whether someone did a pass
> > over
> > > FLINK-3957 [2]. I came across it when going through the release 2.0
> > feature
> > > list [3] as part of the vote. I have the feeling that there are some
> > valid
> > > action items (e.g. FLINK-4675, FLINK-5126, FLINK-13926 [4-6]) which do
> > not
> > > seem to be listed in the 2.0 feature list [3], yet (or are included in
> > some
> > > of the bigger items). Majority of the subtasks are probably covered by
> > the
> > > DataSet removal, the Scala API removal and the ProcessFunction
> > refactoring.
> > > Other subtasks (FLINK-14068 [7]) made it into the feature list.
> > >
> > > I haven't worked with the SDK code that much so that I can judge
> whether
> > > the subtasks are still reasonable or actually obsolete. That is why I
> > > wanted to mention the Jira issue here once more.
> > >
> > > I don't consider it a blocker for the ongoing vote but was wondering
> > > whether it makes sense for someone who might have more experience in
> that
> > > field to add some of the subtasks to the feature list.
> > >
> > > Or shall we just consider it as "not interesting enough" because nobody
> > > added it in the first place to the 2.0 feature list [3]?
> > >
> > > Matthias
> > >
> > > [1] https://lists.apache.org/thread/3dw4f8frlg8hzlv324ql7n2755bzs9hy
> > > [2] https://issues.apache.org/jira/browse/FLINK-3957
> > > [3] https://cwiki.apache.org/confluence/display/FLINK/2.0+Release
> > > [4] https://issues.apache.org/jira/browse/FLINK-4675
> > > [5] https://issues.apache.org/jira/browse/FLINK-5126
> > > [6] https://issues.apache.org/jira/browse/FLINK-13926
> > > [7] https://issues.apache.org/jira/browse/FLINK-14068
> > >
> > > On Mon, Jul 10, 2023 at 3:17 PM Zhu Zhu <re...@gmail.com> wrote:
> > >
> > > > Agreed that we should deprecate affected APIs as soon as possible.
> > > > But there is not much time before the feature freeze of 1.18,  hence
> > > > I'm a bit concerned that some of the deprecations might not be done
> > 1.18.
> > > >
> > > > We are currently looking into the improvements of the configuration
> > > layer.
> > > > Most of the proposed changes would require a public discussion, or
> even
> > > > a FLIP, which I think can hardly close before the feature freeze of
> > 1.18.
> > > > And some of the APIs can be deprecated only after the corresponding
> new
> > > > APIs are developed. Therefore we previously targeted them for 1.19.
> > > >
> > > > We may review later to see what deprecation work can be done in 1.18
> > and
> > > > make it if possible. I think we can do the work even after the
> feature
> > > > freeze
> > > > date, if it is a purely deprecation work (simply adding annotations).
> > > WDYT?
> > > >
> > > > I'm also changing the priority of "Clarify the scopes of
> configuration
> > > > options"
> > > > to nice to have. I think most of the work are not breaking changes
> and
> > > can
> > > > be done in 1.x or 2.1+. For the breaking changes which might be
> needed,
> > > we
> > > > will consider it as part of the configuration layer rework.
> > > >
> > > > Thanks,
> > > > Zhu
> > > >
> > > > Xintong Song <to...@gmail.com> 于2023年7月10日周一 19:58写道:
> > > > >
> > > > > >
> > > > > > At what point are the FLIP discussions coming into play?
> > > > >
> > > > > I keep wondering if these shouldn't have started already.
> > > > >
> > > > >
> > > > > I think this depends on the responsible contributor and reviewer of
> > > > > individual items. From my perspective, the FLIP discussions can
> start
> > > any
> > > > > time as long as the contributors are ready, the earlier the better.
> > > > >
> > > > >
> > > > > What we need to ensure is that all breaking API changes are
> > > > > > discussed/decided before 1.18 is released so we can deprecate
> > > affected
> > > > APIs.
> > > > > >
> > > > >
> > > > > The introduction of the migration period has brought the
> requirement
> > to
> > > > > plan the removal of public APIs 2 minor releases ahead of the major
> > > > > release, which is TBH a bit unexpected. I agree it would be nice if
> > we
> > > > can
> > > > > get the FLIPs ready by releasing 1.18. But I also don't think we
> > should
> > > > > rush on it. If the deprecation of a Public API does not make 1.18,
> we
> > > may
> > > > > carry it until 3.0. Or if there are many Public APIs whose
> > deprecation
> > > > does
> > > > > not make 1.18, we may deprecate them in 1.19 and postpone the major
> > > > version
> > > > > bump to after a 1.20 release. Moreover, as mentioned in
> FLIP-321[1],
> > > > > exceptions are discussable given that the migration period is newly
> > > > > proposed and we did not give developers the chance to plan things
> > > ahead.
> > > > To
> > > > > sum up, I'd say we try identify APIs that need to be deprecated in
> > 1.18
> > > > > with best efforts, and evaluate the remaining options (carrying the
> > API
> > > > for
> > > > > the entire 2.x cycle, postpone 2.0, or making an exception)
> > > case-by-case.
> > > > > WDYT?
> > > > >
> > > > > Best,
> > > > >
> > > > > Xintong
> > > > >
> > > > >
> > > > > [1]
> https://lists.apache.org/thread/vmhzv8fcw2b33pqxp43486owrxbkd5x9
> > > > >
> > > > > On Mon, Jul 10, 2023 at 6:13 PM Chesnay Schepler <
> chesnay@apache.org
> > >
> > > > wrote:
> > > > >
> > > > > > At what point are the FLIP discussions coming into play?
> > > > > >
> > > > > > I keep wondering if these shouldn't have started already.
> > > > > > It just seems that a lot of decisions are implicitly reliant on
> the
> > > > > > items even being accepted.
> > > > > > Estimates can only be provided if we actually know the scope of
> the
> > > > > > change, but that's not always clear from the description in the
> > doc.
> > > > > >
> > > > > > What we need to ensure is that all breaking API changes are
> > > > > > discussed/decided before 1.18 is released so we can deprecate
> > > affected
> > > > > > APIs.
> > > > > >
> > > > > > On 10/07/2023 11:32, Xintong Song wrote:
> > > > > > > Hi Matthias,
> > > > > > >
> > > > > > > The questions you asked are indeed very important. Here're some
> > > quick
> > > > > > > responses, based on the plans I had in mind, which I have not
> > > aligned
> > > > > > with
> > > > > > > other release managers yet.
> > > > > > >
> > > > > > > In the previous discussions between the RMs, we were not able
> to
> > > make
> > > > > > > proposals on things like how to make a time plan, how to manage
> > the
> > > > > > release
> > > > > > > branch, etc., due to the lack of inputs on e.g., the work items
> > > need
> > > > to
> > > > > > be
> > > > > > > included (which transitively depends on the API compatibility
> to
> > > > provide
> > > > > > > between major versions) and the workloads / time needed for
> them.
> > > > With
> > > > > > the
> > > > > > > recent discussions, we have collected at least the majority of
> > the
> > > > inputs
> > > > > > > needed.
> > > > > > >
> > > > > > > Here are things that I think we as the release managers would
> do
> > > next
> > > > > > > (again, not aligned with other release managers yet)
> > > > > > > - Creating a time plan, by reaching out to people to understand
> > the
> > > > > > > estimated workloads, prerequisites and ETA of each work item.
> > > > > > > - Make a proposal on how to manage the release branch, i.e.,
> when
> > > to
> > > > cut
> > > > > > > the branch and whether to ship the milestone releases, etc.
> > > > > > > - Set-up regular release syncs (bi-weekly / monthly) to update
> > the
> > > > status
> > > > > > > and draw attention to where help is needed.
> > > > > > >
> > > > > > > So back to your questions.
> > > > > > >
> > > > > > > There are still to-be-discussed items in the list of features.
> > > > What's the
> > > > > > >> plan with those?
> > > > > > > When collecting ETA, for items that the completion time cannot
> > yet
> > > be
> > > > > > > estimated, we would like to have at least a time by which the
> > > > estimation
> > > > > > > can be made. I think the same applies to the to-be-discussed
> > items.
> > > > And
> > > > > > if
> > > > > > > the items should be included as must-haves, we would need
> another
> > > > vote to
> > > > > > > adjust the must-have item list.
> > > > > > >
> > > > > > > Some of them don't have anyone assigned.
> > > > > > > My concern is that they will be overlooked because nobody feels
> > to
> > > > be in
> > > > > > >> charge.
> > > > > > > This is a tricky one. For must-have items without assignees, we
> > as
> > > > the
> > > > > > > release managers should be responsible for raising them up in
> the
> > > > release
> > > > > > > syncs, and try to find assignees for them. Hopefully, there
> will
> > be
> > > > > > someone
> > > > > > > who stands out. But it is possible that for a must-have item
> > nobody
> > > > wants
> > > > > > > to work on it. If that happens, which I don't think it will, it
> > > > probably
> > > > > > > means the item is not that critical and we may have to exclude
> it
> > > > from
> > > > > > the
> > > > > > > release. Either way, they should not be overlooked, because
> IMHO
> > > > release
> > > > > > > managers should be responsible for trying to get someone to
> work
> > on
> > > > the
> > > > > > > un-assigned items.
> > > > > > >
> > > > > > > We'll have more discussions soon and keep the community
> updated.
> > > > > > >
> > > > > > > Best,
> > > > > > >
> > > > > > > Xintong
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > On Mon, Jul 10, 2023 at 3:53 PM Matthias Pohl
> > > > > > > <ma...@aiven.io.invalid> wrote:
> > > > > > >
> > > > > > >> Now that the vote is started on the must-have items: There are
> > > still
> > > > > > >> to-be-discussed items in the list of features. What's the plan
> > > with
> > > > > > those?
> > > > > > >> Some of them don't have anyone assigned. Were these items
> > > discussed
> > > > > > among
> > > > > > >> the release managers? So far, it looks like they are handled
> as
> > > > > > >> nice-to-have if someone volunteers to pick them up?
> > > > > > >>
> > > > > > >> My concern is that they will be overlooked because nobody
> feels
> > to
> > > > be in
> > > > > > >> charge.
> > > > > > >>
> > > > > > >> Best,
> > > > > > >> Matthias
> > > > > > >>
> > > > > > >> On Fri, Jul 7, 2023 at 11:06 AM Xintong Song <
> > > tonysong820@gmail.com
> > > > >
> > > > > > >> wrote:
> > > > > > >>
> > > > > > >>> Thanks all for the discussion.
> > > > > > >>>
> > > > > > >>> The wiki has been updated as discussed. I'm starting a vote
> > now.
> > > > > > >>>
> > > > > > >>> Best,
> > > > > > >>>
> > > > > > >>> Xintong
> > > > > > >>>
> > > > > > >>>
> > > > > > >>>
> > > > > > >>> On Wed, Jul 5, 2023 at 9:52 AM Xintong Song <
> > > tonysong820@gmail.com
> > > > >
> > > > > > >> wrote:
> > > > > > >>>> Hi ConradJam,
> > > > > > >>>>
> > > > > > >>>> I think Chesnay has already put his name as the Contributor
> > for
> > > > the
> > > > > > two
> > > > > > >>>> tasks you listed. Maybe you can reach out to him to see if
> you
> > > can
> > > > > > >>>> collaborate on this.
> > > > > > >>>>
> > > > > > >>>> In general, I don't think contributing to a release 2.0
> issue
> > is
> > > > much
> > > > > > >>>> different from contributing to a regular issue. We haven't
> yet
> > > > created
> > > > > > >>> JIRA
> > > > > > >>>> tickets for all the listed tasks because many of them needs
> > > > further
> > > > > > >>>> discussions and / or FLIPs to decide whether and how they
> > should
> > > > be
> > > > > > >>>> performed.
> > > > > > >>>>
> > > > > > >>>> Best,
> > > > > > >>>>
> > > > > > >>>> Xintong
> > > > > > >>>>
> > > > > > >>>>
> > > > > > >>>>
> > > > > > >>>> On Mon, Jul 3, 2023 at 10:37 PM ConradJam <
> > jam.gzczy@gmail.com>
> > > > > > wrote:
> > > > > > >>>>
> > > > > > >>>>> Hi Community:
> > > > > > >>>>>    I see some tasks in the 2.0 list that haven't been
> > assigned
> > > > yet. I
> > > > > > >>> want
> > > > > > >>>>> to take the initiative to take on some tasks that I can
> > > > complete. How
> > > > > > >>> do I
> > > > > > >>>>> apply to the community for this part of the task? I am
> > > > interested in
> > > > > > >> the
> > > > > > >>>>> following parts of FLINK-32377
> > > > > > >>>>> <https://issues.apache.org/jira/browse/FLINK-32377>, do I
> > need
> > > > to
> > > > > > >>> create
> > > > > > >>>>> issuse myself and point it to myself?
> > > > > > >>>>>
> > > > > > >>>>> - the current timestamp, which is problematic w.r.t.
> caching
> > > and
> > > > > > >>> testing,
> > > > > > >>>>> while providing no value.
> > > > > > >>>>> - Remove JarRequestBody#programArgs in favor of
> > > #programArgsList.
> > > > > > >>>>>
> > > > > > >>>>> [1] FLINK-32377 <
> > > > https://issues.apache.org/jira/browse/FLINK-32377>
> > > > > > >>>>> https://issues.apache.org/jira/browse/FLINK-32377
> > > > > > >>>>>
> > > > > > >>>>> Teoh, Hong <li...@amazon.co.uk.invalid> 于2023年6月30日周五
> > > 00:53写道:
> > > > > > >>>>>
> > > > > > >>>>>
> > > > > > >>>>> Teoh, Hong <li...@amazon.co.uk.invalid> 于2023年6月30日周五
> > > 00:53写道:
> > > > > > >>>>>
> > > > > > >>>>>> Thanks Xintong for driving the effort.
> > > > > > >>>>>>
> > > > > > >>>>>> I’d add a +1 to reworking configs, as suggested by @Jark
> and
> > > > > > >> @Chesnay,
> > > > > > >>>>>> especially the types. We have various configs that encode
> > > Time /
> > > > > > >>>>> MemorySize
> > > > > > >>>>>> that are Long instead!
> > > > > > >>>>>>
> > > > > > >>>>>> Regards,
> > > > > > >>>>>> Hong
> > > > > > >>>>>>
> > > > > > >>>>>>
> > > > > > >>>>>>
> > > > > > >>>>>>> On 29 Jun 2023, at 16:19, Yuan Mei <
> yuanmei.work@gmail.com
> > >
> > > > > > >> wrote:
> > > > > > >>>>>>> CAUTION: This email originated from outside of the
> > > > organization.
> > > > > > >> Do
> > > > > > >>>>> not
> > > > > > >>>>>> click links or open attachments unless you can confirm the
> > > > sender
> > > > > > >> and
> > > > > > >>>>> know
> > > > > > >>>>>> the content is safe.
> > > > > > >>>>>>>
> > > > > > >>>>>>>
> > > > > > >>>>>>> Thanks for driving this effort, Xintong!
> > > > > > >>>>>>>
> > > > > > >>>>>>> To Chesnay
> > > > > > >>>>>>>> I'm curious as to why the "Disaggregated State
> Management"
> > > > item
> > > > > > >> is
> > > > > > >>>>>>>> marked as a must-have; will it require changes that
> break
> > > > > > >>> something?
> > > > > > >>>>>>>> What prevents it from being added in 2.1?
> > > > > > >>>>>>> As to "Disaggregated State Management".
> > > > > > >>>>>>>
> > > > > > >>>>>>> We plan to provide a new type of state backend to support
> > DFS
> > > > as
> > > > > > >>>>> primary
> > > > > > >>>>>>> storage.
> > > > > > >>>>>>> To achieve this, we at least need to include two parts of
> > > > amends
> > > > > > >>> (not
> > > > > > >>>>>>> entirely sure yet, since we are still in the designing
> and
> > > > > > >> prototype
> > > > > > >>>>>> phase)
> > > > > > >>>>>>> 1. Statebackend Change
> > > > > > >>>>>>> 2. State Access Change
> > > > > > >>>>>>>
> > > > > > >>>>>>> Not all of the interfaces related are `@Internal`. Some
> of
> > > the
> > > > > > >>>>> interfaces
> > > > > > >>>>>>> like `StateBackend` is `@PublicEvolving`
> > > > > > >>>>>>> So, you are right in the sense that "Disaggregated State
> > > > > > >> Management"
> > > > > > >>>>>> itself
> > > > > > >>>>>>> probably does not need to be a "Must Have"
> > > > > > >>>>>>>
> > > > > > >>>>>>> But I was hoping changes that related to public APIs can
> be
> > > > > > >>> finalized
> > > > > > >>>>> and
> > > > > > >>>>>>> merged in Flink 2.0 (I will fix the wiki accordingly).
> > > > > > >>>>>>>
> > > > > > >>>>>>> I also agree with Jark that 2.0 is a good chance to
> rework
> > > the
> > > > > > >>> default
> > > > > > >>>>>>> value of configurations.
> > > > > > >>>>>>>
> > > > > > >>>>>>> Best
> > > > > > >>>>>>> Yuan
> > > > > > >>>>>>>
> > > > > > >>>>>>>
> > > > > > >>>>>>> On Thu, Jun 29, 2023 at 8:43 PM Chesnay Schepler <
> > > > > > >>> chesnay@apache.org>
> > > > > > >>>>>> wrote:
> > > > > > >>>>>>>> Something else configuration-related is that there are a
> > > > bunch of
> > > > > > >>>>>>>> options where the type isn't quite correct (e.g., a
> String
> > > > where
> > > > > > >> it
> > > > > > >>>>>>>> could be an enum, a string where it should be an int or
> > > > > > >> something).
> > > > > > >>>>>>>> Could do a pass over those as well.
> > > > > > >>>>>>>>
> > > > > > >>>>>>>> On 29/06/2023 13:50, Jark Wu wrote:
> > > > > > >>>>>>>>> Hi,
> > > > > > >>>>>>>>>
> > > > > > >>>>>>>>> I think one more thing we need to consider to do in 2.0
> > is
> > > > > > >>> changing
> > > > > > >>>>> the
> > > > > > >>>>>>>>> default value of configuration to improve out-of-box
> user
> > > > > > >>>>> experience.
> > > > > > >>>>>>>>> Currently, in order to run a Flink job, users may need
> to
> > > set
> > > > > > >>>>>>>>> a bunch of configurations, such as minibatch,
> checkpoint
> > > > > > >> interval,
> > > > > > >>>>>>>>> exactly-once,
> > > > > > >>>>>>>>> incremental-checkpoint, etc. It's very verbose and hard
> > to
> > > > use
> > > > > > >> for
> > > > > > >>>>>>>>> beginners.
> > > > > > >>>>>>>>> Most of them can have a universally applicable value.
> > > > Because
> > > > > > >>>>> changing
> > > > > > >>>>>>>> the
> > > > > > >>>>>>>>> default value is a breaking change. I think It's worth
> > > > > > >> considering
> > > > > > >>>>>>>> changing
> > > > > > >>>>>>>>> them in 2.0.
> > > > > > >>>>>>>>>
> > > > > > >>>>>>>>> What do you think?
> > > > > > >>>>>>>>>
> > > > > > >>>>>>>>> Best,
> > > > > > >>>>>>>>> Jark
> > > > > > >>>>>>>>>
> > > > > > >>>>>>>>>
> > > > > > >>>>>>>>> On Wed, 28 Jun 2023 at 14:10, Sergey Nuyanzin <
> > > > > > >>> snuyanzin@gmail.com>
> > > > > > >>>>>>>> wrote:
> > > > > > >>>>>>>>>> Hi Chesnay
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>>>> "Move Calcite rules from Scala to Java": I would hope
> > > that
> > > > > > >> this
> > > > > > >>>>> would
> > > > > > >>>>>>>> be
> > > > > > >>>>>>>>>>> an entirely internal change, and could thus be an
> > > > incremental
> > > > > > >>>>> process
> > > > > > >>>>>>>>>>> independent of major releases.
> > > > > > >>>>>>>>>>> What is the actual scale of this item; how much are
> we
> > > > > > >> actually
> > > > > > >>>>>>>>>> re-writing?
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>>> Thanks for asking
> > > > > > >>>>>>>>>> yes, you're right, that should be internal change.
> > > > > > >>>>>>>>>> Yeah I was also thinking about incremental change
> (rule
> > by
> > > > rule
> > > > > > >>> or
> > > > > > >>>>>>>>>> reasonable small group of rules).
> > > > > > >>>>>>>>>> And yes, this could be an independent (on major
> release)
> > > > > > >> activity
> > > > > > >>>>>>>>>> The problem is actually for children of RelOptRule.
> > > > > > >>>>>>>>>> Currently I see 60+ such rules (in Scala) using the
> > > > mentioned
> > > > > > >>>>>> deprecated
> > > > > > >>>>>>>>>> api.
> > > > > > >>>>>>>>>> There are also children of ConverterRule (50+) which
> do
> > > not
> > > > > > >> have
> > > > > > >>>>> such
> > > > > > >>>>>>>>>> issues.
> > > > > > >>>>>>>>>> Maybe it could be considered as the next step to have
> > all
> > > > the
> > > > > > >>>>> rules in
> > > > > > >>>>>>>>>> Java.
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>>> On Tue, Jun 27, 2023 at 1:34 PM Xintong Song <
> > > > > > >>>>> tonysong820@gmail.com>
> > > > > > >>>>>>>>>> wrote:
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>>>> Hi Alex & Gyula,
> > > > > > >>>>>>>>>>>
> > > > > > >>>>>>>>>>> By compatibility discussion do you mean the
> "[DISCUSS]
> > > > > > >> FLIP-321:
> > > > > > >>>>>>>>>> Introduce
> > > > > > >>>>>>>>>>>> an API deprecation process" thread [1]?
> > > > > > >>>>>>>>>>>>
> > > > > > >>>>>>>>>>> Yes, I meant the FLIP-321 discussion. I just noticed
> I
> > > > pasted
> > > > > > >>> the
> > > > > > >>>>>> wrong
> > > > > > >>>>>>>>>> url
> > > > > > >>>>>>>>>>> in my previous email. Sorry for the mistake.
> > > > > > >>>>>>>>>>>
> > > > > > >>>>>>>>>>> I am also curious to know if the rationale behind
> this
> > > new
> > > > API
> > > > > > >>> has
> > > > > > >>>>>> been
> > > > > > >>>>>>>>>>>> previously discussed on the mailing list. Do we
> have a
> > > > list
> > > > > > >> of
> > > > > > >>>>>>>>>>> shortcomings
> > > > > > >>>>>>>>>>>> in the current DataStream API that it tries to
> > resolve?
> > > > How
> > > > > > >>> does
> > > > > > >>>>> the
> > > > > > >>>>>>>>>>>> current ProcessFunction functionality fit into the
> > > > picture?
> > > > > > >>> Will
> > > > > > >>>>> it
> > > > > > >>>>>> be
> > > > > > >>>>>>>>>>> kept
> > > > > > >>>>>>>>>>>> as is or subsumed by new API?
> > > > > > >>>>>>>>>>>>
> > > > > > >>>>>>>>>>> I don't think we should create a replacement for the
> > > > > > >> DataStream
> > > > > > >>>>> API
> > > > > > >>>>>>>>>> unless
> > > > > > >>>>>>>>>>>> we have a very good reason to do so and with a
> proper
> > > > > > >>> discussion
> > > > > > >>>>>> about
> > > > > > >>>>>>>>>>> this
> > > > > > >>>>>>>>>>>> as Alex said.
> > > > > > >>>>>>>>>>> The ProcessFunction API which is targeting to replace
> > > > > > >> DataStream
> > > > > > >>>>> API
> > > > > > >>>>>> is
> > > > > > >>>>>>>>>>> still a proposal, not a decision. Sorry for the
> > > confusion,
> > > > I
> > > > > > >>>>> should
> > > > > > >>>>>>>> have
> > > > > > >>>>>>>>>>> been more careful with my words, not giving the
> > > impression
> > > > > > >> that
> > > > > > >>>>> this
> > > > > > >>>>>> is
> > > > > > >>>>>>>>>>> something we'll do anyway.
> > > > > > >>>>>>>>>>>
> > > > > > >>>>>>>>>>> There will be a FLIP describing the motivations and
> > > > designs in
> > > > > > >>>>>> detail,
> > > > > > >>>>>>>>>> for
> > > > > > >>>>>>>>>>> the community to discuss and vote on. We are still
> > > working
> > > > on
> > > > > > >>> it.
> > > > > > >>>>>> TBH,
> > > > > > >>>>>>>>>> this
> > > > > > >>>>>>>>>>> is not trivial and we would need more time on it.
> > > > > > >>>>>>>>>>>
> > > > > > >>>>>>>>>>> Just to quickly share some backgrounds:
> > > > > > >>>>>>>>>>>
> > > > > > >>>>>>>>>>>     - We see quite some problems with the current
> > > > DataStream
> > > > > > >> APIs
> > > > > > >>>>>>>>>>>        - Users are working with concrete classes
> rather
> > > > than
> > > > > > >>>>>>>> interfaces,
> > > > > > >>>>>>>>>>>        which means
> > > > > > >>>>>>>>>>>        - Users can access methods that are designed
> to
> > be
> > > > used
> > > > > > >> by
> > > > > > >>>>>>>> internal
> > > > > > >>>>>>>>>>>           classes, even though they are annotated
> with
> > > > > > >>> `@Internal`.
> > > > > > >>>>>>>> E.g.,
> > > > > > >>>>>>>>>>>           `DataStream#getTransformation`.
> > > > > > >>>>>>>>>>>           - Changes to the non-API implementations
> > (e.g.,
> > > > > > >>>>>>>>>> `Transformation`)
> > > > > > >>>>>>>>>>>           would affect the API classes (e.g.,
> > > > `DataStream`),
> > > > > > >>> which
> > > > > > >>>>>>>>>>> makes it hard to
> > > > > > >>>>>>>>>>>           provide binary compatibility.
> > > > > > >>>>>>>>>>>        - Internal classes are used as parameter /
> > > > return-value
> > > > > > >> of
> > > > > > >>>>>>>> public
> > > > > > >>>>>>>>>>>        APIs. E.g., while `AbstractStreamOperator` is
> > > > > > >>>>> PublicEvolving,
> > > > > > >>>>>>>>>>> `StreamTask`
> > > > > > >>>>>>>>>>>        which returns from
> > > > > > >>>>> `AbstractStreamOperator#getContainingTask`
> > > > > > >>>>>> is
> > > > > > >>>>>>>>>>> Internal.
> > > > > > >>>>>>>>>>>        - In many cases, users are asked to extend the
> > API
> > > > > > >>> classes,
> > > > > > >>>>>>>> rather
> > > > > > >>>>>>>>>>>        than implementing interfaces. E.g.,
> > > > > > >>>>> `AbstractStreamOperator`.
> > > > > > >>>>>>>>>>>           - Any changes to the base classes, even the
> > > > internal
> > > > > > >>>>> part,
> > > > > > >>>>>>>> may
> > > > > > >>>>>>>>>>>           affect the behavior of the user-provided
> > > > sub-classes
> > > > > > >>>>>>>>>>>           - Users can override the behavior of the
> base
> > > > classes
> > > > > > >>>>>>>>>>>        - The API module `flink-streaming-java`
> contains
> > > > non-API
> > > > > > >>>>>>>> classes,
> > > > > > >>>>>>>>>> and
> > > > > > >>>>>>>>>>>        depends on internal modules such as
> > > `flink-runtime`,
> > > > > > >> which
> > > > > > >>>>>> means
> > > > > > >>>>>>>>>>>        - Changes to the internal modules may affect
> the
> > > API
> > > > > > >>>>> modules,
> > > > > > >>>>>>>> which
> > > > > > >>>>>>>>>>>           requires users to re-build their
> applications
> > > > upon
> > > > > > >>>>> upgrading
> > > > > > >>>>>>>>>>>           - The artifact user needs for building
> their
> > > > > > >>> application
> > > > > > >>>>>>>> larger
> > > > > > >>>>>>>>>>>           than necessary.
> > > > > > >>>>>>>>>>>        - We probably should not expose operators
> (e.g.,
> > > > > > >>>>>>>>>>>        `AbstractStreamOperator`) to users. Functions
> > > > should be
> > > > > > >>>>> enough
> > > > > > >>>>>>>>>>> for users to
> > > > > > >>>>>>>>>>>        define their data processing logics. Exposing
> > > > > > >>> operator-level
> > > > > > >>>>>>>>>> concepts
> > > > > > >>>>>>>>>>>        (e.g., mailbox thread model, checkpoint
> barrier
> > > > > > >> alignment,
> > > > > > >>>>>>>> etc.) is
> > > > > > >>>>>>>>>>>        unnecessary and limits the improvement
> regarding
> > > > such
> > > > > > >>>>> exposed
> > > > > > >>>>>>>>>>> mechanisms
> > > > > > >>>>>>>>>>>        with compatibility considerations.
> > > > > > >>>>>>>>>>>        - The current DataStream API seems to be a
> > mixture
> > > > of
> > > > > > >> many
> > > > > > >>>>>>>> things,
> > > > > > >>>>>>>>>>>        making it hard to understand especially for
> > > > newcomers.
> > > > > > >> It
> > > > > > >>>>> might
> > > > > > >>>>>>>> be
> > > > > > >>>>>>>>>>> better
> > > > > > >>>>>>>>>>>        to re-organize it into several parts: (the
> > > taxonomy
> > > > > > >> below
> > > > > > >>>>> are
> > > > > > >>>>>>>> just
> > > > > > >>>>>>>>>> an
> > > > > > >>>>>>>>>>>        example of the, we are still working on this)
> > > > > > >>>>>>>>>>>           - The most fundamental stateful stream
> > > > processing:
> > > > > > >>>>> streams,
> > > > > > >>>>>>>>>>>           partitions / key, process functions, state,
> > > > > > >>>>> timeline-service
> > > > > > >>>>>>>>>>>           - An extension for common batch-streaming
> > > unified
> > > > > > >>>>> functions:
> > > > > > >>>>>>>>>> map,
> > > > > > >>>>>>>>>>>           flatmap, filter, agg, reduce, join, etc.
> > > > > > >>>>>>>>>>>           - An extension for windowing supports:
> > window,
> > > > > > >>>>> triggering
> > > > > > >>>>>>>>>>>           - An extension for event-time supports:
> event
> > > > time,
> > > > > > >>>>>> watermark
> > > > > > >>>>>>>>>>>           - The extensions are like short-cuts /
> > sugars,
> > > > > > >> without
> > > > > > >>>>> which
> > > > > > >>>>>>>>>> users
> > > > > > >>>>>>>>>>>           can probably still achieve the same
> behavior
> > by
> > > > > > >> working
> > > > > > >>>>> with
> > > > > > >>>>>>>> the
> > > > > > >>>>>>>>>>>           fundamental APIs, but would be a lot easier
> > > with
> > > > the
> > > > > > >>>>>>>> extensions
> > > > > > >>>>>>>>>>>        - The original plan was to do in-place
> > refactors /
> > > > > > >> changes
> > > > > > >>>>> on
> > > > > > >>>>>>>>>>>     DataStream API. Some related items are listed in
> > this
> > > > doc
> > > > > > >> [2]
> > > > > > >>>>>>>> attached
> > > > > > >>>>>>>>>>> to
> > > > > > >>>>>>>>>>>     the kicking off email [3]. Not all of the above
> > > issues
> > > > are
> > > > > > >>>>> listed,
> > > > > > >>>>>>>>>>> because
> > > > > > >>>>>>>>>>>     we haven't looked into this as deeply as now  by
> > that
> > > > time.
> > > > > > >>>>>>>>>>>     - We proposed this as a new API rather than
> > in-place
> > > > > > >>> refactors
> > > > > > >>>>> in
> > > > > > >>>>>>>> the
> > > > > > >>>>>>>>>>>     2.0 work item list, because we realized the
> changes
> > > > might
> > > > > > >> be
> > > > > > >>>>> too
> > > > > > >>>>>>>> big
> > > > > > >>>>>>>>>>> for an
> > > > > > >>>>>>>>>>>     in-place change. First having a new API then
> > > gradually
> > > > > > >>> retiring
> > > > > > >>>>>> the
> > > > > > >>>>>>>>>> old
> > > > > > >>>>>>>>>>> one
> > > > > > >>>>>>>>>>>     would help users to smoothly migrate between
> them.
> > > > > > >>>>>>>>>>>
> > > > > > >>>>>>>>>>> A thorough discussion is definitely needed once the
> > FLIP
> > > is
> > > > > > >> out.
> > > > > > >>>>> And
> > > > > > >>>>>> of
> > > > > > >>>>>>>>>>> course it's possible that the FLIP might be rejected.
> > > Given
> > > > > > >> that
> > > > > > >>>>> we
> > > > > > >>>>>> are
> > > > > > >>>>>>>>>>> planning for release 2.0, I just feel it would be
> > better
> > > to
> > > > > > >>> bring
> > > > > > >>>>>> this
> > > > > > >>>>>>>> up
> > > > > > >>>>>>>>>>> early even the concrete plan is not yet ready,
> > > > > > >>>>>>>>>>>
> > > > > > >>>>>>>>>>> Best,
> > > > > > >>>>>>>>>>>
> > > > > > >>>>>>>>>>> Xintong
> > > > > > >>>>>>>>>>>
> > > > > > >>>>>>>>>>>
> > > > > > >>>>>>>>>>> [1]
> > > > > > >>>>>
> > > https://lists.apache.org/thread/vmhzv8fcw2b33pqxp43486owrxbkd5x9
> > > > > > >>>>>>>>>>> [2]
> > > > > > >>>>>>>>>>>
> > > > > > >>>>>>>>>>>
> > > > > > >>
> > > > > >
> > > >
> > >
> >
> https://docs.google.com/document/d/1_PMGl5RuDQGlV99_gL3y7OiRsF0DgCk91Coua6hFXhE/edit?usp=sharing
> > > > > > >>>>>>>>>>> [3]
> > > > > > >>>>>
> > > https://lists.apache.org/thread/b8w5cx0qqbwzzklyn5xxf54vw9ymys1c
> > > > > > >>>>>>>>>>> On Tue, Jun 27, 2023 at 5:15 PM Gyula Fóra <
> > > > gyfora@apache.org
> > > > > > >>>>>> wrote:
> > > > > > >>>>>>>>>>>> Hey!
> > > > > > >>>>>>>>>>>>
> > > > > > >>>>>>>>>>>> I share the same concerns mentioned above regarding
> > the
> > > > > > >>>>>>>>>> "ProcessFunction
> > > > > > >>>>>>>>>>>> API".
> > > > > > >>>>>>>>>>>>
> > > > > > >>>>>>>>>>>> I don't think we should create a replacement for the
> > > > > > >> DataStream
> > > > > > >>>>> API
> > > > > > >>>>>>>>>>> unless
> > > > > > >>>>>>>>>>>> we have a very good reason to do so and with a
> proper
> > > > > > >>> discussion
> > > > > > >>>>>> about
> > > > > > >>>>>>>>>>> this
> > > > > > >>>>>>>>>>>> as Alex said.
> > > > > > >>>>>>>>>>>>
> > > > > > >>>>>>>>>>>> Cheers,
> > > > > > >>>>>>>>>>>> Gyula
> > > > > > >>>>>>>>>>>>
> > > > > > >>>>>>>>>>>> On Tue, Jun 27, 2023 at 11:03 AM Alexander Fedulov <
> > > > > > >>>>>>>>>>>> alexander.fedulov@gmail.com> wrote:
> > > > > > >>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>> Hi Xintong,
> > > > > > >>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>> By compatibility discussion do you mean the
> > "[DISCUSS]
> > > > > > >>> FLIP-321:
> > > > > > >>>>>>>>>>>> Introduce
> > > > > > >>>>>>>>>>>>> an API deprecation process" thread [1]?
> > > > > > >>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>> I am also curious to know if the rationale behind
> > this
> > > > new
> > > > > > >> API
> > > > > > >>>>> has
> > > > > > >>>>>>>>>> been
> > > > > > >>>>>>>>>>>>> previously discussed on the mailing list. Do we
> have
> > a
> > > > list
> > > > > > >> of
> > > > > > >>>>>>>>>>>> shortcomings
> > > > > > >>>>>>>>>>>>> in the current DataStream API that it tries to
> > resolve?
> > > > How
> > > > > > >>> does
> > > > > > >>>>>> the
> > > > > > >>>>>>>>>>>>> current ProcessFunction functionality fit into the
> > > > picture?
> > > > > > >>>>> Will it
> > > > > > >>>>>>>>>> be
> > > > > > >>>>>>>>>>>> kept
> > > > > > >>>>>>>>>>>>> as is or subsumed by new API?
> > > > > > >>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>> [1]
> > > > > > >>>>>>
> > > > https://lists.apache.org/thread/vmhzv8fcw2b33pqxp43486owrxbkd5x9
> > > > > > >>>>>>>>>>>>> Best,
> > > > > > >>>>>>>>>>>>> Alex
> > > > > > >>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>> On Mon, 26 Jun 2023 at 14:33, Xintong Song <
> > > > > > >>>>> tonysong820@gmail.com>
> > > > > > >>>>>>>>>>>> wrote:
> > > > > > >>>>>>>>>>>>>>> The ProcessFunction API item is giving me the
> most
> > > > > > >> headaches
> > > > > > >>>>>>>>>>> because
> > > > > > >>>>>>>>>>>>> it's
> > > > > > >>>>>>>>>>>>>>> very unclear what it actually entails; like is it
> > an
> > > > > > >>> entirely
> > > > > > >>>>>>>>>>>> separate
> > > > > > >>>>>>>>>>>>>> API
> > > > > > >>>>>>>>>>>>>>> to DataStream (sounds like it is!) or an
> extension
> > of
> > > > > > >>>>> DataStream.
> > > > > > >>>>>>>>>>> How
> > > > > > >>>>>>>>>>>>>> much
> > > > > > >>>>>>>>>>>>>>> will it share the internals with DataStream etc.;
> > how
> > > > does
> > > > > > >>> it
> > > > > > >>>>>>>>>>> relate
> > > > > > >>>>>>>>>>>> to
> > > > > > >>>>>>>>>>>>>> the
> > > > > > >>>>>>>>>>>>>>> Table API (w.r.t. switching APIs / what Table API
> > > uses
> > > > > > >>>>>>>>>> underneath).
> > > > > > >>>>>>>>>>>>>> I totally understand your confusion. We started
> > > planning
> > > > > > >> this
> > > > > > >>>>>> after
> > > > > > >>>>>>>>>>>>> kicking
> > > > > > >>>>>>>>>>>>>> off the release 2.0, so there's still a lot to be
> > > > explored
> > > > > > >>> and
> > > > > > >>>>> the
> > > > > > >>>>>>>>>>> plan
> > > > > > >>>>>>>>>>>>>> keeps changing.
> > > > > > >>>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>>     - In the beginning, we planned to do an
> in-place
> > > > > > >> refactor
> > > > > > >>> of
> > > > > > >>>>>>>>>>>>> DataStream
> > > > > > >>>>>>>>>>>>>>     API, until the API migration period is
> proposed.
> > > > > > >>>>>>>>>>>>>>     - Then we want to make it an entirely separate
> > API
> > > > to
> > > > > > >>>>>>>>>> DataStream,
> > > > > > >>>>>>>>>>>> and
> > > > > > >>>>>>>>>>>>>>     listed as a must-have for release 2.0 so that
> we
> > > can
> > > > > > >>> remove
> > > > > > >>>>>>>>>>>> DataStream
> > > > > > >>>>>>>>>>>>>> once
> > > > > > >>>>>>>>>>>>>>     it's ready.
> > > > > > >>>>>>>>>>>>>>     - However, depending on the outcome of the API
> > > > > > >>> compatibility
> > > > > > >>>>>>>>>>>>> discussion
> > > > > > >>>>>>>>>>>>>>     [1], we may not be able to remove DataStream
> in
> > > 2.0
> > > > > > >>> anyway,
> > > > > > >>>>>>>>>> which
> > > > > > >>>>>>>>>>>>> means
> > > > > > >>>>>>>>>>>>>> we
> > > > > > >>>>>>>>>>>>>>     might need to re-evaluate the necessity of
> this
> > > > item for
> > > > > > >>>>> 2.0.
> > > > > > >>>>>>>>>>>>>> I'd say we wait a bit longer for the compatibility
> > > > > > >> discussion
> > > > > > >>>>> [1]
> > > > > > >>>>>>>>>> and
> > > > > > >>>>>>>>>>>>>> decide the priority for this item afterwards.
> > > > > > >>>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>> Best,
> > > > > > >>>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>> Xintong
> > > > > > >>>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>> [1]
> > > > > > >> https://lists.apache.org/list.html?dev@flink.apache.org
> > > > > > >>>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>> On Mon, Jun 26, 2023 at 6:00 PM Chesnay Schepler <
> > > > > > >>>>>>>>>> chesnay@apache.org
> > > > > > >>>>>>>>>>>>>> wrote:
> > > > > > >>>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>>> by-and-large I'm quite happy with the list of
> > items.
> > > > > > >>>>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>>> I'm curious as to why the "Disaggregated State
> > > > Management"
> > > > > > >>>>> item
> > > > > > >>>>>>>>>> is
> > > > > > >>>>>>>>>>>>> marked
> > > > > > >>>>>>>>>>>>>>> as a must-have; will it require changes that
> break
> > > > > > >>> something?
> > > > > > >>>>>>>>>> What
> > > > > > >>>>>>>>>>>>>> prevents
> > > > > > >>>>>>>>>>>>>>> it from being added in 2.1?
> > > > > > >>>>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>>> We may want to update the Java 17 item to "Make
> > Java
> > > 17
> > > > > > >> the
> > > > > > >>>>>>>>>>> default,
> > > > > > >>>>>>>>>>>>> drop
> > > > > > >>>>>>>>>>>>>>> Java 8/11". Maybe even split it into a must-have
> > > "Drop
> > > > > > >> Java
> > > > > > >>> 8"
> > > > > > >>>>>>>>>> and
> > > > > > >>>>>>>>>>> a
> > > > > > >>>>>>>>>>>>>>> nice-to-have "Drop Java 11"?
> > > > > > >>>>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>>> "Move Calcite rules from Scala to Java": I would
> > hope
> > > > that
> > > > > > >>>>> this
> > > > > > >>>>>>>>>>> would
> > > > > > >>>>>>>>>>>>> be
> > > > > > >>>>>>>>>>>>>>> an entirely internal change, and could thus be an
> > > > > > >>> incremental
> > > > > > >>>>>>>>>>> process
> > > > > > >>>>>>>>>>>>>>> independent of major releases.
> > > > > > >>>>>>>>>>>>>>> What is the actual scale of this item; how much
> are
> > > we
> > > > > > >>>>> actually
> > > > > > >>>>>>>>>>>>>> re-writing?
> > > > > > >>>>>>>>>>>>>>> "Add MetricGroup#getLogicalScope": I'd raise this
> > to
> > > a
> > > > > > >>>>>>>>>> must-have; i
> > > > > > >>>>>>>>>>>>> think
> > > > > > >>>>>>>>>>>>>>> I marked it down as nice-to-have only because it
> > > > depends
> > > > > > >> on
> > > > > > >>>>>>>>>> another
> > > > > > >>>>>>>>>>>>> item.
> > > > > > >>>>>>>>>>>>>>> The ProcessFunction API item is giving me the
> most
> > > > > > >> headaches
> > > > > > >>>>>>>>>>> because
> > > > > > >>>>>>>>>>>>> it's
> > > > > > >>>>>>>>>>>>>>> very unclear what it actually entails; like is it
> > an
> > > > > > >>> entirely
> > > > > > >>>>>>>>>>>> separate
> > > > > > >>>>>>>>>>>>>> API
> > > > > > >>>>>>>>>>>>>>> to DataStream (sounds like it is!) or an
> extension
> > of
> > > > > > >>>>> DataStream.
> > > > > > >>>>>>>>>>> How
> > > > > > >>>>>>>>>>>>>> much
> > > > > > >>>>>>>>>>>>>>> will it share the internals with DataStream etc.;
> > how
> > > > does
> > > > > > >>> it
> > > > > > >>>>>>>>>>> relate
> > > > > > >>>>>>>>>>>> to
> > > > > > >>>>>>>>>>>>>> the
> > > > > > >>>>>>>>>>>>>>> Table API (w.r.t. switching APIs / what Table API
> > > uses
> > > > > > >>>>>>>>>> underneath).
> > > > > > >>>>>>>>>>>>>>> There are a few items I added as ideas which
> don't
> > > > have a
> > > > > > >>>>>>>>>> priority
> > > > > > >>>>>>>>>>>> yet;
> > > > > > >>>>>>>>>>>>>>> would love to get some feedback on those.
> > > > > > >>>>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>>> On 21/06/2023 08:41, Xintong Song wrote:
> > > > > > >>>>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>>> Hi devs,
> > > > > > >>>>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>>> As previously discussed in [1], we had been
> > > collecting
> > > > > > >> work
> > > > > > >>>>> item
> > > > > > >>>>>>>>>>>>>> proposals
> > > > > > >>>>>>>>>>>>>>> for the 2.0 release until June 15th, on the wiki
> > page
> > > > [2].
> > > > > > >>>>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>>>     - As we have passed the due date, I'd like to
> > > > kindly
> > > > > > >>> remind
> > > > > > >>>>>>>>>>>> everyone
> > > > > > >>>>>>>>>>>>>> *not
> > > > > > >>>>>>>>>>>>>>>     to add / remove items directly on the wiki
> > page*.
> > > > If
> > > > > > >>>>> needed,
> > > > > > >>>>>>>>>>>> please
> > > > > > >>>>>>>>>>>>>> post
> > > > > > >>>>>>>>>>>>>>>     in this thread or reach out to the release
> > > managers
> > > > > > >>>>> instead.
> > > > > > >>>>>>>>>>>>>>>     - I've reached out to some folks for
> > > clarifications
> > > > > > >> about
> > > > > > >>>>>>>>>> their
> > > > > > >>>>>>>>>>>>>>>     proposals. Some of them mentioned that they
> can
> > > > not yet
> > > > > > >>>>> tell
> > > > > > >>>>>>>>>>>> whether
> > > > > > >>>>>>>>>>>>>> we
> > > > > > >>>>>>>>>>>>>>>     should do an item or not, and would need more
> > > time
> > > > /
> > > > > > >>>>>>>>>> discussions
> > > > > > >>>>>>>>>>>> to
> > > > > > >>>>>>>>>>>>>> make
> > > > > > >>>>>>>>>>>>>>>     the decision. So I added a new symbol for
> items
> > > > whose
> > > > > > >>>>>>>>>> priorities
> > > > > > >>>>>>>>>>>> are
> > > > > > >>>>>>>>>>>>>> `TBD`.
> > > > > > >>>>>>>>>>>>>>> Now it's time to collaboratively decide a minimum
> > set
> > > > of
> > > > > > >>>>>>>>>> must-have
> > > > > > >>>>>>>>>>>>> items.
> > > > > > >>>>>>>>>>>>>>> I've gone through the entire list of proposed
> > items,
> > > > and
> > > > > > >>> found
> > > > > > >>>>>>>>>> most
> > > > > > >>>>>>>>>>>> of
> > > > > > >>>>>>>>>>>>>> them
> > > > > > >>>>>>>>>>>>>>> make quite much sense. So I think an online sync
> > > might
> > > > not
> > > > > > >>> be
> > > > > > >>>>>>>>>>>> necessary
> > > > > > >>>>>>>>>>>>>> for
> > > > > > >>>>>>>>>>>>>>> this. I'd like to go with this DISCUSS thread,
> > where
> > > > > > >>> everyone
> > > > > > >>>>> can
> > > > > > >>>>>>>>>>>>> comment
> > > > > > >>>>>>>>>>>>>>> on how they think the list can be improved,
> > followed
> > > > by a
> > > > > > >>>>> VOTE to
> > > > > > >>>>>>>>>>>>>> formally
> > > > > > >>>>>>>>>>>>>>> make the decision.
> > > > > > >>>>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>>> Any feedback and opinions, including but not
> > limited
> > > to
> > > > > > >> the
> > > > > > >>>>>>>>>>> following
> > > > > > >>>>>>>>>>>>>>> aspects, will be appreciated.
> > > > > > >>>>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>>>     - Important items that are missing from the
> > list
> > > > > > >>>>>>>>>>>>>>>     - Concerns regarding the listed items or
> their
> > > > > > >> priorities
> > > > > > >>>>>>>>>>>>>>> Looking forward to your feedback.
> > > > > > >>>>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>>> Best,
> > > > > > >>>>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>>> Xintong
> > > > > > >>>>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>>> [1]
> > > > > > >>
> > > > > >
> > > >
> > >
> >
> https://lists.apache.org/list?dev@flink.apache.org:lte=1M:release%202.0%20status%20updates
> > > > > > >>>>>>>>>>>>>>> [2]
> > > > > > >>>>>>>>>>
> > > > https://cwiki.apache.org/confluence/display/FLINK/2.0+Release
> > > > > > >>>>>>>>>>>>>>>
> > > > > > >>>>>>>>>> --
> > > > > > >>>>>>>>>> Best regards,
> > > > > > >>>>>>>>>> Sergey
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>
> > > > > > >>>>>>
> > > > > > >>>>> --
> > > > > > >>>>> Best
> > > > > > >>>>>
> > > > > > >>>>> ConradJam
> > > > > > >>>>>
> > > > > >
> > > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] Release 2.0 Work Items

Posted by Matthias Pohl <ma...@aiven.io.INVALID>.
@Xintong I guess it makes sense. I agree with your conclusions on the four
mentioned Jira issues.

I just checked any issues that have fixVersion = 2.0.0 [1]. There are a few
more items that are not affiliated with FLINK-3957 [2]. I guess we should
find answers for these issues: Either closing them with a reason to have a
consistent state in Jira or adding them to the feature list as part of a
separate voting thread (to leave the current vote untouched).

What we might want to come up with is a summary with each 2.0.0 issue on
why it should be included or not. That summary is something the community
could vote on. WDYT? I'm happy to help here.

Matthias

[1]
https://issues.apache.org/jira/browse/FLINK-32437?jql=project%20%3D%20FLINK%20AND%20fixVersion%20%3D%202.0.0%20AND%20status%20NOT%20IN%20(Closed%2C%20Resolved)%20%20
[2] https://issues.apache.org/jira/browse/FLINK-3957


On Tue, Jul 11, 2023 at 5:01 AM Xintong Song <to...@gmail.com> wrote:

> @Zhu,
> As you are downgrading "Clarify the scopes of configuration options" to
> nice-to-have priority, could you also bring that up in the vote thread[1]?
> I'm asking because there are people who already voted on the original list.
> I think restarting the vote is probably an overkill and unnecessary, but we
> should at least bring this change to their attention.
>
> @Matthias,
> Thanks a lot for bringing this up. I wasn't aware of this early umbrella. I
> haven't gone through everything in FLINK-3957 yet. I'll do it asap.
>
> Just quickly went through the 4 issues you mentioned.
> - FLINK-4675 & FLINK-14068: I'd be +1 to deprecate them in 1.18, as long as
> the new APIs that we want users to migrate to are ready. For these 2
> tickets, I think introduction of the updated APIs should be straightforward
> and feasible for 1.18.
> - FLINK-13926: I'm not sure about this one. The two mentioned classes
> `ProcessingTimeSessionWindows` and `EventTimeSessionWindows` are not even
> marked as Public or PublicEvolving APIs. Moreover, I don't see a good way
> to smoothly replace the classes with a generic version.
> - FLINK-5126: This is a bit unclear to me. From the description and
> conversation on the ticket, I don't fully understand which concrete APIs
> the ticket is referring to. Or maybe it refers to all / most of the APIs
> that throws Exception / IOException in general. Moreover, I don't think
> removing Exception / IOException from the API signature is a breaking
> change. It requires no code changes on the caller side.
>
> WDYT?
>
> Best,
>
> Xintong
>
>
> [1] https://lists.apache.org/thread/r0y9syc6k5nmcxvnd0hj33htdpdj9k6m
> [2] https://issues.apache.org/jira/browse/FLINK-3957
>
> On Mon, Jul 10, 2023 at 10:53 PM Matthias Pohl
> <ma...@aiven.io.invalid> wrote:
>
> > I brought it up in the deprecating APIs in 1.18 thread [1] already but it
> > feels misplaced there. I just wanted to ask whether someone did a pass
> over
> > FLINK-3957 [2]. I came across it when going through the release 2.0
> feature
> > list [3] as part of the vote. I have the feeling that there are some
> valid
> > action items (e.g. FLINK-4675, FLINK-5126, FLINK-13926 [4-6]) which do
> not
> > seem to be listed in the 2.0 feature list [3], yet (or are included in
> some
> > of the bigger items). Majority of the subtasks are probably covered by
> the
> > DataSet removal, the Scala API removal and the ProcessFunction
> refactoring.
> > Other subtasks (FLINK-14068 [7]) made it into the feature list.
> >
> > I haven't worked with the SDK code that much so that I can judge whether
> > the subtasks are still reasonable or actually obsolete. That is why I
> > wanted to mention the Jira issue here once more.
> >
> > I don't consider it a blocker for the ongoing vote but was wondering
> > whether it makes sense for someone who might have more experience in that
> > field to add some of the subtasks to the feature list.
> >
> > Or shall we just consider it as "not interesting enough" because nobody
> > added it in the first place to the 2.0 feature list [3]?
> >
> > Matthias
> >
> > [1] https://lists.apache.org/thread/3dw4f8frlg8hzlv324ql7n2755bzs9hy
> > [2] https://issues.apache.org/jira/browse/FLINK-3957
> > [3] https://cwiki.apache.org/confluence/display/FLINK/2.0+Release
> > [4] https://issues.apache.org/jira/browse/FLINK-4675
> > [5] https://issues.apache.org/jira/browse/FLINK-5126
> > [6] https://issues.apache.org/jira/browse/FLINK-13926
> > [7] https://issues.apache.org/jira/browse/FLINK-14068
> >
> > On Mon, Jul 10, 2023 at 3:17 PM Zhu Zhu <re...@gmail.com> wrote:
> >
> > > Agreed that we should deprecate affected APIs as soon as possible.
> > > But there is not much time before the feature freeze of 1.18,  hence
> > > I'm a bit concerned that some of the deprecations might not be done
> 1.18.
> > >
> > > We are currently looking into the improvements of the configuration
> > layer.
> > > Most of the proposed changes would require a public discussion, or even
> > > a FLIP, which I think can hardly close before the feature freeze of
> 1.18.
> > > And some of the APIs can be deprecated only after the corresponding new
> > > APIs are developed. Therefore we previously targeted them for 1.19.
> > >
> > > We may review later to see what deprecation work can be done in 1.18
> and
> > > make it if possible. I think we can do the work even after the feature
> > > freeze
> > > date, if it is a purely deprecation work (simply adding annotations).
> > WDYT?
> > >
> > > I'm also changing the priority of "Clarify the scopes of configuration
> > > options"
> > > to nice to have. I think most of the work are not breaking changes and
> > can
> > > be done in 1.x or 2.1+. For the breaking changes which might be needed,
> > we
> > > will consider it as part of the configuration layer rework.
> > >
> > > Thanks,
> > > Zhu
> > >
> > > Xintong Song <to...@gmail.com> 于2023年7月10日周一 19:58写道:
> > > >
> > > > >
> > > > > At what point are the FLIP discussions coming into play?
> > > >
> > > > I keep wondering if these shouldn't have started already.
> > > >
> > > >
> > > > I think this depends on the responsible contributor and reviewer of
> > > > individual items. From my perspective, the FLIP discussions can start
> > any
> > > > time as long as the contributors are ready, the earlier the better.
> > > >
> > > >
> > > > What we need to ensure is that all breaking API changes are
> > > > > discussed/decided before 1.18 is released so we can deprecate
> > affected
> > > APIs.
> > > > >
> > > >
> > > > The introduction of the migration period has brought the requirement
> to
> > > > plan the removal of public APIs 2 minor releases ahead of the major
> > > > release, which is TBH a bit unexpected. I agree it would be nice if
> we
> > > can
> > > > get the FLIPs ready by releasing 1.18. But I also don't think we
> should
> > > > rush on it. If the deprecation of a Public API does not make 1.18, we
> > may
> > > > carry it until 3.0. Or if there are many Public APIs whose
> deprecation
> > > does
> > > > not make 1.18, we may deprecate them in 1.19 and postpone the major
> > > version
> > > > bump to after a 1.20 release. Moreover, as mentioned in FLIP-321[1],
> > > > exceptions are discussable given that the migration period is newly
> > > > proposed and we did not give developers the chance to plan things
> > ahead.
> > > To
> > > > sum up, I'd say we try identify APIs that need to be deprecated in
> 1.18
> > > > with best efforts, and evaluate the remaining options (carrying the
> API
> > > for
> > > > the entire 2.x cycle, postpone 2.0, or making an exception)
> > case-by-case.
> > > > WDYT?
> > > >
> > > > Best,
> > > >
> > > > Xintong
> > > >
> > > >
> > > > [1] https://lists.apache.org/thread/vmhzv8fcw2b33pqxp43486owrxbkd5x9
> > > >
> > > > On Mon, Jul 10, 2023 at 6:13 PM Chesnay Schepler <chesnay@apache.org
> >
> > > wrote:
> > > >
> > > > > At what point are the FLIP discussions coming into play?
> > > > >
> > > > > I keep wondering if these shouldn't have started already.
> > > > > It just seems that a lot of decisions are implicitly reliant on the
> > > > > items even being accepted.
> > > > > Estimates can only be provided if we actually know the scope of the
> > > > > change, but that's not always clear from the description in the
> doc.
> > > > >
> > > > > What we need to ensure is that all breaking API changes are
> > > > > discussed/decided before 1.18 is released so we can deprecate
> > affected
> > > > > APIs.
> > > > >
> > > > > On 10/07/2023 11:32, Xintong Song wrote:
> > > > > > Hi Matthias,
> > > > > >
> > > > > > The questions you asked are indeed very important. Here're some
> > quick
> > > > > > responses, based on the plans I had in mind, which I have not
> > aligned
> > > > > with
> > > > > > other release managers yet.
> > > > > >
> > > > > > In the previous discussions between the RMs, we were not able to
> > make
> > > > > > proposals on things like how to make a time plan, how to manage
> the
> > > > > release
> > > > > > branch, etc., due to the lack of inputs on e.g., the work items
> > need
> > > to
> > > > > be
> > > > > > included (which transitively depends on the API compatibility to
> > > provide
> > > > > > between major versions) and the workloads / time needed for them.
> > > With
> > > > > the
> > > > > > recent discussions, we have collected at least the majority of
> the
> > > inputs
> > > > > > needed.
> > > > > >
> > > > > > Here are things that I think we as the release managers would do
> > next
> > > > > > (again, not aligned with other release managers yet)
> > > > > > - Creating a time plan, by reaching out to people to understand
> the
> > > > > > estimated workloads, prerequisites and ETA of each work item.
> > > > > > - Make a proposal on how to manage the release branch, i.e., when
> > to
> > > cut
> > > > > > the branch and whether to ship the milestone releases, etc.
> > > > > > - Set-up regular release syncs (bi-weekly / monthly) to update
> the
> > > status
> > > > > > and draw attention to where help is needed.
> > > > > >
> > > > > > So back to your questions.
> > > > > >
> > > > > > There are still to-be-discussed items in the list of features.
> > > What's the
> > > > > >> plan with those?
> > > > > > When collecting ETA, for items that the completion time cannot
> yet
> > be
> > > > > > estimated, we would like to have at least a time by which the
> > > estimation
> > > > > > can be made. I think the same applies to the to-be-discussed
> items.
> > > And
> > > > > if
> > > > > > the items should be included as must-haves, we would need another
> > > vote to
> > > > > > adjust the must-have item list.
> > > > > >
> > > > > > Some of them don't have anyone assigned.
> > > > > > My concern is that they will be overlooked because nobody feels
> to
> > > be in
> > > > > >> charge.
> > > > > > This is a tricky one. For must-have items without assignees, we
> as
> > > the
> > > > > > release managers should be responsible for raising them up in the
> > > release
> > > > > > syncs, and try to find assignees for them. Hopefully, there will
> be
> > > > > someone
> > > > > > who stands out. But it is possible that for a must-have item
> nobody
> > > wants
> > > > > > to work on it. If that happens, which I don't think it will, it
> > > probably
> > > > > > means the item is not that critical and we may have to exclude it
> > > from
> > > > > the
> > > > > > release. Either way, they should not be overlooked, because IMHO
> > > release
> > > > > > managers should be responsible for trying to get someone to work
> on
> > > the
> > > > > > un-assigned items.
> > > > > >
> > > > > > We'll have more discussions soon and keep the community updated.
> > > > > >
> > > > > > Best,
> > > > > >
> > > > > > Xintong
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Mon, Jul 10, 2023 at 3:53 PM Matthias Pohl
> > > > > > <ma...@aiven.io.invalid> wrote:
> > > > > >
> > > > > >> Now that the vote is started on the must-have items: There are
> > still
> > > > > >> to-be-discussed items in the list of features. What's the plan
> > with
> > > > > those?
> > > > > >> Some of them don't have anyone assigned. Were these items
> > discussed
> > > > > among
> > > > > >> the release managers? So far, it looks like they are handled as
> > > > > >> nice-to-have if someone volunteers to pick them up?
> > > > > >>
> > > > > >> My concern is that they will be overlooked because nobody feels
> to
> > > be in
> > > > > >> charge.
> > > > > >>
> > > > > >> Best,
> > > > > >> Matthias
> > > > > >>
> > > > > >> On Fri, Jul 7, 2023 at 11:06 AM Xintong Song <
> > tonysong820@gmail.com
> > > >
> > > > > >> wrote:
> > > > > >>
> > > > > >>> Thanks all for the discussion.
> > > > > >>>
> > > > > >>> The wiki has been updated as discussed. I'm starting a vote
> now.
> > > > > >>>
> > > > > >>> Best,
> > > > > >>>
> > > > > >>> Xintong
> > > > > >>>
> > > > > >>>
> > > > > >>>
> > > > > >>> On Wed, Jul 5, 2023 at 9:52 AM Xintong Song <
> > tonysong820@gmail.com
> > > >
> > > > > >> wrote:
> > > > > >>>> Hi ConradJam,
> > > > > >>>>
> > > > > >>>> I think Chesnay has already put his name as the Contributor
> for
> > > the
> > > > > two
> > > > > >>>> tasks you listed. Maybe you can reach out to him to see if you
> > can
> > > > > >>>> collaborate on this.
> > > > > >>>>
> > > > > >>>> In general, I don't think contributing to a release 2.0 issue
> is
> > > much
> > > > > >>>> different from contributing to a regular issue. We haven't yet
> > > created
> > > > > >>> JIRA
> > > > > >>>> tickets for all the listed tasks because many of them needs
> > > further
> > > > > >>>> discussions and / or FLIPs to decide whether and how they
> should
> > > be
> > > > > >>>> performed.
> > > > > >>>>
> > > > > >>>> Best,
> > > > > >>>>
> > > > > >>>> Xintong
> > > > > >>>>
> > > > > >>>>
> > > > > >>>>
> > > > > >>>> On Mon, Jul 3, 2023 at 10:37 PM ConradJam <
> jam.gzczy@gmail.com>
> > > > > wrote:
> > > > > >>>>
> > > > > >>>>> Hi Community:
> > > > > >>>>>    I see some tasks in the 2.0 list that haven't been
> assigned
> > > yet. I
> > > > > >>> want
> > > > > >>>>> to take the initiative to take on some tasks that I can
> > > complete. How
> > > > > >>> do I
> > > > > >>>>> apply to the community for this part of the task? I am
> > > interested in
> > > > > >> the
> > > > > >>>>> following parts of FLINK-32377
> > > > > >>>>> <https://issues.apache.org/jira/browse/FLINK-32377>, do I
> need
> > > to
> > > > > >>> create
> > > > > >>>>> issuse myself and point it to myself?
> > > > > >>>>>
> > > > > >>>>> - the current timestamp, which is problematic w.r.t. caching
> > and
> > > > > >>> testing,
> > > > > >>>>> while providing no value.
> > > > > >>>>> - Remove JarRequestBody#programArgs in favor of
> > #programArgsList.
> > > > > >>>>>
> > > > > >>>>> [1] FLINK-32377 <
> > > https://issues.apache.org/jira/browse/FLINK-32377>
> > > > > >>>>> https://issues.apache.org/jira/browse/FLINK-32377
> > > > > >>>>>
> > > > > >>>>> Teoh, Hong <li...@amazon.co.uk.invalid> 于2023年6月30日周五
> > 00:53写道:
> > > > > >>>>>
> > > > > >>>>>
> > > > > >>>>> Teoh, Hong <li...@amazon.co.uk.invalid> 于2023年6月30日周五
> > 00:53写道:
> > > > > >>>>>
> > > > > >>>>>> Thanks Xintong for driving the effort.
> > > > > >>>>>>
> > > > > >>>>>> I’d add a +1 to reworking configs, as suggested by @Jark and
> > > > > >> @Chesnay,
> > > > > >>>>>> especially the types. We have various configs that encode
> > Time /
> > > > > >>>>> MemorySize
> > > > > >>>>>> that are Long instead!
> > > > > >>>>>>
> > > > > >>>>>> Regards,
> > > > > >>>>>> Hong
> > > > > >>>>>>
> > > > > >>>>>>
> > > > > >>>>>>
> > > > > >>>>>>> On 29 Jun 2023, at 16:19, Yuan Mei <yuanmei.work@gmail.com
> >
> > > > > >> wrote:
> > > > > >>>>>>> CAUTION: This email originated from outside of the
> > > organization.
> > > > > >> Do
> > > > > >>>>> not
> > > > > >>>>>> click links or open attachments unless you can confirm the
> > > sender
> > > > > >> and
> > > > > >>>>> know
> > > > > >>>>>> the content is safe.
> > > > > >>>>>>>
> > > > > >>>>>>>
> > > > > >>>>>>> Thanks for driving this effort, Xintong!
> > > > > >>>>>>>
> > > > > >>>>>>> To Chesnay
> > > > > >>>>>>>> I'm curious as to why the "Disaggregated State Management"
> > > item
> > > > > >> is
> > > > > >>>>>>>> marked as a must-have; will it require changes that break
> > > > > >>> something?
> > > > > >>>>>>>> What prevents it from being added in 2.1?
> > > > > >>>>>>> As to "Disaggregated State Management".
> > > > > >>>>>>>
> > > > > >>>>>>> We plan to provide a new type of state backend to support
> DFS
> > > as
> > > > > >>>>> primary
> > > > > >>>>>>> storage.
> > > > > >>>>>>> To achieve this, we at least need to include two parts of
> > > amends
> > > > > >>> (not
> > > > > >>>>>>> entirely sure yet, since we are still in the designing and
> > > > > >> prototype
> > > > > >>>>>> phase)
> > > > > >>>>>>> 1. Statebackend Change
> > > > > >>>>>>> 2. State Access Change
> > > > > >>>>>>>
> > > > > >>>>>>> Not all of the interfaces related are `@Internal`. Some of
> > the
> > > > > >>>>> interfaces
> > > > > >>>>>>> like `StateBackend` is `@PublicEvolving`
> > > > > >>>>>>> So, you are right in the sense that "Disaggregated State
> > > > > >> Management"
> > > > > >>>>>> itself
> > > > > >>>>>>> probably does not need to be a "Must Have"
> > > > > >>>>>>>
> > > > > >>>>>>> But I was hoping changes that related to public APIs can be
> > > > > >>> finalized
> > > > > >>>>> and
> > > > > >>>>>>> merged in Flink 2.0 (I will fix the wiki accordingly).
> > > > > >>>>>>>
> > > > > >>>>>>> I also agree with Jark that 2.0 is a good chance to rework
> > the
> > > > > >>> default
> > > > > >>>>>>> value of configurations.
> > > > > >>>>>>>
> > > > > >>>>>>> Best
> > > > > >>>>>>> Yuan
> > > > > >>>>>>>
> > > > > >>>>>>>
> > > > > >>>>>>> On Thu, Jun 29, 2023 at 8:43 PM Chesnay Schepler <
> > > > > >>> chesnay@apache.org>
> > > > > >>>>>> wrote:
> > > > > >>>>>>>> Something else configuration-related is that there are a
> > > bunch of
> > > > > >>>>>>>> options where the type isn't quite correct (e.g., a String
> > > where
> > > > > >> it
> > > > > >>>>>>>> could be an enum, a string where it should be an int or
> > > > > >> something).
> > > > > >>>>>>>> Could do a pass over those as well.
> > > > > >>>>>>>>
> > > > > >>>>>>>> On 29/06/2023 13:50, Jark Wu wrote:
> > > > > >>>>>>>>> Hi,
> > > > > >>>>>>>>>
> > > > > >>>>>>>>> I think one more thing we need to consider to do in 2.0
> is
> > > > > >>> changing
> > > > > >>>>> the
> > > > > >>>>>>>>> default value of configuration to improve out-of-box user
> > > > > >>>>> experience.
> > > > > >>>>>>>>> Currently, in order to run a Flink job, users may need to
> > set
> > > > > >>>>>>>>> a bunch of configurations, such as minibatch, checkpoint
> > > > > >> interval,
> > > > > >>>>>>>>> exactly-once,
> > > > > >>>>>>>>> incremental-checkpoint, etc. It's very verbose and hard
> to
> > > use
> > > > > >> for
> > > > > >>>>>>>>> beginners.
> > > > > >>>>>>>>> Most of them can have a universally applicable value.
> > > Because
> > > > > >>>>> changing
> > > > > >>>>>>>> the
> > > > > >>>>>>>>> default value is a breaking change. I think It's worth
> > > > > >> considering
> > > > > >>>>>>>> changing
> > > > > >>>>>>>>> them in 2.0.
> > > > > >>>>>>>>>
> > > > > >>>>>>>>> What do you think?
> > > > > >>>>>>>>>
> > > > > >>>>>>>>> Best,
> > > > > >>>>>>>>> Jark
> > > > > >>>>>>>>>
> > > > > >>>>>>>>>
> > > > > >>>>>>>>> On Wed, 28 Jun 2023 at 14:10, Sergey Nuyanzin <
> > > > > >>> snuyanzin@gmail.com>
> > > > > >>>>>>>> wrote:
> > > > > >>>>>>>>>> Hi Chesnay
> > > > > >>>>>>>>>>
> > > > > >>>>>>>>>>> "Move Calcite rules from Scala to Java": I would hope
> > that
> > > > > >> this
> > > > > >>>>> would
> > > > > >>>>>>>> be
> > > > > >>>>>>>>>>> an entirely internal change, and could thus be an
> > > incremental
> > > > > >>>>> process
> > > > > >>>>>>>>>>> independent of major releases.
> > > > > >>>>>>>>>>> What is the actual scale of this item; how much are we
> > > > > >> actually
> > > > > >>>>>>>>>> re-writing?
> > > > > >>>>>>>>>>
> > > > > >>>>>>>>>> Thanks for asking
> > > > > >>>>>>>>>> yes, you're right, that should be internal change.
> > > > > >>>>>>>>>> Yeah I was also thinking about incremental change (rule
> by
> > > rule
> > > > > >>> or
> > > > > >>>>>>>>>> reasonable small group of rules).
> > > > > >>>>>>>>>> And yes, this could be an independent (on major release)
> > > > > >> activity
> > > > > >>>>>>>>>> The problem is actually for children of RelOptRule.
> > > > > >>>>>>>>>> Currently I see 60+ such rules (in Scala) using the
> > > mentioned
> > > > > >>>>>> deprecated
> > > > > >>>>>>>>>> api.
> > > > > >>>>>>>>>> There are also children of ConverterRule (50+) which do
> > not
> > > > > >> have
> > > > > >>>>> such
> > > > > >>>>>>>>>> issues.
> > > > > >>>>>>>>>> Maybe it could be considered as the next step to have
> all
> > > the
> > > > > >>>>> rules in
> > > > > >>>>>>>>>> Java.
> > > > > >>>>>>>>>>
> > > > > >>>>>>>>>> On Tue, Jun 27, 2023 at 1:34 PM Xintong Song <
> > > > > >>>>> tonysong820@gmail.com>
> > > > > >>>>>>>>>> wrote:
> > > > > >>>>>>>>>>
> > > > > >>>>>>>>>>> Hi Alex & Gyula,
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>> By compatibility discussion do you mean the "[DISCUSS]
> > > > > >> FLIP-321:
> > > > > >>>>>>>>>> Introduce
> > > > > >>>>>>>>>>>> an API deprecation process" thread [1]?
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>> Yes, I meant the FLIP-321 discussion. I just noticed I
> > > pasted
> > > > > >>> the
> > > > > >>>>>> wrong
> > > > > >>>>>>>>>> url
> > > > > >>>>>>>>>>> in my previous email. Sorry for the mistake.
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>> I am also curious to know if the rationale behind this
> > new
> > > API
> > > > > >>> has
> > > > > >>>>>> been
> > > > > >>>>>>>>>>>> previously discussed on the mailing list. Do we have a
> > > list
> > > > > >> of
> > > > > >>>>>>>>>>> shortcomings
> > > > > >>>>>>>>>>>> in the current DataStream API that it tries to
> resolve?
> > > How
> > > > > >>> does
> > > > > >>>>> the
> > > > > >>>>>>>>>>>> current ProcessFunction functionality fit into the
> > > picture?
> > > > > >>> Will
> > > > > >>>>> it
> > > > > >>>>>> be
> > > > > >>>>>>>>>>> kept
> > > > > >>>>>>>>>>>> as is or subsumed by new API?
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>> I don't think we should create a replacement for the
> > > > > >> DataStream
> > > > > >>>>> API
> > > > > >>>>>>>>>> unless
> > > > > >>>>>>>>>>>> we have a very good reason to do so and with a proper
> > > > > >>> discussion
> > > > > >>>>>> about
> > > > > >>>>>>>>>>> this
> > > > > >>>>>>>>>>>> as Alex said.
> > > > > >>>>>>>>>>> The ProcessFunction API which is targeting to replace
> > > > > >> DataStream
> > > > > >>>>> API
> > > > > >>>>>> is
> > > > > >>>>>>>>>>> still a proposal, not a decision. Sorry for the
> > confusion,
> > > I
> > > > > >>>>> should
> > > > > >>>>>>>> have
> > > > > >>>>>>>>>>> been more careful with my words, not giving the
> > impression
> > > > > >> that
> > > > > >>>>> this
> > > > > >>>>>> is
> > > > > >>>>>>>>>>> something we'll do anyway.
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>> There will be a FLIP describing the motivations and
> > > designs in
> > > > > >>>>>> detail,
> > > > > >>>>>>>>>> for
> > > > > >>>>>>>>>>> the community to discuss and vote on. We are still
> > working
> > > on
> > > > > >>> it.
> > > > > >>>>>> TBH,
> > > > > >>>>>>>>>> this
> > > > > >>>>>>>>>>> is not trivial and we would need more time on it.
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>> Just to quickly share some backgrounds:
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>>     - We see quite some problems with the current
> > > DataStream
> > > > > >> APIs
> > > > > >>>>>>>>>>>        - Users are working with concrete classes rather
> > > than
> > > > > >>>>>>>> interfaces,
> > > > > >>>>>>>>>>>        which means
> > > > > >>>>>>>>>>>        - Users can access methods that are designed to
> be
> > > used
> > > > > >> by
> > > > > >>>>>>>> internal
> > > > > >>>>>>>>>>>           classes, even though they are annotated with
> > > > > >>> `@Internal`.
> > > > > >>>>>>>> E.g.,
> > > > > >>>>>>>>>>>           `DataStream#getTransformation`.
> > > > > >>>>>>>>>>>           - Changes to the non-API implementations
> (e.g.,
> > > > > >>>>>>>>>> `Transformation`)
> > > > > >>>>>>>>>>>           would affect the API classes (e.g.,
> > > `DataStream`),
> > > > > >>> which
> > > > > >>>>>>>>>>> makes it hard to
> > > > > >>>>>>>>>>>           provide binary compatibility.
> > > > > >>>>>>>>>>>        - Internal classes are used as parameter /
> > > return-value
> > > > > >> of
> > > > > >>>>>>>> public
> > > > > >>>>>>>>>>>        APIs. E.g., while `AbstractStreamOperator` is
> > > > > >>>>> PublicEvolving,
> > > > > >>>>>>>>>>> `StreamTask`
> > > > > >>>>>>>>>>>        which returns from
> > > > > >>>>> `AbstractStreamOperator#getContainingTask`
> > > > > >>>>>> is
> > > > > >>>>>>>>>>> Internal.
> > > > > >>>>>>>>>>>        - In many cases, users are asked to extend the
> API
> > > > > >>> classes,
> > > > > >>>>>>>> rather
> > > > > >>>>>>>>>>>        than implementing interfaces. E.g.,
> > > > > >>>>> `AbstractStreamOperator`.
> > > > > >>>>>>>>>>>           - Any changes to the base classes, even the
> > > internal
> > > > > >>>>> part,
> > > > > >>>>>>>> may
> > > > > >>>>>>>>>>>           affect the behavior of the user-provided
> > > sub-classes
> > > > > >>>>>>>>>>>           - Users can override the behavior of the base
> > > classes
> > > > > >>>>>>>>>>>        - The API module `flink-streaming-java` contains
> > > non-API
> > > > > >>>>>>>> classes,
> > > > > >>>>>>>>>> and
> > > > > >>>>>>>>>>>        depends on internal modules such as
> > `flink-runtime`,
> > > > > >> which
> > > > > >>>>>> means
> > > > > >>>>>>>>>>>        - Changes to the internal modules may affect the
> > API
> > > > > >>>>> modules,
> > > > > >>>>>>>> which
> > > > > >>>>>>>>>>>           requires users to re-build their applications
> > > upon
> > > > > >>>>> upgrading
> > > > > >>>>>>>>>>>           - The artifact user needs for building their
> > > > > >>> application
> > > > > >>>>>>>> larger
> > > > > >>>>>>>>>>>           than necessary.
> > > > > >>>>>>>>>>>        - We probably should not expose operators (e.g.,
> > > > > >>>>>>>>>>>        `AbstractStreamOperator`) to users. Functions
> > > should be
> > > > > >>>>> enough
> > > > > >>>>>>>>>>> for users to
> > > > > >>>>>>>>>>>        define their data processing logics. Exposing
> > > > > >>> operator-level
> > > > > >>>>>>>>>> concepts
> > > > > >>>>>>>>>>>        (e.g., mailbox thread model, checkpoint barrier
> > > > > >> alignment,
> > > > > >>>>>>>> etc.) is
> > > > > >>>>>>>>>>>        unnecessary and limits the improvement regarding
> > > such
> > > > > >>>>> exposed
> > > > > >>>>>>>>>>> mechanisms
> > > > > >>>>>>>>>>>        with compatibility considerations.
> > > > > >>>>>>>>>>>        - The current DataStream API seems to be a
> mixture
> > > of
> > > > > >> many
> > > > > >>>>>>>> things,
> > > > > >>>>>>>>>>>        making it hard to understand especially for
> > > newcomers.
> > > > > >> It
> > > > > >>>>> might
> > > > > >>>>>>>> be
> > > > > >>>>>>>>>>> better
> > > > > >>>>>>>>>>>        to re-organize it into several parts: (the
> > taxonomy
> > > > > >> below
> > > > > >>>>> are
> > > > > >>>>>>>> just
> > > > > >>>>>>>>>> an
> > > > > >>>>>>>>>>>        example of the, we are still working on this)
> > > > > >>>>>>>>>>>           - The most fundamental stateful stream
> > > processing:
> > > > > >>>>> streams,
> > > > > >>>>>>>>>>>           partitions / key, process functions, state,
> > > > > >>>>> timeline-service
> > > > > >>>>>>>>>>>           - An extension for common batch-streaming
> > unified
> > > > > >>>>> functions:
> > > > > >>>>>>>>>> map,
> > > > > >>>>>>>>>>>           flatmap, filter, agg, reduce, join, etc.
> > > > > >>>>>>>>>>>           - An extension for windowing supports:
> window,
> > > > > >>>>> triggering
> > > > > >>>>>>>>>>>           - An extension for event-time supports: event
> > > time,
> > > > > >>>>>> watermark
> > > > > >>>>>>>>>>>           - The extensions are like short-cuts /
> sugars,
> > > > > >> without
> > > > > >>>>> which
> > > > > >>>>>>>>>> users
> > > > > >>>>>>>>>>>           can probably still achieve the same behavior
> by
> > > > > >> working
> > > > > >>>>> with
> > > > > >>>>>>>> the
> > > > > >>>>>>>>>>>           fundamental APIs, but would be a lot easier
> > with
> > > the
> > > > > >>>>>>>> extensions
> > > > > >>>>>>>>>>>        - The original plan was to do in-place
> refactors /
> > > > > >> changes
> > > > > >>>>> on
> > > > > >>>>>>>>>>>     DataStream API. Some related items are listed in
> this
> > > doc
> > > > > >> [2]
> > > > > >>>>>>>> attached
> > > > > >>>>>>>>>>> to
> > > > > >>>>>>>>>>>     the kicking off email [3]. Not all of the above
> > issues
> > > are
> > > > > >>>>> listed,
> > > > > >>>>>>>>>>> because
> > > > > >>>>>>>>>>>     we haven't looked into this as deeply as now  by
> that
> > > time.
> > > > > >>>>>>>>>>>     - We proposed this as a new API rather than
> in-place
> > > > > >>> refactors
> > > > > >>>>> in
> > > > > >>>>>>>> the
> > > > > >>>>>>>>>>>     2.0 work item list, because we realized the changes
> > > might
> > > > > >> be
> > > > > >>>>> too
> > > > > >>>>>>>> big
> > > > > >>>>>>>>>>> for an
> > > > > >>>>>>>>>>>     in-place change. First having a new API then
> > gradually
> > > > > >>> retiring
> > > > > >>>>>> the
> > > > > >>>>>>>>>> old
> > > > > >>>>>>>>>>> one
> > > > > >>>>>>>>>>>     would help users to smoothly migrate between them.
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>> A thorough discussion is definitely needed once the
> FLIP
> > is
> > > > > >> out.
> > > > > >>>>> And
> > > > > >>>>>> of
> > > > > >>>>>>>>>>> course it's possible that the FLIP might be rejected.
> > Given
> > > > > >> that
> > > > > >>>>> we
> > > > > >>>>>> are
> > > > > >>>>>>>>>>> planning for release 2.0, I just feel it would be
> better
> > to
> > > > > >>> bring
> > > > > >>>>>> this
> > > > > >>>>>>>> up
> > > > > >>>>>>>>>>> early even the concrete plan is not yet ready,
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>> Best,
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>> Xintong
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>> [1]
> > > > > >>>>>
> > https://lists.apache.org/thread/vmhzv8fcw2b33pqxp43486owrxbkd5x9
> > > > > >>>>>>>>>>> [2]
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>>
> > > > > >>
> > > > >
> > >
> >
> https://docs.google.com/document/d/1_PMGl5RuDQGlV99_gL3y7OiRsF0DgCk91Coua6hFXhE/edit?usp=sharing
> > > > > >>>>>>>>>>> [3]
> > > > > >>>>>
> > https://lists.apache.org/thread/b8w5cx0qqbwzzklyn5xxf54vw9ymys1c
> > > > > >>>>>>>>>>> On Tue, Jun 27, 2023 at 5:15 PM Gyula Fóra <
> > > gyfora@apache.org
> > > > > >>>>>> wrote:
> > > > > >>>>>>>>>>>> Hey!
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>>> I share the same concerns mentioned above regarding
> the
> > > > > >>>>>>>>>> "ProcessFunction
> > > > > >>>>>>>>>>>> API".
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>>> I don't think we should create a replacement for the
> > > > > >> DataStream
> > > > > >>>>> API
> > > > > >>>>>>>>>>> unless
> > > > > >>>>>>>>>>>> we have a very good reason to do so and with a proper
> > > > > >>> discussion
> > > > > >>>>>> about
> > > > > >>>>>>>>>>> this
> > > > > >>>>>>>>>>>> as Alex said.
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>>> Cheers,
> > > > > >>>>>>>>>>>> Gyula
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>>> On Tue, Jun 27, 2023 at 11:03 AM Alexander Fedulov <
> > > > > >>>>>>>>>>>> alexander.fedulov@gmail.com> wrote:
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>>>> Hi Xintong,
> > > > > >>>>>>>>>>>>>
> > > > > >>>>>>>>>>>>> By compatibility discussion do you mean the
> "[DISCUSS]
> > > > > >>> FLIP-321:
> > > > > >>>>>>>>>>>> Introduce
> > > > > >>>>>>>>>>>>> an API deprecation process" thread [1]?
> > > > > >>>>>>>>>>>>>
> > > > > >>>>>>>>>>>>> I am also curious to know if the rationale behind
> this
> > > new
> > > > > >> API
> > > > > >>>>> has
> > > > > >>>>>>>>>> been
> > > > > >>>>>>>>>>>>> previously discussed on the mailing list. Do we have
> a
> > > list
> > > > > >> of
> > > > > >>>>>>>>>>>> shortcomings
> > > > > >>>>>>>>>>>>> in the current DataStream API that it tries to
> resolve?
> > > How
> > > > > >>> does
> > > > > >>>>>> the
> > > > > >>>>>>>>>>>>> current ProcessFunction functionality fit into the
> > > picture?
> > > > > >>>>> Will it
> > > > > >>>>>>>>>> be
> > > > > >>>>>>>>>>>> kept
> > > > > >>>>>>>>>>>>> as is or subsumed by new API?
> > > > > >>>>>>>>>>>>>
> > > > > >>>>>>>>>>>>> [1]
> > > > > >>>>>>
> > > https://lists.apache.org/thread/vmhzv8fcw2b33pqxp43486owrxbkd5x9
> > > > > >>>>>>>>>>>>> Best,
> > > > > >>>>>>>>>>>>> Alex
> > > > > >>>>>>>>>>>>>
> > > > > >>>>>>>>>>>>> On Mon, 26 Jun 2023 at 14:33, Xintong Song <
> > > > > >>>>> tonysong820@gmail.com>
> > > > > >>>>>>>>>>>> wrote:
> > > > > >>>>>>>>>>>>>>> The ProcessFunction API item is giving me the most
> > > > > >> headaches
> > > > > >>>>>>>>>>> because
> > > > > >>>>>>>>>>>>> it's
> > > > > >>>>>>>>>>>>>>> very unclear what it actually entails; like is it
> an
> > > > > >>> entirely
> > > > > >>>>>>>>>>>> separate
> > > > > >>>>>>>>>>>>>> API
> > > > > >>>>>>>>>>>>>>> to DataStream (sounds like it is!) or an extension
> of
> > > > > >>>>> DataStream.
> > > > > >>>>>>>>>>> How
> > > > > >>>>>>>>>>>>>> much
> > > > > >>>>>>>>>>>>>>> will it share the internals with DataStream etc.;
> how
> > > does
> > > > > >>> it
> > > > > >>>>>>>>>>> relate
> > > > > >>>>>>>>>>>> to
> > > > > >>>>>>>>>>>>>> the
> > > > > >>>>>>>>>>>>>>> Table API (w.r.t. switching APIs / what Table API
> > uses
> > > > > >>>>>>>>>> underneath).
> > > > > >>>>>>>>>>>>>> I totally understand your confusion. We started
> > planning
> > > > > >> this
> > > > > >>>>>> after
> > > > > >>>>>>>>>>>>> kicking
> > > > > >>>>>>>>>>>>>> off the release 2.0, so there's still a lot to be
> > > explored
> > > > > >>> and
> > > > > >>>>> the
> > > > > >>>>>>>>>>> plan
> > > > > >>>>>>>>>>>>>> keeps changing.
> > > > > >>>>>>>>>>>>>>
> > > > > >>>>>>>>>>>>>>
> > > > > >>>>>>>>>>>>>>     - In the beginning, we planned to do an in-place
> > > > > >> refactor
> > > > > >>> of
> > > > > >>>>>>>>>>>>> DataStream
> > > > > >>>>>>>>>>>>>>     API, until the API migration period is proposed.
> > > > > >>>>>>>>>>>>>>     - Then we want to make it an entirely separate
> API
> > > to
> > > > > >>>>>>>>>> DataStream,
> > > > > >>>>>>>>>>>> and
> > > > > >>>>>>>>>>>>>>     listed as a must-have for release 2.0 so that we
> > can
> > > > > >>> remove
> > > > > >>>>>>>>>>>> DataStream
> > > > > >>>>>>>>>>>>>> once
> > > > > >>>>>>>>>>>>>>     it's ready.
> > > > > >>>>>>>>>>>>>>     - However, depending on the outcome of the API
> > > > > >>> compatibility
> > > > > >>>>>>>>>>>>> discussion
> > > > > >>>>>>>>>>>>>>     [1], we may not be able to remove DataStream in
> > 2.0
> > > > > >>> anyway,
> > > > > >>>>>>>>>> which
> > > > > >>>>>>>>>>>>> means
> > > > > >>>>>>>>>>>>>> we
> > > > > >>>>>>>>>>>>>>     might need to re-evaluate the necessity of this
> > > item for
> > > > > >>>>> 2.0.
> > > > > >>>>>>>>>>>>>> I'd say we wait a bit longer for the compatibility
> > > > > >> discussion
> > > > > >>>>> [1]
> > > > > >>>>>>>>>> and
> > > > > >>>>>>>>>>>>>> decide the priority for this item afterwards.
> > > > > >>>>>>>>>>>>>>
> > > > > >>>>>>>>>>>>>>
> > > > > >>>>>>>>>>>>>> Best,
> > > > > >>>>>>>>>>>>>>
> > > > > >>>>>>>>>>>>>> Xintong
> > > > > >>>>>>>>>>>>>>
> > > > > >>>>>>>>>>>>>>
> > > > > >>>>>>>>>>>>>> [1]
> > > > > >> https://lists.apache.org/list.html?dev@flink.apache.org
> > > > > >>>>>>>>>>>>>>
> > > > > >>>>>>>>>>>>>> On Mon, Jun 26, 2023 at 6:00 PM Chesnay Schepler <
> > > > > >>>>>>>>>> chesnay@apache.org
> > > > > >>>>>>>>>>>>>> wrote:
> > > > > >>>>>>>>>>>>>>
> > > > > >>>>>>>>>>>>>>> by-and-large I'm quite happy with the list of
> items.
> > > > > >>>>>>>>>>>>>>>
> > > > > >>>>>>>>>>>>>>> I'm curious as to why the "Disaggregated State
> > > Management"
> > > > > >>>>> item
> > > > > >>>>>>>>>> is
> > > > > >>>>>>>>>>>>> marked
> > > > > >>>>>>>>>>>>>>> as a must-have; will it require changes that break
> > > > > >>> something?
> > > > > >>>>>>>>>> What
> > > > > >>>>>>>>>>>>>> prevents
> > > > > >>>>>>>>>>>>>>> it from being added in 2.1?
> > > > > >>>>>>>>>>>>>>>
> > > > > >>>>>>>>>>>>>>> We may want to update the Java 17 item to "Make
> Java
> > 17
> > > > > >> the
> > > > > >>>>>>>>>>> default,
> > > > > >>>>>>>>>>>>> drop
> > > > > >>>>>>>>>>>>>>> Java 8/11". Maybe even split it into a must-have
> > "Drop
> > > > > >> Java
> > > > > >>> 8"
> > > > > >>>>>>>>>> and
> > > > > >>>>>>>>>>> a
> > > > > >>>>>>>>>>>>>>> nice-to-have "Drop Java 11"?
> > > > > >>>>>>>>>>>>>>>
> > > > > >>>>>>>>>>>>>>> "Move Calcite rules from Scala to Java": I would
> hope
> > > that
> > > > > >>>>> this
> > > > > >>>>>>>>>>> would
> > > > > >>>>>>>>>>>>> be
> > > > > >>>>>>>>>>>>>>> an entirely internal change, and could thus be an
> > > > > >>> incremental
> > > > > >>>>>>>>>>> process
> > > > > >>>>>>>>>>>>>>> independent of major releases.
> > > > > >>>>>>>>>>>>>>> What is the actual scale of this item; how much are
> > we
> > > > > >>>>> actually
> > > > > >>>>>>>>>>>>>> re-writing?
> > > > > >>>>>>>>>>>>>>> "Add MetricGroup#getLogicalScope": I'd raise this
> to
> > a
> > > > > >>>>>>>>>> must-have; i
> > > > > >>>>>>>>>>>>> think
> > > > > >>>>>>>>>>>>>>> I marked it down as nice-to-have only because it
> > > depends
> > > > > >> on
> > > > > >>>>>>>>>> another
> > > > > >>>>>>>>>>>>> item.
> > > > > >>>>>>>>>>>>>>> The ProcessFunction API item is giving me the most
> > > > > >> headaches
> > > > > >>>>>>>>>>> because
> > > > > >>>>>>>>>>>>> it's
> > > > > >>>>>>>>>>>>>>> very unclear what it actually entails; like is it
> an
> > > > > >>> entirely
> > > > > >>>>>>>>>>>> separate
> > > > > >>>>>>>>>>>>>> API
> > > > > >>>>>>>>>>>>>>> to DataStream (sounds like it is!) or an extension
> of
> > > > > >>>>> DataStream.
> > > > > >>>>>>>>>>> How
> > > > > >>>>>>>>>>>>>> much
> > > > > >>>>>>>>>>>>>>> will it share the internals with DataStream etc.;
> how
> > > does
> > > > > >>> it
> > > > > >>>>>>>>>>> relate
> > > > > >>>>>>>>>>>> to
> > > > > >>>>>>>>>>>>>> the
> > > > > >>>>>>>>>>>>>>> Table API (w.r.t. switching APIs / what Table API
> > uses
> > > > > >>>>>>>>>> underneath).
> > > > > >>>>>>>>>>>>>>> There are a few items I added as ideas which don't
> > > have a
> > > > > >>>>>>>>>> priority
> > > > > >>>>>>>>>>>> yet;
> > > > > >>>>>>>>>>>>>>> would love to get some feedback on those.
> > > > > >>>>>>>>>>>>>>>
> > > > > >>>>>>>>>>>>>>> On 21/06/2023 08:41, Xintong Song wrote:
> > > > > >>>>>>>>>>>>>>>
> > > > > >>>>>>>>>>>>>>> Hi devs,
> > > > > >>>>>>>>>>>>>>>
> > > > > >>>>>>>>>>>>>>> As previously discussed in [1], we had been
> > collecting
> > > > > >> work
> > > > > >>>>> item
> > > > > >>>>>>>>>>>>>> proposals
> > > > > >>>>>>>>>>>>>>> for the 2.0 release until June 15th, on the wiki
> page
> > > [2].
> > > > > >>>>>>>>>>>>>>>
> > > > > >>>>>>>>>>>>>>>     - As we have passed the due date, I'd like to
> > > kindly
> > > > > >>> remind
> > > > > >>>>>>>>>>>> everyone
> > > > > >>>>>>>>>>>>>> *not
> > > > > >>>>>>>>>>>>>>>     to add / remove items directly on the wiki
> page*.
> > > If
> > > > > >>>>> needed,
> > > > > >>>>>>>>>>>> please
> > > > > >>>>>>>>>>>>>> post
> > > > > >>>>>>>>>>>>>>>     in this thread or reach out to the release
> > managers
> > > > > >>>>> instead.
> > > > > >>>>>>>>>>>>>>>     - I've reached out to some folks for
> > clarifications
> > > > > >> about
> > > > > >>>>>>>>>> their
> > > > > >>>>>>>>>>>>>>>     proposals. Some of them mentioned that they can
> > > not yet
> > > > > >>>>> tell
> > > > > >>>>>>>>>>>> whether
> > > > > >>>>>>>>>>>>>> we
> > > > > >>>>>>>>>>>>>>>     should do an item or not, and would need more
> > time
> > > /
> > > > > >>>>>>>>>> discussions
> > > > > >>>>>>>>>>>> to
> > > > > >>>>>>>>>>>>>> make
> > > > > >>>>>>>>>>>>>>>     the decision. So I added a new symbol for items
> > > whose
> > > > > >>>>>>>>>> priorities
> > > > > >>>>>>>>>>>> are
> > > > > >>>>>>>>>>>>>> `TBD`.
> > > > > >>>>>>>>>>>>>>> Now it's time to collaboratively decide a minimum
> set
> > > of
> > > > > >>>>>>>>>> must-have
> > > > > >>>>>>>>>>>>> items.
> > > > > >>>>>>>>>>>>>>> I've gone through the entire list of proposed
> items,
> > > and
> > > > > >>> found
> > > > > >>>>>>>>>> most
> > > > > >>>>>>>>>>>> of
> > > > > >>>>>>>>>>>>>> them
> > > > > >>>>>>>>>>>>>>> make quite much sense. So I think an online sync
> > might
> > > not
> > > > > >>> be
> > > > > >>>>>>>>>>>> necessary
> > > > > >>>>>>>>>>>>>> for
> > > > > >>>>>>>>>>>>>>> this. I'd like to go with this DISCUSS thread,
> where
> > > > > >>> everyone
> > > > > >>>>> can
> > > > > >>>>>>>>>>>>> comment
> > > > > >>>>>>>>>>>>>>> on how they think the list can be improved,
> followed
> > > by a
> > > > > >>>>> VOTE to
> > > > > >>>>>>>>>>>>>> formally
> > > > > >>>>>>>>>>>>>>> make the decision.
> > > > > >>>>>>>>>>>>>>>
> > > > > >>>>>>>>>>>>>>> Any feedback and opinions, including but not
> limited
> > to
> > > > > >> the
> > > > > >>>>>>>>>>> following
> > > > > >>>>>>>>>>>>>>> aspects, will be appreciated.
> > > > > >>>>>>>>>>>>>>>
> > > > > >>>>>>>>>>>>>>>     - Important items that are missing from the
> list
> > > > > >>>>>>>>>>>>>>>     - Concerns regarding the listed items or their
> > > > > >> priorities
> > > > > >>>>>>>>>>>>>>> Looking forward to your feedback.
> > > > > >>>>>>>>>>>>>>>
> > > > > >>>>>>>>>>>>>>> Best,
> > > > > >>>>>>>>>>>>>>>
> > > > > >>>>>>>>>>>>>>> Xintong
> > > > > >>>>>>>>>>>>>>>
> > > > > >>>>>>>>>>>>>>>
> > > > > >>>>>>>>>>>>>>> [1]
> > > > > >>
> > > > >
> > >
> >
> https://lists.apache.org/list?dev@flink.apache.org:lte=1M:release%202.0%20status%20updates
> > > > > >>>>>>>>>>>>>>> [2]
> > > > > >>>>>>>>>>
> > > https://cwiki.apache.org/confluence/display/FLINK/2.0+Release
> > > > > >>>>>>>>>>>>>>>
> > > > > >>>>>>>>>> --
> > > > > >>>>>>>>>> Best regards,
> > > > > >>>>>>>>>> Sergey
> > > > > >>>>>>>>>>
> > > > > >>>>>>>>
> > > > > >>>>>>
> > > > > >>>>> --
> > > > > >>>>> Best
> > > > > >>>>>
> > > > > >>>>> ConradJam
> > > > > >>>>>
> > > > >
> > > > >
> > >
> >
>

Re: [DISCUSS] Release 2.0 Work Items

Posted by Xintong Song <to...@gmail.com>.
@Zhu,
As you are downgrading "Clarify the scopes of configuration options" to
nice-to-have priority, could you also bring that up in the vote thread[1]?
I'm asking because there are people who already voted on the original list.
I think restarting the vote is probably an overkill and unnecessary, but we
should at least bring this change to their attention.

@Matthias,
Thanks a lot for bringing this up. I wasn't aware of this early umbrella. I
haven't gone through everything in FLINK-3957 yet. I'll do it asap.

Just quickly went through the 4 issues you mentioned.
- FLINK-4675 & FLINK-14068: I'd be +1 to deprecate them in 1.18, as long as
the new APIs that we want users to migrate to are ready. For these 2
tickets, I think introduction of the updated APIs should be straightforward
and feasible for 1.18.
- FLINK-13926: I'm not sure about this one. The two mentioned classes
`ProcessingTimeSessionWindows` and `EventTimeSessionWindows` are not even
marked as Public or PublicEvolving APIs. Moreover, I don't see a good way
to smoothly replace the classes with a generic version.
- FLINK-5126: This is a bit unclear to me. From the description and
conversation on the ticket, I don't fully understand which concrete APIs
the ticket is referring to. Or maybe it refers to all / most of the APIs
that throws Exception / IOException in general. Moreover, I don't think
removing Exception / IOException from the API signature is a breaking
change. It requires no code changes on the caller side.

WDYT?

Best,

Xintong


[1] https://lists.apache.org/thread/r0y9syc6k5nmcxvnd0hj33htdpdj9k6m
[2] https://issues.apache.org/jira/browse/FLINK-3957

On Mon, Jul 10, 2023 at 10:53 PM Matthias Pohl
<ma...@aiven.io.invalid> wrote:

> I brought it up in the deprecating APIs in 1.18 thread [1] already but it
> feels misplaced there. I just wanted to ask whether someone did a pass over
> FLINK-3957 [2]. I came across it when going through the release 2.0 feature
> list [3] as part of the vote. I have the feeling that there are some valid
> action items (e.g. FLINK-4675, FLINK-5126, FLINK-13926 [4-6]) which do not
> seem to be listed in the 2.0 feature list [3], yet (or are included in some
> of the bigger items). Majority of the subtasks are probably covered by the
> DataSet removal, the Scala API removal and the ProcessFunction refactoring.
> Other subtasks (FLINK-14068 [7]) made it into the feature list.
>
> I haven't worked with the SDK code that much so that I can judge whether
> the subtasks are still reasonable or actually obsolete. That is why I
> wanted to mention the Jira issue here once more.
>
> I don't consider it a blocker for the ongoing vote but was wondering
> whether it makes sense for someone who might have more experience in that
> field to add some of the subtasks to the feature list.
>
> Or shall we just consider it as "not interesting enough" because nobody
> added it in the first place to the 2.0 feature list [3]?
>
> Matthias
>
> [1] https://lists.apache.org/thread/3dw4f8frlg8hzlv324ql7n2755bzs9hy
> [2] https://issues.apache.org/jira/browse/FLINK-3957
> [3] https://cwiki.apache.org/confluence/display/FLINK/2.0+Release
> [4] https://issues.apache.org/jira/browse/FLINK-4675
> [5] https://issues.apache.org/jira/browse/FLINK-5126
> [6] https://issues.apache.org/jira/browse/FLINK-13926
> [7] https://issues.apache.org/jira/browse/FLINK-14068
>
> On Mon, Jul 10, 2023 at 3:17 PM Zhu Zhu <re...@gmail.com> wrote:
>
> > Agreed that we should deprecate affected APIs as soon as possible.
> > But there is not much time before the feature freeze of 1.18,  hence
> > I'm a bit concerned that some of the deprecations might not be done 1.18.
> >
> > We are currently looking into the improvements of the configuration
> layer.
> > Most of the proposed changes would require a public discussion, or even
> > a FLIP, which I think can hardly close before the feature freeze of 1.18.
> > And some of the APIs can be deprecated only after the corresponding new
> > APIs are developed. Therefore we previously targeted them for 1.19.
> >
> > We may review later to see what deprecation work can be done in 1.18 and
> > make it if possible. I think we can do the work even after the feature
> > freeze
> > date, if it is a purely deprecation work (simply adding annotations).
> WDYT?
> >
> > I'm also changing the priority of "Clarify the scopes of configuration
> > options"
> > to nice to have. I think most of the work are not breaking changes and
> can
> > be done in 1.x or 2.1+. For the breaking changes which might be needed,
> we
> > will consider it as part of the configuration layer rework.
> >
> > Thanks,
> > Zhu
> >
> > Xintong Song <to...@gmail.com> 于2023年7月10日周一 19:58写道:
> > >
> > > >
> > > > At what point are the FLIP discussions coming into play?
> > >
> > > I keep wondering if these shouldn't have started already.
> > >
> > >
> > > I think this depends on the responsible contributor and reviewer of
> > > individual items. From my perspective, the FLIP discussions can start
> any
> > > time as long as the contributors are ready, the earlier the better.
> > >
> > >
> > > What we need to ensure is that all breaking API changes are
> > > > discussed/decided before 1.18 is released so we can deprecate
> affected
> > APIs.
> > > >
> > >
> > > The introduction of the migration period has brought the requirement to
> > > plan the removal of public APIs 2 minor releases ahead of the major
> > > release, which is TBH a bit unexpected. I agree it would be nice if we
> > can
> > > get the FLIPs ready by releasing 1.18. But I also don't think we should
> > > rush on it. If the deprecation of a Public API does not make 1.18, we
> may
> > > carry it until 3.0. Or if there are many Public APIs whose deprecation
> > does
> > > not make 1.18, we may deprecate them in 1.19 and postpone the major
> > version
> > > bump to after a 1.20 release. Moreover, as mentioned in FLIP-321[1],
> > > exceptions are discussable given that the migration period is newly
> > > proposed and we did not give developers the chance to plan things
> ahead.
> > To
> > > sum up, I'd say we try identify APIs that need to be deprecated in 1.18
> > > with best efforts, and evaluate the remaining options (carrying the API
> > for
> > > the entire 2.x cycle, postpone 2.0, or making an exception)
> case-by-case.
> > > WDYT?
> > >
> > > Best,
> > >
> > > Xintong
> > >
> > >
> > > [1] https://lists.apache.org/thread/vmhzv8fcw2b33pqxp43486owrxbkd5x9
> > >
> > > On Mon, Jul 10, 2023 at 6:13 PM Chesnay Schepler <ch...@apache.org>
> > wrote:
> > >
> > > > At what point are the FLIP discussions coming into play?
> > > >
> > > > I keep wondering if these shouldn't have started already.
> > > > It just seems that a lot of decisions are implicitly reliant on the
> > > > items even being accepted.
> > > > Estimates can only be provided if we actually know the scope of the
> > > > change, but that's not always clear from the description in the doc.
> > > >
> > > > What we need to ensure is that all breaking API changes are
> > > > discussed/decided before 1.18 is released so we can deprecate
> affected
> > > > APIs.
> > > >
> > > > On 10/07/2023 11:32, Xintong Song wrote:
> > > > > Hi Matthias,
> > > > >
> > > > > The questions you asked are indeed very important. Here're some
> quick
> > > > > responses, based on the plans I had in mind, which I have not
> aligned
> > > > with
> > > > > other release managers yet.
> > > > >
> > > > > In the previous discussions between the RMs, we were not able to
> make
> > > > > proposals on things like how to make a time plan, how to manage the
> > > > release
> > > > > branch, etc., due to the lack of inputs on e.g., the work items
> need
> > to
> > > > be
> > > > > included (which transitively depends on the API compatibility to
> > provide
> > > > > between major versions) and the workloads / time needed for them.
> > With
> > > > the
> > > > > recent discussions, we have collected at least the majority of the
> > inputs
> > > > > needed.
> > > > >
> > > > > Here are things that I think we as the release managers would do
> next
> > > > > (again, not aligned with other release managers yet)
> > > > > - Creating a time plan, by reaching out to people to understand the
> > > > > estimated workloads, prerequisites and ETA of each work item.
> > > > > - Make a proposal on how to manage the release branch, i.e., when
> to
> > cut
> > > > > the branch and whether to ship the milestone releases, etc.
> > > > > - Set-up regular release syncs (bi-weekly / monthly) to update the
> > status
> > > > > and draw attention to where help is needed.
> > > > >
> > > > > So back to your questions.
> > > > >
> > > > > There are still to-be-discussed items in the list of features.
> > What's the
> > > > >> plan with those?
> > > > > When collecting ETA, for items that the completion time cannot yet
> be
> > > > > estimated, we would like to have at least a time by which the
> > estimation
> > > > > can be made. I think the same applies to the to-be-discussed items.
> > And
> > > > if
> > > > > the items should be included as must-haves, we would need another
> > vote to
> > > > > adjust the must-have item list.
> > > > >
> > > > > Some of them don't have anyone assigned.
> > > > > My concern is that they will be overlooked because nobody feels to
> > be in
> > > > >> charge.
> > > > > This is a tricky one. For must-have items without assignees, we as
> > the
> > > > > release managers should be responsible for raising them up in the
> > release
> > > > > syncs, and try to find assignees for them. Hopefully, there will be
> > > > someone
> > > > > who stands out. But it is possible that for a must-have item nobody
> > wants
> > > > > to work on it. If that happens, which I don't think it will, it
> > probably
> > > > > means the item is not that critical and we may have to exclude it
> > from
> > > > the
> > > > > release. Either way, they should not be overlooked, because IMHO
> > release
> > > > > managers should be responsible for trying to get someone to work on
> > the
> > > > > un-assigned items.
> > > > >
> > > > > We'll have more discussions soon and keep the community updated.
> > > > >
> > > > > Best,
> > > > >
> > > > > Xintong
> > > > >
> > > > >
> > > > >
> > > > > On Mon, Jul 10, 2023 at 3:53 PM Matthias Pohl
> > > > > <ma...@aiven.io.invalid> wrote:
> > > > >
> > > > >> Now that the vote is started on the must-have items: There are
> still
> > > > >> to-be-discussed items in the list of features. What's the plan
> with
> > > > those?
> > > > >> Some of them don't have anyone assigned. Were these items
> discussed
> > > > among
> > > > >> the release managers? So far, it looks like they are handled as
> > > > >> nice-to-have if someone volunteers to pick them up?
> > > > >>
> > > > >> My concern is that they will be overlooked because nobody feels to
> > be in
> > > > >> charge.
> > > > >>
> > > > >> Best,
> > > > >> Matthias
> > > > >>
> > > > >> On Fri, Jul 7, 2023 at 11:06 AM Xintong Song <
> tonysong820@gmail.com
> > >
> > > > >> wrote:
> > > > >>
> > > > >>> Thanks all for the discussion.
> > > > >>>
> > > > >>> The wiki has been updated as discussed. I'm starting a vote now.
> > > > >>>
> > > > >>> Best,
> > > > >>>
> > > > >>> Xintong
> > > > >>>
> > > > >>>
> > > > >>>
> > > > >>> On Wed, Jul 5, 2023 at 9:52 AM Xintong Song <
> tonysong820@gmail.com
> > >
> > > > >> wrote:
> > > > >>>> Hi ConradJam,
> > > > >>>>
> > > > >>>> I think Chesnay has already put his name as the Contributor for
> > the
> > > > two
> > > > >>>> tasks you listed. Maybe you can reach out to him to see if you
> can
> > > > >>>> collaborate on this.
> > > > >>>>
> > > > >>>> In general, I don't think contributing to a release 2.0 issue is
> > much
> > > > >>>> different from contributing to a regular issue. We haven't yet
> > created
> > > > >>> JIRA
> > > > >>>> tickets for all the listed tasks because many of them needs
> > further
> > > > >>>> discussions and / or FLIPs to decide whether and how they should
> > be
> > > > >>>> performed.
> > > > >>>>
> > > > >>>> Best,
> > > > >>>>
> > > > >>>> Xintong
> > > > >>>>
> > > > >>>>
> > > > >>>>
> > > > >>>> On Mon, Jul 3, 2023 at 10:37 PM ConradJam <ja...@gmail.com>
> > > > wrote:
> > > > >>>>
> > > > >>>>> Hi Community:
> > > > >>>>>    I see some tasks in the 2.0 list that haven't been assigned
> > yet. I
> > > > >>> want
> > > > >>>>> to take the initiative to take on some tasks that I can
> > complete. How
> > > > >>> do I
> > > > >>>>> apply to the community for this part of the task? I am
> > interested in
> > > > >> the
> > > > >>>>> following parts of FLINK-32377
> > > > >>>>> <https://issues.apache.org/jira/browse/FLINK-32377>, do I need
> > to
> > > > >>> create
> > > > >>>>> issuse myself and point it to myself?
> > > > >>>>>
> > > > >>>>> - the current timestamp, which is problematic w.r.t. caching
> and
> > > > >>> testing,
> > > > >>>>> while providing no value.
> > > > >>>>> - Remove JarRequestBody#programArgs in favor of
> #programArgsList.
> > > > >>>>>
> > > > >>>>> [1] FLINK-32377 <
> > https://issues.apache.org/jira/browse/FLINK-32377>
> > > > >>>>> https://issues.apache.org/jira/browse/FLINK-32377
> > > > >>>>>
> > > > >>>>> Teoh, Hong <li...@amazon.co.uk.invalid> 于2023年6月30日周五
> 00:53写道:
> > > > >>>>>
> > > > >>>>>
> > > > >>>>> Teoh, Hong <li...@amazon.co.uk.invalid> 于2023年6月30日周五
> 00:53写道:
> > > > >>>>>
> > > > >>>>>> Thanks Xintong for driving the effort.
> > > > >>>>>>
> > > > >>>>>> I’d add a +1 to reworking configs, as suggested by @Jark and
> > > > >> @Chesnay,
> > > > >>>>>> especially the types. We have various configs that encode
> Time /
> > > > >>>>> MemorySize
> > > > >>>>>> that are Long instead!
> > > > >>>>>>
> > > > >>>>>> Regards,
> > > > >>>>>> Hong
> > > > >>>>>>
> > > > >>>>>>
> > > > >>>>>>
> > > > >>>>>>> On 29 Jun 2023, at 16:19, Yuan Mei <yu...@gmail.com>
> > > > >> wrote:
> > > > >>>>>>> CAUTION: This email originated from outside of the
> > organization.
> > > > >> Do
> > > > >>>>> not
> > > > >>>>>> click links or open attachments unless you can confirm the
> > sender
> > > > >> and
> > > > >>>>> know
> > > > >>>>>> the content is safe.
> > > > >>>>>>>
> > > > >>>>>>>
> > > > >>>>>>> Thanks for driving this effort, Xintong!
> > > > >>>>>>>
> > > > >>>>>>> To Chesnay
> > > > >>>>>>>> I'm curious as to why the "Disaggregated State Management"
> > item
> > > > >> is
> > > > >>>>>>>> marked as a must-have; will it require changes that break
> > > > >>> something?
> > > > >>>>>>>> What prevents it from being added in 2.1?
> > > > >>>>>>> As to "Disaggregated State Management".
> > > > >>>>>>>
> > > > >>>>>>> We plan to provide a new type of state backend to support DFS
> > as
> > > > >>>>> primary
> > > > >>>>>>> storage.
> > > > >>>>>>> To achieve this, we at least need to include two parts of
> > amends
> > > > >>> (not
> > > > >>>>>>> entirely sure yet, since we are still in the designing and
> > > > >> prototype
> > > > >>>>>> phase)
> > > > >>>>>>> 1. Statebackend Change
> > > > >>>>>>> 2. State Access Change
> > > > >>>>>>>
> > > > >>>>>>> Not all of the interfaces related are `@Internal`. Some of
> the
> > > > >>>>> interfaces
> > > > >>>>>>> like `StateBackend` is `@PublicEvolving`
> > > > >>>>>>> So, you are right in the sense that "Disaggregated State
> > > > >> Management"
> > > > >>>>>> itself
> > > > >>>>>>> probably does not need to be a "Must Have"
> > > > >>>>>>>
> > > > >>>>>>> But I was hoping changes that related to public APIs can be
> > > > >>> finalized
> > > > >>>>> and
> > > > >>>>>>> merged in Flink 2.0 (I will fix the wiki accordingly).
> > > > >>>>>>>
> > > > >>>>>>> I also agree with Jark that 2.0 is a good chance to rework
> the
> > > > >>> default
> > > > >>>>>>> value of configurations.
> > > > >>>>>>>
> > > > >>>>>>> Best
> > > > >>>>>>> Yuan
> > > > >>>>>>>
> > > > >>>>>>>
> > > > >>>>>>> On Thu, Jun 29, 2023 at 8:43 PM Chesnay Schepler <
> > > > >>> chesnay@apache.org>
> > > > >>>>>> wrote:
> > > > >>>>>>>> Something else configuration-related is that there are a
> > bunch of
> > > > >>>>>>>> options where the type isn't quite correct (e.g., a String
> > where
> > > > >> it
> > > > >>>>>>>> could be an enum, a string where it should be an int or
> > > > >> something).
> > > > >>>>>>>> Could do a pass over those as well.
> > > > >>>>>>>>
> > > > >>>>>>>> On 29/06/2023 13:50, Jark Wu wrote:
> > > > >>>>>>>>> Hi,
> > > > >>>>>>>>>
> > > > >>>>>>>>> I think one more thing we need to consider to do in 2.0 is
> > > > >>> changing
> > > > >>>>> the
> > > > >>>>>>>>> default value of configuration to improve out-of-box user
> > > > >>>>> experience.
> > > > >>>>>>>>> Currently, in order to run a Flink job, users may need to
> set
> > > > >>>>>>>>> a bunch of configurations, such as minibatch, checkpoint
> > > > >> interval,
> > > > >>>>>>>>> exactly-once,
> > > > >>>>>>>>> incremental-checkpoint, etc. It's very verbose and hard to
> > use
> > > > >> for
> > > > >>>>>>>>> beginners.
> > > > >>>>>>>>> Most of them can have a universally applicable value.
> > Because
> > > > >>>>> changing
> > > > >>>>>>>> the
> > > > >>>>>>>>> default value is a breaking change. I think It's worth
> > > > >> considering
> > > > >>>>>>>> changing
> > > > >>>>>>>>> them in 2.0.
> > > > >>>>>>>>>
> > > > >>>>>>>>> What do you think?
> > > > >>>>>>>>>
> > > > >>>>>>>>> Best,
> > > > >>>>>>>>> Jark
> > > > >>>>>>>>>
> > > > >>>>>>>>>
> > > > >>>>>>>>> On Wed, 28 Jun 2023 at 14:10, Sergey Nuyanzin <
> > > > >>> snuyanzin@gmail.com>
> > > > >>>>>>>> wrote:
> > > > >>>>>>>>>> Hi Chesnay
> > > > >>>>>>>>>>
> > > > >>>>>>>>>>> "Move Calcite rules from Scala to Java": I would hope
> that
> > > > >> this
> > > > >>>>> would
> > > > >>>>>>>> be
> > > > >>>>>>>>>>> an entirely internal change, and could thus be an
> > incremental
> > > > >>>>> process
> > > > >>>>>>>>>>> independent of major releases.
> > > > >>>>>>>>>>> What is the actual scale of this item; how much are we
> > > > >> actually
> > > > >>>>>>>>>> re-writing?
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> Thanks for asking
> > > > >>>>>>>>>> yes, you're right, that should be internal change.
> > > > >>>>>>>>>> Yeah I was also thinking about incremental change (rule by
> > rule
> > > > >>> or
> > > > >>>>>>>>>> reasonable small group of rules).
> > > > >>>>>>>>>> And yes, this could be an independent (on major release)
> > > > >> activity
> > > > >>>>>>>>>> The problem is actually for children of RelOptRule.
> > > > >>>>>>>>>> Currently I see 60+ such rules (in Scala) using the
> > mentioned
> > > > >>>>>> deprecated
> > > > >>>>>>>>>> api.
> > > > >>>>>>>>>> There are also children of ConverterRule (50+) which do
> not
> > > > >> have
> > > > >>>>> such
> > > > >>>>>>>>>> issues.
> > > > >>>>>>>>>> Maybe it could be considered as the next step to have all
> > the
> > > > >>>>> rules in
> > > > >>>>>>>>>> Java.
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> On Tue, Jun 27, 2023 at 1:34 PM Xintong Song <
> > > > >>>>> tonysong820@gmail.com>
> > > > >>>>>>>>>> wrote:
> > > > >>>>>>>>>>
> > > > >>>>>>>>>>> Hi Alex & Gyula,
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>> By compatibility discussion do you mean the "[DISCUSS]
> > > > >> FLIP-321:
> > > > >>>>>>>>>> Introduce
> > > > >>>>>>>>>>>> an API deprecation process" thread [1]?
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>> Yes, I meant the FLIP-321 discussion. I just noticed I
> > pasted
> > > > >>> the
> > > > >>>>>> wrong
> > > > >>>>>>>>>> url
> > > > >>>>>>>>>>> in my previous email. Sorry for the mistake.
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>> I am also curious to know if the rationale behind this
> new
> > API
> > > > >>> has
> > > > >>>>>> been
> > > > >>>>>>>>>>>> previously discussed on the mailing list. Do we have a
> > list
> > > > >> of
> > > > >>>>>>>>>>> shortcomings
> > > > >>>>>>>>>>>> in the current DataStream API that it tries to resolve?
> > How
> > > > >>> does
> > > > >>>>> the
> > > > >>>>>>>>>>>> current ProcessFunction functionality fit into the
> > picture?
> > > > >>> Will
> > > > >>>>> it
> > > > >>>>>> be
> > > > >>>>>>>>>>> kept
> > > > >>>>>>>>>>>> as is or subsumed by new API?
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>> I don't think we should create a replacement for the
> > > > >> DataStream
> > > > >>>>> API
> > > > >>>>>>>>>> unless
> > > > >>>>>>>>>>>> we have a very good reason to do so and with a proper
> > > > >>> discussion
> > > > >>>>>> about
> > > > >>>>>>>>>>> this
> > > > >>>>>>>>>>>> as Alex said.
> > > > >>>>>>>>>>> The ProcessFunction API which is targeting to replace
> > > > >> DataStream
> > > > >>>>> API
> > > > >>>>>> is
> > > > >>>>>>>>>>> still a proposal, not a decision. Sorry for the
> confusion,
> > I
> > > > >>>>> should
> > > > >>>>>>>> have
> > > > >>>>>>>>>>> been more careful with my words, not giving the
> impression
> > > > >> that
> > > > >>>>> this
> > > > >>>>>> is
> > > > >>>>>>>>>>> something we'll do anyway.
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>> There will be a FLIP describing the motivations and
> > designs in
> > > > >>>>>> detail,
> > > > >>>>>>>>>> for
> > > > >>>>>>>>>>> the community to discuss and vote on. We are still
> working
> > on
> > > > >>> it.
> > > > >>>>>> TBH,
> > > > >>>>>>>>>> this
> > > > >>>>>>>>>>> is not trivial and we would need more time on it.
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>> Just to quickly share some backgrounds:
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>>     - We see quite some problems with the current
> > DataStream
> > > > >> APIs
> > > > >>>>>>>>>>>        - Users are working with concrete classes rather
> > than
> > > > >>>>>>>> interfaces,
> > > > >>>>>>>>>>>        which means
> > > > >>>>>>>>>>>        - Users can access methods that are designed to be
> > used
> > > > >> by
> > > > >>>>>>>> internal
> > > > >>>>>>>>>>>           classes, even though they are annotated with
> > > > >>> `@Internal`.
> > > > >>>>>>>> E.g.,
> > > > >>>>>>>>>>>           `DataStream#getTransformation`.
> > > > >>>>>>>>>>>           - Changes to the non-API implementations (e.g.,
> > > > >>>>>>>>>> `Transformation`)
> > > > >>>>>>>>>>>           would affect the API classes (e.g.,
> > `DataStream`),
> > > > >>> which
> > > > >>>>>>>>>>> makes it hard to
> > > > >>>>>>>>>>>           provide binary compatibility.
> > > > >>>>>>>>>>>        - Internal classes are used as parameter /
> > return-value
> > > > >> of
> > > > >>>>>>>> public
> > > > >>>>>>>>>>>        APIs. E.g., while `AbstractStreamOperator` is
> > > > >>>>> PublicEvolving,
> > > > >>>>>>>>>>> `StreamTask`
> > > > >>>>>>>>>>>        which returns from
> > > > >>>>> `AbstractStreamOperator#getContainingTask`
> > > > >>>>>> is
> > > > >>>>>>>>>>> Internal.
> > > > >>>>>>>>>>>        - In many cases, users are asked to extend the API
> > > > >>> classes,
> > > > >>>>>>>> rather
> > > > >>>>>>>>>>>        than implementing interfaces. E.g.,
> > > > >>>>> `AbstractStreamOperator`.
> > > > >>>>>>>>>>>           - Any changes to the base classes, even the
> > internal
> > > > >>>>> part,
> > > > >>>>>>>> may
> > > > >>>>>>>>>>>           affect the behavior of the user-provided
> > sub-classes
> > > > >>>>>>>>>>>           - Users can override the behavior of the base
> > classes
> > > > >>>>>>>>>>>        - The API module `flink-streaming-java` contains
> > non-API
> > > > >>>>>>>> classes,
> > > > >>>>>>>>>> and
> > > > >>>>>>>>>>>        depends on internal modules such as
> `flink-runtime`,
> > > > >> which
> > > > >>>>>> means
> > > > >>>>>>>>>>>        - Changes to the internal modules may affect the
> API
> > > > >>>>> modules,
> > > > >>>>>>>> which
> > > > >>>>>>>>>>>           requires users to re-build their applications
> > upon
> > > > >>>>> upgrading
> > > > >>>>>>>>>>>           - The artifact user needs for building their
> > > > >>> application
> > > > >>>>>>>> larger
> > > > >>>>>>>>>>>           than necessary.
> > > > >>>>>>>>>>>        - We probably should not expose operators (e.g.,
> > > > >>>>>>>>>>>        `AbstractStreamOperator`) to users. Functions
> > should be
> > > > >>>>> enough
> > > > >>>>>>>>>>> for users to
> > > > >>>>>>>>>>>        define their data processing logics. Exposing
> > > > >>> operator-level
> > > > >>>>>>>>>> concepts
> > > > >>>>>>>>>>>        (e.g., mailbox thread model, checkpoint barrier
> > > > >> alignment,
> > > > >>>>>>>> etc.) is
> > > > >>>>>>>>>>>        unnecessary and limits the improvement regarding
> > such
> > > > >>>>> exposed
> > > > >>>>>>>>>>> mechanisms
> > > > >>>>>>>>>>>        with compatibility considerations.
> > > > >>>>>>>>>>>        - The current DataStream API seems to be a mixture
> > of
> > > > >> many
> > > > >>>>>>>> things,
> > > > >>>>>>>>>>>        making it hard to understand especially for
> > newcomers.
> > > > >> It
> > > > >>>>> might
> > > > >>>>>>>> be
> > > > >>>>>>>>>>> better
> > > > >>>>>>>>>>>        to re-organize it into several parts: (the
> taxonomy
> > > > >> below
> > > > >>>>> are
> > > > >>>>>>>> just
> > > > >>>>>>>>>> an
> > > > >>>>>>>>>>>        example of the, we are still working on this)
> > > > >>>>>>>>>>>           - The most fundamental stateful stream
> > processing:
> > > > >>>>> streams,
> > > > >>>>>>>>>>>           partitions / key, process functions, state,
> > > > >>>>> timeline-service
> > > > >>>>>>>>>>>           - An extension for common batch-streaming
> unified
> > > > >>>>> functions:
> > > > >>>>>>>>>> map,
> > > > >>>>>>>>>>>           flatmap, filter, agg, reduce, join, etc.
> > > > >>>>>>>>>>>           - An extension for windowing supports:  window,
> > > > >>>>> triggering
> > > > >>>>>>>>>>>           - An extension for event-time supports: event
> > time,
> > > > >>>>>> watermark
> > > > >>>>>>>>>>>           - The extensions are like short-cuts / sugars,
> > > > >> without
> > > > >>>>> which
> > > > >>>>>>>>>> users
> > > > >>>>>>>>>>>           can probably still achieve the same behavior by
> > > > >> working
> > > > >>>>> with
> > > > >>>>>>>> the
> > > > >>>>>>>>>>>           fundamental APIs, but would be a lot easier
> with
> > the
> > > > >>>>>>>> extensions
> > > > >>>>>>>>>>>        - The original plan was to do in-place refactors /
> > > > >> changes
> > > > >>>>> on
> > > > >>>>>>>>>>>     DataStream API. Some related items are listed in this
> > doc
> > > > >> [2]
> > > > >>>>>>>> attached
> > > > >>>>>>>>>>> to
> > > > >>>>>>>>>>>     the kicking off email [3]. Not all of the above
> issues
> > are
> > > > >>>>> listed,
> > > > >>>>>>>>>>> because
> > > > >>>>>>>>>>>     we haven't looked into this as deeply as now  by that
> > time.
> > > > >>>>>>>>>>>     - We proposed this as a new API rather than in-place
> > > > >>> refactors
> > > > >>>>> in
> > > > >>>>>>>> the
> > > > >>>>>>>>>>>     2.0 work item list, because we realized the changes
> > might
> > > > >> be
> > > > >>>>> too
> > > > >>>>>>>> big
> > > > >>>>>>>>>>> for an
> > > > >>>>>>>>>>>     in-place change. First having a new API then
> gradually
> > > > >>> retiring
> > > > >>>>>> the
> > > > >>>>>>>>>> old
> > > > >>>>>>>>>>> one
> > > > >>>>>>>>>>>     would help users to smoothly migrate between them.
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>> A thorough discussion is definitely needed once the FLIP
> is
> > > > >> out.
> > > > >>>>> And
> > > > >>>>>> of
> > > > >>>>>>>>>>> course it's possible that the FLIP might be rejected.
> Given
> > > > >> that
> > > > >>>>> we
> > > > >>>>>> are
> > > > >>>>>>>>>>> planning for release 2.0, I just feel it would be better
> to
> > > > >>> bring
> > > > >>>>>> this
> > > > >>>>>>>> up
> > > > >>>>>>>>>>> early even the concrete plan is not yet ready,
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>> Best,
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>> Xintong
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>> [1]
> > > > >>>>>
> https://lists.apache.org/thread/vmhzv8fcw2b33pqxp43486owrxbkd5x9
> > > > >>>>>>>>>>> [2]
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>>
> > > > >>
> > > >
> >
> https://docs.google.com/document/d/1_PMGl5RuDQGlV99_gL3y7OiRsF0DgCk91Coua6hFXhE/edit?usp=sharing
> > > > >>>>>>>>>>> [3]
> > > > >>>>>
> https://lists.apache.org/thread/b8w5cx0qqbwzzklyn5xxf54vw9ymys1c
> > > > >>>>>>>>>>> On Tue, Jun 27, 2023 at 5:15 PM Gyula Fóra <
> > gyfora@apache.org
> > > > >>>>>> wrote:
> > > > >>>>>>>>>>>> Hey!
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> I share the same concerns mentioned above regarding the
> > > > >>>>>>>>>> "ProcessFunction
> > > > >>>>>>>>>>>> API".
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> I don't think we should create a replacement for the
> > > > >> DataStream
> > > > >>>>> API
> > > > >>>>>>>>>>> unless
> > > > >>>>>>>>>>>> we have a very good reason to do so and with a proper
> > > > >>> discussion
> > > > >>>>>> about
> > > > >>>>>>>>>>> this
> > > > >>>>>>>>>>>> as Alex said.
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> Cheers,
> > > > >>>>>>>>>>>> Gyula
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> On Tue, Jun 27, 2023 at 11:03 AM Alexander Fedulov <
> > > > >>>>>>>>>>>> alexander.fedulov@gmail.com> wrote:
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>>> Hi Xintong,
> > > > >>>>>>>>>>>>>
> > > > >>>>>>>>>>>>> By compatibility discussion do you mean the "[DISCUSS]
> > > > >>> FLIP-321:
> > > > >>>>>>>>>>>> Introduce
> > > > >>>>>>>>>>>>> an API deprecation process" thread [1]?
> > > > >>>>>>>>>>>>>
> > > > >>>>>>>>>>>>> I am also curious to know if the rationale behind this
> > new
> > > > >> API
> > > > >>>>> has
> > > > >>>>>>>>>> been
> > > > >>>>>>>>>>>>> previously discussed on the mailing list. Do we have a
> > list
> > > > >> of
> > > > >>>>>>>>>>>> shortcomings
> > > > >>>>>>>>>>>>> in the current DataStream API that it tries to resolve?
> > How
> > > > >>> does
> > > > >>>>>> the
> > > > >>>>>>>>>>>>> current ProcessFunction functionality fit into the
> > picture?
> > > > >>>>> Will it
> > > > >>>>>>>>>> be
> > > > >>>>>>>>>>>> kept
> > > > >>>>>>>>>>>>> as is or subsumed by new API?
> > > > >>>>>>>>>>>>>
> > > > >>>>>>>>>>>>> [1]
> > > > >>>>>>
> > https://lists.apache.org/thread/vmhzv8fcw2b33pqxp43486owrxbkd5x9
> > > > >>>>>>>>>>>>> Best,
> > > > >>>>>>>>>>>>> Alex
> > > > >>>>>>>>>>>>>
> > > > >>>>>>>>>>>>> On Mon, 26 Jun 2023 at 14:33, Xintong Song <
> > > > >>>>> tonysong820@gmail.com>
> > > > >>>>>>>>>>>> wrote:
> > > > >>>>>>>>>>>>>>> The ProcessFunction API item is giving me the most
> > > > >> headaches
> > > > >>>>>>>>>>> because
> > > > >>>>>>>>>>>>> it's
> > > > >>>>>>>>>>>>>>> very unclear what it actually entails; like is it an
> > > > >>> entirely
> > > > >>>>>>>>>>>> separate
> > > > >>>>>>>>>>>>>> API
> > > > >>>>>>>>>>>>>>> to DataStream (sounds like it is!) or an extension of
> > > > >>>>> DataStream.
> > > > >>>>>>>>>>> How
> > > > >>>>>>>>>>>>>> much
> > > > >>>>>>>>>>>>>>> will it share the internals with DataStream etc.; how
> > does
> > > > >>> it
> > > > >>>>>>>>>>> relate
> > > > >>>>>>>>>>>> to
> > > > >>>>>>>>>>>>>> the
> > > > >>>>>>>>>>>>>>> Table API (w.r.t. switching APIs / what Table API
> uses
> > > > >>>>>>>>>> underneath).
> > > > >>>>>>>>>>>>>> I totally understand your confusion. We started
> planning
> > > > >> this
> > > > >>>>>> after
> > > > >>>>>>>>>>>>> kicking
> > > > >>>>>>>>>>>>>> off the release 2.0, so there's still a lot to be
> > explored
> > > > >>> and
> > > > >>>>> the
> > > > >>>>>>>>>>> plan
> > > > >>>>>>>>>>>>>> keeps changing.
> > > > >>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>     - In the beginning, we planned to do an in-place
> > > > >> refactor
> > > > >>> of
> > > > >>>>>>>>>>>>> DataStream
> > > > >>>>>>>>>>>>>>     API, until the API migration period is proposed.
> > > > >>>>>>>>>>>>>>     - Then we want to make it an entirely separate API
> > to
> > > > >>>>>>>>>> DataStream,
> > > > >>>>>>>>>>>> and
> > > > >>>>>>>>>>>>>>     listed as a must-have for release 2.0 so that we
> can
> > > > >>> remove
> > > > >>>>>>>>>>>> DataStream
> > > > >>>>>>>>>>>>>> once
> > > > >>>>>>>>>>>>>>     it's ready.
> > > > >>>>>>>>>>>>>>     - However, depending on the outcome of the API
> > > > >>> compatibility
> > > > >>>>>>>>>>>>> discussion
> > > > >>>>>>>>>>>>>>     [1], we may not be able to remove DataStream in
> 2.0
> > > > >>> anyway,
> > > > >>>>>>>>>> which
> > > > >>>>>>>>>>>>> means
> > > > >>>>>>>>>>>>>> we
> > > > >>>>>>>>>>>>>>     might need to re-evaluate the necessity of this
> > item for
> > > > >>>>> 2.0.
> > > > >>>>>>>>>>>>>> I'd say we wait a bit longer for the compatibility
> > > > >> discussion
> > > > >>>>> [1]
> > > > >>>>>>>>>> and
> > > > >>>>>>>>>>>>>> decide the priority for this item afterwards.
> > > > >>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>> Best,
> > > > >>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>> Xintong
> > > > >>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>> [1]
> > > > >> https://lists.apache.org/list.html?dev@flink.apache.org
> > > > >>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>> On Mon, Jun 26, 2023 at 6:00 PM Chesnay Schepler <
> > > > >>>>>>>>>> chesnay@apache.org
> > > > >>>>>>>>>>>>>> wrote:
> > > > >>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>> by-and-large I'm quite happy with the list of items.
> > > > >>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>> I'm curious as to why the "Disaggregated State
> > Management"
> > > > >>>>> item
> > > > >>>>>>>>>> is
> > > > >>>>>>>>>>>>> marked
> > > > >>>>>>>>>>>>>>> as a must-have; will it require changes that break
> > > > >>> something?
> > > > >>>>>>>>>> What
> > > > >>>>>>>>>>>>>> prevents
> > > > >>>>>>>>>>>>>>> it from being added in 2.1?
> > > > >>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>> We may want to update the Java 17 item to "Make Java
> 17
> > > > >> the
> > > > >>>>>>>>>>> default,
> > > > >>>>>>>>>>>>> drop
> > > > >>>>>>>>>>>>>>> Java 8/11". Maybe even split it into a must-have
> "Drop
> > > > >> Java
> > > > >>> 8"
> > > > >>>>>>>>>> and
> > > > >>>>>>>>>>> a
> > > > >>>>>>>>>>>>>>> nice-to-have "Drop Java 11"?
> > > > >>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>> "Move Calcite rules from Scala to Java": I would hope
> > that
> > > > >>>>> this
> > > > >>>>>>>>>>> would
> > > > >>>>>>>>>>>>> be
> > > > >>>>>>>>>>>>>>> an entirely internal change, and could thus be an
> > > > >>> incremental
> > > > >>>>>>>>>>> process
> > > > >>>>>>>>>>>>>>> independent of major releases.
> > > > >>>>>>>>>>>>>>> What is the actual scale of this item; how much are
> we
> > > > >>>>> actually
> > > > >>>>>>>>>>>>>> re-writing?
> > > > >>>>>>>>>>>>>>> "Add MetricGroup#getLogicalScope": I'd raise this to
> a
> > > > >>>>>>>>>> must-have; i
> > > > >>>>>>>>>>>>> think
> > > > >>>>>>>>>>>>>>> I marked it down as nice-to-have only because it
> > depends
> > > > >> on
> > > > >>>>>>>>>> another
> > > > >>>>>>>>>>>>> item.
> > > > >>>>>>>>>>>>>>> The ProcessFunction API item is giving me the most
> > > > >> headaches
> > > > >>>>>>>>>>> because
> > > > >>>>>>>>>>>>> it's
> > > > >>>>>>>>>>>>>>> very unclear what it actually entails; like is it an
> > > > >>> entirely
> > > > >>>>>>>>>>>> separate
> > > > >>>>>>>>>>>>>> API
> > > > >>>>>>>>>>>>>>> to DataStream (sounds like it is!) or an extension of
> > > > >>>>> DataStream.
> > > > >>>>>>>>>>> How
> > > > >>>>>>>>>>>>>> much
> > > > >>>>>>>>>>>>>>> will it share the internals with DataStream etc.; how
> > does
> > > > >>> it
> > > > >>>>>>>>>>> relate
> > > > >>>>>>>>>>>> to
> > > > >>>>>>>>>>>>>> the
> > > > >>>>>>>>>>>>>>> Table API (w.r.t. switching APIs / what Table API
> uses
> > > > >>>>>>>>>> underneath).
> > > > >>>>>>>>>>>>>>> There are a few items I added as ideas which don't
> > have a
> > > > >>>>>>>>>> priority
> > > > >>>>>>>>>>>> yet;
> > > > >>>>>>>>>>>>>>> would love to get some feedback on those.
> > > > >>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>> On 21/06/2023 08:41, Xintong Song wrote:
> > > > >>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>> Hi devs,
> > > > >>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>> As previously discussed in [1], we had been
> collecting
> > > > >> work
> > > > >>>>> item
> > > > >>>>>>>>>>>>>> proposals
> > > > >>>>>>>>>>>>>>> for the 2.0 release until June 15th, on the wiki page
> > [2].
> > > > >>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>     - As we have passed the due date, I'd like to
> > kindly
> > > > >>> remind
> > > > >>>>>>>>>>>> everyone
> > > > >>>>>>>>>>>>>> *not
> > > > >>>>>>>>>>>>>>>     to add / remove items directly on the wiki page*.
> > If
> > > > >>>>> needed,
> > > > >>>>>>>>>>>> please
> > > > >>>>>>>>>>>>>> post
> > > > >>>>>>>>>>>>>>>     in this thread or reach out to the release
> managers
> > > > >>>>> instead.
> > > > >>>>>>>>>>>>>>>     - I've reached out to some folks for
> clarifications
> > > > >> about
> > > > >>>>>>>>>> their
> > > > >>>>>>>>>>>>>>>     proposals. Some of them mentioned that they can
> > not yet
> > > > >>>>> tell
> > > > >>>>>>>>>>>> whether
> > > > >>>>>>>>>>>>>> we
> > > > >>>>>>>>>>>>>>>     should do an item or not, and would need more
> time
> > /
> > > > >>>>>>>>>> discussions
> > > > >>>>>>>>>>>> to
> > > > >>>>>>>>>>>>>> make
> > > > >>>>>>>>>>>>>>>     the decision. So I added a new symbol for items
> > whose
> > > > >>>>>>>>>> priorities
> > > > >>>>>>>>>>>> are
> > > > >>>>>>>>>>>>>> `TBD`.
> > > > >>>>>>>>>>>>>>> Now it's time to collaboratively decide a minimum set
> > of
> > > > >>>>>>>>>> must-have
> > > > >>>>>>>>>>>>> items.
> > > > >>>>>>>>>>>>>>> I've gone through the entire list of proposed items,
> > and
> > > > >>> found
> > > > >>>>>>>>>> most
> > > > >>>>>>>>>>>> of
> > > > >>>>>>>>>>>>>> them
> > > > >>>>>>>>>>>>>>> make quite much sense. So I think an online sync
> might
> > not
> > > > >>> be
> > > > >>>>>>>>>>>> necessary
> > > > >>>>>>>>>>>>>> for
> > > > >>>>>>>>>>>>>>> this. I'd like to go with this DISCUSS thread, where
> > > > >>> everyone
> > > > >>>>> can
> > > > >>>>>>>>>>>>> comment
> > > > >>>>>>>>>>>>>>> on how they think the list can be improved, followed
> > by a
> > > > >>>>> VOTE to
> > > > >>>>>>>>>>>>>> formally
> > > > >>>>>>>>>>>>>>> make the decision.
> > > > >>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>> Any feedback and opinions, including but not limited
> to
> > > > >> the
> > > > >>>>>>>>>>> following
> > > > >>>>>>>>>>>>>>> aspects, will be appreciated.
> > > > >>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>     - Important items that are missing from the list
> > > > >>>>>>>>>>>>>>>     - Concerns regarding the listed items or their
> > > > >> priorities
> > > > >>>>>>>>>>>>>>> Looking forward to your feedback.
> > > > >>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>> Best,
> > > > >>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>> Xintong
> > > > >>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>> [1]
> > > > >>
> > > >
> >
> https://lists.apache.org/list?dev@flink.apache.org:lte=1M:release%202.0%20status%20updates
> > > > >>>>>>>>>>>>>>> [2]
> > > > >>>>>>>>>>
> > https://cwiki.apache.org/confluence/display/FLINK/2.0+Release
> > > > >>>>>>>>>>>>>>>
> > > > >>>>>>>>>> --
> > > > >>>>>>>>>> Best regards,
> > > > >>>>>>>>>> Sergey
> > > > >>>>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>
> > > > >>>>> --
> > > > >>>>> Best
> > > > >>>>>
> > > > >>>>> ConradJam
> > > > >>>>>
> > > >
> > > >
> >
>

Re: [DISCUSS] Release 2.0 Work Items

Posted by Matthias Pohl <ma...@aiven.io.INVALID>.
I brought it up in the deprecating APIs in 1.18 thread [1] already but it
feels misplaced there. I just wanted to ask whether someone did a pass over
FLINK-3957 [2]. I came across it when going through the release 2.0 feature
list [3] as part of the vote. I have the feeling that there are some valid
action items (e.g. FLINK-4675, FLINK-5126, FLINK-13926 [4-6]) which do not
seem to be listed in the 2.0 feature list [3], yet (or are included in some
of the bigger items). Majority of the subtasks are probably covered by the
DataSet removal, the Scala API removal and the ProcessFunction refactoring.
Other subtasks (FLINK-14068 [7]) made it into the feature list.

I haven't worked with the SDK code that much so that I can judge whether
the subtasks are still reasonable or actually obsolete. That is why I
wanted to mention the Jira issue here once more.

I don't consider it a blocker for the ongoing vote but was wondering
whether it makes sense for someone who might have more experience in that
field to add some of the subtasks to the feature list.

Or shall we just consider it as "not interesting enough" because nobody
added it in the first place to the 2.0 feature list [3]?

Matthias

[1] https://lists.apache.org/thread/3dw4f8frlg8hzlv324ql7n2755bzs9hy
[2] https://issues.apache.org/jira/browse/FLINK-3957
[3] https://cwiki.apache.org/confluence/display/FLINK/2.0+Release
[4] https://issues.apache.org/jira/browse/FLINK-4675
[5] https://issues.apache.org/jira/browse/FLINK-5126
[6] https://issues.apache.org/jira/browse/FLINK-13926
[7] https://issues.apache.org/jira/browse/FLINK-14068

On Mon, Jul 10, 2023 at 3:17 PM Zhu Zhu <re...@gmail.com> wrote:

> Agreed that we should deprecate affected APIs as soon as possible.
> But there is not much time before the feature freeze of 1.18,  hence
> I'm a bit concerned that some of the deprecations might not be done 1.18.
>
> We are currently looking into the improvements of the configuration layer.
> Most of the proposed changes would require a public discussion, or even
> a FLIP, which I think can hardly close before the feature freeze of 1.18.
> And some of the APIs can be deprecated only after the corresponding new
> APIs are developed. Therefore we previously targeted them for 1.19.
>
> We may review later to see what deprecation work can be done in 1.18 and
> make it if possible. I think we can do the work even after the feature
> freeze
> date, if it is a purely deprecation work (simply adding annotations). WDYT?
>
> I'm also changing the priority of "Clarify the scopes of configuration
> options"
> to nice to have. I think most of the work are not breaking changes and can
> be done in 1.x or 2.1+. For the breaking changes which might be needed, we
> will consider it as part of the configuration layer rework.
>
> Thanks,
> Zhu
>
> Xintong Song <to...@gmail.com> 于2023年7月10日周一 19:58写道:
> >
> > >
> > > At what point are the FLIP discussions coming into play?
> >
> > I keep wondering if these shouldn't have started already.
> >
> >
> > I think this depends on the responsible contributor and reviewer of
> > individual items. From my perspective, the FLIP discussions can start any
> > time as long as the contributors are ready, the earlier the better.
> >
> >
> > What we need to ensure is that all breaking API changes are
> > > discussed/decided before 1.18 is released so we can deprecate affected
> APIs.
> > >
> >
> > The introduction of the migration period has brought the requirement to
> > plan the removal of public APIs 2 minor releases ahead of the major
> > release, which is TBH a bit unexpected. I agree it would be nice if we
> can
> > get the FLIPs ready by releasing 1.18. But I also don't think we should
> > rush on it. If the deprecation of a Public API does not make 1.18, we may
> > carry it until 3.0. Or if there are many Public APIs whose deprecation
> does
> > not make 1.18, we may deprecate them in 1.19 and postpone the major
> version
> > bump to after a 1.20 release. Moreover, as mentioned in FLIP-321[1],
> > exceptions are discussable given that the migration period is newly
> > proposed and we did not give developers the chance to plan things ahead.
> To
> > sum up, I'd say we try identify APIs that need to be deprecated in 1.18
> > with best efforts, and evaluate the remaining options (carrying the API
> for
> > the entire 2.x cycle, postpone 2.0, or making an exception) case-by-case.
> > WDYT?
> >
> > Best,
> >
> > Xintong
> >
> >
> > [1] https://lists.apache.org/thread/vmhzv8fcw2b33pqxp43486owrxbkd5x9
> >
> > On Mon, Jul 10, 2023 at 6:13 PM Chesnay Schepler <ch...@apache.org>
> wrote:
> >
> > > At what point are the FLIP discussions coming into play?
> > >
> > > I keep wondering if these shouldn't have started already.
> > > It just seems that a lot of decisions are implicitly reliant on the
> > > items even being accepted.
> > > Estimates can only be provided if we actually know the scope of the
> > > change, but that's not always clear from the description in the doc.
> > >
> > > What we need to ensure is that all breaking API changes are
> > > discussed/decided before 1.18 is released so we can deprecate affected
> > > APIs.
> > >
> > > On 10/07/2023 11:32, Xintong Song wrote:
> > > > Hi Matthias,
> > > >
> > > > The questions you asked are indeed very important. Here're some quick
> > > > responses, based on the plans I had in mind, which I have not aligned
> > > with
> > > > other release managers yet.
> > > >
> > > > In the previous discussions between the RMs, we were not able to make
> > > > proposals on things like how to make a time plan, how to manage the
> > > release
> > > > branch, etc., due to the lack of inputs on e.g., the work items need
> to
> > > be
> > > > included (which transitively depends on the API compatibility to
> provide
> > > > between major versions) and the workloads / time needed for them.
> With
> > > the
> > > > recent discussions, we have collected at least the majority of the
> inputs
> > > > needed.
> > > >
> > > > Here are things that I think we as the release managers would do next
> > > > (again, not aligned with other release managers yet)
> > > > - Creating a time plan, by reaching out to people to understand the
> > > > estimated workloads, prerequisites and ETA of each work item.
> > > > - Make a proposal on how to manage the release branch, i.e., when to
> cut
> > > > the branch and whether to ship the milestone releases, etc.
> > > > - Set-up regular release syncs (bi-weekly / monthly) to update the
> status
> > > > and draw attention to where help is needed.
> > > >
> > > > So back to your questions.
> > > >
> > > > There are still to-be-discussed items in the list of features.
> What's the
> > > >> plan with those?
> > > > When collecting ETA, for items that the completion time cannot yet be
> > > > estimated, we would like to have at least a time by which the
> estimation
> > > > can be made. I think the same applies to the to-be-discussed items.
> And
> > > if
> > > > the items should be included as must-haves, we would need another
> vote to
> > > > adjust the must-have item list.
> > > >
> > > > Some of them don't have anyone assigned.
> > > > My concern is that they will be overlooked because nobody feels to
> be in
> > > >> charge.
> > > > This is a tricky one. For must-have items without assignees, we as
> the
> > > > release managers should be responsible for raising them up in the
> release
> > > > syncs, and try to find assignees for them. Hopefully, there will be
> > > someone
> > > > who stands out. But it is possible that for a must-have item nobody
> wants
> > > > to work on it. If that happens, which I don't think it will, it
> probably
> > > > means the item is not that critical and we may have to exclude it
> from
> > > the
> > > > release. Either way, they should not be overlooked, because IMHO
> release
> > > > managers should be responsible for trying to get someone to work on
> the
> > > > un-assigned items.
> > > >
> > > > We'll have more discussions soon and keep the community updated.
> > > >
> > > > Best,
> > > >
> > > > Xintong
> > > >
> > > >
> > > >
> > > > On Mon, Jul 10, 2023 at 3:53 PM Matthias Pohl
> > > > <ma...@aiven.io.invalid> wrote:
> > > >
> > > >> Now that the vote is started on the must-have items: There are still
> > > >> to-be-discussed items in the list of features. What's the plan with
> > > those?
> > > >> Some of them don't have anyone assigned. Were these items discussed
> > > among
> > > >> the release managers? So far, it looks like they are handled as
> > > >> nice-to-have if someone volunteers to pick them up?
> > > >>
> > > >> My concern is that they will be overlooked because nobody feels to
> be in
> > > >> charge.
> > > >>
> > > >> Best,
> > > >> Matthias
> > > >>
> > > >> On Fri, Jul 7, 2023 at 11:06 AM Xintong Song <tonysong820@gmail.com
> >
> > > >> wrote:
> > > >>
> > > >>> Thanks all for the discussion.
> > > >>>
> > > >>> The wiki has been updated as discussed. I'm starting a vote now.
> > > >>>
> > > >>> Best,
> > > >>>
> > > >>> Xintong
> > > >>>
> > > >>>
> > > >>>
> > > >>> On Wed, Jul 5, 2023 at 9:52 AM Xintong Song <tonysong820@gmail.com
> >
> > > >> wrote:
> > > >>>> Hi ConradJam,
> > > >>>>
> > > >>>> I think Chesnay has already put his name as the Contributor for
> the
> > > two
> > > >>>> tasks you listed. Maybe you can reach out to him to see if you can
> > > >>>> collaborate on this.
> > > >>>>
> > > >>>> In general, I don't think contributing to a release 2.0 issue is
> much
> > > >>>> different from contributing to a regular issue. We haven't yet
> created
> > > >>> JIRA
> > > >>>> tickets for all the listed tasks because many of them needs
> further
> > > >>>> discussions and / or FLIPs to decide whether and how they should
> be
> > > >>>> performed.
> > > >>>>
> > > >>>> Best,
> > > >>>>
> > > >>>> Xintong
> > > >>>>
> > > >>>>
> > > >>>>
> > > >>>> On Mon, Jul 3, 2023 at 10:37 PM ConradJam <ja...@gmail.com>
> > > wrote:
> > > >>>>
> > > >>>>> Hi Community:
> > > >>>>>    I see some tasks in the 2.0 list that haven't been assigned
> yet. I
> > > >>> want
> > > >>>>> to take the initiative to take on some tasks that I can
> complete. How
> > > >>> do I
> > > >>>>> apply to the community for this part of the task? I am
> interested in
> > > >> the
> > > >>>>> following parts of FLINK-32377
> > > >>>>> <https://issues.apache.org/jira/browse/FLINK-32377>, do I need
> to
> > > >>> create
> > > >>>>> issuse myself and point it to myself?
> > > >>>>>
> > > >>>>> - the current timestamp, which is problematic w.r.t. caching and
> > > >>> testing,
> > > >>>>> while providing no value.
> > > >>>>> - Remove JarRequestBody#programArgs in favor of #programArgsList.
> > > >>>>>
> > > >>>>> [1] FLINK-32377 <
> https://issues.apache.org/jira/browse/FLINK-32377>
> > > >>>>> https://issues.apache.org/jira/browse/FLINK-32377
> > > >>>>>
> > > >>>>> Teoh, Hong <li...@amazon.co.uk.invalid> 于2023年6月30日周五 00:53写道:
> > > >>>>>
> > > >>>>>
> > > >>>>> Teoh, Hong <li...@amazon.co.uk.invalid> 于2023年6月30日周五 00:53写道:
> > > >>>>>
> > > >>>>>> Thanks Xintong for driving the effort.
> > > >>>>>>
> > > >>>>>> I’d add a +1 to reworking configs, as suggested by @Jark and
> > > >> @Chesnay,
> > > >>>>>> especially the types. We have various configs that encode Time /
> > > >>>>> MemorySize
> > > >>>>>> that are Long instead!
> > > >>>>>>
> > > >>>>>> Regards,
> > > >>>>>> Hong
> > > >>>>>>
> > > >>>>>>
> > > >>>>>>
> > > >>>>>>> On 29 Jun 2023, at 16:19, Yuan Mei <yu...@gmail.com>
> > > >> wrote:
> > > >>>>>>> CAUTION: This email originated from outside of the
> organization.
> > > >> Do
> > > >>>>> not
> > > >>>>>> click links or open attachments unless you can confirm the
> sender
> > > >> and
> > > >>>>> know
> > > >>>>>> the content is safe.
> > > >>>>>>>
> > > >>>>>>>
> > > >>>>>>> Thanks for driving this effort, Xintong!
> > > >>>>>>>
> > > >>>>>>> To Chesnay
> > > >>>>>>>> I'm curious as to why the "Disaggregated State Management"
> item
> > > >> is
> > > >>>>>>>> marked as a must-have; will it require changes that break
> > > >>> something?
> > > >>>>>>>> What prevents it from being added in 2.1?
> > > >>>>>>> As to "Disaggregated State Management".
> > > >>>>>>>
> > > >>>>>>> We plan to provide a new type of state backend to support DFS
> as
> > > >>>>> primary
> > > >>>>>>> storage.
> > > >>>>>>> To achieve this, we at least need to include two parts of
> amends
> > > >>> (not
> > > >>>>>>> entirely sure yet, since we are still in the designing and
> > > >> prototype
> > > >>>>>> phase)
> > > >>>>>>> 1. Statebackend Change
> > > >>>>>>> 2. State Access Change
> > > >>>>>>>
> > > >>>>>>> Not all of the interfaces related are `@Internal`. Some of the
> > > >>>>> interfaces
> > > >>>>>>> like `StateBackend` is `@PublicEvolving`
> > > >>>>>>> So, you are right in the sense that "Disaggregated State
> > > >> Management"
> > > >>>>>> itself
> > > >>>>>>> probably does not need to be a "Must Have"
> > > >>>>>>>
> > > >>>>>>> But I was hoping changes that related to public APIs can be
> > > >>> finalized
> > > >>>>> and
> > > >>>>>>> merged in Flink 2.0 (I will fix the wiki accordingly).
> > > >>>>>>>
> > > >>>>>>> I also agree with Jark that 2.0 is a good chance to rework the
> > > >>> default
> > > >>>>>>> value of configurations.
> > > >>>>>>>
> > > >>>>>>> Best
> > > >>>>>>> Yuan
> > > >>>>>>>
> > > >>>>>>>
> > > >>>>>>> On Thu, Jun 29, 2023 at 8:43 PM Chesnay Schepler <
> > > >>> chesnay@apache.org>
> > > >>>>>> wrote:
> > > >>>>>>>> Something else configuration-related is that there are a
> bunch of
> > > >>>>>>>> options where the type isn't quite correct (e.g., a String
> where
> > > >> it
> > > >>>>>>>> could be an enum, a string where it should be an int or
> > > >> something).
> > > >>>>>>>> Could do a pass over those as well.
> > > >>>>>>>>
> > > >>>>>>>> On 29/06/2023 13:50, Jark Wu wrote:
> > > >>>>>>>>> Hi,
> > > >>>>>>>>>
> > > >>>>>>>>> I think one more thing we need to consider to do in 2.0 is
> > > >>> changing
> > > >>>>> the
> > > >>>>>>>>> default value of configuration to improve out-of-box user
> > > >>>>> experience.
> > > >>>>>>>>> Currently, in order to run a Flink job, users may need to set
> > > >>>>>>>>> a bunch of configurations, such as minibatch, checkpoint
> > > >> interval,
> > > >>>>>>>>> exactly-once,
> > > >>>>>>>>> incremental-checkpoint, etc. It's very verbose and hard to
> use
> > > >> for
> > > >>>>>>>>> beginners.
> > > >>>>>>>>> Most of them can have a universally applicable value.
> Because
> > > >>>>> changing
> > > >>>>>>>> the
> > > >>>>>>>>> default value is a breaking change. I think It's worth
> > > >> considering
> > > >>>>>>>> changing
> > > >>>>>>>>> them in 2.0.
> > > >>>>>>>>>
> > > >>>>>>>>> What do you think?
> > > >>>>>>>>>
> > > >>>>>>>>> Best,
> > > >>>>>>>>> Jark
> > > >>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>>>> On Wed, 28 Jun 2023 at 14:10, Sergey Nuyanzin <
> > > >>> snuyanzin@gmail.com>
> > > >>>>>>>> wrote:
> > > >>>>>>>>>> Hi Chesnay
> > > >>>>>>>>>>
> > > >>>>>>>>>>> "Move Calcite rules from Scala to Java": I would hope that
> > > >> this
> > > >>>>> would
> > > >>>>>>>> be
> > > >>>>>>>>>>> an entirely internal change, and could thus be an
> incremental
> > > >>>>> process
> > > >>>>>>>>>>> independent of major releases.
> > > >>>>>>>>>>> What is the actual scale of this item; how much are we
> > > >> actually
> > > >>>>>>>>>> re-writing?
> > > >>>>>>>>>>
> > > >>>>>>>>>> Thanks for asking
> > > >>>>>>>>>> yes, you're right, that should be internal change.
> > > >>>>>>>>>> Yeah I was also thinking about incremental change (rule by
> rule
> > > >>> or
> > > >>>>>>>>>> reasonable small group of rules).
> > > >>>>>>>>>> And yes, this could be an independent (on major release)
> > > >> activity
> > > >>>>>>>>>> The problem is actually for children of RelOptRule.
> > > >>>>>>>>>> Currently I see 60+ such rules (in Scala) using the
> mentioned
> > > >>>>>> deprecated
> > > >>>>>>>>>> api.
> > > >>>>>>>>>> There are also children of ConverterRule (50+) which do not
> > > >> have
> > > >>>>> such
> > > >>>>>>>>>> issues.
> > > >>>>>>>>>> Maybe it could be considered as the next step to have all
> the
> > > >>>>> rules in
> > > >>>>>>>>>> Java.
> > > >>>>>>>>>>
> > > >>>>>>>>>> On Tue, Jun 27, 2023 at 1:34 PM Xintong Song <
> > > >>>>> tonysong820@gmail.com>
> > > >>>>>>>>>> wrote:
> > > >>>>>>>>>>
> > > >>>>>>>>>>> Hi Alex & Gyula,
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> By compatibility discussion do you mean the "[DISCUSS]
> > > >> FLIP-321:
> > > >>>>>>>>>> Introduce
> > > >>>>>>>>>>>> an API deprecation process" thread [1]?
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> Yes, I meant the FLIP-321 discussion. I just noticed I
> pasted
> > > >>> the
> > > >>>>>> wrong
> > > >>>>>>>>>> url
> > > >>>>>>>>>>> in my previous email. Sorry for the mistake.
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> I am also curious to know if the rationale behind this new
> API
> > > >>> has
> > > >>>>>> been
> > > >>>>>>>>>>>> previously discussed on the mailing list. Do we have a
> list
> > > >> of
> > > >>>>>>>>>>> shortcomings
> > > >>>>>>>>>>>> in the current DataStream API that it tries to resolve?
> How
> > > >>> does
> > > >>>>> the
> > > >>>>>>>>>>>> current ProcessFunction functionality fit into the
> picture?
> > > >>> Will
> > > >>>>> it
> > > >>>>>> be
> > > >>>>>>>>>>> kept
> > > >>>>>>>>>>>> as is or subsumed by new API?
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> I don't think we should create a replacement for the
> > > >> DataStream
> > > >>>>> API
> > > >>>>>>>>>> unless
> > > >>>>>>>>>>>> we have a very good reason to do so and with a proper
> > > >>> discussion
> > > >>>>>> about
> > > >>>>>>>>>>> this
> > > >>>>>>>>>>>> as Alex said.
> > > >>>>>>>>>>> The ProcessFunction API which is targeting to replace
> > > >> DataStream
> > > >>>>> API
> > > >>>>>> is
> > > >>>>>>>>>>> still a proposal, not a decision. Sorry for the confusion,
> I
> > > >>>>> should
> > > >>>>>>>> have
> > > >>>>>>>>>>> been more careful with my words, not giving the impression
> > > >> that
> > > >>>>> this
> > > >>>>>> is
> > > >>>>>>>>>>> something we'll do anyway.
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> There will be a FLIP describing the motivations and
> designs in
> > > >>>>>> detail,
> > > >>>>>>>>>> for
> > > >>>>>>>>>>> the community to discuss and vote on. We are still working
> on
> > > >>> it.
> > > >>>>>> TBH,
> > > >>>>>>>>>> this
> > > >>>>>>>>>>> is not trivial and we would need more time on it.
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> Just to quickly share some backgrounds:
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>     - We see quite some problems with the current
> DataStream
> > > >> APIs
> > > >>>>>>>>>>>        - Users are working with concrete classes rather
> than
> > > >>>>>>>> interfaces,
> > > >>>>>>>>>>>        which means
> > > >>>>>>>>>>>        - Users can access methods that are designed to be
> used
> > > >> by
> > > >>>>>>>> internal
> > > >>>>>>>>>>>           classes, even though they are annotated with
> > > >>> `@Internal`.
> > > >>>>>>>> E.g.,
> > > >>>>>>>>>>>           `DataStream#getTransformation`.
> > > >>>>>>>>>>>           - Changes to the non-API implementations (e.g.,
> > > >>>>>>>>>> `Transformation`)
> > > >>>>>>>>>>>           would affect the API classes (e.g.,
> `DataStream`),
> > > >>> which
> > > >>>>>>>>>>> makes it hard to
> > > >>>>>>>>>>>           provide binary compatibility.
> > > >>>>>>>>>>>        - Internal classes are used as parameter /
> return-value
> > > >> of
> > > >>>>>>>> public
> > > >>>>>>>>>>>        APIs. E.g., while `AbstractStreamOperator` is
> > > >>>>> PublicEvolving,
> > > >>>>>>>>>>> `StreamTask`
> > > >>>>>>>>>>>        which returns from
> > > >>>>> `AbstractStreamOperator#getContainingTask`
> > > >>>>>> is
> > > >>>>>>>>>>> Internal.
> > > >>>>>>>>>>>        - In many cases, users are asked to extend the API
> > > >>> classes,
> > > >>>>>>>> rather
> > > >>>>>>>>>>>        than implementing interfaces. E.g.,
> > > >>>>> `AbstractStreamOperator`.
> > > >>>>>>>>>>>           - Any changes to the base classes, even the
> internal
> > > >>>>> part,
> > > >>>>>>>> may
> > > >>>>>>>>>>>           affect the behavior of the user-provided
> sub-classes
> > > >>>>>>>>>>>           - Users can override the behavior of the base
> classes
> > > >>>>>>>>>>>        - The API module `flink-streaming-java` contains
> non-API
> > > >>>>>>>> classes,
> > > >>>>>>>>>> and
> > > >>>>>>>>>>>        depends on internal modules such as `flink-runtime`,
> > > >> which
> > > >>>>>> means
> > > >>>>>>>>>>>        - Changes to the internal modules may affect the API
> > > >>>>> modules,
> > > >>>>>>>> which
> > > >>>>>>>>>>>           requires users to re-build their applications
> upon
> > > >>>>> upgrading
> > > >>>>>>>>>>>           - The artifact user needs for building their
> > > >>> application
> > > >>>>>>>> larger
> > > >>>>>>>>>>>           than necessary.
> > > >>>>>>>>>>>        - We probably should not expose operators (e.g.,
> > > >>>>>>>>>>>        `AbstractStreamOperator`) to users. Functions
> should be
> > > >>>>> enough
> > > >>>>>>>>>>> for users to
> > > >>>>>>>>>>>        define their data processing logics. Exposing
> > > >>> operator-level
> > > >>>>>>>>>> concepts
> > > >>>>>>>>>>>        (e.g., mailbox thread model, checkpoint barrier
> > > >> alignment,
> > > >>>>>>>> etc.) is
> > > >>>>>>>>>>>        unnecessary and limits the improvement regarding
> such
> > > >>>>> exposed
> > > >>>>>>>>>>> mechanisms
> > > >>>>>>>>>>>        with compatibility considerations.
> > > >>>>>>>>>>>        - The current DataStream API seems to be a mixture
> of
> > > >> many
> > > >>>>>>>> things,
> > > >>>>>>>>>>>        making it hard to understand especially for
> newcomers.
> > > >> It
> > > >>>>> might
> > > >>>>>>>> be
> > > >>>>>>>>>>> better
> > > >>>>>>>>>>>        to re-organize it into several parts: (the taxonomy
> > > >> below
> > > >>>>> are
> > > >>>>>>>> just
> > > >>>>>>>>>> an
> > > >>>>>>>>>>>        example of the, we are still working on this)
> > > >>>>>>>>>>>           - The most fundamental stateful stream
> processing:
> > > >>>>> streams,
> > > >>>>>>>>>>>           partitions / key, process functions, state,
> > > >>>>> timeline-service
> > > >>>>>>>>>>>           - An extension for common batch-streaming unified
> > > >>>>> functions:
> > > >>>>>>>>>> map,
> > > >>>>>>>>>>>           flatmap, filter, agg, reduce, join, etc.
> > > >>>>>>>>>>>           - An extension for windowing supports:  window,
> > > >>>>> triggering
> > > >>>>>>>>>>>           - An extension for event-time supports: event
> time,
> > > >>>>>> watermark
> > > >>>>>>>>>>>           - The extensions are like short-cuts / sugars,
> > > >> without
> > > >>>>> which
> > > >>>>>>>>>> users
> > > >>>>>>>>>>>           can probably still achieve the same behavior by
> > > >> working
> > > >>>>> with
> > > >>>>>>>> the
> > > >>>>>>>>>>>           fundamental APIs, but would be a lot easier with
> the
> > > >>>>>>>> extensions
> > > >>>>>>>>>>>        - The original plan was to do in-place refactors /
> > > >> changes
> > > >>>>> on
> > > >>>>>>>>>>>     DataStream API. Some related items are listed in this
> doc
> > > >> [2]
> > > >>>>>>>> attached
> > > >>>>>>>>>>> to
> > > >>>>>>>>>>>     the kicking off email [3]. Not all of the above issues
> are
> > > >>>>> listed,
> > > >>>>>>>>>>> because
> > > >>>>>>>>>>>     we haven't looked into this as deeply as now  by that
> time.
> > > >>>>>>>>>>>     - We proposed this as a new API rather than in-place
> > > >>> refactors
> > > >>>>> in
> > > >>>>>>>> the
> > > >>>>>>>>>>>     2.0 work item list, because we realized the changes
> might
> > > >> be
> > > >>>>> too
> > > >>>>>>>> big
> > > >>>>>>>>>>> for an
> > > >>>>>>>>>>>     in-place change. First having a new API then gradually
> > > >>> retiring
> > > >>>>>> the
> > > >>>>>>>>>> old
> > > >>>>>>>>>>> one
> > > >>>>>>>>>>>     would help users to smoothly migrate between them.
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> A thorough discussion is definitely needed once the FLIP is
> > > >> out.
> > > >>>>> And
> > > >>>>>> of
> > > >>>>>>>>>>> course it's possible that the FLIP might be rejected. Given
> > > >> that
> > > >>>>> we
> > > >>>>>> are
> > > >>>>>>>>>>> planning for release 2.0, I just feel it would be better to
> > > >>> bring
> > > >>>>>> this
> > > >>>>>>>> up
> > > >>>>>>>>>>> early even the concrete plan is not yet ready,
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> Best,
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> Xintong
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> [1]
> > > >>>>> https://lists.apache.org/thread/vmhzv8fcw2b33pqxp43486owrxbkd5x9
> > > >>>>>>>>>>> [2]
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>
> > > >>
> > >
> https://docs.google.com/document/d/1_PMGl5RuDQGlV99_gL3y7OiRsF0DgCk91Coua6hFXhE/edit?usp=sharing
> > > >>>>>>>>>>> [3]
> > > >>>>> https://lists.apache.org/thread/b8w5cx0qqbwzzklyn5xxf54vw9ymys1c
> > > >>>>>>>>>>> On Tue, Jun 27, 2023 at 5:15 PM Gyula Fóra <
> gyfora@apache.org
> > > >>>>>> wrote:
> > > >>>>>>>>>>>> Hey!
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> I share the same concerns mentioned above regarding the
> > > >>>>>>>>>> "ProcessFunction
> > > >>>>>>>>>>>> API".
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> I don't think we should create a replacement for the
> > > >> DataStream
> > > >>>>> API
> > > >>>>>>>>>>> unless
> > > >>>>>>>>>>>> we have a very good reason to do so and with a proper
> > > >>> discussion
> > > >>>>>> about
> > > >>>>>>>>>>> this
> > > >>>>>>>>>>>> as Alex said.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> Cheers,
> > > >>>>>>>>>>>> Gyula
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> On Tue, Jun 27, 2023 at 11:03 AM Alexander Fedulov <
> > > >>>>>>>>>>>> alexander.fedulov@gmail.com> wrote:
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>>> Hi Xintong,
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>> By compatibility discussion do you mean the "[DISCUSS]
> > > >>> FLIP-321:
> > > >>>>>>>>>>>> Introduce
> > > >>>>>>>>>>>>> an API deprecation process" thread [1]?
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>> I am also curious to know if the rationale behind this
> new
> > > >> API
> > > >>>>> has
> > > >>>>>>>>>> been
> > > >>>>>>>>>>>>> previously discussed on the mailing list. Do we have a
> list
> > > >> of
> > > >>>>>>>>>>>> shortcomings
> > > >>>>>>>>>>>>> in the current DataStream API that it tries to resolve?
> How
> > > >>> does
> > > >>>>>> the
> > > >>>>>>>>>>>>> current ProcessFunction functionality fit into the
> picture?
> > > >>>>> Will it
> > > >>>>>>>>>> be
> > > >>>>>>>>>>>> kept
> > > >>>>>>>>>>>>> as is or subsumed by new API?
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>> [1]
> > > >>>>>>
> https://lists.apache.org/thread/vmhzv8fcw2b33pqxp43486owrxbkd5x9
> > > >>>>>>>>>>>>> Best,
> > > >>>>>>>>>>>>> Alex
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>> On Mon, 26 Jun 2023 at 14:33, Xintong Song <
> > > >>>>> tonysong820@gmail.com>
> > > >>>>>>>>>>>> wrote:
> > > >>>>>>>>>>>>>>> The ProcessFunction API item is giving me the most
> > > >> headaches
> > > >>>>>>>>>>> because
> > > >>>>>>>>>>>>> it's
> > > >>>>>>>>>>>>>>> very unclear what it actually entails; like is it an
> > > >>> entirely
> > > >>>>>>>>>>>> separate
> > > >>>>>>>>>>>>>> API
> > > >>>>>>>>>>>>>>> to DataStream (sounds like it is!) or an extension of
> > > >>>>> DataStream.
> > > >>>>>>>>>>> How
> > > >>>>>>>>>>>>>> much
> > > >>>>>>>>>>>>>>> will it share the internals with DataStream etc.; how
> does
> > > >>> it
> > > >>>>>>>>>>> relate
> > > >>>>>>>>>>>> to
> > > >>>>>>>>>>>>>> the
> > > >>>>>>>>>>>>>>> Table API (w.r.t. switching APIs / what Table API uses
> > > >>>>>>>>>> underneath).
> > > >>>>>>>>>>>>>> I totally understand your confusion. We started planning
> > > >> this
> > > >>>>>> after
> > > >>>>>>>>>>>>> kicking
> > > >>>>>>>>>>>>>> off the release 2.0, so there's still a lot to be
> explored
> > > >>> and
> > > >>>>> the
> > > >>>>>>>>>>> plan
> > > >>>>>>>>>>>>>> keeps changing.
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>     - In the beginning, we planned to do an in-place
> > > >> refactor
> > > >>> of
> > > >>>>>>>>>>>>> DataStream
> > > >>>>>>>>>>>>>>     API, until the API migration period is proposed.
> > > >>>>>>>>>>>>>>     - Then we want to make it an entirely separate API
> to
> > > >>>>>>>>>> DataStream,
> > > >>>>>>>>>>>> and
> > > >>>>>>>>>>>>>>     listed as a must-have for release 2.0 so that we can
> > > >>> remove
> > > >>>>>>>>>>>> DataStream
> > > >>>>>>>>>>>>>> once
> > > >>>>>>>>>>>>>>     it's ready.
> > > >>>>>>>>>>>>>>     - However, depending on the outcome of the API
> > > >>> compatibility
> > > >>>>>>>>>>>>> discussion
> > > >>>>>>>>>>>>>>     [1], we may not be able to remove DataStream in 2.0
> > > >>> anyway,
> > > >>>>>>>>>> which
> > > >>>>>>>>>>>>> means
> > > >>>>>>>>>>>>>> we
> > > >>>>>>>>>>>>>>     might need to re-evaluate the necessity of this
> item for
> > > >>>>> 2.0.
> > > >>>>>>>>>>>>>> I'd say we wait a bit longer for the compatibility
> > > >> discussion
> > > >>>>> [1]
> > > >>>>>>>>>> and
> > > >>>>>>>>>>>>>> decide the priority for this item afterwards.
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>> Best,
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>> Xintong
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>> [1]
> > > >> https://lists.apache.org/list.html?dev@flink.apache.org
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>> On Mon, Jun 26, 2023 at 6:00 PM Chesnay Schepler <
> > > >>>>>>>>>> chesnay@apache.org
> > > >>>>>>>>>>>>>> wrote:
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>> by-and-large I'm quite happy with the list of items.
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>> I'm curious as to why the "Disaggregated State
> Management"
> > > >>>>> item
> > > >>>>>>>>>> is
> > > >>>>>>>>>>>>> marked
> > > >>>>>>>>>>>>>>> as a must-have; will it require changes that break
> > > >>> something?
> > > >>>>>>>>>> What
> > > >>>>>>>>>>>>>> prevents
> > > >>>>>>>>>>>>>>> it from being added in 2.1?
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>> We may want to update the Java 17 item to "Make Java 17
> > > >> the
> > > >>>>>>>>>>> default,
> > > >>>>>>>>>>>>> drop
> > > >>>>>>>>>>>>>>> Java 8/11". Maybe even split it into a must-have "Drop
> > > >> Java
> > > >>> 8"
> > > >>>>>>>>>> and
> > > >>>>>>>>>>> a
> > > >>>>>>>>>>>>>>> nice-to-have "Drop Java 11"?
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>> "Move Calcite rules from Scala to Java": I would hope
> that
> > > >>>>> this
> > > >>>>>>>>>>> would
> > > >>>>>>>>>>>>> be
> > > >>>>>>>>>>>>>>> an entirely internal change, and could thus be an
> > > >>> incremental
> > > >>>>>>>>>>> process
> > > >>>>>>>>>>>>>>> independent of major releases.
> > > >>>>>>>>>>>>>>> What is the actual scale of this item; how much are we
> > > >>>>> actually
> > > >>>>>>>>>>>>>> re-writing?
> > > >>>>>>>>>>>>>>> "Add MetricGroup#getLogicalScope": I'd raise this to a
> > > >>>>>>>>>> must-have; i
> > > >>>>>>>>>>>>> think
> > > >>>>>>>>>>>>>>> I marked it down as nice-to-have only because it
> depends
> > > >> on
> > > >>>>>>>>>> another
> > > >>>>>>>>>>>>> item.
> > > >>>>>>>>>>>>>>> The ProcessFunction API item is giving me the most
> > > >> headaches
> > > >>>>>>>>>>> because
> > > >>>>>>>>>>>>> it's
> > > >>>>>>>>>>>>>>> very unclear what it actually entails; like is it an
> > > >>> entirely
> > > >>>>>>>>>>>> separate
> > > >>>>>>>>>>>>>> API
> > > >>>>>>>>>>>>>>> to DataStream (sounds like it is!) or an extension of
> > > >>>>> DataStream.
> > > >>>>>>>>>>> How
> > > >>>>>>>>>>>>>> much
> > > >>>>>>>>>>>>>>> will it share the internals with DataStream etc.; how
> does
> > > >>> it
> > > >>>>>>>>>>> relate
> > > >>>>>>>>>>>> to
> > > >>>>>>>>>>>>>> the
> > > >>>>>>>>>>>>>>> Table API (w.r.t. switching APIs / what Table API uses
> > > >>>>>>>>>> underneath).
> > > >>>>>>>>>>>>>>> There are a few items I added as ideas which don't
> have a
> > > >>>>>>>>>> priority
> > > >>>>>>>>>>>> yet;
> > > >>>>>>>>>>>>>>> would love to get some feedback on those.
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>> On 21/06/2023 08:41, Xintong Song wrote:
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>> Hi devs,
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>> As previously discussed in [1], we had been collecting
> > > >> work
> > > >>>>> item
> > > >>>>>>>>>>>>>> proposals
> > > >>>>>>>>>>>>>>> for the 2.0 release until June 15th, on the wiki page
> [2].
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>     - As we have passed the due date, I'd like to
> kindly
> > > >>> remind
> > > >>>>>>>>>>>> everyone
> > > >>>>>>>>>>>>>> *not
> > > >>>>>>>>>>>>>>>     to add / remove items directly on the wiki page*.
> If
> > > >>>>> needed,
> > > >>>>>>>>>>>> please
> > > >>>>>>>>>>>>>> post
> > > >>>>>>>>>>>>>>>     in this thread or reach out to the release managers
> > > >>>>> instead.
> > > >>>>>>>>>>>>>>>     - I've reached out to some folks for clarifications
> > > >> about
> > > >>>>>>>>>> their
> > > >>>>>>>>>>>>>>>     proposals. Some of them mentioned that they can
> not yet
> > > >>>>> tell
> > > >>>>>>>>>>>> whether
> > > >>>>>>>>>>>>>> we
> > > >>>>>>>>>>>>>>>     should do an item or not, and would need more time
> /
> > > >>>>>>>>>> discussions
> > > >>>>>>>>>>>> to
> > > >>>>>>>>>>>>>> make
> > > >>>>>>>>>>>>>>>     the decision. So I added a new symbol for items
> whose
> > > >>>>>>>>>> priorities
> > > >>>>>>>>>>>> are
> > > >>>>>>>>>>>>>> `TBD`.
> > > >>>>>>>>>>>>>>> Now it's time to collaboratively decide a minimum set
> of
> > > >>>>>>>>>> must-have
> > > >>>>>>>>>>>>> items.
> > > >>>>>>>>>>>>>>> I've gone through the entire list of proposed items,
> and
> > > >>> found
> > > >>>>>>>>>> most
> > > >>>>>>>>>>>> of
> > > >>>>>>>>>>>>>> them
> > > >>>>>>>>>>>>>>> make quite much sense. So I think an online sync might
> not
> > > >>> be
> > > >>>>>>>>>>>> necessary
> > > >>>>>>>>>>>>>> for
> > > >>>>>>>>>>>>>>> this. I'd like to go with this DISCUSS thread, where
> > > >>> everyone
> > > >>>>> can
> > > >>>>>>>>>>>>> comment
> > > >>>>>>>>>>>>>>> on how they think the list can be improved, followed
> by a
> > > >>>>> VOTE to
> > > >>>>>>>>>>>>>> formally
> > > >>>>>>>>>>>>>>> make the decision.
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>> Any feedback and opinions, including but not limited to
> > > >> the
> > > >>>>>>>>>>> following
> > > >>>>>>>>>>>>>>> aspects, will be appreciated.
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>     - Important items that are missing from the list
> > > >>>>>>>>>>>>>>>     - Concerns regarding the listed items or their
> > > >> priorities
> > > >>>>>>>>>>>>>>> Looking forward to your feedback.
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>> Best,
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>> Xintong
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>> [1]
> > > >>
> > >
> https://lists.apache.org/list?dev@flink.apache.org:lte=1M:release%202.0%20status%20updates
> > > >>>>>>>>>>>>>>> [2]
> > > >>>>>>>>>>
> https://cwiki.apache.org/confluence/display/FLINK/2.0+Release
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>> --
> > > >>>>>>>>>> Best regards,
> > > >>>>>>>>>> Sergey
> > > >>>>>>>>>>
> > > >>>>>>>>
> > > >>>>>>
> > > >>>>> --
> > > >>>>> Best
> > > >>>>>
> > > >>>>> ConradJam
> > > >>>>>
> > >
> > >
>

Re: [DISCUSS] Release 2.0 Work Items

Posted by Zhu Zhu <re...@gmail.com>.
Agreed that we should deprecate affected APIs as soon as possible.
But there is not much time before the feature freeze of 1.18,  hence
I'm a bit concerned that some of the deprecations might not be done 1.18.

We are currently looking into the improvements of the configuration layer.
Most of the proposed changes would require a public discussion, or even
a FLIP, which I think can hardly close before the feature freeze of 1.18.
And some of the APIs can be deprecated only after the corresponding new
APIs are developed. Therefore we previously targeted them for 1.19.

We may review later to see what deprecation work can be done in 1.18 and
make it if possible. I think we can do the work even after the feature freeze
date, if it is a purely deprecation work (simply adding annotations). WDYT?

I'm also changing the priority of "Clarify the scopes of configuration options"
to nice to have. I think most of the work are not breaking changes and can
be done in 1.x or 2.1+. For the breaking changes which might be needed, we
will consider it as part of the configuration layer rework.

Thanks,
Zhu

Xintong Song <to...@gmail.com> 于2023年7月10日周一 19:58写道:
>
> >
> > At what point are the FLIP discussions coming into play?
>
> I keep wondering if these shouldn't have started already.
>
>
> I think this depends on the responsible contributor and reviewer of
> individual items. From my perspective, the FLIP discussions can start any
> time as long as the contributors are ready, the earlier the better.
>
>
> What we need to ensure is that all breaking API changes are
> > discussed/decided before 1.18 is released so we can deprecate affected APIs.
> >
>
> The introduction of the migration period has brought the requirement to
> plan the removal of public APIs 2 minor releases ahead of the major
> release, which is TBH a bit unexpected. I agree it would be nice if we can
> get the FLIPs ready by releasing 1.18. But I also don't think we should
> rush on it. If the deprecation of a Public API does not make 1.18, we may
> carry it until 3.0. Or if there are many Public APIs whose deprecation does
> not make 1.18, we may deprecate them in 1.19 and postpone the major version
> bump to after a 1.20 release. Moreover, as mentioned in FLIP-321[1],
> exceptions are discussable given that the migration period is newly
> proposed and we did not give developers the chance to plan things ahead. To
> sum up, I'd say we try identify APIs that need to be deprecated in 1.18
> with best efforts, and evaluate the remaining options (carrying the API for
> the entire 2.x cycle, postpone 2.0, or making an exception) case-by-case.
> WDYT?
>
> Best,
>
> Xintong
>
>
> [1] https://lists.apache.org/thread/vmhzv8fcw2b33pqxp43486owrxbkd5x9
>
> On Mon, Jul 10, 2023 at 6:13 PM Chesnay Schepler <ch...@apache.org> wrote:
>
> > At what point are the FLIP discussions coming into play?
> >
> > I keep wondering if these shouldn't have started already.
> > It just seems that a lot of decisions are implicitly reliant on the
> > items even being accepted.
> > Estimates can only be provided if we actually know the scope of the
> > change, but that's not always clear from the description in the doc.
> >
> > What we need to ensure is that all breaking API changes are
> > discussed/decided before 1.18 is released so we can deprecate affected
> > APIs.
> >
> > On 10/07/2023 11:32, Xintong Song wrote:
> > > Hi Matthias,
> > >
> > > The questions you asked are indeed very important. Here're some quick
> > > responses, based on the plans I had in mind, which I have not aligned
> > with
> > > other release managers yet.
> > >
> > > In the previous discussions between the RMs, we were not able to make
> > > proposals on things like how to make a time plan, how to manage the
> > release
> > > branch, etc., due to the lack of inputs on e.g., the work items need to
> > be
> > > included (which transitively depends on the API compatibility to provide
> > > between major versions) and the workloads / time needed for them. With
> > the
> > > recent discussions, we have collected at least the majority of the inputs
> > > needed.
> > >
> > > Here are things that I think we as the release managers would do next
> > > (again, not aligned with other release managers yet)
> > > - Creating a time plan, by reaching out to people to understand the
> > > estimated workloads, prerequisites and ETA of each work item.
> > > - Make a proposal on how to manage the release branch, i.e., when to cut
> > > the branch and whether to ship the milestone releases, etc.
> > > - Set-up regular release syncs (bi-weekly / monthly) to update the status
> > > and draw attention to where help is needed.
> > >
> > > So back to your questions.
> > >
> > > There are still to-be-discussed items in the list of features. What's the
> > >> plan with those?
> > > When collecting ETA, for items that the completion time cannot yet be
> > > estimated, we would like to have at least a time by which the estimation
> > > can be made. I think the same applies to the to-be-discussed items. And
> > if
> > > the items should be included as must-haves, we would need another vote to
> > > adjust the must-have item list.
> > >
> > > Some of them don't have anyone assigned.
> > > My concern is that they will be overlooked because nobody feels to be in
> > >> charge.
> > > This is a tricky one. For must-have items without assignees, we as the
> > > release managers should be responsible for raising them up in the release
> > > syncs, and try to find assignees for them. Hopefully, there will be
> > someone
> > > who stands out. But it is possible that for a must-have item nobody wants
> > > to work on it. If that happens, which I don't think it will, it probably
> > > means the item is not that critical and we may have to exclude it from
> > the
> > > release. Either way, they should not be overlooked, because IMHO release
> > > managers should be responsible for trying to get someone to work on the
> > > un-assigned items.
> > >
> > > We'll have more discussions soon and keep the community updated.
> > >
> > > Best,
> > >
> > > Xintong
> > >
> > >
> > >
> > > On Mon, Jul 10, 2023 at 3:53 PM Matthias Pohl
> > > <ma...@aiven.io.invalid> wrote:
> > >
> > >> Now that the vote is started on the must-have items: There are still
> > >> to-be-discussed items in the list of features. What's the plan with
> > those?
> > >> Some of them don't have anyone assigned. Were these items discussed
> > among
> > >> the release managers? So far, it looks like they are handled as
> > >> nice-to-have if someone volunteers to pick them up?
> > >>
> > >> My concern is that they will be overlooked because nobody feels to be in
> > >> charge.
> > >>
> > >> Best,
> > >> Matthias
> > >>
> > >> On Fri, Jul 7, 2023 at 11:06 AM Xintong Song <to...@gmail.com>
> > >> wrote:
> > >>
> > >>> Thanks all for the discussion.
> > >>>
> > >>> The wiki has been updated as discussed. I'm starting a vote now.
> > >>>
> > >>> Best,
> > >>>
> > >>> Xintong
> > >>>
> > >>>
> > >>>
> > >>> On Wed, Jul 5, 2023 at 9:52 AM Xintong Song <to...@gmail.com>
> > >> wrote:
> > >>>> Hi ConradJam,
> > >>>>
> > >>>> I think Chesnay has already put his name as the Contributor for the
> > two
> > >>>> tasks you listed. Maybe you can reach out to him to see if you can
> > >>>> collaborate on this.
> > >>>>
> > >>>> In general, I don't think contributing to a release 2.0 issue is much
> > >>>> different from contributing to a regular issue. We haven't yet created
> > >>> JIRA
> > >>>> tickets for all the listed tasks because many of them needs further
> > >>>> discussions and / or FLIPs to decide whether and how they should be
> > >>>> performed.
> > >>>>
> > >>>> Best,
> > >>>>
> > >>>> Xintong
> > >>>>
> > >>>>
> > >>>>
> > >>>> On Mon, Jul 3, 2023 at 10:37 PM ConradJam <ja...@gmail.com>
> > wrote:
> > >>>>
> > >>>>> Hi Community:
> > >>>>>    I see some tasks in the 2.0 list that haven't been assigned yet. I
> > >>> want
> > >>>>> to take the initiative to take on some tasks that I can complete. How
> > >>> do I
> > >>>>> apply to the community for this part of the task? I am interested in
> > >> the
> > >>>>> following parts of FLINK-32377
> > >>>>> <https://issues.apache.org/jira/browse/FLINK-32377>, do I need to
> > >>> create
> > >>>>> issuse myself and point it to myself?
> > >>>>>
> > >>>>> - the current timestamp, which is problematic w.r.t. caching and
> > >>> testing,
> > >>>>> while providing no value.
> > >>>>> - Remove JarRequestBody#programArgs in favor of #programArgsList.
> > >>>>>
> > >>>>> [1] FLINK-32377 <https://issues.apache.org/jira/browse/FLINK-32377>
> > >>>>> https://issues.apache.org/jira/browse/FLINK-32377
> > >>>>>
> > >>>>> Teoh, Hong <li...@amazon.co.uk.invalid> 于2023年6月30日周五 00:53写道:
> > >>>>>
> > >>>>>
> > >>>>> Teoh, Hong <li...@amazon.co.uk.invalid> 于2023年6月30日周五 00:53写道:
> > >>>>>
> > >>>>>> Thanks Xintong for driving the effort.
> > >>>>>>
> > >>>>>> I’d add a +1 to reworking configs, as suggested by @Jark and
> > >> @Chesnay,
> > >>>>>> especially the types. We have various configs that encode Time /
> > >>>>> MemorySize
> > >>>>>> that are Long instead!
> > >>>>>>
> > >>>>>> Regards,
> > >>>>>> Hong
> > >>>>>>
> > >>>>>>
> > >>>>>>
> > >>>>>>> On 29 Jun 2023, at 16:19, Yuan Mei <yu...@gmail.com>
> > >> wrote:
> > >>>>>>> CAUTION: This email originated from outside of the organization.
> > >> Do
> > >>>>> not
> > >>>>>> click links or open attachments unless you can confirm the sender
> > >> and
> > >>>>> know
> > >>>>>> the content is safe.
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> Thanks for driving this effort, Xintong!
> > >>>>>>>
> > >>>>>>> To Chesnay
> > >>>>>>>> I'm curious as to why the "Disaggregated State Management" item
> > >> is
> > >>>>>>>> marked as a must-have; will it require changes that break
> > >>> something?
> > >>>>>>>> What prevents it from being added in 2.1?
> > >>>>>>> As to "Disaggregated State Management".
> > >>>>>>>
> > >>>>>>> We plan to provide a new type of state backend to support DFS as
> > >>>>> primary
> > >>>>>>> storage.
> > >>>>>>> To achieve this, we at least need to include two parts of amends
> > >>> (not
> > >>>>>>> entirely sure yet, since we are still in the designing and
> > >> prototype
> > >>>>>> phase)
> > >>>>>>> 1. Statebackend Change
> > >>>>>>> 2. State Access Change
> > >>>>>>>
> > >>>>>>> Not all of the interfaces related are `@Internal`. Some of the
> > >>>>> interfaces
> > >>>>>>> like `StateBackend` is `@PublicEvolving`
> > >>>>>>> So, you are right in the sense that "Disaggregated State
> > >> Management"
> > >>>>>> itself
> > >>>>>>> probably does not need to be a "Must Have"
> > >>>>>>>
> > >>>>>>> But I was hoping changes that related to public APIs can be
> > >>> finalized
> > >>>>> and
> > >>>>>>> merged in Flink 2.0 (I will fix the wiki accordingly).
> > >>>>>>>
> > >>>>>>> I also agree with Jark that 2.0 is a good chance to rework the
> > >>> default
> > >>>>>>> value of configurations.
> > >>>>>>>
> > >>>>>>> Best
> > >>>>>>> Yuan
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> On Thu, Jun 29, 2023 at 8:43 PM Chesnay Schepler <
> > >>> chesnay@apache.org>
> > >>>>>> wrote:
> > >>>>>>>> Something else configuration-related is that there are a bunch of
> > >>>>>>>> options where the type isn't quite correct (e.g., a String where
> > >> it
> > >>>>>>>> could be an enum, a string where it should be an int or
> > >> something).
> > >>>>>>>> Could do a pass over those as well.
> > >>>>>>>>
> > >>>>>>>> On 29/06/2023 13:50, Jark Wu wrote:
> > >>>>>>>>> Hi,
> > >>>>>>>>>
> > >>>>>>>>> I think one more thing we need to consider to do in 2.0 is
> > >>> changing
> > >>>>> the
> > >>>>>>>>> default value of configuration to improve out-of-box user
> > >>>>> experience.
> > >>>>>>>>> Currently, in order to run a Flink job, users may need to set
> > >>>>>>>>> a bunch of configurations, such as minibatch, checkpoint
> > >> interval,
> > >>>>>>>>> exactly-once,
> > >>>>>>>>> incremental-checkpoint, etc. It's very verbose and hard to use
> > >> for
> > >>>>>>>>> beginners.
> > >>>>>>>>> Most of them can have a universally applicable value.  Because
> > >>>>> changing
> > >>>>>>>> the
> > >>>>>>>>> default value is a breaking change. I think It's worth
> > >> considering
> > >>>>>>>> changing
> > >>>>>>>>> them in 2.0.
> > >>>>>>>>>
> > >>>>>>>>> What do you think?
> > >>>>>>>>>
> > >>>>>>>>> Best,
> > >>>>>>>>> Jark
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>> On Wed, 28 Jun 2023 at 14:10, Sergey Nuyanzin <
> > >>> snuyanzin@gmail.com>
> > >>>>>>>> wrote:
> > >>>>>>>>>> Hi Chesnay
> > >>>>>>>>>>
> > >>>>>>>>>>> "Move Calcite rules from Scala to Java": I would hope that
> > >> this
> > >>>>> would
> > >>>>>>>> be
> > >>>>>>>>>>> an entirely internal change, and could thus be an incremental
> > >>>>> process
> > >>>>>>>>>>> independent of major releases.
> > >>>>>>>>>>> What is the actual scale of this item; how much are we
> > >> actually
> > >>>>>>>>>> re-writing?
> > >>>>>>>>>>
> > >>>>>>>>>> Thanks for asking
> > >>>>>>>>>> yes, you're right, that should be internal change.
> > >>>>>>>>>> Yeah I was also thinking about incremental change (rule by rule
> > >>> or
> > >>>>>>>>>> reasonable small group of rules).
> > >>>>>>>>>> And yes, this could be an independent (on major release)
> > >> activity
> > >>>>>>>>>> The problem is actually for children of RelOptRule.
> > >>>>>>>>>> Currently I see 60+ such rules (in Scala) using the mentioned
> > >>>>>> deprecated
> > >>>>>>>>>> api.
> > >>>>>>>>>> There are also children of ConverterRule (50+) which do not
> > >> have
> > >>>>> such
> > >>>>>>>>>> issues.
> > >>>>>>>>>> Maybe it could be considered as the next step to have all the
> > >>>>> rules in
> > >>>>>>>>>> Java.
> > >>>>>>>>>>
> > >>>>>>>>>> On Tue, Jun 27, 2023 at 1:34 PM Xintong Song <
> > >>>>> tonysong820@gmail.com>
> > >>>>>>>>>> wrote:
> > >>>>>>>>>>
> > >>>>>>>>>>> Hi Alex & Gyula,
> > >>>>>>>>>>>
> > >>>>>>>>>>> By compatibility discussion do you mean the "[DISCUSS]
> > >> FLIP-321:
> > >>>>>>>>>> Introduce
> > >>>>>>>>>>>> an API deprecation process" thread [1]?
> > >>>>>>>>>>>>
> > >>>>>>>>>>> Yes, I meant the FLIP-321 discussion. I just noticed I pasted
> > >>> the
> > >>>>>> wrong
> > >>>>>>>>>> url
> > >>>>>>>>>>> in my previous email. Sorry for the mistake.
> > >>>>>>>>>>>
> > >>>>>>>>>>> I am also curious to know if the rationale behind this new API
> > >>> has
> > >>>>>> been
> > >>>>>>>>>>>> previously discussed on the mailing list. Do we have a list
> > >> of
> > >>>>>>>>>>> shortcomings
> > >>>>>>>>>>>> in the current DataStream API that it tries to resolve? How
> > >>> does
> > >>>>> the
> > >>>>>>>>>>>> current ProcessFunction functionality fit into the picture?
> > >>> Will
> > >>>>> it
> > >>>>>> be
> > >>>>>>>>>>> kept
> > >>>>>>>>>>>> as is or subsumed by new API?
> > >>>>>>>>>>>>
> > >>>>>>>>>>> I don't think we should create a replacement for the
> > >> DataStream
> > >>>>> API
> > >>>>>>>>>> unless
> > >>>>>>>>>>>> we have a very good reason to do so and with a proper
> > >>> discussion
> > >>>>>> about
> > >>>>>>>>>>> this
> > >>>>>>>>>>>> as Alex said.
> > >>>>>>>>>>> The ProcessFunction API which is targeting to replace
> > >> DataStream
> > >>>>> API
> > >>>>>> is
> > >>>>>>>>>>> still a proposal, not a decision. Sorry for the confusion, I
> > >>>>> should
> > >>>>>>>> have
> > >>>>>>>>>>> been more careful with my words, not giving the impression
> > >> that
> > >>>>> this
> > >>>>>> is
> > >>>>>>>>>>> something we'll do anyway.
> > >>>>>>>>>>>
> > >>>>>>>>>>> There will be a FLIP describing the motivations and designs in
> > >>>>>> detail,
> > >>>>>>>>>> for
> > >>>>>>>>>>> the community to discuss and vote on. We are still working on
> > >>> it.
> > >>>>>> TBH,
> > >>>>>>>>>> this
> > >>>>>>>>>>> is not trivial and we would need more time on it.
> > >>>>>>>>>>>
> > >>>>>>>>>>> Just to quickly share some backgrounds:
> > >>>>>>>>>>>
> > >>>>>>>>>>>     - We see quite some problems with the current DataStream
> > >> APIs
> > >>>>>>>>>>>        - Users are working with concrete classes rather than
> > >>>>>>>> interfaces,
> > >>>>>>>>>>>        which means
> > >>>>>>>>>>>        - Users can access methods that are designed to be used
> > >> by
> > >>>>>>>> internal
> > >>>>>>>>>>>           classes, even though they are annotated with
> > >>> `@Internal`.
> > >>>>>>>> E.g.,
> > >>>>>>>>>>>           `DataStream#getTransformation`.
> > >>>>>>>>>>>           - Changes to the non-API implementations (e.g.,
> > >>>>>>>>>> `Transformation`)
> > >>>>>>>>>>>           would affect the API classes (e.g., `DataStream`),
> > >>> which
> > >>>>>>>>>>> makes it hard to
> > >>>>>>>>>>>           provide binary compatibility.
> > >>>>>>>>>>>        - Internal classes are used as parameter / return-value
> > >> of
> > >>>>>>>> public
> > >>>>>>>>>>>        APIs. E.g., while `AbstractStreamOperator` is
> > >>>>> PublicEvolving,
> > >>>>>>>>>>> `StreamTask`
> > >>>>>>>>>>>        which returns from
> > >>>>> `AbstractStreamOperator#getContainingTask`
> > >>>>>> is
> > >>>>>>>>>>> Internal.
> > >>>>>>>>>>>        - In many cases, users are asked to extend the API
> > >>> classes,
> > >>>>>>>> rather
> > >>>>>>>>>>>        than implementing interfaces. E.g.,
> > >>>>> `AbstractStreamOperator`.
> > >>>>>>>>>>>           - Any changes to the base classes, even the internal
> > >>>>> part,
> > >>>>>>>> may
> > >>>>>>>>>>>           affect the behavior of the user-provided sub-classes
> > >>>>>>>>>>>           - Users can override the behavior of the base classes
> > >>>>>>>>>>>        - The API module `flink-streaming-java` contains non-API
> > >>>>>>>> classes,
> > >>>>>>>>>> and
> > >>>>>>>>>>>        depends on internal modules such as `flink-runtime`,
> > >> which
> > >>>>>> means
> > >>>>>>>>>>>        - Changes to the internal modules may affect the API
> > >>>>> modules,
> > >>>>>>>> which
> > >>>>>>>>>>>           requires users to re-build their applications upon
> > >>>>> upgrading
> > >>>>>>>>>>>           - The artifact user needs for building their
> > >>> application
> > >>>>>>>> larger
> > >>>>>>>>>>>           than necessary.
> > >>>>>>>>>>>        - We probably should not expose operators (e.g.,
> > >>>>>>>>>>>        `AbstractStreamOperator`) to users. Functions should be
> > >>>>> enough
> > >>>>>>>>>>> for users to
> > >>>>>>>>>>>        define their data processing logics. Exposing
> > >>> operator-level
> > >>>>>>>>>> concepts
> > >>>>>>>>>>>        (e.g., mailbox thread model, checkpoint barrier
> > >> alignment,
> > >>>>>>>> etc.) is
> > >>>>>>>>>>>        unnecessary and limits the improvement regarding such
> > >>>>> exposed
> > >>>>>>>>>>> mechanisms
> > >>>>>>>>>>>        with compatibility considerations.
> > >>>>>>>>>>>        - The current DataStream API seems to be a mixture of
> > >> many
> > >>>>>>>> things,
> > >>>>>>>>>>>        making it hard to understand especially for newcomers.
> > >> It
> > >>>>> might
> > >>>>>>>> be
> > >>>>>>>>>>> better
> > >>>>>>>>>>>        to re-organize it into several parts: (the taxonomy
> > >> below
> > >>>>> are
> > >>>>>>>> just
> > >>>>>>>>>> an
> > >>>>>>>>>>>        example of the, we are still working on this)
> > >>>>>>>>>>>           - The most fundamental stateful stream processing:
> > >>>>> streams,
> > >>>>>>>>>>>           partitions / key, process functions, state,
> > >>>>> timeline-service
> > >>>>>>>>>>>           - An extension for common batch-streaming unified
> > >>>>> functions:
> > >>>>>>>>>> map,
> > >>>>>>>>>>>           flatmap, filter, agg, reduce, join, etc.
> > >>>>>>>>>>>           - An extension for windowing supports:  window,
> > >>>>> triggering
> > >>>>>>>>>>>           - An extension for event-time supports: event time,
> > >>>>>> watermark
> > >>>>>>>>>>>           - The extensions are like short-cuts / sugars,
> > >> without
> > >>>>> which
> > >>>>>>>>>> users
> > >>>>>>>>>>>           can probably still achieve the same behavior by
> > >> working
> > >>>>> with
> > >>>>>>>> the
> > >>>>>>>>>>>           fundamental APIs, but would be a lot easier with the
> > >>>>>>>> extensions
> > >>>>>>>>>>>        - The original plan was to do in-place refactors /
> > >> changes
> > >>>>> on
> > >>>>>>>>>>>     DataStream API. Some related items are listed in this doc
> > >> [2]
> > >>>>>>>> attached
> > >>>>>>>>>>> to
> > >>>>>>>>>>>     the kicking off email [3]. Not all of the above issues are
> > >>>>> listed,
> > >>>>>>>>>>> because
> > >>>>>>>>>>>     we haven't looked into this as deeply as now  by that time.
> > >>>>>>>>>>>     - We proposed this as a new API rather than in-place
> > >>> refactors
> > >>>>> in
> > >>>>>>>> the
> > >>>>>>>>>>>     2.0 work item list, because we realized the changes might
> > >> be
> > >>>>> too
> > >>>>>>>> big
> > >>>>>>>>>>> for an
> > >>>>>>>>>>>     in-place change. First having a new API then gradually
> > >>> retiring
> > >>>>>> the
> > >>>>>>>>>> old
> > >>>>>>>>>>> one
> > >>>>>>>>>>>     would help users to smoothly migrate between them.
> > >>>>>>>>>>>
> > >>>>>>>>>>> A thorough discussion is definitely needed once the FLIP is
> > >> out.
> > >>>>> And
> > >>>>>> of
> > >>>>>>>>>>> course it's possible that the FLIP might be rejected. Given
> > >> that
> > >>>>> we
> > >>>>>> are
> > >>>>>>>>>>> planning for release 2.0, I just feel it would be better to
> > >>> bring
> > >>>>>> this
> > >>>>>>>> up
> > >>>>>>>>>>> early even the concrete plan is not yet ready,
> > >>>>>>>>>>>
> > >>>>>>>>>>> Best,
> > >>>>>>>>>>>
> > >>>>>>>>>>> Xintong
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>> [1]
> > >>>>> https://lists.apache.org/thread/vmhzv8fcw2b33pqxp43486owrxbkd5x9
> > >>>>>>>>>>> [2]
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>
> > https://docs.google.com/document/d/1_PMGl5RuDQGlV99_gL3y7OiRsF0DgCk91Coua6hFXhE/edit?usp=sharing
> > >>>>>>>>>>> [3]
> > >>>>> https://lists.apache.org/thread/b8w5cx0qqbwzzklyn5xxf54vw9ymys1c
> > >>>>>>>>>>> On Tue, Jun 27, 2023 at 5:15 PM Gyula Fóra <gyfora@apache.org
> > >>>>>> wrote:
> > >>>>>>>>>>>> Hey!
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> I share the same concerns mentioned above regarding the
> > >>>>>>>>>> "ProcessFunction
> > >>>>>>>>>>>> API".
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> I don't think we should create a replacement for the
> > >> DataStream
> > >>>>> API
> > >>>>>>>>>>> unless
> > >>>>>>>>>>>> we have a very good reason to do so and with a proper
> > >>> discussion
> > >>>>>> about
> > >>>>>>>>>>> this
> > >>>>>>>>>>>> as Alex said.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Cheers,
> > >>>>>>>>>>>> Gyula
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> On Tue, Jun 27, 2023 at 11:03 AM Alexander Fedulov <
> > >>>>>>>>>>>> alexander.fedulov@gmail.com> wrote:
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>> Hi Xintong,
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> By compatibility discussion do you mean the "[DISCUSS]
> > >>> FLIP-321:
> > >>>>>>>>>>>> Introduce
> > >>>>>>>>>>>>> an API deprecation process" thread [1]?
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> I am also curious to know if the rationale behind this new
> > >> API
> > >>>>> has
> > >>>>>>>>>> been
> > >>>>>>>>>>>>> previously discussed on the mailing list. Do we have a list
> > >> of
> > >>>>>>>>>>>> shortcomings
> > >>>>>>>>>>>>> in the current DataStream API that it tries to resolve? How
> > >>> does
> > >>>>>> the
> > >>>>>>>>>>>>> current ProcessFunction functionality fit into the picture?
> > >>>>> Will it
> > >>>>>>>>>> be
> > >>>>>>>>>>>> kept
> > >>>>>>>>>>>>> as is or subsumed by new API?
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> [1]
> > >>>>>> https://lists.apache.org/thread/vmhzv8fcw2b33pqxp43486owrxbkd5x9
> > >>>>>>>>>>>>> Best,
> > >>>>>>>>>>>>> Alex
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> On Mon, 26 Jun 2023 at 14:33, Xintong Song <
> > >>>>> tonysong820@gmail.com>
> > >>>>>>>>>>>> wrote:
> > >>>>>>>>>>>>>>> The ProcessFunction API item is giving me the most
> > >> headaches
> > >>>>>>>>>>> because
> > >>>>>>>>>>>>> it's
> > >>>>>>>>>>>>>>> very unclear what it actually entails; like is it an
> > >>> entirely
> > >>>>>>>>>>>> separate
> > >>>>>>>>>>>>>> API
> > >>>>>>>>>>>>>>> to DataStream (sounds like it is!) or an extension of
> > >>>>> DataStream.
> > >>>>>>>>>>> How
> > >>>>>>>>>>>>>> much
> > >>>>>>>>>>>>>>> will it share the internals with DataStream etc.; how does
> > >>> it
> > >>>>>>>>>>> relate
> > >>>>>>>>>>>> to
> > >>>>>>>>>>>>>> the
> > >>>>>>>>>>>>>>> Table API (w.r.t. switching APIs / what Table API uses
> > >>>>>>>>>> underneath).
> > >>>>>>>>>>>>>> I totally understand your confusion. We started planning
> > >> this
> > >>>>>> after
> > >>>>>>>>>>>>> kicking
> > >>>>>>>>>>>>>> off the release 2.0, so there's still a lot to be explored
> > >>> and
> > >>>>> the
> > >>>>>>>>>>> plan
> > >>>>>>>>>>>>>> keeps changing.
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>     - In the beginning, we planned to do an in-place
> > >> refactor
> > >>> of
> > >>>>>>>>>>>>> DataStream
> > >>>>>>>>>>>>>>     API, until the API migration period is proposed.
> > >>>>>>>>>>>>>>     - Then we want to make it an entirely separate API to
> > >>>>>>>>>> DataStream,
> > >>>>>>>>>>>> and
> > >>>>>>>>>>>>>>     listed as a must-have for release 2.0 so that we can
> > >>> remove
> > >>>>>>>>>>>> DataStream
> > >>>>>>>>>>>>>> once
> > >>>>>>>>>>>>>>     it's ready.
> > >>>>>>>>>>>>>>     - However, depending on the outcome of the API
> > >>> compatibility
> > >>>>>>>>>>>>> discussion
> > >>>>>>>>>>>>>>     [1], we may not be able to remove DataStream in 2.0
> > >>> anyway,
> > >>>>>>>>>> which
> > >>>>>>>>>>>>> means
> > >>>>>>>>>>>>>> we
> > >>>>>>>>>>>>>>     might need to re-evaluate the necessity of this item for
> > >>>>> 2.0.
> > >>>>>>>>>>>>>> I'd say we wait a bit longer for the compatibility
> > >> discussion
> > >>>>> [1]
> > >>>>>>>>>> and
> > >>>>>>>>>>>>>> decide the priority for this item afterwards.
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> Best,
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> Xintong
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> [1]
> > >> https://lists.apache.org/list.html?dev@flink.apache.org
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> On Mon, Jun 26, 2023 at 6:00 PM Chesnay Schepler <
> > >>>>>>>>>> chesnay@apache.org
> > >>>>>>>>>>>>>> wrote:
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> by-and-large I'm quite happy with the list of items.
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> I'm curious as to why the "Disaggregated State Management"
> > >>>>> item
> > >>>>>>>>>> is
> > >>>>>>>>>>>>> marked
> > >>>>>>>>>>>>>>> as a must-have; will it require changes that break
> > >>> something?
> > >>>>>>>>>> What
> > >>>>>>>>>>>>>> prevents
> > >>>>>>>>>>>>>>> it from being added in 2.1?
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> We may want to update the Java 17 item to "Make Java 17
> > >> the
> > >>>>>>>>>>> default,
> > >>>>>>>>>>>>> drop
> > >>>>>>>>>>>>>>> Java 8/11". Maybe even split it into a must-have "Drop
> > >> Java
> > >>> 8"
> > >>>>>>>>>> and
> > >>>>>>>>>>> a
> > >>>>>>>>>>>>>>> nice-to-have "Drop Java 11"?
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> "Move Calcite rules from Scala to Java": I would hope that
> > >>>>> this
> > >>>>>>>>>>> would
> > >>>>>>>>>>>>> be
> > >>>>>>>>>>>>>>> an entirely internal change, and could thus be an
> > >>> incremental
> > >>>>>>>>>>> process
> > >>>>>>>>>>>>>>> independent of major releases.
> > >>>>>>>>>>>>>>> What is the actual scale of this item; how much are we
> > >>>>> actually
> > >>>>>>>>>>>>>> re-writing?
> > >>>>>>>>>>>>>>> "Add MetricGroup#getLogicalScope": I'd raise this to a
> > >>>>>>>>>> must-have; i
> > >>>>>>>>>>>>> think
> > >>>>>>>>>>>>>>> I marked it down as nice-to-have only because it depends
> > >> on
> > >>>>>>>>>> another
> > >>>>>>>>>>>>> item.
> > >>>>>>>>>>>>>>> The ProcessFunction API item is giving me the most
> > >> headaches
> > >>>>>>>>>>> because
> > >>>>>>>>>>>>> it's
> > >>>>>>>>>>>>>>> very unclear what it actually entails; like is it an
> > >>> entirely
> > >>>>>>>>>>>> separate
> > >>>>>>>>>>>>>> API
> > >>>>>>>>>>>>>>> to DataStream (sounds like it is!) or an extension of
> > >>>>> DataStream.
> > >>>>>>>>>>> How
> > >>>>>>>>>>>>>> much
> > >>>>>>>>>>>>>>> will it share the internals with DataStream etc.; how does
> > >>> it
> > >>>>>>>>>>> relate
> > >>>>>>>>>>>> to
> > >>>>>>>>>>>>>> the
> > >>>>>>>>>>>>>>> Table API (w.r.t. switching APIs / what Table API uses
> > >>>>>>>>>> underneath).
> > >>>>>>>>>>>>>>> There are a few items I added as ideas which don't have a
> > >>>>>>>>>> priority
> > >>>>>>>>>>>> yet;
> > >>>>>>>>>>>>>>> would love to get some feedback on those.
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> On 21/06/2023 08:41, Xintong Song wrote:
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> Hi devs,
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> As previously discussed in [1], we had been collecting
> > >> work
> > >>>>> item
> > >>>>>>>>>>>>>> proposals
> > >>>>>>>>>>>>>>> for the 2.0 release until June 15th, on the wiki page [2].
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>     - As we have passed the due date, I'd like to kindly
> > >>> remind
> > >>>>>>>>>>>> everyone
> > >>>>>>>>>>>>>> *not
> > >>>>>>>>>>>>>>>     to add / remove items directly on the wiki page*. If
> > >>>>> needed,
> > >>>>>>>>>>>> please
> > >>>>>>>>>>>>>> post
> > >>>>>>>>>>>>>>>     in this thread or reach out to the release managers
> > >>>>> instead.
> > >>>>>>>>>>>>>>>     - I've reached out to some folks for clarifications
> > >> about
> > >>>>>>>>>> their
> > >>>>>>>>>>>>>>>     proposals. Some of them mentioned that they can not yet
> > >>>>> tell
> > >>>>>>>>>>>> whether
> > >>>>>>>>>>>>>> we
> > >>>>>>>>>>>>>>>     should do an item or not, and would need more time /
> > >>>>>>>>>> discussions
> > >>>>>>>>>>>> to
> > >>>>>>>>>>>>>> make
> > >>>>>>>>>>>>>>>     the decision. So I added a new symbol for items whose
> > >>>>>>>>>> priorities
> > >>>>>>>>>>>> are
> > >>>>>>>>>>>>>> `TBD`.
> > >>>>>>>>>>>>>>> Now it's time to collaboratively decide a minimum set of
> > >>>>>>>>>> must-have
> > >>>>>>>>>>>>> items.
> > >>>>>>>>>>>>>>> I've gone through the entire list of proposed items, and
> > >>> found
> > >>>>>>>>>> most
> > >>>>>>>>>>>> of
> > >>>>>>>>>>>>>> them
> > >>>>>>>>>>>>>>> make quite much sense. So I think an online sync might not
> > >>> be
> > >>>>>>>>>>>> necessary
> > >>>>>>>>>>>>>> for
> > >>>>>>>>>>>>>>> this. I'd like to go with this DISCUSS thread, where
> > >>> everyone
> > >>>>> can
> > >>>>>>>>>>>>> comment
> > >>>>>>>>>>>>>>> on how they think the list can be improved, followed by a
> > >>>>> VOTE to
> > >>>>>>>>>>>>>> formally
> > >>>>>>>>>>>>>>> make the decision.
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> Any feedback and opinions, including but not limited to
> > >> the
> > >>>>>>>>>>> following
> > >>>>>>>>>>>>>>> aspects, will be appreciated.
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>     - Important items that are missing from the list
> > >>>>>>>>>>>>>>>     - Concerns regarding the listed items or their
> > >> priorities
> > >>>>>>>>>>>>>>> Looking forward to your feedback.
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> Best,
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> Xintong
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> [1]
> > >>
> > https://lists.apache.org/list?dev@flink.apache.org:lte=1M:release%202.0%20status%20updates
> > >>>>>>>>>>>>>>> [2]
> > >>>>>>>>>> https://cwiki.apache.org/confluence/display/FLINK/2.0+Release
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>> --
> > >>>>>>>>>> Best regards,
> > >>>>>>>>>> Sergey
> > >>>>>>>>>>
> > >>>>>>>>
> > >>>>>>
> > >>>>> --
> > >>>>> Best
> > >>>>>
> > >>>>> ConradJam
> > >>>>>
> >
> >

Re: [DISCUSS] Release 2.0 Work Items

Posted by Xintong Song <to...@gmail.com>.
>
> At what point are the FLIP discussions coming into play?

I keep wondering if these shouldn't have started already.


I think this depends on the responsible contributor and reviewer of
individual items. From my perspective, the FLIP discussions can start any
time as long as the contributors are ready, the earlier the better.


What we need to ensure is that all breaking API changes are
> discussed/decided before 1.18 is released so we can deprecate affected APIs.
>

The introduction of the migration period has brought the requirement to
plan the removal of public APIs 2 minor releases ahead of the major
release, which is TBH a bit unexpected. I agree it would be nice if we can
get the FLIPs ready by releasing 1.18. But I also don't think we should
rush on it. If the deprecation of a Public API does not make 1.18, we may
carry it until 3.0. Or if there are many Public APIs whose deprecation does
not make 1.18, we may deprecate them in 1.19 and postpone the major version
bump to after a 1.20 release. Moreover, as mentioned in FLIP-321[1],
exceptions are discussable given that the migration period is newly
proposed and we did not give developers the chance to plan things ahead. To
sum up, I'd say we try identify APIs that need to be deprecated in 1.18
with best efforts, and evaluate the remaining options (carrying the API for
the entire 2.x cycle, postpone 2.0, or making an exception) case-by-case.
WDYT?

Best,

Xintong


[1] https://lists.apache.org/thread/vmhzv8fcw2b33pqxp43486owrxbkd5x9

On Mon, Jul 10, 2023 at 6:13 PM Chesnay Schepler <ch...@apache.org> wrote:

> At what point are the FLIP discussions coming into play?
>
> I keep wondering if these shouldn't have started already.
> It just seems that a lot of decisions are implicitly reliant on the
> items even being accepted.
> Estimates can only be provided if we actually know the scope of the
> change, but that's not always clear from the description in the doc.
>
> What we need to ensure is that all breaking API changes are
> discussed/decided before 1.18 is released so we can deprecate affected
> APIs.
>
> On 10/07/2023 11:32, Xintong Song wrote:
> > Hi Matthias,
> >
> > The questions you asked are indeed very important. Here're some quick
> > responses, based on the plans I had in mind, which I have not aligned
> with
> > other release managers yet.
> >
> > In the previous discussions between the RMs, we were not able to make
> > proposals on things like how to make a time plan, how to manage the
> release
> > branch, etc., due to the lack of inputs on e.g., the work items need to
> be
> > included (which transitively depends on the API compatibility to provide
> > between major versions) and the workloads / time needed for them. With
> the
> > recent discussions, we have collected at least the majority of the inputs
> > needed.
> >
> > Here are things that I think we as the release managers would do next
> > (again, not aligned with other release managers yet)
> > - Creating a time plan, by reaching out to people to understand the
> > estimated workloads, prerequisites and ETA of each work item.
> > - Make a proposal on how to manage the release branch, i.e., when to cut
> > the branch and whether to ship the milestone releases, etc.
> > - Set-up regular release syncs (bi-weekly / monthly) to update the status
> > and draw attention to where help is needed.
> >
> > So back to your questions.
> >
> > There are still to-be-discussed items in the list of features. What's the
> >> plan with those?
> > When collecting ETA, for items that the completion time cannot yet be
> > estimated, we would like to have at least a time by which the estimation
> > can be made. I think the same applies to the to-be-discussed items. And
> if
> > the items should be included as must-haves, we would need another vote to
> > adjust the must-have item list.
> >
> > Some of them don't have anyone assigned.
> > My concern is that they will be overlooked because nobody feels to be in
> >> charge.
> > This is a tricky one. For must-have items without assignees, we as the
> > release managers should be responsible for raising them up in the release
> > syncs, and try to find assignees for them. Hopefully, there will be
> someone
> > who stands out. But it is possible that for a must-have item nobody wants
> > to work on it. If that happens, which I don't think it will, it probably
> > means the item is not that critical and we may have to exclude it from
> the
> > release. Either way, they should not be overlooked, because IMHO release
> > managers should be responsible for trying to get someone to work on the
> > un-assigned items.
> >
> > We'll have more discussions soon and keep the community updated.
> >
> > Best,
> >
> > Xintong
> >
> >
> >
> > On Mon, Jul 10, 2023 at 3:53 PM Matthias Pohl
> > <ma...@aiven.io.invalid> wrote:
> >
> >> Now that the vote is started on the must-have items: There are still
> >> to-be-discussed items in the list of features. What's the plan with
> those?
> >> Some of them don't have anyone assigned. Were these items discussed
> among
> >> the release managers? So far, it looks like they are handled as
> >> nice-to-have if someone volunteers to pick them up?
> >>
> >> My concern is that they will be overlooked because nobody feels to be in
> >> charge.
> >>
> >> Best,
> >> Matthias
> >>
> >> On Fri, Jul 7, 2023 at 11:06 AM Xintong Song <to...@gmail.com>
> >> wrote:
> >>
> >>> Thanks all for the discussion.
> >>>
> >>> The wiki has been updated as discussed. I'm starting a vote now.
> >>>
> >>> Best,
> >>>
> >>> Xintong
> >>>
> >>>
> >>>
> >>> On Wed, Jul 5, 2023 at 9:52 AM Xintong Song <to...@gmail.com>
> >> wrote:
> >>>> Hi ConradJam,
> >>>>
> >>>> I think Chesnay has already put his name as the Contributor for the
> two
> >>>> tasks you listed. Maybe you can reach out to him to see if you can
> >>>> collaborate on this.
> >>>>
> >>>> In general, I don't think contributing to a release 2.0 issue is much
> >>>> different from contributing to a regular issue. We haven't yet created
> >>> JIRA
> >>>> tickets for all the listed tasks because many of them needs further
> >>>> discussions and / or FLIPs to decide whether and how they should be
> >>>> performed.
> >>>>
> >>>> Best,
> >>>>
> >>>> Xintong
> >>>>
> >>>>
> >>>>
> >>>> On Mon, Jul 3, 2023 at 10:37 PM ConradJam <ja...@gmail.com>
> wrote:
> >>>>
> >>>>> Hi Community:
> >>>>>    I see some tasks in the 2.0 list that haven't been assigned yet. I
> >>> want
> >>>>> to take the initiative to take on some tasks that I can complete. How
> >>> do I
> >>>>> apply to the community for this part of the task? I am interested in
> >> the
> >>>>> following parts of FLINK-32377
> >>>>> <https://issues.apache.org/jira/browse/FLINK-32377>, do I need to
> >>> create
> >>>>> issuse myself and point it to myself?
> >>>>>
> >>>>> - the current timestamp, which is problematic w.r.t. caching and
> >>> testing,
> >>>>> while providing no value.
> >>>>> - Remove JarRequestBody#programArgs in favor of #programArgsList.
> >>>>>
> >>>>> [1] FLINK-32377 <https://issues.apache.org/jira/browse/FLINK-32377>
> >>>>> https://issues.apache.org/jira/browse/FLINK-32377
> >>>>>
> >>>>> Teoh, Hong <li...@amazon.co.uk.invalid> 于2023年6月30日周五 00:53写道:
> >>>>>
> >>>>>
> >>>>> Teoh, Hong <li...@amazon.co.uk.invalid> 于2023年6月30日周五 00:53写道:
> >>>>>
> >>>>>> Thanks Xintong for driving the effort.
> >>>>>>
> >>>>>> I’d add a +1 to reworking configs, as suggested by @Jark and
> >> @Chesnay,
> >>>>>> especially the types. We have various configs that encode Time /
> >>>>> MemorySize
> >>>>>> that are Long instead!
> >>>>>>
> >>>>>> Regards,
> >>>>>> Hong
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>> On 29 Jun 2023, at 16:19, Yuan Mei <yu...@gmail.com>
> >> wrote:
> >>>>>>> CAUTION: This email originated from outside of the organization.
> >> Do
> >>>>> not
> >>>>>> click links or open attachments unless you can confirm the sender
> >> and
> >>>>> know
> >>>>>> the content is safe.
> >>>>>>>
> >>>>>>>
> >>>>>>> Thanks for driving this effort, Xintong!
> >>>>>>>
> >>>>>>> To Chesnay
> >>>>>>>> I'm curious as to why the "Disaggregated State Management" item
> >> is
> >>>>>>>> marked as a must-have; will it require changes that break
> >>> something?
> >>>>>>>> What prevents it from being added in 2.1?
> >>>>>>> As to "Disaggregated State Management".
> >>>>>>>
> >>>>>>> We plan to provide a new type of state backend to support DFS as
> >>>>> primary
> >>>>>>> storage.
> >>>>>>> To achieve this, we at least need to include two parts of amends
> >>> (not
> >>>>>>> entirely sure yet, since we are still in the designing and
> >> prototype
> >>>>>> phase)
> >>>>>>> 1. Statebackend Change
> >>>>>>> 2. State Access Change
> >>>>>>>
> >>>>>>> Not all of the interfaces related are `@Internal`. Some of the
> >>>>> interfaces
> >>>>>>> like `StateBackend` is `@PublicEvolving`
> >>>>>>> So, you are right in the sense that "Disaggregated State
> >> Management"
> >>>>>> itself
> >>>>>>> probably does not need to be a "Must Have"
> >>>>>>>
> >>>>>>> But I was hoping changes that related to public APIs can be
> >>> finalized
> >>>>> and
> >>>>>>> merged in Flink 2.0 (I will fix the wiki accordingly).
> >>>>>>>
> >>>>>>> I also agree with Jark that 2.0 is a good chance to rework the
> >>> default
> >>>>>>> value of configurations.
> >>>>>>>
> >>>>>>> Best
> >>>>>>> Yuan
> >>>>>>>
> >>>>>>>
> >>>>>>> On Thu, Jun 29, 2023 at 8:43 PM Chesnay Schepler <
> >>> chesnay@apache.org>
> >>>>>> wrote:
> >>>>>>>> Something else configuration-related is that there are a bunch of
> >>>>>>>> options where the type isn't quite correct (e.g., a String where
> >> it
> >>>>>>>> could be an enum, a string where it should be an int or
> >> something).
> >>>>>>>> Could do a pass over those as well.
> >>>>>>>>
> >>>>>>>> On 29/06/2023 13:50, Jark Wu wrote:
> >>>>>>>>> Hi,
> >>>>>>>>>
> >>>>>>>>> I think one more thing we need to consider to do in 2.0 is
> >>> changing
> >>>>> the
> >>>>>>>>> default value of configuration to improve out-of-box user
> >>>>> experience.
> >>>>>>>>> Currently, in order to run a Flink job, users may need to set
> >>>>>>>>> a bunch of configurations, such as minibatch, checkpoint
> >> interval,
> >>>>>>>>> exactly-once,
> >>>>>>>>> incremental-checkpoint, etc. It's very verbose and hard to use
> >> for
> >>>>>>>>> beginners.
> >>>>>>>>> Most of them can have a universally applicable value.  Because
> >>>>> changing
> >>>>>>>> the
> >>>>>>>>> default value is a breaking change. I think It's worth
> >> considering
> >>>>>>>> changing
> >>>>>>>>> them in 2.0.
> >>>>>>>>>
> >>>>>>>>> What do you think?
> >>>>>>>>>
> >>>>>>>>> Best,
> >>>>>>>>> Jark
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> On Wed, 28 Jun 2023 at 14:10, Sergey Nuyanzin <
> >>> snuyanzin@gmail.com>
> >>>>>>>> wrote:
> >>>>>>>>>> Hi Chesnay
> >>>>>>>>>>
> >>>>>>>>>>> "Move Calcite rules from Scala to Java": I would hope that
> >> this
> >>>>> would
> >>>>>>>> be
> >>>>>>>>>>> an entirely internal change, and could thus be an incremental
> >>>>> process
> >>>>>>>>>>> independent of major releases.
> >>>>>>>>>>> What is the actual scale of this item; how much are we
> >> actually
> >>>>>>>>>> re-writing?
> >>>>>>>>>>
> >>>>>>>>>> Thanks for asking
> >>>>>>>>>> yes, you're right, that should be internal change.
> >>>>>>>>>> Yeah I was also thinking about incremental change (rule by rule
> >>> or
> >>>>>>>>>> reasonable small group of rules).
> >>>>>>>>>> And yes, this could be an independent (on major release)
> >> activity
> >>>>>>>>>> The problem is actually for children of RelOptRule.
> >>>>>>>>>> Currently I see 60+ such rules (in Scala) using the mentioned
> >>>>>> deprecated
> >>>>>>>>>> api.
> >>>>>>>>>> There are also children of ConverterRule (50+) which do not
> >> have
> >>>>> such
> >>>>>>>>>> issues.
> >>>>>>>>>> Maybe it could be considered as the next step to have all the
> >>>>> rules in
> >>>>>>>>>> Java.
> >>>>>>>>>>
> >>>>>>>>>> On Tue, Jun 27, 2023 at 1:34 PM Xintong Song <
> >>>>> tonysong820@gmail.com>
> >>>>>>>>>> wrote:
> >>>>>>>>>>
> >>>>>>>>>>> Hi Alex & Gyula,
> >>>>>>>>>>>
> >>>>>>>>>>> By compatibility discussion do you mean the "[DISCUSS]
> >> FLIP-321:
> >>>>>>>>>> Introduce
> >>>>>>>>>>>> an API deprecation process" thread [1]?
> >>>>>>>>>>>>
> >>>>>>>>>>> Yes, I meant the FLIP-321 discussion. I just noticed I pasted
> >>> the
> >>>>>> wrong
> >>>>>>>>>> url
> >>>>>>>>>>> in my previous email. Sorry for the mistake.
> >>>>>>>>>>>
> >>>>>>>>>>> I am also curious to know if the rationale behind this new API
> >>> has
> >>>>>> been
> >>>>>>>>>>>> previously discussed on the mailing list. Do we have a list
> >> of
> >>>>>>>>>>> shortcomings
> >>>>>>>>>>>> in the current DataStream API that it tries to resolve? How
> >>> does
> >>>>> the
> >>>>>>>>>>>> current ProcessFunction functionality fit into the picture?
> >>> Will
> >>>>> it
> >>>>>> be
> >>>>>>>>>>> kept
> >>>>>>>>>>>> as is or subsumed by new API?
> >>>>>>>>>>>>
> >>>>>>>>>>> I don't think we should create a replacement for the
> >> DataStream
> >>>>> API
> >>>>>>>>>> unless
> >>>>>>>>>>>> we have a very good reason to do so and with a proper
> >>> discussion
> >>>>>> about
> >>>>>>>>>>> this
> >>>>>>>>>>>> as Alex said.
> >>>>>>>>>>> The ProcessFunction API which is targeting to replace
> >> DataStream
> >>>>> API
> >>>>>> is
> >>>>>>>>>>> still a proposal, not a decision. Sorry for the confusion, I
> >>>>> should
> >>>>>>>> have
> >>>>>>>>>>> been more careful with my words, not giving the impression
> >> that
> >>>>> this
> >>>>>> is
> >>>>>>>>>>> something we'll do anyway.
> >>>>>>>>>>>
> >>>>>>>>>>> There will be a FLIP describing the motivations and designs in
> >>>>>> detail,
> >>>>>>>>>> for
> >>>>>>>>>>> the community to discuss and vote on. We are still working on
> >>> it.
> >>>>>> TBH,
> >>>>>>>>>> this
> >>>>>>>>>>> is not trivial and we would need more time on it.
> >>>>>>>>>>>
> >>>>>>>>>>> Just to quickly share some backgrounds:
> >>>>>>>>>>>
> >>>>>>>>>>>     - We see quite some problems with the current DataStream
> >> APIs
> >>>>>>>>>>>        - Users are working with concrete classes rather than
> >>>>>>>> interfaces,
> >>>>>>>>>>>        which means
> >>>>>>>>>>>        - Users can access methods that are designed to be used
> >> by
> >>>>>>>> internal
> >>>>>>>>>>>           classes, even though they are annotated with
> >>> `@Internal`.
> >>>>>>>> E.g.,
> >>>>>>>>>>>           `DataStream#getTransformation`.
> >>>>>>>>>>>           - Changes to the non-API implementations (e.g.,
> >>>>>>>>>> `Transformation`)
> >>>>>>>>>>>           would affect the API classes (e.g., `DataStream`),
> >>> which
> >>>>>>>>>>> makes it hard to
> >>>>>>>>>>>           provide binary compatibility.
> >>>>>>>>>>>        - Internal classes are used as parameter / return-value
> >> of
> >>>>>>>> public
> >>>>>>>>>>>        APIs. E.g., while `AbstractStreamOperator` is
> >>>>> PublicEvolving,
> >>>>>>>>>>> `StreamTask`
> >>>>>>>>>>>        which returns from
> >>>>> `AbstractStreamOperator#getContainingTask`
> >>>>>> is
> >>>>>>>>>>> Internal.
> >>>>>>>>>>>        - In many cases, users are asked to extend the API
> >>> classes,
> >>>>>>>> rather
> >>>>>>>>>>>        than implementing interfaces. E.g.,
> >>>>> `AbstractStreamOperator`.
> >>>>>>>>>>>           - Any changes to the base classes, even the internal
> >>>>> part,
> >>>>>>>> may
> >>>>>>>>>>>           affect the behavior of the user-provided sub-classes
> >>>>>>>>>>>           - Users can override the behavior of the base classes
> >>>>>>>>>>>        - The API module `flink-streaming-java` contains non-API
> >>>>>>>> classes,
> >>>>>>>>>> and
> >>>>>>>>>>>        depends on internal modules such as `flink-runtime`,
> >> which
> >>>>>> means
> >>>>>>>>>>>        - Changes to the internal modules may affect the API
> >>>>> modules,
> >>>>>>>> which
> >>>>>>>>>>>           requires users to re-build their applications upon
> >>>>> upgrading
> >>>>>>>>>>>           - The artifact user needs for building their
> >>> application
> >>>>>>>> larger
> >>>>>>>>>>>           than necessary.
> >>>>>>>>>>>        - We probably should not expose operators (e.g.,
> >>>>>>>>>>>        `AbstractStreamOperator`) to users. Functions should be
> >>>>> enough
> >>>>>>>>>>> for users to
> >>>>>>>>>>>        define their data processing logics. Exposing
> >>> operator-level
> >>>>>>>>>> concepts
> >>>>>>>>>>>        (e.g., mailbox thread model, checkpoint barrier
> >> alignment,
> >>>>>>>> etc.) is
> >>>>>>>>>>>        unnecessary and limits the improvement regarding such
> >>>>> exposed
> >>>>>>>>>>> mechanisms
> >>>>>>>>>>>        with compatibility considerations.
> >>>>>>>>>>>        - The current DataStream API seems to be a mixture of
> >> many
> >>>>>>>> things,
> >>>>>>>>>>>        making it hard to understand especially for newcomers.
> >> It
> >>>>> might
> >>>>>>>> be
> >>>>>>>>>>> better
> >>>>>>>>>>>        to re-organize it into several parts: (the taxonomy
> >> below
> >>>>> are
> >>>>>>>> just
> >>>>>>>>>> an
> >>>>>>>>>>>        example of the, we are still working on this)
> >>>>>>>>>>>           - The most fundamental stateful stream processing:
> >>>>> streams,
> >>>>>>>>>>>           partitions / key, process functions, state,
> >>>>> timeline-service
> >>>>>>>>>>>           - An extension for common batch-streaming unified
> >>>>> functions:
> >>>>>>>>>> map,
> >>>>>>>>>>>           flatmap, filter, agg, reduce, join, etc.
> >>>>>>>>>>>           - An extension for windowing supports:  window,
> >>>>> triggering
> >>>>>>>>>>>           - An extension for event-time supports: event time,
> >>>>>> watermark
> >>>>>>>>>>>           - The extensions are like short-cuts / sugars,
> >> without
> >>>>> which
> >>>>>>>>>> users
> >>>>>>>>>>>           can probably still achieve the same behavior by
> >> working
> >>>>> with
> >>>>>>>> the
> >>>>>>>>>>>           fundamental APIs, but would be a lot easier with the
> >>>>>>>> extensions
> >>>>>>>>>>>        - The original plan was to do in-place refactors /
> >> changes
> >>>>> on
> >>>>>>>>>>>     DataStream API. Some related items are listed in this doc
> >> [2]
> >>>>>>>> attached
> >>>>>>>>>>> to
> >>>>>>>>>>>     the kicking off email [3]. Not all of the above issues are
> >>>>> listed,
> >>>>>>>>>>> because
> >>>>>>>>>>>     we haven't looked into this as deeply as now  by that time.
> >>>>>>>>>>>     - We proposed this as a new API rather than in-place
> >>> refactors
> >>>>> in
> >>>>>>>> the
> >>>>>>>>>>>     2.0 work item list, because we realized the changes might
> >> be
> >>>>> too
> >>>>>>>> big
> >>>>>>>>>>> for an
> >>>>>>>>>>>     in-place change. First having a new API then gradually
> >>> retiring
> >>>>>> the
> >>>>>>>>>> old
> >>>>>>>>>>> one
> >>>>>>>>>>>     would help users to smoothly migrate between them.
> >>>>>>>>>>>
> >>>>>>>>>>> A thorough discussion is definitely needed once the FLIP is
> >> out.
> >>>>> And
> >>>>>> of
> >>>>>>>>>>> course it's possible that the FLIP might be rejected. Given
> >> that
> >>>>> we
> >>>>>> are
> >>>>>>>>>>> planning for release 2.0, I just feel it would be better to
> >>> bring
> >>>>>> this
> >>>>>>>> up
> >>>>>>>>>>> early even the concrete plan is not yet ready,
> >>>>>>>>>>>
> >>>>>>>>>>> Best,
> >>>>>>>>>>>
> >>>>>>>>>>> Xintong
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> [1]
> >>>>> https://lists.apache.org/thread/vmhzv8fcw2b33pqxp43486owrxbkd5x9
> >>>>>>>>>>> [2]
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>
> https://docs.google.com/document/d/1_PMGl5RuDQGlV99_gL3y7OiRsF0DgCk91Coua6hFXhE/edit?usp=sharing
> >>>>>>>>>>> [3]
> >>>>> https://lists.apache.org/thread/b8w5cx0qqbwzzklyn5xxf54vw9ymys1c
> >>>>>>>>>>> On Tue, Jun 27, 2023 at 5:15 PM Gyula Fóra <gyfora@apache.org
> >>>>>> wrote:
> >>>>>>>>>>>> Hey!
> >>>>>>>>>>>>
> >>>>>>>>>>>> I share the same concerns mentioned above regarding the
> >>>>>>>>>> "ProcessFunction
> >>>>>>>>>>>> API".
> >>>>>>>>>>>>
> >>>>>>>>>>>> I don't think we should create a replacement for the
> >> DataStream
> >>>>> API
> >>>>>>>>>>> unless
> >>>>>>>>>>>> we have a very good reason to do so and with a proper
> >>> discussion
> >>>>>> about
> >>>>>>>>>>> this
> >>>>>>>>>>>> as Alex said.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Cheers,
> >>>>>>>>>>>> Gyula
> >>>>>>>>>>>>
> >>>>>>>>>>>> On Tue, Jun 27, 2023 at 11:03 AM Alexander Fedulov <
> >>>>>>>>>>>> alexander.fedulov@gmail.com> wrote:
> >>>>>>>>>>>>
> >>>>>>>>>>>>> Hi Xintong,
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> By compatibility discussion do you mean the "[DISCUSS]
> >>> FLIP-321:
> >>>>>>>>>>>> Introduce
> >>>>>>>>>>>>> an API deprecation process" thread [1]?
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> I am also curious to know if the rationale behind this new
> >> API
> >>>>> has
> >>>>>>>>>> been
> >>>>>>>>>>>>> previously discussed on the mailing list. Do we have a list
> >> of
> >>>>>>>>>>>> shortcomings
> >>>>>>>>>>>>> in the current DataStream API that it tries to resolve? How
> >>> does
> >>>>>> the
> >>>>>>>>>>>>> current ProcessFunction functionality fit into the picture?
> >>>>> Will it
> >>>>>>>>>> be
> >>>>>>>>>>>> kept
> >>>>>>>>>>>>> as is or subsumed by new API?
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> [1]
> >>>>>> https://lists.apache.org/thread/vmhzv8fcw2b33pqxp43486owrxbkd5x9
> >>>>>>>>>>>>> Best,
> >>>>>>>>>>>>> Alex
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> On Mon, 26 Jun 2023 at 14:33, Xintong Song <
> >>>>> tonysong820@gmail.com>
> >>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>> The ProcessFunction API item is giving me the most
> >> headaches
> >>>>>>>>>>> because
> >>>>>>>>>>>>> it's
> >>>>>>>>>>>>>>> very unclear what it actually entails; like is it an
> >>> entirely
> >>>>>>>>>>>> separate
> >>>>>>>>>>>>>> API
> >>>>>>>>>>>>>>> to DataStream (sounds like it is!) or an extension of
> >>>>> DataStream.
> >>>>>>>>>>> How
> >>>>>>>>>>>>>> much
> >>>>>>>>>>>>>>> will it share the internals with DataStream etc.; how does
> >>> it
> >>>>>>>>>>> relate
> >>>>>>>>>>>> to
> >>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>> Table API (w.r.t. switching APIs / what Table API uses
> >>>>>>>>>> underneath).
> >>>>>>>>>>>>>> I totally understand your confusion. We started planning
> >> this
> >>>>>> after
> >>>>>>>>>>>>> kicking
> >>>>>>>>>>>>>> off the release 2.0, so there's still a lot to be explored
> >>> and
> >>>>> the
> >>>>>>>>>>> plan
> >>>>>>>>>>>>>> keeps changing.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>     - In the beginning, we planned to do an in-place
> >> refactor
> >>> of
> >>>>>>>>>>>>> DataStream
> >>>>>>>>>>>>>>     API, until the API migration period is proposed.
> >>>>>>>>>>>>>>     - Then we want to make it an entirely separate API to
> >>>>>>>>>> DataStream,
> >>>>>>>>>>>> and
> >>>>>>>>>>>>>>     listed as a must-have for release 2.0 so that we can
> >>> remove
> >>>>>>>>>>>> DataStream
> >>>>>>>>>>>>>> once
> >>>>>>>>>>>>>>     it's ready.
> >>>>>>>>>>>>>>     - However, depending on the outcome of the API
> >>> compatibility
> >>>>>>>>>>>>> discussion
> >>>>>>>>>>>>>>     [1], we may not be able to remove DataStream in 2.0
> >>> anyway,
> >>>>>>>>>> which
> >>>>>>>>>>>>> means
> >>>>>>>>>>>>>> we
> >>>>>>>>>>>>>>     might need to re-evaluate the necessity of this item for
> >>>>> 2.0.
> >>>>>>>>>>>>>> I'd say we wait a bit longer for the compatibility
> >> discussion
> >>>>> [1]
> >>>>>>>>>> and
> >>>>>>>>>>>>>> decide the priority for this item afterwards.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Best,
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Xintong
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> [1]
> >> https://lists.apache.org/list.html?dev@flink.apache.org
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> On Mon, Jun 26, 2023 at 6:00 PM Chesnay Schepler <
> >>>>>>>>>> chesnay@apache.org
> >>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> by-and-large I'm quite happy with the list of items.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> I'm curious as to why the "Disaggregated State Management"
> >>>>> item
> >>>>>>>>>> is
> >>>>>>>>>>>>> marked
> >>>>>>>>>>>>>>> as a must-have; will it require changes that break
> >>> something?
> >>>>>>>>>> What
> >>>>>>>>>>>>>> prevents
> >>>>>>>>>>>>>>> it from being added in 2.1?
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> We may want to update the Java 17 item to "Make Java 17
> >> the
> >>>>>>>>>>> default,
> >>>>>>>>>>>>> drop
> >>>>>>>>>>>>>>> Java 8/11". Maybe even split it into a must-have "Drop
> >> Java
> >>> 8"
> >>>>>>>>>> and
> >>>>>>>>>>> a
> >>>>>>>>>>>>>>> nice-to-have "Drop Java 11"?
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> "Move Calcite rules from Scala to Java": I would hope that
> >>>>> this
> >>>>>>>>>>> would
> >>>>>>>>>>>>> be
> >>>>>>>>>>>>>>> an entirely internal change, and could thus be an
> >>> incremental
> >>>>>>>>>>> process
> >>>>>>>>>>>>>>> independent of major releases.
> >>>>>>>>>>>>>>> What is the actual scale of this item; how much are we
> >>>>> actually
> >>>>>>>>>>>>>> re-writing?
> >>>>>>>>>>>>>>> "Add MetricGroup#getLogicalScope": I'd raise this to a
> >>>>>>>>>> must-have; i
> >>>>>>>>>>>>> think
> >>>>>>>>>>>>>>> I marked it down as nice-to-have only because it depends
> >> on
> >>>>>>>>>> another
> >>>>>>>>>>>>> item.
> >>>>>>>>>>>>>>> The ProcessFunction API item is giving me the most
> >> headaches
> >>>>>>>>>>> because
> >>>>>>>>>>>>> it's
> >>>>>>>>>>>>>>> very unclear what it actually entails; like is it an
> >>> entirely
> >>>>>>>>>>>> separate
> >>>>>>>>>>>>>> API
> >>>>>>>>>>>>>>> to DataStream (sounds like it is!) or an extension of
> >>>>> DataStream.
> >>>>>>>>>>> How
> >>>>>>>>>>>>>> much
> >>>>>>>>>>>>>>> will it share the internals with DataStream etc.; how does
> >>> it
> >>>>>>>>>>> relate
> >>>>>>>>>>>> to
> >>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>> Table API (w.r.t. switching APIs / what Table API uses
> >>>>>>>>>> underneath).
> >>>>>>>>>>>>>>> There are a few items I added as ideas which don't have a
> >>>>>>>>>> priority
> >>>>>>>>>>>> yet;
> >>>>>>>>>>>>>>> would love to get some feedback on those.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> On 21/06/2023 08:41, Xintong Song wrote:
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Hi devs,
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> As previously discussed in [1], we had been collecting
> >> work
> >>>>> item
> >>>>>>>>>>>>>> proposals
> >>>>>>>>>>>>>>> for the 2.0 release until June 15th, on the wiki page [2].
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>     - As we have passed the due date, I'd like to kindly
> >>> remind
> >>>>>>>>>>>> everyone
> >>>>>>>>>>>>>> *not
> >>>>>>>>>>>>>>>     to add / remove items directly on the wiki page*. If
> >>>>> needed,
> >>>>>>>>>>>> please
> >>>>>>>>>>>>>> post
> >>>>>>>>>>>>>>>     in this thread or reach out to the release managers
> >>>>> instead.
> >>>>>>>>>>>>>>>     - I've reached out to some folks for clarifications
> >> about
> >>>>>>>>>> their
> >>>>>>>>>>>>>>>     proposals. Some of them mentioned that they can not yet
> >>>>> tell
> >>>>>>>>>>>> whether
> >>>>>>>>>>>>>> we
> >>>>>>>>>>>>>>>     should do an item or not, and would need more time /
> >>>>>>>>>> discussions
> >>>>>>>>>>>> to
> >>>>>>>>>>>>>> make
> >>>>>>>>>>>>>>>     the decision. So I added a new symbol for items whose
> >>>>>>>>>> priorities
> >>>>>>>>>>>> are
> >>>>>>>>>>>>>> `TBD`.
> >>>>>>>>>>>>>>> Now it's time to collaboratively decide a minimum set of
> >>>>>>>>>> must-have
> >>>>>>>>>>>>> items.
> >>>>>>>>>>>>>>> I've gone through the entire list of proposed items, and
> >>> found
> >>>>>>>>>> most
> >>>>>>>>>>>> of
> >>>>>>>>>>>>>> them
> >>>>>>>>>>>>>>> make quite much sense. So I think an online sync might not
> >>> be
> >>>>>>>>>>>> necessary
> >>>>>>>>>>>>>> for
> >>>>>>>>>>>>>>> this. I'd like to go with this DISCUSS thread, where
> >>> everyone
> >>>>> can
> >>>>>>>>>>>>> comment
> >>>>>>>>>>>>>>> on how they think the list can be improved, followed by a
> >>>>> VOTE to
> >>>>>>>>>>>>>> formally
> >>>>>>>>>>>>>>> make the decision.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Any feedback and opinions, including but not limited to
> >> the
> >>>>>>>>>>> following
> >>>>>>>>>>>>>>> aspects, will be appreciated.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>     - Important items that are missing from the list
> >>>>>>>>>>>>>>>     - Concerns regarding the listed items or their
> >> priorities
> >>>>>>>>>>>>>>> Looking forward to your feedback.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Best,
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Xintong
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> [1]
> >>
> https://lists.apache.org/list?dev@flink.apache.org:lte=1M:release%202.0%20status%20updates
> >>>>>>>>>>>>>>> [2]
> >>>>>>>>>> https://cwiki.apache.org/confluence/display/FLINK/2.0+Release
> >>>>>>>>>>>>>>>
> >>>>>>>>>> --
> >>>>>>>>>> Best regards,
> >>>>>>>>>> Sergey
> >>>>>>>>>>
> >>>>>>>>
> >>>>>>
> >>>>> --
> >>>>> Best
> >>>>>
> >>>>> ConradJam
> >>>>>
>
>

Re: [DISCUSS] Release 2.0 Work Items

Posted by Chesnay Schepler <ch...@apache.org>.
At what point are the FLIP discussions coming into play?

I keep wondering if these shouldn't have started already.
It just seems that a lot of decisions are implicitly reliant on the 
items even being accepted.
Estimates can only be provided if we actually know the scope of the 
change, but that's not always clear from the description in the doc.

What we need to ensure is that all breaking API changes are 
discussed/decided before 1.18 is released so we can deprecate affected APIs.

On 10/07/2023 11:32, Xintong Song wrote:
> Hi Matthias,
>
> The questions you asked are indeed very important. Here're some quick
> responses, based on the plans I had in mind, which I have not aligned with
> other release managers yet.
>
> In the previous discussions between the RMs, we were not able to make
> proposals on things like how to make a time plan, how to manage the release
> branch, etc., due to the lack of inputs on e.g., the work items need to be
> included (which transitively depends on the API compatibility to provide
> between major versions) and the workloads / time needed for them. With the
> recent discussions, we have collected at least the majority of the inputs
> needed.
>
> Here are things that I think we as the release managers would do next
> (again, not aligned with other release managers yet)
> - Creating a time plan, by reaching out to people to understand the
> estimated workloads, prerequisites and ETA of each work item.
> - Make a proposal on how to manage the release branch, i.e., when to cut
> the branch and whether to ship the milestone releases, etc.
> - Set-up regular release syncs (bi-weekly / monthly) to update the status
> and draw attention to where help is needed.
>
> So back to your questions.
>
> There are still to-be-discussed items in the list of features. What's the
>> plan with those?
> When collecting ETA, for items that the completion time cannot yet be
> estimated, we would like to have at least a time by which the estimation
> can be made. I think the same applies to the to-be-discussed items. And if
> the items should be included as must-haves, we would need another vote to
> adjust the must-have item list.
>
> Some of them don't have anyone assigned.
> My concern is that they will be overlooked because nobody feels to be in
>> charge.
> This is a tricky one. For must-have items without assignees, we as the
> release managers should be responsible for raising them up in the release
> syncs, and try to find assignees for them. Hopefully, there will be someone
> who stands out. But it is possible that for a must-have item nobody wants
> to work on it. If that happens, which I don't think it will, it probably
> means the item is not that critical and we may have to exclude it from the
> release. Either way, they should not be overlooked, because IMHO release
> managers should be responsible for trying to get someone to work on the
> un-assigned items.
>
> We'll have more discussions soon and keep the community updated.
>
> Best,
>
> Xintong
>
>
>
> On Mon, Jul 10, 2023 at 3:53 PM Matthias Pohl
> <ma...@aiven.io.invalid> wrote:
>
>> Now that the vote is started on the must-have items: There are still
>> to-be-discussed items in the list of features. What's the plan with those?
>> Some of them don't have anyone assigned. Were these items discussed among
>> the release managers? So far, it looks like they are handled as
>> nice-to-have if someone volunteers to pick them up?
>>
>> My concern is that they will be overlooked because nobody feels to be in
>> charge.
>>
>> Best,
>> Matthias
>>
>> On Fri, Jul 7, 2023 at 11:06 AM Xintong Song <to...@gmail.com>
>> wrote:
>>
>>> Thanks all for the discussion.
>>>
>>> The wiki has been updated as discussed. I'm starting a vote now.
>>>
>>> Best,
>>>
>>> Xintong
>>>
>>>
>>>
>>> On Wed, Jul 5, 2023 at 9:52 AM Xintong Song <to...@gmail.com>
>> wrote:
>>>> Hi ConradJam,
>>>>
>>>> I think Chesnay has already put his name as the Contributor for the two
>>>> tasks you listed. Maybe you can reach out to him to see if you can
>>>> collaborate on this.
>>>>
>>>> In general, I don't think contributing to a release 2.0 issue is much
>>>> different from contributing to a regular issue. We haven't yet created
>>> JIRA
>>>> tickets for all the listed tasks because many of them needs further
>>>> discussions and / or FLIPs to decide whether and how they should be
>>>> performed.
>>>>
>>>> Best,
>>>>
>>>> Xintong
>>>>
>>>>
>>>>
>>>> On Mon, Jul 3, 2023 at 10:37 PM ConradJam <ja...@gmail.com> wrote:
>>>>
>>>>> Hi Community:
>>>>>    I see some tasks in the 2.0 list that haven't been assigned yet. I
>>> want
>>>>> to take the initiative to take on some tasks that I can complete. How
>>> do I
>>>>> apply to the community for this part of the task? I am interested in
>> the
>>>>> following parts of FLINK-32377
>>>>> <https://issues.apache.org/jira/browse/FLINK-32377>, do I need to
>>> create
>>>>> issuse myself and point it to myself?
>>>>>
>>>>> - the current timestamp, which is problematic w.r.t. caching and
>>> testing,
>>>>> while providing no value.
>>>>> - Remove JarRequestBody#programArgs in favor of #programArgsList.
>>>>>
>>>>> [1] FLINK-32377 <https://issues.apache.org/jira/browse/FLINK-32377>
>>>>> https://issues.apache.org/jira/browse/FLINK-32377
>>>>>
>>>>> Teoh, Hong <li...@amazon.co.uk.invalid> 于2023年6月30日周五 00:53写道:
>>>>>
>>>>>
>>>>> Teoh, Hong <li...@amazon.co.uk.invalid> 于2023年6月30日周五 00:53写道:
>>>>>
>>>>>> Thanks Xintong for driving the effort.
>>>>>>
>>>>>> I’d add a +1 to reworking configs, as suggested by @Jark and
>> @Chesnay,
>>>>>> especially the types. We have various configs that encode Time /
>>>>> MemorySize
>>>>>> that are Long instead!
>>>>>>
>>>>>> Regards,
>>>>>> Hong
>>>>>>
>>>>>>
>>>>>>
>>>>>>> On 29 Jun 2023, at 16:19, Yuan Mei <yu...@gmail.com>
>> wrote:
>>>>>>> CAUTION: This email originated from outside of the organization.
>> Do
>>>>> not
>>>>>> click links or open attachments unless you can confirm the sender
>> and
>>>>> know
>>>>>> the content is safe.
>>>>>>>
>>>>>>>
>>>>>>> Thanks for driving this effort, Xintong!
>>>>>>>
>>>>>>> To Chesnay
>>>>>>>> I'm curious as to why the "Disaggregated State Management" item
>> is
>>>>>>>> marked as a must-have; will it require changes that break
>>> something?
>>>>>>>> What prevents it from being added in 2.1?
>>>>>>> As to "Disaggregated State Management".
>>>>>>>
>>>>>>> We plan to provide a new type of state backend to support DFS as
>>>>> primary
>>>>>>> storage.
>>>>>>> To achieve this, we at least need to include two parts of amends
>>> (not
>>>>>>> entirely sure yet, since we are still in the designing and
>> prototype
>>>>>> phase)
>>>>>>> 1. Statebackend Change
>>>>>>> 2. State Access Change
>>>>>>>
>>>>>>> Not all of the interfaces related are `@Internal`. Some of the
>>>>> interfaces
>>>>>>> like `StateBackend` is `@PublicEvolving`
>>>>>>> So, you are right in the sense that "Disaggregated State
>> Management"
>>>>>> itself
>>>>>>> probably does not need to be a "Must Have"
>>>>>>>
>>>>>>> But I was hoping changes that related to public APIs can be
>>> finalized
>>>>> and
>>>>>>> merged in Flink 2.0 (I will fix the wiki accordingly).
>>>>>>>
>>>>>>> I also agree with Jark that 2.0 is a good chance to rework the
>>> default
>>>>>>> value of configurations.
>>>>>>>
>>>>>>> Best
>>>>>>> Yuan
>>>>>>>
>>>>>>>
>>>>>>> On Thu, Jun 29, 2023 at 8:43 PM Chesnay Schepler <
>>> chesnay@apache.org>
>>>>>> wrote:
>>>>>>>> Something else configuration-related is that there are a bunch of
>>>>>>>> options where the type isn't quite correct (e.g., a String where
>> it
>>>>>>>> could be an enum, a string where it should be an int or
>> something).
>>>>>>>> Could do a pass over those as well.
>>>>>>>>
>>>>>>>> On 29/06/2023 13:50, Jark Wu wrote:
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> I think one more thing we need to consider to do in 2.0 is
>>> changing
>>>>> the
>>>>>>>>> default value of configuration to improve out-of-box user
>>>>> experience.
>>>>>>>>> Currently, in order to run a Flink job, users may need to set
>>>>>>>>> a bunch of configurations, such as minibatch, checkpoint
>> interval,
>>>>>>>>> exactly-once,
>>>>>>>>> incremental-checkpoint, etc. It's very verbose and hard to use
>> for
>>>>>>>>> beginners.
>>>>>>>>> Most of them can have a universally applicable value.  Because
>>>>> changing
>>>>>>>> the
>>>>>>>>> default value is a breaking change. I think It's worth
>> considering
>>>>>>>> changing
>>>>>>>>> them in 2.0.
>>>>>>>>>
>>>>>>>>> What do you think?
>>>>>>>>>
>>>>>>>>> Best,
>>>>>>>>> Jark
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Wed, 28 Jun 2023 at 14:10, Sergey Nuyanzin <
>>> snuyanzin@gmail.com>
>>>>>>>> wrote:
>>>>>>>>>> Hi Chesnay
>>>>>>>>>>
>>>>>>>>>>> "Move Calcite rules from Scala to Java": I would hope that
>> this
>>>>> would
>>>>>>>> be
>>>>>>>>>>> an entirely internal change, and could thus be an incremental
>>>>> process
>>>>>>>>>>> independent of major releases.
>>>>>>>>>>> What is the actual scale of this item; how much are we
>> actually
>>>>>>>>>> re-writing?
>>>>>>>>>>
>>>>>>>>>> Thanks for asking
>>>>>>>>>> yes, you're right, that should be internal change.
>>>>>>>>>> Yeah I was also thinking about incremental change (rule by rule
>>> or
>>>>>>>>>> reasonable small group of rules).
>>>>>>>>>> And yes, this could be an independent (on major release)
>> activity
>>>>>>>>>> The problem is actually for children of RelOptRule.
>>>>>>>>>> Currently I see 60+ such rules (in Scala) using the mentioned
>>>>>> deprecated
>>>>>>>>>> api.
>>>>>>>>>> There are also children of ConverterRule (50+) which do not
>> have
>>>>> such
>>>>>>>>>> issues.
>>>>>>>>>> Maybe it could be considered as the next step to have all the
>>>>> rules in
>>>>>>>>>> Java.
>>>>>>>>>>
>>>>>>>>>> On Tue, Jun 27, 2023 at 1:34 PM Xintong Song <
>>>>> tonysong820@gmail.com>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi Alex & Gyula,
>>>>>>>>>>>
>>>>>>>>>>> By compatibility discussion do you mean the "[DISCUSS]
>> FLIP-321:
>>>>>>>>>> Introduce
>>>>>>>>>>>> an API deprecation process" thread [1]?
>>>>>>>>>>>>
>>>>>>>>>>> Yes, I meant the FLIP-321 discussion. I just noticed I pasted
>>> the
>>>>>> wrong
>>>>>>>>>> url
>>>>>>>>>>> in my previous email. Sorry for the mistake.
>>>>>>>>>>>
>>>>>>>>>>> I am also curious to know if the rationale behind this new API
>>> has
>>>>>> been
>>>>>>>>>>>> previously discussed on the mailing list. Do we have a list
>> of
>>>>>>>>>>> shortcomings
>>>>>>>>>>>> in the current DataStream API that it tries to resolve? How
>>> does
>>>>> the
>>>>>>>>>>>> current ProcessFunction functionality fit into the picture?
>>> Will
>>>>> it
>>>>>> be
>>>>>>>>>>> kept
>>>>>>>>>>>> as is or subsumed by new API?
>>>>>>>>>>>>
>>>>>>>>>>> I don't think we should create a replacement for the
>> DataStream
>>>>> API
>>>>>>>>>> unless
>>>>>>>>>>>> we have a very good reason to do so and with a proper
>>> discussion
>>>>>> about
>>>>>>>>>>> this
>>>>>>>>>>>> as Alex said.
>>>>>>>>>>> The ProcessFunction API which is targeting to replace
>> DataStream
>>>>> API
>>>>>> is
>>>>>>>>>>> still a proposal, not a decision. Sorry for the confusion, I
>>>>> should
>>>>>>>> have
>>>>>>>>>>> been more careful with my words, not giving the impression
>> that
>>>>> this
>>>>>> is
>>>>>>>>>>> something we'll do anyway.
>>>>>>>>>>>
>>>>>>>>>>> There will be a FLIP describing the motivations and designs in
>>>>>> detail,
>>>>>>>>>> for
>>>>>>>>>>> the community to discuss and vote on. We are still working on
>>> it.
>>>>>> TBH,
>>>>>>>>>> this
>>>>>>>>>>> is not trivial and we would need more time on it.
>>>>>>>>>>>
>>>>>>>>>>> Just to quickly share some backgrounds:
>>>>>>>>>>>
>>>>>>>>>>>     - We see quite some problems with the current DataStream
>> APIs
>>>>>>>>>>>        - Users are working with concrete classes rather than
>>>>>>>> interfaces,
>>>>>>>>>>>        which means
>>>>>>>>>>>        - Users can access methods that are designed to be used
>> by
>>>>>>>> internal
>>>>>>>>>>>           classes, even though they are annotated with
>>> `@Internal`.
>>>>>>>> E.g.,
>>>>>>>>>>>           `DataStream#getTransformation`.
>>>>>>>>>>>           - Changes to the non-API implementations (e.g.,
>>>>>>>>>> `Transformation`)
>>>>>>>>>>>           would affect the API classes (e.g., `DataStream`),
>>> which
>>>>>>>>>>> makes it hard to
>>>>>>>>>>>           provide binary compatibility.
>>>>>>>>>>>        - Internal classes are used as parameter / return-value
>> of
>>>>>>>> public
>>>>>>>>>>>        APIs. E.g., while `AbstractStreamOperator` is
>>>>> PublicEvolving,
>>>>>>>>>>> `StreamTask`
>>>>>>>>>>>        which returns from
>>>>> `AbstractStreamOperator#getContainingTask`
>>>>>> is
>>>>>>>>>>> Internal.
>>>>>>>>>>>        - In many cases, users are asked to extend the API
>>> classes,
>>>>>>>> rather
>>>>>>>>>>>        than implementing interfaces. E.g.,
>>>>> `AbstractStreamOperator`.
>>>>>>>>>>>           - Any changes to the base classes, even the internal
>>>>> part,
>>>>>>>> may
>>>>>>>>>>>           affect the behavior of the user-provided sub-classes
>>>>>>>>>>>           - Users can override the behavior of the base classes
>>>>>>>>>>>        - The API module `flink-streaming-java` contains non-API
>>>>>>>> classes,
>>>>>>>>>> and
>>>>>>>>>>>        depends on internal modules such as `flink-runtime`,
>> which
>>>>>> means
>>>>>>>>>>>        - Changes to the internal modules may affect the API
>>>>> modules,
>>>>>>>> which
>>>>>>>>>>>           requires users to re-build their applications upon
>>>>> upgrading
>>>>>>>>>>>           - The artifact user needs for building their
>>> application
>>>>>>>> larger
>>>>>>>>>>>           than necessary.
>>>>>>>>>>>        - We probably should not expose operators (e.g.,
>>>>>>>>>>>        `AbstractStreamOperator`) to users. Functions should be
>>>>> enough
>>>>>>>>>>> for users to
>>>>>>>>>>>        define their data processing logics. Exposing
>>> operator-level
>>>>>>>>>> concepts
>>>>>>>>>>>        (e.g., mailbox thread model, checkpoint barrier
>> alignment,
>>>>>>>> etc.) is
>>>>>>>>>>>        unnecessary and limits the improvement regarding such
>>>>> exposed
>>>>>>>>>>> mechanisms
>>>>>>>>>>>        with compatibility considerations.
>>>>>>>>>>>        - The current DataStream API seems to be a mixture of
>> many
>>>>>>>> things,
>>>>>>>>>>>        making it hard to understand especially for newcomers.
>> It
>>>>> might
>>>>>>>> be
>>>>>>>>>>> better
>>>>>>>>>>>        to re-organize it into several parts: (the taxonomy
>> below
>>>>> are
>>>>>>>> just
>>>>>>>>>> an
>>>>>>>>>>>        example of the, we are still working on this)
>>>>>>>>>>>           - The most fundamental stateful stream processing:
>>>>> streams,
>>>>>>>>>>>           partitions / key, process functions, state,
>>>>> timeline-service
>>>>>>>>>>>           - An extension for common batch-streaming unified
>>>>> functions:
>>>>>>>>>> map,
>>>>>>>>>>>           flatmap, filter, agg, reduce, join, etc.
>>>>>>>>>>>           - An extension for windowing supports:  window,
>>>>> triggering
>>>>>>>>>>>           - An extension for event-time supports: event time,
>>>>>> watermark
>>>>>>>>>>>           - The extensions are like short-cuts / sugars,
>> without
>>>>> which
>>>>>>>>>> users
>>>>>>>>>>>           can probably still achieve the same behavior by
>> working
>>>>> with
>>>>>>>> the
>>>>>>>>>>>           fundamental APIs, but would be a lot easier with the
>>>>>>>> extensions
>>>>>>>>>>>        - The original plan was to do in-place refactors /
>> changes
>>>>> on
>>>>>>>>>>>     DataStream API. Some related items are listed in this doc
>> [2]
>>>>>>>> attached
>>>>>>>>>>> to
>>>>>>>>>>>     the kicking off email [3]. Not all of the above issues are
>>>>> listed,
>>>>>>>>>>> because
>>>>>>>>>>>     we haven't looked into this as deeply as now  by that time.
>>>>>>>>>>>     - We proposed this as a new API rather than in-place
>>> refactors
>>>>> in
>>>>>>>> the
>>>>>>>>>>>     2.0 work item list, because we realized the changes might
>> be
>>>>> too
>>>>>>>> big
>>>>>>>>>>> for an
>>>>>>>>>>>     in-place change. First having a new API then gradually
>>> retiring
>>>>>> the
>>>>>>>>>> old
>>>>>>>>>>> one
>>>>>>>>>>>     would help users to smoothly migrate between them.
>>>>>>>>>>>
>>>>>>>>>>> A thorough discussion is definitely needed once the FLIP is
>> out.
>>>>> And
>>>>>> of
>>>>>>>>>>> course it's possible that the FLIP might be rejected. Given
>> that
>>>>> we
>>>>>> are
>>>>>>>>>>> planning for release 2.0, I just feel it would be better to
>>> bring
>>>>>> this
>>>>>>>> up
>>>>>>>>>>> early even the concrete plan is not yet ready,
>>>>>>>>>>>
>>>>>>>>>>> Best,
>>>>>>>>>>>
>>>>>>>>>>> Xintong
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> [1]
>>>>> https://lists.apache.org/thread/vmhzv8fcw2b33pqxp43486owrxbkd5x9
>>>>>>>>>>> [2]
>>>>>>>>>>>
>>>>>>>>>>>
>> https://docs.google.com/document/d/1_PMGl5RuDQGlV99_gL3y7OiRsF0DgCk91Coua6hFXhE/edit?usp=sharing
>>>>>>>>>>> [3]
>>>>> https://lists.apache.org/thread/b8w5cx0qqbwzzklyn5xxf54vw9ymys1c
>>>>>>>>>>> On Tue, Jun 27, 2023 at 5:15 PM Gyula Fóra <gyfora@apache.org
>>>>>> wrote:
>>>>>>>>>>>> Hey!
>>>>>>>>>>>>
>>>>>>>>>>>> I share the same concerns mentioned above regarding the
>>>>>>>>>> "ProcessFunction
>>>>>>>>>>>> API".
>>>>>>>>>>>>
>>>>>>>>>>>> I don't think we should create a replacement for the
>> DataStream
>>>>> API
>>>>>>>>>>> unless
>>>>>>>>>>>> we have a very good reason to do so and with a proper
>>> discussion
>>>>>> about
>>>>>>>>>>> this
>>>>>>>>>>>> as Alex said.
>>>>>>>>>>>>
>>>>>>>>>>>> Cheers,
>>>>>>>>>>>> Gyula
>>>>>>>>>>>>
>>>>>>>>>>>> On Tue, Jun 27, 2023 at 11:03 AM Alexander Fedulov <
>>>>>>>>>>>> alexander.fedulov@gmail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Hi Xintong,
>>>>>>>>>>>>>
>>>>>>>>>>>>> By compatibility discussion do you mean the "[DISCUSS]
>>> FLIP-321:
>>>>>>>>>>>> Introduce
>>>>>>>>>>>>> an API deprecation process" thread [1]?
>>>>>>>>>>>>>
>>>>>>>>>>>>> I am also curious to know if the rationale behind this new
>> API
>>>>> has
>>>>>>>>>> been
>>>>>>>>>>>>> previously discussed on the mailing list. Do we have a list
>> of
>>>>>>>>>>>> shortcomings
>>>>>>>>>>>>> in the current DataStream API that it tries to resolve? How
>>> does
>>>>>> the
>>>>>>>>>>>>> current ProcessFunction functionality fit into the picture?
>>>>> Will it
>>>>>>>>>> be
>>>>>>>>>>>> kept
>>>>>>>>>>>>> as is or subsumed by new API?
>>>>>>>>>>>>>
>>>>>>>>>>>>> [1]
>>>>>> https://lists.apache.org/thread/vmhzv8fcw2b33pqxp43486owrxbkd5x9
>>>>>>>>>>>>> Best,
>>>>>>>>>>>>> Alex
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Mon, 26 Jun 2023 at 14:33, Xintong Song <
>>>>> tonysong820@gmail.com>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>> The ProcessFunction API item is giving me the most
>> headaches
>>>>>>>>>>> because
>>>>>>>>>>>>> it's
>>>>>>>>>>>>>>> very unclear what it actually entails; like is it an
>>> entirely
>>>>>>>>>>>> separate
>>>>>>>>>>>>>> API
>>>>>>>>>>>>>>> to DataStream (sounds like it is!) or an extension of
>>>>> DataStream.
>>>>>>>>>>> How
>>>>>>>>>>>>>> much
>>>>>>>>>>>>>>> will it share the internals with DataStream etc.; how does
>>> it
>>>>>>>>>>> relate
>>>>>>>>>>>> to
>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>> Table API (w.r.t. switching APIs / what Table API uses
>>>>>>>>>> underneath).
>>>>>>>>>>>>>> I totally understand your confusion. We started planning
>> this
>>>>>> after
>>>>>>>>>>>>> kicking
>>>>>>>>>>>>>> off the release 2.0, so there's still a lot to be explored
>>> and
>>>>> the
>>>>>>>>>>> plan
>>>>>>>>>>>>>> keeps changing.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>     - In the beginning, we planned to do an in-place
>> refactor
>>> of
>>>>>>>>>>>>> DataStream
>>>>>>>>>>>>>>     API, until the API migration period is proposed.
>>>>>>>>>>>>>>     - Then we want to make it an entirely separate API to
>>>>>>>>>> DataStream,
>>>>>>>>>>>> and
>>>>>>>>>>>>>>     listed as a must-have for release 2.0 so that we can
>>> remove
>>>>>>>>>>>> DataStream
>>>>>>>>>>>>>> once
>>>>>>>>>>>>>>     it's ready.
>>>>>>>>>>>>>>     - However, depending on the outcome of the API
>>> compatibility
>>>>>>>>>>>>> discussion
>>>>>>>>>>>>>>     [1], we may not be able to remove DataStream in 2.0
>>> anyway,
>>>>>>>>>> which
>>>>>>>>>>>>> means
>>>>>>>>>>>>>> we
>>>>>>>>>>>>>>     might need to re-evaluate the necessity of this item for
>>>>> 2.0.
>>>>>>>>>>>>>> I'd say we wait a bit longer for the compatibility
>> discussion
>>>>> [1]
>>>>>>>>>> and
>>>>>>>>>>>>>> decide the priority for this item afterwards.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Xintong
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> [1]
>> https://lists.apache.org/list.html?dev@flink.apache.org
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Mon, Jun 26, 2023 at 6:00 PM Chesnay Schepler <
>>>>>>>>>> chesnay@apache.org
>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> by-and-large I'm quite happy with the list of items.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I'm curious as to why the "Disaggregated State Management"
>>>>> item
>>>>>>>>>> is
>>>>>>>>>>>>> marked
>>>>>>>>>>>>>>> as a must-have; will it require changes that break
>>> something?
>>>>>>>>>> What
>>>>>>>>>>>>>> prevents
>>>>>>>>>>>>>>> it from being added in 2.1?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> We may want to update the Java 17 item to "Make Java 17
>> the
>>>>>>>>>>> default,
>>>>>>>>>>>>> drop
>>>>>>>>>>>>>>> Java 8/11". Maybe even split it into a must-have "Drop
>> Java
>>> 8"
>>>>>>>>>> and
>>>>>>>>>>> a
>>>>>>>>>>>>>>> nice-to-have "Drop Java 11"?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> "Move Calcite rules from Scala to Java": I would hope that
>>>>> this
>>>>>>>>>>> would
>>>>>>>>>>>>> be
>>>>>>>>>>>>>>> an entirely internal change, and could thus be an
>>> incremental
>>>>>>>>>>> process
>>>>>>>>>>>>>>> independent of major releases.
>>>>>>>>>>>>>>> What is the actual scale of this item; how much are we
>>>>> actually
>>>>>>>>>>>>>> re-writing?
>>>>>>>>>>>>>>> "Add MetricGroup#getLogicalScope": I'd raise this to a
>>>>>>>>>> must-have; i
>>>>>>>>>>>>> think
>>>>>>>>>>>>>>> I marked it down as nice-to-have only because it depends
>> on
>>>>>>>>>> another
>>>>>>>>>>>>> item.
>>>>>>>>>>>>>>> The ProcessFunction API item is giving me the most
>> headaches
>>>>>>>>>>> because
>>>>>>>>>>>>> it's
>>>>>>>>>>>>>>> very unclear what it actually entails; like is it an
>>> entirely
>>>>>>>>>>>> separate
>>>>>>>>>>>>>> API
>>>>>>>>>>>>>>> to DataStream (sounds like it is!) or an extension of
>>>>> DataStream.
>>>>>>>>>>> How
>>>>>>>>>>>>>> much
>>>>>>>>>>>>>>> will it share the internals with DataStream etc.; how does
>>> it
>>>>>>>>>>> relate
>>>>>>>>>>>> to
>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>> Table API (w.r.t. switching APIs / what Table API uses
>>>>>>>>>> underneath).
>>>>>>>>>>>>>>> There are a few items I added as ideas which don't have a
>>>>>>>>>> priority
>>>>>>>>>>>> yet;
>>>>>>>>>>>>>>> would love to get some feedback on those.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On 21/06/2023 08:41, Xintong Song wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hi devs,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> As previously discussed in [1], we had been collecting
>> work
>>>>> item
>>>>>>>>>>>>>> proposals
>>>>>>>>>>>>>>> for the 2.0 release until June 15th, on the wiki page [2].
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>     - As we have passed the due date, I'd like to kindly
>>> remind
>>>>>>>>>>>> everyone
>>>>>>>>>>>>>> *not
>>>>>>>>>>>>>>>     to add / remove items directly on the wiki page*. If
>>>>> needed,
>>>>>>>>>>>> please
>>>>>>>>>>>>>> post
>>>>>>>>>>>>>>>     in this thread or reach out to the release managers
>>>>> instead.
>>>>>>>>>>>>>>>     - I've reached out to some folks for clarifications
>> about
>>>>>>>>>> their
>>>>>>>>>>>>>>>     proposals. Some of them mentioned that they can not yet
>>>>> tell
>>>>>>>>>>>> whether
>>>>>>>>>>>>>> we
>>>>>>>>>>>>>>>     should do an item or not, and would need more time /
>>>>>>>>>> discussions
>>>>>>>>>>>> to
>>>>>>>>>>>>>> make
>>>>>>>>>>>>>>>     the decision. So I added a new symbol for items whose
>>>>>>>>>> priorities
>>>>>>>>>>>> are
>>>>>>>>>>>>>> `TBD`.
>>>>>>>>>>>>>>> Now it's time to collaboratively decide a minimum set of
>>>>>>>>>> must-have
>>>>>>>>>>>>> items.
>>>>>>>>>>>>>>> I've gone through the entire list of proposed items, and
>>> found
>>>>>>>>>> most
>>>>>>>>>>>> of
>>>>>>>>>>>>>> them
>>>>>>>>>>>>>>> make quite much sense. So I think an online sync might not
>>> be
>>>>>>>>>>>> necessary
>>>>>>>>>>>>>> for
>>>>>>>>>>>>>>> this. I'd like to go with this DISCUSS thread, where
>>> everyone
>>>>> can
>>>>>>>>>>>>> comment
>>>>>>>>>>>>>>> on how they think the list can be improved, followed by a
>>>>> VOTE to
>>>>>>>>>>>>>> formally
>>>>>>>>>>>>>>> make the decision.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Any feedback and opinions, including but not limited to
>> the
>>>>>>>>>>> following
>>>>>>>>>>>>>>> aspects, will be appreciated.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>     - Important items that are missing from the list
>>>>>>>>>>>>>>>     - Concerns regarding the listed items or their
>> priorities
>>>>>>>>>>>>>>> Looking forward to your feedback.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Xintong
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> [1]
>> https://lists.apache.org/list?dev@flink.apache.org:lte=1M:release%202.0%20status%20updates
>>>>>>>>>>>>>>> [2]
>>>>>>>>>> https://cwiki.apache.org/confluence/display/FLINK/2.0+Release
>>>>>>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> Best regards,
>>>>>>>>>> Sergey
>>>>>>>>>>
>>>>>>>>
>>>>>>
>>>>> --
>>>>> Best
>>>>>
>>>>> ConradJam
>>>>>


Re: [DISCUSS] Release 2.0 Work Items

Posted by Xintong Song <to...@gmail.com>.
Hi Matthias,

The questions you asked are indeed very important. Here're some quick
responses, based on the plans I had in mind, which I have not aligned with
other release managers yet.

In the previous discussions between the RMs, we were not able to make
proposals on things like how to make a time plan, how to manage the release
branch, etc., due to the lack of inputs on e.g., the work items need to be
included (which transitively depends on the API compatibility to provide
between major versions) and the workloads / time needed for them. With the
recent discussions, we have collected at least the majority of the inputs
needed.

Here are things that I think we as the release managers would do next
(again, not aligned with other release managers yet)
- Creating a time plan, by reaching out to people to understand the
estimated workloads, prerequisites and ETA of each work item.
- Make a proposal on how to manage the release branch, i.e., when to cut
the branch and whether to ship the milestone releases, etc.
- Set-up regular release syncs (bi-weekly / monthly) to update the status
and draw attention to where help is needed.

So back to your questions.

There are still to-be-discussed items in the list of features. What's the
> plan with those?

When collecting ETA, for items that the completion time cannot yet be
estimated, we would like to have at least a time by which the estimation
can be made. I think the same applies to the to-be-discussed items. And if
the items should be included as must-haves, we would need another vote to
adjust the must-have item list.

Some of them don't have anyone assigned.
>
My concern is that they will be overlooked because nobody feels to be in
> charge.

This is a tricky one. For must-have items without assignees, we as the
release managers should be responsible for raising them up in the release
syncs, and try to find assignees for them. Hopefully, there will be someone
who stands out. But it is possible that for a must-have item nobody wants
to work on it. If that happens, which I don't think it will, it probably
means the item is not that critical and we may have to exclude it from the
release. Either way, they should not be overlooked, because IMHO release
managers should be responsible for trying to get someone to work on the
un-assigned items.

We'll have more discussions soon and keep the community updated.

Best,

Xintong



On Mon, Jul 10, 2023 at 3:53 PM Matthias Pohl
<ma...@aiven.io.invalid> wrote:

> Now that the vote is started on the must-have items: There are still
> to-be-discussed items in the list of features. What's the plan with those?
> Some of them don't have anyone assigned. Were these items discussed among
> the release managers? So far, it looks like they are handled as
> nice-to-have if someone volunteers to pick them up?
>
> My concern is that they will be overlooked because nobody feels to be in
> charge.
>
> Best,
> Matthias
>
> On Fri, Jul 7, 2023 at 11:06 AM Xintong Song <to...@gmail.com>
> wrote:
>
> > Thanks all for the discussion.
> >
> > The wiki has been updated as discussed. I'm starting a vote now.
> >
> > Best,
> >
> > Xintong
> >
> >
> >
> > On Wed, Jul 5, 2023 at 9:52 AM Xintong Song <to...@gmail.com>
> wrote:
> >
> > > Hi ConradJam,
> > >
> > > I think Chesnay has already put his name as the Contributor for the two
> > > tasks you listed. Maybe you can reach out to him to see if you can
> > > collaborate on this.
> > >
> > > In general, I don't think contributing to a release 2.0 issue is much
> > > different from contributing to a regular issue. We haven't yet created
> > JIRA
> > > tickets for all the listed tasks because many of them needs further
> > > discussions and / or FLIPs to decide whether and how they should be
> > > performed.
> > >
> > > Best,
> > >
> > > Xintong
> > >
> > >
> > >
> > > On Mon, Jul 3, 2023 at 10:37 PM ConradJam <ja...@gmail.com> wrote:
> > >
> > >> Hi Community:
> > >>   I see some tasks in the 2.0 list that haven't been assigned yet. I
> > want
> > >> to take the initiative to take on some tasks that I can complete. How
> > do I
> > >> apply to the community for this part of the task? I am interested in
> the
> > >> following parts of FLINK-32377
> > >> <https://issues.apache.org/jira/browse/FLINK-32377>, do I need to
> > create
> > >> issuse myself and point it to myself?
> > >>
> > >> - the current timestamp, which is problematic w.r.t. caching and
> > testing,
> > >> while providing no value.
> > >> - Remove JarRequestBody#programArgs in favor of #programArgsList.
> > >>
> > >> [1] FLINK-32377 <https://issues.apache.org/jira/browse/FLINK-32377>
> > >> https://issues.apache.org/jira/browse/FLINK-32377
> > >>
> > >> Teoh, Hong <li...@amazon.co.uk.invalid> 于2023年6月30日周五 00:53写道:
> > >>
> > >>
> > >> Teoh, Hong <li...@amazon.co.uk.invalid> 于2023年6月30日周五 00:53写道:
> > >>
> > >> > Thanks Xintong for driving the effort.
> > >> >
> > >> > I’d add a +1 to reworking configs, as suggested by @Jark and
> @Chesnay,
> > >> > especially the types. We have various configs that encode Time /
> > >> MemorySize
> > >> > that are Long instead!
> > >> >
> > >> > Regards,
> > >> > Hong
> > >> >
> > >> >
> > >> >
> > >> > > On 29 Jun 2023, at 16:19, Yuan Mei <yu...@gmail.com>
> wrote:
> > >> > >
> > >> > > CAUTION: This email originated from outside of the organization.
> Do
> > >> not
> > >> > click links or open attachments unless you can confirm the sender
> and
> > >> know
> > >> > the content is safe.
> > >> > >
> > >> > >
> > >> > >
> > >> > > Thanks for driving this effort, Xintong!
> > >> > >
> > >> > > To Chesnay
> > >> > >> I'm curious as to why the "Disaggregated State Management" item
> is
> > >> > >> marked as a must-have; will it require changes that break
> > something?
> > >> > >> What prevents it from being added in 2.1?
> > >> > >
> > >> > > As to "Disaggregated State Management".
> > >> > >
> > >> > > We plan to provide a new type of state backend to support DFS as
> > >> primary
> > >> > > storage.
> > >> > > To achieve this, we at least need to include two parts of amends
> > (not
> > >> > > entirely sure yet, since we are still in the designing and
> prototype
> > >> > phase)
> > >> > >
> > >> > > 1. Statebackend Change
> > >> > > 2. State Access Change
> > >> > >
> > >> > > Not all of the interfaces related are `@Internal`. Some of the
> > >> interfaces
> > >> > > like `StateBackend` is `@PublicEvolving`
> > >> > > So, you are right in the sense that "Disaggregated State
> Management"
> > >> > itself
> > >> > > probably does not need to be a "Must Have"
> > >> > >
> > >> > > But I was hoping changes that related to public APIs can be
> > finalized
> > >> and
> > >> > > merged in Flink 2.0 (I will fix the wiki accordingly).
> > >> > >
> > >> > > I also agree with Jark that 2.0 is a good chance to rework the
> > default
> > >> > > value of configurations.
> > >> > >
> > >> > > Best
> > >> > > Yuan
> > >> > >
> > >> > >
> > >> > > On Thu, Jun 29, 2023 at 8:43 PM Chesnay Schepler <
> > chesnay@apache.org>
> > >> > wrote:
> > >> > >
> > >> > >> Something else configuration-related is that there are a bunch of
> > >> > >> options where the type isn't quite correct (e.g., a String where
> it
> > >> > >> could be an enum, a string where it should be an int or
> something).
> > >> > >> Could do a pass over those as well.
> > >> > >>
> > >> > >> On 29/06/2023 13:50, Jark Wu wrote:
> > >> > >>> Hi,
> > >> > >>>
> > >> > >>> I think one more thing we need to consider to do in 2.0 is
> > changing
> > >> the
> > >> > >>> default value of configuration to improve out-of-box user
> > >> experience.
> > >> > >>>
> > >> > >>> Currently, in order to run a Flink job, users may need to set
> > >> > >>> a bunch of configurations, such as minibatch, checkpoint
> interval,
> > >> > >>> exactly-once,
> > >> > >>> incremental-checkpoint, etc. It's very verbose and hard to use
> for
> > >> > >>> beginners.
> > >> > >>> Most of them can have a universally applicable value.  Because
> > >> changing
> > >> > >> the
> > >> > >>> default value is a breaking change. I think It's worth
> considering
> > >> > >> changing
> > >> > >>> them in 2.0.
> > >> > >>>
> > >> > >>> What do you think?
> > >> > >>>
> > >> > >>> Best,
> > >> > >>> Jark
> > >> > >>>
> > >> > >>>
> > >> > >>> On Wed, 28 Jun 2023 at 14:10, Sergey Nuyanzin <
> > snuyanzin@gmail.com>
> > >> > >> wrote:
> > >> > >>>
> > >> > >>>> Hi Chesnay
> > >> > >>>>
> > >> > >>>>> "Move Calcite rules from Scala to Java": I would hope that
> this
> > >> would
> > >> > >> be
> > >> > >>>>> an entirely internal change, and could thus be an incremental
> > >> process
> > >> > >>>>> independent of major releases.
> > >> > >>>>> What is the actual scale of this item; how much are we
> actually
> > >> > >>>> re-writing?
> > >> > >>>>
> > >> > >>>> Thanks for asking
> > >> > >>>> yes, you're right, that should be internal change.
> > >> > >>>> Yeah I was also thinking about incremental change (rule by rule
> > or
> > >> > >>>> reasonable small group of rules).
> > >> > >>>> And yes, this could be an independent (on major release)
> activity
> > >> > >>>>
> > >> > >>>> The problem is actually for children of RelOptRule.
> > >> > >>>> Currently I see 60+ such rules (in Scala) using the mentioned
> > >> > deprecated
> > >> > >>>> api.
> > >> > >>>> There are also children of ConverterRule (50+) which do not
> have
> > >> such
> > >> > >>>> issues.
> > >> > >>>> Maybe it could be considered as the next step to have all the
> > >> rules in
> > >> > >>>> Java.
> > >> > >>>>
> > >> > >>>> On Tue, Jun 27, 2023 at 1:34 PM Xintong Song <
> > >> tonysong820@gmail.com>
> > >> > >>>> wrote:
> > >> > >>>>
> > >> > >>>>> Hi Alex & Gyula,
> > >> > >>>>>
> > >> > >>>>> By compatibility discussion do you mean the "[DISCUSS]
> FLIP-321:
> > >> > >>>> Introduce
> > >> > >>>>>> an API deprecation process" thread [1]?
> > >> > >>>>>>
> > >> > >>>>> Yes, I meant the FLIP-321 discussion. I just noticed I pasted
> > the
> > >> > wrong
> > >> > >>>> url
> > >> > >>>>> in my previous email. Sorry for the mistake.
> > >> > >>>>>
> > >> > >>>>> I am also curious to know if the rationale behind this new API
> > has
> > >> > been
> > >> > >>>>>> previously discussed on the mailing list. Do we have a list
> of
> > >> > >>>>> shortcomings
> > >> > >>>>>> in the current DataStream API that it tries to resolve? How
> > does
> > >> the
> > >> > >>>>>> current ProcessFunction functionality fit into the picture?
> > Will
> > >> it
> > >> > be
> > >> > >>>>> kept
> > >> > >>>>>> as is or subsumed by new API?
> > >> > >>>>>>
> > >> > >>>>> I don't think we should create a replacement for the
> DataStream
> > >> API
> > >> > >>>> unless
> > >> > >>>>>> we have a very good reason to do so and with a proper
> > discussion
> > >> > about
> > >> > >>>>> this
> > >> > >>>>>> as Alex said.
> > >> > >>>>>
> > >> > >>>>> The ProcessFunction API which is targeting to replace
> DataStream
> > >> API
> > >> > is
> > >> > >>>>> still a proposal, not a decision. Sorry for the confusion, I
> > >> should
> > >> > >> have
> > >> > >>>>> been more careful with my words, not giving the impression
> that
> > >> this
> > >> > is
> > >> > >>>>> something we'll do anyway.
> > >> > >>>>>
> > >> > >>>>> There will be a FLIP describing the motivations and designs in
> > >> > detail,
> > >> > >>>> for
> > >> > >>>>> the community to discuss and vote on. We are still working on
> > it.
> > >> > TBH,
> > >> > >>>> this
> > >> > >>>>> is not trivial and we would need more time on it.
> > >> > >>>>>
> > >> > >>>>> Just to quickly share some backgrounds:
> > >> > >>>>>
> > >> > >>>>>    - We see quite some problems with the current DataStream
> APIs
> > >> > >>>>>       - Users are working with concrete classes rather than
> > >> > >> interfaces,
> > >> > >>>>>       which means
> > >> > >>>>>       - Users can access methods that are designed to be used
> by
> > >> > >> internal
> > >> > >>>>>          classes, even though they are annotated with
> > `@Internal`.
> > >> > >> E.g.,
> > >> > >>>>>          `DataStream#getTransformation`.
> > >> > >>>>>          - Changes to the non-API implementations (e.g.,
> > >> > >>>> `Transformation`)
> > >> > >>>>>          would affect the API classes (e.g., `DataStream`),
> > which
> > >> > >>>>> makes it hard to
> > >> > >>>>>          provide binary compatibility.
> > >> > >>>>>       - Internal classes are used as parameter / return-value
> of
> > >> > >> public
> > >> > >>>>>       APIs. E.g., while `AbstractStreamOperator` is
> > >> PublicEvolving,
> > >> > >>>>> `StreamTask`
> > >> > >>>>>       which returns from
> > >> `AbstractStreamOperator#getContainingTask`
> > >> > is
> > >> > >>>>> Internal.
> > >> > >>>>>       - In many cases, users are asked to extend the API
> > classes,
> > >> > >> rather
> > >> > >>>>>       than implementing interfaces. E.g.,
> > >> `AbstractStreamOperator`.
> > >> > >>>>>          - Any changes to the base classes, even the internal
> > >> part,
> > >> > >> may
> > >> > >>>>>          affect the behavior of the user-provided sub-classes
> > >> > >>>>>          - Users can override the behavior of the base classes
> > >> > >>>>>       - The API module `flink-streaming-java` contains non-API
> > >> > >> classes,
> > >> > >>>> and
> > >> > >>>>>       depends on internal modules such as `flink-runtime`,
> which
> > >> > means
> > >> > >>>>>       - Changes to the internal modules may affect the API
> > >> modules,
> > >> > >> which
> > >> > >>>>>          requires users to re-build their applications upon
> > >> upgrading
> > >> > >>>>>          - The artifact user needs for building their
> > application
> > >> > >> larger
> > >> > >>>>>          than necessary.
> > >> > >>>>>       - We probably should not expose operators (e.g.,
> > >> > >>>>>       `AbstractStreamOperator`) to users. Functions should be
> > >> enough
> > >> > >>>>> for users to
> > >> > >>>>>       define their data processing logics. Exposing
> > operator-level
> > >> > >>>> concepts
> > >> > >>>>>       (e.g., mailbox thread model, checkpoint barrier
> alignment,
> > >> > >> etc.) is
> > >> > >>>>>       unnecessary and limits the improvement regarding such
> > >> exposed
> > >> > >>>>> mechanisms
> > >> > >>>>>       with compatibility considerations.
> > >> > >>>>>       - The current DataStream API seems to be a mixture of
> many
> > >> > >> things,
> > >> > >>>>>       making it hard to understand especially for newcomers.
> It
> > >> might
> > >> > >> be
> > >> > >>>>> better
> > >> > >>>>>       to re-organize it into several parts: (the taxonomy
> below
> > >> are
> > >> > >> just
> > >> > >>>> an
> > >> > >>>>>       example of the, we are still working on this)
> > >> > >>>>>          - The most fundamental stateful stream processing:
> > >> streams,
> > >> > >>>>>          partitions / key, process functions, state,
> > >> timeline-service
> > >> > >>>>>          - An extension for common batch-streaming unified
> > >> functions:
> > >> > >>>> map,
> > >> > >>>>>          flatmap, filter, agg, reduce, join, etc.
> > >> > >>>>>          - An extension for windowing supports:  window,
> > >> triggering
> > >> > >>>>>          - An extension for event-time supports: event time,
> > >> > watermark
> > >> > >>>>>          - The extensions are like short-cuts / sugars,
> without
> > >> which
> > >> > >>>> users
> > >> > >>>>>          can probably still achieve the same behavior by
> working
> > >> with
> > >> > >> the
> > >> > >>>>>          fundamental APIs, but would be a lot easier with the
> > >> > >> extensions
> > >> > >>>>>       - The original plan was to do in-place refactors /
> changes
> > >> on
> > >> > >>>>>    DataStream API. Some related items are listed in this doc
> [2]
> > >> > >> attached
> > >> > >>>>> to
> > >> > >>>>>    the kicking off email [3]. Not all of the above issues are
> > >> listed,
> > >> > >>>>> because
> > >> > >>>>>    we haven't looked into this as deeply as now  by that time.
> > >> > >>>>>    - We proposed this as a new API rather than in-place
> > refactors
> > >> in
> > >> > >> the
> > >> > >>>>>    2.0 work item list, because we realized the changes might
> be
> > >> too
> > >> > >> big
> > >> > >>>>> for an
> > >> > >>>>>    in-place change. First having a new API then gradually
> > retiring
> > >> > the
> > >> > >>>> old
> > >> > >>>>> one
> > >> > >>>>>    would help users to smoothly migrate between them.
> > >> > >>>>>
> > >> > >>>>> A thorough discussion is definitely needed once the FLIP is
> out.
> > >> And
> > >> > of
> > >> > >>>>> course it's possible that the FLIP might be rejected. Given
> that
> > >> we
> > >> > are
> > >> > >>>>> planning for release 2.0, I just feel it would be better to
> > bring
> > >> > this
> > >> > >> up
> > >> > >>>>> early even the concrete plan is not yet ready,
> > >> > >>>>>
> > >> > >>>>> Best,
> > >> > >>>>>
> > >> > >>>>> Xintong
> > >> > >>>>>
> > >> > >>>>>
> > >> > >>>>> [1]
> > >> https://lists.apache.org/thread/vmhzv8fcw2b33pqxp43486owrxbkd5x9
> > >> > >>>>> [2]
> > >> > >>>>>
> > >> > >>>>>
> > >> > >>>>
> > >> > >>
> > >> >
> > >>
> >
> https://docs.google.com/document/d/1_PMGl5RuDQGlV99_gL3y7OiRsF0DgCk91Coua6hFXhE/edit?usp=sharing
> > >> > >>>>> [3]
> > >> https://lists.apache.org/thread/b8w5cx0qqbwzzklyn5xxf54vw9ymys1c
> > >> > >>>>>
> > >> > >>>>> On Tue, Jun 27, 2023 at 5:15 PM Gyula Fóra <gyfora@apache.org
> >
> > >> > wrote:
> > >> > >>>>>
> > >> > >>>>>> Hey!
> > >> > >>>>>>
> > >> > >>>>>> I share the same concerns mentioned above regarding the
> > >> > >>>> "ProcessFunction
> > >> > >>>>>> API".
> > >> > >>>>>>
> > >> > >>>>>> I don't think we should create a replacement for the
> DataStream
> > >> API
> > >> > >>>>> unless
> > >> > >>>>>> we have a very good reason to do so and with a proper
> > discussion
> > >> > about
> > >> > >>>>> this
> > >> > >>>>>> as Alex said.
> > >> > >>>>>>
> > >> > >>>>>> Cheers,
> > >> > >>>>>> Gyula
> > >> > >>>>>>
> > >> > >>>>>> On Tue, Jun 27, 2023 at 11:03 AM Alexander Fedulov <
> > >> > >>>>>> alexander.fedulov@gmail.com> wrote:
> > >> > >>>>>>
> > >> > >>>>>>> Hi Xintong,
> > >> > >>>>>>>
> > >> > >>>>>>> By compatibility discussion do you mean the "[DISCUSS]
> > FLIP-321:
> > >> > >>>>>> Introduce
> > >> > >>>>>>> an API deprecation process" thread [1]?
> > >> > >>>>>>>
> > >> > >>>>>>> I am also curious to know if the rationale behind this new
> API
> > >> has
> > >> > >>>> been
> > >> > >>>>>>> previously discussed on the mailing list. Do we have a list
> of
> > >> > >>>>>> shortcomings
> > >> > >>>>>>> in the current DataStream API that it tries to resolve? How
> > does
> > >> > the
> > >> > >>>>>>> current ProcessFunction functionality fit into the picture?
> > >> Will it
> > >> > >>>> be
> > >> > >>>>>> kept
> > >> > >>>>>>> as is or subsumed by new API?
> > >> > >>>>>>>
> > >> > >>>>>>> [1]
> > >> > https://lists.apache.org/thread/vmhzv8fcw2b33pqxp43486owrxbkd5x9
> > >> > >>>>>>>
> > >> > >>>>>>> Best,
> > >> > >>>>>>> Alex
> > >> > >>>>>>>
> > >> > >>>>>>> On Mon, 26 Jun 2023 at 14:33, Xintong Song <
> > >> tonysong820@gmail.com>
> > >> > >>>>>> wrote:
> > >> > >>>>>>>>> The ProcessFunction API item is giving me the most
> headaches
> > >> > >>>>> because
> > >> > >>>>>>> it's
> > >> > >>>>>>>>> very unclear what it actually entails; like is it an
> > entirely
> > >> > >>>>>> separate
> > >> > >>>>>>>> API
> > >> > >>>>>>>>> to DataStream (sounds like it is!) or an extension of
> > >> DataStream.
> > >> > >>>>> How
> > >> > >>>>>>>> much
> > >> > >>>>>>>>> will it share the internals with DataStream etc.; how does
> > it
> > >> > >>>>> relate
> > >> > >>>>>> to
> > >> > >>>>>>>> the
> > >> > >>>>>>>>> Table API (w.r.t. switching APIs / what Table API uses
> > >> > >>>> underneath).
> > >> > >>>>>>>> I totally understand your confusion. We started planning
> this
> > >> > after
> > >> > >>>>>>> kicking
> > >> > >>>>>>>> off the release 2.0, so there's still a lot to be explored
> > and
> > >> the
> > >> > >>>>> plan
> > >> > >>>>>>>> keeps changing.
> > >> > >>>>>>>>
> > >> > >>>>>>>>
> > >> > >>>>>>>>    - In the beginning, we planned to do an in-place
> refactor
> > of
> > >> > >>>>>>> DataStream
> > >> > >>>>>>>>    API, until the API migration period is proposed.
> > >> > >>>>>>>>    - Then we want to make it an entirely separate API to
> > >> > >>>> DataStream,
> > >> > >>>>>> and
> > >> > >>>>>>>>    listed as a must-have for release 2.0 so that we can
> > remove
> > >> > >>>>>> DataStream
> > >> > >>>>>>>> once
> > >> > >>>>>>>>    it's ready.
> > >> > >>>>>>>>    - However, depending on the outcome of the API
> > compatibility
> > >> > >>>>>>> discussion
> > >> > >>>>>>>>    [1], we may not be able to remove DataStream in 2.0
> > anyway,
> > >> > >>>> which
> > >> > >>>>>>> means
> > >> > >>>>>>>> we
> > >> > >>>>>>>>    might need to re-evaluate the necessity of this item for
> > >> 2.0.
> > >> > >>>>>>>>
> > >> > >>>>>>>> I'd say we wait a bit longer for the compatibility
> discussion
> > >> [1]
> > >> > >>>> and
> > >> > >>>>>>>> decide the priority for this item afterwards.
> > >> > >>>>>>>>
> > >> > >>>>>>>>
> > >> > >>>>>>>> Best,
> > >> > >>>>>>>>
> > >> > >>>>>>>> Xintong
> > >> > >>>>>>>>
> > >> > >>>>>>>>
> > >> > >>>>>>>> [1]
> https://lists.apache.org/list.html?dev@flink.apache.org
> > >> > >>>>>>>>
> > >> > >>>>>>>>
> > >> > >>>>>>>> On Mon, Jun 26, 2023 at 6:00 PM Chesnay Schepler <
> > >> > >>>> chesnay@apache.org
> > >> > >>>>>>>> wrote:
> > >> > >>>>>>>>
> > >> > >>>>>>>>> by-and-large I'm quite happy with the list of items.
> > >> > >>>>>>>>>
> > >> > >>>>>>>>> I'm curious as to why the "Disaggregated State Management"
> > >> item
> > >> > >>>> is
> > >> > >>>>>>> marked
> > >> > >>>>>>>>> as a must-have; will it require changes that break
> > something?
> > >> > >>>> What
> > >> > >>>>>>>> prevents
> > >> > >>>>>>>>> it from being added in 2.1?
> > >> > >>>>>>>>>
> > >> > >>>>>>>>> We may want to update the Java 17 item to "Make Java 17
> the
> > >> > >>>>> default,
> > >> > >>>>>>> drop
> > >> > >>>>>>>>> Java 8/11". Maybe even split it into a must-have "Drop
> Java
> > 8"
> > >> > >>>> and
> > >> > >>>>> a
> > >> > >>>>>>>>> nice-to-have "Drop Java 11"?
> > >> > >>>>>>>>>
> > >> > >>>>>>>>> "Move Calcite rules from Scala to Java": I would hope that
> > >> this
> > >> > >>>>> would
> > >> > >>>>>>> be
> > >> > >>>>>>>>> an entirely internal change, and could thus be an
> > incremental
> > >> > >>>>> process
> > >> > >>>>>>>>> independent of major releases.
> > >> > >>>>>>>>> What is the actual scale of this item; how much are we
> > >> actually
> > >> > >>>>>>>> re-writing?
> > >> > >>>>>>>>> "Add MetricGroup#getLogicalScope": I'd raise this to a
> > >> > >>>> must-have; i
> > >> > >>>>>>> think
> > >> > >>>>>>>>> I marked it down as nice-to-have only because it depends
> on
> > >> > >>>> another
> > >> > >>>>>>> item.
> > >> > >>>>>>>>> The ProcessFunction API item is giving me the most
> headaches
> > >> > >>>>> because
> > >> > >>>>>>> it's
> > >> > >>>>>>>>> very unclear what it actually entails; like is it an
> > entirely
> > >> > >>>>>> separate
> > >> > >>>>>>>> API
> > >> > >>>>>>>>> to DataStream (sounds like it is!) or an extension of
> > >> DataStream.
> > >> > >>>>> How
> > >> > >>>>>>>> much
> > >> > >>>>>>>>> will it share the internals with DataStream etc.; how does
> > it
> > >> > >>>>> relate
> > >> > >>>>>> to
> > >> > >>>>>>>> the
> > >> > >>>>>>>>> Table API (w.r.t. switching APIs / what Table API uses
> > >> > >>>> underneath).
> > >> > >>>>>>>>> There are a few items I added as ideas which don't have a
> > >> > >>>> priority
> > >> > >>>>>> yet;
> > >> > >>>>>>>>> would love to get some feedback on those.
> > >> > >>>>>>>>>
> > >> > >>>>>>>>> On 21/06/2023 08:41, Xintong Song wrote:
> > >> > >>>>>>>>>
> > >> > >>>>>>>>> Hi devs,
> > >> > >>>>>>>>>
> > >> > >>>>>>>>> As previously discussed in [1], we had been collecting
> work
> > >> item
> > >> > >>>>>>>> proposals
> > >> > >>>>>>>>> for the 2.0 release until June 15th, on the wiki page [2].
> > >> > >>>>>>>>>
> > >> > >>>>>>>>>    - As we have passed the due date, I'd like to kindly
> > remind
> > >> > >>>>>> everyone
> > >> > >>>>>>>> *not
> > >> > >>>>>>>>>    to add / remove items directly on the wiki page*. If
> > >> needed,
> > >> > >>>>>> please
> > >> > >>>>>>>> post
> > >> > >>>>>>>>>    in this thread or reach out to the release managers
> > >> instead.
> > >> > >>>>>>>>>    - I've reached out to some folks for clarifications
> about
> > >> > >>>> their
> > >> > >>>>>>>>>    proposals. Some of them mentioned that they can not yet
> > >> tell
> > >> > >>>>>> whether
> > >> > >>>>>>>> we
> > >> > >>>>>>>>>    should do an item or not, and would need more time /
> > >> > >>>> discussions
> > >> > >>>>>> to
> > >> > >>>>>>>> make
> > >> > >>>>>>>>>    the decision. So I added a new symbol for items whose
> > >> > >>>> priorities
> > >> > >>>>>> are
> > >> > >>>>>>>> `TBD`.
> > >> > >>>>>>>>> Now it's time to collaboratively decide a minimum set of
> > >> > >>>> must-have
> > >> > >>>>>>> items.
> > >> > >>>>>>>>> I've gone through the entire list of proposed items, and
> > found
> > >> > >>>> most
> > >> > >>>>>> of
> > >> > >>>>>>>> them
> > >> > >>>>>>>>> make quite much sense. So I think an online sync might not
> > be
> > >> > >>>>>> necessary
> > >> > >>>>>>>> for
> > >> > >>>>>>>>> this. I'd like to go with this DISCUSS thread, where
> > everyone
> > >> can
> > >> > >>>>>>> comment
> > >> > >>>>>>>>> on how they think the list can be improved, followed by a
> > >> VOTE to
> > >> > >>>>>>>> formally
> > >> > >>>>>>>>> make the decision.
> > >> > >>>>>>>>>
> > >> > >>>>>>>>> Any feedback and opinions, including but not limited to
> the
> > >> > >>>>> following
> > >> > >>>>>>>>> aspects, will be appreciated.
> > >> > >>>>>>>>>
> > >> > >>>>>>>>>    - Important items that are missing from the list
> > >> > >>>>>>>>>    - Concerns regarding the listed items or their
> priorities
> > >> > >>>>>>>>>
> > >> > >>>>>>>>> Looking forward to your feedback.
> > >> > >>>>>>>>>
> > >> > >>>>>>>>> Best,
> > >> > >>>>>>>>>
> > >> > >>>>>>>>> Xintong
> > >> > >>>>>>>>>
> > >> > >>>>>>>>>
> > >> > >>>>>>>>> [1]
> > >> > >>>>
> > >> > >>
> > >> >
> > >>
> >
> https://lists.apache.org/list?dev@flink.apache.org:lte=1M:release%202.0%20status%20updates
> > >> > >>>>>>>>> [2]
> > >> > >>>> https://cwiki.apache.org/confluence/display/FLINK/2.0+Release
> > >> > >>>>>>>>>
> > >> > >>>>>>>>>
> > >> > >>>>
> > >> > >>>> --
> > >> > >>>> Best regards,
> > >> > >>>> Sergey
> > >> > >>>>
> > >> > >>
> > >> > >>
> > >> >
> > >> >
> > >>
> > >> --
> > >> Best
> > >>
> > >> ConradJam
> > >>
> > >
> >
>

Re: [DISCUSS] Release 2.0 Work Items

Posted by Matthias Pohl <ma...@aiven.io.INVALID>.
Now that the vote is started on the must-have items: There are still
to-be-discussed items in the list of features. What's the plan with those?
Some of them don't have anyone assigned. Were these items discussed among
the release managers? So far, it looks like they are handled as
nice-to-have if someone volunteers to pick them up?

My concern is that they will be overlooked because nobody feels to be in
charge.

Best,
Matthias

On Fri, Jul 7, 2023 at 11:06 AM Xintong Song <to...@gmail.com> wrote:

> Thanks all for the discussion.
>
> The wiki has been updated as discussed. I'm starting a vote now.
>
> Best,
>
> Xintong
>
>
>
> On Wed, Jul 5, 2023 at 9:52 AM Xintong Song <to...@gmail.com> wrote:
>
> > Hi ConradJam,
> >
> > I think Chesnay has already put his name as the Contributor for the two
> > tasks you listed. Maybe you can reach out to him to see if you can
> > collaborate on this.
> >
> > In general, I don't think contributing to a release 2.0 issue is much
> > different from contributing to a regular issue. We haven't yet created
> JIRA
> > tickets for all the listed tasks because many of them needs further
> > discussions and / or FLIPs to decide whether and how they should be
> > performed.
> >
> > Best,
> >
> > Xintong
> >
> >
> >
> > On Mon, Jul 3, 2023 at 10:37 PM ConradJam <ja...@gmail.com> wrote:
> >
> >> Hi Community:
> >>   I see some tasks in the 2.0 list that haven't been assigned yet. I
> want
> >> to take the initiative to take on some tasks that I can complete. How
> do I
> >> apply to the community for this part of the task? I am interested in the
> >> following parts of FLINK-32377
> >> <https://issues.apache.org/jira/browse/FLINK-32377>, do I need to
> create
> >> issuse myself and point it to myself?
> >>
> >> - the current timestamp, which is problematic w.r.t. caching and
> testing,
> >> while providing no value.
> >> - Remove JarRequestBody#programArgs in favor of #programArgsList.
> >>
> >> [1] FLINK-32377 <https://issues.apache.org/jira/browse/FLINK-32377>
> >> https://issues.apache.org/jira/browse/FLINK-32377
> >>
> >> Teoh, Hong <li...@amazon.co.uk.invalid> 于2023年6月30日周五 00:53写道:
> >>
> >>
> >> Teoh, Hong <li...@amazon.co.uk.invalid> 于2023年6月30日周五 00:53写道:
> >>
> >> > Thanks Xintong for driving the effort.
> >> >
> >> > I’d add a +1 to reworking configs, as suggested by @Jark and @Chesnay,
> >> > especially the types. We have various configs that encode Time /
> >> MemorySize
> >> > that are Long instead!
> >> >
> >> > Regards,
> >> > Hong
> >> >
> >> >
> >> >
> >> > > On 29 Jun 2023, at 16:19, Yuan Mei <yu...@gmail.com> wrote:
> >> > >
> >> > > CAUTION: This email originated from outside of the organization. Do
> >> not
> >> > click links or open attachments unless you can confirm the sender and
> >> know
> >> > the content is safe.
> >> > >
> >> > >
> >> > >
> >> > > Thanks for driving this effort, Xintong!
> >> > >
> >> > > To Chesnay
> >> > >> I'm curious as to why the "Disaggregated State Management" item is
> >> > >> marked as a must-have; will it require changes that break
> something?
> >> > >> What prevents it from being added in 2.1?
> >> > >
> >> > > As to "Disaggregated State Management".
> >> > >
> >> > > We plan to provide a new type of state backend to support DFS as
> >> primary
> >> > > storage.
> >> > > To achieve this, we at least need to include two parts of amends
> (not
> >> > > entirely sure yet, since we are still in the designing and prototype
> >> > phase)
> >> > >
> >> > > 1. Statebackend Change
> >> > > 2. State Access Change
> >> > >
> >> > > Not all of the interfaces related are `@Internal`. Some of the
> >> interfaces
> >> > > like `StateBackend` is `@PublicEvolving`
> >> > > So, you are right in the sense that "Disaggregated State Management"
> >> > itself
> >> > > probably does not need to be a "Must Have"
> >> > >
> >> > > But I was hoping changes that related to public APIs can be
> finalized
> >> and
> >> > > merged in Flink 2.0 (I will fix the wiki accordingly).
> >> > >
> >> > > I also agree with Jark that 2.0 is a good chance to rework the
> default
> >> > > value of configurations.
> >> > >
> >> > > Best
> >> > > Yuan
> >> > >
> >> > >
> >> > > On Thu, Jun 29, 2023 at 8:43 PM Chesnay Schepler <
> chesnay@apache.org>
> >> > wrote:
> >> > >
> >> > >> Something else configuration-related is that there are a bunch of
> >> > >> options where the type isn't quite correct (e.g., a String where it
> >> > >> could be an enum, a string where it should be an int or something).
> >> > >> Could do a pass over those as well.
> >> > >>
> >> > >> On 29/06/2023 13:50, Jark Wu wrote:
> >> > >>> Hi,
> >> > >>>
> >> > >>> I think one more thing we need to consider to do in 2.0 is
> changing
> >> the
> >> > >>> default value of configuration to improve out-of-box user
> >> experience.
> >> > >>>
> >> > >>> Currently, in order to run a Flink job, users may need to set
> >> > >>> a bunch of configurations, such as minibatch, checkpoint interval,
> >> > >>> exactly-once,
> >> > >>> incremental-checkpoint, etc. It's very verbose and hard to use for
> >> > >>> beginners.
> >> > >>> Most of them can have a universally applicable value.  Because
> >> changing
> >> > >> the
> >> > >>> default value is a breaking change. I think It's worth considering
> >> > >> changing
> >> > >>> them in 2.0.
> >> > >>>
> >> > >>> What do you think?
> >> > >>>
> >> > >>> Best,
> >> > >>> Jark
> >> > >>>
> >> > >>>
> >> > >>> On Wed, 28 Jun 2023 at 14:10, Sergey Nuyanzin <
> snuyanzin@gmail.com>
> >> > >> wrote:
> >> > >>>
> >> > >>>> Hi Chesnay
> >> > >>>>
> >> > >>>>> "Move Calcite rules from Scala to Java": I would hope that this
> >> would
> >> > >> be
> >> > >>>>> an entirely internal change, and could thus be an incremental
> >> process
> >> > >>>>> independent of major releases.
> >> > >>>>> What is the actual scale of this item; how much are we actually
> >> > >>>> re-writing?
> >> > >>>>
> >> > >>>> Thanks for asking
> >> > >>>> yes, you're right, that should be internal change.
> >> > >>>> Yeah I was also thinking about incremental change (rule by rule
> or
> >> > >>>> reasonable small group of rules).
> >> > >>>> And yes, this could be an independent (on major release) activity
> >> > >>>>
> >> > >>>> The problem is actually for children of RelOptRule.
> >> > >>>> Currently I see 60+ such rules (in Scala) using the mentioned
> >> > deprecated
> >> > >>>> api.
> >> > >>>> There are also children of ConverterRule (50+) which do not have
> >> such
> >> > >>>> issues.
> >> > >>>> Maybe it could be considered as the next step to have all the
> >> rules in
> >> > >>>> Java.
> >> > >>>>
> >> > >>>> On Tue, Jun 27, 2023 at 1:34 PM Xintong Song <
> >> tonysong820@gmail.com>
> >> > >>>> wrote:
> >> > >>>>
> >> > >>>>> Hi Alex & Gyula,
> >> > >>>>>
> >> > >>>>> By compatibility discussion do you mean the "[DISCUSS] FLIP-321:
> >> > >>>> Introduce
> >> > >>>>>> an API deprecation process" thread [1]?
> >> > >>>>>>
> >> > >>>>> Yes, I meant the FLIP-321 discussion. I just noticed I pasted
> the
> >> > wrong
> >> > >>>> url
> >> > >>>>> in my previous email. Sorry for the mistake.
> >> > >>>>>
> >> > >>>>> I am also curious to know if the rationale behind this new API
> has
> >> > been
> >> > >>>>>> previously discussed on the mailing list. Do we have a list of
> >> > >>>>> shortcomings
> >> > >>>>>> in the current DataStream API that it tries to resolve? How
> does
> >> the
> >> > >>>>>> current ProcessFunction functionality fit into the picture?
> Will
> >> it
> >> > be
> >> > >>>>> kept
> >> > >>>>>> as is or subsumed by new API?
> >> > >>>>>>
> >> > >>>>> I don't think we should create a replacement for the DataStream
> >> API
> >> > >>>> unless
> >> > >>>>>> we have a very good reason to do so and with a proper
> discussion
> >> > about
> >> > >>>>> this
> >> > >>>>>> as Alex said.
> >> > >>>>>
> >> > >>>>> The ProcessFunction API which is targeting to replace DataStream
> >> API
> >> > is
> >> > >>>>> still a proposal, not a decision. Sorry for the confusion, I
> >> should
> >> > >> have
> >> > >>>>> been more careful with my words, not giving the impression that
> >> this
> >> > is
> >> > >>>>> something we'll do anyway.
> >> > >>>>>
> >> > >>>>> There will be a FLIP describing the motivations and designs in
> >> > detail,
> >> > >>>> for
> >> > >>>>> the community to discuss and vote on. We are still working on
> it.
> >> > TBH,
> >> > >>>> this
> >> > >>>>> is not trivial and we would need more time on it.
> >> > >>>>>
> >> > >>>>> Just to quickly share some backgrounds:
> >> > >>>>>
> >> > >>>>>    - We see quite some problems with the current DataStream APIs
> >> > >>>>>       - Users are working with concrete classes rather than
> >> > >> interfaces,
> >> > >>>>>       which means
> >> > >>>>>       - Users can access methods that are designed to be used by
> >> > >> internal
> >> > >>>>>          classes, even though they are annotated with
> `@Internal`.
> >> > >> E.g.,
> >> > >>>>>          `DataStream#getTransformation`.
> >> > >>>>>          - Changes to the non-API implementations (e.g.,
> >> > >>>> `Transformation`)
> >> > >>>>>          would affect the API classes (e.g., `DataStream`),
> which
> >> > >>>>> makes it hard to
> >> > >>>>>          provide binary compatibility.
> >> > >>>>>       - Internal classes are used as parameter / return-value of
> >> > >> public
> >> > >>>>>       APIs. E.g., while `AbstractStreamOperator` is
> >> PublicEvolving,
> >> > >>>>> `StreamTask`
> >> > >>>>>       which returns from
> >> `AbstractStreamOperator#getContainingTask`
> >> > is
> >> > >>>>> Internal.
> >> > >>>>>       - In many cases, users are asked to extend the API
> classes,
> >> > >> rather
> >> > >>>>>       than implementing interfaces. E.g.,
> >> `AbstractStreamOperator`.
> >> > >>>>>          - Any changes to the base classes, even the internal
> >> part,
> >> > >> may
> >> > >>>>>          affect the behavior of the user-provided sub-classes
> >> > >>>>>          - Users can override the behavior of the base classes
> >> > >>>>>       - The API module `flink-streaming-java` contains non-API
> >> > >> classes,
> >> > >>>> and
> >> > >>>>>       depends on internal modules such as `flink-runtime`, which
> >> > means
> >> > >>>>>       - Changes to the internal modules may affect the API
> >> modules,
> >> > >> which
> >> > >>>>>          requires users to re-build their applications upon
> >> upgrading
> >> > >>>>>          - The artifact user needs for building their
> application
> >> > >> larger
> >> > >>>>>          than necessary.
> >> > >>>>>       - We probably should not expose operators (e.g.,
> >> > >>>>>       `AbstractStreamOperator`) to users. Functions should be
> >> enough
> >> > >>>>> for users to
> >> > >>>>>       define their data processing logics. Exposing
> operator-level
> >> > >>>> concepts
> >> > >>>>>       (e.g., mailbox thread model, checkpoint barrier alignment,
> >> > >> etc.) is
> >> > >>>>>       unnecessary and limits the improvement regarding such
> >> exposed
> >> > >>>>> mechanisms
> >> > >>>>>       with compatibility considerations.
> >> > >>>>>       - The current DataStream API seems to be a mixture of many
> >> > >> things,
> >> > >>>>>       making it hard to understand especially for newcomers. It
> >> might
> >> > >> be
> >> > >>>>> better
> >> > >>>>>       to re-organize it into several parts: (the taxonomy below
> >> are
> >> > >> just
> >> > >>>> an
> >> > >>>>>       example of the, we are still working on this)
> >> > >>>>>          - The most fundamental stateful stream processing:
> >> streams,
> >> > >>>>>          partitions / key, process functions, state,
> >> timeline-service
> >> > >>>>>          - An extension for common batch-streaming unified
> >> functions:
> >> > >>>> map,
> >> > >>>>>          flatmap, filter, agg, reduce, join, etc.
> >> > >>>>>          - An extension for windowing supports:  window,
> >> triggering
> >> > >>>>>          - An extension for event-time supports: event time,
> >> > watermark
> >> > >>>>>          - The extensions are like short-cuts / sugars, without
> >> which
> >> > >>>> users
> >> > >>>>>          can probably still achieve the same behavior by working
> >> with
> >> > >> the
> >> > >>>>>          fundamental APIs, but would be a lot easier with the
> >> > >> extensions
> >> > >>>>>       - The original plan was to do in-place refactors / changes
> >> on
> >> > >>>>>    DataStream API. Some related items are listed in this doc [2]
> >> > >> attached
> >> > >>>>> to
> >> > >>>>>    the kicking off email [3]. Not all of the above issues are
> >> listed,
> >> > >>>>> because
> >> > >>>>>    we haven't looked into this as deeply as now  by that time.
> >> > >>>>>    - We proposed this as a new API rather than in-place
> refactors
> >> in
> >> > >> the
> >> > >>>>>    2.0 work item list, because we realized the changes might be
> >> too
> >> > >> big
> >> > >>>>> for an
> >> > >>>>>    in-place change. First having a new API then gradually
> retiring
> >> > the
> >> > >>>> old
> >> > >>>>> one
> >> > >>>>>    would help users to smoothly migrate between them.
> >> > >>>>>
> >> > >>>>> A thorough discussion is definitely needed once the FLIP is out.
> >> And
> >> > of
> >> > >>>>> course it's possible that the FLIP might be rejected. Given that
> >> we
> >> > are
> >> > >>>>> planning for release 2.0, I just feel it would be better to
> bring
> >> > this
> >> > >> up
> >> > >>>>> early even the concrete plan is not yet ready,
> >> > >>>>>
> >> > >>>>> Best,
> >> > >>>>>
> >> > >>>>> Xintong
> >> > >>>>>
> >> > >>>>>
> >> > >>>>> [1]
> >> https://lists.apache.org/thread/vmhzv8fcw2b33pqxp43486owrxbkd5x9
> >> > >>>>> [2]
> >> > >>>>>
> >> > >>>>>
> >> > >>>>
> >> > >>
> >> >
> >>
> https://docs.google.com/document/d/1_PMGl5RuDQGlV99_gL3y7OiRsF0DgCk91Coua6hFXhE/edit?usp=sharing
> >> > >>>>> [3]
> >> https://lists.apache.org/thread/b8w5cx0qqbwzzklyn5xxf54vw9ymys1c
> >> > >>>>>
> >> > >>>>> On Tue, Jun 27, 2023 at 5:15 PM Gyula Fóra <gy...@apache.org>
> >> > wrote:
> >> > >>>>>
> >> > >>>>>> Hey!
> >> > >>>>>>
> >> > >>>>>> I share the same concerns mentioned above regarding the
> >> > >>>> "ProcessFunction
> >> > >>>>>> API".
> >> > >>>>>>
> >> > >>>>>> I don't think we should create a replacement for the DataStream
> >> API
> >> > >>>>> unless
> >> > >>>>>> we have a very good reason to do so and with a proper
> discussion
> >> > about
> >> > >>>>> this
> >> > >>>>>> as Alex said.
> >> > >>>>>>
> >> > >>>>>> Cheers,
> >> > >>>>>> Gyula
> >> > >>>>>>
> >> > >>>>>> On Tue, Jun 27, 2023 at 11:03 AM Alexander Fedulov <
> >> > >>>>>> alexander.fedulov@gmail.com> wrote:
> >> > >>>>>>
> >> > >>>>>>> Hi Xintong,
> >> > >>>>>>>
> >> > >>>>>>> By compatibility discussion do you mean the "[DISCUSS]
> FLIP-321:
> >> > >>>>>> Introduce
> >> > >>>>>>> an API deprecation process" thread [1]?
> >> > >>>>>>>
> >> > >>>>>>> I am also curious to know if the rationale behind this new API
> >> has
> >> > >>>> been
> >> > >>>>>>> previously discussed on the mailing list. Do we have a list of
> >> > >>>>>> shortcomings
> >> > >>>>>>> in the current DataStream API that it tries to resolve? How
> does
> >> > the
> >> > >>>>>>> current ProcessFunction functionality fit into the picture?
> >> Will it
> >> > >>>> be
> >> > >>>>>> kept
> >> > >>>>>>> as is or subsumed by new API?
> >> > >>>>>>>
> >> > >>>>>>> [1]
> >> > https://lists.apache.org/thread/vmhzv8fcw2b33pqxp43486owrxbkd5x9
> >> > >>>>>>>
> >> > >>>>>>> Best,
> >> > >>>>>>> Alex
> >> > >>>>>>>
> >> > >>>>>>> On Mon, 26 Jun 2023 at 14:33, Xintong Song <
> >> tonysong820@gmail.com>
> >> > >>>>>> wrote:
> >> > >>>>>>>>> The ProcessFunction API item is giving me the most headaches
> >> > >>>>> because
> >> > >>>>>>> it's
> >> > >>>>>>>>> very unclear what it actually entails; like is it an
> entirely
> >> > >>>>>> separate
> >> > >>>>>>>> API
> >> > >>>>>>>>> to DataStream (sounds like it is!) or an extension of
> >> DataStream.
> >> > >>>>> How
> >> > >>>>>>>> much
> >> > >>>>>>>>> will it share the internals with DataStream etc.; how does
> it
> >> > >>>>> relate
> >> > >>>>>> to
> >> > >>>>>>>> the
> >> > >>>>>>>>> Table API (w.r.t. switching APIs / what Table API uses
> >> > >>>> underneath).
> >> > >>>>>>>> I totally understand your confusion. We started planning this
> >> > after
> >> > >>>>>>> kicking
> >> > >>>>>>>> off the release 2.0, so there's still a lot to be explored
> and
> >> the
> >> > >>>>> plan
> >> > >>>>>>>> keeps changing.
> >> > >>>>>>>>
> >> > >>>>>>>>
> >> > >>>>>>>>    - In the beginning, we planned to do an in-place refactor
> of
> >> > >>>>>>> DataStream
> >> > >>>>>>>>    API, until the API migration period is proposed.
> >> > >>>>>>>>    - Then we want to make it an entirely separate API to
> >> > >>>> DataStream,
> >> > >>>>>> and
> >> > >>>>>>>>    listed as a must-have for release 2.0 so that we can
> remove
> >> > >>>>>> DataStream
> >> > >>>>>>>> once
> >> > >>>>>>>>    it's ready.
> >> > >>>>>>>>    - However, depending on the outcome of the API
> compatibility
> >> > >>>>>>> discussion
> >> > >>>>>>>>    [1], we may not be able to remove DataStream in 2.0
> anyway,
> >> > >>>> which
> >> > >>>>>>> means
> >> > >>>>>>>> we
> >> > >>>>>>>>    might need to re-evaluate the necessity of this item for
> >> 2.0.
> >> > >>>>>>>>
> >> > >>>>>>>> I'd say we wait a bit longer for the compatibility discussion
> >> [1]
> >> > >>>> and
> >> > >>>>>>>> decide the priority for this item afterwards.
> >> > >>>>>>>>
> >> > >>>>>>>>
> >> > >>>>>>>> Best,
> >> > >>>>>>>>
> >> > >>>>>>>> Xintong
> >> > >>>>>>>>
> >> > >>>>>>>>
> >> > >>>>>>>> [1] https://lists.apache.org/list.html?dev@flink.apache.org
> >> > >>>>>>>>
> >> > >>>>>>>>
> >> > >>>>>>>> On Mon, Jun 26, 2023 at 6:00 PM Chesnay Schepler <
> >> > >>>> chesnay@apache.org
> >> > >>>>>>>> wrote:
> >> > >>>>>>>>
> >> > >>>>>>>>> by-and-large I'm quite happy with the list of items.
> >> > >>>>>>>>>
> >> > >>>>>>>>> I'm curious as to why the "Disaggregated State Management"
> >> item
> >> > >>>> is
> >> > >>>>>>> marked
> >> > >>>>>>>>> as a must-have; will it require changes that break
> something?
> >> > >>>> What
> >> > >>>>>>>> prevents
> >> > >>>>>>>>> it from being added in 2.1?
> >> > >>>>>>>>>
> >> > >>>>>>>>> We may want to update the Java 17 item to "Make Java 17 the
> >> > >>>>> default,
> >> > >>>>>>> drop
> >> > >>>>>>>>> Java 8/11". Maybe even split it into a must-have "Drop Java
> 8"
> >> > >>>> and
> >> > >>>>> a
> >> > >>>>>>>>> nice-to-have "Drop Java 11"?
> >> > >>>>>>>>>
> >> > >>>>>>>>> "Move Calcite rules from Scala to Java": I would hope that
> >> this
> >> > >>>>> would
> >> > >>>>>>> be
> >> > >>>>>>>>> an entirely internal change, and could thus be an
> incremental
> >> > >>>>> process
> >> > >>>>>>>>> independent of major releases.
> >> > >>>>>>>>> What is the actual scale of this item; how much are we
> >> actually
> >> > >>>>>>>> re-writing?
> >> > >>>>>>>>> "Add MetricGroup#getLogicalScope": I'd raise this to a
> >> > >>>> must-have; i
> >> > >>>>>>> think
> >> > >>>>>>>>> I marked it down as nice-to-have only because it depends on
> >> > >>>> another
> >> > >>>>>>> item.
> >> > >>>>>>>>> The ProcessFunction API item is giving me the most headaches
> >> > >>>>> because
> >> > >>>>>>> it's
> >> > >>>>>>>>> very unclear what it actually entails; like is it an
> entirely
> >> > >>>>>> separate
> >> > >>>>>>>> API
> >> > >>>>>>>>> to DataStream (sounds like it is!) or an extension of
> >> DataStream.
> >> > >>>>> How
> >> > >>>>>>>> much
> >> > >>>>>>>>> will it share the internals with DataStream etc.; how does
> it
> >> > >>>>> relate
> >> > >>>>>> to
> >> > >>>>>>>> the
> >> > >>>>>>>>> Table API (w.r.t. switching APIs / what Table API uses
> >> > >>>> underneath).
> >> > >>>>>>>>> There are a few items I added as ideas which don't have a
> >> > >>>> priority
> >> > >>>>>> yet;
> >> > >>>>>>>>> would love to get some feedback on those.
> >> > >>>>>>>>>
> >> > >>>>>>>>> On 21/06/2023 08:41, Xintong Song wrote:
> >> > >>>>>>>>>
> >> > >>>>>>>>> Hi devs,
> >> > >>>>>>>>>
> >> > >>>>>>>>> As previously discussed in [1], we had been collecting work
> >> item
> >> > >>>>>>>> proposals
> >> > >>>>>>>>> for the 2.0 release until June 15th, on the wiki page [2].
> >> > >>>>>>>>>
> >> > >>>>>>>>>    - As we have passed the due date, I'd like to kindly
> remind
> >> > >>>>>> everyone
> >> > >>>>>>>> *not
> >> > >>>>>>>>>    to add / remove items directly on the wiki page*. If
> >> needed,
> >> > >>>>>> please
> >> > >>>>>>>> post
> >> > >>>>>>>>>    in this thread or reach out to the release managers
> >> instead.
> >> > >>>>>>>>>    - I've reached out to some folks for clarifications about
> >> > >>>> their
> >> > >>>>>>>>>    proposals. Some of them mentioned that they can not yet
> >> tell
> >> > >>>>>> whether
> >> > >>>>>>>> we
> >> > >>>>>>>>>    should do an item or not, and would need more time /
> >> > >>>> discussions
> >> > >>>>>> to
> >> > >>>>>>>> make
> >> > >>>>>>>>>    the decision. So I added a new symbol for items whose
> >> > >>>> priorities
> >> > >>>>>> are
> >> > >>>>>>>> `TBD`.
> >> > >>>>>>>>> Now it's time to collaboratively decide a minimum set of
> >> > >>>> must-have
> >> > >>>>>>> items.
> >> > >>>>>>>>> I've gone through the entire list of proposed items, and
> found
> >> > >>>> most
> >> > >>>>>> of
> >> > >>>>>>>> them
> >> > >>>>>>>>> make quite much sense. So I think an online sync might not
> be
> >> > >>>>>> necessary
> >> > >>>>>>>> for
> >> > >>>>>>>>> this. I'd like to go with this DISCUSS thread, where
> everyone
> >> can
> >> > >>>>>>> comment
> >> > >>>>>>>>> on how they think the list can be improved, followed by a
> >> VOTE to
> >> > >>>>>>>> formally
> >> > >>>>>>>>> make the decision.
> >> > >>>>>>>>>
> >> > >>>>>>>>> Any feedback and opinions, including but not limited to the
> >> > >>>>> following
> >> > >>>>>>>>> aspects, will be appreciated.
> >> > >>>>>>>>>
> >> > >>>>>>>>>    - Important items that are missing from the list
> >> > >>>>>>>>>    - Concerns regarding the listed items or their priorities
> >> > >>>>>>>>>
> >> > >>>>>>>>> Looking forward to your feedback.
> >> > >>>>>>>>>
> >> > >>>>>>>>> Best,
> >> > >>>>>>>>>
> >> > >>>>>>>>> Xintong
> >> > >>>>>>>>>
> >> > >>>>>>>>>
> >> > >>>>>>>>> [1]
> >> > >>>>
> >> > >>
> >> >
> >>
> https://lists.apache.org/list?dev@flink.apache.org:lte=1M:release%202.0%20status%20updates
> >> > >>>>>>>>> [2]
> >> > >>>> https://cwiki.apache.org/confluence/display/FLINK/2.0+Release
> >> > >>>>>>>>>
> >> > >>>>>>>>>
> >> > >>>>
> >> > >>>> --
> >> > >>>> Best regards,
> >> > >>>> Sergey
> >> > >>>>
> >> > >>
> >> > >>
> >> >
> >> >
> >>
> >> --
> >> Best
> >>
> >> ConradJam
> >>
> >
>

Re: [DISCUSS] Release 2.0 Work Items

Posted by Xintong Song <to...@gmail.com>.
Thanks all for the discussion.

The wiki has been updated as discussed. I'm starting a vote now.

Best,

Xintong



On Wed, Jul 5, 2023 at 9:52 AM Xintong Song <to...@gmail.com> wrote:

> Hi ConradJam,
>
> I think Chesnay has already put his name as the Contributor for the two
> tasks you listed. Maybe you can reach out to him to see if you can
> collaborate on this.
>
> In general, I don't think contributing to a release 2.0 issue is much
> different from contributing to a regular issue. We haven't yet created JIRA
> tickets for all the listed tasks because many of them needs further
> discussions and / or FLIPs to decide whether and how they should be
> performed.
>
> Best,
>
> Xintong
>
>
>
> On Mon, Jul 3, 2023 at 10:37 PM ConradJam <ja...@gmail.com> wrote:
>
>> Hi Community:
>>   I see some tasks in the 2.0 list that haven't been assigned yet. I want
>> to take the initiative to take on some tasks that I can complete. How do I
>> apply to the community for this part of the task? I am interested in the
>> following parts of FLINK-32377
>> <https://issues.apache.org/jira/browse/FLINK-32377>, do I need to create
>> issuse myself and point it to myself?
>>
>> - the current timestamp, which is problematic w.r.t. caching and testing,
>> while providing no value.
>> - Remove JarRequestBody#programArgs in favor of #programArgsList.
>>
>> [1] FLINK-32377 <https://issues.apache.org/jira/browse/FLINK-32377>
>> https://issues.apache.org/jira/browse/FLINK-32377
>>
>> Teoh, Hong <li...@amazon.co.uk.invalid> 于2023年6月30日周五 00:53写道:
>>
>>
>> Teoh, Hong <li...@amazon.co.uk.invalid> 于2023年6月30日周五 00:53写道:
>>
>> > Thanks Xintong for driving the effort.
>> >
>> > I’d add a +1 to reworking configs, as suggested by @Jark and @Chesnay,
>> > especially the types. We have various configs that encode Time /
>> MemorySize
>> > that are Long instead!
>> >
>> > Regards,
>> > Hong
>> >
>> >
>> >
>> > > On 29 Jun 2023, at 16:19, Yuan Mei <yu...@gmail.com> wrote:
>> > >
>> > > CAUTION: This email originated from outside of the organization. Do
>> not
>> > click links or open attachments unless you can confirm the sender and
>> know
>> > the content is safe.
>> > >
>> > >
>> > >
>> > > Thanks for driving this effort, Xintong!
>> > >
>> > > To Chesnay
>> > >> I'm curious as to why the "Disaggregated State Management" item is
>> > >> marked as a must-have; will it require changes that break something?
>> > >> What prevents it from being added in 2.1?
>> > >
>> > > As to "Disaggregated State Management".
>> > >
>> > > We plan to provide a new type of state backend to support DFS as
>> primary
>> > > storage.
>> > > To achieve this, we at least need to include two parts of amends (not
>> > > entirely sure yet, since we are still in the designing and prototype
>> > phase)
>> > >
>> > > 1. Statebackend Change
>> > > 2. State Access Change
>> > >
>> > > Not all of the interfaces related are `@Internal`. Some of the
>> interfaces
>> > > like `StateBackend` is `@PublicEvolving`
>> > > So, you are right in the sense that "Disaggregated State Management"
>> > itself
>> > > probably does not need to be a "Must Have"
>> > >
>> > > But I was hoping changes that related to public APIs can be finalized
>> and
>> > > merged in Flink 2.0 (I will fix the wiki accordingly).
>> > >
>> > > I also agree with Jark that 2.0 is a good chance to rework the default
>> > > value of configurations.
>> > >
>> > > Best
>> > > Yuan
>> > >
>> > >
>> > > On Thu, Jun 29, 2023 at 8:43 PM Chesnay Schepler <ch...@apache.org>
>> > wrote:
>> > >
>> > >> Something else configuration-related is that there are a bunch of
>> > >> options where the type isn't quite correct (e.g., a String where it
>> > >> could be an enum, a string where it should be an int or something).
>> > >> Could do a pass over those as well.
>> > >>
>> > >> On 29/06/2023 13:50, Jark Wu wrote:
>> > >>> Hi,
>> > >>>
>> > >>> I think one more thing we need to consider to do in 2.0 is changing
>> the
>> > >>> default value of configuration to improve out-of-box user
>> experience.
>> > >>>
>> > >>> Currently, in order to run a Flink job, users may need to set
>> > >>> a bunch of configurations, such as minibatch, checkpoint interval,
>> > >>> exactly-once,
>> > >>> incremental-checkpoint, etc. It's very verbose and hard to use for
>> > >>> beginners.
>> > >>> Most of them can have a universally applicable value.  Because
>> changing
>> > >> the
>> > >>> default value is a breaking change. I think It's worth considering
>> > >> changing
>> > >>> them in 2.0.
>> > >>>
>> > >>> What do you think?
>> > >>>
>> > >>> Best,
>> > >>> Jark
>> > >>>
>> > >>>
>> > >>> On Wed, 28 Jun 2023 at 14:10, Sergey Nuyanzin <sn...@gmail.com>
>> > >> wrote:
>> > >>>
>> > >>>> Hi Chesnay
>> > >>>>
>> > >>>>> "Move Calcite rules from Scala to Java": I would hope that this
>> would
>> > >> be
>> > >>>>> an entirely internal change, and could thus be an incremental
>> process
>> > >>>>> independent of major releases.
>> > >>>>> What is the actual scale of this item; how much are we actually
>> > >>>> re-writing?
>> > >>>>
>> > >>>> Thanks for asking
>> > >>>> yes, you're right, that should be internal change.
>> > >>>> Yeah I was also thinking about incremental change (rule by rule or
>> > >>>> reasonable small group of rules).
>> > >>>> And yes, this could be an independent (on major release) activity
>> > >>>>
>> > >>>> The problem is actually for children of RelOptRule.
>> > >>>> Currently I see 60+ such rules (in Scala) using the mentioned
>> > deprecated
>> > >>>> api.
>> > >>>> There are also children of ConverterRule (50+) which do not have
>> such
>> > >>>> issues.
>> > >>>> Maybe it could be considered as the next step to have all the
>> rules in
>> > >>>> Java.
>> > >>>>
>> > >>>> On Tue, Jun 27, 2023 at 1:34 PM Xintong Song <
>> tonysong820@gmail.com>
>> > >>>> wrote:
>> > >>>>
>> > >>>>> Hi Alex & Gyula,
>> > >>>>>
>> > >>>>> By compatibility discussion do you mean the "[DISCUSS] FLIP-321:
>> > >>>> Introduce
>> > >>>>>> an API deprecation process" thread [1]?
>> > >>>>>>
>> > >>>>> Yes, I meant the FLIP-321 discussion. I just noticed I pasted the
>> > wrong
>> > >>>> url
>> > >>>>> in my previous email. Sorry for the mistake.
>> > >>>>>
>> > >>>>> I am also curious to know if the rationale behind this new API has
>> > been
>> > >>>>>> previously discussed on the mailing list. Do we have a list of
>> > >>>>> shortcomings
>> > >>>>>> in the current DataStream API that it tries to resolve? How does
>> the
>> > >>>>>> current ProcessFunction functionality fit into the picture? Will
>> it
>> > be
>> > >>>>> kept
>> > >>>>>> as is or subsumed by new API?
>> > >>>>>>
>> > >>>>> I don't think we should create a replacement for the DataStream
>> API
>> > >>>> unless
>> > >>>>>> we have a very good reason to do so and with a proper discussion
>> > about
>> > >>>>> this
>> > >>>>>> as Alex said.
>> > >>>>>
>> > >>>>> The ProcessFunction API which is targeting to replace DataStream
>> API
>> > is
>> > >>>>> still a proposal, not a decision. Sorry for the confusion, I
>> should
>> > >> have
>> > >>>>> been more careful with my words, not giving the impression that
>> this
>> > is
>> > >>>>> something we'll do anyway.
>> > >>>>>
>> > >>>>> There will be a FLIP describing the motivations and designs in
>> > detail,
>> > >>>> for
>> > >>>>> the community to discuss and vote on. We are still working on it.
>> > TBH,
>> > >>>> this
>> > >>>>> is not trivial and we would need more time on it.
>> > >>>>>
>> > >>>>> Just to quickly share some backgrounds:
>> > >>>>>
>> > >>>>>    - We see quite some problems with the current DataStream APIs
>> > >>>>>       - Users are working with concrete classes rather than
>> > >> interfaces,
>> > >>>>>       which means
>> > >>>>>       - Users can access methods that are designed to be used by
>> > >> internal
>> > >>>>>          classes, even though they are annotated with `@Internal`.
>> > >> E.g.,
>> > >>>>>          `DataStream#getTransformation`.
>> > >>>>>          - Changes to the non-API implementations (e.g.,
>> > >>>> `Transformation`)
>> > >>>>>          would affect the API classes (e.g., `DataStream`), which
>> > >>>>> makes it hard to
>> > >>>>>          provide binary compatibility.
>> > >>>>>       - Internal classes are used as parameter / return-value of
>> > >> public
>> > >>>>>       APIs. E.g., while `AbstractStreamOperator` is
>> PublicEvolving,
>> > >>>>> `StreamTask`
>> > >>>>>       which returns from
>> `AbstractStreamOperator#getContainingTask`
>> > is
>> > >>>>> Internal.
>> > >>>>>       - In many cases, users are asked to extend the API classes,
>> > >> rather
>> > >>>>>       than implementing interfaces. E.g.,
>> `AbstractStreamOperator`.
>> > >>>>>          - Any changes to the base classes, even the internal
>> part,
>> > >> may
>> > >>>>>          affect the behavior of the user-provided sub-classes
>> > >>>>>          - Users can override the behavior of the base classes
>> > >>>>>       - The API module `flink-streaming-java` contains non-API
>> > >> classes,
>> > >>>> and
>> > >>>>>       depends on internal modules such as `flink-runtime`, which
>> > means
>> > >>>>>       - Changes to the internal modules may affect the API
>> modules,
>> > >> which
>> > >>>>>          requires users to re-build their applications upon
>> upgrading
>> > >>>>>          - The artifact user needs for building their application
>> > >> larger
>> > >>>>>          than necessary.
>> > >>>>>       - We probably should not expose operators (e.g.,
>> > >>>>>       `AbstractStreamOperator`) to users. Functions should be
>> enough
>> > >>>>> for users to
>> > >>>>>       define their data processing logics. Exposing operator-level
>> > >>>> concepts
>> > >>>>>       (e.g., mailbox thread model, checkpoint barrier alignment,
>> > >> etc.) is
>> > >>>>>       unnecessary and limits the improvement regarding such
>> exposed
>> > >>>>> mechanisms
>> > >>>>>       with compatibility considerations.
>> > >>>>>       - The current DataStream API seems to be a mixture of many
>> > >> things,
>> > >>>>>       making it hard to understand especially for newcomers. It
>> might
>> > >> be
>> > >>>>> better
>> > >>>>>       to re-organize it into several parts: (the taxonomy below
>> are
>> > >> just
>> > >>>> an
>> > >>>>>       example of the, we are still working on this)
>> > >>>>>          - The most fundamental stateful stream processing:
>> streams,
>> > >>>>>          partitions / key, process functions, state,
>> timeline-service
>> > >>>>>          - An extension for common batch-streaming unified
>> functions:
>> > >>>> map,
>> > >>>>>          flatmap, filter, agg, reduce, join, etc.
>> > >>>>>          - An extension for windowing supports:  window,
>> triggering
>> > >>>>>          - An extension for event-time supports: event time,
>> > watermark
>> > >>>>>          - The extensions are like short-cuts / sugars, without
>> which
>> > >>>> users
>> > >>>>>          can probably still achieve the same behavior by working
>> with
>> > >> the
>> > >>>>>          fundamental APIs, but would be a lot easier with the
>> > >> extensions
>> > >>>>>       - The original plan was to do in-place refactors / changes
>> on
>> > >>>>>    DataStream API. Some related items are listed in this doc [2]
>> > >> attached
>> > >>>>> to
>> > >>>>>    the kicking off email [3]. Not all of the above issues are
>> listed,
>> > >>>>> because
>> > >>>>>    we haven't looked into this as deeply as now  by that time.
>> > >>>>>    - We proposed this as a new API rather than in-place refactors
>> in
>> > >> the
>> > >>>>>    2.0 work item list, because we realized the changes might be
>> too
>> > >> big
>> > >>>>> for an
>> > >>>>>    in-place change. First having a new API then gradually retiring
>> > the
>> > >>>> old
>> > >>>>> one
>> > >>>>>    would help users to smoothly migrate between them.
>> > >>>>>
>> > >>>>> A thorough discussion is definitely needed once the FLIP is out.
>> And
>> > of
>> > >>>>> course it's possible that the FLIP might be rejected. Given that
>> we
>> > are
>> > >>>>> planning for release 2.0, I just feel it would be better to bring
>> > this
>> > >> up
>> > >>>>> early even the concrete plan is not yet ready,
>> > >>>>>
>> > >>>>> Best,
>> > >>>>>
>> > >>>>> Xintong
>> > >>>>>
>> > >>>>>
>> > >>>>> [1]
>> https://lists.apache.org/thread/vmhzv8fcw2b33pqxp43486owrxbkd5x9
>> > >>>>> [2]
>> > >>>>>
>> > >>>>>
>> > >>>>
>> > >>
>> >
>> https://docs.google.com/document/d/1_PMGl5RuDQGlV99_gL3y7OiRsF0DgCk91Coua6hFXhE/edit?usp=sharing
>> > >>>>> [3]
>> https://lists.apache.org/thread/b8w5cx0qqbwzzklyn5xxf54vw9ymys1c
>> > >>>>>
>> > >>>>> On Tue, Jun 27, 2023 at 5:15 PM Gyula Fóra <gy...@apache.org>
>> > wrote:
>> > >>>>>
>> > >>>>>> Hey!
>> > >>>>>>
>> > >>>>>> I share the same concerns mentioned above regarding the
>> > >>>> "ProcessFunction
>> > >>>>>> API".
>> > >>>>>>
>> > >>>>>> I don't think we should create a replacement for the DataStream
>> API
>> > >>>>> unless
>> > >>>>>> we have a very good reason to do so and with a proper discussion
>> > about
>> > >>>>> this
>> > >>>>>> as Alex said.
>> > >>>>>>
>> > >>>>>> Cheers,
>> > >>>>>> Gyula
>> > >>>>>>
>> > >>>>>> On Tue, Jun 27, 2023 at 11:03 AM Alexander Fedulov <
>> > >>>>>> alexander.fedulov@gmail.com> wrote:
>> > >>>>>>
>> > >>>>>>> Hi Xintong,
>> > >>>>>>>
>> > >>>>>>> By compatibility discussion do you mean the "[DISCUSS] FLIP-321:
>> > >>>>>> Introduce
>> > >>>>>>> an API deprecation process" thread [1]?
>> > >>>>>>>
>> > >>>>>>> I am also curious to know if the rationale behind this new API
>> has
>> > >>>> been
>> > >>>>>>> previously discussed on the mailing list. Do we have a list of
>> > >>>>>> shortcomings
>> > >>>>>>> in the current DataStream API that it tries to resolve? How does
>> > the
>> > >>>>>>> current ProcessFunction functionality fit into the picture?
>> Will it
>> > >>>> be
>> > >>>>>> kept
>> > >>>>>>> as is or subsumed by new API?
>> > >>>>>>>
>> > >>>>>>> [1]
>> > https://lists.apache.org/thread/vmhzv8fcw2b33pqxp43486owrxbkd5x9
>> > >>>>>>>
>> > >>>>>>> Best,
>> > >>>>>>> Alex
>> > >>>>>>>
>> > >>>>>>> On Mon, 26 Jun 2023 at 14:33, Xintong Song <
>> tonysong820@gmail.com>
>> > >>>>>> wrote:
>> > >>>>>>>>> The ProcessFunction API item is giving me the most headaches
>> > >>>>> because
>> > >>>>>>> it's
>> > >>>>>>>>> very unclear what it actually entails; like is it an entirely
>> > >>>>>> separate
>> > >>>>>>>> API
>> > >>>>>>>>> to DataStream (sounds like it is!) or an extension of
>> DataStream.
>> > >>>>> How
>> > >>>>>>>> much
>> > >>>>>>>>> will it share the internals with DataStream etc.; how does it
>> > >>>>> relate
>> > >>>>>> to
>> > >>>>>>>> the
>> > >>>>>>>>> Table API (w.r.t. switching APIs / what Table API uses
>> > >>>> underneath).
>> > >>>>>>>> I totally understand your confusion. We started planning this
>> > after
>> > >>>>>>> kicking
>> > >>>>>>>> off the release 2.0, so there's still a lot to be explored and
>> the
>> > >>>>> plan
>> > >>>>>>>> keeps changing.
>> > >>>>>>>>
>> > >>>>>>>>
>> > >>>>>>>>    - In the beginning, we planned to do an in-place refactor of
>> > >>>>>>> DataStream
>> > >>>>>>>>    API, until the API migration period is proposed.
>> > >>>>>>>>    - Then we want to make it an entirely separate API to
>> > >>>> DataStream,
>> > >>>>>> and
>> > >>>>>>>>    listed as a must-have for release 2.0 so that we can remove
>> > >>>>>> DataStream
>> > >>>>>>>> once
>> > >>>>>>>>    it's ready.
>> > >>>>>>>>    - However, depending on the outcome of the API compatibility
>> > >>>>>>> discussion
>> > >>>>>>>>    [1], we may not be able to remove DataStream in 2.0 anyway,
>> > >>>> which
>> > >>>>>>> means
>> > >>>>>>>> we
>> > >>>>>>>>    might need to re-evaluate the necessity of this item for
>> 2.0.
>> > >>>>>>>>
>> > >>>>>>>> I'd say we wait a bit longer for the compatibility discussion
>> [1]
>> > >>>> and
>> > >>>>>>>> decide the priority for this item afterwards.
>> > >>>>>>>>
>> > >>>>>>>>
>> > >>>>>>>> Best,
>> > >>>>>>>>
>> > >>>>>>>> Xintong
>> > >>>>>>>>
>> > >>>>>>>>
>> > >>>>>>>> [1] https://lists.apache.org/list.html?dev@flink.apache.org
>> > >>>>>>>>
>> > >>>>>>>>
>> > >>>>>>>> On Mon, Jun 26, 2023 at 6:00 PM Chesnay Schepler <
>> > >>>> chesnay@apache.org
>> > >>>>>>>> wrote:
>> > >>>>>>>>
>> > >>>>>>>>> by-and-large I'm quite happy with the list of items.
>> > >>>>>>>>>
>> > >>>>>>>>> I'm curious as to why the "Disaggregated State Management"
>> item
>> > >>>> is
>> > >>>>>>> marked
>> > >>>>>>>>> as a must-have; will it require changes that break something?
>> > >>>> What
>> > >>>>>>>> prevents
>> > >>>>>>>>> it from being added in 2.1?
>> > >>>>>>>>>
>> > >>>>>>>>> We may want to update the Java 17 item to "Make Java 17 the
>> > >>>>> default,
>> > >>>>>>> drop
>> > >>>>>>>>> Java 8/11". Maybe even split it into a must-have "Drop Java 8"
>> > >>>> and
>> > >>>>> a
>> > >>>>>>>>> nice-to-have "Drop Java 11"?
>> > >>>>>>>>>
>> > >>>>>>>>> "Move Calcite rules from Scala to Java": I would hope that
>> this
>> > >>>>> would
>> > >>>>>>> be
>> > >>>>>>>>> an entirely internal change, and could thus be an incremental
>> > >>>>> process
>> > >>>>>>>>> independent of major releases.
>> > >>>>>>>>> What is the actual scale of this item; how much are we
>> actually
>> > >>>>>>>> re-writing?
>> > >>>>>>>>> "Add MetricGroup#getLogicalScope": I'd raise this to a
>> > >>>> must-have; i
>> > >>>>>>> think
>> > >>>>>>>>> I marked it down as nice-to-have only because it depends on
>> > >>>> another
>> > >>>>>>> item.
>> > >>>>>>>>> The ProcessFunction API item is giving me the most headaches
>> > >>>>> because
>> > >>>>>>> it's
>> > >>>>>>>>> very unclear what it actually entails; like is it an entirely
>> > >>>>>> separate
>> > >>>>>>>> API
>> > >>>>>>>>> to DataStream (sounds like it is!) or an extension of
>> DataStream.
>> > >>>>> How
>> > >>>>>>>> much
>> > >>>>>>>>> will it share the internals with DataStream etc.; how does it
>> > >>>>> relate
>> > >>>>>> to
>> > >>>>>>>> the
>> > >>>>>>>>> Table API (w.r.t. switching APIs / what Table API uses
>> > >>>> underneath).
>> > >>>>>>>>> There are a few items I added as ideas which don't have a
>> > >>>> priority
>> > >>>>>> yet;
>> > >>>>>>>>> would love to get some feedback on those.
>> > >>>>>>>>>
>> > >>>>>>>>> On 21/06/2023 08:41, Xintong Song wrote:
>> > >>>>>>>>>
>> > >>>>>>>>> Hi devs,
>> > >>>>>>>>>
>> > >>>>>>>>> As previously discussed in [1], we had been collecting work
>> item
>> > >>>>>>>> proposals
>> > >>>>>>>>> for the 2.0 release until June 15th, on the wiki page [2].
>> > >>>>>>>>>
>> > >>>>>>>>>    - As we have passed the due date, I'd like to kindly remind
>> > >>>>>> everyone
>> > >>>>>>>> *not
>> > >>>>>>>>>    to add / remove items directly on the wiki page*. If
>> needed,
>> > >>>>>> please
>> > >>>>>>>> post
>> > >>>>>>>>>    in this thread or reach out to the release managers
>> instead.
>> > >>>>>>>>>    - I've reached out to some folks for clarifications about
>> > >>>> their
>> > >>>>>>>>>    proposals. Some of them mentioned that they can not yet
>> tell
>> > >>>>>> whether
>> > >>>>>>>> we
>> > >>>>>>>>>    should do an item or not, and would need more time /
>> > >>>> discussions
>> > >>>>>> to
>> > >>>>>>>> make
>> > >>>>>>>>>    the decision. So I added a new symbol for items whose
>> > >>>> priorities
>> > >>>>>> are
>> > >>>>>>>> `TBD`.
>> > >>>>>>>>> Now it's time to collaboratively decide a minimum set of
>> > >>>> must-have
>> > >>>>>>> items.
>> > >>>>>>>>> I've gone through the entire list of proposed items, and found
>> > >>>> most
>> > >>>>>> of
>> > >>>>>>>> them
>> > >>>>>>>>> make quite much sense. So I think an online sync might not be
>> > >>>>>> necessary
>> > >>>>>>>> for
>> > >>>>>>>>> this. I'd like to go with this DISCUSS thread, where everyone
>> can
>> > >>>>>>> comment
>> > >>>>>>>>> on how they think the list can be improved, followed by a
>> VOTE to
>> > >>>>>>>> formally
>> > >>>>>>>>> make the decision.
>> > >>>>>>>>>
>> > >>>>>>>>> Any feedback and opinions, including but not limited to the
>> > >>>>> following
>> > >>>>>>>>> aspects, will be appreciated.
>> > >>>>>>>>>
>> > >>>>>>>>>    - Important items that are missing from the list
>> > >>>>>>>>>    - Concerns regarding the listed items or their priorities
>> > >>>>>>>>>
>> > >>>>>>>>> Looking forward to your feedback.
>> > >>>>>>>>>
>> > >>>>>>>>> Best,
>> > >>>>>>>>>
>> > >>>>>>>>> Xintong
>> > >>>>>>>>>
>> > >>>>>>>>>
>> > >>>>>>>>> [1]
>> > >>>>
>> > >>
>> >
>> https://lists.apache.org/list?dev@flink.apache.org:lte=1M:release%202.0%20status%20updates
>> > >>>>>>>>> [2]
>> > >>>> https://cwiki.apache.org/confluence/display/FLINK/2.0+Release
>> > >>>>>>>>>
>> > >>>>>>>>>
>> > >>>>
>> > >>>> --
>> > >>>> Best regards,
>> > >>>> Sergey
>> > >>>>
>> > >>
>> > >>
>> >
>> >
>>
>> --
>> Best
>>
>> ConradJam
>>
>

Re: [DISCUSS] Release 2.0 Work Items

Posted by Xintong Song <to...@gmail.com>.
Hi ConradJam,

I think Chesnay has already put his name as the Contributor for the two
tasks you listed. Maybe you can reach out to him to see if you can
collaborate on this.

In general, I don't think contributing to a release 2.0 issue is much
different from contributing to a regular issue. We haven't yet created JIRA
tickets for all the listed tasks because many of them needs further
discussions and / or FLIPs to decide whether and how they should be
performed.

Best,

Xintong



On Mon, Jul 3, 2023 at 10:37 PM ConradJam <ja...@gmail.com> wrote:

> Hi Community:
>   I see some tasks in the 2.0 list that haven't been assigned yet. I want
> to take the initiative to take on some tasks that I can complete. How do I
> apply to the community for this part of the task? I am interested in the
> following parts of FLINK-32377
> <https://issues.apache.org/jira/browse/FLINK-32377>, do I need to create
> issuse myself and point it to myself?
>
> - the current timestamp, which is problematic w.r.t. caching and testing,
> while providing no value.
> - Remove JarRequestBody#programArgs in favor of #programArgsList.
>
> [1] FLINK-32377 <https://issues.apache.org/jira/browse/FLINK-32377>
> https://issues.apache.org/jira/browse/FLINK-32377
>
> Teoh, Hong <li...@amazon.co.uk.invalid> 于2023年6月30日周五 00:53写道:
>
>
> Teoh, Hong <li...@amazon.co.uk.invalid> 于2023年6月30日周五 00:53写道:
>
> > Thanks Xintong for driving the effort.
> >
> > I’d add a +1 to reworking configs, as suggested by @Jark and @Chesnay,
> > especially the types. We have various configs that encode Time /
> MemorySize
> > that are Long instead!
> >
> > Regards,
> > Hong
> >
> >
> >
> > > On 29 Jun 2023, at 16:19, Yuan Mei <yu...@gmail.com> wrote:
> > >
> > > CAUTION: This email originated from outside of the organization. Do not
> > click links or open attachments unless you can confirm the sender and
> know
> > the content is safe.
> > >
> > >
> > >
> > > Thanks for driving this effort, Xintong!
> > >
> > > To Chesnay
> > >> I'm curious as to why the "Disaggregated State Management" item is
> > >> marked as a must-have; will it require changes that break something?
> > >> What prevents it from being added in 2.1?
> > >
> > > As to "Disaggregated State Management".
> > >
> > > We plan to provide a new type of state backend to support DFS as
> primary
> > > storage.
> > > To achieve this, we at least need to include two parts of amends (not
> > > entirely sure yet, since we are still in the designing and prototype
> > phase)
> > >
> > > 1. Statebackend Change
> > > 2. State Access Change
> > >
> > > Not all of the interfaces related are `@Internal`. Some of the
> interfaces
> > > like `StateBackend` is `@PublicEvolving`
> > > So, you are right in the sense that "Disaggregated State Management"
> > itself
> > > probably does not need to be a "Must Have"
> > >
> > > But I was hoping changes that related to public APIs can be finalized
> and
> > > merged in Flink 2.0 (I will fix the wiki accordingly).
> > >
> > > I also agree with Jark that 2.0 is a good chance to rework the default
> > > value of configurations.
> > >
> > > Best
> > > Yuan
> > >
> > >
> > > On Thu, Jun 29, 2023 at 8:43 PM Chesnay Schepler <ch...@apache.org>
> > wrote:
> > >
> > >> Something else configuration-related is that there are a bunch of
> > >> options where the type isn't quite correct (e.g., a String where it
> > >> could be an enum, a string where it should be an int or something).
> > >> Could do a pass over those as well.
> > >>
> > >> On 29/06/2023 13:50, Jark Wu wrote:
> > >>> Hi,
> > >>>
> > >>> I think one more thing we need to consider to do in 2.0 is changing
> the
> > >>> default value of configuration to improve out-of-box user experience.
> > >>>
> > >>> Currently, in order to run a Flink job, users may need to set
> > >>> a bunch of configurations, such as minibatch, checkpoint interval,
> > >>> exactly-once,
> > >>> incremental-checkpoint, etc. It's very verbose and hard to use for
> > >>> beginners.
> > >>> Most of them can have a universally applicable value.  Because
> changing
> > >> the
> > >>> default value is a breaking change. I think It's worth considering
> > >> changing
> > >>> them in 2.0.
> > >>>
> > >>> What do you think?
> > >>>
> > >>> Best,
> > >>> Jark
> > >>>
> > >>>
> > >>> On Wed, 28 Jun 2023 at 14:10, Sergey Nuyanzin <sn...@gmail.com>
> > >> wrote:
> > >>>
> > >>>> Hi Chesnay
> > >>>>
> > >>>>> "Move Calcite rules from Scala to Java": I would hope that this
> would
> > >> be
> > >>>>> an entirely internal change, and could thus be an incremental
> process
> > >>>>> independent of major releases.
> > >>>>> What is the actual scale of this item; how much are we actually
> > >>>> re-writing?
> > >>>>
> > >>>> Thanks for asking
> > >>>> yes, you're right, that should be internal change.
> > >>>> Yeah I was also thinking about incremental change (rule by rule or
> > >>>> reasonable small group of rules).
> > >>>> And yes, this could be an independent (on major release) activity
> > >>>>
> > >>>> The problem is actually for children of RelOptRule.
> > >>>> Currently I see 60+ such rules (in Scala) using the mentioned
> > deprecated
> > >>>> api.
> > >>>> There are also children of ConverterRule (50+) which do not have
> such
> > >>>> issues.
> > >>>> Maybe it could be considered as the next step to have all the rules
> in
> > >>>> Java.
> > >>>>
> > >>>> On Tue, Jun 27, 2023 at 1:34 PM Xintong Song <tonysong820@gmail.com
> >
> > >>>> wrote:
> > >>>>
> > >>>>> Hi Alex & Gyula,
> > >>>>>
> > >>>>> By compatibility discussion do you mean the "[DISCUSS] FLIP-321:
> > >>>> Introduce
> > >>>>>> an API deprecation process" thread [1]?
> > >>>>>>
> > >>>>> Yes, I meant the FLIP-321 discussion. I just noticed I pasted the
> > wrong
> > >>>> url
> > >>>>> in my previous email. Sorry for the mistake.
> > >>>>>
> > >>>>> I am also curious to know if the rationale behind this new API has
> > been
> > >>>>>> previously discussed on the mailing list. Do we have a list of
> > >>>>> shortcomings
> > >>>>>> in the current DataStream API that it tries to resolve? How does
> the
> > >>>>>> current ProcessFunction functionality fit into the picture? Will
> it
> > be
> > >>>>> kept
> > >>>>>> as is or subsumed by new API?
> > >>>>>>
> > >>>>> I don't think we should create a replacement for the DataStream API
> > >>>> unless
> > >>>>>> we have a very good reason to do so and with a proper discussion
> > about
> > >>>>> this
> > >>>>>> as Alex said.
> > >>>>>
> > >>>>> The ProcessFunction API which is targeting to replace DataStream
> API
> > is
> > >>>>> still a proposal, not a decision. Sorry for the confusion, I should
> > >> have
> > >>>>> been more careful with my words, not giving the impression that
> this
> > is
> > >>>>> something we'll do anyway.
> > >>>>>
> > >>>>> There will be a FLIP describing the motivations and designs in
> > detail,
> > >>>> for
> > >>>>> the community to discuss and vote on. We are still working on it.
> > TBH,
> > >>>> this
> > >>>>> is not trivial and we would need more time on it.
> > >>>>>
> > >>>>> Just to quickly share some backgrounds:
> > >>>>>
> > >>>>>    - We see quite some problems with the current DataStream APIs
> > >>>>>       - Users are working with concrete classes rather than
> > >> interfaces,
> > >>>>>       which means
> > >>>>>       - Users can access methods that are designed to be used by
> > >> internal
> > >>>>>          classes, even though they are annotated with `@Internal`.
> > >> E.g.,
> > >>>>>          `DataStream#getTransformation`.
> > >>>>>          - Changes to the non-API implementations (e.g.,
> > >>>> `Transformation`)
> > >>>>>          would affect the API classes (e.g., `DataStream`), which
> > >>>>> makes it hard to
> > >>>>>          provide binary compatibility.
> > >>>>>       - Internal classes are used as parameter / return-value of
> > >> public
> > >>>>>       APIs. E.g., while `AbstractStreamOperator` is PublicEvolving,
> > >>>>> `StreamTask`
> > >>>>>       which returns from `AbstractStreamOperator#getContainingTask`
> > is
> > >>>>> Internal.
> > >>>>>       - In many cases, users are asked to extend the API classes,
> > >> rather
> > >>>>>       than implementing interfaces. E.g., `AbstractStreamOperator`.
> > >>>>>          - Any changes to the base classes, even the internal part,
> > >> may
> > >>>>>          affect the behavior of the user-provided sub-classes
> > >>>>>          - Users can override the behavior of the base classes
> > >>>>>       - The API module `flink-streaming-java` contains non-API
> > >> classes,
> > >>>> and
> > >>>>>       depends on internal modules such as `flink-runtime`, which
> > means
> > >>>>>       - Changes to the internal modules may affect the API modules,
> > >> which
> > >>>>>          requires users to re-build their applications upon
> upgrading
> > >>>>>          - The artifact user needs for building their application
> > >> larger
> > >>>>>          than necessary.
> > >>>>>       - We probably should not expose operators (e.g.,
> > >>>>>       `AbstractStreamOperator`) to users. Functions should be
> enough
> > >>>>> for users to
> > >>>>>       define their data processing logics. Exposing operator-level
> > >>>> concepts
> > >>>>>       (e.g., mailbox thread model, checkpoint barrier alignment,
> > >> etc.) is
> > >>>>>       unnecessary and limits the improvement regarding such exposed
> > >>>>> mechanisms
> > >>>>>       with compatibility considerations.
> > >>>>>       - The current DataStream API seems to be a mixture of many
> > >> things,
> > >>>>>       making it hard to understand especially for newcomers. It
> might
> > >> be
> > >>>>> better
> > >>>>>       to re-organize it into several parts: (the taxonomy below are
> > >> just
> > >>>> an
> > >>>>>       example of the, we are still working on this)
> > >>>>>          - The most fundamental stateful stream processing:
> streams,
> > >>>>>          partitions / key, process functions, state,
> timeline-service
> > >>>>>          - An extension for common batch-streaming unified
> functions:
> > >>>> map,
> > >>>>>          flatmap, filter, agg, reduce, join, etc.
> > >>>>>          - An extension for windowing supports:  window, triggering
> > >>>>>          - An extension for event-time supports: event time,
> > watermark
> > >>>>>          - The extensions are like short-cuts / sugars, without
> which
> > >>>> users
> > >>>>>          can probably still achieve the same behavior by working
> with
> > >> the
> > >>>>>          fundamental APIs, but would be a lot easier with the
> > >> extensions
> > >>>>>       - The original plan was to do in-place refactors / changes on
> > >>>>>    DataStream API. Some related items are listed in this doc [2]
> > >> attached
> > >>>>> to
> > >>>>>    the kicking off email [3]. Not all of the above issues are
> listed,
> > >>>>> because
> > >>>>>    we haven't looked into this as deeply as now  by that time.
> > >>>>>    - We proposed this as a new API rather than in-place refactors
> in
> > >> the
> > >>>>>    2.0 work item list, because we realized the changes might be too
> > >> big
> > >>>>> for an
> > >>>>>    in-place change. First having a new API then gradually retiring
> > the
> > >>>> old
> > >>>>> one
> > >>>>>    would help users to smoothly migrate between them.
> > >>>>>
> > >>>>> A thorough discussion is definitely needed once the FLIP is out.
> And
> > of
> > >>>>> course it's possible that the FLIP might be rejected. Given that we
> > are
> > >>>>> planning for release 2.0, I just feel it would be better to bring
> > this
> > >> up
> > >>>>> early even the concrete plan is not yet ready,
> > >>>>>
> > >>>>> Best,
> > >>>>>
> > >>>>> Xintong
> > >>>>>
> > >>>>>
> > >>>>> [1]
> https://lists.apache.org/thread/vmhzv8fcw2b33pqxp43486owrxbkd5x9
> > >>>>> [2]
> > >>>>>
> > >>>>>
> > >>>>
> > >>
> >
> https://docs.google.com/document/d/1_PMGl5RuDQGlV99_gL3y7OiRsF0DgCk91Coua6hFXhE/edit?usp=sharing
> > >>>>> [3]
> https://lists.apache.org/thread/b8w5cx0qqbwzzklyn5xxf54vw9ymys1c
> > >>>>>
> > >>>>> On Tue, Jun 27, 2023 at 5:15 PM Gyula Fóra <gy...@apache.org>
> > wrote:
> > >>>>>
> > >>>>>> Hey!
> > >>>>>>
> > >>>>>> I share the same concerns mentioned above regarding the
> > >>>> "ProcessFunction
> > >>>>>> API".
> > >>>>>>
> > >>>>>> I don't think we should create a replacement for the DataStream
> API
> > >>>>> unless
> > >>>>>> we have a very good reason to do so and with a proper discussion
> > about
> > >>>>> this
> > >>>>>> as Alex said.
> > >>>>>>
> > >>>>>> Cheers,
> > >>>>>> Gyula
> > >>>>>>
> > >>>>>> On Tue, Jun 27, 2023 at 11:03 AM Alexander Fedulov <
> > >>>>>> alexander.fedulov@gmail.com> wrote:
> > >>>>>>
> > >>>>>>> Hi Xintong,
> > >>>>>>>
> > >>>>>>> By compatibility discussion do you mean the "[DISCUSS] FLIP-321:
> > >>>>>> Introduce
> > >>>>>>> an API deprecation process" thread [1]?
> > >>>>>>>
> > >>>>>>> I am also curious to know if the rationale behind this new API
> has
> > >>>> been
> > >>>>>>> previously discussed on the mailing list. Do we have a list of
> > >>>>>> shortcomings
> > >>>>>>> in the current DataStream API that it tries to resolve? How does
> > the
> > >>>>>>> current ProcessFunction functionality fit into the picture? Will
> it
> > >>>> be
> > >>>>>> kept
> > >>>>>>> as is or subsumed by new API?
> > >>>>>>>
> > >>>>>>> [1]
> > https://lists.apache.org/thread/vmhzv8fcw2b33pqxp43486owrxbkd5x9
> > >>>>>>>
> > >>>>>>> Best,
> > >>>>>>> Alex
> > >>>>>>>
> > >>>>>>> On Mon, 26 Jun 2023 at 14:33, Xintong Song <
> tonysong820@gmail.com>
> > >>>>>> wrote:
> > >>>>>>>>> The ProcessFunction API item is giving me the most headaches
> > >>>>> because
> > >>>>>>> it's
> > >>>>>>>>> very unclear what it actually entails; like is it an entirely
> > >>>>>> separate
> > >>>>>>>> API
> > >>>>>>>>> to DataStream (sounds like it is!) or an extension of
> DataStream.
> > >>>>> How
> > >>>>>>>> much
> > >>>>>>>>> will it share the internals with DataStream etc.; how does it
> > >>>>> relate
> > >>>>>> to
> > >>>>>>>> the
> > >>>>>>>>> Table API (w.r.t. switching APIs / what Table API uses
> > >>>> underneath).
> > >>>>>>>> I totally understand your confusion. We started planning this
> > after
> > >>>>>>> kicking
> > >>>>>>>> off the release 2.0, so there's still a lot to be explored and
> the
> > >>>>> plan
> > >>>>>>>> keeps changing.
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>>    - In the beginning, we planned to do an in-place refactor of
> > >>>>>>> DataStream
> > >>>>>>>>    API, until the API migration period is proposed.
> > >>>>>>>>    - Then we want to make it an entirely separate API to
> > >>>> DataStream,
> > >>>>>> and
> > >>>>>>>>    listed as a must-have for release 2.0 so that we can remove
> > >>>>>> DataStream
> > >>>>>>>> once
> > >>>>>>>>    it's ready.
> > >>>>>>>>    - However, depending on the outcome of the API compatibility
> > >>>>>>> discussion
> > >>>>>>>>    [1], we may not be able to remove DataStream in 2.0 anyway,
> > >>>> which
> > >>>>>>> means
> > >>>>>>>> we
> > >>>>>>>>    might need to re-evaluate the necessity of this item for 2.0.
> > >>>>>>>>
> > >>>>>>>> I'd say we wait a bit longer for the compatibility discussion
> [1]
> > >>>> and
> > >>>>>>>> decide the priority for this item afterwards.
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> Best,
> > >>>>>>>>
> > >>>>>>>> Xintong
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> [1] https://lists.apache.org/list.html?dev@flink.apache.org
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> On Mon, Jun 26, 2023 at 6:00 PM Chesnay Schepler <
> > >>>> chesnay@apache.org
> > >>>>>>>> wrote:
> > >>>>>>>>
> > >>>>>>>>> by-and-large I'm quite happy with the list of items.
> > >>>>>>>>>
> > >>>>>>>>> I'm curious as to why the "Disaggregated State Management" item
> > >>>> is
> > >>>>>>> marked
> > >>>>>>>>> as a must-have; will it require changes that break something?
> > >>>> What
> > >>>>>>>> prevents
> > >>>>>>>>> it from being added in 2.1?
> > >>>>>>>>>
> > >>>>>>>>> We may want to update the Java 17 item to "Make Java 17 the
> > >>>>> default,
> > >>>>>>> drop
> > >>>>>>>>> Java 8/11". Maybe even split it into a must-have "Drop Java 8"
> > >>>> and
> > >>>>> a
> > >>>>>>>>> nice-to-have "Drop Java 11"?
> > >>>>>>>>>
> > >>>>>>>>> "Move Calcite rules from Scala to Java": I would hope that this
> > >>>>> would
> > >>>>>>> be
> > >>>>>>>>> an entirely internal change, and could thus be an incremental
> > >>>>> process
> > >>>>>>>>> independent of major releases.
> > >>>>>>>>> What is the actual scale of this item; how much are we actually
> > >>>>>>>> re-writing?
> > >>>>>>>>> "Add MetricGroup#getLogicalScope": I'd raise this to a
> > >>>> must-have; i
> > >>>>>>> think
> > >>>>>>>>> I marked it down as nice-to-have only because it depends on
> > >>>> another
> > >>>>>>> item.
> > >>>>>>>>> The ProcessFunction API item is giving me the most headaches
> > >>>>> because
> > >>>>>>> it's
> > >>>>>>>>> very unclear what it actually entails; like is it an entirely
> > >>>>>> separate
> > >>>>>>>> API
> > >>>>>>>>> to DataStream (sounds like it is!) or an extension of
> DataStream.
> > >>>>> How
> > >>>>>>>> much
> > >>>>>>>>> will it share the internals with DataStream etc.; how does it
> > >>>>> relate
> > >>>>>> to
> > >>>>>>>> the
> > >>>>>>>>> Table API (w.r.t. switching APIs / what Table API uses
> > >>>> underneath).
> > >>>>>>>>> There are a few items I added as ideas which don't have a
> > >>>> priority
> > >>>>>> yet;
> > >>>>>>>>> would love to get some feedback on those.
> > >>>>>>>>>
> > >>>>>>>>> On 21/06/2023 08:41, Xintong Song wrote:
> > >>>>>>>>>
> > >>>>>>>>> Hi devs,
> > >>>>>>>>>
> > >>>>>>>>> As previously discussed in [1], we had been collecting work
> item
> > >>>>>>>> proposals
> > >>>>>>>>> for the 2.0 release until June 15th, on the wiki page [2].
> > >>>>>>>>>
> > >>>>>>>>>    - As we have passed the due date, I'd like to kindly remind
> > >>>>>> everyone
> > >>>>>>>> *not
> > >>>>>>>>>    to add / remove items directly on the wiki page*. If needed,
> > >>>>>> please
> > >>>>>>>> post
> > >>>>>>>>>    in this thread or reach out to the release managers instead.
> > >>>>>>>>>    - I've reached out to some folks for clarifications about
> > >>>> their
> > >>>>>>>>>    proposals. Some of them mentioned that they can not yet tell
> > >>>>>> whether
> > >>>>>>>> we
> > >>>>>>>>>    should do an item or not, and would need more time /
> > >>>> discussions
> > >>>>>> to
> > >>>>>>>> make
> > >>>>>>>>>    the decision. So I added a new symbol for items whose
> > >>>> priorities
> > >>>>>> are
> > >>>>>>>> `TBD`.
> > >>>>>>>>> Now it's time to collaboratively decide a minimum set of
> > >>>> must-have
> > >>>>>>> items.
> > >>>>>>>>> I've gone through the entire list of proposed items, and found
> > >>>> most
> > >>>>>> of
> > >>>>>>>> them
> > >>>>>>>>> make quite much sense. So I think an online sync might not be
> > >>>>>> necessary
> > >>>>>>>> for
> > >>>>>>>>> this. I'd like to go with this DISCUSS thread, where everyone
> can
> > >>>>>>> comment
> > >>>>>>>>> on how they think the list can be improved, followed by a VOTE
> to
> > >>>>>>>> formally
> > >>>>>>>>> make the decision.
> > >>>>>>>>>
> > >>>>>>>>> Any feedback and opinions, including but not limited to the
> > >>>>> following
> > >>>>>>>>> aspects, will be appreciated.
> > >>>>>>>>>
> > >>>>>>>>>    - Important items that are missing from the list
> > >>>>>>>>>    - Concerns regarding the listed items or their priorities
> > >>>>>>>>>
> > >>>>>>>>> Looking forward to your feedback.
> > >>>>>>>>>
> > >>>>>>>>> Best,
> > >>>>>>>>>
> > >>>>>>>>> Xintong
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>> [1]
> > >>>>
> > >>
> >
> https://lists.apache.org/list?dev@flink.apache.org:lte=1M:release%202.0%20status%20updates
> > >>>>>>>>> [2]
> > >>>> https://cwiki.apache.org/confluence/display/FLINK/2.0+Release
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>
> > >>>> --
> > >>>> Best regards,
> > >>>> Sergey
> > >>>>
> > >>
> > >>
> >
> >
>
> --
> Best
>
> ConradJam
>

Re: [DISCUSS] Release 2.0 Work Items

Posted by ConradJam <ja...@gmail.com>.
Hi Community:
  I see some tasks in the 2.0 list that haven't been assigned yet. I want
to take the initiative to take on some tasks that I can complete. How do I
apply to the community for this part of the task? I am interested in the
following parts of FLINK-32377
<https://issues.apache.org/jira/browse/FLINK-32377>, do I need to create
issuse myself and point it to myself?

- the current timestamp, which is problematic w.r.t. caching and testing,
while providing no value.
- Remove JarRequestBody#programArgs in favor of #programArgsList.

[1] FLINK-32377 <https://issues.apache.org/jira/browse/FLINK-32377>
https://issues.apache.org/jira/browse/FLINK-32377

Teoh, Hong <li...@amazon.co.uk.invalid> 于2023年6月30日周五 00:53写道:


Teoh, Hong <li...@amazon.co.uk.invalid> 于2023年6月30日周五 00:53写道:

> Thanks Xintong for driving the effort.
>
> I’d add a +1 to reworking configs, as suggested by @Jark and @Chesnay,
> especially the types. We have various configs that encode Time / MemorySize
> that are Long instead!
>
> Regards,
> Hong
>
>
>
> > On 29 Jun 2023, at 16:19, Yuan Mei <yu...@gmail.com> wrote:
> >
> > CAUTION: This email originated from outside of the organization. Do not
> click links or open attachments unless you can confirm the sender and know
> the content is safe.
> >
> >
> >
> > Thanks for driving this effort, Xintong!
> >
> > To Chesnay
> >> I'm curious as to why the "Disaggregated State Management" item is
> >> marked as a must-have; will it require changes that break something?
> >> What prevents it from being added in 2.1?
> >
> > As to "Disaggregated State Management".
> >
> > We plan to provide a new type of state backend to support DFS as primary
> > storage.
> > To achieve this, we at least need to include two parts of amends (not
> > entirely sure yet, since we are still in the designing and prototype
> phase)
> >
> > 1. Statebackend Change
> > 2. State Access Change
> >
> > Not all of the interfaces related are `@Internal`. Some of the interfaces
> > like `StateBackend` is `@PublicEvolving`
> > So, you are right in the sense that "Disaggregated State Management"
> itself
> > probably does not need to be a "Must Have"
> >
> > But I was hoping changes that related to public APIs can be finalized and
> > merged in Flink 2.0 (I will fix the wiki accordingly).
> >
> > I also agree with Jark that 2.0 is a good chance to rework the default
> > value of configurations.
> >
> > Best
> > Yuan
> >
> >
> > On Thu, Jun 29, 2023 at 8:43 PM Chesnay Schepler <ch...@apache.org>
> wrote:
> >
> >> Something else configuration-related is that there are a bunch of
> >> options where the type isn't quite correct (e.g., a String where it
> >> could be an enum, a string where it should be an int or something).
> >> Could do a pass over those as well.
> >>
> >> On 29/06/2023 13:50, Jark Wu wrote:
> >>> Hi,
> >>>
> >>> I think one more thing we need to consider to do in 2.0 is changing the
> >>> default value of configuration to improve out-of-box user experience.
> >>>
> >>> Currently, in order to run a Flink job, users may need to set
> >>> a bunch of configurations, such as minibatch, checkpoint interval,
> >>> exactly-once,
> >>> incremental-checkpoint, etc. It's very verbose and hard to use for
> >>> beginners.
> >>> Most of them can have a universally applicable value.  Because changing
> >> the
> >>> default value is a breaking change. I think It's worth considering
> >> changing
> >>> them in 2.0.
> >>>
> >>> What do you think?
> >>>
> >>> Best,
> >>> Jark
> >>>
> >>>
> >>> On Wed, 28 Jun 2023 at 14:10, Sergey Nuyanzin <sn...@gmail.com>
> >> wrote:
> >>>
> >>>> Hi Chesnay
> >>>>
> >>>>> "Move Calcite rules from Scala to Java": I would hope that this would
> >> be
> >>>>> an entirely internal change, and could thus be an incremental process
> >>>>> independent of major releases.
> >>>>> What is the actual scale of this item; how much are we actually
> >>>> re-writing?
> >>>>
> >>>> Thanks for asking
> >>>> yes, you're right, that should be internal change.
> >>>> Yeah I was also thinking about incremental change (rule by rule or
> >>>> reasonable small group of rules).
> >>>> And yes, this could be an independent (on major release) activity
> >>>>
> >>>> The problem is actually for children of RelOptRule.
> >>>> Currently I see 60+ such rules (in Scala) using the mentioned
> deprecated
> >>>> api.
> >>>> There are also children of ConverterRule (50+) which do not have such
> >>>> issues.
> >>>> Maybe it could be considered as the next step to have all the rules in
> >>>> Java.
> >>>>
> >>>> On Tue, Jun 27, 2023 at 1:34 PM Xintong Song <to...@gmail.com>
> >>>> wrote:
> >>>>
> >>>>> Hi Alex & Gyula,
> >>>>>
> >>>>> By compatibility discussion do you mean the "[DISCUSS] FLIP-321:
> >>>> Introduce
> >>>>>> an API deprecation process" thread [1]?
> >>>>>>
> >>>>> Yes, I meant the FLIP-321 discussion. I just noticed I pasted the
> wrong
> >>>> url
> >>>>> in my previous email. Sorry for the mistake.
> >>>>>
> >>>>> I am also curious to know if the rationale behind this new API has
> been
> >>>>>> previously discussed on the mailing list. Do we have a list of
> >>>>> shortcomings
> >>>>>> in the current DataStream API that it tries to resolve? How does the
> >>>>>> current ProcessFunction functionality fit into the picture? Will it
> be
> >>>>> kept
> >>>>>> as is or subsumed by new API?
> >>>>>>
> >>>>> I don't think we should create a replacement for the DataStream API
> >>>> unless
> >>>>>> we have a very good reason to do so and with a proper discussion
> about
> >>>>> this
> >>>>>> as Alex said.
> >>>>>
> >>>>> The ProcessFunction API which is targeting to replace DataStream API
> is
> >>>>> still a proposal, not a decision. Sorry for the confusion, I should
> >> have
> >>>>> been more careful with my words, not giving the impression that this
> is
> >>>>> something we'll do anyway.
> >>>>>
> >>>>> There will be a FLIP describing the motivations and designs in
> detail,
> >>>> for
> >>>>> the community to discuss and vote on. We are still working on it.
> TBH,
> >>>> this
> >>>>> is not trivial and we would need more time on it.
> >>>>>
> >>>>> Just to quickly share some backgrounds:
> >>>>>
> >>>>>    - We see quite some problems with the current DataStream APIs
> >>>>>       - Users are working with concrete classes rather than
> >> interfaces,
> >>>>>       which means
> >>>>>       - Users can access methods that are designed to be used by
> >> internal
> >>>>>          classes, even though they are annotated with `@Internal`.
> >> E.g.,
> >>>>>          `DataStream#getTransformation`.
> >>>>>          - Changes to the non-API implementations (e.g.,
> >>>> `Transformation`)
> >>>>>          would affect the API classes (e.g., `DataStream`), which
> >>>>> makes it hard to
> >>>>>          provide binary compatibility.
> >>>>>       - Internal classes are used as parameter / return-value of
> >> public
> >>>>>       APIs. E.g., while `AbstractStreamOperator` is PublicEvolving,
> >>>>> `StreamTask`
> >>>>>       which returns from `AbstractStreamOperator#getContainingTask`
> is
> >>>>> Internal.
> >>>>>       - In many cases, users are asked to extend the API classes,
> >> rather
> >>>>>       than implementing interfaces. E.g., `AbstractStreamOperator`.
> >>>>>          - Any changes to the base classes, even the internal part,
> >> may
> >>>>>          affect the behavior of the user-provided sub-classes
> >>>>>          - Users can override the behavior of the base classes
> >>>>>       - The API module `flink-streaming-java` contains non-API
> >> classes,
> >>>> and
> >>>>>       depends on internal modules such as `flink-runtime`, which
> means
> >>>>>       - Changes to the internal modules may affect the API modules,
> >> which
> >>>>>          requires users to re-build their applications upon upgrading
> >>>>>          - The artifact user needs for building their application
> >> larger
> >>>>>          than necessary.
> >>>>>       - We probably should not expose operators (e.g.,
> >>>>>       `AbstractStreamOperator`) to users. Functions should be enough
> >>>>> for users to
> >>>>>       define their data processing logics. Exposing operator-level
> >>>> concepts
> >>>>>       (e.g., mailbox thread model, checkpoint barrier alignment,
> >> etc.) is
> >>>>>       unnecessary and limits the improvement regarding such exposed
> >>>>> mechanisms
> >>>>>       with compatibility considerations.
> >>>>>       - The current DataStream API seems to be a mixture of many
> >> things,
> >>>>>       making it hard to understand especially for newcomers. It might
> >> be
> >>>>> better
> >>>>>       to re-organize it into several parts: (the taxonomy below are
> >> just
> >>>> an
> >>>>>       example of the, we are still working on this)
> >>>>>          - The most fundamental stateful stream processing: streams,
> >>>>>          partitions / key, process functions, state, timeline-service
> >>>>>          - An extension for common batch-streaming unified functions:
> >>>> map,
> >>>>>          flatmap, filter, agg, reduce, join, etc.
> >>>>>          - An extension for windowing supports:  window, triggering
> >>>>>          - An extension for event-time supports: event time,
> watermark
> >>>>>          - The extensions are like short-cuts / sugars, without which
> >>>> users
> >>>>>          can probably still achieve the same behavior by working with
> >> the
> >>>>>          fundamental APIs, but would be a lot easier with the
> >> extensions
> >>>>>       - The original plan was to do in-place refactors / changes on
> >>>>>    DataStream API. Some related items are listed in this doc [2]
> >> attached
> >>>>> to
> >>>>>    the kicking off email [3]. Not all of the above issues are listed,
> >>>>> because
> >>>>>    we haven't looked into this as deeply as now  by that time.
> >>>>>    - We proposed this as a new API rather than in-place refactors in
> >> the
> >>>>>    2.0 work item list, because we realized the changes might be too
> >> big
> >>>>> for an
> >>>>>    in-place change. First having a new API then gradually retiring
> the
> >>>> old
> >>>>> one
> >>>>>    would help users to smoothly migrate between them.
> >>>>>
> >>>>> A thorough discussion is definitely needed once the FLIP is out. And
> of
> >>>>> course it's possible that the FLIP might be rejected. Given that we
> are
> >>>>> planning for release 2.0, I just feel it would be better to bring
> this
> >> up
> >>>>> early even the concrete plan is not yet ready,
> >>>>>
> >>>>> Best,
> >>>>>
> >>>>> Xintong
> >>>>>
> >>>>>
> >>>>> [1] https://lists.apache.org/thread/vmhzv8fcw2b33pqxp43486owrxbkd5x9
> >>>>> [2]
> >>>>>
> >>>>>
> >>>>
> >>
> https://docs.google.com/document/d/1_PMGl5RuDQGlV99_gL3y7OiRsF0DgCk91Coua6hFXhE/edit?usp=sharing
> >>>>> [3] https://lists.apache.org/thread/b8w5cx0qqbwzzklyn5xxf54vw9ymys1c
> >>>>>
> >>>>> On Tue, Jun 27, 2023 at 5:15 PM Gyula Fóra <gy...@apache.org>
> wrote:
> >>>>>
> >>>>>> Hey!
> >>>>>>
> >>>>>> I share the same concerns mentioned above regarding the
> >>>> "ProcessFunction
> >>>>>> API".
> >>>>>>
> >>>>>> I don't think we should create a replacement for the DataStream API
> >>>>> unless
> >>>>>> we have a very good reason to do so and with a proper discussion
> about
> >>>>> this
> >>>>>> as Alex said.
> >>>>>>
> >>>>>> Cheers,
> >>>>>> Gyula
> >>>>>>
> >>>>>> On Tue, Jun 27, 2023 at 11:03 AM Alexander Fedulov <
> >>>>>> alexander.fedulov@gmail.com> wrote:
> >>>>>>
> >>>>>>> Hi Xintong,
> >>>>>>>
> >>>>>>> By compatibility discussion do you mean the "[DISCUSS] FLIP-321:
> >>>>>> Introduce
> >>>>>>> an API deprecation process" thread [1]?
> >>>>>>>
> >>>>>>> I am also curious to know if the rationale behind this new API has
> >>>> been
> >>>>>>> previously discussed on the mailing list. Do we have a list of
> >>>>>> shortcomings
> >>>>>>> in the current DataStream API that it tries to resolve? How does
> the
> >>>>>>> current ProcessFunction functionality fit into the picture? Will it
> >>>> be
> >>>>>> kept
> >>>>>>> as is or subsumed by new API?
> >>>>>>>
> >>>>>>> [1]
> https://lists.apache.org/thread/vmhzv8fcw2b33pqxp43486owrxbkd5x9
> >>>>>>>
> >>>>>>> Best,
> >>>>>>> Alex
> >>>>>>>
> >>>>>>> On Mon, 26 Jun 2023 at 14:33, Xintong Song <to...@gmail.com>
> >>>>>> wrote:
> >>>>>>>>> The ProcessFunction API item is giving me the most headaches
> >>>>> because
> >>>>>>> it's
> >>>>>>>>> very unclear what it actually entails; like is it an entirely
> >>>>>> separate
> >>>>>>>> API
> >>>>>>>>> to DataStream (sounds like it is!) or an extension of DataStream.
> >>>>> How
> >>>>>>>> much
> >>>>>>>>> will it share the internals with DataStream etc.; how does it
> >>>>> relate
> >>>>>> to
> >>>>>>>> the
> >>>>>>>>> Table API (w.r.t. switching APIs / what Table API uses
> >>>> underneath).
> >>>>>>>> I totally understand your confusion. We started planning this
> after
> >>>>>>> kicking
> >>>>>>>> off the release 2.0, so there's still a lot to be explored and the
> >>>>> plan
> >>>>>>>> keeps changing.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>    - In the beginning, we planned to do an in-place refactor of
> >>>>>>> DataStream
> >>>>>>>>    API, until the API migration period is proposed.
> >>>>>>>>    - Then we want to make it an entirely separate API to
> >>>> DataStream,
> >>>>>> and
> >>>>>>>>    listed as a must-have for release 2.0 so that we can remove
> >>>>>> DataStream
> >>>>>>>> once
> >>>>>>>>    it's ready.
> >>>>>>>>    - However, depending on the outcome of the API compatibility
> >>>>>>> discussion
> >>>>>>>>    [1], we may not be able to remove DataStream in 2.0 anyway,
> >>>> which
> >>>>>>> means
> >>>>>>>> we
> >>>>>>>>    might need to re-evaluate the necessity of this item for 2.0.
> >>>>>>>>
> >>>>>>>> I'd say we wait a bit longer for the compatibility discussion [1]
> >>>> and
> >>>>>>>> decide the priority for this item afterwards.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Best,
> >>>>>>>>
> >>>>>>>> Xintong
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> [1] https://lists.apache.org/list.html?dev@flink.apache.org
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On Mon, Jun 26, 2023 at 6:00 PM Chesnay Schepler <
> >>>> chesnay@apache.org
> >>>>>>>> wrote:
> >>>>>>>>
> >>>>>>>>> by-and-large I'm quite happy with the list of items.
> >>>>>>>>>
> >>>>>>>>> I'm curious as to why the "Disaggregated State Management" item
> >>>> is
> >>>>>>> marked
> >>>>>>>>> as a must-have; will it require changes that break something?
> >>>> What
> >>>>>>>> prevents
> >>>>>>>>> it from being added in 2.1?
> >>>>>>>>>
> >>>>>>>>> We may want to update the Java 17 item to "Make Java 17 the
> >>>>> default,
> >>>>>>> drop
> >>>>>>>>> Java 8/11". Maybe even split it into a must-have "Drop Java 8"
> >>>> and
> >>>>> a
> >>>>>>>>> nice-to-have "Drop Java 11"?
> >>>>>>>>>
> >>>>>>>>> "Move Calcite rules from Scala to Java": I would hope that this
> >>>>> would
> >>>>>>> be
> >>>>>>>>> an entirely internal change, and could thus be an incremental
> >>>>> process
> >>>>>>>>> independent of major releases.
> >>>>>>>>> What is the actual scale of this item; how much are we actually
> >>>>>>>> re-writing?
> >>>>>>>>> "Add MetricGroup#getLogicalScope": I'd raise this to a
> >>>> must-have; i
> >>>>>>> think
> >>>>>>>>> I marked it down as nice-to-have only because it depends on
> >>>> another
> >>>>>>> item.
> >>>>>>>>> The ProcessFunction API item is giving me the most headaches
> >>>>> because
> >>>>>>> it's
> >>>>>>>>> very unclear what it actually entails; like is it an entirely
> >>>>>> separate
> >>>>>>>> API
> >>>>>>>>> to DataStream (sounds like it is!) or an extension of DataStream.
> >>>>> How
> >>>>>>>> much
> >>>>>>>>> will it share the internals with DataStream etc.; how does it
> >>>>> relate
> >>>>>> to
> >>>>>>>> the
> >>>>>>>>> Table API (w.r.t. switching APIs / what Table API uses
> >>>> underneath).
> >>>>>>>>> There are a few items I added as ideas which don't have a
> >>>> priority
> >>>>>> yet;
> >>>>>>>>> would love to get some feedback on those.
> >>>>>>>>>
> >>>>>>>>> On 21/06/2023 08:41, Xintong Song wrote:
> >>>>>>>>>
> >>>>>>>>> Hi devs,
> >>>>>>>>>
> >>>>>>>>> As previously discussed in [1], we had been collecting work item
> >>>>>>>> proposals
> >>>>>>>>> for the 2.0 release until June 15th, on the wiki page [2].
> >>>>>>>>>
> >>>>>>>>>    - As we have passed the due date, I'd like to kindly remind
> >>>>>> everyone
> >>>>>>>> *not
> >>>>>>>>>    to add / remove items directly on the wiki page*. If needed,
> >>>>>> please
> >>>>>>>> post
> >>>>>>>>>    in this thread or reach out to the release managers instead.
> >>>>>>>>>    - I've reached out to some folks for clarifications about
> >>>> their
> >>>>>>>>>    proposals. Some of them mentioned that they can not yet tell
> >>>>>> whether
> >>>>>>>> we
> >>>>>>>>>    should do an item or not, and would need more time /
> >>>> discussions
> >>>>>> to
> >>>>>>>> make
> >>>>>>>>>    the decision. So I added a new symbol for items whose
> >>>> priorities
> >>>>>> are
> >>>>>>>> `TBD`.
> >>>>>>>>> Now it's time to collaboratively decide a minimum set of
> >>>> must-have
> >>>>>>> items.
> >>>>>>>>> I've gone through the entire list of proposed items, and found
> >>>> most
> >>>>>> of
> >>>>>>>> them
> >>>>>>>>> make quite much sense. So I think an online sync might not be
> >>>>>> necessary
> >>>>>>>> for
> >>>>>>>>> this. I'd like to go with this DISCUSS thread, where everyone can
> >>>>>>> comment
> >>>>>>>>> on how they think the list can be improved, followed by a VOTE to
> >>>>>>>> formally
> >>>>>>>>> make the decision.
> >>>>>>>>>
> >>>>>>>>> Any feedback and opinions, including but not limited to the
> >>>>> following
> >>>>>>>>> aspects, will be appreciated.
> >>>>>>>>>
> >>>>>>>>>    - Important items that are missing from the list
> >>>>>>>>>    - Concerns regarding the listed items or their priorities
> >>>>>>>>>
> >>>>>>>>> Looking forward to your feedback.
> >>>>>>>>>
> >>>>>>>>> Best,
> >>>>>>>>>
> >>>>>>>>> Xintong
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> [1]
> >>>>
> >>
> https://lists.apache.org/list?dev@flink.apache.org:lte=1M:release%202.0%20status%20updates
> >>>>>>>>> [2]
> >>>> https://cwiki.apache.org/confluence/display/FLINK/2.0+Release
> >>>>>>>>>
> >>>>>>>>>
> >>>>
> >>>> --
> >>>> Best regards,
> >>>> Sergey
> >>>>
> >>
> >>
>
>

-- 
Best

ConradJam

Re: [DISCUSS] Release 2.0 Work Items

Posted by "Teoh, Hong" <li...@amazon.co.uk.INVALID>.
Thanks Xintong for driving the effort.

I’d add a +1 to reworking configs, as suggested by @Jark and @Chesnay, especially the types. We have various configs that encode Time / MemorySize that are Long instead!

Regards,
Hong



> On 29 Jun 2023, at 16:19, Yuan Mei <yu...@gmail.com> wrote:
> 
> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
> 
> 
> 
> Thanks for driving this effort, Xintong!
> 
> To Chesnay
>> I'm curious as to why the "Disaggregated State Management" item is
>> marked as a must-have; will it require changes that break something?
>> What prevents it from being added in 2.1?
> 
> As to "Disaggregated State Management".
> 
> We plan to provide a new type of state backend to support DFS as primary
> storage.
> To achieve this, we at least need to include two parts of amends (not
> entirely sure yet, since we are still in the designing and prototype phase)
> 
> 1. Statebackend Change
> 2. State Access Change
> 
> Not all of the interfaces related are `@Internal`. Some of the interfaces
> like `StateBackend` is `@PublicEvolving`
> So, you are right in the sense that "Disaggregated State Management" itself
> probably does not need to be a "Must Have"
> 
> But I was hoping changes that related to public APIs can be finalized and
> merged in Flink 2.0 (I will fix the wiki accordingly).
> 
> I also agree with Jark that 2.0 is a good chance to rework the default
> value of configurations.
> 
> Best
> Yuan
> 
> 
> On Thu, Jun 29, 2023 at 8:43 PM Chesnay Schepler <ch...@apache.org> wrote:
> 
>> Something else configuration-related is that there are a bunch of
>> options where the type isn't quite correct (e.g., a String where it
>> could be an enum, a string where it should be an int or something).
>> Could do a pass over those as well.
>> 
>> On 29/06/2023 13:50, Jark Wu wrote:
>>> Hi,
>>> 
>>> I think one more thing we need to consider to do in 2.0 is changing the
>>> default value of configuration to improve out-of-box user experience.
>>> 
>>> Currently, in order to run a Flink job, users may need to set
>>> a bunch of configurations, such as minibatch, checkpoint interval,
>>> exactly-once,
>>> incremental-checkpoint, etc. It's very verbose and hard to use for
>>> beginners.
>>> Most of them can have a universally applicable value.  Because changing
>> the
>>> default value is a breaking change. I think It's worth considering
>> changing
>>> them in 2.0.
>>> 
>>> What do you think?
>>> 
>>> Best,
>>> Jark
>>> 
>>> 
>>> On Wed, 28 Jun 2023 at 14:10, Sergey Nuyanzin <sn...@gmail.com>
>> wrote:
>>> 
>>>> Hi Chesnay
>>>> 
>>>>> "Move Calcite rules from Scala to Java": I would hope that this would
>> be
>>>>> an entirely internal change, and could thus be an incremental process
>>>>> independent of major releases.
>>>>> What is the actual scale of this item; how much are we actually
>>>> re-writing?
>>>> 
>>>> Thanks for asking
>>>> yes, you're right, that should be internal change.
>>>> Yeah I was also thinking about incremental change (rule by rule or
>>>> reasonable small group of rules).
>>>> And yes, this could be an independent (on major release) activity
>>>> 
>>>> The problem is actually for children of RelOptRule.
>>>> Currently I see 60+ such rules (in Scala) using the mentioned deprecated
>>>> api.
>>>> There are also children of ConverterRule (50+) which do not have such
>>>> issues.
>>>> Maybe it could be considered as the next step to have all the rules in
>>>> Java.
>>>> 
>>>> On Tue, Jun 27, 2023 at 1:34 PM Xintong Song <to...@gmail.com>
>>>> wrote:
>>>> 
>>>>> Hi Alex & Gyula,
>>>>> 
>>>>> By compatibility discussion do you mean the "[DISCUSS] FLIP-321:
>>>> Introduce
>>>>>> an API deprecation process" thread [1]?
>>>>>> 
>>>>> Yes, I meant the FLIP-321 discussion. I just noticed I pasted the wrong
>>>> url
>>>>> in my previous email. Sorry for the mistake.
>>>>> 
>>>>> I am also curious to know if the rationale behind this new API has been
>>>>>> previously discussed on the mailing list. Do we have a list of
>>>>> shortcomings
>>>>>> in the current DataStream API that it tries to resolve? How does the
>>>>>> current ProcessFunction functionality fit into the picture? Will it be
>>>>> kept
>>>>>> as is or subsumed by new API?
>>>>>> 
>>>>> I don't think we should create a replacement for the DataStream API
>>>> unless
>>>>>> we have a very good reason to do so and with a proper discussion about
>>>>> this
>>>>>> as Alex said.
>>>>> 
>>>>> The ProcessFunction API which is targeting to replace DataStream API is
>>>>> still a proposal, not a decision. Sorry for the confusion, I should
>> have
>>>>> been more careful with my words, not giving the impression that this is
>>>>> something we'll do anyway.
>>>>> 
>>>>> There will be a FLIP describing the motivations and designs in detail,
>>>> for
>>>>> the community to discuss and vote on. We are still working on it. TBH,
>>>> this
>>>>> is not trivial and we would need more time on it.
>>>>> 
>>>>> Just to quickly share some backgrounds:
>>>>> 
>>>>>    - We see quite some problems with the current DataStream APIs
>>>>>       - Users are working with concrete classes rather than
>> interfaces,
>>>>>       which means
>>>>>       - Users can access methods that are designed to be used by
>> internal
>>>>>          classes, even though they are annotated with `@Internal`.
>> E.g.,
>>>>>          `DataStream#getTransformation`.
>>>>>          - Changes to the non-API implementations (e.g.,
>>>> `Transformation`)
>>>>>          would affect the API classes (e.g., `DataStream`), which
>>>>> makes it hard to
>>>>>          provide binary compatibility.
>>>>>       - Internal classes are used as parameter / return-value of
>> public
>>>>>       APIs. E.g., while `AbstractStreamOperator` is PublicEvolving,
>>>>> `StreamTask`
>>>>>       which returns from `AbstractStreamOperator#getContainingTask` is
>>>>> Internal.
>>>>>       - In many cases, users are asked to extend the API classes,
>> rather
>>>>>       than implementing interfaces. E.g., `AbstractStreamOperator`.
>>>>>          - Any changes to the base classes, even the internal part,
>> may
>>>>>          affect the behavior of the user-provided sub-classes
>>>>>          - Users can override the behavior of the base classes
>>>>>       - The API module `flink-streaming-java` contains non-API
>> classes,
>>>> and
>>>>>       depends on internal modules such as `flink-runtime`, which means
>>>>>       - Changes to the internal modules may affect the API modules,
>> which
>>>>>          requires users to re-build their applications upon upgrading
>>>>>          - The artifact user needs for building their application
>> larger
>>>>>          than necessary.
>>>>>       - We probably should not expose operators (e.g.,
>>>>>       `AbstractStreamOperator`) to users. Functions should be enough
>>>>> for users to
>>>>>       define their data processing logics. Exposing operator-level
>>>> concepts
>>>>>       (e.g., mailbox thread model, checkpoint barrier alignment,
>> etc.) is
>>>>>       unnecessary and limits the improvement regarding such exposed
>>>>> mechanisms
>>>>>       with compatibility considerations.
>>>>>       - The current DataStream API seems to be a mixture of many
>> things,
>>>>>       making it hard to understand especially for newcomers. It might
>> be
>>>>> better
>>>>>       to re-organize it into several parts: (the taxonomy below are
>> just
>>>> an
>>>>>       example of the, we are still working on this)
>>>>>          - The most fundamental stateful stream processing: streams,
>>>>>          partitions / key, process functions, state, timeline-service
>>>>>          - An extension for common batch-streaming unified functions:
>>>> map,
>>>>>          flatmap, filter, agg, reduce, join, etc.
>>>>>          - An extension for windowing supports:  window, triggering
>>>>>          - An extension for event-time supports: event time, watermark
>>>>>          - The extensions are like short-cuts / sugars, without which
>>>> users
>>>>>          can probably still achieve the same behavior by working with
>> the
>>>>>          fundamental APIs, but would be a lot easier with the
>> extensions
>>>>>       - The original plan was to do in-place refactors / changes on
>>>>>    DataStream API. Some related items are listed in this doc [2]
>> attached
>>>>> to
>>>>>    the kicking off email [3]. Not all of the above issues are listed,
>>>>> because
>>>>>    we haven't looked into this as deeply as now  by that time.
>>>>>    - We proposed this as a new API rather than in-place refactors in
>> the
>>>>>    2.0 work item list, because we realized the changes might be too
>> big
>>>>> for an
>>>>>    in-place change. First having a new API then gradually retiring the
>>>> old
>>>>> one
>>>>>    would help users to smoothly migrate between them.
>>>>> 
>>>>> A thorough discussion is definitely needed once the FLIP is out. And of
>>>>> course it's possible that the FLIP might be rejected. Given that we are
>>>>> planning for release 2.0, I just feel it would be better to bring this
>> up
>>>>> early even the concrete plan is not yet ready,
>>>>> 
>>>>> Best,
>>>>> 
>>>>> Xintong
>>>>> 
>>>>> 
>>>>> [1] https://lists.apache.org/thread/vmhzv8fcw2b33pqxp43486owrxbkd5x9
>>>>> [2]
>>>>> 
>>>>> 
>>>> 
>> https://docs.google.com/document/d/1_PMGl5RuDQGlV99_gL3y7OiRsF0DgCk91Coua6hFXhE/edit?usp=sharing
>>>>> [3] https://lists.apache.org/thread/b8w5cx0qqbwzzklyn5xxf54vw9ymys1c
>>>>> 
>>>>> On Tue, Jun 27, 2023 at 5:15 PM Gyula Fóra <gy...@apache.org> wrote:
>>>>> 
>>>>>> Hey!
>>>>>> 
>>>>>> I share the same concerns mentioned above regarding the
>>>> "ProcessFunction
>>>>>> API".
>>>>>> 
>>>>>> I don't think we should create a replacement for the DataStream API
>>>>> unless
>>>>>> we have a very good reason to do so and with a proper discussion about
>>>>> this
>>>>>> as Alex said.
>>>>>> 
>>>>>> Cheers,
>>>>>> Gyula
>>>>>> 
>>>>>> On Tue, Jun 27, 2023 at 11:03 AM Alexander Fedulov <
>>>>>> alexander.fedulov@gmail.com> wrote:
>>>>>> 
>>>>>>> Hi Xintong,
>>>>>>> 
>>>>>>> By compatibility discussion do you mean the "[DISCUSS] FLIP-321:
>>>>>> Introduce
>>>>>>> an API deprecation process" thread [1]?
>>>>>>> 
>>>>>>> I am also curious to know if the rationale behind this new API has
>>>> been
>>>>>>> previously discussed on the mailing list. Do we have a list of
>>>>>> shortcomings
>>>>>>> in the current DataStream API that it tries to resolve? How does the
>>>>>>> current ProcessFunction functionality fit into the picture? Will it
>>>> be
>>>>>> kept
>>>>>>> as is or subsumed by new API?
>>>>>>> 
>>>>>>> [1] https://lists.apache.org/thread/vmhzv8fcw2b33pqxp43486owrxbkd5x9
>>>>>>> 
>>>>>>> Best,
>>>>>>> Alex
>>>>>>> 
>>>>>>> On Mon, 26 Jun 2023 at 14:33, Xintong Song <to...@gmail.com>
>>>>>> wrote:
>>>>>>>>> The ProcessFunction API item is giving me the most headaches
>>>>> because
>>>>>>> it's
>>>>>>>>> very unclear what it actually entails; like is it an entirely
>>>>>> separate
>>>>>>>> API
>>>>>>>>> to DataStream (sounds like it is!) or an extension of DataStream.
>>>>> How
>>>>>>>> much
>>>>>>>>> will it share the internals with DataStream etc.; how does it
>>>>> relate
>>>>>> to
>>>>>>>> the
>>>>>>>>> Table API (w.r.t. switching APIs / what Table API uses
>>>> underneath).
>>>>>>>> I totally understand your confusion. We started planning this after
>>>>>>> kicking
>>>>>>>> off the release 2.0, so there's still a lot to be explored and the
>>>>> plan
>>>>>>>> keeps changing.
>>>>>>>> 
>>>>>>>> 
>>>>>>>>    - In the beginning, we planned to do an in-place refactor of
>>>>>>> DataStream
>>>>>>>>    API, until the API migration period is proposed.
>>>>>>>>    - Then we want to make it an entirely separate API to
>>>> DataStream,
>>>>>> and
>>>>>>>>    listed as a must-have for release 2.0 so that we can remove
>>>>>> DataStream
>>>>>>>> once
>>>>>>>>    it's ready.
>>>>>>>>    - However, depending on the outcome of the API compatibility
>>>>>>> discussion
>>>>>>>>    [1], we may not be able to remove DataStream in 2.0 anyway,
>>>> which
>>>>>>> means
>>>>>>>> we
>>>>>>>>    might need to re-evaluate the necessity of this item for 2.0.
>>>>>>>> 
>>>>>>>> I'd say we wait a bit longer for the compatibility discussion [1]
>>>> and
>>>>>>>> decide the priority for this item afterwards.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> Best,
>>>>>>>> 
>>>>>>>> Xintong
>>>>>>>> 
>>>>>>>> 
>>>>>>>> [1] https://lists.apache.org/list.html?dev@flink.apache.org
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On Mon, Jun 26, 2023 at 6:00 PM Chesnay Schepler <
>>>> chesnay@apache.org
>>>>>>>> wrote:
>>>>>>>> 
>>>>>>>>> by-and-large I'm quite happy with the list of items.
>>>>>>>>> 
>>>>>>>>> I'm curious as to why the "Disaggregated State Management" item
>>>> is
>>>>>>> marked
>>>>>>>>> as a must-have; will it require changes that break something?
>>>> What
>>>>>>>> prevents
>>>>>>>>> it from being added in 2.1?
>>>>>>>>> 
>>>>>>>>> We may want to update the Java 17 item to "Make Java 17 the
>>>>> default,
>>>>>>> drop
>>>>>>>>> Java 8/11". Maybe even split it into a must-have "Drop Java 8"
>>>> and
>>>>> a
>>>>>>>>> nice-to-have "Drop Java 11"?
>>>>>>>>> 
>>>>>>>>> "Move Calcite rules from Scala to Java": I would hope that this
>>>>> would
>>>>>>> be
>>>>>>>>> an entirely internal change, and could thus be an incremental
>>>>> process
>>>>>>>>> independent of major releases.
>>>>>>>>> What is the actual scale of this item; how much are we actually
>>>>>>>> re-writing?
>>>>>>>>> "Add MetricGroup#getLogicalScope": I'd raise this to a
>>>> must-have; i
>>>>>>> think
>>>>>>>>> I marked it down as nice-to-have only because it depends on
>>>> another
>>>>>>> item.
>>>>>>>>> The ProcessFunction API item is giving me the most headaches
>>>>> because
>>>>>>> it's
>>>>>>>>> very unclear what it actually entails; like is it an entirely
>>>>>> separate
>>>>>>>> API
>>>>>>>>> to DataStream (sounds like it is!) or an extension of DataStream.
>>>>> How
>>>>>>>> much
>>>>>>>>> will it share the internals with DataStream etc.; how does it
>>>>> relate
>>>>>> to
>>>>>>>> the
>>>>>>>>> Table API (w.r.t. switching APIs / what Table API uses
>>>> underneath).
>>>>>>>>> There are a few items I added as ideas which don't have a
>>>> priority
>>>>>> yet;
>>>>>>>>> would love to get some feedback on those.
>>>>>>>>> 
>>>>>>>>> On 21/06/2023 08:41, Xintong Song wrote:
>>>>>>>>> 
>>>>>>>>> Hi devs,
>>>>>>>>> 
>>>>>>>>> As previously discussed in [1], we had been collecting work item
>>>>>>>> proposals
>>>>>>>>> for the 2.0 release until June 15th, on the wiki page [2].
>>>>>>>>> 
>>>>>>>>>    - As we have passed the due date, I'd like to kindly remind
>>>>>> everyone
>>>>>>>> *not
>>>>>>>>>    to add / remove items directly on the wiki page*. If needed,
>>>>>> please
>>>>>>>> post
>>>>>>>>>    in this thread or reach out to the release managers instead.
>>>>>>>>>    - I've reached out to some folks for clarifications about
>>>> their
>>>>>>>>>    proposals. Some of them mentioned that they can not yet tell
>>>>>> whether
>>>>>>>> we
>>>>>>>>>    should do an item or not, and would need more time /
>>>> discussions
>>>>>> to
>>>>>>>> make
>>>>>>>>>    the decision. So I added a new symbol for items whose
>>>> priorities
>>>>>> are
>>>>>>>> `TBD`.
>>>>>>>>> Now it's time to collaboratively decide a minimum set of
>>>> must-have
>>>>>>> items.
>>>>>>>>> I've gone through the entire list of proposed items, and found
>>>> most
>>>>>> of
>>>>>>>> them
>>>>>>>>> make quite much sense. So I think an online sync might not be
>>>>>> necessary
>>>>>>>> for
>>>>>>>>> this. I'd like to go with this DISCUSS thread, where everyone can
>>>>>>> comment
>>>>>>>>> on how they think the list can be improved, followed by a VOTE to
>>>>>>>> formally
>>>>>>>>> make the decision.
>>>>>>>>> 
>>>>>>>>> Any feedback and opinions, including but not limited to the
>>>>> following
>>>>>>>>> aspects, will be appreciated.
>>>>>>>>> 
>>>>>>>>>    - Important items that are missing from the list
>>>>>>>>>    - Concerns regarding the listed items or their priorities
>>>>>>>>> 
>>>>>>>>> Looking forward to your feedback.
>>>>>>>>> 
>>>>>>>>> Best,
>>>>>>>>> 
>>>>>>>>> Xintong
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> [1]
>>>> 
>> https://lists.apache.org/list?dev@flink.apache.org:lte=1M:release%202.0%20status%20updates
>>>>>>>>> [2]
>>>> https://cwiki.apache.org/confluence/display/FLINK/2.0+Release
>>>>>>>>> 
>>>>>>>>> 
>>>> 
>>>> --
>>>> Best regards,
>>>> Sergey
>>>> 
>> 
>> 


Re: [DISCUSS] Release 2.0 Work Items

Posted by Yuan Mei <yu...@gmail.com>.
Thanks for driving this effort, Xintong!

To Chesnay
> I'm curious as to why the "Disaggregated State Management" item is
> marked as a must-have; will it require changes that break something?
> What prevents it from being added in 2.1?

As to "Disaggregated State Management".

We plan to provide a new type of state backend to support DFS as primary
storage.
To achieve this, we at least need to include two parts of amends (not
entirely sure yet, since we are still in the designing and prototype phase)

1. Statebackend Change
2. State Access Change

Not all of the interfaces related are `@Internal`. Some of the interfaces
like `StateBackend` is `@PublicEvolving`
So, you are right in the sense that "Disaggregated State Management" itself
probably does not need to be a "Must Have"

But I was hoping changes that related to public APIs can be finalized and
merged in Flink 2.0 (I will fix the wiki accordingly).

I also agree with Jark that 2.0 is a good chance to rework the default
value of configurations.

Best
Yuan


On Thu, Jun 29, 2023 at 8:43 PM Chesnay Schepler <ch...@apache.org> wrote:

> Something else configuration-related is that there are a bunch of
> options where the type isn't quite correct (e.g., a String where it
> could be an enum, a string where it should be an int or something).
> Could do a pass over those as well.
>
> On 29/06/2023 13:50, Jark Wu wrote:
> > Hi,
> >
> > I think one more thing we need to consider to do in 2.0 is changing the
> > default value of configuration to improve out-of-box user experience.
> >
> > Currently, in order to run a Flink job, users may need to set
> > a bunch of configurations, such as minibatch, checkpoint interval,
> > exactly-once,
> > incremental-checkpoint, etc. It's very verbose and hard to use for
> > beginners.
> > Most of them can have a universally applicable value.  Because changing
> the
> > default value is a breaking change. I think It's worth considering
> changing
> > them in 2.0.
> >
> > What do you think?
> >
> > Best,
> > Jark
> >
> >
> > On Wed, 28 Jun 2023 at 14:10, Sergey Nuyanzin <sn...@gmail.com>
> wrote:
> >
> >> Hi Chesnay
> >>
> >>> "Move Calcite rules from Scala to Java": I would hope that this would
> be
> >>> an entirely internal change, and could thus be an incremental process
> >>> independent of major releases.
> >>> What is the actual scale of this item; how much are we actually
> >> re-writing?
> >>
> >> Thanks for asking
> >> yes, you're right, that should be internal change.
> >> Yeah I was also thinking about incremental change (rule by rule or
> >> reasonable small group of rules).
> >> And yes, this could be an independent (on major release) activity
> >>
> >> The problem is actually for children of RelOptRule.
> >> Currently I see 60+ such rules (in Scala) using the mentioned deprecated
> >> api.
> >> There are also children of ConverterRule (50+) which do not have such
> >> issues.
> >> Maybe it could be considered as the next step to have all the rules in
> >> Java.
> >>
> >> On Tue, Jun 27, 2023 at 1:34 PM Xintong Song <to...@gmail.com>
> >> wrote:
> >>
> >>> Hi Alex & Gyula,
> >>>
> >>> By compatibility discussion do you mean the "[DISCUSS] FLIP-321:
> >> Introduce
> >>>> an API deprecation process" thread [1]?
> >>>>
> >>> Yes, I meant the FLIP-321 discussion. I just noticed I pasted the wrong
> >> url
> >>> in my previous email. Sorry for the mistake.
> >>>
> >>> I am also curious to know if the rationale behind this new API has been
> >>>> previously discussed on the mailing list. Do we have a list of
> >>> shortcomings
> >>>> in the current DataStream API that it tries to resolve? How does the
> >>>> current ProcessFunction functionality fit into the picture? Will it be
> >>> kept
> >>>> as is or subsumed by new API?
> >>>>
> >>> I don't think we should create a replacement for the DataStream API
> >> unless
> >>>> we have a very good reason to do so and with a proper discussion about
> >>> this
> >>>> as Alex said.
> >>>
> >>> The ProcessFunction API which is targeting to replace DataStream API is
> >>> still a proposal, not a decision. Sorry for the confusion, I should
> have
> >>> been more careful with my words, not giving the impression that this is
> >>> something we'll do anyway.
> >>>
> >>> There will be a FLIP describing the motivations and designs in detail,
> >> for
> >>> the community to discuss and vote on. We are still working on it. TBH,
> >> this
> >>> is not trivial and we would need more time on it.
> >>>
> >>> Just to quickly share some backgrounds:
> >>>
> >>>     - We see quite some problems with the current DataStream APIs
> >>>        - Users are working with concrete classes rather than
> interfaces,
> >>>        which means
> >>>        - Users can access methods that are designed to be used by
> internal
> >>>           classes, even though they are annotated with `@Internal`.
> E.g.,
> >>>           `DataStream#getTransformation`.
> >>>           - Changes to the non-API implementations (e.g.,
> >> `Transformation`)
> >>>           would affect the API classes (e.g., `DataStream`), which
> >>> makes it hard to
> >>>           provide binary compatibility.
> >>>        - Internal classes are used as parameter / return-value of
> public
> >>>        APIs. E.g., while `AbstractStreamOperator` is PublicEvolving,
> >>> `StreamTask`
> >>>        which returns from `AbstractStreamOperator#getContainingTask` is
> >>> Internal.
> >>>        - In many cases, users are asked to extend the API classes,
> rather
> >>>        than implementing interfaces. E.g., `AbstractStreamOperator`.
> >>>           - Any changes to the base classes, even the internal part,
> may
> >>>           affect the behavior of the user-provided sub-classes
> >>>           - Users can override the behavior of the base classes
> >>>        - The API module `flink-streaming-java` contains non-API
> classes,
> >> and
> >>>        depends on internal modules such as `flink-runtime`, which means
> >>>        - Changes to the internal modules may affect the API modules,
> which
> >>>           requires users to re-build their applications upon upgrading
> >>>           - The artifact user needs for building their application
> larger
> >>>           than necessary.
> >>>        - We probably should not expose operators (e.g.,
> >>>        `AbstractStreamOperator`) to users. Functions should be enough
> >>> for users to
> >>>        define their data processing logics. Exposing operator-level
> >> concepts
> >>>        (e.g., mailbox thread model, checkpoint barrier alignment,
> etc.) is
> >>>        unnecessary and limits the improvement regarding such exposed
> >>> mechanisms
> >>>        with compatibility considerations.
> >>>        - The current DataStream API seems to be a mixture of many
> things,
> >>>        making it hard to understand especially for newcomers. It might
> be
> >>> better
> >>>        to re-organize it into several parts: (the taxonomy below are
> just
> >> an
> >>>        example of the, we are still working on this)
> >>>           - The most fundamental stateful stream processing: streams,
> >>>           partitions / key, process functions, state, timeline-service
> >>>           - An extension for common batch-streaming unified functions:
> >> map,
> >>>           flatmap, filter, agg, reduce, join, etc.
> >>>           - An extension for windowing supports:  window, triggering
> >>>           - An extension for event-time supports: event time, watermark
> >>>           - The extensions are like short-cuts / sugars, without which
> >> users
> >>>           can probably still achieve the same behavior by working with
> the
> >>>           fundamental APIs, but would be a lot easier with the
> extensions
> >>>        - The original plan was to do in-place refactors / changes on
> >>>     DataStream API. Some related items are listed in this doc [2]
> attached
> >>> to
> >>>     the kicking off email [3]. Not all of the above issues are listed,
> >>> because
> >>>     we haven't looked into this as deeply as now  by that time.
> >>>     - We proposed this as a new API rather than in-place refactors in
> the
> >>>     2.0 work item list, because we realized the changes might be too
> big
> >>> for an
> >>>     in-place change. First having a new API then gradually retiring the
> >> old
> >>> one
> >>>     would help users to smoothly migrate between them.
> >>>
> >>> A thorough discussion is definitely needed once the FLIP is out. And of
> >>> course it's possible that the FLIP might be rejected. Given that we are
> >>> planning for release 2.0, I just feel it would be better to bring this
> up
> >>> early even the concrete plan is not yet ready,
> >>>
> >>> Best,
> >>>
> >>> Xintong
> >>>
> >>>
> >>> [1] https://lists.apache.org/thread/vmhzv8fcw2b33pqxp43486owrxbkd5x9
> >>> [2]
> >>>
> >>>
> >>
> https://docs.google.com/document/d/1_PMGl5RuDQGlV99_gL3y7OiRsF0DgCk91Coua6hFXhE/edit?usp=sharing
> >>> [3] https://lists.apache.org/thread/b8w5cx0qqbwzzklyn5xxf54vw9ymys1c
> >>>
> >>> On Tue, Jun 27, 2023 at 5:15 PM Gyula Fóra <gy...@apache.org> wrote:
> >>>
> >>>> Hey!
> >>>>
> >>>> I share the same concerns mentioned above regarding the
> >> "ProcessFunction
> >>>> API".
> >>>>
> >>>> I don't think we should create a replacement for the DataStream API
> >>> unless
> >>>> we have a very good reason to do so and with a proper discussion about
> >>> this
> >>>> as Alex said.
> >>>>
> >>>> Cheers,
> >>>> Gyula
> >>>>
> >>>> On Tue, Jun 27, 2023 at 11:03 AM Alexander Fedulov <
> >>>> alexander.fedulov@gmail.com> wrote:
> >>>>
> >>>>> Hi Xintong,
> >>>>>
> >>>>> By compatibility discussion do you mean the "[DISCUSS] FLIP-321:
> >>>> Introduce
> >>>>> an API deprecation process" thread [1]?
> >>>>>
> >>>>> I am also curious to know if the rationale behind this new API has
> >> been
> >>>>> previously discussed on the mailing list. Do we have a list of
> >>>> shortcomings
> >>>>> in the current DataStream API that it tries to resolve? How does the
> >>>>> current ProcessFunction functionality fit into the picture? Will it
> >> be
> >>>> kept
> >>>>> as is or subsumed by new API?
> >>>>>
> >>>>> [1] https://lists.apache.org/thread/vmhzv8fcw2b33pqxp43486owrxbkd5x9
> >>>>>
> >>>>> Best,
> >>>>> Alex
> >>>>>
> >>>>> On Mon, 26 Jun 2023 at 14:33, Xintong Song <to...@gmail.com>
> >>>> wrote:
> >>>>>>> The ProcessFunction API item is giving me the most headaches
> >>> because
> >>>>> it's
> >>>>>>> very unclear what it actually entails; like is it an entirely
> >>>> separate
> >>>>>> API
> >>>>>>> to DataStream (sounds like it is!) or an extension of DataStream.
> >>> How
> >>>>>> much
> >>>>>>> will it share the internals with DataStream etc.; how does it
> >>> relate
> >>>> to
> >>>>>> the
> >>>>>>> Table API (w.r.t. switching APIs / what Table API uses
> >> underneath).
> >>>>>> I totally understand your confusion. We started planning this after
> >>>>> kicking
> >>>>>> off the release 2.0, so there's still a lot to be explored and the
> >>> plan
> >>>>>> keeps changing.
> >>>>>>
> >>>>>>
> >>>>>>     - In the beginning, we planned to do an in-place refactor of
> >>>>> DataStream
> >>>>>>     API, until the API migration period is proposed.
> >>>>>>     - Then we want to make it an entirely separate API to
> >> DataStream,
> >>>> and
> >>>>>>     listed as a must-have for release 2.0 so that we can remove
> >>>> DataStream
> >>>>>> once
> >>>>>>     it's ready.
> >>>>>>     - However, depending on the outcome of the API compatibility
> >>>>> discussion
> >>>>>>     [1], we may not be able to remove DataStream in 2.0 anyway,
> >> which
> >>>>> means
> >>>>>> we
> >>>>>>     might need to re-evaluate the necessity of this item for 2.0.
> >>>>>>
> >>>>>> I'd say we wait a bit longer for the compatibility discussion [1]
> >> and
> >>>>>> decide the priority for this item afterwards.
> >>>>>>
> >>>>>>
> >>>>>> Best,
> >>>>>>
> >>>>>> Xintong
> >>>>>>
> >>>>>>
> >>>>>> [1] https://lists.apache.org/list.html?dev@flink.apache.org
> >>>>>>
> >>>>>>
> >>>>>> On Mon, Jun 26, 2023 at 6:00 PM Chesnay Schepler <
> >> chesnay@apache.org
> >>>>>> wrote:
> >>>>>>
> >>>>>>> by-and-large I'm quite happy with the list of items.
> >>>>>>>
> >>>>>>> I'm curious as to why the "Disaggregated State Management" item
> >> is
> >>>>> marked
> >>>>>>> as a must-have; will it require changes that break something?
> >> What
> >>>>>> prevents
> >>>>>>> it from being added in 2.1?
> >>>>>>>
> >>>>>>> We may want to update the Java 17 item to "Make Java 17 the
> >>> default,
> >>>>> drop
> >>>>>>> Java 8/11". Maybe even split it into a must-have "Drop Java 8"
> >> and
> >>> a
> >>>>>>> nice-to-have "Drop Java 11"?
> >>>>>>>
> >>>>>>> "Move Calcite rules from Scala to Java": I would hope that this
> >>> would
> >>>>> be
> >>>>>>> an entirely internal change, and could thus be an incremental
> >>> process
> >>>>>>> independent of major releases.
> >>>>>>> What is the actual scale of this item; how much are we actually
> >>>>>> re-writing?
> >>>>>>> "Add MetricGroup#getLogicalScope": I'd raise this to a
> >> must-have; i
> >>>>> think
> >>>>>>> I marked it down as nice-to-have only because it depends on
> >> another
> >>>>> item.
> >>>>>>> The ProcessFunction API item is giving me the most headaches
> >>> because
> >>>>> it's
> >>>>>>> very unclear what it actually entails; like is it an entirely
> >>>> separate
> >>>>>> API
> >>>>>>> to DataStream (sounds like it is!) or an extension of DataStream.
> >>> How
> >>>>>> much
> >>>>>>> will it share the internals with DataStream etc.; how does it
> >>> relate
> >>>> to
> >>>>>> the
> >>>>>>> Table API (w.r.t. switching APIs / what Table API uses
> >> underneath).
> >>>>>>> There are a few items I added as ideas which don't have a
> >> priority
> >>>> yet;
> >>>>>>> would love to get some feedback on those.
> >>>>>>>
> >>>>>>> On 21/06/2023 08:41, Xintong Song wrote:
> >>>>>>>
> >>>>>>> Hi devs,
> >>>>>>>
> >>>>>>> As previously discussed in [1], we had been collecting work item
> >>>>>> proposals
> >>>>>>> for the 2.0 release until June 15th, on the wiki page [2].
> >>>>>>>
> >>>>>>>     - As we have passed the due date, I'd like to kindly remind
> >>>> everyone
> >>>>>> *not
> >>>>>>>     to add / remove items directly on the wiki page*. If needed,
> >>>> please
> >>>>>> post
> >>>>>>>     in this thread or reach out to the release managers instead.
> >>>>>>>     - I've reached out to some folks for clarifications about
> >> their
> >>>>>>>     proposals. Some of them mentioned that they can not yet tell
> >>>> whether
> >>>>>> we
> >>>>>>>     should do an item or not, and would need more time /
> >> discussions
> >>>> to
> >>>>>> make
> >>>>>>>     the decision. So I added a new symbol for items whose
> >> priorities
> >>>> are
> >>>>>> `TBD`.
> >>>>>>> Now it's time to collaboratively decide a minimum set of
> >> must-have
> >>>>> items.
> >>>>>>> I've gone through the entire list of proposed items, and found
> >> most
> >>>> of
> >>>>>> them
> >>>>>>> make quite much sense. So I think an online sync might not be
> >>>> necessary
> >>>>>> for
> >>>>>>> this. I'd like to go with this DISCUSS thread, where everyone can
> >>>>> comment
> >>>>>>> on how they think the list can be improved, followed by a VOTE to
> >>>>>> formally
> >>>>>>> make the decision.
> >>>>>>>
> >>>>>>> Any feedback and opinions, including but not limited to the
> >>> following
> >>>>>>> aspects, will be appreciated.
> >>>>>>>
> >>>>>>>     - Important items that are missing from the list
> >>>>>>>     - Concerns regarding the listed items or their priorities
> >>>>>>>
> >>>>>>> Looking forward to your feedback.
> >>>>>>>
> >>>>>>> Best,
> >>>>>>>
> >>>>>>> Xintong
> >>>>>>>
> >>>>>>>
> >>>>>>> [1]
> >>
> https://lists.apache.org/list?dev@flink.apache.org:lte=1M:release%202.0%20status%20updates
> >>>>>>> [2]
> >> https://cwiki.apache.org/confluence/display/FLINK/2.0+Release
> >>>>>>>
> >>>>>>>
> >>
> >> --
> >> Best regards,
> >> Sergey
> >>
>
>

Re: [DISCUSS] Release 2.0 Work Items

Posted by Chesnay Schepler <ch...@apache.org>.
Something else configuration-related is that there are a bunch of 
options where the type isn't quite correct (e.g., a String where it 
could be an enum, a string where it should be an int or something).
Could do a pass over those as well.

On 29/06/2023 13:50, Jark Wu wrote:
> Hi,
>
> I think one more thing we need to consider to do in 2.0 is changing the
> default value of configuration to improve out-of-box user experience.
>
> Currently, in order to run a Flink job, users may need to set
> a bunch of configurations, such as minibatch, checkpoint interval,
> exactly-once,
> incremental-checkpoint, etc. It's very verbose and hard to use for
> beginners.
> Most of them can have a universally applicable value.  Because changing the
> default value is a breaking change. I think It's worth considering changing
> them in 2.0.
>
> What do you think?
>
> Best,
> Jark
>
>
> On Wed, 28 Jun 2023 at 14:10, Sergey Nuyanzin <sn...@gmail.com> wrote:
>
>> Hi Chesnay
>>
>>> "Move Calcite rules from Scala to Java": I would hope that this would be
>>> an entirely internal change, and could thus be an incremental process
>>> independent of major releases.
>>> What is the actual scale of this item; how much are we actually
>> re-writing?
>>
>> Thanks for asking
>> yes, you're right, that should be internal change.
>> Yeah I was also thinking about incremental change (rule by rule or
>> reasonable small group of rules).
>> And yes, this could be an independent (on major release) activity
>>
>> The problem is actually for children of RelOptRule.
>> Currently I see 60+ such rules (in Scala) using the mentioned deprecated
>> api.
>> There are also children of ConverterRule (50+) which do not have such
>> issues.
>> Maybe it could be considered as the next step to have all the rules in
>> Java.
>>
>> On Tue, Jun 27, 2023 at 1:34 PM Xintong Song <to...@gmail.com>
>> wrote:
>>
>>> Hi Alex & Gyula,
>>>
>>> By compatibility discussion do you mean the "[DISCUSS] FLIP-321:
>> Introduce
>>>> an API deprecation process" thread [1]?
>>>>
>>> Yes, I meant the FLIP-321 discussion. I just noticed I pasted the wrong
>> url
>>> in my previous email. Sorry for the mistake.
>>>
>>> I am also curious to know if the rationale behind this new API has been
>>>> previously discussed on the mailing list. Do we have a list of
>>> shortcomings
>>>> in the current DataStream API that it tries to resolve? How does the
>>>> current ProcessFunction functionality fit into the picture? Will it be
>>> kept
>>>> as is or subsumed by new API?
>>>>
>>> I don't think we should create a replacement for the DataStream API
>> unless
>>>> we have a very good reason to do so and with a proper discussion about
>>> this
>>>> as Alex said.
>>>
>>> The ProcessFunction API which is targeting to replace DataStream API is
>>> still a proposal, not a decision. Sorry for the confusion, I should have
>>> been more careful with my words, not giving the impression that this is
>>> something we'll do anyway.
>>>
>>> There will be a FLIP describing the motivations and designs in detail,
>> for
>>> the community to discuss and vote on. We are still working on it. TBH,
>> this
>>> is not trivial and we would need more time on it.
>>>
>>> Just to quickly share some backgrounds:
>>>
>>>     - We see quite some problems with the current DataStream APIs
>>>        - Users are working with concrete classes rather than interfaces,
>>>        which means
>>>        - Users can access methods that are designed to be used by internal
>>>           classes, even though they are annotated with `@Internal`. E.g.,
>>>           `DataStream#getTransformation`.
>>>           - Changes to the non-API implementations (e.g.,
>> `Transformation`)
>>>           would affect the API classes (e.g., `DataStream`), which
>>> makes it hard to
>>>           provide binary compatibility.
>>>        - Internal classes are used as parameter / return-value of public
>>>        APIs. E.g., while `AbstractStreamOperator` is PublicEvolving,
>>> `StreamTask`
>>>        which returns from `AbstractStreamOperator#getContainingTask` is
>>> Internal.
>>>        - In many cases, users are asked to extend the API classes, rather
>>>        than implementing interfaces. E.g., `AbstractStreamOperator`.
>>>           - Any changes to the base classes, even the internal part, may
>>>           affect the behavior of the user-provided sub-classes
>>>           - Users can override the behavior of the base classes
>>>        - The API module `flink-streaming-java` contains non-API classes,
>> and
>>>        depends on internal modules such as `flink-runtime`, which means
>>>        - Changes to the internal modules may affect the API modules, which
>>>           requires users to re-build their applications upon upgrading
>>>           - The artifact user needs for building their application larger
>>>           than necessary.
>>>        - We probably should not expose operators (e.g.,
>>>        `AbstractStreamOperator`) to users. Functions should be enough
>>> for users to
>>>        define their data processing logics. Exposing operator-level
>> concepts
>>>        (e.g., mailbox thread model, checkpoint barrier alignment, etc.) is
>>>        unnecessary and limits the improvement regarding such exposed
>>> mechanisms
>>>        with compatibility considerations.
>>>        - The current DataStream API seems to be a mixture of many things,
>>>        making it hard to understand especially for newcomers. It might be
>>> better
>>>        to re-organize it into several parts: (the taxonomy below are just
>> an
>>>        example of the, we are still working on this)
>>>           - The most fundamental stateful stream processing: streams,
>>>           partitions / key, process functions, state, timeline-service
>>>           - An extension for common batch-streaming unified functions:
>> map,
>>>           flatmap, filter, agg, reduce, join, etc.
>>>           - An extension for windowing supports:  window, triggering
>>>           - An extension for event-time supports: event time, watermark
>>>           - The extensions are like short-cuts / sugars, without which
>> users
>>>           can probably still achieve the same behavior by working with the
>>>           fundamental APIs, but would be a lot easier with the extensions
>>>        - The original plan was to do in-place refactors / changes on
>>>     DataStream API. Some related items are listed in this doc [2] attached
>>> to
>>>     the kicking off email [3]. Not all of the above issues are listed,
>>> because
>>>     we haven't looked into this as deeply as now  by that time.
>>>     - We proposed this as a new API rather than in-place refactors in the
>>>     2.0 work item list, because we realized the changes might be too big
>>> for an
>>>     in-place change. First having a new API then gradually retiring the
>> old
>>> one
>>>     would help users to smoothly migrate between them.
>>>
>>> A thorough discussion is definitely needed once the FLIP is out. And of
>>> course it's possible that the FLIP might be rejected. Given that we are
>>> planning for release 2.0, I just feel it would be better to bring this up
>>> early even the concrete plan is not yet ready,
>>>
>>> Best,
>>>
>>> Xintong
>>>
>>>
>>> [1] https://lists.apache.org/thread/vmhzv8fcw2b33pqxp43486owrxbkd5x9
>>> [2]
>>>
>>>
>> https://docs.google.com/document/d/1_PMGl5RuDQGlV99_gL3y7OiRsF0DgCk91Coua6hFXhE/edit?usp=sharing
>>> [3] https://lists.apache.org/thread/b8w5cx0qqbwzzklyn5xxf54vw9ymys1c
>>>
>>> On Tue, Jun 27, 2023 at 5:15 PM Gyula Fóra <gy...@apache.org> wrote:
>>>
>>>> Hey!
>>>>
>>>> I share the same concerns mentioned above regarding the
>> "ProcessFunction
>>>> API".
>>>>
>>>> I don't think we should create a replacement for the DataStream API
>>> unless
>>>> we have a very good reason to do so and with a proper discussion about
>>> this
>>>> as Alex said.
>>>>
>>>> Cheers,
>>>> Gyula
>>>>
>>>> On Tue, Jun 27, 2023 at 11:03 AM Alexander Fedulov <
>>>> alexander.fedulov@gmail.com> wrote:
>>>>
>>>>> Hi Xintong,
>>>>>
>>>>> By compatibility discussion do you mean the "[DISCUSS] FLIP-321:
>>>> Introduce
>>>>> an API deprecation process" thread [1]?
>>>>>
>>>>> I am also curious to know if the rationale behind this new API has
>> been
>>>>> previously discussed on the mailing list. Do we have a list of
>>>> shortcomings
>>>>> in the current DataStream API that it tries to resolve? How does the
>>>>> current ProcessFunction functionality fit into the picture? Will it
>> be
>>>> kept
>>>>> as is or subsumed by new API?
>>>>>
>>>>> [1] https://lists.apache.org/thread/vmhzv8fcw2b33pqxp43486owrxbkd5x9
>>>>>
>>>>> Best,
>>>>> Alex
>>>>>
>>>>> On Mon, 26 Jun 2023 at 14:33, Xintong Song <to...@gmail.com>
>>>> wrote:
>>>>>>> The ProcessFunction API item is giving me the most headaches
>>> because
>>>>> it's
>>>>>>> very unclear what it actually entails; like is it an entirely
>>>> separate
>>>>>> API
>>>>>>> to DataStream (sounds like it is!) or an extension of DataStream.
>>> How
>>>>>> much
>>>>>>> will it share the internals with DataStream etc.; how does it
>>> relate
>>>> to
>>>>>> the
>>>>>>> Table API (w.r.t. switching APIs / what Table API uses
>> underneath).
>>>>>> I totally understand your confusion. We started planning this after
>>>>> kicking
>>>>>> off the release 2.0, so there's still a lot to be explored and the
>>> plan
>>>>>> keeps changing.
>>>>>>
>>>>>>
>>>>>>     - In the beginning, we planned to do an in-place refactor of
>>>>> DataStream
>>>>>>     API, until the API migration period is proposed.
>>>>>>     - Then we want to make it an entirely separate API to
>> DataStream,
>>>> and
>>>>>>     listed as a must-have for release 2.0 so that we can remove
>>>> DataStream
>>>>>> once
>>>>>>     it's ready.
>>>>>>     - However, depending on the outcome of the API compatibility
>>>>> discussion
>>>>>>     [1], we may not be able to remove DataStream in 2.0 anyway,
>> which
>>>>> means
>>>>>> we
>>>>>>     might need to re-evaluate the necessity of this item for 2.0.
>>>>>>
>>>>>> I'd say we wait a bit longer for the compatibility discussion [1]
>> and
>>>>>> decide the priority for this item afterwards.
>>>>>>
>>>>>>
>>>>>> Best,
>>>>>>
>>>>>> Xintong
>>>>>>
>>>>>>
>>>>>> [1] https://lists.apache.org/list.html?dev@flink.apache.org
>>>>>>
>>>>>>
>>>>>> On Mon, Jun 26, 2023 at 6:00 PM Chesnay Schepler <
>> chesnay@apache.org
>>>>>> wrote:
>>>>>>
>>>>>>> by-and-large I'm quite happy with the list of items.
>>>>>>>
>>>>>>> I'm curious as to why the "Disaggregated State Management" item
>> is
>>>>> marked
>>>>>>> as a must-have; will it require changes that break something?
>> What
>>>>>> prevents
>>>>>>> it from being added in 2.1?
>>>>>>>
>>>>>>> We may want to update the Java 17 item to "Make Java 17 the
>>> default,
>>>>> drop
>>>>>>> Java 8/11". Maybe even split it into a must-have "Drop Java 8"
>> and
>>> a
>>>>>>> nice-to-have "Drop Java 11"?
>>>>>>>
>>>>>>> "Move Calcite rules from Scala to Java": I would hope that this
>>> would
>>>>> be
>>>>>>> an entirely internal change, and could thus be an incremental
>>> process
>>>>>>> independent of major releases.
>>>>>>> What is the actual scale of this item; how much are we actually
>>>>>> re-writing?
>>>>>>> "Add MetricGroup#getLogicalScope": I'd raise this to a
>> must-have; i
>>>>> think
>>>>>>> I marked it down as nice-to-have only because it depends on
>> another
>>>>> item.
>>>>>>> The ProcessFunction API item is giving me the most headaches
>>> because
>>>>> it's
>>>>>>> very unclear what it actually entails; like is it an entirely
>>>> separate
>>>>>> API
>>>>>>> to DataStream (sounds like it is!) or an extension of DataStream.
>>> How
>>>>>> much
>>>>>>> will it share the internals with DataStream etc.; how does it
>>> relate
>>>> to
>>>>>> the
>>>>>>> Table API (w.r.t. switching APIs / what Table API uses
>> underneath).
>>>>>>> There are a few items I added as ideas which don't have a
>> priority
>>>> yet;
>>>>>>> would love to get some feedback on those.
>>>>>>>
>>>>>>> On 21/06/2023 08:41, Xintong Song wrote:
>>>>>>>
>>>>>>> Hi devs,
>>>>>>>
>>>>>>> As previously discussed in [1], we had been collecting work item
>>>>>> proposals
>>>>>>> for the 2.0 release until June 15th, on the wiki page [2].
>>>>>>>
>>>>>>>     - As we have passed the due date, I'd like to kindly remind
>>>> everyone
>>>>>> *not
>>>>>>>     to add / remove items directly on the wiki page*. If needed,
>>>> please
>>>>>> post
>>>>>>>     in this thread or reach out to the release managers instead.
>>>>>>>     - I've reached out to some folks for clarifications about
>> their
>>>>>>>     proposals. Some of them mentioned that they can not yet tell
>>>> whether
>>>>>> we
>>>>>>>     should do an item or not, and would need more time /
>> discussions
>>>> to
>>>>>> make
>>>>>>>     the decision. So I added a new symbol for items whose
>> priorities
>>>> are
>>>>>> `TBD`.
>>>>>>> Now it's time to collaboratively decide a minimum set of
>> must-have
>>>>> items.
>>>>>>> I've gone through the entire list of proposed items, and found
>> most
>>>> of
>>>>>> them
>>>>>>> make quite much sense. So I think an online sync might not be
>>>> necessary
>>>>>> for
>>>>>>> this. I'd like to go with this DISCUSS thread, where everyone can
>>>>> comment
>>>>>>> on how they think the list can be improved, followed by a VOTE to
>>>>>> formally
>>>>>>> make the decision.
>>>>>>>
>>>>>>> Any feedback and opinions, including but not limited to the
>>> following
>>>>>>> aspects, will be appreciated.
>>>>>>>
>>>>>>>     - Important items that are missing from the list
>>>>>>>     - Concerns regarding the listed items or their priorities
>>>>>>>
>>>>>>> Looking forward to your feedback.
>>>>>>>
>>>>>>> Best,
>>>>>>>
>>>>>>> Xintong
>>>>>>>
>>>>>>>
>>>>>>> [1]
>> https://lists.apache.org/list?dev@flink.apache.org:lte=1M:release%202.0%20status%20updates
>>>>>>> [2]
>> https://cwiki.apache.org/confluence/display/FLINK/2.0+Release
>>>>>>>
>>>>>>>
>>
>> --
>> Best regards,
>> Sergey
>>


Re: [DISCUSS] Release 2.0 Work Items

Posted by Jark Wu <im...@gmail.com>.
Hi,

I think one more thing we need to consider to do in 2.0 is changing the
default value of configuration to improve out-of-box user experience.

Currently, in order to run a Flink job, users may need to set
a bunch of configurations, such as minibatch, checkpoint interval,
exactly-once,
incremental-checkpoint, etc. It's very verbose and hard to use for
beginners.
Most of them can have a universally applicable value.  Because changing the
default value is a breaking change. I think It's worth considering changing
them in 2.0.

What do you think?

Best,
Jark


On Wed, 28 Jun 2023 at 14:10, Sergey Nuyanzin <sn...@gmail.com> wrote:

> Hi Chesnay
>
> >"Move Calcite rules from Scala to Java": I would hope that this would be
> >an entirely internal change, and could thus be an incremental process
> >independent of major releases.
> >What is the actual scale of this item; how much are we actually
> re-writing?
>
> Thanks for asking
> yes, you're right, that should be internal change.
> Yeah I was also thinking about incremental change (rule by rule or
> reasonable small group of rules).
> And yes, this could be an independent (on major release) activity
>
> The problem is actually for children of RelOptRule.
> Currently I see 60+ such rules (in Scala) using the mentioned deprecated
> api.
> There are also children of ConverterRule (50+) which do not have such
> issues.
> Maybe it could be considered as the next step to have all the rules in
> Java.
>
> On Tue, Jun 27, 2023 at 1:34 PM Xintong Song <to...@gmail.com>
> wrote:
>
> > Hi Alex & Gyula,
> >
> > By compatibility discussion do you mean the "[DISCUSS] FLIP-321:
> Introduce
> > > an API deprecation process" thread [1]?
> > >
> >
> > Yes, I meant the FLIP-321 discussion. I just noticed I pasted the wrong
> url
> > in my previous email. Sorry for the mistake.
> >
> > I am also curious to know if the rationale behind this new API has been
> > > previously discussed on the mailing list. Do we have a list of
> > shortcomings
> > > in the current DataStream API that it tries to resolve? How does the
> > > current ProcessFunction functionality fit into the picture? Will it be
> > kept
> > > as is or subsumed by new API?
> > >
> >
> > I don't think we should create a replacement for the DataStream API
> unless
> > > we have a very good reason to do so and with a proper discussion about
> > this
> > > as Alex said.
> >
> >
> > The ProcessFunction API which is targeting to replace DataStream API is
> > still a proposal, not a decision. Sorry for the confusion, I should have
> > been more careful with my words, not giving the impression that this is
> > something we'll do anyway.
> >
> > There will be a FLIP describing the motivations and designs in detail,
> for
> > the community to discuss and vote on. We are still working on it. TBH,
> this
> > is not trivial and we would need more time on it.
> >
> > Just to quickly share some backgrounds:
> >
> >    - We see quite some problems with the current DataStream APIs
> >       - Users are working with concrete classes rather than interfaces,
> >       which means
> >       - Users can access methods that are designed to be used by internal
> >          classes, even though they are annotated with `@Internal`. E.g.,
> >          `DataStream#getTransformation`.
> >          - Changes to the non-API implementations (e.g.,
> `Transformation`)
> >          would affect the API classes (e.g., `DataStream`), which
> > makes it hard to
> >          provide binary compatibility.
> >       - Internal classes are used as parameter / return-value of public
> >       APIs. E.g., while `AbstractStreamOperator` is PublicEvolving,
> > `StreamTask`
> >       which returns from `AbstractStreamOperator#getContainingTask` is
> > Internal.
> >       - In many cases, users are asked to extend the API classes, rather
> >       than implementing interfaces. E.g., `AbstractStreamOperator`.
> >          - Any changes to the base classes, even the internal part, may
> >          affect the behavior of the user-provided sub-classes
> >          - Users can override the behavior of the base classes
> >       - The API module `flink-streaming-java` contains non-API classes,
> and
> >       depends on internal modules such as `flink-runtime`, which means
> >       - Changes to the internal modules may affect the API modules, which
> >          requires users to re-build their applications upon upgrading
> >          - The artifact user needs for building their application larger
> >          than necessary.
> >       - We probably should not expose operators (e.g.,
> >       `AbstractStreamOperator`) to users. Functions should be enough
> > for users to
> >       define their data processing logics. Exposing operator-level
> concepts
> >       (e.g., mailbox thread model, checkpoint barrier alignment, etc.) is
> >       unnecessary and limits the improvement regarding such exposed
> > mechanisms
> >       with compatibility considerations.
> >       - The current DataStream API seems to be a mixture of many things,
> >       making it hard to understand especially for newcomers. It might be
> > better
> >       to re-organize it into several parts: (the taxonomy below are just
> an
> >       example of the, we are still working on this)
> >          - The most fundamental stateful stream processing: streams,
> >          partitions / key, process functions, state, timeline-service
> >          - An extension for common batch-streaming unified functions:
> map,
> >          flatmap, filter, agg, reduce, join, etc.
> >          - An extension for windowing supports:  window, triggering
> >          - An extension for event-time supports: event time, watermark
> >          - The extensions are like short-cuts / sugars, without which
> users
> >          can probably still achieve the same behavior by working with the
> >          fundamental APIs, but would be a lot easier with the extensions
> >       - The original plan was to do in-place refactors / changes on
> >    DataStream API. Some related items are listed in this doc [2] attached
> > to
> >    the kicking off email [3]. Not all of the above issues are listed,
> > because
> >    we haven't looked into this as deeply as now  by that time.
> >    - We proposed this as a new API rather than in-place refactors in the
> >    2.0 work item list, because we realized the changes might be too big
> > for an
> >    in-place change. First having a new API then gradually retiring the
> old
> > one
> >    would help users to smoothly migrate between them.
> >
> > A thorough discussion is definitely needed once the FLIP is out. And of
> > course it's possible that the FLIP might be rejected. Given that we are
> > planning for release 2.0, I just feel it would be better to bring this up
> > early even the concrete plan is not yet ready,
> >
> > Best,
> >
> > Xintong
> >
> >
> > [1] https://lists.apache.org/thread/vmhzv8fcw2b33pqxp43486owrxbkd5x9
> > [2]
> >
> >
> https://docs.google.com/document/d/1_PMGl5RuDQGlV99_gL3y7OiRsF0DgCk91Coua6hFXhE/edit?usp=sharing
> > [3] https://lists.apache.org/thread/b8w5cx0qqbwzzklyn5xxf54vw9ymys1c
> >
> > On Tue, Jun 27, 2023 at 5:15 PM Gyula Fóra <gy...@apache.org> wrote:
> >
> > > Hey!
> > >
> > > I share the same concerns mentioned above regarding the
> "ProcessFunction
> > > API".
> > >
> > > I don't think we should create a replacement for the DataStream API
> > unless
> > > we have a very good reason to do so and with a proper discussion about
> > this
> > > as Alex said.
> > >
> > > Cheers,
> > > Gyula
> > >
> > > On Tue, Jun 27, 2023 at 11:03 AM Alexander Fedulov <
> > > alexander.fedulov@gmail.com> wrote:
> > >
> > > > Hi Xintong,
> > > >
> > > > By compatibility discussion do you mean the "[DISCUSS] FLIP-321:
> > > Introduce
> > > > an API deprecation process" thread [1]?
> > > >
> > > > I am also curious to know if the rationale behind this new API has
> been
> > > > previously discussed on the mailing list. Do we have a list of
> > > shortcomings
> > > > in the current DataStream API that it tries to resolve? How does the
> > > > current ProcessFunction functionality fit into the picture? Will it
> be
> > > kept
> > > > as is or subsumed by new API?
> > > >
> > > > [1] https://lists.apache.org/thread/vmhzv8fcw2b33pqxp43486owrxbkd5x9
> > > >
> > > > Best,
> > > > Alex
> > > >
> > > > On Mon, 26 Jun 2023 at 14:33, Xintong Song <to...@gmail.com>
> > > wrote:
> > > >
> > > > > >
> > > > > > The ProcessFunction API item is giving me the most headaches
> > because
> > > > it's
> > > > > > very unclear what it actually entails; like is it an entirely
> > > separate
> > > > > API
> > > > > > to DataStream (sounds like it is!) or an extension of DataStream.
> > How
> > > > > much
> > > > > > will it share the internals with DataStream etc.; how does it
> > relate
> > > to
> > > > > the
> > > > > > Table API (w.r.t. switching APIs / what Table API uses
> underneath).
> > > > > >
> > > > >
> > > > > I totally understand your confusion. We started planning this after
> > > > kicking
> > > > > off the release 2.0, so there's still a lot to be explored and the
> > plan
> > > > > keeps changing.
> > > > >
> > > > >
> > > > >    - In the beginning, we planned to do an in-place refactor of
> > > > DataStream
> > > > >    API, until the API migration period is proposed.
> > > > >    - Then we want to make it an entirely separate API to
> DataStream,
> > > and
> > > > >    listed as a must-have for release 2.0 so that we can remove
> > > DataStream
> > > > > once
> > > > >    it's ready.
> > > > >    - However, depending on the outcome of the API compatibility
> > > > discussion
> > > > >    [1], we may not be able to remove DataStream in 2.0 anyway,
> which
> > > > means
> > > > > we
> > > > >    might need to re-evaluate the necessity of this item for 2.0.
> > > > >
> > > > > I'd say we wait a bit longer for the compatibility discussion [1]
> and
> > > > > decide the priority for this item afterwards.
> > > > >
> > > > >
> > > > > Best,
> > > > >
> > > > > Xintong
> > > > >
> > > > >
> > > > > [1] https://lists.apache.org/list.html?dev@flink.apache.org
> > > > >
> > > > >
> > > > > On Mon, Jun 26, 2023 at 6:00 PM Chesnay Schepler <
> chesnay@apache.org
> > >
> > > > > wrote:
> > > > >
> > > > > > by-and-large I'm quite happy with the list of items.
> > > > > >
> > > > > > I'm curious as to why the "Disaggregated State Management" item
> is
> > > > marked
> > > > > > as a must-have; will it require changes that break something?
> What
> > > > > prevents
> > > > > > it from being added in 2.1?
> > > > > >
> > > > > > We may want to update the Java 17 item to "Make Java 17 the
> > default,
> > > > drop
> > > > > > Java 8/11". Maybe even split it into a must-have "Drop Java 8"
> and
> > a
> > > > > > nice-to-have "Drop Java 11"?
> > > > > >
> > > > > > "Move Calcite rules from Scala to Java": I would hope that this
> > would
> > > > be
> > > > > > an entirely internal change, and could thus be an incremental
> > process
> > > > > > independent of major releases.
> > > > > > What is the actual scale of this item; how much are we actually
> > > > > re-writing?
> > > > > >
> > > > > > "Add MetricGroup#getLogicalScope": I'd raise this to a
> must-have; i
> > > > think
> > > > > > I marked it down as nice-to-have only because it depends on
> another
> > > > item.
> > > > > >
> > > > > > The ProcessFunction API item is giving me the most headaches
> > because
> > > > it's
> > > > > > very unclear what it actually entails; like is it an entirely
> > > separate
> > > > > API
> > > > > > to DataStream (sounds like it is!) or an extension of DataStream.
> > How
> > > > > much
> > > > > > will it share the internals with DataStream etc.; how does it
> > relate
> > > to
> > > > > the
> > > > > > Table API (w.r.t. switching APIs / what Table API uses
> underneath).
> > > > > >
> > > > > > There are a few items I added as ideas which don't have a
> priority
> > > yet;
> > > > > > would love to get some feedback on those.
> > > > > >
> > > > > > On 21/06/2023 08:41, Xintong Song wrote:
> > > > > >
> > > > > > Hi devs,
> > > > > >
> > > > > > As previously discussed in [1], we had been collecting work item
> > > > > proposals
> > > > > > for the 2.0 release until June 15th, on the wiki page [2].
> > > > > >
> > > > > >    - As we have passed the due date, I'd like to kindly remind
> > > everyone
> > > > > *not
> > > > > >    to add / remove items directly on the wiki page*. If needed,
> > > please
> > > > > post
> > > > > >    in this thread or reach out to the release managers instead.
> > > > > >    - I've reached out to some folks for clarifications about
> their
> > > > > >    proposals. Some of them mentioned that they can not yet tell
> > > whether
> > > > > we
> > > > > >    should do an item or not, and would need more time /
> discussions
> > > to
> > > > > make
> > > > > >    the decision. So I added a new symbol for items whose
> priorities
> > > are
> > > > > `TBD`.
> > > > > >
> > > > > > Now it's time to collaboratively decide a minimum set of
> must-have
> > > > items.
> > > > > > I've gone through the entire list of proposed items, and found
> most
> > > of
> > > > > them
> > > > > > make quite much sense. So I think an online sync might not be
> > > necessary
> > > > > for
> > > > > > this. I'd like to go with this DISCUSS thread, where everyone can
> > > > comment
> > > > > > on how they think the list can be improved, followed by a VOTE to
> > > > > formally
> > > > > > make the decision.
> > > > > >
> > > > > > Any feedback and opinions, including but not limited to the
> > following
> > > > > > aspects, will be appreciated.
> > > > > >
> > > > > >    - Important items that are missing from the list
> > > > > >    - Concerns regarding the listed items or their priorities
> > > > > >
> > > > > > Looking forward to your feedback.
> > > > > >
> > > > > > Best,
> > > > > >
> > > > > > Xintong
> > > > > >
> > > > > >
> > > > > > [1]
> > > > >
> > > >
> > >
> >
> https://lists.apache.org/list?dev@flink.apache.org:lte=1M:release%202.0%20status%20updates
> > > > > >
> > > > > > [2]
> https://cwiki.apache.org/confluence/display/FLINK/2.0+Release
> > > > > >
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>
>
> --
> Best regards,
> Sergey
>

Re: [DISCUSS] Release 2.0 Work Items

Posted by Sergey Nuyanzin <sn...@gmail.com>.
Hi Chesnay

>"Move Calcite rules from Scala to Java": I would hope that this would be
>an entirely internal change, and could thus be an incremental process
>independent of major releases.
>What is the actual scale of this item; how much are we actually re-writing?

Thanks for asking
yes, you're right, that should be internal change.
Yeah I was also thinking about incremental change (rule by rule or
reasonable small group of rules).
And yes, this could be an independent (on major release) activity

The problem is actually for children of RelOptRule.
Currently I see 60+ such rules (in Scala) using the mentioned deprecated
api.
There are also children of ConverterRule (50+) which do not have such
issues.
Maybe it could be considered as the next step to have all the rules in Java.

On Tue, Jun 27, 2023 at 1:34 PM Xintong Song <to...@gmail.com> wrote:

> Hi Alex & Gyula,
>
> By compatibility discussion do you mean the "[DISCUSS] FLIP-321: Introduce
> > an API deprecation process" thread [1]?
> >
>
> Yes, I meant the FLIP-321 discussion. I just noticed I pasted the wrong url
> in my previous email. Sorry for the mistake.
>
> I am also curious to know if the rationale behind this new API has been
> > previously discussed on the mailing list. Do we have a list of
> shortcomings
> > in the current DataStream API that it tries to resolve? How does the
> > current ProcessFunction functionality fit into the picture? Will it be
> kept
> > as is or subsumed by new API?
> >
>
> I don't think we should create a replacement for the DataStream API unless
> > we have a very good reason to do so and with a proper discussion about
> this
> > as Alex said.
>
>
> The ProcessFunction API which is targeting to replace DataStream API is
> still a proposal, not a decision. Sorry for the confusion, I should have
> been more careful with my words, not giving the impression that this is
> something we'll do anyway.
>
> There will be a FLIP describing the motivations and designs in detail, for
> the community to discuss and vote on. We are still working on it. TBH, this
> is not trivial and we would need more time on it.
>
> Just to quickly share some backgrounds:
>
>    - We see quite some problems with the current DataStream APIs
>       - Users are working with concrete classes rather than interfaces,
>       which means
>       - Users can access methods that are designed to be used by internal
>          classes, even though they are annotated with `@Internal`. E.g.,
>          `DataStream#getTransformation`.
>          - Changes to the non-API implementations (e.g., `Transformation`)
>          would affect the API classes (e.g., `DataStream`), which
> makes it hard to
>          provide binary compatibility.
>       - Internal classes are used as parameter / return-value of public
>       APIs. E.g., while `AbstractStreamOperator` is PublicEvolving,
> `StreamTask`
>       which returns from `AbstractStreamOperator#getContainingTask` is
> Internal.
>       - In many cases, users are asked to extend the API classes, rather
>       than implementing interfaces. E.g., `AbstractStreamOperator`.
>          - Any changes to the base classes, even the internal part, may
>          affect the behavior of the user-provided sub-classes
>          - Users can override the behavior of the base classes
>       - The API module `flink-streaming-java` contains non-API classes, and
>       depends on internal modules such as `flink-runtime`, which means
>       - Changes to the internal modules may affect the API modules, which
>          requires users to re-build their applications upon upgrading
>          - The artifact user needs for building their application larger
>          than necessary.
>       - We probably should not expose operators (e.g.,
>       `AbstractStreamOperator`) to users. Functions should be enough
> for users to
>       define their data processing logics. Exposing operator-level concepts
>       (e.g., mailbox thread model, checkpoint barrier alignment, etc.) is
>       unnecessary and limits the improvement regarding such exposed
> mechanisms
>       with compatibility considerations.
>       - The current DataStream API seems to be a mixture of many things,
>       making it hard to understand especially for newcomers. It might be
> better
>       to re-organize it into several parts: (the taxonomy below are just an
>       example of the, we are still working on this)
>          - The most fundamental stateful stream processing: streams,
>          partitions / key, process functions, state, timeline-service
>          - An extension for common batch-streaming unified functions: map,
>          flatmap, filter, agg, reduce, join, etc.
>          - An extension for windowing supports:  window, triggering
>          - An extension for event-time supports: event time, watermark
>          - The extensions are like short-cuts / sugars, without which users
>          can probably still achieve the same behavior by working with the
>          fundamental APIs, but would be a lot easier with the extensions
>       - The original plan was to do in-place refactors / changes on
>    DataStream API. Some related items are listed in this doc [2] attached
> to
>    the kicking off email [3]. Not all of the above issues are listed,
> because
>    we haven't looked into this as deeply as now  by that time.
>    - We proposed this as a new API rather than in-place refactors in the
>    2.0 work item list, because we realized the changes might be too big
> for an
>    in-place change. First having a new API then gradually retiring the old
> one
>    would help users to smoothly migrate between them.
>
> A thorough discussion is definitely needed once the FLIP is out. And of
> course it's possible that the FLIP might be rejected. Given that we are
> planning for release 2.0, I just feel it would be better to bring this up
> early even the concrete plan is not yet ready,
>
> Best,
>
> Xintong
>
>
> [1] https://lists.apache.org/thread/vmhzv8fcw2b33pqxp43486owrxbkd5x9
> [2]
>
> https://docs.google.com/document/d/1_PMGl5RuDQGlV99_gL3y7OiRsF0DgCk91Coua6hFXhE/edit?usp=sharing
> [3] https://lists.apache.org/thread/b8w5cx0qqbwzzklyn5xxf54vw9ymys1c
>
> On Tue, Jun 27, 2023 at 5:15 PM Gyula Fóra <gy...@apache.org> wrote:
>
> > Hey!
> >
> > I share the same concerns mentioned above regarding the "ProcessFunction
> > API".
> >
> > I don't think we should create a replacement for the DataStream API
> unless
> > we have a very good reason to do so and with a proper discussion about
> this
> > as Alex said.
> >
> > Cheers,
> > Gyula
> >
> > On Tue, Jun 27, 2023 at 11:03 AM Alexander Fedulov <
> > alexander.fedulov@gmail.com> wrote:
> >
> > > Hi Xintong,
> > >
> > > By compatibility discussion do you mean the "[DISCUSS] FLIP-321:
> > Introduce
> > > an API deprecation process" thread [1]?
> > >
> > > I am also curious to know if the rationale behind this new API has been
> > > previously discussed on the mailing list. Do we have a list of
> > shortcomings
> > > in the current DataStream API that it tries to resolve? How does the
> > > current ProcessFunction functionality fit into the picture? Will it be
> > kept
> > > as is or subsumed by new API?
> > >
> > > [1] https://lists.apache.org/thread/vmhzv8fcw2b33pqxp43486owrxbkd5x9
> > >
> > > Best,
> > > Alex
> > >
> > > On Mon, 26 Jun 2023 at 14:33, Xintong Song <to...@gmail.com>
> > wrote:
> > >
> > > > >
> > > > > The ProcessFunction API item is giving me the most headaches
> because
> > > it's
> > > > > very unclear what it actually entails; like is it an entirely
> > separate
> > > > API
> > > > > to DataStream (sounds like it is!) or an extension of DataStream.
> How
> > > > much
> > > > > will it share the internals with DataStream etc.; how does it
> relate
> > to
> > > > the
> > > > > Table API (w.r.t. switching APIs / what Table API uses underneath).
> > > > >
> > > >
> > > > I totally understand your confusion. We started planning this after
> > > kicking
> > > > off the release 2.0, so there's still a lot to be explored and the
> plan
> > > > keeps changing.
> > > >
> > > >
> > > >    - In the beginning, we planned to do an in-place refactor of
> > > DataStream
> > > >    API, until the API migration period is proposed.
> > > >    - Then we want to make it an entirely separate API to DataStream,
> > and
> > > >    listed as a must-have for release 2.0 so that we can remove
> > DataStream
> > > > once
> > > >    it's ready.
> > > >    - However, depending on the outcome of the API compatibility
> > > discussion
> > > >    [1], we may not be able to remove DataStream in 2.0 anyway, which
> > > means
> > > > we
> > > >    might need to re-evaluate the necessity of this item for 2.0.
> > > >
> > > > I'd say we wait a bit longer for the compatibility discussion [1] and
> > > > decide the priority for this item afterwards.
> > > >
> > > >
> > > > Best,
> > > >
> > > > Xintong
> > > >
> > > >
> > > > [1] https://lists.apache.org/list.html?dev@flink.apache.org
> > > >
> > > >
> > > > On Mon, Jun 26, 2023 at 6:00 PM Chesnay Schepler <chesnay@apache.org
> >
> > > > wrote:
> > > >
> > > > > by-and-large I'm quite happy with the list of items.
> > > > >
> > > > > I'm curious as to why the "Disaggregated State Management" item is
> > > marked
> > > > > as a must-have; will it require changes that break something? What
> > > > prevents
> > > > > it from being added in 2.1?
> > > > >
> > > > > We may want to update the Java 17 item to "Make Java 17 the
> default,
> > > drop
> > > > > Java 8/11". Maybe even split it into a must-have "Drop Java 8" and
> a
> > > > > nice-to-have "Drop Java 11"?
> > > > >
> > > > > "Move Calcite rules from Scala to Java": I would hope that this
> would
> > > be
> > > > > an entirely internal change, and could thus be an incremental
> process
> > > > > independent of major releases.
> > > > > What is the actual scale of this item; how much are we actually
> > > > re-writing?
> > > > >
> > > > > "Add MetricGroup#getLogicalScope": I'd raise this to a must-have; i
> > > think
> > > > > I marked it down as nice-to-have only because it depends on another
> > > item.
> > > > >
> > > > > The ProcessFunction API item is giving me the most headaches
> because
> > > it's
> > > > > very unclear what it actually entails; like is it an entirely
> > separate
> > > > API
> > > > > to DataStream (sounds like it is!) or an extension of DataStream.
> How
> > > > much
> > > > > will it share the internals with DataStream etc.; how does it
> relate
> > to
> > > > the
> > > > > Table API (w.r.t. switching APIs / what Table API uses underneath).
> > > > >
> > > > > There are a few items I added as ideas which don't have a priority
> > yet;
> > > > > would love to get some feedback on those.
> > > > >
> > > > > On 21/06/2023 08:41, Xintong Song wrote:
> > > > >
> > > > > Hi devs,
> > > > >
> > > > > As previously discussed in [1], we had been collecting work item
> > > > proposals
> > > > > for the 2.0 release until June 15th, on the wiki page [2].
> > > > >
> > > > >    - As we have passed the due date, I'd like to kindly remind
> > everyone
> > > > *not
> > > > >    to add / remove items directly on the wiki page*. If needed,
> > please
> > > > post
> > > > >    in this thread or reach out to the release managers instead.
> > > > >    - I've reached out to some folks for clarifications about their
> > > > >    proposals. Some of them mentioned that they can not yet tell
> > whether
> > > > we
> > > > >    should do an item or not, and would need more time / discussions
> > to
> > > > make
> > > > >    the decision. So I added a new symbol for items whose priorities
> > are
> > > > `TBD`.
> > > > >
> > > > > Now it's time to collaboratively decide a minimum set of must-have
> > > items.
> > > > > I've gone through the entire list of proposed items, and found most
> > of
> > > > them
> > > > > make quite much sense. So I think an online sync might not be
> > necessary
> > > > for
> > > > > this. I'd like to go with this DISCUSS thread, where everyone can
> > > comment
> > > > > on how they think the list can be improved, followed by a VOTE to
> > > > formally
> > > > > make the decision.
> > > > >
> > > > > Any feedback and opinions, including but not limited to the
> following
> > > > > aspects, will be appreciated.
> > > > >
> > > > >    - Important items that are missing from the list
> > > > >    - Concerns regarding the listed items or their priorities
> > > > >
> > > > > Looking forward to your feedback.
> > > > >
> > > > > Best,
> > > > >
> > > > > Xintong
> > > > >
> > > > >
> > > > > [1]
> > > >
> > >
> >
> https://lists.apache.org/list?dev@flink.apache.org:lte=1M:release%202.0%20status%20updates
> > > > >
> > > > > [2] https://cwiki.apache.org/confluence/display/FLINK/2.0+Release
> > > > >
> > > > >
> > > > >
> > > >
> > >
> >
>


-- 
Best regards,
Sergey

Re: [DISCUSS] Release 2.0 Work Items

Posted by Xintong Song <to...@gmail.com>.
Hi Alex & Gyula,

By compatibility discussion do you mean the "[DISCUSS] FLIP-321: Introduce
> an API deprecation process" thread [1]?
>

Yes, I meant the FLIP-321 discussion. I just noticed I pasted the wrong url
in my previous email. Sorry for the mistake.

I am also curious to know if the rationale behind this new API has been
> previously discussed on the mailing list. Do we have a list of shortcomings
> in the current DataStream API that it tries to resolve? How does the
> current ProcessFunction functionality fit into the picture? Will it be kept
> as is or subsumed by new API?
>

I don't think we should create a replacement for the DataStream API unless
> we have a very good reason to do so and with a proper discussion about this
> as Alex said.


The ProcessFunction API which is targeting to replace DataStream API is
still a proposal, not a decision. Sorry for the confusion, I should have
been more careful with my words, not giving the impression that this is
something we'll do anyway.

There will be a FLIP describing the motivations and designs in detail, for
the community to discuss and vote on. We are still working on it. TBH, this
is not trivial and we would need more time on it.

Just to quickly share some backgrounds:

   - We see quite some problems with the current DataStream APIs
      - Users are working with concrete classes rather than interfaces,
      which means
      - Users can access methods that are designed to be used by internal
         classes, even though they are annotated with `@Internal`. E.g.,
         `DataStream#getTransformation`.
         - Changes to the non-API implementations (e.g., `Transformation`)
         would affect the API classes (e.g., `DataStream`), which
makes it hard to
         provide binary compatibility.
      - Internal classes are used as parameter / return-value of public
      APIs. E.g., while `AbstractStreamOperator` is PublicEvolving,
`StreamTask`
      which returns from `AbstractStreamOperator#getContainingTask` is Internal.
      - In many cases, users are asked to extend the API classes, rather
      than implementing interfaces. E.g., `AbstractStreamOperator`.
         - Any changes to the base classes, even the internal part, may
         affect the behavior of the user-provided sub-classes
         - Users can override the behavior of the base classes
      - The API module `flink-streaming-java` contains non-API classes, and
      depends on internal modules such as `flink-runtime`, which means
      - Changes to the internal modules may affect the API modules, which
         requires users to re-build their applications upon upgrading
         - The artifact user needs for building their application larger
         than necessary.
      - We probably should not expose operators (e.g.,
      `AbstractStreamOperator`) to users. Functions should be enough
for users to
      define their data processing logics. Exposing operator-level concepts
      (e.g., mailbox thread model, checkpoint barrier alignment, etc.) is
      unnecessary and limits the improvement regarding such exposed mechanisms
      with compatibility considerations.
      - The current DataStream API seems to be a mixture of many things,
      making it hard to understand especially for newcomers. It might be better
      to re-organize it into several parts: (the taxonomy below are just an
      example of the, we are still working on this)
         - The most fundamental stateful stream processing: streams,
         partitions / key, process functions, state, timeline-service
         - An extension for common batch-streaming unified functions: map,
         flatmap, filter, agg, reduce, join, etc.
         - An extension for windowing supports:  window, triggering
         - An extension for event-time supports: event time, watermark
         - The extensions are like short-cuts / sugars, without which users
         can probably still achieve the same behavior by working with the
         fundamental APIs, but would be a lot easier with the extensions
      - The original plan was to do in-place refactors / changes on
   DataStream API. Some related items are listed in this doc [2] attached to
   the kicking off email [3]. Not all of the above issues are listed, because
   we haven't looked into this as deeply as now  by that time.
   - We proposed this as a new API rather than in-place refactors in the
   2.0 work item list, because we realized the changes might be too big for an
   in-place change. First having a new API then gradually retiring the old one
   would help users to smoothly migrate between them.

A thorough discussion is definitely needed once the FLIP is out. And of
course it's possible that the FLIP might be rejected. Given that we are
planning for release 2.0, I just feel it would be better to bring this up
early even the concrete plan is not yet ready,

Best,

Xintong


[1] https://lists.apache.org/thread/vmhzv8fcw2b33pqxp43486owrxbkd5x9
[2]
https://docs.google.com/document/d/1_PMGl5RuDQGlV99_gL3y7OiRsF0DgCk91Coua6hFXhE/edit?usp=sharing
[3] https://lists.apache.org/thread/b8w5cx0qqbwzzklyn5xxf54vw9ymys1c

On Tue, Jun 27, 2023 at 5:15 PM Gyula Fóra <gy...@apache.org> wrote:

> Hey!
>
> I share the same concerns mentioned above regarding the "ProcessFunction
> API".
>
> I don't think we should create a replacement for the DataStream API unless
> we have a very good reason to do so and with a proper discussion about this
> as Alex said.
>
> Cheers,
> Gyula
>
> On Tue, Jun 27, 2023 at 11:03 AM Alexander Fedulov <
> alexander.fedulov@gmail.com> wrote:
>
> > Hi Xintong,
> >
> > By compatibility discussion do you mean the "[DISCUSS] FLIP-321:
> Introduce
> > an API deprecation process" thread [1]?
> >
> > I am also curious to know if the rationale behind this new API has been
> > previously discussed on the mailing list. Do we have a list of
> shortcomings
> > in the current DataStream API that it tries to resolve? How does the
> > current ProcessFunction functionality fit into the picture? Will it be
> kept
> > as is or subsumed by new API?
> >
> > [1] https://lists.apache.org/thread/vmhzv8fcw2b33pqxp43486owrxbkd5x9
> >
> > Best,
> > Alex
> >
> > On Mon, 26 Jun 2023 at 14:33, Xintong Song <to...@gmail.com>
> wrote:
> >
> > > >
> > > > The ProcessFunction API item is giving me the most headaches because
> > it's
> > > > very unclear what it actually entails; like is it an entirely
> separate
> > > API
> > > > to DataStream (sounds like it is!) or an extension of DataStream. How
> > > much
> > > > will it share the internals with DataStream etc.; how does it relate
> to
> > > the
> > > > Table API (w.r.t. switching APIs / what Table API uses underneath).
> > > >
> > >
> > > I totally understand your confusion. We started planning this after
> > kicking
> > > off the release 2.0, so there's still a lot to be explored and the plan
> > > keeps changing.
> > >
> > >
> > >    - In the beginning, we planned to do an in-place refactor of
> > DataStream
> > >    API, until the API migration period is proposed.
> > >    - Then we want to make it an entirely separate API to DataStream,
> and
> > >    listed as a must-have for release 2.0 so that we can remove
> DataStream
> > > once
> > >    it's ready.
> > >    - However, depending on the outcome of the API compatibility
> > discussion
> > >    [1], we may not be able to remove DataStream in 2.0 anyway, which
> > means
> > > we
> > >    might need to re-evaluate the necessity of this item for 2.0.
> > >
> > > I'd say we wait a bit longer for the compatibility discussion [1] and
> > > decide the priority for this item afterwards.
> > >
> > >
> > > Best,
> > >
> > > Xintong
> > >
> > >
> > > [1] https://lists.apache.org/list.html?dev@flink.apache.org
> > >
> > >
> > > On Mon, Jun 26, 2023 at 6:00 PM Chesnay Schepler <ch...@apache.org>
> > > wrote:
> > >
> > > > by-and-large I'm quite happy with the list of items.
> > > >
> > > > I'm curious as to why the "Disaggregated State Management" item is
> > marked
> > > > as a must-have; will it require changes that break something? What
> > > prevents
> > > > it from being added in 2.1?
> > > >
> > > > We may want to update the Java 17 item to "Make Java 17 the default,
> > drop
> > > > Java 8/11". Maybe even split it into a must-have "Drop Java 8" and a
> > > > nice-to-have "Drop Java 11"?
> > > >
> > > > "Move Calcite rules from Scala to Java": I would hope that this would
> > be
> > > > an entirely internal change, and could thus be an incremental process
> > > > independent of major releases.
> > > > What is the actual scale of this item; how much are we actually
> > > re-writing?
> > > >
> > > > "Add MetricGroup#getLogicalScope": I'd raise this to a must-have; i
> > think
> > > > I marked it down as nice-to-have only because it depends on another
> > item.
> > > >
> > > > The ProcessFunction API item is giving me the most headaches because
> > it's
> > > > very unclear what it actually entails; like is it an entirely
> separate
> > > API
> > > > to DataStream (sounds like it is!) or an extension of DataStream. How
> > > much
> > > > will it share the internals with DataStream etc.; how does it relate
> to
> > > the
> > > > Table API (w.r.t. switching APIs / what Table API uses underneath).
> > > >
> > > > There are a few items I added as ideas which don't have a priority
> yet;
> > > > would love to get some feedback on those.
> > > >
> > > > On 21/06/2023 08:41, Xintong Song wrote:
> > > >
> > > > Hi devs,
> > > >
> > > > As previously discussed in [1], we had been collecting work item
> > > proposals
> > > > for the 2.0 release until June 15th, on the wiki page [2].
> > > >
> > > >    - As we have passed the due date, I'd like to kindly remind
> everyone
> > > *not
> > > >    to add / remove items directly on the wiki page*. If needed,
> please
> > > post
> > > >    in this thread or reach out to the release managers instead.
> > > >    - I've reached out to some folks for clarifications about their
> > > >    proposals. Some of them mentioned that they can not yet tell
> whether
> > > we
> > > >    should do an item or not, and would need more time / discussions
> to
> > > make
> > > >    the decision. So I added a new symbol for items whose priorities
> are
> > > `TBD`.
> > > >
> > > > Now it's time to collaboratively decide a minimum set of must-have
> > items.
> > > > I've gone through the entire list of proposed items, and found most
> of
> > > them
> > > > make quite much sense. So I think an online sync might not be
> necessary
> > > for
> > > > this. I'd like to go with this DISCUSS thread, where everyone can
> > comment
> > > > on how they think the list can be improved, followed by a VOTE to
> > > formally
> > > > make the decision.
> > > >
> > > > Any feedback and opinions, including but not limited to the following
> > > > aspects, will be appreciated.
> > > >
> > > >    - Important items that are missing from the list
> > > >    - Concerns regarding the listed items or their priorities
> > > >
> > > > Looking forward to your feedback.
> > > >
> > > > Best,
> > > >
> > > > Xintong
> > > >
> > > >
> > > > [1]
> > >
> >
> https://lists.apache.org/list?dev@flink.apache.org:lte=1M:release%202.0%20status%20updates
> > > >
> > > > [2] https://cwiki.apache.org/confluence/display/FLINK/2.0+Release
> > > >
> > > >
> > > >
> > >
> >
>

Re: [DISCUSS] Release 2.0 Work Items

Posted by Gyula Fóra <gy...@apache.org>.
Hey!

I share the same concerns mentioned above regarding the "ProcessFunction
API".

I don't think we should create a replacement for the DataStream API unless
we have a very good reason to do so and with a proper discussion about this
as Alex said.

Cheers,
Gyula

On Tue, Jun 27, 2023 at 11:03 AM Alexander Fedulov <
alexander.fedulov@gmail.com> wrote:

> Hi Xintong,
>
> By compatibility discussion do you mean the "[DISCUSS] FLIP-321: Introduce
> an API deprecation process" thread [1]?
>
> I am also curious to know if the rationale behind this new API has been
> previously discussed on the mailing list. Do we have a list of shortcomings
> in the current DataStream API that it tries to resolve? How does the
> current ProcessFunction functionality fit into the picture? Will it be kept
> as is or subsumed by new API?
>
> [1] https://lists.apache.org/thread/vmhzv8fcw2b33pqxp43486owrxbkd5x9
>
> Best,
> Alex
>
> On Mon, 26 Jun 2023 at 14:33, Xintong Song <to...@gmail.com> wrote:
>
> > >
> > > The ProcessFunction API item is giving me the most headaches because
> it's
> > > very unclear what it actually entails; like is it an entirely separate
> > API
> > > to DataStream (sounds like it is!) or an extension of DataStream. How
> > much
> > > will it share the internals with DataStream etc.; how does it relate to
> > the
> > > Table API (w.r.t. switching APIs / what Table API uses underneath).
> > >
> >
> > I totally understand your confusion. We started planning this after
> kicking
> > off the release 2.0, so there's still a lot to be explored and the plan
> > keeps changing.
> >
> >
> >    - In the beginning, we planned to do an in-place refactor of
> DataStream
> >    API, until the API migration period is proposed.
> >    - Then we want to make it an entirely separate API to DataStream, and
> >    listed as a must-have for release 2.0 so that we can remove DataStream
> > once
> >    it's ready.
> >    - However, depending on the outcome of the API compatibility
> discussion
> >    [1], we may not be able to remove DataStream in 2.0 anyway, which
> means
> > we
> >    might need to re-evaluate the necessity of this item for 2.0.
> >
> > I'd say we wait a bit longer for the compatibility discussion [1] and
> > decide the priority for this item afterwards.
> >
> >
> > Best,
> >
> > Xintong
> >
> >
> > [1] https://lists.apache.org/list.html?dev@flink.apache.org
> >
> >
> > On Mon, Jun 26, 2023 at 6:00 PM Chesnay Schepler <ch...@apache.org>
> > wrote:
> >
> > > by-and-large I'm quite happy with the list of items.
> > >
> > > I'm curious as to why the "Disaggregated State Management" item is
> marked
> > > as a must-have; will it require changes that break something? What
> > prevents
> > > it from being added in 2.1?
> > >
> > > We may want to update the Java 17 item to "Make Java 17 the default,
> drop
> > > Java 8/11". Maybe even split it into a must-have "Drop Java 8" and a
> > > nice-to-have "Drop Java 11"?
> > >
> > > "Move Calcite rules from Scala to Java": I would hope that this would
> be
> > > an entirely internal change, and could thus be an incremental process
> > > independent of major releases.
> > > What is the actual scale of this item; how much are we actually
> > re-writing?
> > >
> > > "Add MetricGroup#getLogicalScope": I'd raise this to a must-have; i
> think
> > > I marked it down as nice-to-have only because it depends on another
> item.
> > >
> > > The ProcessFunction API item is giving me the most headaches because
> it's
> > > very unclear what it actually entails; like is it an entirely separate
> > API
> > > to DataStream (sounds like it is!) or an extension of DataStream. How
> > much
> > > will it share the internals with DataStream etc.; how does it relate to
> > the
> > > Table API (w.r.t. switching APIs / what Table API uses underneath).
> > >
> > > There are a few items I added as ideas which don't have a priority yet;
> > > would love to get some feedback on those.
> > >
> > > On 21/06/2023 08:41, Xintong Song wrote:
> > >
> > > Hi devs,
> > >
> > > As previously discussed in [1], we had been collecting work item
> > proposals
> > > for the 2.0 release until June 15th, on the wiki page [2].
> > >
> > >    - As we have passed the due date, I'd like to kindly remind everyone
> > *not
> > >    to add / remove items directly on the wiki page*. If needed, please
> > post
> > >    in this thread or reach out to the release managers instead.
> > >    - I've reached out to some folks for clarifications about their
> > >    proposals. Some of them mentioned that they can not yet tell whether
> > we
> > >    should do an item or not, and would need more time / discussions to
> > make
> > >    the decision. So I added a new symbol for items whose priorities are
> > `TBD`.
> > >
> > > Now it's time to collaboratively decide a minimum set of must-have
> items.
> > > I've gone through the entire list of proposed items, and found most of
> > them
> > > make quite much sense. So I think an online sync might not be necessary
> > for
> > > this. I'd like to go with this DISCUSS thread, where everyone can
> comment
> > > on how they think the list can be improved, followed by a VOTE to
> > formally
> > > make the decision.
> > >
> > > Any feedback and opinions, including but not limited to the following
> > > aspects, will be appreciated.
> > >
> > >    - Important items that are missing from the list
> > >    - Concerns regarding the listed items or their priorities
> > >
> > > Looking forward to your feedback.
> > >
> > > Best,
> > >
> > > Xintong
> > >
> > >
> > > [1]
> >
> https://lists.apache.org/list?dev@flink.apache.org:lte=1M:release%202.0%20status%20updates
> > >
> > > [2] https://cwiki.apache.org/confluence/display/FLINK/2.0+Release
> > >
> > >
> > >
> >
>

Re: [DISCUSS] Release 2.0 Work Items

Posted by Alexander Fedulov <al...@gmail.com>.
Hi Xintong,

By compatibility discussion do you mean the "[DISCUSS] FLIP-321: Introduce
an API deprecation process" thread [1]?

I am also curious to know if the rationale behind this new API has been
previously discussed on the mailing list. Do we have a list of shortcomings
in the current DataStream API that it tries to resolve? How does the
current ProcessFunction functionality fit into the picture? Will it be kept
as is or subsumed by new API?

[1] https://lists.apache.org/thread/vmhzv8fcw2b33pqxp43486owrxbkd5x9

Best,
Alex

On Mon, 26 Jun 2023 at 14:33, Xintong Song <to...@gmail.com> wrote:

> >
> > The ProcessFunction API item is giving me the most headaches because it's
> > very unclear what it actually entails; like is it an entirely separate
> API
> > to DataStream (sounds like it is!) or an extension of DataStream. How
> much
> > will it share the internals with DataStream etc.; how does it relate to
> the
> > Table API (w.r.t. switching APIs / what Table API uses underneath).
> >
>
> I totally understand your confusion. We started planning this after kicking
> off the release 2.0, so there's still a lot to be explored and the plan
> keeps changing.
>
>
>    - In the beginning, we planned to do an in-place refactor of DataStream
>    API, until the API migration period is proposed.
>    - Then we want to make it an entirely separate API to DataStream, and
>    listed as a must-have for release 2.0 so that we can remove DataStream
> once
>    it's ready.
>    - However, depending on the outcome of the API compatibility discussion
>    [1], we may not be able to remove DataStream in 2.0 anyway, which means
> we
>    might need to re-evaluate the necessity of this item for 2.0.
>
> I'd say we wait a bit longer for the compatibility discussion [1] and
> decide the priority for this item afterwards.
>
>
> Best,
>
> Xintong
>
>
> [1] https://lists.apache.org/list.html?dev@flink.apache.org
>
>
> On Mon, Jun 26, 2023 at 6:00 PM Chesnay Schepler <ch...@apache.org>
> wrote:
>
> > by-and-large I'm quite happy with the list of items.
> >
> > I'm curious as to why the "Disaggregated State Management" item is marked
> > as a must-have; will it require changes that break something? What
> prevents
> > it from being added in 2.1?
> >
> > We may want to update the Java 17 item to "Make Java 17 the default, drop
> > Java 8/11". Maybe even split it into a must-have "Drop Java 8" and a
> > nice-to-have "Drop Java 11"?
> >
> > "Move Calcite rules from Scala to Java": I would hope that this would be
> > an entirely internal change, and could thus be an incremental process
> > independent of major releases.
> > What is the actual scale of this item; how much are we actually
> re-writing?
> >
> > "Add MetricGroup#getLogicalScope": I'd raise this to a must-have; i think
> > I marked it down as nice-to-have only because it depends on another item.
> >
> > The ProcessFunction API item is giving me the most headaches because it's
> > very unclear what it actually entails; like is it an entirely separate
> API
> > to DataStream (sounds like it is!) or an extension of DataStream. How
> much
> > will it share the internals with DataStream etc.; how does it relate to
> the
> > Table API (w.r.t. switching APIs / what Table API uses underneath).
> >
> > There are a few items I added as ideas which don't have a priority yet;
> > would love to get some feedback on those.
> >
> > On 21/06/2023 08:41, Xintong Song wrote:
> >
> > Hi devs,
> >
> > As previously discussed in [1], we had been collecting work item
> proposals
> > for the 2.0 release until June 15th, on the wiki page [2].
> >
> >    - As we have passed the due date, I'd like to kindly remind everyone
> *not
> >    to add / remove items directly on the wiki page*. If needed, please
> post
> >    in this thread or reach out to the release managers instead.
> >    - I've reached out to some folks for clarifications about their
> >    proposals. Some of them mentioned that they can not yet tell whether
> we
> >    should do an item or not, and would need more time / discussions to
> make
> >    the decision. So I added a new symbol for items whose priorities are
> `TBD`.
> >
> > Now it's time to collaboratively decide a minimum set of must-have items.
> > I've gone through the entire list of proposed items, and found most of
> them
> > make quite much sense. So I think an online sync might not be necessary
> for
> > this. I'd like to go with this DISCUSS thread, where everyone can comment
> > on how they think the list can be improved, followed by a VOTE to
> formally
> > make the decision.
> >
> > Any feedback and opinions, including but not limited to the following
> > aspects, will be appreciated.
> >
> >    - Important items that are missing from the list
> >    - Concerns regarding the listed items or their priorities
> >
> > Looking forward to your feedback.
> >
> > Best,
> >
> > Xintong
> >
> >
> > [1]
> https://lists.apache.org/list?dev@flink.apache.org:lte=1M:release%202.0%20status%20updates
> >
> > [2] https://cwiki.apache.org/confluence/display/FLINK/2.0+Release
> >
> >
> >
>

Re: [DISCUSS] Release 2.0 Work Items

Posted by Xintong Song <to...@gmail.com>.
>
> The ProcessFunction API item is giving me the most headaches because it's
> very unclear what it actually entails; like is it an entirely separate API
> to DataStream (sounds like it is!) or an extension of DataStream. How much
> will it share the internals with DataStream etc.; how does it relate to the
> Table API (w.r.t. switching APIs / what Table API uses underneath).
>

I totally understand your confusion. We started planning this after kicking
off the release 2.0, so there's still a lot to be explored and the plan
keeps changing.


   - In the beginning, we planned to do an in-place refactor of DataStream
   API, until the API migration period is proposed.
   - Then we want to make it an entirely separate API to DataStream, and
   listed as a must-have for release 2.0 so that we can remove DataStream once
   it's ready.
   - However, depending on the outcome of the API compatibility discussion
   [1], we may not be able to remove DataStream in 2.0 anyway, which means we
   might need to re-evaluate the necessity of this item for 2.0.

I'd say we wait a bit longer for the compatibility discussion [1] and
decide the priority for this item afterwards.


Best,

Xintong


[1] https://lists.apache.org/list.html?dev@flink.apache.org


On Mon, Jun 26, 2023 at 6:00 PM Chesnay Schepler <ch...@apache.org> wrote:

> by-and-large I'm quite happy with the list of items.
>
> I'm curious as to why the "Disaggregated State Management" item is marked
> as a must-have; will it require changes that break something? What prevents
> it from being added in 2.1?
>
> We may want to update the Java 17 item to "Make Java 17 the default, drop
> Java 8/11". Maybe even split it into a must-have "Drop Java 8" and a
> nice-to-have "Drop Java 11"?
>
> "Move Calcite rules from Scala to Java": I would hope that this would be
> an entirely internal change, and could thus be an incremental process
> independent of major releases.
> What is the actual scale of this item; how much are we actually re-writing?
>
> "Add MetricGroup#getLogicalScope": I'd raise this to a must-have; i think
> I marked it down as nice-to-have only because it depends on another item.
>
> The ProcessFunction API item is giving me the most headaches because it's
> very unclear what it actually entails; like is it an entirely separate API
> to DataStream (sounds like it is!) or an extension of DataStream. How much
> will it share the internals with DataStream etc.; how does it relate to the
> Table API (w.r.t. switching APIs / what Table API uses underneath).
>
> There are a few items I added as ideas which don't have a priority yet;
> would love to get some feedback on those.
>
> On 21/06/2023 08:41, Xintong Song wrote:
>
> Hi devs,
>
> As previously discussed in [1], we had been collecting work item proposals
> for the 2.0 release until June 15th, on the wiki page [2].
>
>    - As we have passed the due date, I'd like to kindly remind everyone *not
>    to add / remove items directly on the wiki page*. If needed, please post
>    in this thread or reach out to the release managers instead.
>    - I've reached out to some folks for clarifications about their
>    proposals. Some of them mentioned that they can not yet tell whether we
>    should do an item or not, and would need more time / discussions to make
>    the decision. So I added a new symbol for items whose priorities are `TBD`.
>
> Now it's time to collaboratively decide a minimum set of must-have items.
> I've gone through the entire list of proposed items, and found most of them
> make quite much sense. So I think an online sync might not be necessary for
> this. I'd like to go with this DISCUSS thread, where everyone can comment
> on how they think the list can be improved, followed by a VOTE to formally
> make the decision.
>
> Any feedback and opinions, including but not limited to the following
> aspects, will be appreciated.
>
>    - Important items that are missing from the list
>    - Concerns regarding the listed items or their priorities
>
> Looking forward to your feedback.
>
> Best,
>
> Xintong
>
>
> [1]https://lists.apache.org/list?dev@flink.apache.org:lte=1M:release%202.0%20status%20updates
>
> [2] https://cwiki.apache.org/confluence/display/FLINK/2.0+Release
>
>
>

Re: [DISCUSS] Release 2.0 Work Items

Posted by Chesnay Schepler <ch...@apache.org>.
by-and-large I'm quite happy with the list of items.

I'm curious as to why the "Disaggregated State Management" item is 
marked as a must-have; will it require changes that break something? 
What prevents it from being added in 2.1?

We may want to update the Java 17 item to "Make Java 17 the default, 
drop Java 8/11". Maybe even split it into a must-have "Drop Java 8" and 
a nice-to-have "Drop Java 11"?

"Move Calcite rules from Scala to Java": I would hope that this would be 
an entirely internal change, and could thus be an incremental process 
independent of major releases.
What is the actual scale of this item; how much are we actually re-writing?

"Add MetricGroup#getLogicalScope": I'd raise this to a must-have; i 
think I marked it down as nice-to-have only because it depends on 
another item.

The ProcessFunction API item is giving me the most headaches because 
it's very unclear what it actually entails; like is it an entirely 
separate API to DataStream (sounds like it is!) or an extension of 
DataStream. How much will it share the internals with DataStream etc.; 
how does it relate to the Table API (w.r.t. switching APIs / what Table 
API uses underneath).

There are a few items I added as ideas which don't have a priority yet; 
would love to get some feedback on those.

On 21/06/2023 08:41, Xintong Song wrote:
> Hi devs,
>
> As previously discussed in [1], we had been collecting work item proposals
> for the 2.0 release until June 15th, on the wiki page [2].
>
>     - As we have passed the due date, I'd like to kindly remind everyone *not
>     to add / remove items directly on the wiki page*. If needed, please post
>     in this thread or reach out to the release managers instead.
>     - I've reached out to some folks for clarifications about their
>     proposals. Some of them mentioned that they can not yet tell whether we
>     should do an item or not, and would need more time / discussions to make
>     the decision. So I added a new symbol for items whose priorities are `TBD`.
>
> Now it's time to collaboratively decide a minimum set of must-have items.
> I've gone through the entire list of proposed items, and found most of them
> make quite much sense. So I think an online sync might not be necessary for
> this. I'd like to go with this DISCUSS thread, where everyone can comment
> on how they think the list can be improved, followed by a VOTE to formally
> make the decision.
>
> Any feedback and opinions, including but not limited to the following
> aspects, will be appreciated.
>
>     - Important items that are missing from the list
>     - Concerns regarding the listed items or their priorities
>
> Looking forward to your feedback.
>
> Best,
>
> Xintong
>
>
> [1]
> https://lists.apache.org/list?dev@flink.apache.org:lte=1M:release%202.0%20status%20updates
>
> [2]https://cwiki.apache.org/confluence/display/FLINK/2.0+Release
>