You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@flink.apache.org by Konstantin Knauf <kn...@apache.org> on 2023/02/03 18:04:31 UTC

Re: [DISCUSS] Promote SinkV2 to @Public and deprecate SinkFunction

Hi everyone,

sorry for joining the discussion late.

1) Is there an option to deprecate SinkFunction in Flink 1.17 while leaving
SinkV2 @PublicEvolving in Flink 1.17. We then aim to make SinkV2 @Public in
and remove SinkFunction in Flink 1.18. @PublicEvolving are intended for
public use. So, I don't see it as a blocker for deprecating SinkFunction
that we have to make SinkV2 @PublicEvovling. For reference this is the
description of @PublicEvovling:

/**
 * Annotation to mark classes and methods for public use, but with
evolving interfaces.
 *
 * <p>Classes and methods with this annotation are intended for public
use and have stable behavior.
 * However, their interfaces and signatures are not considered to be
stable and might be changed
 * across versions.
 *
 * <p>This annotation also excludes methods and classes with evolving
interfaces / signatures within
 * classes annotated with {@link Public}.
 */


Marking SinkFunction @Deprecated would already single everyone to move to
SinkV2, which we as a community, I believe, have a strong interest in. Its
almost comical how long the transition from SourceFurnction/SinkFunction to
Source/Sink takes us. At the same time, we leave ourselves the option to to
make small changes to SinkV2 if any problems arise during the migration of
these connector.

I think, we have a bit of a chicken/egg problem here. The pressure for
users and contributors is not high enough to move away from SinkFunction as
long as its not deprecated, but at the same time we need people to migrate
their connectors to see if there are any gaps in SinkV2. I believe, the
combination proposed above could bridge this problem.

2) I don't understand the argument of waiting until some of the
implementations are @Public. How can we make the implementations of the
SinkV2 API @Public without making SinkV2 @Public? All public methods of
SinkV2 are part of every implementation. So to me it actually seems to be
opposite: in order to make any of the implementation @Public we first need
to make the API @Public.

Cheers,

Konstantin

Am Mo., 30. Jan. 2023 um 13:18 Uhr schrieb Dong Lin <li...@gmail.com>:

> Hi Martijn,
>
> Thanks for driving this effort to clean-up the Flink codebase!
>
> I like the idea to cleanup Flink codebase to avoid having two Sinks. On the
> other hand, I also thing the concern mentioned by Jing makes sense. In
> addition to thinking in terms of the rule proposed in FLIP-197
> <
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-197%3A+API+stability+graduation+process
> >
> (which
> seems to focus mostly on the Flink developers' perspective), it might be
> useful to also think about the story from users' perspective and make sure
> their concerns can be addressed.
>
> Typically, by marking an API as deprecated, we are effectively telling
> users *they should start to migrate to the new API ASAP and we reserve the
> right to remove this API completely in the 1-2 releases*. Then it might be
> reasonable for users to ask questions such as:
>
> 1) Does SinkV2 public API provides all the functionalities needed to
> migrate my existing code from subclassing SinkFunction to subclassing
> SinkV2?
>
> 2) Is the amount of migration work reasonable? If yes, why is a class such
> as HBaseSinkFunction in Flink's own codebase still depending on
> SinkFunction? Maybe Flink developers should eat their own dog food and
> migrate (or deprecate) those classes in the Flink codebase first?
>
> Based on the discussion in this thread, I am not sure we have good answers
> to those questions yet. For the 1st question above, the answer is *no*
> because we already know that the SinkV2 is currently not able to support
> migration for JdbcSink. For the 2nd question above, we know there are many
> non-deprecated class in Flink codebase that are still depending on SinkV2.
> It is probably not nice to users if we tell them to migrate while we know
> there are existing issues that can prevent them from doing so easily.
>
> In order to move forward to deprecate SinkV2, I think it will be super
> useful to first migrate all the connectors managed by Flink community
> (including all externalized connectors) to use SinkV2. This work won't be
> wasted since we need to do this anyway. And it will also give us a chance
> to validate the capabilities of SinkV2 and fix bugs by ourselves as much as
> possible.
>
> What do you think?
>
> Best Regards,
> Dong
>
>
> On Wed, Jan 18, 2023 at 6:52 PM Martijn Visser <ma...@apache.org>
> wrote:
>
> > Hi all,
> >
> > While discussing FLIP-281 [1] the discussion also turned to the
> > SinkFunction and the SinkV2 API. For a broader discussion I'm opening up
> a
> > separate discussion thread.
> >
> > As Yun Tang has mentioned in that discussion thread, it would be a good
> > time to deprecate the SinkFunction to avoid the need to introduce new
> > functions towards (to be) deprecated APIs. Jing rightfully mentioned that
> > it would be confusing to deprecate the SinkFunction if its successor is
> not
> > yet marked as @Public (it's currently @PublicEvolving).
> >
> > My proposal would be to promote the SinkV2 API to @public in Flink 1.17
> and
> > mark the SinkFunction as @deprecated in Flink 1.17
> >
> > The original Sink interface was introduced in Flink 1.12 with FLIP-143
> [2]
> > and extended with FLIP-177 in Flink 1.14 [3] and has been improved on
> > further as Sink V2 via FLIP-191 in Flink 1.15 [4].
> >
> > Looking at the API stability graduation process [5], the fact that Sink
> V2
> > was introduced in Flink 1.15 would mean that we could warrant a promotion
> > to @public already (given that there have been two releases with 1.15 and
> > 1.16 where it was introduced). Combined with the fact that SinkV2 has
> been
> > the result of iteration over the introduction of the original Sink API
> > since Flink 1.12, I would argue that the promotion is overdue.
> >
> > If we promote the Sink API to @public, I think we should also immediately
> > mark the SinkFunction as @deprecated.
> >
> > Looking forward to your thoughts.
> >
> > Best regards,
> >
> > Martijn
> >
> >
> > [1] https://lists.apache.org/thread/l05m6cf8fwkkbpnjtzbg9l2lo40oxzd1
> > [2]
> >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-143%3A+Unified+Sink+API
> > [3]
> >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-177%3A+Extend+Sink+API
> > [4]
> >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-191%3A+Extend+unified+Sink+interface+to+support+small+file+compaction
> > [5]
> >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-197%3A+API+stability+graduation+process
> >
>


-- 
https://twitter.com/snntrable
https://github.com/knaufk

Re: [DISCUSS] Promote SinkV2 to @Public and deprecate SinkFunction

Posted by Jing Ge <ji...@ververica.com.INVALID>.
Hi,

We have the same goal to let users migrate to SinkV2. Just marking those
interfaces as deprecated or public won't achieve this goal. Frankly
speaking, users will start to migrate not because those interfaces are
marked as deprecated. They will do it because there is a workable
implementation which can replace the old one with a clear migration
guideline. What Dong Li and Jark Wu said makes a lot of sense. This leads
to the "bottom-up" solution - first have workable(graduated, not
necessarily means @public, please see below) connectors, then deprecate and
graduate top level interfaces.

Hi Konstanin

Thanks for sharing. Commonly, you are right, I'm with you. Interface and
its (solo) implementation should be graduated together. But SinkV2 is a
special case. It has more than 10 implementations. And it is a more
customer-facing API which has a big impact on the ecosystem. We should do
it with special care. If we go this way, it will be even harder to graduate
the SinkV2 API, because many connector implementations should be graduated
with SinkV2 interfaces together.

Speaking of the issue you mentioned, as far as I am concerned, the current
KafkaSink implementation[1] offers an option for this. Conceptually, it is
possible to mark KafkaSink as @public without marking SinkV2 as @public.
All potential unstable methods coming from SinkV2 interfaces will be kept
marked as @internal. We still can make breaking changes with those methods,
i.e. with SinkV2 interfaces(still @publicEvolving) after KafkaSink is
graduated. For KafkaSink itself, it is also possible to do further backward
compatible changes, like adding new methods, removing @internal(once the
related SinkV2 interface is graduated), etc. The graduation impact will be
limited only within the Kafka connector.

Another tiny different option could be that we make KafkaSink be able to
completely replace FlinkKafkaProducer, still keep it as @publicEvolving and
remove FlinkKafkaProducer. This could also be considered as the (soft)
graduation of KafkaSink even if it is still @publicEvolving. Users will not
be confused because there is only one Kafka sink
solution(FlinkKafkaProducer is gone).

It's my belief that this is a more flexible solution with small impact on
the ecosystem than directly marking top level interfaces as @deprecated or
@public without showing users how to migrate, i.e. without concrete
graduated implementations.

Best regards,
Jing



[1]
https://github.com/apache/flink/blob/2ae5df278958073fee63b2bf824a53a28a21701b/flink-connectors/flink-connector-kafka/src/main/java/org/apache/flink/connector/kafka/sink/KafkaSink.java#L89


On Mon, Feb 6, 2023 at 2:56 PM Galen Warren <ga...@cvillewarrens.com> wrote:

> Recently, a critical bug with the Unified Sink committer was reported:  [FLINK-30238]
> Unified Sink committer does not clean up state on final savepoint - ASF
> JIRA (apache.org) <https://issues.apache.org/jira/browse/FLINK-30238>.
>
> Fabian Paul reported:
>
> Hi folks,
>>
>> I did some initial investigation, and the problem seems twofold.
>>
>> If no post-commit topology is used, we do not run into a problem where
>> we could lose data but since we do not clean up the state correctly,
>> we will hit this [1] when trying to stop the pipeline with a savepoint
>> after we have started it from a savepoint.
>> AFAICT all two-phase commit sinks are affected Kafka, File etc.
>>
>> For sinks using the post-commit topology, the same applies.
>> Additionally, we might never do the commit from the post-commit
>> topology resulting in lost data.
>>
>> Best,
>> Fabian
>>
>
>
> Does this need to be addressed before people can safely move to V2 sinks?
> I'm using the StreamingFileSink for this reason.
>
> Thanks,
>
> Galen
>
>
>
>
>
> On Mon, Feb 6, 2023 at 8:48 AM Konstantin Knauf <kn...@apache.org> wrote:
>
>> Hi Steven,
>>
>> Sink is already deprecated. It was deprecated at the moment where we
>> introduced SinkV2.
>>
>> Hi Jark, Hi Dong,
>>
>> My understanding is the SinkV2 is a workable interface. The most important
>> connectors have been migrated (Kafka, Filesystem) and more connectors
>> (OpenSearch, ElasticSearch, Kinesis) use it successfully. To make SinkV2
>> public, it does not need to have all possible functionality. Public APIs
>> can be extended. That's what we do all the time. There will also always be
>> bugs. So, these points can not be categorical blockers to promote the API.
>>
>> What are the specific issues/tickets that are blocking us? Can we in your
>> opinion only deprecate it when every single connector in Apache Flink is
>> migrated already?
>>
>> In my opinion it is the time to ask users to the migrate their connectors.
>> More importantly, @Deprecated would signal users not to build new
>> connectors on SinkFunction. I would arque its also very misleading to
>> users
>> to not @Deprecated SinkFunction given that is clearly will be deprecated.
>>
>> Cheers,
>>
>> Konstantin
>>
>>
>> Am Mo., 6. Feb. 2023 um 13:26 Uhr schrieb Jark Wu <im...@gmail.com>:
>>
>> > I agree with Dong Lin.
>> >
>> > Oracle explains how to use Deprecate API [1]:
>> >
>> > You are strongly recommended to use the Javadoc @deprecated tag with
>> > > appropriate comments explaining how to use the new API. This ensures
>> > > developers will *have a workable migration path from the old API to
>> the
>> > > new API*.
>> >
>> >
>> > From a user's perspective, the workable migration path is very
>> important.
>> > Otherwise, it blurs the semantics of API deprecation. The Flink API's
>> > compatibility and stability issues in the past left a bad impression on
>> the
>> > downstream projects. We should be careful when changing and deprecating
>> > APIs, especially when there are known migration gaps. I think it's a
>> good
>> > idea to migrate Flink-owned connectors before marking old API
>> deprecated.
>> > This ensures downstream projects can migrate to new APIs smoothly.
>> >
>> > Best,
>> > Jark
>> >
>> > [1]:
>> >
>> >
>> https://docs.oracle.com/javase/8/docs/technotes/guides/javadoc/deprecation/deprecation.html
>> >
>> > On Mon, 6 Feb 2023 at 10:01, Steven Wu <st...@gmail.com> wrote:
>> >
>> > > Regarding the discussion on global committer [1] for sinks with global
>> > > transactions, there is no consensus on solving that problem in SinkV2.
>> > Will
>> > > it require any breaking change in SinkV2?
>> > >
>> > > Also will SinkV1 be deprecated too? or it should happen sometime after
>> > > SinkFunction deprecation?
>> > >
>> > > [1] https://lists.apache.org/thread/82bgvlton9olb591bfg2djv0cshj1bxj
>> > >
>> > > On Sun, Feb 5, 2023 at 2:14 AM Dong Lin <li...@gmail.com> wrote:
>> > >
>> > > > Hi Konstantin,
>> > > >
>> > > > Thanks for the comment! Please see my comment inline.
>> > > >
>> > > > Cheers,
>> > > > Dong
>> > > >
>> > > > On Sat, Feb 4, 2023 at 2:06 AM Konstantin Knauf <kn...@apache.org>
>> > > wrote:
>> > > >
>> > > > > Hi everyone,
>> > > > >
>> > > > > sorry for joining the discussion late.
>> > > > >
>> > > > > 1) Is there an option to deprecate SinkFunction in Flink 1.17
>> while
>> > > > leaving
>> > > > > SinkV2 @PublicEvolving in Flink 1.17. We then aim to make SinkV2
>> > > @Public
>> > > > in
>> > > > > and remove SinkFunction in Flink 1.18. @PublicEvolving are
>> intended
>> > for
>> > > > > public use. So, I don't see it as a blocker for deprecating
>> > > SinkFunction
>> > > > > that we have to make SinkV2 @PublicEvovling. For reference this is
>> > the
>> > > > > description of @PublicEvovling:
>> > > > >
>> > > > > /**
>> > > > >  * Annotation to mark classes and methods for public use, but with
>> > > > > evolving interfaces.
>> > > > >  *
>> > > > >  * <p>Classes and methods with this annotation are intended for
>> > public
>> > > > > use and have stable behavior.
>> > > > >  * However, their interfaces and signatures are not considered to
>> be
>> > > > > stable and might be changed
>> > > > >  * across versions.
>> > > > >  *
>> > > > >  * <p>This annotation also excludes methods and classes with
>> evolving
>> > > > > interfaces / signatures within
>> > > > >  * classes annotated with {@link Public}.
>> > > > >  */
>> > > > >
>> > > > >
>> > > > > Marking SinkFunction @Deprecated would already single everyone to
>> > move
>> > > to
>> > > > > SinkV2, which we as a community, I believe, have a strong interest
>> > in.
>> > > > Its
>> > > > >
>> > > >
>> > > > Yes, I also believe we all have this strong interest. I just hope
>> that
>> > > this
>> > > > can be done in the best possible way that does not confuse users.
>> > > >
>> > > > I probably still have the same concern regarding its impact on
>> users:
>> > if
>> > > we
>> > > > mark an API as deprecated, it effectively means the users of this
>> API
>> > > > should start to migrate to another API (e.g. SinkV2) and we might
>> > remove
>> > > > this API in the future. However, given that we know there are known
>> > > > problems preventing users from doing so, it seems that we are not
>> ready
>> > > to
>> > > > send this message to users right.
>> > > >
>> > > > If I understand correctly, I guess you are suggesting that by
>> marking
>> > > > SinkFunction as deprecated, we can put higher pressure on Flink
>> > > > contributors to update the existing Flink codebase to improve and
>> use
>> > > > SinkV2.
>> > > >
>> > > > I am not sure this is the right way to use @deprecated, which has a
>> > > > particular meaning for its users rather than contributors. And I am
>> > also
>> > > > not sure we can even pressure contributors of an open-source project
>> > into
>> > > > developing a feature (e.g. migrate all existing SinkFunction
>> subclasses
>> > > to
>> > > > SinkV2). IMO, the typical way is for the contributor with
>> interest/time
>> > > to
>> > > > work on the feature, or talk to other contributors whether they are
>> > > willing
>> > > > to collaborate/work on this, rather than pressuring other
>> contributors
>> > > into
>> > > > working on this.
>> > > >
>> > > >
>> > > > almost comical how long the transition from
>> > SourceFurnction/SinkFunction
>> > > to
>> > > > > Source/Sink takes us. At the same time, we leave ourselves the
>> option
>> > > to
>> > > > to
>> > > > > make small changes to SinkV2 if any problems arise during the
>> > migration
>> > > > of
>> > > > > these connector.
>> > > > >
>> > > > > I think, we have a bit of a chicken/egg problem here. The pressure
>> > for
>> > > > >
>> > > >
>> > > > Similar to the reason described above, I am not sure we have a
>> > > chicken/egg
>> > > > problem here. The issue here is that SinkV2 is not ready and we
>> have a
>> > > lot
>> > > > of existing SinkFunction that is not migrated by ourselves. We
>> (Flink
>> > > > contributors) probably do not need to mark SinkFunction as
>> deprecated
>> > in
>> > > > order to address these issues in our own codebase.
>> > > >
>> > > >
>> > > > users and contributors is not high enough to move away from
>> > SinkFunction
>> > > as
>> > > > > long as its not deprecated, but at the same time we need people to
>> > > > migrate
>> > > > > their connectors to see if there are any gaps in SinkV2. I
>> believe,
>> > the
>> > > > > combination proposed above could bridge this problem.
>> > > > >
>> > > > > 2) I don't understand the argument of waiting until some of the
>> > > > > implementations are @Public. How can we make the implementations
>> of
>> > the
>> > > > > SinkV2 API @Public without making SinkV2 @Public? All public
>> methods
>> > of
>> > > > > SinkV2 are part of every implementation. So to me it actually
>> seems
>> > to
>> > > be
>> > > > > opposite: in order to make any of the implementation @Public we
>> first
>> > > > need
>> > > > > to make the API @Public.
>> > > > >
>> > > >
>> > > > Yeah I also agree with you.
>> > > >
>> > > >
>> > > > >
>> > > > > Cheers,
>> > > > >
>> > > > > Konstantin
>> > > > >
>> > > > > Am Mo., 30. Jan. 2023 um 13:18 Uhr schrieb Dong Lin <
>> > > lindong28@gmail.com
>> > > > >:
>> > > > >
>> > > > > > Hi Martijn,
>> > > > > >
>> > > > > > Thanks for driving this effort to clean-up the Flink codebase!
>> > > > > >
>> > > > > > I like the idea to cleanup Flink codebase to avoid having two
>> > Sinks.
>> > > On
>> > > > > the
>> > > > > > other hand, I also thing the concern mentioned by Jing makes
>> sense.
>> > > In
>> > > > > > addition to thinking in terms of the rule proposed in FLIP-197
>> > > > > > <
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-197%3A+API+stability+graduation+process
>> > > > > > >
>> > > > > > (which
>> > > > > > seems to focus mostly on the Flink developers' perspective), it
>> > might
>> > > > be
>> > > > > > useful to also think about the story from users' perspective and
>> > make
>> > > > > sure
>> > > > > > their concerns can be addressed.
>> > > > > >
>> > > > > > Typically, by marking an API as deprecated, we are effectively
>> > > telling
>> > > > > > users *they should start to migrate to the new API ASAP and we
>> > > reserve
>> > > > > the
>> > > > > > right to remove this API completely in the 1-2 releases*. Then
>> it
>> > > might
>> > > > > be
>> > > > > > reasonable for users to ask questions such as:
>> > > > > >
>> > > > > > 1) Does SinkV2 public API provides all the functionalities
>> needed
>> > to
>> > > > > > migrate my existing code from subclassing SinkFunction to
>> > subclassing
>> > > > > > SinkV2?
>> > > > > >
>> > > > > > 2) Is the amount of migration work reasonable? If yes, why is a
>> > class
>> > > > > such
>> > > > > > as HBaseSinkFunction in Flink's own codebase still depending on
>> > > > > > SinkFunction? Maybe Flink developers should eat their own dog
>> food
>> > > and
>> > > > > > migrate (or deprecate) those classes in the Flink codebase
>> first?
>> > > > > >
>> > > > > > Based on the discussion in this thread, I am not sure we have
>> good
>> > > > > answers
>> > > > > > to those questions yet. For the 1st question above, the answer
>> is
>> > > *no*
>> > > > > > because we already know that the SinkV2 is currently not able to
>> > > > support
>> > > > > > migration for JdbcSink. For the 2nd question above, we know
>> there
>> > are
>> > > > > many
>> > > > > > non-deprecated class in Flink codebase that are still depending
>> on
>> > > > > SinkV2.
>> > > > > > It is probably not nice to users if we tell them to migrate
>> while
>> > we
>> > > > know
>> > > > > > there are existing issues that can prevent them from doing so
>> > easily.
>> > > > > >
>> > > > > > In order to move forward to deprecate SinkV2, I think it will be
>> > > super
>> > > > > > useful to first migrate all the connectors managed by Flink
>> > community
>> > > > > > (including all externalized connectors) to use SinkV2. This work
>> > > won't
>> > > > be
>> > > > > > wasted since we need to do this anyway. And it will also give
>> us a
>> > > > chance
>> > > > > > to validate the capabilities of SinkV2 and fix bugs by
>> ourselves as
>> > > > much
>> > > > > as
>> > > > > > possible.
>> > > > > >
>> > > > > > What do you think?
>> > > > > >
>> > > > > > Best Regards,
>> > > > > > Dong
>> > > > > >
>> > > > > >
>> > > > > > On Wed, Jan 18, 2023 at 6:52 PM Martijn Visser <
>> > > > martijnvisser@apache.org
>> > > > > >
>> > > > > > wrote:
>> > > > > >
>> > > > > > > Hi all,
>> > > > > > >
>> > > > > > > While discussing FLIP-281 [1] the discussion also turned to
>> the
>> > > > > > > SinkFunction and the SinkV2 API. For a broader discussion I'm
>> > > opening
>> > > > > up
>> > > > > > a
>> > > > > > > separate discussion thread.
>> > > > > > >
>> > > > > > > As Yun Tang has mentioned in that discussion thread, it would
>> be
>> > a
>> > > > good
>> > > > > > > time to deprecate the SinkFunction to avoid the need to
>> introduce
>> > > new
>> > > > > > > functions towards (to be) deprecated APIs. Jing rightfully
>> > > mentioned
>> > > > > that
>> > > > > > > it would be confusing to deprecate the SinkFunction if its
>> > > successor
>> > > > is
>> > > > > > not
>> > > > > > > yet marked as @Public (it's currently @PublicEvolving).
>> > > > > > >
>> > > > > > > My proposal would be to promote the SinkV2 API to @public in
>> > Flink
>> > > > 1.17
>> > > > > > and
>> > > > > > > mark the SinkFunction as @deprecated in Flink 1.17
>> > > > > > >
>> > > > > > > The original Sink interface was introduced in Flink 1.12 with
>> > > > FLIP-143
>> > > > > > [2]
>> > > > > > > and extended with FLIP-177 in Flink 1.14 [3] and has been
>> > improved
>> > > on
>> > > > > > > further as Sink V2 via FLIP-191 in Flink 1.15 [4].
>> > > > > > >
>> > > > > > > Looking at the API stability graduation process [5], the fact
>> > that
>> > > > Sink
>> > > > > > V2
>> > > > > > > was introduced in Flink 1.15 would mean that we could warrant
>> a
>> > > > > promotion
>> > > > > > > to @public already (given that there have been two releases
>> with
>> > > 1.15
>> > > > > and
>> > > > > > > 1.16 where it was introduced). Combined with the fact that
>> SinkV2
>> > > has
>> > > > > > been
>> > > > > > > the result of iteration over the introduction of the original
>> > Sink
>> > > > API
>> > > > > > > since Flink 1.12, I would argue that the promotion is overdue.
>> > > > > > >
>> > > > > > > If we promote the Sink API to @public, I think we should also
>> > > > > immediately
>> > > > > > > mark the SinkFunction as @deprecated.
>> > > > > > >
>> > > > > > > Looking forward to your thoughts.
>> > > > > > >
>> > > > > > > Best regards,
>> > > > > > >
>> > > > > > > Martijn
>> > > > > > >
>> > > > > > >
>> > > > > > > [1]
>> > > https://lists.apache.org/thread/l05m6cf8fwkkbpnjtzbg9l2lo40oxzd1
>> > > > > > > [2]
>> > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-143%3A+Unified+Sink+API
>> > > > > > > [3]
>> > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-177%3A+Extend+Sink+API
>> > > > > > > [4]
>> > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-191%3A+Extend+unified+Sink+interface+to+support+small+file+compaction
>> > > > > > > [5]
>> > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-197%3A+API+stability+graduation+process
>> > > > > > >
>> > > > > >
>> > > > >
>> > > > >
>> > > > > --
>> > > > > https://twitter.com/snntrable
>> > > > > https://github.com/knaufk
>> > > > >
>> > > >
>> > >
>> >
>>
>>
>> --
>> https://twitter.com/snntrable
>> https://github.com/knaufk
>>
>

Re: [DISCUSS] Promote SinkV2 to @Public and deprecate SinkFunction

Posted by Galen Warren <ga...@cvillewarrens.com>.
Recently, a critical bug with the Unified Sink committer was reported:
 [FLINK-30238]
Unified Sink committer does not clean up state on final savepoint - ASF
JIRA (apache.org) <https://issues.apache.org/jira/browse/FLINK-30238>.

Fabian Paul reported:

Hi folks,
>
> I did some initial investigation, and the problem seems twofold.
>
> If no post-commit topology is used, we do not run into a problem where
> we could lose data but since we do not clean up the state correctly,
> we will hit this [1] when trying to stop the pipeline with a savepoint
> after we have started it from a savepoint.
> AFAICT all two-phase commit sinks are affected Kafka, File etc.
>
> For sinks using the post-commit topology, the same applies.
> Additionally, we might never do the commit from the post-commit
> topology resulting in lost data.
>
> Best,
> Fabian
>


Does this need to be addressed before people can safely move to V2 sinks?
I'm using the StreamingFileSink for this reason.

Thanks,

Galen





On Mon, Feb 6, 2023 at 8:48 AM Konstantin Knauf <kn...@apache.org> wrote:

> Hi Steven,
>
> Sink is already deprecated. It was deprecated at the moment where we
> introduced SinkV2.
>
> Hi Jark, Hi Dong,
>
> My understanding is the SinkV2 is a workable interface. The most important
> connectors have been migrated (Kafka, Filesystem) and more connectors
> (OpenSearch, ElasticSearch, Kinesis) use it successfully. To make SinkV2
> public, it does not need to have all possible functionality. Public APIs
> can be extended. That's what we do all the time. There will also always be
> bugs. So, these points can not be categorical blockers to promote the API.
>
> What are the specific issues/tickets that are blocking us? Can we in your
> opinion only deprecate it when every single connector in Apache Flink is
> migrated already?
>
> In my opinion it is the time to ask users to the migrate their connectors.
> More importantly, @Deprecated would signal users not to build new
> connectors on SinkFunction. I would arque its also very misleading to users
> to not @Deprecated SinkFunction given that is clearly will be deprecated.
>
> Cheers,
>
> Konstantin
>
>
> Am Mo., 6. Feb. 2023 um 13:26 Uhr schrieb Jark Wu <im...@gmail.com>:
>
> > I agree with Dong Lin.
> >
> > Oracle explains how to use Deprecate API [1]:
> >
> > You are strongly recommended to use the Javadoc @deprecated tag with
> > > appropriate comments explaining how to use the new API. This ensures
> > > developers will *have a workable migration path from the old API to the
> > > new API*.
> >
> >
> > From a user's perspective, the workable migration path is very important.
> > Otherwise, it blurs the semantics of API deprecation. The Flink API's
> > compatibility and stability issues in the past left a bad impression on
> the
> > downstream projects. We should be careful when changing and deprecating
> > APIs, especially when there are known migration gaps. I think it's a good
> > idea to migrate Flink-owned connectors before marking old API deprecated.
> > This ensures downstream projects can migrate to new APIs smoothly.
> >
> > Best,
> > Jark
> >
> > [1]:
> >
> >
> https://docs.oracle.com/javase/8/docs/technotes/guides/javadoc/deprecation/deprecation.html
> >
> > On Mon, 6 Feb 2023 at 10:01, Steven Wu <st...@gmail.com> wrote:
> >
> > > Regarding the discussion on global committer [1] for sinks with global
> > > transactions, there is no consensus on solving that problem in SinkV2.
> > Will
> > > it require any breaking change in SinkV2?
> > >
> > > Also will SinkV1 be deprecated too? or it should happen sometime after
> > > SinkFunction deprecation?
> > >
> > > [1] https://lists.apache.org/thread/82bgvlton9olb591bfg2djv0cshj1bxj
> > >
> > > On Sun, Feb 5, 2023 at 2:14 AM Dong Lin <li...@gmail.com> wrote:
> > >
> > > > Hi Konstantin,
> > > >
> > > > Thanks for the comment! Please see my comment inline.
> > > >
> > > > Cheers,
> > > > Dong
> > > >
> > > > On Sat, Feb 4, 2023 at 2:06 AM Konstantin Knauf <kn...@apache.org>
> > > wrote:
> > > >
> > > > > Hi everyone,
> > > > >
> > > > > sorry for joining the discussion late.
> > > > >
> > > > > 1) Is there an option to deprecate SinkFunction in Flink 1.17 while
> > > > leaving
> > > > > SinkV2 @PublicEvolving in Flink 1.17. We then aim to make SinkV2
> > > @Public
> > > > in
> > > > > and remove SinkFunction in Flink 1.18. @PublicEvolving are intended
> > for
> > > > > public use. So, I don't see it as a blocker for deprecating
> > > SinkFunction
> > > > > that we have to make SinkV2 @PublicEvovling. For reference this is
> > the
> > > > > description of @PublicEvovling:
> > > > >
> > > > > /**
> > > > >  * Annotation to mark classes and methods for public use, but with
> > > > > evolving interfaces.
> > > > >  *
> > > > >  * <p>Classes and methods with this annotation are intended for
> > public
> > > > > use and have stable behavior.
> > > > >  * However, their interfaces and signatures are not considered to
> be
> > > > > stable and might be changed
> > > > >  * across versions.
> > > > >  *
> > > > >  * <p>This annotation also excludes methods and classes with
> evolving
> > > > > interfaces / signatures within
> > > > >  * classes annotated with {@link Public}.
> > > > >  */
> > > > >
> > > > >
> > > > > Marking SinkFunction @Deprecated would already single everyone to
> > move
> > > to
> > > > > SinkV2, which we as a community, I believe, have a strong interest
> > in.
> > > > Its
> > > > >
> > > >
> > > > Yes, I also believe we all have this strong interest. I just hope
> that
> > > this
> > > > can be done in the best possible way that does not confuse users.
> > > >
> > > > I probably still have the same concern regarding its impact on users:
> > if
> > > we
> > > > mark an API as deprecated, it effectively means the users of this API
> > > > should start to migrate to another API (e.g. SinkV2) and we might
> > remove
> > > > this API in the future. However, given that we know there are known
> > > > problems preventing users from doing so, it seems that we are not
> ready
> > > to
> > > > send this message to users right.
> > > >
> > > > If I understand correctly, I guess you are suggesting that by marking
> > > > SinkFunction as deprecated, we can put higher pressure on Flink
> > > > contributors to update the existing Flink codebase to improve and use
> > > > SinkV2.
> > > >
> > > > I am not sure this is the right way to use @deprecated, which has a
> > > > particular meaning for its users rather than contributors. And I am
> > also
> > > > not sure we can even pressure contributors of an open-source project
> > into
> > > > developing a feature (e.g. migrate all existing SinkFunction
> subclasses
> > > to
> > > > SinkV2). IMO, the typical way is for the contributor with
> interest/time
> > > to
> > > > work on the feature, or talk to other contributors whether they are
> > > willing
> > > > to collaborate/work on this, rather than pressuring other
> contributors
> > > into
> > > > working on this.
> > > >
> > > >
> > > > almost comical how long the transition from
> > SourceFurnction/SinkFunction
> > > to
> > > > > Source/Sink takes us. At the same time, we leave ourselves the
> option
> > > to
> > > > to
> > > > > make small changes to SinkV2 if any problems arise during the
> > migration
> > > > of
> > > > > these connector.
> > > > >
> > > > > I think, we have a bit of a chicken/egg problem here. The pressure
> > for
> > > > >
> > > >
> > > > Similar to the reason described above, I am not sure we have a
> > > chicken/egg
> > > > problem here. The issue here is that SinkV2 is not ready and we have
> a
> > > lot
> > > > of existing SinkFunction that is not migrated by ourselves. We (Flink
> > > > contributors) probably do not need to mark SinkFunction as deprecated
> > in
> > > > order to address these issues in our own codebase.
> > > >
> > > >
> > > > users and contributors is not high enough to move away from
> > SinkFunction
> > > as
> > > > > long as its not deprecated, but at the same time we need people to
> > > > migrate
> > > > > their connectors to see if there are any gaps in SinkV2. I believe,
> > the
> > > > > combination proposed above could bridge this problem.
> > > > >
> > > > > 2) I don't understand the argument of waiting until some of the
> > > > > implementations are @Public. How can we make the implementations of
> > the
> > > > > SinkV2 API @Public without making SinkV2 @Public? All public
> methods
> > of
> > > > > SinkV2 are part of every implementation. So to me it actually seems
> > to
> > > be
> > > > > opposite: in order to make any of the implementation @Public we
> first
> > > > need
> > > > > to make the API @Public.
> > > > >
> > > >
> > > > Yeah I also agree with you.
> > > >
> > > >
> > > > >
> > > > > Cheers,
> > > > >
> > > > > Konstantin
> > > > >
> > > > > Am Mo., 30. Jan. 2023 um 13:18 Uhr schrieb Dong Lin <
> > > lindong28@gmail.com
> > > > >:
> > > > >
> > > > > > Hi Martijn,
> > > > > >
> > > > > > Thanks for driving this effort to clean-up the Flink codebase!
> > > > > >
> > > > > > I like the idea to cleanup Flink codebase to avoid having two
> > Sinks.
> > > On
> > > > > the
> > > > > > other hand, I also thing the concern mentioned by Jing makes
> sense.
> > > In
> > > > > > addition to thinking in terms of the rule proposed in FLIP-197
> > > > > > <
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-197%3A+API+stability+graduation+process
> > > > > > >
> > > > > > (which
> > > > > > seems to focus mostly on the Flink developers' perspective), it
> > might
> > > > be
> > > > > > useful to also think about the story from users' perspective and
> > make
> > > > > sure
> > > > > > their concerns can be addressed.
> > > > > >
> > > > > > Typically, by marking an API as deprecated, we are effectively
> > > telling
> > > > > > users *they should start to migrate to the new API ASAP and we
> > > reserve
> > > > > the
> > > > > > right to remove this API completely in the 1-2 releases*. Then it
> > > might
> > > > > be
> > > > > > reasonable for users to ask questions such as:
> > > > > >
> > > > > > 1) Does SinkV2 public API provides all the functionalities needed
> > to
> > > > > > migrate my existing code from subclassing SinkFunction to
> > subclassing
> > > > > > SinkV2?
> > > > > >
> > > > > > 2) Is the amount of migration work reasonable? If yes, why is a
> > class
> > > > > such
> > > > > > as HBaseSinkFunction in Flink's own codebase still depending on
> > > > > > SinkFunction? Maybe Flink developers should eat their own dog
> food
> > > and
> > > > > > migrate (or deprecate) those classes in the Flink codebase first?
> > > > > >
> > > > > > Based on the discussion in this thread, I am not sure we have
> good
> > > > > answers
> > > > > > to those questions yet. For the 1st question above, the answer is
> > > *no*
> > > > > > because we already know that the SinkV2 is currently not able to
> > > > support
> > > > > > migration for JdbcSink. For the 2nd question above, we know there
> > are
> > > > > many
> > > > > > non-deprecated class in Flink codebase that are still depending
> on
> > > > > SinkV2.
> > > > > > It is probably not nice to users if we tell them to migrate while
> > we
> > > > know
> > > > > > there are existing issues that can prevent them from doing so
> > easily.
> > > > > >
> > > > > > In order to move forward to deprecate SinkV2, I think it will be
> > > super
> > > > > > useful to first migrate all the connectors managed by Flink
> > community
> > > > > > (including all externalized connectors) to use SinkV2. This work
> > > won't
> > > > be
> > > > > > wasted since we need to do this anyway. And it will also give us
> a
> > > > chance
> > > > > > to validate the capabilities of SinkV2 and fix bugs by ourselves
> as
> > > > much
> > > > > as
> > > > > > possible.
> > > > > >
> > > > > > What do you think?
> > > > > >
> > > > > > Best Regards,
> > > > > > Dong
> > > > > >
> > > > > >
> > > > > > On Wed, Jan 18, 2023 at 6:52 PM Martijn Visser <
> > > > martijnvisser@apache.org
> > > > > >
> > > > > > wrote:
> > > > > >
> > > > > > > Hi all,
> > > > > > >
> > > > > > > While discussing FLIP-281 [1] the discussion also turned to the
> > > > > > > SinkFunction and the SinkV2 API. For a broader discussion I'm
> > > opening
> > > > > up
> > > > > > a
> > > > > > > separate discussion thread.
> > > > > > >
> > > > > > > As Yun Tang has mentioned in that discussion thread, it would
> be
> > a
> > > > good
> > > > > > > time to deprecate the SinkFunction to avoid the need to
> introduce
> > > new
> > > > > > > functions towards (to be) deprecated APIs. Jing rightfully
> > > mentioned
> > > > > that
> > > > > > > it would be confusing to deprecate the SinkFunction if its
> > > successor
> > > > is
> > > > > > not
> > > > > > > yet marked as @Public (it's currently @PublicEvolving).
> > > > > > >
> > > > > > > My proposal would be to promote the SinkV2 API to @public in
> > Flink
> > > > 1.17
> > > > > > and
> > > > > > > mark the SinkFunction as @deprecated in Flink 1.17
> > > > > > >
> > > > > > > The original Sink interface was introduced in Flink 1.12 with
> > > > FLIP-143
> > > > > > [2]
> > > > > > > and extended with FLIP-177 in Flink 1.14 [3] and has been
> > improved
> > > on
> > > > > > > further as Sink V2 via FLIP-191 in Flink 1.15 [4].
> > > > > > >
> > > > > > > Looking at the API stability graduation process [5], the fact
> > that
> > > > Sink
> > > > > > V2
> > > > > > > was introduced in Flink 1.15 would mean that we could warrant a
> > > > > promotion
> > > > > > > to @public already (given that there have been two releases
> with
> > > 1.15
> > > > > and
> > > > > > > 1.16 where it was introduced). Combined with the fact that
> SinkV2
> > > has
> > > > > > been
> > > > > > > the result of iteration over the introduction of the original
> > Sink
> > > > API
> > > > > > > since Flink 1.12, I would argue that the promotion is overdue.
> > > > > > >
> > > > > > > If we promote the Sink API to @public, I think we should also
> > > > > immediately
> > > > > > > mark the SinkFunction as @deprecated.
> > > > > > >
> > > > > > > Looking forward to your thoughts.
> > > > > > >
> > > > > > > Best regards,
> > > > > > >
> > > > > > > Martijn
> > > > > > >
> > > > > > >
> > > > > > > [1]
> > > https://lists.apache.org/thread/l05m6cf8fwkkbpnjtzbg9l2lo40oxzd1
> > > > > > > [2]
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-143%3A+Unified+Sink+API
> > > > > > > [3]
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-177%3A+Extend+Sink+API
> > > > > > > [4]
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-191%3A+Extend+unified+Sink+interface+to+support+small+file+compaction
> > > > > > > [5]
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-197%3A+API+stability+graduation+process
> > > > > > >
> > > > > >
> > > > >
> > > > >
> > > > > --
> > > > > https://twitter.com/snntrable
> > > > > https://github.com/knaufk
> > > > >
> > > >
> > >
> >
>
>
> --
> https://twitter.com/snntrable
> https://github.com/knaufk
>

Re: [DISCUSS] Promote SinkV2 to @Public and deprecate SinkFunction

Posted by Dong Lin <li...@gmail.com>.
Hi Galen,

I am not directly involved in the creation of FLINK-30238
<https://issues.apache.org/jira/browse/FLINK-30238> and my understanding
could be incorrect. But yes, I get a similar sense as you do. According to
FLINK-30238's description, there is the risk of losing data if we use
stop-with-savepoint with SinkV2.

Regards,
Dong

On Sat, Feb 11, 2023 at 11:11 PM Galen Warren <ga...@cvillewarrens.com>
wrote:

> Regarding this part:
>
> @Dong
> > And Yuxia mentioned earlier in this thread that there are bugs such as
> FLINK-30238 <https://issues.apache.org/jira/browse/FLINK-30238> and
> FLINK-29459 <https://issues.apache.org/jira/browse/FLINK-29459> , which
> makes it hard to use SinkV2 properly in production.
>
> I'll admit that I don't understand everything that's going on here, but the
> sense I get from these is that I risk running into problems
> stopping-with-savepoint and restarting streaming jobs with SinkV2.
>
> Is that correct? It's quite possible I'm not understanding the issue and
> would be great if so!
>
> If that is correct, then I really can't use SinkV2 myself; I need to be
> able to reliably stop/start streaming jobs without losing state.
>
> Thanks,
> Galen
>
>
>
> On Fri, Feb 10, 2023 at 3:29 PM Martijn Visser <ma...@apache.org>
> wrote:
>
> > Hi all,
> >
> > I wanted to get back on a couple of comments and have a proposal at the
> end
> > for the next steps:
> >
> > @Steven Wu
> > If I look back at the original discussion of FLIP-191 and also the thread
> > that you're referring to, it appears from the discussion with Yun Gao
> that
> > the solution was in near-sight, but just not finished. Perhaps it needs
> to
> > be restarted once more so it can be brought to a closure. Also when I
> > looked back at the original introduction of SinkV2, there was FLINK-25726
> > [1] which looks like it was built specifically for Iceberg and Delta
> Sink?
> >
> > @Jing
> > > All potential unstable methods coming from SinkV2 interfaces will be
> kept
> > marked as @internal.
> >
> > This is a situation that I think should be avoided: if it's unstable, it
> > should be marked as @Experimental. Connectors should not need to rely
> > on @Internal interfaces; if they are needed by a connector, a maintainer
> > should create a proposal to expose.
> >
> > On the production readiness of SinkV2:
> >
> > @Dong
> > > And Yuxia mentioned earlier in this thread that there are bugs such as
> > FLINK-30238 <https://issues.apache.org/jira/browse/FLINK-30238> and
> > FLINK-29459 <https://issues.apache.org/jira/browse/FLINK-29459> , which
> > makes it hard to use SinkV2 properly in production.
> >
> > @Jark
> > > The follow-up step should be trying the new APIs in externalized
> > connectors and giving users the confidence to migrate.
> >
> > I think that is not a fair assessment of the situation. Given that Kafka,
> > Elasticsearch, FileSink and everything that uses the AsyncSink is on
> SinkV2
> > since Flink 1.15 and the popularity of these connectors, they have
> > definitely already proven themselves. That's not to downplay that bugs
> have
> > no impact and shouldn't be resolved, but I don't think it's fair to state
> > that SinkV2 is not stable or usable in production.
> >
> > Coming back to the original topic of this thread, my proposal would be
> that
> > in Flink 1.17:
> >
> > 1. SinkV2 gets promoted from @PublicEvolving to @Public, which also
> means:
> > a. KafkaSink gets promoted from @PublicEvolving to @Public (which solves
> > the weird thing that FlinkKafkaProducer is already marked as deprecated
> > since Flink 1.14 and points to KafkaSink as a successor)
> > b. FileSink gets promoted from @Experimental to @PublicEvolving (note:
> this
> > has been marked as @Experimental since Flink 1.12, so one could argue
> that
> > this needs to be bumped to @Public too. The reason why I didn't do it is
> > because of Yuxia's remark that FileSystemTableSink is still using
> > SinkFunction)
> > c. ElasticsearchSink gets promoted from @PublicEvolving to @Public (has
> > been externalized so needs to happen there)
> > d. The AWS connectors (Firehose, Kinesis) get promoted from
> @PublicEvolving
> > to @Public (has been externalized so needs to happen there)
> > e. It's up to the maintainers for DynamoDB and Opensearch to decide this
> > themselves, given that I believe they've only been introduced recently
> and
> > support only Flink 1.16.
> >
> > 2. For the next Flink release, the community needs to establish what are
> > features that are currently not available in SinkV2 (e.g. FLIP-287), if
> > such a feature should be made available in SinkV2 (in case it's a feature
> > which currently resides in SinkFunction, while from new insights we think
> > it should now be solved differently) and then get these implemented asap.
> >
> > Best regards,
> >
> > Martijn
> >
> > [1] https://issues.apache.org/jira/browse/FLINK-25726
> >
> >
> > On Tue, Feb 7, 2023 at 5:11 AM Steven Wu <st...@gmail.com> wrote:
> >
> > > > Regarding the discussion on global committer [1] for sinks with
> global
> > > transactions, there is no consensus on solving that problem in SinkV2.
> > Will
> > > it require any breaking change in SinkV2?
> > >
> > > Just want to reiterate my earlier question. What is the migration path
> > for
> > > the Iceberg sink?
> > >
> > > [1] https://lists.apache.org/thread/82bgvlton9olb591bfg2djv0cshj1bxj
> > >
> > > On Mon, Feb 6, 2023 at 6:22 PM Jark Wu <im...@gmail.com> wrote:
> > >
> > > > Hi Konstantin,
> > > >
> > > > I totally agree with making SinkV2 @Public. I just have concerns
> about
> > > > deprecating SinkFunction at this point. Dong Lin has raised the
> blocker
> > > > issues of migration multiple times in this thread which I think we
> > should
> > > > address first. I don't know why we rush to deprecate SinkFunction
> while
> > > > SourceFunction is still public, but the new Source API is much more
> > > stable
> > > > and production-ready than SinkV2.
> > > >
> > > > Iceberg community raised concerns[1] about the workability and
> > stability
> > > of
> > > > Flink connector APIs.
> > > >
> > > > We are hoping any issues with the APIs for Iceberg connector will
> > surface
> > > > > sooner and get more attention from the Flink community when the
> > > connector
> > > > > is within Flink umbrella rather than in Iceberg repo.
> > > >
> > > >
> > > > The connector externalizing is a big step for building a mechanism to
> > > > guarantee Flink connector API is stable and workable. The follow-up
> > step
> > > > should be trying the new APIs in externalized connectors and giving
> > users
> > > > the confidence to migrate.
> > > >
> > > >
> > > > Best,
> > > > Jark
> > > >
> > > > [1]:
> https://lists.apache.org/thread/r5zqnkt01x2c611brkjmxxnt3bfcgl1b
> > > >
> > > > On Tue, 7 Feb 2023 at 09:53, yuxia <lu...@alumni.sjtu.edu.cn>
> > wrote:
> > > >
> > > > > Hi Konstantin,
> > > > > Just FYI, the FileSystemTableSink are still using SinkFunction.
> > > > >
> > > > > Best regards,
> > > > > Yuxia
> > > > >
> > > > > ----- 原始邮件 -----
> > > > > 发件人: "Dong Lin" <li...@gmail.com>
> > > > > 收件人: "dev" <de...@flink.apache.org>
> > > > > 抄送: "Jing Ge" <ji...@ververica.com>, "Yun Tang" <my...@live.com>
> > > > > 发送时间: 星期二, 2023年 2 月 07日 上午 9:41:07
> > > > > 主题: Re: [DISCUSS] Promote SinkV2 to @Public and deprecate
> > SinkFunction
> > > > >
> > > > > Hi Konstantin,
> > > > >
> > > > > Thanks for the reply. Please see my comments inline.
> > > > >
> > > > > On Mon, Feb 6, 2023 at 9:48 PM Konstantin Knauf <knaufk@apache.org
> >
> > > > wrote:
> > > > >
> > > > > > Hi Steven,
> > > > > >
> > > > > > Sink is already deprecated. It was deprecated at the moment where
> > we
> > > > > > introduced SinkV2.
> > > > > >
> > > > > > Hi Jark, Hi Dong,
> > > > > >
> > > > > > My understanding is the SinkV2 is a workable interface. The most
> > > > > important
> > > > > > connectors have been migrated (Kafka, Filesystem) and more
> > connectors
> > > > > > (OpenSearch, ElasticSearch, Kinesis) use it successfully. To make
> > > > SinkV2
> > > > > > public, it does not need to have all possible functionality.
> Public
> > > > APIs
> > > > > > can be extended. That's what we do all the time. There will also
> > > always
> > > > > be
> > > > > > bugs. So, these points can not be categorical blockers to promote
> > the
> > > > > API.
> > > > > >
> > > > >
> > > > > Right. I believe we all agree that we make SinkV2 @PublicEvolving.
> > The
> > > > > concern here is whether we can mark SinkFunction as deprecated at
> > this
> > > > > point.
> > > > >
> > > > >
> > > > > >
> > > > > > What are the specific issues/tickets that are blocking us? Can we
> > in
> > > > your
> > > > > >
> > > > >
> > > > > For example, Lijie mentioned earlier in this thread that according
> to
> > > > > FLIP-287,
> > > > > currently the Sink.InitContext still lacks some necessary
> information
> > > to
> > > > > migrate existing connectors to new sinks. This could be a blocker
> > issue
> > > > > since this is related to the SinkV2 API design.
> > > > >
> > > > > And Yuxia mentioned earlier in this thread that there are bugs such
> > as
> > > > > FLINK-30238 <https://issues.apache.org/jira/browse/FLINK-30238>
> and
> > > > > FLINK-29459 <https://issues.apache.org/jira/browse/FLINK-29459> ,
> > > which
> > > > > makes it hard to use SinkV2 properly in production.  It seems that
> > > these
> > > > > two bugs are created months ago and are still unresolved or even
> > > > > unassigned. This looks like a clear signal that SinkV2 is not being
> > > > > actively maintained and used in production.
> > > > >
> > > > >
> > > > > > opinion only deprecate it when every single connector in Apache
> > Flink
> > > > is
> > > > > > migrated already?
> > > > > >
> > > > >
> > > > > Technically this is not a hard requirement. But I would suggest
> that
> > we
> > > > > should migrate all existing connectors so that we eat our own
> dogfood
> > > and
> > > > > prove to users that SinkV2 is ready for use in production.
> > > > >
> > > > >
> > > > > > In my opinion it is the time to ask users to the migrate their
> > > > > connectors.
> > > > > > More importantly, @Deprecated would signal users not to build new
> > > > > > connectors on SinkFunction. I would arque its also very
> misleading
> > to
> > > > > users
> > > > > > to not @Deprecated SinkFunction given that is clearly will be
> > > > deprecated.
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > > Cheers,
> > > > > >
> > > > > > Konstantin
> > > > > >
> > > > > >
> > > > > > Am Mo., 6. Feb. 2023 um 13:26 Uhr schrieb Jark Wu <
> > imjark@gmail.com
> > > >:
> > > > > >
> > > > > > > I agree with Dong Lin.
> > > > > > >
> > > > > > > Oracle explains how to use Deprecate API [1]:
> > > > > > >
> > > > > > > You are strongly recommended to use the Javadoc @deprecated tag
> > > with
> > > > > > > > appropriate comments explaining how to use the new API. This
> > > > ensures
> > > > > > > > developers will *have a workable migration path from the old
> > API
> > > to
> > > > > the
> > > > > > > > new API*.
> > > > > > >
> > > > > > >
> > > > > > > From a user's perspective, the workable migration path is very
> > > > > important.
> > > > > > > Otherwise, it blurs the semantics of API deprecation. The Flink
> > > API's
> > > > > > > compatibility and stability issues in the past left a bad
> > > impression
> > > > on
> > > > > > the
> > > > > > > downstream projects. We should be careful when changing and
> > > > deprecating
> > > > > > > APIs, especially when there are known migration gaps. I think
> > it's
> > > a
> > > > > good
> > > > > > > idea to migrate Flink-owned connectors before marking old API
> > > > > deprecated.
> > > > > > > This ensures downstream projects can migrate to new APIs
> > smoothly.
> > > > > > >
> > > > > > > Best,
> > > > > > > Jark
> > > > > > >
> > > > > > > [1]:
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://docs.oracle.com/javase/8/docs/technotes/guides/javadoc/deprecation/deprecation.html
> > > > > > >
> > > > > > > On Mon, 6 Feb 2023 at 10:01, Steven Wu <st...@gmail.com>
> > > wrote:
> > > > > > >
> > > > > > > > Regarding the discussion on global committer [1] for sinks
> with
> > > > > global
> > > > > > > > transactions, there is no consensus on solving that problem
> in
> > > > > SinkV2.
> > > > > > > Will
> > > > > > > > it require any breaking change in SinkV2?
> > > > > > > >
> > > > > > > > Also will SinkV1 be deprecated too? or it should happen
> > sometime
> > > > > after
> > > > > > > > SinkFunction deprecation?
> > > > > > > >
> > > > > > > > [1]
> > > > https://lists.apache.org/thread/82bgvlton9olb591bfg2djv0cshj1bxj
> > > > > > > >
> > > > > > > > On Sun, Feb 5, 2023 at 2:14 AM Dong Lin <lindong28@gmail.com
> >
> > > > wrote:
> > > > > > > >
> > > > > > > > > Hi Konstantin,
> > > > > > > > >
> > > > > > > > > Thanks for the comment! Please see my comment inline.
> > > > > > > > >
> > > > > > > > > Cheers,
> > > > > > > > > Dong
> > > > > > > > >
> > > > > > > > > On Sat, Feb 4, 2023 at 2:06 AM Konstantin Knauf <
> > > > knaufk@apache.org
> > > > > >
> > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Hi everyone,
> > > > > > > > > >
> > > > > > > > > > sorry for joining the discussion late.
> > > > > > > > > >
> > > > > > > > > > 1) Is there an option to deprecate SinkFunction in Flink
> > 1.17
> > > > > while
> > > > > > > > > leaving
> > > > > > > > > > SinkV2 @PublicEvolving in Flink 1.17. We then aim to make
> > > > SinkV2
> > > > > > > > @Public
> > > > > > > > > in
> > > > > > > > > > and remove SinkFunction in Flink 1.18. @PublicEvolving
> are
> > > > > intended
> > > > > > > for
> > > > > > > > > > public use. So, I don't see it as a blocker for
> deprecating
> > > > > > > > SinkFunction
> > > > > > > > > > that we have to make SinkV2 @PublicEvovling. For
> reference
> > > this
> > > > > is
> > > > > > > the
> > > > > > > > > > description of @PublicEvovling:
> > > > > > > > > >
> > > > > > > > > > /**
> > > > > > > > > >  * Annotation to mark classes and methods for public use,
> > but
> > > > > with
> > > > > > > > > > evolving interfaces.
> > > > > > > > > >  *
> > > > > > > > > >  * <p>Classes and methods with this annotation are
> intended
> > > for
> > > > > > > public
> > > > > > > > > > use and have stable behavior.
> > > > > > > > > >  * However, their interfaces and signatures are not
> > > considered
> > > > to
> > > > > > be
> > > > > > > > > > stable and might be changed
> > > > > > > > > >  * across versions.
> > > > > > > > > >  *
> > > > > > > > > >  * <p>This annotation also excludes methods and classes
> > with
> > > > > > evolving
> > > > > > > > > > interfaces / signatures within
> > > > > > > > > >  * classes annotated with {@link Public}.
> > > > > > > > > >  */
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Marking SinkFunction @Deprecated would already single
> > > everyone
> > > > to
> > > > > > > move
> > > > > > > > to
> > > > > > > > > > SinkV2, which we as a community, I believe, have a strong
> > > > > interest
> > > > > > > in.
> > > > > > > > > Its
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > > Yes, I also believe we all have this strong interest. I
> just
> > > hope
> > > > > > that
> > > > > > > > this
> > > > > > > > > can be done in the best possible way that does not confuse
> > > users.
> > > > > > > > >
> > > > > > > > > I probably still have the same concern regarding its impact
> > on
> > > > > users:
> > > > > > > if
> > > > > > > > we
> > > > > > > > > mark an API as deprecated, it effectively means the users
> of
> > > this
> > > > > API
> > > > > > > > > should start to migrate to another API (e.g. SinkV2) and we
> > > might
> > > > > > > remove
> > > > > > > > > this API in the future. However, given that we know there
> are
> > > > known
> > > > > > > > > problems preventing users from doing so, it seems that we
> are
> > > not
> > > > > > ready
> > > > > > > > to
> > > > > > > > > send this message to users right.
> > > > > > > > >
> > > > > > > > > If I understand correctly, I guess you are suggesting that
> by
> > > > > marking
> > > > > > > > > SinkFunction as deprecated, we can put higher pressure on
> > Flink
> > > > > > > > > contributors to update the existing Flink codebase to
> improve
> > > and
> > > > > use
> > > > > > > > > SinkV2.
> > > > > > > > >
> > > > > > > > > I am not sure this is the right way to use @deprecated,
> which
> > > > has a
> > > > > > > > > particular meaning for its users rather than contributors.
> > And
> > > I
> > > > am
> > > > > > > also
> > > > > > > > > not sure we can even pressure contributors of an
> open-source
> > > > > project
> > > > > > > into
> > > > > > > > > developing a feature (e.g. migrate all existing
> SinkFunction
> > > > > > subclasses
> > > > > > > > to
> > > > > > > > > SinkV2). IMO, the typical way is for the contributor with
> > > > > > interest/time
> > > > > > > > to
> > > > > > > > > work on the feature, or talk to other contributors whether
> > they
> > > > are
> > > > > > > > willing
> > > > > > > > > to collaborate/work on this, rather than pressuring other
> > > > > > contributors
> > > > > > > > into
> > > > > > > > > working on this.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > almost comical how long the transition from
> > > > > > > SourceFurnction/SinkFunction
> > > > > > > > to
> > > > > > > > > > Source/Sink takes us. At the same time, we leave
> ourselves
> > > the
> > > > > > option
> > > > > > > > to
> > > > > > > > > to
> > > > > > > > > > make small changes to SinkV2 if any problems arise during
> > the
> > > > > > > migration
> > > > > > > > > of
> > > > > > > > > > these connector.
> > > > > > > > > >
> > > > > > > > > > I think, we have a bit of a chicken/egg problem here. The
> > > > > pressure
> > > > > > > for
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > > Similar to the reason described above, I am not sure we
> have
> > a
> > > > > > > > chicken/egg
> > > > > > > > > problem here. The issue here is that SinkV2 is not ready
> and
> > we
> > > > > have
> > > > > > a
> > > > > > > > lot
> > > > > > > > > of existing SinkFunction that is not migrated by ourselves.
> > We
> > > > > (Flink
> > > > > > > > > contributors) probably do not need to mark SinkFunction as
> > > > > deprecated
> > > > > > > in
> > > > > > > > > order to address these issues in our own codebase.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > users and contributors is not high enough to move away from
> > > > > > > SinkFunction
> > > > > > > > as
> > > > > > > > > > long as its not deprecated, but at the same time we need
> > > people
> > > > > to
> > > > > > > > > migrate
> > > > > > > > > > their connectors to see if there are any gaps in SinkV2.
> I
> > > > > believe,
> > > > > > > the
> > > > > > > > > > combination proposed above could bridge this problem.
> > > > > > > > > >
> > > > > > > > > > 2) I don't understand the argument of waiting until some
> of
> > > the
> > > > > > > > > > implementations are @Public. How can we make the
> > > > implementations
> > > > > of
> > > > > > > the
> > > > > > > > > > SinkV2 API @Public without making SinkV2 @Public? All
> > public
> > > > > > methods
> > > > > > > of
> > > > > > > > > > SinkV2 are part of every implementation. So to me it
> > actually
> > > > > seems
> > > > > > > to
> > > > > > > > be
> > > > > > > > > > opposite: in order to make any of the implementation
> > @Public
> > > we
> > > > > > first
> > > > > > > > > need
> > > > > > > > > > to make the API @Public.
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > > Yeah I also agree with you.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Cheers,
> > > > > > > > > >
> > > > > > > > > > Konstantin
> > > > > > > > > >
> > > > > > > > > > Am Mo., 30. Jan. 2023 um 13:18 Uhr schrieb Dong Lin <
> > > > > > > > lindong28@gmail.com
> > > > > > > > > >:
> > > > > > > > > >
> > > > > > > > > > > Hi Martijn,
> > > > > > > > > > >
> > > > > > > > > > > Thanks for driving this effort to clean-up the Flink
> > > > codebase!
> > > > > > > > > > >
> > > > > > > > > > > I like the idea to cleanup Flink codebase to avoid
> having
> > > two
> > > > > > > Sinks.
> > > > > > > > On
> > > > > > > > > > the
> > > > > > > > > > > other hand, I also thing the concern mentioned by Jing
> > > makes
> > > > > > sense.
> > > > > > > > In
> > > > > > > > > > > addition to thinking in terms of the rule proposed in
> > > > FLIP-197
> > > > > > > > > > > <
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-197%3A+API+stability+graduation+process
> > > > > > > > > > > >
> > > > > > > > > > > (which
> > > > > > > > > > > seems to focus mostly on the Flink developers'
> > > perspective),
> > > > it
> > > > > > > might
> > > > > > > > > be
> > > > > > > > > > > useful to also think about the story from users'
> > > perspective
> > > > > and
> > > > > > > make
> > > > > > > > > > sure
> > > > > > > > > > > their concerns can be addressed.
> > > > > > > > > > >
> > > > > > > > > > > Typically, by marking an API as deprecated, we are
> > > > effectively
> > > > > > > > telling
> > > > > > > > > > > users *they should start to migrate to the new API ASAP
> > and
> > > > we
> > > > > > > > reserve
> > > > > > > > > > the
> > > > > > > > > > > right to remove this API completely in the 1-2
> releases*.
> > > > Then
> > > > > it
> > > > > > > > might
> > > > > > > > > > be
> > > > > > > > > > > reasonable for users to ask questions such as:
> > > > > > > > > > >
> > > > > > > > > > > 1) Does SinkV2 public API provides all the
> > functionalities
> > > > > needed
> > > > > > > to
> > > > > > > > > > > migrate my existing code from subclassing SinkFunction
> to
> > > > > > > subclassing
> > > > > > > > > > > SinkV2?
> > > > > > > > > > >
> > > > > > > > > > > 2) Is the amount of migration work reasonable? If yes,
> > why
> > > > is a
> > > > > > > class
> > > > > > > > > > such
> > > > > > > > > > > as HBaseSinkFunction in Flink's own codebase still
> > > depending
> > > > on
> > > > > > > > > > > SinkFunction? Maybe Flink developers should eat their
> own
> > > dog
> > > > > > food
> > > > > > > > and
> > > > > > > > > > > migrate (or deprecate) those classes in the Flink
> > codebase
> > > > > first?
> > > > > > > > > > >
> > > > > > > > > > > Based on the discussion in this thread, I am not sure
> we
> > > have
> > > > > > good
> > > > > > > > > > answers
> > > > > > > > > > > to those questions yet. For the 1st question above, the
> > > > answer
> > > > > is
> > > > > > > > *no*
> > > > > > > > > > > because we already know that the SinkV2 is currently
> not
> > > able
> > > > > to
> > > > > > > > > support
> > > > > > > > > > > migration for JdbcSink. For the 2nd question above, we
> > know
> > > > > there
> > > > > > > are
> > > > > > > > > > many
> > > > > > > > > > > non-deprecated class in Flink codebase that are still
> > > > depending
> > > > > > on
> > > > > > > > > > SinkV2.
> > > > > > > > > > > It is probably not nice to users if we tell them to
> > migrate
> > > > > while
> > > > > > > we
> > > > > > > > > know
> > > > > > > > > > > there are existing issues that can prevent them from
> > doing
> > > so
> > > > > > > easily.
> > > > > > > > > > >
> > > > > > > > > > > In order to move forward to deprecate SinkV2, I think
> it
> > > will
> > > > > be
> > > > > > > > super
> > > > > > > > > > > useful to first migrate all the connectors managed by
> > Flink
> > > > > > > community
> > > > > > > > > > > (including all externalized connectors) to use SinkV2.
> > This
> > > > > work
> > > > > > > > won't
> > > > > > > > > be
> > > > > > > > > > > wasted since we need to do this anyway. And it will
> also
> > > give
> > > > > us
> > > > > > a
> > > > > > > > > chance
> > > > > > > > > > > to validate the capabilities of SinkV2 and fix bugs by
> > > > > ourselves
> > > > > > as
> > > > > > > > > much
> > > > > > > > > > as
> > > > > > > > > > > possible.
> > > > > > > > > > >
> > > > > > > > > > > What do you think?
> > > > > > > > > > >
> > > > > > > > > > > Best Regards,
> > > > > > > > > > > Dong
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > On Wed, Jan 18, 2023 at 6:52 PM Martijn Visser <
> > > > > > > > > martijnvisser@apache.org
> > > > > > > > > > >
> > > > > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Hi all,
> > > > > > > > > > > >
> > > > > > > > > > > > While discussing FLIP-281 [1] the discussion also
> > turned
> > > to
> > > > > the
> > > > > > > > > > > > SinkFunction and the SinkV2 API. For a broader
> > discussion
> > > > I'm
> > > > > > > > opening
> > > > > > > > > > up
> > > > > > > > > > > a
> > > > > > > > > > > > separate discussion thread.
> > > > > > > > > > > >
> > > > > > > > > > > > As Yun Tang has mentioned in that discussion thread,
> it
> > > > would
> > > > > > be
> > > > > > > a
> > > > > > > > > good
> > > > > > > > > > > > time to deprecate the SinkFunction to avoid the need
> to
> > > > > > introduce
> > > > > > > > new
> > > > > > > > > > > > functions towards (to be) deprecated APIs. Jing
> > > rightfully
> > > > > > > > mentioned
> > > > > > > > > > that
> > > > > > > > > > > > it would be confusing to deprecate the SinkFunction
> if
> > > its
> > > > > > > > successor
> > > > > > > > > is
> > > > > > > > > > > not
> > > > > > > > > > > > yet marked as @Public (it's currently
> @PublicEvolving).
> > > > > > > > > > > >
> > > > > > > > > > > > My proposal would be to promote the SinkV2 API to
> > @public
> > > > in
> > > > > > > Flink
> > > > > > > > > 1.17
> > > > > > > > > > > and
> > > > > > > > > > > > mark the SinkFunction as @deprecated in Flink 1.17
> > > > > > > > > > > >
> > > > > > > > > > > > The original Sink interface was introduced in Flink
> > 1.12
> > > > with
> > > > > > > > > FLIP-143
> > > > > > > > > > > [2]
> > > > > > > > > > > > and extended with FLIP-177 in Flink 1.14 [3] and has
> > been
> > > > > > > improved
> > > > > > > > on
> > > > > > > > > > > > further as Sink V2 via FLIP-191 in Flink 1.15 [4].
> > > > > > > > > > > >
> > > > > > > > > > > > Looking at the API stability graduation process [5],
> > the
> > > > fact
> > > > > > > that
> > > > > > > > > Sink
> > > > > > > > > > > V2
> > > > > > > > > > > > was introduced in Flink 1.15 would mean that we could
> > > > > warrant a
> > > > > > > > > > promotion
> > > > > > > > > > > > to @public already (given that there have been two
> > > releases
> > > > > > with
> > > > > > > > 1.15
> > > > > > > > > > and
> > > > > > > > > > > > 1.16 where it was introduced). Combined with the fact
> > > that
> > > > > > SinkV2
> > > > > > > > has
> > > > > > > > > > > been
> > > > > > > > > > > > the result of iteration over the introduction of the
> > > > original
> > > > > > > Sink
> > > > > > > > > API
> > > > > > > > > > > > since Flink 1.12, I would argue that the promotion is
> > > > > overdue.
> > > > > > > > > > > >
> > > > > > > > > > > > If we promote the Sink API to @public, I think we
> > should
> > > > also
> > > > > > > > > > immediately
> > > > > > > > > > > > mark the SinkFunction as @deprecated.
> > > > > > > > > > > >
> > > > > > > > > > > > Looking forward to your thoughts.
> > > > > > > > > > > >
> > > > > > > > > > > > Best regards,
> > > > > > > > > > > >
> > > > > > > > > > > > Martijn
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > [1]
> > > > > > > >
> > https://lists.apache.org/thread/l05m6cf8fwkkbpnjtzbg9l2lo40oxzd1
> > > > > > > > > > > > [2]
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-143%3A+Unified+Sink+API
> > > > > > > > > > > > [3]
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-177%3A+Extend+Sink+API
> > > > > > > > > > > > [4]
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-191%3A+Extend+unified+Sink+interface+to+support+small+file+compaction
> > > > > > > > > > > > [5]
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-197%3A+API+stability+graduation+process
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > --
> > > > > > > > > > https://twitter.com/snntrable
> > > > > > > > > > https://github.com/knaufk
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > https://twitter.com/snntrable
> > > > > > https://github.com/knaufk
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] Promote SinkV2 to @Public and deprecate SinkFunction

Posted by Galen Warren <ga...@cvillewarrens.com>.
Regarding this part:

@Dong
> And Yuxia mentioned earlier in this thread that there are bugs such as
FLINK-30238 <https://issues.apache.org/jira/browse/FLINK-30238> and
FLINK-29459 <https://issues.apache.org/jira/browse/FLINK-29459> , which
makes it hard to use SinkV2 properly in production.

I'll admit that I don't understand everything that's going on here, but the
sense I get from these is that I risk running into problems
stopping-with-savepoint and restarting streaming jobs with SinkV2.

Is that correct? It's quite possible I'm not understanding the issue and
would be great if so!

If that is correct, then I really can't use SinkV2 myself; I need to be
able to reliably stop/start streaming jobs without losing state.

Thanks,
Galen



On Fri, Feb 10, 2023 at 3:29 PM Martijn Visser <ma...@apache.org>
wrote:

> Hi all,
>
> I wanted to get back on a couple of comments and have a proposal at the end
> for the next steps:
>
> @Steven Wu
> If I look back at the original discussion of FLIP-191 and also the thread
> that you're referring to, it appears from the discussion with Yun Gao that
> the solution was in near-sight, but just not finished. Perhaps it needs to
> be restarted once more so it can be brought to a closure. Also when I
> looked back at the original introduction of SinkV2, there was FLINK-25726
> [1] which looks like it was built specifically for Iceberg and Delta Sink?
>
> @Jing
> > All potential unstable methods coming from SinkV2 interfaces will be kept
> marked as @internal.
>
> This is a situation that I think should be avoided: if it's unstable, it
> should be marked as @Experimental. Connectors should not need to rely
> on @Internal interfaces; if they are needed by a connector, a maintainer
> should create a proposal to expose.
>
> On the production readiness of SinkV2:
>
> @Dong
> > And Yuxia mentioned earlier in this thread that there are bugs such as
> FLINK-30238 <https://issues.apache.org/jira/browse/FLINK-30238> and
> FLINK-29459 <https://issues.apache.org/jira/browse/FLINK-29459> , which
> makes it hard to use SinkV2 properly in production.
>
> @Jark
> > The follow-up step should be trying the new APIs in externalized
> connectors and giving users the confidence to migrate.
>
> I think that is not a fair assessment of the situation. Given that Kafka,
> Elasticsearch, FileSink and everything that uses the AsyncSink is on SinkV2
> since Flink 1.15 and the popularity of these connectors, they have
> definitely already proven themselves. That's not to downplay that bugs have
> no impact and shouldn't be resolved, but I don't think it's fair to state
> that SinkV2 is not stable or usable in production.
>
> Coming back to the original topic of this thread, my proposal would be that
> in Flink 1.17:
>
> 1. SinkV2 gets promoted from @PublicEvolving to @Public, which also means:
> a. KafkaSink gets promoted from @PublicEvolving to @Public (which solves
> the weird thing that FlinkKafkaProducer is already marked as deprecated
> since Flink 1.14 and points to KafkaSink as a successor)
> b. FileSink gets promoted from @Experimental to @PublicEvolving (note: this
> has been marked as @Experimental since Flink 1.12, so one could argue that
> this needs to be bumped to @Public too. The reason why I didn't do it is
> because of Yuxia's remark that FileSystemTableSink is still using
> SinkFunction)
> c. ElasticsearchSink gets promoted from @PublicEvolving to @Public (has
> been externalized so needs to happen there)
> d. The AWS connectors (Firehose, Kinesis) get promoted from @PublicEvolving
> to @Public (has been externalized so needs to happen there)
> e. It's up to the maintainers for DynamoDB and Opensearch to decide this
> themselves, given that I believe they've only been introduced recently and
> support only Flink 1.16.
>
> 2. For the next Flink release, the community needs to establish what are
> features that are currently not available in SinkV2 (e.g. FLIP-287), if
> such a feature should be made available in SinkV2 (in case it's a feature
> which currently resides in SinkFunction, while from new insights we think
> it should now be solved differently) and then get these implemented asap.
>
> Best regards,
>
> Martijn
>
> [1] https://issues.apache.org/jira/browse/FLINK-25726
>
>
> On Tue, Feb 7, 2023 at 5:11 AM Steven Wu <st...@gmail.com> wrote:
>
> > > Regarding the discussion on global committer [1] for sinks with global
> > transactions, there is no consensus on solving that problem in SinkV2.
> Will
> > it require any breaking change in SinkV2?
> >
> > Just want to reiterate my earlier question. What is the migration path
> for
> > the Iceberg sink?
> >
> > [1] https://lists.apache.org/thread/82bgvlton9olb591bfg2djv0cshj1bxj
> >
> > On Mon, Feb 6, 2023 at 6:22 PM Jark Wu <im...@gmail.com> wrote:
> >
> > > Hi Konstantin,
> > >
> > > I totally agree with making SinkV2 @Public. I just have concerns about
> > > deprecating SinkFunction at this point. Dong Lin has raised the blocker
> > > issues of migration multiple times in this thread which I think we
> should
> > > address first. I don't know why we rush to deprecate SinkFunction while
> > > SourceFunction is still public, but the new Source API is much more
> > stable
> > > and production-ready than SinkV2.
> > >
> > > Iceberg community raised concerns[1] about the workability and
> stability
> > of
> > > Flink connector APIs.
> > >
> > > We are hoping any issues with the APIs for Iceberg connector will
> surface
> > > > sooner and get more attention from the Flink community when the
> > connector
> > > > is within Flink umbrella rather than in Iceberg repo.
> > >
> > >
> > > The connector externalizing is a big step for building a mechanism to
> > > guarantee Flink connector API is stable and workable. The follow-up
> step
> > > should be trying the new APIs in externalized connectors and giving
> users
> > > the confidence to migrate.
> > >
> > >
> > > Best,
> > > Jark
> > >
> > > [1]: https://lists.apache.org/thread/r5zqnkt01x2c611brkjmxxnt3bfcgl1b
> > >
> > > On Tue, 7 Feb 2023 at 09:53, yuxia <lu...@alumni.sjtu.edu.cn>
> wrote:
> > >
> > > > Hi Konstantin,
> > > > Just FYI, the FileSystemTableSink are still using SinkFunction.
> > > >
> > > > Best regards,
> > > > Yuxia
> > > >
> > > > ----- 原始邮件 -----
> > > > 发件人: "Dong Lin" <li...@gmail.com>
> > > > 收件人: "dev" <de...@flink.apache.org>
> > > > 抄送: "Jing Ge" <ji...@ververica.com>, "Yun Tang" <my...@live.com>
> > > > 发送时间: 星期二, 2023年 2 月 07日 上午 9:41:07
> > > > 主题: Re: [DISCUSS] Promote SinkV2 to @Public and deprecate
> SinkFunction
> > > >
> > > > Hi Konstantin,
> > > >
> > > > Thanks for the reply. Please see my comments inline.
> > > >
> > > > On Mon, Feb 6, 2023 at 9:48 PM Konstantin Knauf <kn...@apache.org>
> > > wrote:
> > > >
> > > > > Hi Steven,
> > > > >
> > > > > Sink is already deprecated. It was deprecated at the moment where
> we
> > > > > introduced SinkV2.
> > > > >
> > > > > Hi Jark, Hi Dong,
> > > > >
> > > > > My understanding is the SinkV2 is a workable interface. The most
> > > > important
> > > > > connectors have been migrated (Kafka, Filesystem) and more
> connectors
> > > > > (OpenSearch, ElasticSearch, Kinesis) use it successfully. To make
> > > SinkV2
> > > > > public, it does not need to have all possible functionality. Public
> > > APIs
> > > > > can be extended. That's what we do all the time. There will also
> > always
> > > > be
> > > > > bugs. So, these points can not be categorical blockers to promote
> the
> > > > API.
> > > > >
> > > >
> > > > Right. I believe we all agree that we make SinkV2 @PublicEvolving.
> The
> > > > concern here is whether we can mark SinkFunction as deprecated at
> this
> > > > point.
> > > >
> > > >
> > > > >
> > > > > What are the specific issues/tickets that are blocking us? Can we
> in
> > > your
> > > > >
> > > >
> > > > For example, Lijie mentioned earlier in this thread that according to
> > > > FLIP-287,
> > > > currently the Sink.InitContext still lacks some necessary information
> > to
> > > > migrate existing connectors to new sinks. This could be a blocker
> issue
> > > > since this is related to the SinkV2 API design.
> > > >
> > > > And Yuxia mentioned earlier in this thread that there are bugs such
> as
> > > > FLINK-30238 <https://issues.apache.org/jira/browse/FLINK-30238> and
> > > > FLINK-29459 <https://issues.apache.org/jira/browse/FLINK-29459> ,
> > which
> > > > makes it hard to use SinkV2 properly in production.  It seems that
> > these
> > > > two bugs are created months ago and are still unresolved or even
> > > > unassigned. This looks like a clear signal that SinkV2 is not being
> > > > actively maintained and used in production.
> > > >
> > > >
> > > > > opinion only deprecate it when every single connector in Apache
> Flink
> > > is
> > > > > migrated already?
> > > > >
> > > >
> > > > Technically this is not a hard requirement. But I would suggest that
> we
> > > > should migrate all existing connectors so that we eat our own dogfood
> > and
> > > > prove to users that SinkV2 is ready for use in production.
> > > >
> > > >
> > > > > In my opinion it is the time to ask users to the migrate their
> > > > connectors.
> > > > > More importantly, @Deprecated would signal users not to build new
> > > > > connectors on SinkFunction. I would arque its also very misleading
> to
> > > > users
> > > > > to not @Deprecated SinkFunction given that is clearly will be
> > > deprecated.
> > > >
> > > >
> > > >
> > > >
> > > > > Cheers,
> > > > >
> > > > > Konstantin
> > > > >
> > > > >
> > > > > Am Mo., 6. Feb. 2023 um 13:26 Uhr schrieb Jark Wu <
> imjark@gmail.com
> > >:
> > > > >
> > > > > > I agree with Dong Lin.
> > > > > >
> > > > > > Oracle explains how to use Deprecate API [1]:
> > > > > >
> > > > > > You are strongly recommended to use the Javadoc @deprecated tag
> > with
> > > > > > > appropriate comments explaining how to use the new API. This
> > > ensures
> > > > > > > developers will *have a workable migration path from the old
> API
> > to
> > > > the
> > > > > > > new API*.
> > > > > >
> > > > > >
> > > > > > From a user's perspective, the workable migration path is very
> > > > important.
> > > > > > Otherwise, it blurs the semantics of API deprecation. The Flink
> > API's
> > > > > > compatibility and stability issues in the past left a bad
> > impression
> > > on
> > > > > the
> > > > > > downstream projects. We should be careful when changing and
> > > deprecating
> > > > > > APIs, especially when there are known migration gaps. I think
> it's
> > a
> > > > good
> > > > > > idea to migrate Flink-owned connectors before marking old API
> > > > deprecated.
> > > > > > This ensures downstream projects can migrate to new APIs
> smoothly.
> > > > > >
> > > > > > Best,
> > > > > > Jark
> > > > > >
> > > > > > [1]:
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://docs.oracle.com/javase/8/docs/technotes/guides/javadoc/deprecation/deprecation.html
> > > > > >
> > > > > > On Mon, 6 Feb 2023 at 10:01, Steven Wu <st...@gmail.com>
> > wrote:
> > > > > >
> > > > > > > Regarding the discussion on global committer [1] for sinks with
> > > > global
> > > > > > > transactions, there is no consensus on solving that problem in
> > > > SinkV2.
> > > > > > Will
> > > > > > > it require any breaking change in SinkV2?
> > > > > > >
> > > > > > > Also will SinkV1 be deprecated too? or it should happen
> sometime
> > > > after
> > > > > > > SinkFunction deprecation?
> > > > > > >
> > > > > > > [1]
> > > https://lists.apache.org/thread/82bgvlton9olb591bfg2djv0cshj1bxj
> > > > > > >
> > > > > > > On Sun, Feb 5, 2023 at 2:14 AM Dong Lin <li...@gmail.com>
> > > wrote:
> > > > > > >
> > > > > > > > Hi Konstantin,
> > > > > > > >
> > > > > > > > Thanks for the comment! Please see my comment inline.
> > > > > > > >
> > > > > > > > Cheers,
> > > > > > > > Dong
> > > > > > > >
> > > > > > > > On Sat, Feb 4, 2023 at 2:06 AM Konstantin Knauf <
> > > knaufk@apache.org
> > > > >
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Hi everyone,
> > > > > > > > >
> > > > > > > > > sorry for joining the discussion late.
> > > > > > > > >
> > > > > > > > > 1) Is there an option to deprecate SinkFunction in Flink
> 1.17
> > > > while
> > > > > > > > leaving
> > > > > > > > > SinkV2 @PublicEvolving in Flink 1.17. We then aim to make
> > > SinkV2
> > > > > > > @Public
> > > > > > > > in
> > > > > > > > > and remove SinkFunction in Flink 1.18. @PublicEvolving are
> > > > intended
> > > > > > for
> > > > > > > > > public use. So, I don't see it as a blocker for deprecating
> > > > > > > SinkFunction
> > > > > > > > > that we have to make SinkV2 @PublicEvovling. For reference
> > this
> > > > is
> > > > > > the
> > > > > > > > > description of @PublicEvovling:
> > > > > > > > >
> > > > > > > > > /**
> > > > > > > > >  * Annotation to mark classes and methods for public use,
> but
> > > > with
> > > > > > > > > evolving interfaces.
> > > > > > > > >  *
> > > > > > > > >  * <p>Classes and methods with this annotation are intended
> > for
> > > > > > public
> > > > > > > > > use and have stable behavior.
> > > > > > > > >  * However, their interfaces and signatures are not
> > considered
> > > to
> > > > > be
> > > > > > > > > stable and might be changed
> > > > > > > > >  * across versions.
> > > > > > > > >  *
> > > > > > > > >  * <p>This annotation also excludes methods and classes
> with
> > > > > evolving
> > > > > > > > > interfaces / signatures within
> > > > > > > > >  * classes annotated with {@link Public}.
> > > > > > > > >  */
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > Marking SinkFunction @Deprecated would already single
> > everyone
> > > to
> > > > > > move
> > > > > > > to
> > > > > > > > > SinkV2, which we as a community, I believe, have a strong
> > > > interest
> > > > > > in.
> > > > > > > > Its
> > > > > > > > >
> > > > > > > >
> > > > > > > > Yes, I also believe we all have this strong interest. I just
> > hope
> > > > > that
> > > > > > > this
> > > > > > > > can be done in the best possible way that does not confuse
> > users.
> > > > > > > >
> > > > > > > > I probably still have the same concern regarding its impact
> on
> > > > users:
> > > > > > if
> > > > > > > we
> > > > > > > > mark an API as deprecated, it effectively means the users of
> > this
> > > > API
> > > > > > > > should start to migrate to another API (e.g. SinkV2) and we
> > might
> > > > > > remove
> > > > > > > > this API in the future. However, given that we know there are
> > > known
> > > > > > > > problems preventing users from doing so, it seems that we are
> > not
> > > > > ready
> > > > > > > to
> > > > > > > > send this message to users right.
> > > > > > > >
> > > > > > > > If I understand correctly, I guess you are suggesting that by
> > > > marking
> > > > > > > > SinkFunction as deprecated, we can put higher pressure on
> Flink
> > > > > > > > contributors to update the existing Flink codebase to improve
> > and
> > > > use
> > > > > > > > SinkV2.
> > > > > > > >
> > > > > > > > I am not sure this is the right way to use @deprecated, which
> > > has a
> > > > > > > > particular meaning for its users rather than contributors.
> And
> > I
> > > am
> > > > > > also
> > > > > > > > not sure we can even pressure contributors of an open-source
> > > > project
> > > > > > into
> > > > > > > > developing a feature (e.g. migrate all existing SinkFunction
> > > > > subclasses
> > > > > > > to
> > > > > > > > SinkV2). IMO, the typical way is for the contributor with
> > > > > interest/time
> > > > > > > to
> > > > > > > > work on the feature, or talk to other contributors whether
> they
> > > are
> > > > > > > willing
> > > > > > > > to collaborate/work on this, rather than pressuring other
> > > > > contributors
> > > > > > > into
> > > > > > > > working on this.
> > > > > > > >
> > > > > > > >
> > > > > > > > almost comical how long the transition from
> > > > > > SourceFurnction/SinkFunction
> > > > > > > to
> > > > > > > > > Source/Sink takes us. At the same time, we leave ourselves
> > the
> > > > > option
> > > > > > > to
> > > > > > > > to
> > > > > > > > > make small changes to SinkV2 if any problems arise during
> the
> > > > > > migration
> > > > > > > > of
> > > > > > > > > these connector.
> > > > > > > > >
> > > > > > > > > I think, we have a bit of a chicken/egg problem here. The
> > > > pressure
> > > > > > for
> > > > > > > > >
> > > > > > > >
> > > > > > > > Similar to the reason described above, I am not sure we have
> a
> > > > > > > chicken/egg
> > > > > > > > problem here. The issue here is that SinkV2 is not ready and
> we
> > > > have
> > > > > a
> > > > > > > lot
> > > > > > > > of existing SinkFunction that is not migrated by ourselves.
> We
> > > > (Flink
> > > > > > > > contributors) probably do not need to mark SinkFunction as
> > > > deprecated
> > > > > > in
> > > > > > > > order to address these issues in our own codebase.
> > > > > > > >
> > > > > > > >
> > > > > > > > users and contributors is not high enough to move away from
> > > > > > SinkFunction
> > > > > > > as
> > > > > > > > > long as its not deprecated, but at the same time we need
> > people
> > > > to
> > > > > > > > migrate
> > > > > > > > > their connectors to see if there are any gaps in SinkV2. I
> > > > believe,
> > > > > > the
> > > > > > > > > combination proposed above could bridge this problem.
> > > > > > > > >
> > > > > > > > > 2) I don't understand the argument of waiting until some of
> > the
> > > > > > > > > implementations are @Public. How can we make the
> > > implementations
> > > > of
> > > > > > the
> > > > > > > > > SinkV2 API @Public without making SinkV2 @Public? All
> public
> > > > > methods
> > > > > > of
> > > > > > > > > SinkV2 are part of every implementation. So to me it
> actually
> > > > seems
> > > > > > to
> > > > > > > be
> > > > > > > > > opposite: in order to make any of the implementation
> @Public
> > we
> > > > > first
> > > > > > > > need
> > > > > > > > > to make the API @Public.
> > > > > > > > >
> > > > > > > >
> > > > > > > > Yeah I also agree with you.
> > > > > > > >
> > > > > > > >
> > > > > > > > >
> > > > > > > > > Cheers,
> > > > > > > > >
> > > > > > > > > Konstantin
> > > > > > > > >
> > > > > > > > > Am Mo., 30. Jan. 2023 um 13:18 Uhr schrieb Dong Lin <
> > > > > > > lindong28@gmail.com
> > > > > > > > >:
> > > > > > > > >
> > > > > > > > > > Hi Martijn,
> > > > > > > > > >
> > > > > > > > > > Thanks for driving this effort to clean-up the Flink
> > > codebase!
> > > > > > > > > >
> > > > > > > > > > I like the idea to cleanup Flink codebase to avoid having
> > two
> > > > > > Sinks.
> > > > > > > On
> > > > > > > > > the
> > > > > > > > > > other hand, I also thing the concern mentioned by Jing
> > makes
> > > > > sense.
> > > > > > > In
> > > > > > > > > > addition to thinking in terms of the rule proposed in
> > > FLIP-197
> > > > > > > > > > <
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-197%3A+API+stability+graduation+process
> > > > > > > > > > >
> > > > > > > > > > (which
> > > > > > > > > > seems to focus mostly on the Flink developers'
> > perspective),
> > > it
> > > > > > might
> > > > > > > > be
> > > > > > > > > > useful to also think about the story from users'
> > perspective
> > > > and
> > > > > > make
> > > > > > > > > sure
> > > > > > > > > > their concerns can be addressed.
> > > > > > > > > >
> > > > > > > > > > Typically, by marking an API as deprecated, we are
> > > effectively
> > > > > > > telling
> > > > > > > > > > users *they should start to migrate to the new API ASAP
> and
> > > we
> > > > > > > reserve
> > > > > > > > > the
> > > > > > > > > > right to remove this API completely in the 1-2 releases*.
> > > Then
> > > > it
> > > > > > > might
> > > > > > > > > be
> > > > > > > > > > reasonable for users to ask questions such as:
> > > > > > > > > >
> > > > > > > > > > 1) Does SinkV2 public API provides all the
> functionalities
> > > > needed
> > > > > > to
> > > > > > > > > > migrate my existing code from subclassing SinkFunction to
> > > > > > subclassing
> > > > > > > > > > SinkV2?
> > > > > > > > > >
> > > > > > > > > > 2) Is the amount of migration work reasonable? If yes,
> why
> > > is a
> > > > > > class
> > > > > > > > > such
> > > > > > > > > > as HBaseSinkFunction in Flink's own codebase still
> > depending
> > > on
> > > > > > > > > > SinkFunction? Maybe Flink developers should eat their own
> > dog
> > > > > food
> > > > > > > and
> > > > > > > > > > migrate (or deprecate) those classes in the Flink
> codebase
> > > > first?
> > > > > > > > > >
> > > > > > > > > > Based on the discussion in this thread, I am not sure we
> > have
> > > > > good
> > > > > > > > > answers
> > > > > > > > > > to those questions yet. For the 1st question above, the
> > > answer
> > > > is
> > > > > > > *no*
> > > > > > > > > > because we already know that the SinkV2 is currently not
> > able
> > > > to
> > > > > > > > support
> > > > > > > > > > migration for JdbcSink. For the 2nd question above, we
> know
> > > > there
> > > > > > are
> > > > > > > > > many
> > > > > > > > > > non-deprecated class in Flink codebase that are still
> > > depending
> > > > > on
> > > > > > > > > SinkV2.
> > > > > > > > > > It is probably not nice to users if we tell them to
> migrate
> > > > while
> > > > > > we
> > > > > > > > know
> > > > > > > > > > there are existing issues that can prevent them from
> doing
> > so
> > > > > > easily.
> > > > > > > > > >
> > > > > > > > > > In order to move forward to deprecate SinkV2, I think it
> > will
> > > > be
> > > > > > > super
> > > > > > > > > > useful to first migrate all the connectors managed by
> Flink
> > > > > > community
> > > > > > > > > > (including all externalized connectors) to use SinkV2.
> This
> > > > work
> > > > > > > won't
> > > > > > > > be
> > > > > > > > > > wasted since we need to do this anyway. And it will also
> > give
> > > > us
> > > > > a
> > > > > > > > chance
> > > > > > > > > > to validate the capabilities of SinkV2 and fix bugs by
> > > > ourselves
> > > > > as
> > > > > > > > much
> > > > > > > > > as
> > > > > > > > > > possible.
> > > > > > > > > >
> > > > > > > > > > What do you think?
> > > > > > > > > >
> > > > > > > > > > Best Regards,
> > > > > > > > > > Dong
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > On Wed, Jan 18, 2023 at 6:52 PM Martijn Visser <
> > > > > > > > martijnvisser@apache.org
> > > > > > > > > >
> > > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Hi all,
> > > > > > > > > > >
> > > > > > > > > > > While discussing FLIP-281 [1] the discussion also
> turned
> > to
> > > > the
> > > > > > > > > > > SinkFunction and the SinkV2 API. For a broader
> discussion
> > > I'm
> > > > > > > opening
> > > > > > > > > up
> > > > > > > > > > a
> > > > > > > > > > > separate discussion thread.
> > > > > > > > > > >
> > > > > > > > > > > As Yun Tang has mentioned in that discussion thread, it
> > > would
> > > > > be
> > > > > > a
> > > > > > > > good
> > > > > > > > > > > time to deprecate the SinkFunction to avoid the need to
> > > > > introduce
> > > > > > > new
> > > > > > > > > > > functions towards (to be) deprecated APIs. Jing
> > rightfully
> > > > > > > mentioned
> > > > > > > > > that
> > > > > > > > > > > it would be confusing to deprecate the SinkFunction if
> > its
> > > > > > > successor
> > > > > > > > is
> > > > > > > > > > not
> > > > > > > > > > > yet marked as @Public (it's currently @PublicEvolving).
> > > > > > > > > > >
> > > > > > > > > > > My proposal would be to promote the SinkV2 API to
> @public
> > > in
> > > > > > Flink
> > > > > > > > 1.17
> > > > > > > > > > and
> > > > > > > > > > > mark the SinkFunction as @deprecated in Flink 1.17
> > > > > > > > > > >
> > > > > > > > > > > The original Sink interface was introduced in Flink
> 1.12
> > > with
> > > > > > > > FLIP-143
> > > > > > > > > > [2]
> > > > > > > > > > > and extended with FLIP-177 in Flink 1.14 [3] and has
> been
> > > > > > improved
> > > > > > > on
> > > > > > > > > > > further as Sink V2 via FLIP-191 in Flink 1.15 [4].
> > > > > > > > > > >
> > > > > > > > > > > Looking at the API stability graduation process [5],
> the
> > > fact
> > > > > > that
> > > > > > > > Sink
> > > > > > > > > > V2
> > > > > > > > > > > was introduced in Flink 1.15 would mean that we could
> > > > warrant a
> > > > > > > > > promotion
> > > > > > > > > > > to @public already (given that there have been two
> > releases
> > > > > with
> > > > > > > 1.15
> > > > > > > > > and
> > > > > > > > > > > 1.16 where it was introduced). Combined with the fact
> > that
> > > > > SinkV2
> > > > > > > has
> > > > > > > > > > been
> > > > > > > > > > > the result of iteration over the introduction of the
> > > original
> > > > > > Sink
> > > > > > > > API
> > > > > > > > > > > since Flink 1.12, I would argue that the promotion is
> > > > overdue.
> > > > > > > > > > >
> > > > > > > > > > > If we promote the Sink API to @public, I think we
> should
> > > also
> > > > > > > > > immediately
> > > > > > > > > > > mark the SinkFunction as @deprecated.
> > > > > > > > > > >
> > > > > > > > > > > Looking forward to your thoughts.
> > > > > > > > > > >
> > > > > > > > > > > Best regards,
> > > > > > > > > > >
> > > > > > > > > > > Martijn
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > [1]
> > > > > > >
> https://lists.apache.org/thread/l05m6cf8fwkkbpnjtzbg9l2lo40oxzd1
> > > > > > > > > > > [2]
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-143%3A+Unified+Sink+API
> > > > > > > > > > > [3]
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-177%3A+Extend+Sink+API
> > > > > > > > > > > [4]
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-191%3A+Extend+unified+Sink+interface+to+support+small+file+compaction
> > > > > > > > > > > [5]
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-197%3A+API+stability+graduation+process
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > > https://twitter.com/snntrable
> > > > > > > > > https://github.com/knaufk
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > > >
> > > > > --
> > > > > https://twitter.com/snntrable
> > > > > https://github.com/knaufk
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] Promote SinkV2 to @Public and deprecate SinkFunction

Posted by Jing Ge <ji...@ververica.com.INVALID>.
Since we didn't get any new feedback from the reporters, +1 for closing
FLINK-30238 according to the analysis done by Gordon, Martijn, and Alex.
Happy to see we can move a little bit forward. Thanks for your effort!

Best regards,
Jing

On Tue, Oct 31, 2023 at 3:03 AM Yun Gao <ga...@apache.org> wrote:

> Hi Martijn and Gordon,
>
> Very sorry for the very late reply, +1 for close FLINK-30238 and open
> dedicated issues
> for the remaining issues.
>
> Best,
> Yun
>
> On Mon, Oct 30, 2023 at 6:42 PM Martijn Visser <ma...@apache.org>
> wrote:
> >
> > Hi everyone,
> >
> > I would like to +1 Gordon's proposal to close FLINK-30238, create a
> > new follow-up ticket and try to address the specific
> > PostCommitTopology in the work that's currently being done by Peter on
> > SinkV2. If there's no feedback on this topic, I assume everyone's OK
> > with that.
> >
> > Best regards,
> >
> > Martijn
> >
> > On Fri, Sep 29, 2023 at 8:11 AM Tzu-Li (Gordon) Tai <tz...@apache.org>
> wrote:
> > >
> > > Hi everyone,
> > >
> > > It’s been a while since this topic was last discussed, but
> nevertheless, it
> > > still remains very desirable to figure out a clear path towards making
> > > SinkV2 @Public.
> > >
> > > There’s a new thread [1] that has a pretty good description on missing
> > > features in SinkV2 from the Iceberg connector’s perspective, which
> prevents
> > > them from migrating - anything related to those new requirements, let's
> > > discuss there.
> > >
> > > Nevertheless, I think we should also revive and reuse this thread to
> reach
> > > consensus / closure on all concerns already brought up here.
> > >
> > > It’s quite a long thread where a lot of various concerns were brought
> up,
> > > but I’d start by addressing two very specific ones: FLIP-287 [2] and
> > > FLINK-30238 [3]
> > >
> > > First of all, FLIP-287 has been approved and merged already, and will
> be
> > > released with 1.18.0. So, connector migrations that were waiting on
> this
> > > should hopefully be unblocked after the release. So this seems to no
> longer
> > > be a concern - let’s see things through with those connectors actually
> > > being migrated.
> > >
> > > FLINK-30238 is sort of a confusing one, and I do believe it is
> (partially)
> > > a false alarm. After looking into this, the problem reported there
> > > essentially breaks down to two things:
> > > 1) TwoPhaseCommittingSink is unable to take a new savepoint after
> restoring
> > > from a savepoint generated via `stop-with-savepoint --drain`
> > > 2) SinkV2 sinks with a PostCommitTopology do not properly have
> post-commits
> > > completed after a stop-with-savepoint operation, since committed
> > > commitables are not emitted to the post-commit topology after the
> committer
> > > receives the end-of-input signal.
> > >
> > > My latest comment in [3] explains this in a bit more detail.
> > >
> > > I believe we can conclude that problem 1) is a non-concern - users
> should
> > > not restore from a job that is drained on stop-with-savepoint and
> cannot
> > > expect the restored job to function normally.
> > > Problem 2) remains a real issue though, and to help clear things up I
> think
> > > we should close FLINK-30238 in favor of a new ticket scoped to the
> specific
> > > PostCommitTopology problem.
> > >
> > > The other open concerns seem to mostly be around graduation criteria
> and
> > > process - I've yet to go through those and will follow up with a
> separate
> > > reply (or perhaps Martijn can help wrap up that part?).
> > >
> > > Thanks,
> > > Gordon
> > >
> > > [1] https://lists.apache.org/thread/h3kg7jcgjstpvwlhnofq093vk93ylgsn
> > > [2]
> > >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=240880853
> > > [3] https://issues.apache.org/jira/browse/FLINK-30238
> > >
> > > On Mon, Feb 13, 2023 at 2:50 AM Jing Ge <ji...@ververica.com.invalid>
> wrote:
> > >
> > > > @Martijn
> > > >
> > > > What I shared previously is the fact of the current KafkaSink.
> Following
> > > > your suggestion, the KafkaSink should still be marked as
> @Experimental for
> > > > now which will need even longer time to graduate. BTW, KafkaSink
> does not
> > > > depend on any @Internal interfaces. The @Internal is used for methods
> > > > coming from @PublicEvolving SinkV2 interfaces, not interfaces
> themself.
> > > > Thanks for bringing this topic up. Currently, there is no rule
> defined to
> > > > say that no @Internal is allowed for methods implemented
> > > > from @PublicEvolving interfaces. Further (off-this-topic) discussion
> might
> > > > be required to check if it really makes sense to define such a rule,
> since
> > > > some methods defined in interfaces might only be used internally,
> i.e. no
> > > > connector user would be aware of them.
> > > >
> > > > @Dong
> > > >
> > > > I agree with everything you said and especially can't agree more to
> let
> > > > developers who will own it make the decision.
> > > >
> > > > Best regards,
> > > > Jing
> > > >
> > > >
> > > > On Sun, Feb 12, 2023 at 2:53 AM Dong Lin <li...@gmail.com>
> wrote:
> > > >
> > > > > Hi Martijn,
> > > > >
> > > > > Thanks for the reply. Please see my comments inline.
> > > > >
> > > > > Regards,
> > > > > Dong
> > > > >
> > > > > On Sat, Feb 11, 2023 at 4:31 AM Martijn Visser <
> martijnvisser@apache.org
> > > > >
> > > > > wrote:
> > > > >
> > > > > > Hi all,
> > > > > >
> > > > > > I wanted to get back on a couple of comments and have a proposal
> at the
> > > > > end
> > > > > > for the next steps:
> > > > > >
> > > > > > @Steven Wu
> > > > > > If I look back at the original discussion of FLIP-191 and also
> the
> > > > thread
> > > > > > that you're referring to, it appears from the discussion with
> Yun Gao
> > > > > that
> > > > > > the solution was in near-sight, but just not finished. Perhaps
> it needs
> > > > > to
> > > > > > be restarted once more so it can be brought to a closure. Also
> when I
> > > > > > looked back at the original introduction of SinkV2, there was
> > > > FLINK-25726
> > > > > > [1] which looks like it was built specifically for Iceberg and
> Delta
> > > > > Sink?
> > > > > >
> > > > > > @Jing
> > > > > > > All potential unstable methods coming from SinkV2 interfaces
> will be
> > > > > kept
> > > > > > marked as @internal.
> > > > > >
> > > > > > This is a situation that I think should be avoided: if it's
> unstable,
> > > > it
> > > > > > should be marked as @Experimental. Connectors should not need to
> rely
> > > > > > on @Internal interfaces; if they are needed by a connector, a
> > > > maintainer
> > > > > > should create a proposal to expose.
> > > > > >
> > > > > > On the production readiness of SinkV2:
> > > > > >
> > > > > > @Dong
> > > > > > > And Yuxia mentioned earlier in this thread that there are bugs
> such
> > > > as
> > > > > > FLINK-30238 <https://issues.apache.org/jira/browse/FLINK-30238>
> and
> > > > > > FLINK-29459 <https://issues.apache.org/jira/browse/FLINK-29459>
> ,
> > > > which
> > > > > > makes it hard to use SinkV2 properly in production.
> > > > > >
> > > > > > @Jark
> > > > > > > The follow-up step should be trying the new APIs in
> externalized
> > > > > > connectors and giving users the confidence to migrate.
> > > > > >
> > > > > > I think that is not a fair assessment of the situation. Given
> that
> > > > Kafka,
> > > > > > Elasticsearch, FileSink and everything that uses the AsyncSink
> is on
> > > > > SinkV2
> > > > > > since Flink 1.15 and the popularity of these connectors, they
> have
> > > > > > definitely already proven themselves. That's not to downplay
> that bugs
> > > > > have
> > > > > > no impact and shouldn't be resolved, but I don't think it's fair
> to
> > > > state
> > > > > > that SinkV2 is not stable or usable in production.
> > > > > >
> > > > >
> > > > > I believe there exist jobs that are currently using SinkV2 (e.g.
> Kafka,
> > > > > FileSink) in production. The questions are whether 1) SinkV2 is
> ready to
> > > > > support all features of all existing SinkFunction; and 2) whether
> all
> > > > those
> > > > > connectors (e.g. Kafka) are ready to support all features of their
> > > > > SinkFunction version (e.g. FlinkKafkaProducer) reliable without
> critical
> > > > > bugs (e.g. losing data).
> > > > >
> > > > > If we can not answer "yes" to the above two questions, then we
> probably
> > > > can
> > > > > not say that SinkV2 is usable in production in general, even
> though we
> > > > can
> > > > > still say that some source/sink based on SinkV2 is usable in
> production
> > > > in
> > > > > selected scenarios (e.g. maybe those jobs that do not use on
> savepoint).
> > > > >
> > > > > Maybe we can use FLINK-30238
> > > > > <https://issues.apache.org/jira/browse/FLINK-30238> to explain
> this
> > > > point.
> > > > > The JIRA's description basically says that "there is the risk of
> losing
> > > > > data if we use stop-with-savepoint with SinkV2". Do you know
> whether this
> > > > > might potentially affect those SinkV2 (e.g. Kafka, FileSink) jobs
> in
> > > > > production when they are restarted with savepoint?
> > > > >
> > > > >
> > > > >
> > > > > >
> > > > > > Coming back to the original topic of this thread, my proposal
> would be
> > > > > that
> > > > > > in Flink 1.17:
> > > > > >
> > > > > > 1. SinkV2 gets promoted from @PublicEvolving to @Public, which
> also
> > > > > means:
> > > > > >
> > > > >
> > > > > I believe we have agreed to address the issues with SinkV2. Now the
> > > > > question is, how likely would the solutions require backward
> incompatible
> > > > > change to SinkV2.
> > > > >
> > > > > For example, do you think the following questions (mentioned
> earlier in
> > > > > this thread) would require backward incompatible changes?
> > > > > - Sink.InitContext still lacks some necessary information to
> migrate
> > > > > existing connectors to new sinks
> > > > > - Address FLINK-30238 and FLINK-29459.
> > > > > - Find the migration path for the Iceberg sink
> > > > >
> > > > > IMO, it seems safer to mark an interface as @Public when we are
> pretty
> > > > sure
> > > > > that there is no known issue in the near future that would require
> > > > > backward-incompatible changes to the interface.
> > > > >
> > > > > Note that the cost of making an interface @Public lies on the
> > > > developer(s)
> > > > > who owns the task of updating SinkV2 to address those known issues
> (e.g.
> > > > > migration). Maybe we should find the developers who will own this
> task
> > > > and
> > > > > let them decide whether to mark it @Public?
> > > > >
> > > > >
> > > > > a. KafkaSink gets promoted from @PublicEvolving to @Public (which
> solves
> > > > > > the weird thing that FlinkKafkaProducer is already marked as
> deprecated
> > > > > > since Flink 1.14 and points to KafkaSink as a successor)
> > > > > > b. FileSink gets promoted from @Experimental to @PublicEvolving
> (note:
> > > > > this
> > > > > > has been marked as @Experimental since Flink 1.12, so one could
> argue
> > > > > that
> > > > > > this needs to be bumped to @Public too. The reason why I didn't
> do it
> > > > is
> > > > > > because of Yuxia's remark that FileSystemTableSink is still using
> > > > > > SinkFunction)
> > > > > > c. ElasticsearchSink gets promoted from @PublicEvolving to
> @Public (has
> > > > > > been externalized so needs to happen there)
> > > > > > d. The AWS connectors (Firehose, Kinesis) get promoted from
> > > > > @PublicEvolving
> > > > > > to @Public (has been externalized so needs to happen there)
> > > > > > e. It's up to the maintainers for DynamoDB and Opensearch to
> decide
> > > > this
> > > > > > themselves, given that I believe they've only been introduced
> recently
> > > > > and
> > > > > > support only Flink 1.16.
> > > > > >
> > > > > > 2. For the next Flink release, the community needs to establish
> what
> > > > are
> > > > > > features that are currently not available in SinkV2 (e.g.
> FLIP-287), if
> > > > > > such a feature should be made available in SinkV2 (in case it's a
> > > > feature
> > > > > > which currently resides in SinkFunction, while from new insights
> we
> > > > think
> > > > > > it should now be solved differently) and then get these
> implemented
> > > > asap.
> > > > > >
> > > > > > Best regards,
> > > > > >
> > > > > > Martijn
> > > > > >
> > > > > > [1] https://issues.apache.org/jira/browse/FLINK-25726
> > > > > >
> > > > > >
> > > > > > On Tue, Feb 7, 2023 at 5:11 AM Steven Wu <st...@gmail.com>
> wrote:
> > > > > >
> > > > > > > > Regarding the discussion on global committer [1] for sinks
> with
> > > > > global
> > > > > > > transactions, there is no consensus on solving that problem in
> > > > SinkV2.
> > > > > > Will
> > > > > > > it require any breaking change in SinkV2?
> > > > > > >
> > > > > > > Just want to reiterate my earlier question. What is the
> migration
> > > > path
> > > > > > for
> > > > > > > the Iceberg sink?
> > > > > > >
> > > > > > > [1]
> https://lists.apache.org/thread/82bgvlton9olb591bfg2djv0cshj1bxj
> > > > > > >
> > > > > > > On Mon, Feb 6, 2023 at 6:22 PM Jark Wu <im...@gmail.com>
> wrote:
> > > > > > >
> > > > > > > > Hi Konstantin,
> > > > > > > >
> > > > > > > > I totally agree with making SinkV2 @Public. I just have
> concerns
> > > > > about
> > > > > > > > deprecating SinkFunction at this point. Dong Lin has raised
> the
> > > > > blocker
> > > > > > > > issues of migration multiple times in this thread which I
> think we
> > > > > > should
> > > > > > > > address first. I don't know why we rush to deprecate
> SinkFunction
> > > > > while
> > > > > > > > SourceFunction is still public, but the new Source API is
> much more
> > > > > > > stable
> > > > > > > > and production-ready than SinkV2.
> > > > > > > >
> > > > > > > > Iceberg community raised concerns[1] about the workability
> and
> > > > > > stability
> > > > > > > of
> > > > > > > > Flink connector APIs.
> > > > > > > >
> > > > > > > > We are hoping any issues with the APIs for Iceberg connector
> will
> > > > > > surface
> > > > > > > > > sooner and get more attention from the Flink community
> when the
> > > > > > > connector
> > > > > > > > > is within Flink umbrella rather than in Iceberg repo.
> > > > > > > >
> > > > > > > >
> > > > > > > > The connector externalizing is a big step for building a
> mechanism
> > > > to
> > > > > > > > guarantee Flink connector API is stable and workable. The
> follow-up
> > > > > > step
> > > > > > > > should be trying the new APIs in externalized connectors and
> giving
> > > > > > users
> > > > > > > > the confidence to migrate.
> > > > > > > >
> > > > > > > >
> > > > > > > > Best,
> > > > > > > > Jark
> > > > > > > >
> > > > > > > > [1]:
> > > > > https://lists.apache.org/thread/r5zqnkt01x2c611brkjmxxnt3bfcgl1b
> > > > > > > >
> > > > > > > > On Tue, 7 Feb 2023 at 09:53, yuxia <
> luoyuxia@alumni.sjtu.edu.cn>
> > > > > > wrote:
> > > > > > > >
> > > > > > > > > Hi Konstantin,
> > > > > > > > > Just FYI, the FileSystemTableSink are still using
> SinkFunction.
> > > > > > > > >
> > > > > > > > > Best regards,
> > > > > > > > > Yuxia
> > > > > > > > >
> > > > > > > > > ----- 原始邮件 -----
> > > > > > > > > 发件人: "Dong Lin" <li...@gmail.com>
> > > > > > > > > 收件人: "dev" <de...@flink.apache.org>
> > > > > > > > > 抄送: "Jing Ge" <ji...@ververica.com>, "Yun Tang" <
> myasuka@live.com
> > > > >
> > > > > > > > > 发送时间: 星期二, 2023年 2 月 07日 上午 9:41:07
> > > > > > > > > 主题: Re: [DISCUSS] Promote SinkV2 to @Public and deprecate
> > > > > > SinkFunction
> > > > > > > > >
> > > > > > > > > Hi Konstantin,
> > > > > > > > >
> > > > > > > > > Thanks for the reply. Please see my comments inline.
> > > > > > > > >
> > > > > > > > > On Mon, Feb 6, 2023 at 9:48 PM Konstantin Knauf <
> > > > knaufk@apache.org
> > > > > >
> > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Hi Steven,
> > > > > > > > > >
> > > > > > > > > > Sink is already deprecated. It was deprecated at the
> moment
> > > > where
> > > > > > we
> > > > > > > > > > introduced SinkV2.
> > > > > > > > > >
> > > > > > > > > > Hi Jark, Hi Dong,
> > > > > > > > > >
> > > > > > > > > > My understanding is the SinkV2 is a workable interface.
> The
> > > > most
> > > > > > > > > important
> > > > > > > > > > connectors have been migrated (Kafka, Filesystem) and
> more
> > > > > > connectors
> > > > > > > > > > (OpenSearch, ElasticSearch, Kinesis) use it
> successfully. To
> > > > make
> > > > > > > > SinkV2
> > > > > > > > > > public, it does not need to have all possible
> functionality.
> > > > > Public
> > > > > > > > APIs
> > > > > > > > > > can be extended. That's what we do all the time. There
> will
> > > > also
> > > > > > > always
> > > > > > > > > be
> > > > > > > > > > bugs. So, these points can not be categorical blockers to
> > > > promote
> > > > > > the
> > > > > > > > > API.
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > > Right. I believe we all agree that we make SinkV2
> > > > @PublicEvolving.
> > > > > > The
> > > > > > > > > concern here is whether we can mark SinkFunction as
> deprecated at
> > > > > > this
> > > > > > > > > point.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > What are the specific issues/tickets that are blocking
> us? Can
> > > > we
> > > > > > in
> > > > > > > > your
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > > For example, Lijie mentioned earlier in this thread that
> > > > according
> > > > > to
> > > > > > > > > FLIP-287,
> > > > > > > > > currently the Sink.InitContext still lacks some necessary
> > > > > information
> > > > > > > to
> > > > > > > > > migrate existing connectors to new sinks. This could be a
> blocker
> > > > > > issue
> > > > > > > > > since this is related to the SinkV2 API design.
> > > > > > > > >
> > > > > > > > > And Yuxia mentioned earlier in this thread that there are
> bugs
> > > > such
> > > > > > as
> > > > > > > > > FLINK-30238 <
> https://issues.apache.org/jira/browse/FLINK-30238>
> > > > > and
> > > > > > > > > FLINK-29459 <
> https://issues.apache.org/jira/browse/FLINK-29459>
> > > > ,
> > > > > > > which
> > > > > > > > > makes it hard to use SinkV2 properly in production.  It
> seems
> > > > that
> > > > > > > these
> > > > > > > > > two bugs are created months ago and are still unresolved
> or even
> > > > > > > > > unassigned. This looks like a clear signal that SinkV2 is
> not
> > > > being
> > > > > > > > > actively maintained and used in production.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > > opinion only deprecate it when every single connector in
> Apache
> > > > > > Flink
> > > > > > > > is
> > > > > > > > > > migrated already?
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > > Technically this is not a hard requirement. But I would
> suggest
> > > > > that
> > > > > > we
> > > > > > > > > should migrate all existing connectors so that we eat our
> own
> > > > > dogfood
> > > > > > > and
> > > > > > > > > prove to users that SinkV2 is ready for use in production.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > > In my opinion it is the time to ask users to the migrate
> their
> > > > > > > > > connectors.
> > > > > > > > > > More importantly, @Deprecated would signal users not to
> build
> > > > new
> > > > > > > > > > connectors on SinkFunction. I would arque its also very
> > > > > misleading
> > > > > > to
> > > > > > > > > users
> > > > > > > > > > to not @Deprecated SinkFunction given that is clearly
> will be
> > > > > > > > deprecated.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > > Cheers,
> > > > > > > > > >
> > > > > > > > > > Konstantin
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Am Mo., 6. Feb. 2023 um 13:26 Uhr schrieb Jark Wu <
> > > > > > imjark@gmail.com
> > > > > > > >:
> > > > > > > > > >
> > > > > > > > > > > I agree with Dong Lin.
> > > > > > > > > > >
> > > > > > > > > > > Oracle explains how to use Deprecate API [1]:
> > > > > > > > > > >
> > > > > > > > > > > You are strongly recommended to use the Javadoc
> @deprecated
> > > > tag
> > > > > > > with
> > > > > > > > > > > > appropriate comments explaining how to use the new
> API.
> > > > This
> > > > > > > > ensures
> > > > > > > > > > > > developers will *have a workable migration path from
> the
> > > > old
> > > > > > API
> > > > > > > to
> > > > > > > > > the
> > > > > > > > > > > > new API*.
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > From a user's perspective, the workable migration path
> is
> > > > very
> > > > > > > > > important.
> > > > > > > > > > > Otherwise, it blurs the semantics of API deprecation.
> The
> > > > Flink
> > > > > > > API's
> > > > > > > > > > > compatibility and stability issues in the past left a
> bad
> > > > > > > impression
> > > > > > > > on
> > > > > > > > > > the
> > > > > > > > > > > downstream projects. We should be careful when
> changing and
> > > > > > > > deprecating
> > > > > > > > > > > APIs, especially when there are known migration gaps.
> I think
> > > > > > it's
> > > > > > > a
> > > > > > > > > good
> > > > > > > > > > > idea to migrate Flink-owned connectors before marking
> old API
> > > > > > > > > deprecated.
> > > > > > > > > > > This ensures downstream projects can migrate to new
> APIs
> > > > > > smoothly.
> > > > > > > > > > >
> > > > > > > > > > > Best,
> > > > > > > > > > > Jark
> > > > > > > > > > >
> > > > > > > > > > > [1]:
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> https://docs.oracle.com/javase/8/docs/technotes/guides/javadoc/deprecation/deprecation.html
> > > > > > > > > > >
> > > > > > > > > > > On Mon, 6 Feb 2023 at 10:01, Steven Wu <
> stevenz3wu@gmail.com
> > > > >
> > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Regarding the discussion on global committer [1] for
> sinks
> > > > > with
> > > > > > > > > global
> > > > > > > > > > > > transactions, there is no consensus on solving that
> problem
> > > > > in
> > > > > > > > > SinkV2.
> > > > > > > > > > > Will
> > > > > > > > > > > > it require any breaking change in SinkV2?
> > > > > > > > > > > >
> > > > > > > > > > > > Also will SinkV1 be deprecated too? or it should
> happen
> > > > > > sometime
> > > > > > > > > after
> > > > > > > > > > > > SinkFunction deprecation?
> > > > > > > > > > > >
> > > > > > > > > > > > [1]
> > > > > > > >
> https://lists.apache.org/thread/82bgvlton9olb591bfg2djv0cshj1bxj
> > > > > > > > > > > >
> > > > > > > > > > > > On Sun, Feb 5, 2023 at 2:14 AM Dong Lin <
> > > > lindong28@gmail.com
> > > > > >
> > > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > Hi Konstantin,
> > > > > > > > > > > > >
> > > > > > > > > > > > > Thanks for the comment! Please see my comment
> inline.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Cheers,
> > > > > > > > > > > > > Dong
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Sat, Feb 4, 2023 at 2:06 AM Konstantin Knauf <
> > > > > > > > knaufk@apache.org
> > > > > > > > > >
> > > > > > > > > > > > wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > Hi everyone,
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > sorry for joining the discussion late.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > 1) Is there an option to deprecate SinkFunction
> in
> > > > Flink
> > > > > > 1.17
> > > > > > > > > while
> > > > > > > > > > > > > leaving
> > > > > > > > > > > > > > SinkV2 @PublicEvolving in Flink 1.17. We then
> aim to
> > > > make
> > > > > > > > SinkV2
> > > > > > > > > > > > @Public
> > > > > > > > > > > > > in
> > > > > > > > > > > > > > and remove SinkFunction in Flink 1.18.
> @PublicEvolving
> > > > > are
> > > > > > > > > intended
> > > > > > > > > > > for
> > > > > > > > > > > > > > public use. So, I don't see it as a blocker for
> > > > > deprecating
> > > > > > > > > > > > SinkFunction
> > > > > > > > > > > > > > that we have to make SinkV2 @PublicEvovling. For
> > > > > reference
> > > > > > > this
> > > > > > > > > is
> > > > > > > > > > > the
> > > > > > > > > > > > > > description of @PublicEvovling:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > /**
> > > > > > > > > > > > > >  * Annotation to mark classes and methods for
> public
> > > > use,
> > > > > > but
> > > > > > > > > with
> > > > > > > > > > > > > > evolving interfaces.
> > > > > > > > > > > > > >  *
> > > > > > > > > > > > > >  * <p>Classes and methods with this annotation
> are
> > > > > intended
> > > > > > > for
> > > > > > > > > > > public
> > > > > > > > > > > > > > use and have stable behavior.
> > > > > > > > > > > > > >  * However, their interfaces and signatures are
> not
> > > > > > > considered
> > > > > > > > to
> > > > > > > > > > be
> > > > > > > > > > > > > > stable and might be changed
> > > > > > > > > > > > > >  * across versions.
> > > > > > > > > > > > > >  *
> > > > > > > > > > > > > >  * <p>This annotation also excludes methods and
> classes
> > > > > > with
> > > > > > > > > > evolving
> > > > > > > > > > > > > > interfaces / signatures within
> > > > > > > > > > > > > >  * classes annotated with {@link Public}.
> > > > > > > > > > > > > >  */
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Marking SinkFunction @Deprecated would already
> single
> > > > > > > everyone
> > > > > > > > to
> > > > > > > > > > > move
> > > > > > > > > > > > to
> > > > > > > > > > > > > > SinkV2, which we as a community, I believe, have
> a
> > > > strong
> > > > > > > > > interest
> > > > > > > > > > > in.
> > > > > > > > > > > > > Its
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > Yes, I also believe we all have this strong
> interest. I
> > > > > just
> > > > > > > hope
> > > > > > > > > > that
> > > > > > > > > > > > this
> > > > > > > > > > > > > can be done in the best possible way that does not
> > > > confuse
> > > > > > > users.
> > > > > > > > > > > > >
> > > > > > > > > > > > > I probably still have the same concern regarding
> its
> > > > impact
> > > > > > on
> > > > > > > > > users:
> > > > > > > > > > > if
> > > > > > > > > > > > we
> > > > > > > > > > > > > mark an API as deprecated, it effectively means
> the users
> > > > > of
> > > > > > > this
> > > > > > > > > API
> > > > > > > > > > > > > should start to migrate to another API (e.g.
> SinkV2) and
> > > > we
> > > > > > > might
> > > > > > > > > > > remove
> > > > > > > > > > > > > this API in the future. However, given that we
> know there
> > > > > are
> > > > > > > > known
> > > > > > > > > > > > > problems preventing users from doing so, it seems
> that we
> > > > > are
> > > > > > > not
> > > > > > > > > > ready
> > > > > > > > > > > > to
> > > > > > > > > > > > > send this message to users right.
> > > > > > > > > > > > >
> > > > > > > > > > > > > If I understand correctly, I guess you are
> suggesting
> > > > that
> > > > > by
> > > > > > > > > marking
> > > > > > > > > > > > > SinkFunction as deprecated, we can put higher
> pressure on
> > > > > > Flink
> > > > > > > > > > > > > contributors to update the existing Flink codebase
> to
> > > > > improve
> > > > > > > and
> > > > > > > > > use
> > > > > > > > > > > > > SinkV2.
> > > > > > > > > > > > >
> > > > > > > > > > > > > I am not sure this is the right way to use
> @deprecated,
> > > > > which
> > > > > > > > has a
> > > > > > > > > > > > > particular meaning for its users rather than
> > > > contributors.
> > > > > > And
> > > > > > > I
> > > > > > > > am
> > > > > > > > > > > also
> > > > > > > > > > > > > not sure we can even pressure contributors of an
> > > > > open-source
> > > > > > > > > project
> > > > > > > > > > > into
> > > > > > > > > > > > > developing a feature (e.g. migrate all existing
> > > > > SinkFunction
> > > > > > > > > > subclasses
> > > > > > > > > > > > to
> > > > > > > > > > > > > SinkV2). IMO, the typical way is for the
> contributor with
> > > > > > > > > > interest/time
> > > > > > > > > > > > to
> > > > > > > > > > > > > work on the feature, or talk to other contributors
> > > > whether
> > > > > > they
> > > > > > > > are
> > > > > > > > > > > > willing
> > > > > > > > > > > > > to collaborate/work on this, rather than
> pressuring other
> > > > > > > > > > contributors
> > > > > > > > > > > > into
> > > > > > > > > > > > > working on this.
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > almost comical how long the transition from
> > > > > > > > > > > SourceFurnction/SinkFunction
> > > > > > > > > > > > to
> > > > > > > > > > > > > > Source/Sink takes us. At the same time, we leave
> > > > > ourselves
> > > > > > > the
> > > > > > > > > > option
> > > > > > > > > > > > to
> > > > > > > > > > > > > to
> > > > > > > > > > > > > > make small changes to SinkV2 if any problems
> arise
> > > > during
> > > > > > the
> > > > > > > > > > > migration
> > > > > > > > > > > > > of
> > > > > > > > > > > > > > these connector.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > I think, we have a bit of a chicken/egg problem
> here.
> > > > The
> > > > > > > > > pressure
> > > > > > > > > > > for
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > Similar to the reason described above, I am not
> sure we
> > > > > have
> > > > > > a
> > > > > > > > > > > > chicken/egg
> > > > > > > > > > > > > problem here. The issue here is that SinkV2 is not
> ready
> > > > > and
> > > > > > we
> > > > > > > > > have
> > > > > > > > > > a
> > > > > > > > > > > > lot
> > > > > > > > > > > > > of existing SinkFunction that is not migrated by
> > > > ourselves.
> > > > > > We
> > > > > > > > > (Flink
> > > > > > > > > > > > > contributors) probably do not need to mark
> SinkFunction
> > > > as
> > > > > > > > > deprecated
> > > > > > > > > > > in
> > > > > > > > > > > > > order to address these issues in our own codebase.
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > users and contributors is not high enough to move
> away
> > > > from
> > > > > > > > > > > SinkFunction
> > > > > > > > > > > > as
> > > > > > > > > > > > > > long as its not deprecated, but at the same time
> we
> > > > need
> > > > > > > people
> > > > > > > > > to
> > > > > > > > > > > > > migrate
> > > > > > > > > > > > > > their connectors to see if there are any gaps in
> > > > SinkV2.
> > > > > I
> > > > > > > > > believe,
> > > > > > > > > > > the
> > > > > > > > > > > > > > combination proposed above could bridge this
> problem.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > 2) I don't understand the argument of waiting
> until
> > > > some
> > > > > of
> > > > > > > the
> > > > > > > > > > > > > > implementations are @Public. How can we make the
> > > > > > > > implementations
> > > > > > > > > of
> > > > > > > > > > > the
> > > > > > > > > > > > > > SinkV2 API @Public without making SinkV2
> @Public? All
> > > > > > public
> > > > > > > > > > methods
> > > > > > > > > > > of
> > > > > > > > > > > > > > SinkV2 are part of every implementation. So to
> me it
> > > > > > actually
> > > > > > > > > seems
> > > > > > > > > > > to
> > > > > > > > > > > > be
> > > > > > > > > > > > > > opposite: in order to make any of the
> implementation
> > > > > > @Public
> > > > > > > we
> > > > > > > > > > first
> > > > > > > > > > > > > need
> > > > > > > > > > > > > > to make the API @Public.
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > Yeah I also agree with you.
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Cheers,
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Konstantin
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Am Mo., 30. Jan. 2023 um 13:18 Uhr schrieb Dong
> Lin <
> > > > > > > > > > > > lindong28@gmail.com
> > > > > > > > > > > > > >:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Hi Martijn,
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Thanks for driving this effort to clean-up the
> Flink
> > > > > > > > codebase!
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > I like the idea to cleanup Flink codebase to
> avoid
> > > > > having
> > > > > > > two
> > > > > > > > > > > Sinks.
> > > > > > > > > > > > On
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > other hand, I also thing the concern mentioned
> by
> > > > Jing
> > > > > > > makes
> > > > > > > > > > sense.
> > > > > > > > > > > > In
> > > > > > > > > > > > > > > addition to thinking in terms of the rule
> proposed in
> > > > > > > > FLIP-197
> > > > > > > > > > > > > > > <
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-197%3A+API+stability+graduation+process
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > (which
> > > > > > > > > > > > > > > seems to focus mostly on the Flink developers'
> > > > > > > perspective),
> > > > > > > > it
> > > > > > > > > > > might
> > > > > > > > > > > > > be
> > > > > > > > > > > > > > > useful to also think about the story from
> users'
> > > > > > > perspective
> > > > > > > > > and
> > > > > > > > > > > make
> > > > > > > > > > > > > > sure
> > > > > > > > > > > > > > > their concerns can be addressed.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Typically, by marking an API as deprecated, we
> are
> > > > > > > > effectively
> > > > > > > > > > > > telling
> > > > > > > > > > > > > > > users *they should start to migrate to the new
> API
> > > > ASAP
> > > > > > and
> > > > > > > > we
> > > > > > > > > > > > reserve
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > right to remove this API completely in the 1-2
> > > > > releases*.
> > > > > > > > Then
> > > > > > > > > it
> > > > > > > > > > > > might
> > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > reasonable for users to ask questions such as:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > 1) Does SinkV2 public API provides all the
> > > > > > functionalities
> > > > > > > > > needed
> > > > > > > > > > > to
> > > > > > > > > > > > > > > migrate my existing code from subclassing
> > > > SinkFunction
> > > > > to
> > > > > > > > > > > subclassing
> > > > > > > > > > > > > > > SinkV2?
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > 2) Is the amount of migration work reasonable?
> If
> > > > yes,
> > > > > > why
> > > > > > > > is a
> > > > > > > > > > > class
> > > > > > > > > > > > > > such
> > > > > > > > > > > > > > > as HBaseSinkFunction in Flink's own codebase
> still
> > > > > > > depending
> > > > > > > > on
> > > > > > > > > > > > > > > SinkFunction? Maybe Flink developers should
> eat their
> > > > > own
> > > > > > > dog
> > > > > > > > > > food
> > > > > > > > > > > > and
> > > > > > > > > > > > > > > migrate (or deprecate) those classes in the
> Flink
> > > > > > codebase
> > > > > > > > > first?
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Based on the discussion in this thread, I am
> not sure
> > > > > we
> > > > > > > have
> > > > > > > > > > good
> > > > > > > > > > > > > > answers
> > > > > > > > > > > > > > > to those questions yet. For the 1st question
> above,
> > > > the
> > > > > > > > answer
> > > > > > > > > is
> > > > > > > > > > > > *no*
> > > > > > > > > > > > > > > because we already know that the SinkV2 is
> currently
> > > > > not
> > > > > > > able
> > > > > > > > > to
> > > > > > > > > > > > > support
> > > > > > > > > > > > > > > migration for JdbcSink. For the 2nd question
> above,
> > > > we
> > > > > > know
> > > > > > > > > there
> > > > > > > > > > > are
> > > > > > > > > > > > > > many
> > > > > > > > > > > > > > > non-deprecated class in Flink codebase that
> are still
> > > > > > > > depending
> > > > > > > > > > on
> > > > > > > > > > > > > > SinkV2.
> > > > > > > > > > > > > > > It is probably not nice to users if we tell
> them to
> > > > > > migrate
> > > > > > > > > while
> > > > > > > > > > > we
> > > > > > > > > > > > > know
> > > > > > > > > > > > > > > there are existing issues that can prevent
> them from
> > > > > > doing
> > > > > > > so
> > > > > > > > > > > easily.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > In order to move forward to deprecate SinkV2,
> I think
> > > > > it
> > > > > > > will
> > > > > > > > > be
> > > > > > > > > > > > super
> > > > > > > > > > > > > > > useful to first migrate all the connectors
> managed by
> > > > > > Flink
> > > > > > > > > > > community
> > > > > > > > > > > > > > > (including all externalized connectors) to use
> > > > SinkV2.
> > > > > > This
> > > > > > > > > work
> > > > > > > > > > > > won't
> > > > > > > > > > > > > be
> > > > > > > > > > > > > > > wasted since we need to do this anyway. And it
> will
> > > > > also
> > > > > > > give
> > > > > > > > > us
> > > > > > > > > > a
> > > > > > > > > > > > > chance
> > > > > > > > > > > > > > > to validate the capabilities of SinkV2 and fix
> bugs
> > > > by
> > > > > > > > > ourselves
> > > > > > > > > > as
> > > > > > > > > > > > > much
> > > > > > > > > > > > > > as
> > > > > > > > > > > > > > > possible.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > What do you think?
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Best Regards,
> > > > > > > > > > > > > > > Dong
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > On Wed, Jan 18, 2023 at 6:52 PM Martijn Visser
> <
> > > > > > > > > > > > > martijnvisser@apache.org
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Hi all,
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > While discussing FLIP-281 [1] the discussion
> also
> > > > > > turned
> > > > > > > to
> > > > > > > > > the
> > > > > > > > > > > > > > > > SinkFunction and the SinkV2 API. For a
> broader
> > > > > > discussion
> > > > > > > > I'm
> > > > > > > > > > > > opening
> > > > > > > > > > > > > > up
> > > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > separate discussion thread.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > As Yun Tang has mentioned in that discussion
> > > > thread,
> > > > > it
> > > > > > > > would
> > > > > > > > > > be
> > > > > > > > > > > a
> > > > > > > > > > > > > good
> > > > > > > > > > > > > > > > time to deprecate the SinkFunction to avoid
> the
> > > > need
> > > > > to
> > > > > > > > > > introduce
> > > > > > > > > > > > new
> > > > > > > > > > > > > > > > functions towards (to be) deprecated APIs.
> Jing
> > > > > > > rightfully
> > > > > > > > > > > > mentioned
> > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > it would be confusing to deprecate the
> SinkFunction
> > > > > if
> > > > > > > its
> > > > > > > > > > > > successor
> > > > > > > > > > > > > is
> > > > > > > > > > > > > > > not
> > > > > > > > > > > > > > > > yet marked as @Public (it's currently
> > > > > @PublicEvolving).
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > My proposal would be to promote the SinkV2
> API to
> > > > > > @public
> > > > > > > > in
> > > > > > > > > > > Flink
> > > > > > > > > > > > > 1.17
> > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > mark the SinkFunction as @deprecated in
> Flink 1.17
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > The original Sink interface was introduced
> in Flink
> > > > > > 1.12
> > > > > > > > with
> > > > > > > > > > > > > FLIP-143
> > > > > > > > > > > > > > > [2]
> > > > > > > > > > > > > > > > and extended with FLIP-177 in Flink 1.14 [3]
> and
> > > > has
> > > > > > been
> > > > > > > > > > > improved
> > > > > > > > > > > > on
> > > > > > > > > > > > > > > > further as Sink V2 via FLIP-191 in Flink
> 1.15 [4].
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Looking at the API stability graduation
> process
> > > > [5],
> > > > > > the
> > > > > > > > fact
> > > > > > > > > > > that
> > > > > > > > > > > > > Sink
> > > > > > > > > > > > > > > V2
> > > > > > > > > > > > > > > > was introduced in Flink 1.15 would mean that
> we
> > > > could
> > > > > > > > > warrant a
> > > > > > > > > > > > > > promotion
> > > > > > > > > > > > > > > > to @public already (given that there have
> been two
> > > > > > > releases
> > > > > > > > > > with
> > > > > > > > > > > > 1.15
> > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > 1.16 where it was introduced). Combined with
> the
> > > > fact
> > > > > > > that
> > > > > > > > > > SinkV2
> > > > > > > > > > > > has
> > > > > > > > > > > > > > > been
> > > > > > > > > > > > > > > > the result of iteration over the
> introduction of
> > > > the
> > > > > > > > original
> > > > > > > > > > > Sink
> > > > > > > > > > > > > API
> > > > > > > > > > > > > > > > since Flink 1.12, I would argue that the
> promotion
> > > > is
> > > > > > > > > overdue.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > If we promote the Sink API to @public, I
> think we
> > > > > > should
> > > > > > > > also
> > > > > > > > > > > > > > immediately
> > > > > > > > > > > > > > > > mark the SinkFunction as @deprecated.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Looking forward to your thoughts.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Best regards,
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Martijn
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > [1]
> > > > > > > > > > > >
> > > > > > https://lists.apache.org/thread/l05m6cf8fwkkbpnjtzbg9l2lo40oxzd1
> > > > > > > > > > > > > > > > [2]
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-143%3A+Unified+Sink+API
> > > > > > > > > > > > > > > > [3]
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-177%3A+Extend+Sink+API
> > > > > > > > > > > > > > > > [4]
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-191%3A+Extend+unified+Sink+interface+to+support+small+file+compaction
> > > > > > > > > > > > > > > > [5]
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-197%3A+API+stability+graduation+process
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > --
> > > > > > > > > > > > > > https://twitter.com/snntrable
> > > > > > > > > > > > > > https://github.com/knaufk
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > --
> > > > > > > > > > https://twitter.com/snntrable
> > > > > > > > > > https://github.com/knaufk
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
>

Re: [DISCUSS] Promote SinkV2 to @Public and deprecate SinkFunction

Posted by Yun Gao <ga...@apache.org>.
Hi Martijn and Gordon,

Very sorry for the very late reply, +1 for close FLINK-30238 and open
dedicated issues
for the remaining issues.

Best,
Yun

On Mon, Oct 30, 2023 at 6:42 PM Martijn Visser <ma...@apache.org> wrote:
>
> Hi everyone,
>
> I would like to +1 Gordon's proposal to close FLINK-30238, create a
> new follow-up ticket and try to address the specific
> PostCommitTopology in the work that's currently being done by Peter on
> SinkV2. If there's no feedback on this topic, I assume everyone's OK
> with that.
>
> Best regards,
>
> Martijn
>
> On Fri, Sep 29, 2023 at 8:11 AM Tzu-Li (Gordon) Tai <tz...@apache.org> wrote:
> >
> > Hi everyone,
> >
> > It’s been a while since this topic was last discussed, but nevertheless, it
> > still remains very desirable to figure out a clear path towards making
> > SinkV2 @Public.
> >
> > There’s a new thread [1] that has a pretty good description on missing
> > features in SinkV2 from the Iceberg connector’s perspective, which prevents
> > them from migrating - anything related to those new requirements, let's
> > discuss there.
> >
> > Nevertheless, I think we should also revive and reuse this thread to reach
> > consensus / closure on all concerns already brought up here.
> >
> > It’s quite a long thread where a lot of various concerns were brought up,
> > but I’d start by addressing two very specific ones: FLIP-287 [2] and
> > FLINK-30238 [3]
> >
> > First of all, FLIP-287 has been approved and merged already, and will be
> > released with 1.18.0. So, connector migrations that were waiting on this
> > should hopefully be unblocked after the release. So this seems to no longer
> > be a concern - let’s see things through with those connectors actually
> > being migrated.
> >
> > FLINK-30238 is sort of a confusing one, and I do believe it is (partially)
> > a false alarm. After looking into this, the problem reported there
> > essentially breaks down to two things:
> > 1) TwoPhaseCommittingSink is unable to take a new savepoint after restoring
> > from a savepoint generated via `stop-with-savepoint --drain`
> > 2) SinkV2 sinks with a PostCommitTopology do not properly have post-commits
> > completed after a stop-with-savepoint operation, since committed
> > commitables are not emitted to the post-commit topology after the committer
> > receives the end-of-input signal.
> >
> > My latest comment in [3] explains this in a bit more detail.
> >
> > I believe we can conclude that problem 1) is a non-concern - users should
> > not restore from a job that is drained on stop-with-savepoint and cannot
> > expect the restored job to function normally.
> > Problem 2) remains a real issue though, and to help clear things up I think
> > we should close FLINK-30238 in favor of a new ticket scoped to the specific
> > PostCommitTopology problem.
> >
> > The other open concerns seem to mostly be around graduation criteria and
> > process - I've yet to go through those and will follow up with a separate
> > reply (or perhaps Martijn can help wrap up that part?).
> >
> > Thanks,
> > Gordon
> >
> > [1] https://lists.apache.org/thread/h3kg7jcgjstpvwlhnofq093vk93ylgsn
> > [2]
> > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=240880853
> > [3] https://issues.apache.org/jira/browse/FLINK-30238
> >
> > On Mon, Feb 13, 2023 at 2:50 AM Jing Ge <ji...@ververica.com.invalid> wrote:
> >
> > > @Martijn
> > >
> > > What I shared previously is the fact of the current KafkaSink. Following
> > > your suggestion, the KafkaSink should still be marked as @Experimental for
> > > now which will need even longer time to graduate. BTW, KafkaSink does not
> > > depend on any @Internal interfaces. The @Internal is used for methods
> > > coming from @PublicEvolving SinkV2 interfaces, not interfaces themself.
> > > Thanks for bringing this topic up. Currently, there is no rule defined to
> > > say that no @Internal is allowed for methods implemented
> > > from @PublicEvolving interfaces. Further (off-this-topic) discussion might
> > > be required to check if it really makes sense to define such a rule, since
> > > some methods defined in interfaces might only be used internally, i.e. no
> > > connector user would be aware of them.
> > >
> > > @Dong
> > >
> > > I agree with everything you said and especially can't agree more to let
> > > developers who will own it make the decision.
> > >
> > > Best regards,
> > > Jing
> > >
> > >
> > > On Sun, Feb 12, 2023 at 2:53 AM Dong Lin <li...@gmail.com> wrote:
> > >
> > > > Hi Martijn,
> > > >
> > > > Thanks for the reply. Please see my comments inline.
> > > >
> > > > Regards,
> > > > Dong
> > > >
> > > > On Sat, Feb 11, 2023 at 4:31 AM Martijn Visser <martijnvisser@apache.org
> > > >
> > > > wrote:
> > > >
> > > > > Hi all,
> > > > >
> > > > > I wanted to get back on a couple of comments and have a proposal at the
> > > > end
> > > > > for the next steps:
> > > > >
> > > > > @Steven Wu
> > > > > If I look back at the original discussion of FLIP-191 and also the
> > > thread
> > > > > that you're referring to, it appears from the discussion with Yun Gao
> > > > that
> > > > > the solution was in near-sight, but just not finished. Perhaps it needs
> > > > to
> > > > > be restarted once more so it can be brought to a closure. Also when I
> > > > > looked back at the original introduction of SinkV2, there was
> > > FLINK-25726
> > > > > [1] which looks like it was built specifically for Iceberg and Delta
> > > > Sink?
> > > > >
> > > > > @Jing
> > > > > > All potential unstable methods coming from SinkV2 interfaces will be
> > > > kept
> > > > > marked as @internal.
> > > > >
> > > > > This is a situation that I think should be avoided: if it's unstable,
> > > it
> > > > > should be marked as @Experimental. Connectors should not need to rely
> > > > > on @Internal interfaces; if they are needed by a connector, a
> > > maintainer
> > > > > should create a proposal to expose.
> > > > >
> > > > > On the production readiness of SinkV2:
> > > > >
> > > > > @Dong
> > > > > > And Yuxia mentioned earlier in this thread that there are bugs such
> > > as
> > > > > FLINK-30238 <https://issues.apache.org/jira/browse/FLINK-30238> and
> > > > > FLINK-29459 <https://issues.apache.org/jira/browse/FLINK-29459> ,
> > > which
> > > > > makes it hard to use SinkV2 properly in production.
> > > > >
> > > > > @Jark
> > > > > > The follow-up step should be trying the new APIs in externalized
> > > > > connectors and giving users the confidence to migrate.
> > > > >
> > > > > I think that is not a fair assessment of the situation. Given that
> > > Kafka,
> > > > > Elasticsearch, FileSink and everything that uses the AsyncSink is on
> > > > SinkV2
> > > > > since Flink 1.15 and the popularity of these connectors, they have
> > > > > definitely already proven themselves. That's not to downplay that bugs
> > > > have
> > > > > no impact and shouldn't be resolved, but I don't think it's fair to
> > > state
> > > > > that SinkV2 is not stable or usable in production.
> > > > >
> > > >
> > > > I believe there exist jobs that are currently using SinkV2 (e.g. Kafka,
> > > > FileSink) in production. The questions are whether 1) SinkV2 is ready to
> > > > support all features of all existing SinkFunction; and 2) whether all
> > > those
> > > > connectors (e.g. Kafka) are ready to support all features of their
> > > > SinkFunction version (e.g. FlinkKafkaProducer) reliable without critical
> > > > bugs (e.g. losing data).
> > > >
> > > > If we can not answer "yes" to the above two questions, then we probably
> > > can
> > > > not say that SinkV2 is usable in production in general, even though we
> > > can
> > > > still say that some source/sink based on SinkV2 is usable in production
> > > in
> > > > selected scenarios (e.g. maybe those jobs that do not use on savepoint).
> > > >
> > > > Maybe we can use FLINK-30238
> > > > <https://issues.apache.org/jira/browse/FLINK-30238> to explain this
> > > point.
> > > > The JIRA's description basically says that "there is the risk of losing
> > > > data if we use stop-with-savepoint with SinkV2". Do you know whether this
> > > > might potentially affect those SinkV2 (e.g. Kafka, FileSink) jobs in
> > > > production when they are restarted with savepoint?
> > > >
> > > >
> > > >
> > > > >
> > > > > Coming back to the original topic of this thread, my proposal would be
> > > > that
> > > > > in Flink 1.17:
> > > > >
> > > > > 1. SinkV2 gets promoted from @PublicEvolving to @Public, which also
> > > > means:
> > > > >
> > > >
> > > > I believe we have agreed to address the issues with SinkV2. Now the
> > > > question is, how likely would the solutions require backward incompatible
> > > > change to SinkV2.
> > > >
> > > > For example, do you think the following questions (mentioned earlier in
> > > > this thread) would require backward incompatible changes?
> > > > - Sink.InitContext still lacks some necessary information to migrate
> > > > existing connectors to new sinks
> > > > - Address FLINK-30238 and FLINK-29459.
> > > > - Find the migration path for the Iceberg sink
> > > >
> > > > IMO, it seems safer to mark an interface as @Public when we are pretty
> > > sure
> > > > that there is no known issue in the near future that would require
> > > > backward-incompatible changes to the interface.
> > > >
> > > > Note that the cost of making an interface @Public lies on the
> > > developer(s)
> > > > who owns the task of updating SinkV2 to address those known issues (e.g.
> > > > migration). Maybe we should find the developers who will own this task
> > > and
> > > > let them decide whether to mark it @Public?
> > > >
> > > >
> > > > a. KafkaSink gets promoted from @PublicEvolving to @Public (which solves
> > > > > the weird thing that FlinkKafkaProducer is already marked as deprecated
> > > > > since Flink 1.14 and points to KafkaSink as a successor)
> > > > > b. FileSink gets promoted from @Experimental to @PublicEvolving (note:
> > > > this
> > > > > has been marked as @Experimental since Flink 1.12, so one could argue
> > > > that
> > > > > this needs to be bumped to @Public too. The reason why I didn't do it
> > > is
> > > > > because of Yuxia's remark that FileSystemTableSink is still using
> > > > > SinkFunction)
> > > > > c. ElasticsearchSink gets promoted from @PublicEvolving to @Public (has
> > > > > been externalized so needs to happen there)
> > > > > d. The AWS connectors (Firehose, Kinesis) get promoted from
> > > > @PublicEvolving
> > > > > to @Public (has been externalized so needs to happen there)
> > > > > e. It's up to the maintainers for DynamoDB and Opensearch to decide
> > > this
> > > > > themselves, given that I believe they've only been introduced recently
> > > > and
> > > > > support only Flink 1.16.
> > > > >
> > > > > 2. For the next Flink release, the community needs to establish what
> > > are
> > > > > features that are currently not available in SinkV2 (e.g. FLIP-287), if
> > > > > such a feature should be made available in SinkV2 (in case it's a
> > > feature
> > > > > which currently resides in SinkFunction, while from new insights we
> > > think
> > > > > it should now be solved differently) and then get these implemented
> > > asap.
> > > > >
> > > > > Best regards,
> > > > >
> > > > > Martijn
> > > > >
> > > > > [1] https://issues.apache.org/jira/browse/FLINK-25726
> > > > >
> > > > >
> > > > > On Tue, Feb 7, 2023 at 5:11 AM Steven Wu <st...@gmail.com> wrote:
> > > > >
> > > > > > > Regarding the discussion on global committer [1] for sinks with
> > > > global
> > > > > > transactions, there is no consensus on solving that problem in
> > > SinkV2.
> > > > > Will
> > > > > > it require any breaking change in SinkV2?
> > > > > >
> > > > > > Just want to reiterate my earlier question. What is the migration
> > > path
> > > > > for
> > > > > > the Iceberg sink?
> > > > > >
> > > > > > [1] https://lists.apache.org/thread/82bgvlton9olb591bfg2djv0cshj1bxj
> > > > > >
> > > > > > On Mon, Feb 6, 2023 at 6:22 PM Jark Wu <im...@gmail.com> wrote:
> > > > > >
> > > > > > > Hi Konstantin,
> > > > > > >
> > > > > > > I totally agree with making SinkV2 @Public. I just have concerns
> > > > about
> > > > > > > deprecating SinkFunction at this point. Dong Lin has raised the
> > > > blocker
> > > > > > > issues of migration multiple times in this thread which I think we
> > > > > should
> > > > > > > address first. I don't know why we rush to deprecate SinkFunction
> > > > while
> > > > > > > SourceFunction is still public, but the new Source API is much more
> > > > > > stable
> > > > > > > and production-ready than SinkV2.
> > > > > > >
> > > > > > > Iceberg community raised concerns[1] about the workability and
> > > > > stability
> > > > > > of
> > > > > > > Flink connector APIs.
> > > > > > >
> > > > > > > We are hoping any issues with the APIs for Iceberg connector will
> > > > > surface
> > > > > > > > sooner and get more attention from the Flink community when the
> > > > > > connector
> > > > > > > > is within Flink umbrella rather than in Iceberg repo.
> > > > > > >
> > > > > > >
> > > > > > > The connector externalizing is a big step for building a mechanism
> > > to
> > > > > > > guarantee Flink connector API is stable and workable. The follow-up
> > > > > step
> > > > > > > should be trying the new APIs in externalized connectors and giving
> > > > > users
> > > > > > > the confidence to migrate.
> > > > > > >
> > > > > > >
> > > > > > > Best,
> > > > > > > Jark
> > > > > > >
> > > > > > > [1]:
> > > > https://lists.apache.org/thread/r5zqnkt01x2c611brkjmxxnt3bfcgl1b
> > > > > > >
> > > > > > > On Tue, 7 Feb 2023 at 09:53, yuxia <lu...@alumni.sjtu.edu.cn>
> > > > > wrote:
> > > > > > >
> > > > > > > > Hi Konstantin,
> > > > > > > > Just FYI, the FileSystemTableSink are still using SinkFunction.
> > > > > > > >
> > > > > > > > Best regards,
> > > > > > > > Yuxia
> > > > > > > >
> > > > > > > > ----- 原始邮件 -----
> > > > > > > > 发件人: "Dong Lin" <li...@gmail.com>
> > > > > > > > 收件人: "dev" <de...@flink.apache.org>
> > > > > > > > 抄送: "Jing Ge" <ji...@ververica.com>, "Yun Tang" <myasuka@live.com
> > > >
> > > > > > > > 发送时间: 星期二, 2023年 2 月 07日 上午 9:41:07
> > > > > > > > 主题: Re: [DISCUSS] Promote SinkV2 to @Public and deprecate
> > > > > SinkFunction
> > > > > > > >
> > > > > > > > Hi Konstantin,
> > > > > > > >
> > > > > > > > Thanks for the reply. Please see my comments inline.
> > > > > > > >
> > > > > > > > On Mon, Feb 6, 2023 at 9:48 PM Konstantin Knauf <
> > > knaufk@apache.org
> > > > >
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Hi Steven,
> > > > > > > > >
> > > > > > > > > Sink is already deprecated. It was deprecated at the moment
> > > where
> > > > > we
> > > > > > > > > introduced SinkV2.
> > > > > > > > >
> > > > > > > > > Hi Jark, Hi Dong,
> > > > > > > > >
> > > > > > > > > My understanding is the SinkV2 is a workable interface. The
> > > most
> > > > > > > > important
> > > > > > > > > connectors have been migrated (Kafka, Filesystem) and more
> > > > > connectors
> > > > > > > > > (OpenSearch, ElasticSearch, Kinesis) use it successfully. To
> > > make
> > > > > > > SinkV2
> > > > > > > > > public, it does not need to have all possible functionality.
> > > > Public
> > > > > > > APIs
> > > > > > > > > can be extended. That's what we do all the time. There will
> > > also
> > > > > > always
> > > > > > > > be
> > > > > > > > > bugs. So, these points can not be categorical blockers to
> > > promote
> > > > > the
> > > > > > > > API.
> > > > > > > > >
> > > > > > > >
> > > > > > > > Right. I believe we all agree that we make SinkV2
> > > @PublicEvolving.
> > > > > The
> > > > > > > > concern here is whether we can mark SinkFunction as deprecated at
> > > > > this
> > > > > > > > point.
> > > > > > > >
> > > > > > > >
> > > > > > > > >
> > > > > > > > > What are the specific issues/tickets that are blocking us? Can
> > > we
> > > > > in
> > > > > > > your
> > > > > > > > >
> > > > > > > >
> > > > > > > > For example, Lijie mentioned earlier in this thread that
> > > according
> > > > to
> > > > > > > > FLIP-287,
> > > > > > > > currently the Sink.InitContext still lacks some necessary
> > > > information
> > > > > > to
> > > > > > > > migrate existing connectors to new sinks. This could be a blocker
> > > > > issue
> > > > > > > > since this is related to the SinkV2 API design.
> > > > > > > >
> > > > > > > > And Yuxia mentioned earlier in this thread that there are bugs
> > > such
> > > > > as
> > > > > > > > FLINK-30238 <https://issues.apache.org/jira/browse/FLINK-30238>
> > > > and
> > > > > > > > FLINK-29459 <https://issues.apache.org/jira/browse/FLINK-29459>
> > > ,
> > > > > > which
> > > > > > > > makes it hard to use SinkV2 properly in production.  It seems
> > > that
> > > > > > these
> > > > > > > > two bugs are created months ago and are still unresolved or even
> > > > > > > > unassigned. This looks like a clear signal that SinkV2 is not
> > > being
> > > > > > > > actively maintained and used in production.
> > > > > > > >
> > > > > > > >
> > > > > > > > > opinion only deprecate it when every single connector in Apache
> > > > > Flink
> > > > > > > is
> > > > > > > > > migrated already?
> > > > > > > > >
> > > > > > > >
> > > > > > > > Technically this is not a hard requirement. But I would suggest
> > > > that
> > > > > we
> > > > > > > > should migrate all existing connectors so that we eat our own
> > > > dogfood
> > > > > > and
> > > > > > > > prove to users that SinkV2 is ready for use in production.
> > > > > > > >
> > > > > > > >
> > > > > > > > > In my opinion it is the time to ask users to the migrate their
> > > > > > > > connectors.
> > > > > > > > > More importantly, @Deprecated would signal users not to build
> > > new
> > > > > > > > > connectors on SinkFunction. I would arque its also very
> > > > misleading
> > > > > to
> > > > > > > > users
> > > > > > > > > to not @Deprecated SinkFunction given that is clearly will be
> > > > > > > deprecated.
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > > Cheers,
> > > > > > > > >
> > > > > > > > > Konstantin
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > Am Mo., 6. Feb. 2023 um 13:26 Uhr schrieb Jark Wu <
> > > > > imjark@gmail.com
> > > > > > >:
> > > > > > > > >
> > > > > > > > > > I agree with Dong Lin.
> > > > > > > > > >
> > > > > > > > > > Oracle explains how to use Deprecate API [1]:
> > > > > > > > > >
> > > > > > > > > > You are strongly recommended to use the Javadoc @deprecated
> > > tag
> > > > > > with
> > > > > > > > > > > appropriate comments explaining how to use the new API.
> > > This
> > > > > > > ensures
> > > > > > > > > > > developers will *have a workable migration path from the
> > > old
> > > > > API
> > > > > > to
> > > > > > > > the
> > > > > > > > > > > new API*.
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > From a user's perspective, the workable migration path is
> > > very
> > > > > > > > important.
> > > > > > > > > > Otherwise, it blurs the semantics of API deprecation. The
> > > Flink
> > > > > > API's
> > > > > > > > > > compatibility and stability issues in the past left a bad
> > > > > > impression
> > > > > > > on
> > > > > > > > > the
> > > > > > > > > > downstream projects. We should be careful when changing and
> > > > > > > deprecating
> > > > > > > > > > APIs, especially when there are known migration gaps. I think
> > > > > it's
> > > > > > a
> > > > > > > > good
> > > > > > > > > > idea to migrate Flink-owned connectors before marking old API
> > > > > > > > deprecated.
> > > > > > > > > > This ensures downstream projects can migrate to new APIs
> > > > > smoothly.
> > > > > > > > > >
> > > > > > > > > > Best,
> > > > > > > > > > Jark
> > > > > > > > > >
> > > > > > > > > > [1]:
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > > https://docs.oracle.com/javase/8/docs/technotes/guides/javadoc/deprecation/deprecation.html
> > > > > > > > > >
> > > > > > > > > > On Mon, 6 Feb 2023 at 10:01, Steven Wu <stevenz3wu@gmail.com
> > > >
> > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Regarding the discussion on global committer [1] for sinks
> > > > with
> > > > > > > > global
> > > > > > > > > > > transactions, there is no consensus on solving that problem
> > > > in
> > > > > > > > SinkV2.
> > > > > > > > > > Will
> > > > > > > > > > > it require any breaking change in SinkV2?
> > > > > > > > > > >
> > > > > > > > > > > Also will SinkV1 be deprecated too? or it should happen
> > > > > sometime
> > > > > > > > after
> > > > > > > > > > > SinkFunction deprecation?
> > > > > > > > > > >
> > > > > > > > > > > [1]
> > > > > > > https://lists.apache.org/thread/82bgvlton9olb591bfg2djv0cshj1bxj
> > > > > > > > > > >
> > > > > > > > > > > On Sun, Feb 5, 2023 at 2:14 AM Dong Lin <
> > > lindong28@gmail.com
> > > > >
> > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Hi Konstantin,
> > > > > > > > > > > >
> > > > > > > > > > > > Thanks for the comment! Please see my comment inline.
> > > > > > > > > > > >
> > > > > > > > > > > > Cheers,
> > > > > > > > > > > > Dong
> > > > > > > > > > > >
> > > > > > > > > > > > On Sat, Feb 4, 2023 at 2:06 AM Konstantin Knauf <
> > > > > > > knaufk@apache.org
> > > > > > > > >
> > > > > > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > Hi everyone,
> > > > > > > > > > > > >
> > > > > > > > > > > > > sorry for joining the discussion late.
> > > > > > > > > > > > >
> > > > > > > > > > > > > 1) Is there an option to deprecate SinkFunction in
> > > Flink
> > > > > 1.17
> > > > > > > > while
> > > > > > > > > > > > leaving
> > > > > > > > > > > > > SinkV2 @PublicEvolving in Flink 1.17. We then aim to
> > > make
> > > > > > > SinkV2
> > > > > > > > > > > @Public
> > > > > > > > > > > > in
> > > > > > > > > > > > > and remove SinkFunction in Flink 1.18. @PublicEvolving
> > > > are
> > > > > > > > intended
> > > > > > > > > > for
> > > > > > > > > > > > > public use. So, I don't see it as a blocker for
> > > > deprecating
> > > > > > > > > > > SinkFunction
> > > > > > > > > > > > > that we have to make SinkV2 @PublicEvovling. For
> > > > reference
> > > > > > this
> > > > > > > > is
> > > > > > > > > > the
> > > > > > > > > > > > > description of @PublicEvovling:
> > > > > > > > > > > > >
> > > > > > > > > > > > > /**
> > > > > > > > > > > > >  * Annotation to mark classes and methods for public
> > > use,
> > > > > but
> > > > > > > > with
> > > > > > > > > > > > > evolving interfaces.
> > > > > > > > > > > > >  *
> > > > > > > > > > > > >  * <p>Classes and methods with this annotation are
> > > > intended
> > > > > > for
> > > > > > > > > > public
> > > > > > > > > > > > > use and have stable behavior.
> > > > > > > > > > > > >  * However, their interfaces and signatures are not
> > > > > > considered
> > > > > > > to
> > > > > > > > > be
> > > > > > > > > > > > > stable and might be changed
> > > > > > > > > > > > >  * across versions.
> > > > > > > > > > > > >  *
> > > > > > > > > > > > >  * <p>This annotation also excludes methods and classes
> > > > > with
> > > > > > > > > evolving
> > > > > > > > > > > > > interfaces / signatures within
> > > > > > > > > > > > >  * classes annotated with {@link Public}.
> > > > > > > > > > > > >  */
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > Marking SinkFunction @Deprecated would already single
> > > > > > everyone
> > > > > > > to
> > > > > > > > > > move
> > > > > > > > > > > to
> > > > > > > > > > > > > SinkV2, which we as a community, I believe, have a
> > > strong
> > > > > > > > interest
> > > > > > > > > > in.
> > > > > > > > > > > > Its
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > Yes, I also believe we all have this strong interest. I
> > > > just
> > > > > > hope
> > > > > > > > > that
> > > > > > > > > > > this
> > > > > > > > > > > > can be done in the best possible way that does not
> > > confuse
> > > > > > users.
> > > > > > > > > > > >
> > > > > > > > > > > > I probably still have the same concern regarding its
> > > impact
> > > > > on
> > > > > > > > users:
> > > > > > > > > > if
> > > > > > > > > > > we
> > > > > > > > > > > > mark an API as deprecated, it effectively means the users
> > > > of
> > > > > > this
> > > > > > > > API
> > > > > > > > > > > > should start to migrate to another API (e.g. SinkV2) and
> > > we
> > > > > > might
> > > > > > > > > > remove
> > > > > > > > > > > > this API in the future. However, given that we know there
> > > > are
> > > > > > > known
> > > > > > > > > > > > problems preventing users from doing so, it seems that we
> > > > are
> > > > > > not
> > > > > > > > > ready
> > > > > > > > > > > to
> > > > > > > > > > > > send this message to users right.
> > > > > > > > > > > >
> > > > > > > > > > > > If I understand correctly, I guess you are suggesting
> > > that
> > > > by
> > > > > > > > marking
> > > > > > > > > > > > SinkFunction as deprecated, we can put higher pressure on
> > > > > Flink
> > > > > > > > > > > > contributors to update the existing Flink codebase to
> > > > improve
> > > > > > and
> > > > > > > > use
> > > > > > > > > > > > SinkV2.
> > > > > > > > > > > >
> > > > > > > > > > > > I am not sure this is the right way to use @deprecated,
> > > > which
> > > > > > > has a
> > > > > > > > > > > > particular meaning for its users rather than
> > > contributors.
> > > > > And
> > > > > > I
> > > > > > > am
> > > > > > > > > > also
> > > > > > > > > > > > not sure we can even pressure contributors of an
> > > > open-source
> > > > > > > > project
> > > > > > > > > > into
> > > > > > > > > > > > developing a feature (e.g. migrate all existing
> > > > SinkFunction
> > > > > > > > > subclasses
> > > > > > > > > > > to
> > > > > > > > > > > > SinkV2). IMO, the typical way is for the contributor with
> > > > > > > > > interest/time
> > > > > > > > > > > to
> > > > > > > > > > > > work on the feature, or talk to other contributors
> > > whether
> > > > > they
> > > > > > > are
> > > > > > > > > > > willing
> > > > > > > > > > > > to collaborate/work on this, rather than pressuring other
> > > > > > > > > contributors
> > > > > > > > > > > into
> > > > > > > > > > > > working on this.
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > almost comical how long the transition from
> > > > > > > > > > SourceFurnction/SinkFunction
> > > > > > > > > > > to
> > > > > > > > > > > > > Source/Sink takes us. At the same time, we leave
> > > > ourselves
> > > > > > the
> > > > > > > > > option
> > > > > > > > > > > to
> > > > > > > > > > > > to
> > > > > > > > > > > > > make small changes to SinkV2 if any problems arise
> > > during
> > > > > the
> > > > > > > > > > migration
> > > > > > > > > > > > of
> > > > > > > > > > > > > these connector.
> > > > > > > > > > > > >
> > > > > > > > > > > > > I think, we have a bit of a chicken/egg problem here.
> > > The
> > > > > > > > pressure
> > > > > > > > > > for
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > Similar to the reason described above, I am not sure we
> > > > have
> > > > > a
> > > > > > > > > > > chicken/egg
> > > > > > > > > > > > problem here. The issue here is that SinkV2 is not ready
> > > > and
> > > > > we
> > > > > > > > have
> > > > > > > > > a
> > > > > > > > > > > lot
> > > > > > > > > > > > of existing SinkFunction that is not migrated by
> > > ourselves.
> > > > > We
> > > > > > > > (Flink
> > > > > > > > > > > > contributors) probably do not need to mark SinkFunction
> > > as
> > > > > > > > deprecated
> > > > > > > > > > in
> > > > > > > > > > > > order to address these issues in our own codebase.
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > users and contributors is not high enough to move away
> > > from
> > > > > > > > > > SinkFunction
> > > > > > > > > > > as
> > > > > > > > > > > > > long as its not deprecated, but at the same time we
> > > need
> > > > > > people
> > > > > > > > to
> > > > > > > > > > > > migrate
> > > > > > > > > > > > > their connectors to see if there are any gaps in
> > > SinkV2.
> > > > I
> > > > > > > > believe,
> > > > > > > > > > the
> > > > > > > > > > > > > combination proposed above could bridge this problem.
> > > > > > > > > > > > >
> > > > > > > > > > > > > 2) I don't understand the argument of waiting until
> > > some
> > > > of
> > > > > > the
> > > > > > > > > > > > > implementations are @Public. How can we make the
> > > > > > > implementations
> > > > > > > > of
> > > > > > > > > > the
> > > > > > > > > > > > > SinkV2 API @Public without making SinkV2 @Public? All
> > > > > public
> > > > > > > > > methods
> > > > > > > > > > of
> > > > > > > > > > > > > SinkV2 are part of every implementation. So to me it
> > > > > actually
> > > > > > > > seems
> > > > > > > > > > to
> > > > > > > > > > > be
> > > > > > > > > > > > > opposite: in order to make any of the implementation
> > > > > @Public
> > > > > > we
> > > > > > > > > first
> > > > > > > > > > > > need
> > > > > > > > > > > > > to make the API @Public.
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > Yeah I also agree with you.
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > Cheers,
> > > > > > > > > > > > >
> > > > > > > > > > > > > Konstantin
> > > > > > > > > > > > >
> > > > > > > > > > > > > Am Mo., 30. Jan. 2023 um 13:18 Uhr schrieb Dong Lin <
> > > > > > > > > > > lindong28@gmail.com
> > > > > > > > > > > > >:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > Hi Martijn,
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Thanks for driving this effort to clean-up the Flink
> > > > > > > codebase!
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > I like the idea to cleanup Flink codebase to avoid
> > > > having
> > > > > > two
> > > > > > > > > > Sinks.
> > > > > > > > > > > On
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > other hand, I also thing the concern mentioned by
> > > Jing
> > > > > > makes
> > > > > > > > > sense.
> > > > > > > > > > > In
> > > > > > > > > > > > > > addition to thinking in terms of the rule proposed in
> > > > > > > FLIP-197
> > > > > > > > > > > > > > <
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-197%3A+API+stability+graduation+process
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > (which
> > > > > > > > > > > > > > seems to focus mostly on the Flink developers'
> > > > > > perspective),
> > > > > > > it
> > > > > > > > > > might
> > > > > > > > > > > > be
> > > > > > > > > > > > > > useful to also think about the story from users'
> > > > > > perspective
> > > > > > > > and
> > > > > > > > > > make
> > > > > > > > > > > > > sure
> > > > > > > > > > > > > > their concerns can be addressed.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Typically, by marking an API as deprecated, we are
> > > > > > > effectively
> > > > > > > > > > > telling
> > > > > > > > > > > > > > users *they should start to migrate to the new API
> > > ASAP
> > > > > and
> > > > > > > we
> > > > > > > > > > > reserve
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > right to remove this API completely in the 1-2
> > > > releases*.
> > > > > > > Then
> > > > > > > > it
> > > > > > > > > > > might
> > > > > > > > > > > > > be
> > > > > > > > > > > > > > reasonable for users to ask questions such as:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > 1) Does SinkV2 public API provides all the
> > > > > functionalities
> > > > > > > > needed
> > > > > > > > > > to
> > > > > > > > > > > > > > migrate my existing code from subclassing
> > > SinkFunction
> > > > to
> > > > > > > > > > subclassing
> > > > > > > > > > > > > > SinkV2?
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > 2) Is the amount of migration work reasonable? If
> > > yes,
> > > > > why
> > > > > > > is a
> > > > > > > > > > class
> > > > > > > > > > > > > such
> > > > > > > > > > > > > > as HBaseSinkFunction in Flink's own codebase still
> > > > > > depending
> > > > > > > on
> > > > > > > > > > > > > > SinkFunction? Maybe Flink developers should eat their
> > > > own
> > > > > > dog
> > > > > > > > > food
> > > > > > > > > > > and
> > > > > > > > > > > > > > migrate (or deprecate) those classes in the Flink
> > > > > codebase
> > > > > > > > first?
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Based on the discussion in this thread, I am not sure
> > > > we
> > > > > > have
> > > > > > > > > good
> > > > > > > > > > > > > answers
> > > > > > > > > > > > > > to those questions yet. For the 1st question above,
> > > the
> > > > > > > answer
> > > > > > > > is
> > > > > > > > > > > *no*
> > > > > > > > > > > > > > because we already know that the SinkV2 is currently
> > > > not
> > > > > > able
> > > > > > > > to
> > > > > > > > > > > > support
> > > > > > > > > > > > > > migration for JdbcSink. For the 2nd question above,
> > > we
> > > > > know
> > > > > > > > there
> > > > > > > > > > are
> > > > > > > > > > > > > many
> > > > > > > > > > > > > > non-deprecated class in Flink codebase that are still
> > > > > > > depending
> > > > > > > > > on
> > > > > > > > > > > > > SinkV2.
> > > > > > > > > > > > > > It is probably not nice to users if we tell them to
> > > > > migrate
> > > > > > > > while
> > > > > > > > > > we
> > > > > > > > > > > > know
> > > > > > > > > > > > > > there are existing issues that can prevent them from
> > > > > doing
> > > > > > so
> > > > > > > > > > easily.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > In order to move forward to deprecate SinkV2, I think
> > > > it
> > > > > > will
> > > > > > > > be
> > > > > > > > > > > super
> > > > > > > > > > > > > > useful to first migrate all the connectors managed by
> > > > > Flink
> > > > > > > > > > community
> > > > > > > > > > > > > > (including all externalized connectors) to use
> > > SinkV2.
> > > > > This
> > > > > > > > work
> > > > > > > > > > > won't
> > > > > > > > > > > > be
> > > > > > > > > > > > > > wasted since we need to do this anyway. And it will
> > > > also
> > > > > > give
> > > > > > > > us
> > > > > > > > > a
> > > > > > > > > > > > chance
> > > > > > > > > > > > > > to validate the capabilities of SinkV2 and fix bugs
> > > by
> > > > > > > > ourselves
> > > > > > > > > as
> > > > > > > > > > > > much
> > > > > > > > > > > > > as
> > > > > > > > > > > > > > possible.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > What do you think?
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Best Regards,
> > > > > > > > > > > > > > Dong
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On Wed, Jan 18, 2023 at 6:52 PM Martijn Visser <
> > > > > > > > > > > > martijnvisser@apache.org
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Hi all,
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > While discussing FLIP-281 [1] the discussion also
> > > > > turned
> > > > > > to
> > > > > > > > the
> > > > > > > > > > > > > > > SinkFunction and the SinkV2 API. For a broader
> > > > > discussion
> > > > > > > I'm
> > > > > > > > > > > opening
> > > > > > > > > > > > > up
> > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > separate discussion thread.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > As Yun Tang has mentioned in that discussion
> > > thread,
> > > > it
> > > > > > > would
> > > > > > > > > be
> > > > > > > > > > a
> > > > > > > > > > > > good
> > > > > > > > > > > > > > > time to deprecate the SinkFunction to avoid the
> > > need
> > > > to
> > > > > > > > > introduce
> > > > > > > > > > > new
> > > > > > > > > > > > > > > functions towards (to be) deprecated APIs. Jing
> > > > > > rightfully
> > > > > > > > > > > mentioned
> > > > > > > > > > > > > that
> > > > > > > > > > > > > > > it would be confusing to deprecate the SinkFunction
> > > > if
> > > > > > its
> > > > > > > > > > > successor
> > > > > > > > > > > > is
> > > > > > > > > > > > > > not
> > > > > > > > > > > > > > > yet marked as @Public (it's currently
> > > > @PublicEvolving).
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > My proposal would be to promote the SinkV2 API to
> > > > > @public
> > > > > > > in
> > > > > > > > > > Flink
> > > > > > > > > > > > 1.17
> > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > mark the SinkFunction as @deprecated in Flink 1.17
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > The original Sink interface was introduced in Flink
> > > > > 1.12
> > > > > > > with
> > > > > > > > > > > > FLIP-143
> > > > > > > > > > > > > > [2]
> > > > > > > > > > > > > > > and extended with FLIP-177 in Flink 1.14 [3] and
> > > has
> > > > > been
> > > > > > > > > > improved
> > > > > > > > > > > on
> > > > > > > > > > > > > > > further as Sink V2 via FLIP-191 in Flink 1.15 [4].
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Looking at the API stability graduation process
> > > [5],
> > > > > the
> > > > > > > fact
> > > > > > > > > > that
> > > > > > > > > > > > Sink
> > > > > > > > > > > > > > V2
> > > > > > > > > > > > > > > was introduced in Flink 1.15 would mean that we
> > > could
> > > > > > > > warrant a
> > > > > > > > > > > > > promotion
> > > > > > > > > > > > > > > to @public already (given that there have been two
> > > > > > releases
> > > > > > > > > with
> > > > > > > > > > > 1.15
> > > > > > > > > > > > > and
> > > > > > > > > > > > > > > 1.16 where it was introduced). Combined with the
> > > fact
> > > > > > that
> > > > > > > > > SinkV2
> > > > > > > > > > > has
> > > > > > > > > > > > > > been
> > > > > > > > > > > > > > > the result of iteration over the introduction of
> > > the
> > > > > > > original
> > > > > > > > > > Sink
> > > > > > > > > > > > API
> > > > > > > > > > > > > > > since Flink 1.12, I would argue that the promotion
> > > is
> > > > > > > > overdue.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > If we promote the Sink API to @public, I think we
> > > > > should
> > > > > > > also
> > > > > > > > > > > > > immediately
> > > > > > > > > > > > > > > mark the SinkFunction as @deprecated.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Looking forward to your thoughts.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Best regards,
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Martijn
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > [1]
> > > > > > > > > > >
> > > > > https://lists.apache.org/thread/l05m6cf8fwkkbpnjtzbg9l2lo40oxzd1
> > > > > > > > > > > > > > > [2]
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-143%3A+Unified+Sink+API
> > > > > > > > > > > > > > > [3]
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-177%3A+Extend+Sink+API
> > > > > > > > > > > > > > > [4]
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-191%3A+Extend+unified+Sink+interface+to+support+small+file+compaction
> > > > > > > > > > > > > > > [5]
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-197%3A+API+stability+graduation+process
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > --
> > > > > > > > > > > > > https://twitter.com/snntrable
> > > > > > > > > > > > > https://github.com/knaufk
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > > https://twitter.com/snntrable
> > > > > > > > > https://github.com/knaufk
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >

Re: [DISCUSS] Promote SinkV2 to @Public and deprecate SinkFunction

Posted by Martijn Visser <ma...@apache.org>.
Hi everyone,

I would like to +1 Gordon's proposal to close FLINK-30238, create a
new follow-up ticket and try to address the specific
PostCommitTopology in the work that's currently being done by Peter on
SinkV2. If there's no feedback on this topic, I assume everyone's OK
with that.

Best regards,

Martijn

On Fri, Sep 29, 2023 at 8:11 AM Tzu-Li (Gordon) Tai <tz...@apache.org> wrote:
>
> Hi everyone,
>
> It’s been a while since this topic was last discussed, but nevertheless, it
> still remains very desirable to figure out a clear path towards making
> SinkV2 @Public.
>
> There’s a new thread [1] that has a pretty good description on missing
> features in SinkV2 from the Iceberg connector’s perspective, which prevents
> them from migrating - anything related to those new requirements, let's
> discuss there.
>
> Nevertheless, I think we should also revive and reuse this thread to reach
> consensus / closure on all concerns already brought up here.
>
> It’s quite a long thread where a lot of various concerns were brought up,
> but I’d start by addressing two very specific ones: FLIP-287 [2] and
> FLINK-30238 [3]
>
> First of all, FLIP-287 has been approved and merged already, and will be
> released with 1.18.0. So, connector migrations that were waiting on this
> should hopefully be unblocked after the release. So this seems to no longer
> be a concern - let’s see things through with those connectors actually
> being migrated.
>
> FLINK-30238 is sort of a confusing one, and I do believe it is (partially)
> a false alarm. After looking into this, the problem reported there
> essentially breaks down to two things:
> 1) TwoPhaseCommittingSink is unable to take a new savepoint after restoring
> from a savepoint generated via `stop-with-savepoint --drain`
> 2) SinkV2 sinks with a PostCommitTopology do not properly have post-commits
> completed after a stop-with-savepoint operation, since committed
> commitables are not emitted to the post-commit topology after the committer
> receives the end-of-input signal.
>
> My latest comment in [3] explains this in a bit more detail.
>
> I believe we can conclude that problem 1) is a non-concern - users should
> not restore from a job that is drained on stop-with-savepoint and cannot
> expect the restored job to function normally.
> Problem 2) remains a real issue though, and to help clear things up I think
> we should close FLINK-30238 in favor of a new ticket scoped to the specific
> PostCommitTopology problem.
>
> The other open concerns seem to mostly be around graduation criteria and
> process - I've yet to go through those and will follow up with a separate
> reply (or perhaps Martijn can help wrap up that part?).
>
> Thanks,
> Gordon
>
> [1] https://lists.apache.org/thread/h3kg7jcgjstpvwlhnofq093vk93ylgsn
> [2]
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=240880853
> [3] https://issues.apache.org/jira/browse/FLINK-30238
>
> On Mon, Feb 13, 2023 at 2:50 AM Jing Ge <ji...@ververica.com.invalid> wrote:
>
> > @Martijn
> >
> > What I shared previously is the fact of the current KafkaSink. Following
> > your suggestion, the KafkaSink should still be marked as @Experimental for
> > now which will need even longer time to graduate. BTW, KafkaSink does not
> > depend on any @Internal interfaces. The @Internal is used for methods
> > coming from @PublicEvolving SinkV2 interfaces, not interfaces themself.
> > Thanks for bringing this topic up. Currently, there is no rule defined to
> > say that no @Internal is allowed for methods implemented
> > from @PublicEvolving interfaces. Further (off-this-topic) discussion might
> > be required to check if it really makes sense to define such a rule, since
> > some methods defined in interfaces might only be used internally, i.e. no
> > connector user would be aware of them.
> >
> > @Dong
> >
> > I agree with everything you said and especially can't agree more to let
> > developers who will own it make the decision.
> >
> > Best regards,
> > Jing
> >
> >
> > On Sun, Feb 12, 2023 at 2:53 AM Dong Lin <li...@gmail.com> wrote:
> >
> > > Hi Martijn,
> > >
> > > Thanks for the reply. Please see my comments inline.
> > >
> > > Regards,
> > > Dong
> > >
> > > On Sat, Feb 11, 2023 at 4:31 AM Martijn Visser <martijnvisser@apache.org
> > >
> > > wrote:
> > >
> > > > Hi all,
> > > >
> > > > I wanted to get back on a couple of comments and have a proposal at the
> > > end
> > > > for the next steps:
> > > >
> > > > @Steven Wu
> > > > If I look back at the original discussion of FLIP-191 and also the
> > thread
> > > > that you're referring to, it appears from the discussion with Yun Gao
> > > that
> > > > the solution was in near-sight, but just not finished. Perhaps it needs
> > > to
> > > > be restarted once more so it can be brought to a closure. Also when I
> > > > looked back at the original introduction of SinkV2, there was
> > FLINK-25726
> > > > [1] which looks like it was built specifically for Iceberg and Delta
> > > Sink?
> > > >
> > > > @Jing
> > > > > All potential unstable methods coming from SinkV2 interfaces will be
> > > kept
> > > > marked as @internal.
> > > >
> > > > This is a situation that I think should be avoided: if it's unstable,
> > it
> > > > should be marked as @Experimental. Connectors should not need to rely
> > > > on @Internal interfaces; if they are needed by a connector, a
> > maintainer
> > > > should create a proposal to expose.
> > > >
> > > > On the production readiness of SinkV2:
> > > >
> > > > @Dong
> > > > > And Yuxia mentioned earlier in this thread that there are bugs such
> > as
> > > > FLINK-30238 <https://issues.apache.org/jira/browse/FLINK-30238> and
> > > > FLINK-29459 <https://issues.apache.org/jira/browse/FLINK-29459> ,
> > which
> > > > makes it hard to use SinkV2 properly in production.
> > > >
> > > > @Jark
> > > > > The follow-up step should be trying the new APIs in externalized
> > > > connectors and giving users the confidence to migrate.
> > > >
> > > > I think that is not a fair assessment of the situation. Given that
> > Kafka,
> > > > Elasticsearch, FileSink and everything that uses the AsyncSink is on
> > > SinkV2
> > > > since Flink 1.15 and the popularity of these connectors, they have
> > > > definitely already proven themselves. That's not to downplay that bugs
> > > have
> > > > no impact and shouldn't be resolved, but I don't think it's fair to
> > state
> > > > that SinkV2 is not stable or usable in production.
> > > >
> > >
> > > I believe there exist jobs that are currently using SinkV2 (e.g. Kafka,
> > > FileSink) in production. The questions are whether 1) SinkV2 is ready to
> > > support all features of all existing SinkFunction; and 2) whether all
> > those
> > > connectors (e.g. Kafka) are ready to support all features of their
> > > SinkFunction version (e.g. FlinkKafkaProducer) reliable without critical
> > > bugs (e.g. losing data).
> > >
> > > If we can not answer "yes" to the above two questions, then we probably
> > can
> > > not say that SinkV2 is usable in production in general, even though we
> > can
> > > still say that some source/sink based on SinkV2 is usable in production
> > in
> > > selected scenarios (e.g. maybe those jobs that do not use on savepoint).
> > >
> > > Maybe we can use FLINK-30238
> > > <https://issues.apache.org/jira/browse/FLINK-30238> to explain this
> > point.
> > > The JIRA's description basically says that "there is the risk of losing
> > > data if we use stop-with-savepoint with SinkV2". Do you know whether this
> > > might potentially affect those SinkV2 (e.g. Kafka, FileSink) jobs in
> > > production when they are restarted with savepoint?
> > >
> > >
> > >
> > > >
> > > > Coming back to the original topic of this thread, my proposal would be
> > > that
> > > > in Flink 1.17:
> > > >
> > > > 1. SinkV2 gets promoted from @PublicEvolving to @Public, which also
> > > means:
> > > >
> > >
> > > I believe we have agreed to address the issues with SinkV2. Now the
> > > question is, how likely would the solutions require backward incompatible
> > > change to SinkV2.
> > >
> > > For example, do you think the following questions (mentioned earlier in
> > > this thread) would require backward incompatible changes?
> > > - Sink.InitContext still lacks some necessary information to migrate
> > > existing connectors to new sinks
> > > - Address FLINK-30238 and FLINK-29459.
> > > - Find the migration path for the Iceberg sink
> > >
> > > IMO, it seems safer to mark an interface as @Public when we are pretty
> > sure
> > > that there is no known issue in the near future that would require
> > > backward-incompatible changes to the interface.
> > >
> > > Note that the cost of making an interface @Public lies on the
> > developer(s)
> > > who owns the task of updating SinkV2 to address those known issues (e.g.
> > > migration). Maybe we should find the developers who will own this task
> > and
> > > let them decide whether to mark it @Public?
> > >
> > >
> > > a. KafkaSink gets promoted from @PublicEvolving to @Public (which solves
> > > > the weird thing that FlinkKafkaProducer is already marked as deprecated
> > > > since Flink 1.14 and points to KafkaSink as a successor)
> > > > b. FileSink gets promoted from @Experimental to @PublicEvolving (note:
> > > this
> > > > has been marked as @Experimental since Flink 1.12, so one could argue
> > > that
> > > > this needs to be bumped to @Public too. The reason why I didn't do it
> > is
> > > > because of Yuxia's remark that FileSystemTableSink is still using
> > > > SinkFunction)
> > > > c. ElasticsearchSink gets promoted from @PublicEvolving to @Public (has
> > > > been externalized so needs to happen there)
> > > > d. The AWS connectors (Firehose, Kinesis) get promoted from
> > > @PublicEvolving
> > > > to @Public (has been externalized so needs to happen there)
> > > > e. It's up to the maintainers for DynamoDB and Opensearch to decide
> > this
> > > > themselves, given that I believe they've only been introduced recently
> > > and
> > > > support only Flink 1.16.
> > > >
> > > > 2. For the next Flink release, the community needs to establish what
> > are
> > > > features that are currently not available in SinkV2 (e.g. FLIP-287), if
> > > > such a feature should be made available in SinkV2 (in case it's a
> > feature
> > > > which currently resides in SinkFunction, while from new insights we
> > think
> > > > it should now be solved differently) and then get these implemented
> > asap.
> > > >
> > > > Best regards,
> > > >
> > > > Martijn
> > > >
> > > > [1] https://issues.apache.org/jira/browse/FLINK-25726
> > > >
> > > >
> > > > On Tue, Feb 7, 2023 at 5:11 AM Steven Wu <st...@gmail.com> wrote:
> > > >
> > > > > > Regarding the discussion on global committer [1] for sinks with
> > > global
> > > > > transactions, there is no consensus on solving that problem in
> > SinkV2.
> > > > Will
> > > > > it require any breaking change in SinkV2?
> > > > >
> > > > > Just want to reiterate my earlier question. What is the migration
> > path
> > > > for
> > > > > the Iceberg sink?
> > > > >
> > > > > [1] https://lists.apache.org/thread/82bgvlton9olb591bfg2djv0cshj1bxj
> > > > >
> > > > > On Mon, Feb 6, 2023 at 6:22 PM Jark Wu <im...@gmail.com> wrote:
> > > > >
> > > > > > Hi Konstantin,
> > > > > >
> > > > > > I totally agree with making SinkV2 @Public. I just have concerns
> > > about
> > > > > > deprecating SinkFunction at this point. Dong Lin has raised the
> > > blocker
> > > > > > issues of migration multiple times in this thread which I think we
> > > > should
> > > > > > address first. I don't know why we rush to deprecate SinkFunction
> > > while
> > > > > > SourceFunction is still public, but the new Source API is much more
> > > > > stable
> > > > > > and production-ready than SinkV2.
> > > > > >
> > > > > > Iceberg community raised concerns[1] about the workability and
> > > > stability
> > > > > of
> > > > > > Flink connector APIs.
> > > > > >
> > > > > > We are hoping any issues with the APIs for Iceberg connector will
> > > > surface
> > > > > > > sooner and get more attention from the Flink community when the
> > > > > connector
> > > > > > > is within Flink umbrella rather than in Iceberg repo.
> > > > > >
> > > > > >
> > > > > > The connector externalizing is a big step for building a mechanism
> > to
> > > > > > guarantee Flink connector API is stable and workable. The follow-up
> > > > step
> > > > > > should be trying the new APIs in externalized connectors and giving
> > > > users
> > > > > > the confidence to migrate.
> > > > > >
> > > > > >
> > > > > > Best,
> > > > > > Jark
> > > > > >
> > > > > > [1]:
> > > https://lists.apache.org/thread/r5zqnkt01x2c611brkjmxxnt3bfcgl1b
> > > > > >
> > > > > > On Tue, 7 Feb 2023 at 09:53, yuxia <lu...@alumni.sjtu.edu.cn>
> > > > wrote:
> > > > > >
> > > > > > > Hi Konstantin,
> > > > > > > Just FYI, the FileSystemTableSink are still using SinkFunction.
> > > > > > >
> > > > > > > Best regards,
> > > > > > > Yuxia
> > > > > > >
> > > > > > > ----- 原始邮件 -----
> > > > > > > 发件人: "Dong Lin" <li...@gmail.com>
> > > > > > > 收件人: "dev" <de...@flink.apache.org>
> > > > > > > 抄送: "Jing Ge" <ji...@ververica.com>, "Yun Tang" <myasuka@live.com
> > >
> > > > > > > 发送时间: 星期二, 2023年 2 月 07日 上午 9:41:07
> > > > > > > 主题: Re: [DISCUSS] Promote SinkV2 to @Public and deprecate
> > > > SinkFunction
> > > > > > >
> > > > > > > Hi Konstantin,
> > > > > > >
> > > > > > > Thanks for the reply. Please see my comments inline.
> > > > > > >
> > > > > > > On Mon, Feb 6, 2023 at 9:48 PM Konstantin Knauf <
> > knaufk@apache.org
> > > >
> > > > > > wrote:
> > > > > > >
> > > > > > > > Hi Steven,
> > > > > > > >
> > > > > > > > Sink is already deprecated. It was deprecated at the moment
> > where
> > > > we
> > > > > > > > introduced SinkV2.
> > > > > > > >
> > > > > > > > Hi Jark, Hi Dong,
> > > > > > > >
> > > > > > > > My understanding is the SinkV2 is a workable interface. The
> > most
> > > > > > > important
> > > > > > > > connectors have been migrated (Kafka, Filesystem) and more
> > > > connectors
> > > > > > > > (OpenSearch, ElasticSearch, Kinesis) use it successfully. To
> > make
> > > > > > SinkV2
> > > > > > > > public, it does not need to have all possible functionality.
> > > Public
> > > > > > APIs
> > > > > > > > can be extended. That's what we do all the time. There will
> > also
> > > > > always
> > > > > > > be
> > > > > > > > bugs. So, these points can not be categorical blockers to
> > promote
> > > > the
> > > > > > > API.
> > > > > > > >
> > > > > > >
> > > > > > > Right. I believe we all agree that we make SinkV2
> > @PublicEvolving.
> > > > The
> > > > > > > concern here is whether we can mark SinkFunction as deprecated at
> > > > this
> > > > > > > point.
> > > > > > >
> > > > > > >
> > > > > > > >
> > > > > > > > What are the specific issues/tickets that are blocking us? Can
> > we
> > > > in
> > > > > > your
> > > > > > > >
> > > > > > >
> > > > > > > For example, Lijie mentioned earlier in this thread that
> > according
> > > to
> > > > > > > FLIP-287,
> > > > > > > currently the Sink.InitContext still lacks some necessary
> > > information
> > > > > to
> > > > > > > migrate existing connectors to new sinks. This could be a blocker
> > > > issue
> > > > > > > since this is related to the SinkV2 API design.
> > > > > > >
> > > > > > > And Yuxia mentioned earlier in this thread that there are bugs
> > such
> > > > as
> > > > > > > FLINK-30238 <https://issues.apache.org/jira/browse/FLINK-30238>
> > > and
> > > > > > > FLINK-29459 <https://issues.apache.org/jira/browse/FLINK-29459>
> > ,
> > > > > which
> > > > > > > makes it hard to use SinkV2 properly in production.  It seems
> > that
> > > > > these
> > > > > > > two bugs are created months ago and are still unresolved or even
> > > > > > > unassigned. This looks like a clear signal that SinkV2 is not
> > being
> > > > > > > actively maintained and used in production.
> > > > > > >
> > > > > > >
> > > > > > > > opinion only deprecate it when every single connector in Apache
> > > > Flink
> > > > > > is
> > > > > > > > migrated already?
> > > > > > > >
> > > > > > >
> > > > > > > Technically this is not a hard requirement. But I would suggest
> > > that
> > > > we
> > > > > > > should migrate all existing connectors so that we eat our own
> > > dogfood
> > > > > and
> > > > > > > prove to users that SinkV2 is ready for use in production.
> > > > > > >
> > > > > > >
> > > > > > > > In my opinion it is the time to ask users to the migrate their
> > > > > > > connectors.
> > > > > > > > More importantly, @Deprecated would signal users not to build
> > new
> > > > > > > > connectors on SinkFunction. I would arque its also very
> > > misleading
> > > > to
> > > > > > > users
> > > > > > > > to not @Deprecated SinkFunction given that is clearly will be
> > > > > > deprecated.
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > > Cheers,
> > > > > > > >
> > > > > > > > Konstantin
> > > > > > > >
> > > > > > > >
> > > > > > > > Am Mo., 6. Feb. 2023 um 13:26 Uhr schrieb Jark Wu <
> > > > imjark@gmail.com
> > > > > >:
> > > > > > > >
> > > > > > > > > I agree with Dong Lin.
> > > > > > > > >
> > > > > > > > > Oracle explains how to use Deprecate API [1]:
> > > > > > > > >
> > > > > > > > > You are strongly recommended to use the Javadoc @deprecated
> > tag
> > > > > with
> > > > > > > > > > appropriate comments explaining how to use the new API.
> > This
> > > > > > ensures
> > > > > > > > > > developers will *have a workable migration path from the
> > old
> > > > API
> > > > > to
> > > > > > > the
> > > > > > > > > > new API*.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > From a user's perspective, the workable migration path is
> > very
> > > > > > > important.
> > > > > > > > > Otherwise, it blurs the semantics of API deprecation. The
> > Flink
> > > > > API's
> > > > > > > > > compatibility and stability issues in the past left a bad
> > > > > impression
> > > > > > on
> > > > > > > > the
> > > > > > > > > downstream projects. We should be careful when changing and
> > > > > > deprecating
> > > > > > > > > APIs, especially when there are known migration gaps. I think
> > > > it's
> > > > > a
> > > > > > > good
> > > > > > > > > idea to migrate Flink-owned connectors before marking old API
> > > > > > > deprecated.
> > > > > > > > > This ensures downstream projects can migrate to new APIs
> > > > smoothly.
> > > > > > > > >
> > > > > > > > > Best,
> > > > > > > > > Jark
> > > > > > > > >
> > > > > > > > > [1]:
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> > https://docs.oracle.com/javase/8/docs/technotes/guides/javadoc/deprecation/deprecation.html
> > > > > > > > >
> > > > > > > > > On Mon, 6 Feb 2023 at 10:01, Steven Wu <stevenz3wu@gmail.com
> > >
> > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Regarding the discussion on global committer [1] for sinks
> > > with
> > > > > > > global
> > > > > > > > > > transactions, there is no consensus on solving that problem
> > > in
> > > > > > > SinkV2.
> > > > > > > > > Will
> > > > > > > > > > it require any breaking change in SinkV2?
> > > > > > > > > >
> > > > > > > > > > Also will SinkV1 be deprecated too? or it should happen
> > > > sometime
> > > > > > > after
> > > > > > > > > > SinkFunction deprecation?
> > > > > > > > > >
> > > > > > > > > > [1]
> > > > > > https://lists.apache.org/thread/82bgvlton9olb591bfg2djv0cshj1bxj
> > > > > > > > > >
> > > > > > > > > > On Sun, Feb 5, 2023 at 2:14 AM Dong Lin <
> > lindong28@gmail.com
> > > >
> > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Hi Konstantin,
> > > > > > > > > > >
> > > > > > > > > > > Thanks for the comment! Please see my comment inline.
> > > > > > > > > > >
> > > > > > > > > > > Cheers,
> > > > > > > > > > > Dong
> > > > > > > > > > >
> > > > > > > > > > > On Sat, Feb 4, 2023 at 2:06 AM Konstantin Knauf <
> > > > > > knaufk@apache.org
> > > > > > > >
> > > > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Hi everyone,
> > > > > > > > > > > >
> > > > > > > > > > > > sorry for joining the discussion late.
> > > > > > > > > > > >
> > > > > > > > > > > > 1) Is there an option to deprecate SinkFunction in
> > Flink
> > > > 1.17
> > > > > > > while
> > > > > > > > > > > leaving
> > > > > > > > > > > > SinkV2 @PublicEvolving in Flink 1.17. We then aim to
> > make
> > > > > > SinkV2
> > > > > > > > > > @Public
> > > > > > > > > > > in
> > > > > > > > > > > > and remove SinkFunction in Flink 1.18. @PublicEvolving
> > > are
> > > > > > > intended
> > > > > > > > > for
> > > > > > > > > > > > public use. So, I don't see it as a blocker for
> > > deprecating
> > > > > > > > > > SinkFunction
> > > > > > > > > > > > that we have to make SinkV2 @PublicEvovling. For
> > > reference
> > > > > this
> > > > > > > is
> > > > > > > > > the
> > > > > > > > > > > > description of @PublicEvovling:
> > > > > > > > > > > >
> > > > > > > > > > > > /**
> > > > > > > > > > > >  * Annotation to mark classes and methods for public
> > use,
> > > > but
> > > > > > > with
> > > > > > > > > > > > evolving interfaces.
> > > > > > > > > > > >  *
> > > > > > > > > > > >  * <p>Classes and methods with this annotation are
> > > intended
> > > > > for
> > > > > > > > > public
> > > > > > > > > > > > use and have stable behavior.
> > > > > > > > > > > >  * However, their interfaces and signatures are not
> > > > > considered
> > > > > > to
> > > > > > > > be
> > > > > > > > > > > > stable and might be changed
> > > > > > > > > > > >  * across versions.
> > > > > > > > > > > >  *
> > > > > > > > > > > >  * <p>This annotation also excludes methods and classes
> > > > with
> > > > > > > > evolving
> > > > > > > > > > > > interfaces / signatures within
> > > > > > > > > > > >  * classes annotated with {@link Public}.
> > > > > > > > > > > >  */
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > Marking SinkFunction @Deprecated would already single
> > > > > everyone
> > > > > > to
> > > > > > > > > move
> > > > > > > > > > to
> > > > > > > > > > > > SinkV2, which we as a community, I believe, have a
> > strong
> > > > > > > interest
> > > > > > > > > in.
> > > > > > > > > > > Its
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > Yes, I also believe we all have this strong interest. I
> > > just
> > > > > hope
> > > > > > > > that
> > > > > > > > > > this
> > > > > > > > > > > can be done in the best possible way that does not
> > confuse
> > > > > users.
> > > > > > > > > > >
> > > > > > > > > > > I probably still have the same concern regarding its
> > impact
> > > > on
> > > > > > > users:
> > > > > > > > > if
> > > > > > > > > > we
> > > > > > > > > > > mark an API as deprecated, it effectively means the users
> > > of
> > > > > this
> > > > > > > API
> > > > > > > > > > > should start to migrate to another API (e.g. SinkV2) and
> > we
> > > > > might
> > > > > > > > > remove
> > > > > > > > > > > this API in the future. However, given that we know there
> > > are
> > > > > > known
> > > > > > > > > > > problems preventing users from doing so, it seems that we
> > > are
> > > > > not
> > > > > > > > ready
> > > > > > > > > > to
> > > > > > > > > > > send this message to users right.
> > > > > > > > > > >
> > > > > > > > > > > If I understand correctly, I guess you are suggesting
> > that
> > > by
> > > > > > > marking
> > > > > > > > > > > SinkFunction as deprecated, we can put higher pressure on
> > > > Flink
> > > > > > > > > > > contributors to update the existing Flink codebase to
> > > improve
> > > > > and
> > > > > > > use
> > > > > > > > > > > SinkV2.
> > > > > > > > > > >
> > > > > > > > > > > I am not sure this is the right way to use @deprecated,
> > > which
> > > > > > has a
> > > > > > > > > > > particular meaning for its users rather than
> > contributors.
> > > > And
> > > > > I
> > > > > > am
> > > > > > > > > also
> > > > > > > > > > > not sure we can even pressure contributors of an
> > > open-source
> > > > > > > project
> > > > > > > > > into
> > > > > > > > > > > developing a feature (e.g. migrate all existing
> > > SinkFunction
> > > > > > > > subclasses
> > > > > > > > > > to
> > > > > > > > > > > SinkV2). IMO, the typical way is for the contributor with
> > > > > > > > interest/time
> > > > > > > > > > to
> > > > > > > > > > > work on the feature, or talk to other contributors
> > whether
> > > > they
> > > > > > are
> > > > > > > > > > willing
> > > > > > > > > > > to collaborate/work on this, rather than pressuring other
> > > > > > > > contributors
> > > > > > > > > > into
> > > > > > > > > > > working on this.
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > almost comical how long the transition from
> > > > > > > > > SourceFurnction/SinkFunction
> > > > > > > > > > to
> > > > > > > > > > > > Source/Sink takes us. At the same time, we leave
> > > ourselves
> > > > > the
> > > > > > > > option
> > > > > > > > > > to
> > > > > > > > > > > to
> > > > > > > > > > > > make small changes to SinkV2 if any problems arise
> > during
> > > > the
> > > > > > > > > migration
> > > > > > > > > > > of
> > > > > > > > > > > > these connector.
> > > > > > > > > > > >
> > > > > > > > > > > > I think, we have a bit of a chicken/egg problem here.
> > The
> > > > > > > pressure
> > > > > > > > > for
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > Similar to the reason described above, I am not sure we
> > > have
> > > > a
> > > > > > > > > > chicken/egg
> > > > > > > > > > > problem here. The issue here is that SinkV2 is not ready
> > > and
> > > > we
> > > > > > > have
> > > > > > > > a
> > > > > > > > > > lot
> > > > > > > > > > > of existing SinkFunction that is not migrated by
> > ourselves.
> > > > We
> > > > > > > (Flink
> > > > > > > > > > > contributors) probably do not need to mark SinkFunction
> > as
> > > > > > > deprecated
> > > > > > > > > in
> > > > > > > > > > > order to address these issues in our own codebase.
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > users and contributors is not high enough to move away
> > from
> > > > > > > > > SinkFunction
> > > > > > > > > > as
> > > > > > > > > > > > long as its not deprecated, but at the same time we
> > need
> > > > > people
> > > > > > > to
> > > > > > > > > > > migrate
> > > > > > > > > > > > their connectors to see if there are any gaps in
> > SinkV2.
> > > I
> > > > > > > believe,
> > > > > > > > > the
> > > > > > > > > > > > combination proposed above could bridge this problem.
> > > > > > > > > > > >
> > > > > > > > > > > > 2) I don't understand the argument of waiting until
> > some
> > > of
> > > > > the
> > > > > > > > > > > > implementations are @Public. How can we make the
> > > > > > implementations
> > > > > > > of
> > > > > > > > > the
> > > > > > > > > > > > SinkV2 API @Public without making SinkV2 @Public? All
> > > > public
> > > > > > > > methods
> > > > > > > > > of
> > > > > > > > > > > > SinkV2 are part of every implementation. So to me it
> > > > actually
> > > > > > > seems
> > > > > > > > > to
> > > > > > > > > > be
> > > > > > > > > > > > opposite: in order to make any of the implementation
> > > > @Public
> > > > > we
> > > > > > > > first
> > > > > > > > > > > need
> > > > > > > > > > > > to make the API @Public.
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > Yeah I also agree with you.
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > Cheers,
> > > > > > > > > > > >
> > > > > > > > > > > > Konstantin
> > > > > > > > > > > >
> > > > > > > > > > > > Am Mo., 30. Jan. 2023 um 13:18 Uhr schrieb Dong Lin <
> > > > > > > > > > lindong28@gmail.com
> > > > > > > > > > > >:
> > > > > > > > > > > >
> > > > > > > > > > > > > Hi Martijn,
> > > > > > > > > > > > >
> > > > > > > > > > > > > Thanks for driving this effort to clean-up the Flink
> > > > > > codebase!
> > > > > > > > > > > > >
> > > > > > > > > > > > > I like the idea to cleanup Flink codebase to avoid
> > > having
> > > > > two
> > > > > > > > > Sinks.
> > > > > > > > > > On
> > > > > > > > > > > > the
> > > > > > > > > > > > > other hand, I also thing the concern mentioned by
> > Jing
> > > > > makes
> > > > > > > > sense.
> > > > > > > > > > In
> > > > > > > > > > > > > addition to thinking in terms of the rule proposed in
> > > > > > FLIP-197
> > > > > > > > > > > > > <
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-197%3A+API+stability+graduation+process
> > > > > > > > > > > > > >
> > > > > > > > > > > > > (which
> > > > > > > > > > > > > seems to focus mostly on the Flink developers'
> > > > > perspective),
> > > > > > it
> > > > > > > > > might
> > > > > > > > > > > be
> > > > > > > > > > > > > useful to also think about the story from users'
> > > > > perspective
> > > > > > > and
> > > > > > > > > make
> > > > > > > > > > > > sure
> > > > > > > > > > > > > their concerns can be addressed.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Typically, by marking an API as deprecated, we are
> > > > > > effectively
> > > > > > > > > > telling
> > > > > > > > > > > > > users *they should start to migrate to the new API
> > ASAP
> > > > and
> > > > > > we
> > > > > > > > > > reserve
> > > > > > > > > > > > the
> > > > > > > > > > > > > right to remove this API completely in the 1-2
> > > releases*.
> > > > > > Then
> > > > > > > it
> > > > > > > > > > might
> > > > > > > > > > > > be
> > > > > > > > > > > > > reasonable for users to ask questions such as:
> > > > > > > > > > > > >
> > > > > > > > > > > > > 1) Does SinkV2 public API provides all the
> > > > functionalities
> > > > > > > needed
> > > > > > > > > to
> > > > > > > > > > > > > migrate my existing code from subclassing
> > SinkFunction
> > > to
> > > > > > > > > subclassing
> > > > > > > > > > > > > SinkV2?
> > > > > > > > > > > > >
> > > > > > > > > > > > > 2) Is the amount of migration work reasonable? If
> > yes,
> > > > why
> > > > > > is a
> > > > > > > > > class
> > > > > > > > > > > > such
> > > > > > > > > > > > > as HBaseSinkFunction in Flink's own codebase still
> > > > > depending
> > > > > > on
> > > > > > > > > > > > > SinkFunction? Maybe Flink developers should eat their
> > > own
> > > > > dog
> > > > > > > > food
> > > > > > > > > > and
> > > > > > > > > > > > > migrate (or deprecate) those classes in the Flink
> > > > codebase
> > > > > > > first?
> > > > > > > > > > > > >
> > > > > > > > > > > > > Based on the discussion in this thread, I am not sure
> > > we
> > > > > have
> > > > > > > > good
> > > > > > > > > > > > answers
> > > > > > > > > > > > > to those questions yet. For the 1st question above,
> > the
> > > > > > answer
> > > > > > > is
> > > > > > > > > > *no*
> > > > > > > > > > > > > because we already know that the SinkV2 is currently
> > > not
> > > > > able
> > > > > > > to
> > > > > > > > > > > support
> > > > > > > > > > > > > migration for JdbcSink. For the 2nd question above,
> > we
> > > > know
> > > > > > > there
> > > > > > > > > are
> > > > > > > > > > > > many
> > > > > > > > > > > > > non-deprecated class in Flink codebase that are still
> > > > > > depending
> > > > > > > > on
> > > > > > > > > > > > SinkV2.
> > > > > > > > > > > > > It is probably not nice to users if we tell them to
> > > > migrate
> > > > > > > while
> > > > > > > > > we
> > > > > > > > > > > know
> > > > > > > > > > > > > there are existing issues that can prevent them from
> > > > doing
> > > > > so
> > > > > > > > > easily.
> > > > > > > > > > > > >
> > > > > > > > > > > > > In order to move forward to deprecate SinkV2, I think
> > > it
> > > > > will
> > > > > > > be
> > > > > > > > > > super
> > > > > > > > > > > > > useful to first migrate all the connectors managed by
> > > > Flink
> > > > > > > > > community
> > > > > > > > > > > > > (including all externalized connectors) to use
> > SinkV2.
> > > > This
> > > > > > > work
> > > > > > > > > > won't
> > > > > > > > > > > be
> > > > > > > > > > > > > wasted since we need to do this anyway. And it will
> > > also
> > > > > give
> > > > > > > us
> > > > > > > > a
> > > > > > > > > > > chance
> > > > > > > > > > > > > to validate the capabilities of SinkV2 and fix bugs
> > by
> > > > > > > ourselves
> > > > > > > > as
> > > > > > > > > > > much
> > > > > > > > > > > > as
> > > > > > > > > > > > > possible.
> > > > > > > > > > > > >
> > > > > > > > > > > > > What do you think?
> > > > > > > > > > > > >
> > > > > > > > > > > > > Best Regards,
> > > > > > > > > > > > > Dong
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Wed, Jan 18, 2023 at 6:52 PM Martijn Visser <
> > > > > > > > > > > martijnvisser@apache.org
> > > > > > > > > > > > >
> > > > > > > > > > > > > wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > Hi all,
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > While discussing FLIP-281 [1] the discussion also
> > > > turned
> > > > > to
> > > > > > > the
> > > > > > > > > > > > > > SinkFunction and the SinkV2 API. For a broader
> > > > discussion
> > > > > > I'm
> > > > > > > > > > opening
> > > > > > > > > > > > up
> > > > > > > > > > > > > a
> > > > > > > > > > > > > > separate discussion thread.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > As Yun Tang has mentioned in that discussion
> > thread,
> > > it
> > > > > > would
> > > > > > > > be
> > > > > > > > > a
> > > > > > > > > > > good
> > > > > > > > > > > > > > time to deprecate the SinkFunction to avoid the
> > need
> > > to
> > > > > > > > introduce
> > > > > > > > > > new
> > > > > > > > > > > > > > functions towards (to be) deprecated APIs. Jing
> > > > > rightfully
> > > > > > > > > > mentioned
> > > > > > > > > > > > that
> > > > > > > > > > > > > > it would be confusing to deprecate the SinkFunction
> > > if
> > > > > its
> > > > > > > > > > successor
> > > > > > > > > > > is
> > > > > > > > > > > > > not
> > > > > > > > > > > > > > yet marked as @Public (it's currently
> > > @PublicEvolving).
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > My proposal would be to promote the SinkV2 API to
> > > > @public
> > > > > > in
> > > > > > > > > Flink
> > > > > > > > > > > 1.17
> > > > > > > > > > > > > and
> > > > > > > > > > > > > > mark the SinkFunction as @deprecated in Flink 1.17
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > The original Sink interface was introduced in Flink
> > > > 1.12
> > > > > > with
> > > > > > > > > > > FLIP-143
> > > > > > > > > > > > > [2]
> > > > > > > > > > > > > > and extended with FLIP-177 in Flink 1.14 [3] and
> > has
> > > > been
> > > > > > > > > improved
> > > > > > > > > > on
> > > > > > > > > > > > > > further as Sink V2 via FLIP-191 in Flink 1.15 [4].
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Looking at the API stability graduation process
> > [5],
> > > > the
> > > > > > fact
> > > > > > > > > that
> > > > > > > > > > > Sink
> > > > > > > > > > > > > V2
> > > > > > > > > > > > > > was introduced in Flink 1.15 would mean that we
> > could
> > > > > > > warrant a
> > > > > > > > > > > > promotion
> > > > > > > > > > > > > > to @public already (given that there have been two
> > > > > releases
> > > > > > > > with
> > > > > > > > > > 1.15
> > > > > > > > > > > > and
> > > > > > > > > > > > > > 1.16 where it was introduced). Combined with the
> > fact
> > > > > that
> > > > > > > > SinkV2
> > > > > > > > > > has
> > > > > > > > > > > > > been
> > > > > > > > > > > > > > the result of iteration over the introduction of
> > the
> > > > > > original
> > > > > > > > > Sink
> > > > > > > > > > > API
> > > > > > > > > > > > > > since Flink 1.12, I would argue that the promotion
> > is
> > > > > > > overdue.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > If we promote the Sink API to @public, I think we
> > > > should
> > > > > > also
> > > > > > > > > > > > immediately
> > > > > > > > > > > > > > mark the SinkFunction as @deprecated.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Looking forward to your thoughts.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Best regards,
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Martijn
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > [1]
> > > > > > > > > >
> > > > https://lists.apache.org/thread/l05m6cf8fwkkbpnjtzbg9l2lo40oxzd1
> > > > > > > > > > > > > > [2]
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-143%3A+Unified+Sink+API
> > > > > > > > > > > > > > [3]
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-177%3A+Extend+Sink+API
> > > > > > > > > > > > > > [4]
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-191%3A+Extend+unified+Sink+interface+to+support+small+file+compaction
> > > > > > > > > > > > > > [5]
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-197%3A+API+stability+graduation+process
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > --
> > > > > > > > > > > > https://twitter.com/snntrable
> > > > > > > > > > > > https://github.com/knaufk
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > --
> > > > > > > > https://twitter.com/snntrable
> > > > > > > > https://github.com/knaufk
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >

Re: [DISCUSS] Promote SinkV2 to @Public and deprecate SinkFunction

Posted by "Tzu-Li (Gordon) Tai" <tz...@apache.org>.
Hi everyone,

It’s been a while since this topic was last discussed, but nevertheless, it
still remains very desirable to figure out a clear path towards making
SinkV2 @Public.

There’s a new thread [1] that has a pretty good description on missing
features in SinkV2 from the Iceberg connector’s perspective, which prevents
them from migrating - anything related to those new requirements, let's
discuss there.

Nevertheless, I think we should also revive and reuse this thread to reach
consensus / closure on all concerns already brought up here.

It’s quite a long thread where a lot of various concerns were brought up,
but I’d start by addressing two very specific ones: FLIP-287 [2] and
FLINK-30238 [3]

First of all, FLIP-287 has been approved and merged already, and will be
released with 1.18.0. So, connector migrations that were waiting on this
should hopefully be unblocked after the release. So this seems to no longer
be a concern - let’s see things through with those connectors actually
being migrated.

FLINK-30238 is sort of a confusing one, and I do believe it is (partially)
a false alarm. After looking into this, the problem reported there
essentially breaks down to two things:
1) TwoPhaseCommittingSink is unable to take a new savepoint after restoring
from a savepoint generated via `stop-with-savepoint --drain`
2) SinkV2 sinks with a PostCommitTopology do not properly have post-commits
completed after a stop-with-savepoint operation, since committed
commitables are not emitted to the post-commit topology after the committer
receives the end-of-input signal.

My latest comment in [3] explains this in a bit more detail.

I believe we can conclude that problem 1) is a non-concern - users should
not restore from a job that is drained on stop-with-savepoint and cannot
expect the restored job to function normally.
Problem 2) remains a real issue though, and to help clear things up I think
we should close FLINK-30238 in favor of a new ticket scoped to the specific
PostCommitTopology problem.

The other open concerns seem to mostly be around graduation criteria and
process - I've yet to go through those and will follow up with a separate
reply (or perhaps Martijn can help wrap up that part?).

Thanks,
Gordon

[1] https://lists.apache.org/thread/h3kg7jcgjstpvwlhnofq093vk93ylgsn
[2]
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=240880853
[3] https://issues.apache.org/jira/browse/FLINK-30238

On Mon, Feb 13, 2023 at 2:50 AM Jing Ge <ji...@ververica.com.invalid> wrote:

> @Martijn
>
> What I shared previously is the fact of the current KafkaSink. Following
> your suggestion, the KafkaSink should still be marked as @Experimental for
> now which will need even longer time to graduate. BTW, KafkaSink does not
> depend on any @Internal interfaces. The @Internal is used for methods
> coming from @PublicEvolving SinkV2 interfaces, not interfaces themself.
> Thanks for bringing this topic up. Currently, there is no rule defined to
> say that no @Internal is allowed for methods implemented
> from @PublicEvolving interfaces. Further (off-this-topic) discussion might
> be required to check if it really makes sense to define such a rule, since
> some methods defined in interfaces might only be used internally, i.e. no
> connector user would be aware of them.
>
> @Dong
>
> I agree with everything you said and especially can't agree more to let
> developers who will own it make the decision.
>
> Best regards,
> Jing
>
>
> On Sun, Feb 12, 2023 at 2:53 AM Dong Lin <li...@gmail.com> wrote:
>
> > Hi Martijn,
> >
> > Thanks for the reply. Please see my comments inline.
> >
> > Regards,
> > Dong
> >
> > On Sat, Feb 11, 2023 at 4:31 AM Martijn Visser <martijnvisser@apache.org
> >
> > wrote:
> >
> > > Hi all,
> > >
> > > I wanted to get back on a couple of comments and have a proposal at the
> > end
> > > for the next steps:
> > >
> > > @Steven Wu
> > > If I look back at the original discussion of FLIP-191 and also the
> thread
> > > that you're referring to, it appears from the discussion with Yun Gao
> > that
> > > the solution was in near-sight, but just not finished. Perhaps it needs
> > to
> > > be restarted once more so it can be brought to a closure. Also when I
> > > looked back at the original introduction of SinkV2, there was
> FLINK-25726
> > > [1] which looks like it was built specifically for Iceberg and Delta
> > Sink?
> > >
> > > @Jing
> > > > All potential unstable methods coming from SinkV2 interfaces will be
> > kept
> > > marked as @internal.
> > >
> > > This is a situation that I think should be avoided: if it's unstable,
> it
> > > should be marked as @Experimental. Connectors should not need to rely
> > > on @Internal interfaces; if they are needed by a connector, a
> maintainer
> > > should create a proposal to expose.
> > >
> > > On the production readiness of SinkV2:
> > >
> > > @Dong
> > > > And Yuxia mentioned earlier in this thread that there are bugs such
> as
> > > FLINK-30238 <https://issues.apache.org/jira/browse/FLINK-30238> and
> > > FLINK-29459 <https://issues.apache.org/jira/browse/FLINK-29459> ,
> which
> > > makes it hard to use SinkV2 properly in production.
> > >
> > > @Jark
> > > > The follow-up step should be trying the new APIs in externalized
> > > connectors and giving users the confidence to migrate.
> > >
> > > I think that is not a fair assessment of the situation. Given that
> Kafka,
> > > Elasticsearch, FileSink and everything that uses the AsyncSink is on
> > SinkV2
> > > since Flink 1.15 and the popularity of these connectors, they have
> > > definitely already proven themselves. That's not to downplay that bugs
> > have
> > > no impact and shouldn't be resolved, but I don't think it's fair to
> state
> > > that SinkV2 is not stable or usable in production.
> > >
> >
> > I believe there exist jobs that are currently using SinkV2 (e.g. Kafka,
> > FileSink) in production. The questions are whether 1) SinkV2 is ready to
> > support all features of all existing SinkFunction; and 2) whether all
> those
> > connectors (e.g. Kafka) are ready to support all features of their
> > SinkFunction version (e.g. FlinkKafkaProducer) reliable without critical
> > bugs (e.g. losing data).
> >
> > If we can not answer "yes" to the above two questions, then we probably
> can
> > not say that SinkV2 is usable in production in general, even though we
> can
> > still say that some source/sink based on SinkV2 is usable in production
> in
> > selected scenarios (e.g. maybe those jobs that do not use on savepoint).
> >
> > Maybe we can use FLINK-30238
> > <https://issues.apache.org/jira/browse/FLINK-30238> to explain this
> point.
> > The JIRA's description basically says that "there is the risk of losing
> > data if we use stop-with-savepoint with SinkV2". Do you know whether this
> > might potentially affect those SinkV2 (e.g. Kafka, FileSink) jobs in
> > production when they are restarted with savepoint?
> >
> >
> >
> > >
> > > Coming back to the original topic of this thread, my proposal would be
> > that
> > > in Flink 1.17:
> > >
> > > 1. SinkV2 gets promoted from @PublicEvolving to @Public, which also
> > means:
> > >
> >
> > I believe we have agreed to address the issues with SinkV2. Now the
> > question is, how likely would the solutions require backward incompatible
> > change to SinkV2.
> >
> > For example, do you think the following questions (mentioned earlier in
> > this thread) would require backward incompatible changes?
> > - Sink.InitContext still lacks some necessary information to migrate
> > existing connectors to new sinks
> > - Address FLINK-30238 and FLINK-29459.
> > - Find the migration path for the Iceberg sink
> >
> > IMO, it seems safer to mark an interface as @Public when we are pretty
> sure
> > that there is no known issue in the near future that would require
> > backward-incompatible changes to the interface.
> >
> > Note that the cost of making an interface @Public lies on the
> developer(s)
> > who owns the task of updating SinkV2 to address those known issues (e.g.
> > migration). Maybe we should find the developers who will own this task
> and
> > let them decide whether to mark it @Public?
> >
> >
> > a. KafkaSink gets promoted from @PublicEvolving to @Public (which solves
> > > the weird thing that FlinkKafkaProducer is already marked as deprecated
> > > since Flink 1.14 and points to KafkaSink as a successor)
> > > b. FileSink gets promoted from @Experimental to @PublicEvolving (note:
> > this
> > > has been marked as @Experimental since Flink 1.12, so one could argue
> > that
> > > this needs to be bumped to @Public too. The reason why I didn't do it
> is
> > > because of Yuxia's remark that FileSystemTableSink is still using
> > > SinkFunction)
> > > c. ElasticsearchSink gets promoted from @PublicEvolving to @Public (has
> > > been externalized so needs to happen there)
> > > d. The AWS connectors (Firehose, Kinesis) get promoted from
> > @PublicEvolving
> > > to @Public (has been externalized so needs to happen there)
> > > e. It's up to the maintainers for DynamoDB and Opensearch to decide
> this
> > > themselves, given that I believe they've only been introduced recently
> > and
> > > support only Flink 1.16.
> > >
> > > 2. For the next Flink release, the community needs to establish what
> are
> > > features that are currently not available in SinkV2 (e.g. FLIP-287), if
> > > such a feature should be made available in SinkV2 (in case it's a
> feature
> > > which currently resides in SinkFunction, while from new insights we
> think
> > > it should now be solved differently) and then get these implemented
> asap.
> > >
> > > Best regards,
> > >
> > > Martijn
> > >
> > > [1] https://issues.apache.org/jira/browse/FLINK-25726
> > >
> > >
> > > On Tue, Feb 7, 2023 at 5:11 AM Steven Wu <st...@gmail.com> wrote:
> > >
> > > > > Regarding the discussion on global committer [1] for sinks with
> > global
> > > > transactions, there is no consensus on solving that problem in
> SinkV2.
> > > Will
> > > > it require any breaking change in SinkV2?
> > > >
> > > > Just want to reiterate my earlier question. What is the migration
> path
> > > for
> > > > the Iceberg sink?
> > > >
> > > > [1] https://lists.apache.org/thread/82bgvlton9olb591bfg2djv0cshj1bxj
> > > >
> > > > On Mon, Feb 6, 2023 at 6:22 PM Jark Wu <im...@gmail.com> wrote:
> > > >
> > > > > Hi Konstantin,
> > > > >
> > > > > I totally agree with making SinkV2 @Public. I just have concerns
> > about
> > > > > deprecating SinkFunction at this point. Dong Lin has raised the
> > blocker
> > > > > issues of migration multiple times in this thread which I think we
> > > should
> > > > > address first. I don't know why we rush to deprecate SinkFunction
> > while
> > > > > SourceFunction is still public, but the new Source API is much more
> > > > stable
> > > > > and production-ready than SinkV2.
> > > > >
> > > > > Iceberg community raised concerns[1] about the workability and
> > > stability
> > > > of
> > > > > Flink connector APIs.
> > > > >
> > > > > We are hoping any issues with the APIs for Iceberg connector will
> > > surface
> > > > > > sooner and get more attention from the Flink community when the
> > > > connector
> > > > > > is within Flink umbrella rather than in Iceberg repo.
> > > > >
> > > > >
> > > > > The connector externalizing is a big step for building a mechanism
> to
> > > > > guarantee Flink connector API is stable and workable. The follow-up
> > > step
> > > > > should be trying the new APIs in externalized connectors and giving
> > > users
> > > > > the confidence to migrate.
> > > > >
> > > > >
> > > > > Best,
> > > > > Jark
> > > > >
> > > > > [1]:
> > https://lists.apache.org/thread/r5zqnkt01x2c611brkjmxxnt3bfcgl1b
> > > > >
> > > > > On Tue, 7 Feb 2023 at 09:53, yuxia <lu...@alumni.sjtu.edu.cn>
> > > wrote:
> > > > >
> > > > > > Hi Konstantin,
> > > > > > Just FYI, the FileSystemTableSink are still using SinkFunction.
> > > > > >
> > > > > > Best regards,
> > > > > > Yuxia
> > > > > >
> > > > > > ----- 原始邮件 -----
> > > > > > 发件人: "Dong Lin" <li...@gmail.com>
> > > > > > 收件人: "dev" <de...@flink.apache.org>
> > > > > > 抄送: "Jing Ge" <ji...@ververica.com>, "Yun Tang" <myasuka@live.com
> >
> > > > > > 发送时间: 星期二, 2023年 2 月 07日 上午 9:41:07
> > > > > > 主题: Re: [DISCUSS] Promote SinkV2 to @Public and deprecate
> > > SinkFunction
> > > > > >
> > > > > > Hi Konstantin,
> > > > > >
> > > > > > Thanks for the reply. Please see my comments inline.
> > > > > >
> > > > > > On Mon, Feb 6, 2023 at 9:48 PM Konstantin Knauf <
> knaufk@apache.org
> > >
> > > > > wrote:
> > > > > >
> > > > > > > Hi Steven,
> > > > > > >
> > > > > > > Sink is already deprecated. It was deprecated at the moment
> where
> > > we
> > > > > > > introduced SinkV2.
> > > > > > >
> > > > > > > Hi Jark, Hi Dong,
> > > > > > >
> > > > > > > My understanding is the SinkV2 is a workable interface. The
> most
> > > > > > important
> > > > > > > connectors have been migrated (Kafka, Filesystem) and more
> > > connectors
> > > > > > > (OpenSearch, ElasticSearch, Kinesis) use it successfully. To
> make
> > > > > SinkV2
> > > > > > > public, it does not need to have all possible functionality.
> > Public
> > > > > APIs
> > > > > > > can be extended. That's what we do all the time. There will
> also
> > > > always
> > > > > > be
> > > > > > > bugs. So, these points can not be categorical blockers to
> promote
> > > the
> > > > > > API.
> > > > > > >
> > > > > >
> > > > > > Right. I believe we all agree that we make SinkV2
> @PublicEvolving.
> > > The
> > > > > > concern here is whether we can mark SinkFunction as deprecated at
> > > this
> > > > > > point.
> > > > > >
> > > > > >
> > > > > > >
> > > > > > > What are the specific issues/tickets that are blocking us? Can
> we
> > > in
> > > > > your
> > > > > > >
> > > > > >
> > > > > > For example, Lijie mentioned earlier in this thread that
> according
> > to
> > > > > > FLIP-287,
> > > > > > currently the Sink.InitContext still lacks some necessary
> > information
> > > > to
> > > > > > migrate existing connectors to new sinks. This could be a blocker
> > > issue
> > > > > > since this is related to the SinkV2 API design.
> > > > > >
> > > > > > And Yuxia mentioned earlier in this thread that there are bugs
> such
> > > as
> > > > > > FLINK-30238 <https://issues.apache.org/jira/browse/FLINK-30238>
> > and
> > > > > > FLINK-29459 <https://issues.apache.org/jira/browse/FLINK-29459>
> ,
> > > > which
> > > > > > makes it hard to use SinkV2 properly in production.  It seems
> that
> > > > these
> > > > > > two bugs are created months ago and are still unresolved or even
> > > > > > unassigned. This looks like a clear signal that SinkV2 is not
> being
> > > > > > actively maintained and used in production.
> > > > > >
> > > > > >
> > > > > > > opinion only deprecate it when every single connector in Apache
> > > Flink
> > > > > is
> > > > > > > migrated already?
> > > > > > >
> > > > > >
> > > > > > Technically this is not a hard requirement. But I would suggest
> > that
> > > we
> > > > > > should migrate all existing connectors so that we eat our own
> > dogfood
> > > > and
> > > > > > prove to users that SinkV2 is ready for use in production.
> > > > > >
> > > > > >
> > > > > > > In my opinion it is the time to ask users to the migrate their
> > > > > > connectors.
> > > > > > > More importantly, @Deprecated would signal users not to build
> new
> > > > > > > connectors on SinkFunction. I would arque its also very
> > misleading
> > > to
> > > > > > users
> > > > > > > to not @Deprecated SinkFunction given that is clearly will be
> > > > > deprecated.
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > > Cheers,
> > > > > > >
> > > > > > > Konstantin
> > > > > > >
> > > > > > >
> > > > > > > Am Mo., 6. Feb. 2023 um 13:26 Uhr schrieb Jark Wu <
> > > imjark@gmail.com
> > > > >:
> > > > > > >
> > > > > > > > I agree with Dong Lin.
> > > > > > > >
> > > > > > > > Oracle explains how to use Deprecate API [1]:
> > > > > > > >
> > > > > > > > You are strongly recommended to use the Javadoc @deprecated
> tag
> > > > with
> > > > > > > > > appropriate comments explaining how to use the new API.
> This
> > > > > ensures
> > > > > > > > > developers will *have a workable migration path from the
> old
> > > API
> > > > to
> > > > > > the
> > > > > > > > > new API*.
> > > > > > > >
> > > > > > > >
> > > > > > > > From a user's perspective, the workable migration path is
> very
> > > > > > important.
> > > > > > > > Otherwise, it blurs the semantics of API deprecation. The
> Flink
> > > > API's
> > > > > > > > compatibility and stability issues in the past left a bad
> > > > impression
> > > > > on
> > > > > > > the
> > > > > > > > downstream projects. We should be careful when changing and
> > > > > deprecating
> > > > > > > > APIs, especially when there are known migration gaps. I think
> > > it's
> > > > a
> > > > > > good
> > > > > > > > idea to migrate Flink-owned connectors before marking old API
> > > > > > deprecated.
> > > > > > > > This ensures downstream projects can migrate to new APIs
> > > smoothly.
> > > > > > > >
> > > > > > > > Best,
> > > > > > > > Jark
> > > > > > > >
> > > > > > > > [1]:
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://docs.oracle.com/javase/8/docs/technotes/guides/javadoc/deprecation/deprecation.html
> > > > > > > >
> > > > > > > > On Mon, 6 Feb 2023 at 10:01, Steven Wu <stevenz3wu@gmail.com
> >
> > > > wrote:
> > > > > > > >
> > > > > > > > > Regarding the discussion on global committer [1] for sinks
> > with
> > > > > > global
> > > > > > > > > transactions, there is no consensus on solving that problem
> > in
> > > > > > SinkV2.
> > > > > > > > Will
> > > > > > > > > it require any breaking change in SinkV2?
> > > > > > > > >
> > > > > > > > > Also will SinkV1 be deprecated too? or it should happen
> > > sometime
> > > > > > after
> > > > > > > > > SinkFunction deprecation?
> > > > > > > > >
> > > > > > > > > [1]
> > > > > https://lists.apache.org/thread/82bgvlton9olb591bfg2djv0cshj1bxj
> > > > > > > > >
> > > > > > > > > On Sun, Feb 5, 2023 at 2:14 AM Dong Lin <
> lindong28@gmail.com
> > >
> > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Hi Konstantin,
> > > > > > > > > >
> > > > > > > > > > Thanks for the comment! Please see my comment inline.
> > > > > > > > > >
> > > > > > > > > > Cheers,
> > > > > > > > > > Dong
> > > > > > > > > >
> > > > > > > > > > On Sat, Feb 4, 2023 at 2:06 AM Konstantin Knauf <
> > > > > knaufk@apache.org
> > > > > > >
> > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Hi everyone,
> > > > > > > > > > >
> > > > > > > > > > > sorry for joining the discussion late.
> > > > > > > > > > >
> > > > > > > > > > > 1) Is there an option to deprecate SinkFunction in
> Flink
> > > 1.17
> > > > > > while
> > > > > > > > > > leaving
> > > > > > > > > > > SinkV2 @PublicEvolving in Flink 1.17. We then aim to
> make
> > > > > SinkV2
> > > > > > > > > @Public
> > > > > > > > > > in
> > > > > > > > > > > and remove SinkFunction in Flink 1.18. @PublicEvolving
> > are
> > > > > > intended
> > > > > > > > for
> > > > > > > > > > > public use. So, I don't see it as a blocker for
> > deprecating
> > > > > > > > > SinkFunction
> > > > > > > > > > > that we have to make SinkV2 @PublicEvovling. For
> > reference
> > > > this
> > > > > > is
> > > > > > > > the
> > > > > > > > > > > description of @PublicEvovling:
> > > > > > > > > > >
> > > > > > > > > > > /**
> > > > > > > > > > >  * Annotation to mark classes and methods for public
> use,
> > > but
> > > > > > with
> > > > > > > > > > > evolving interfaces.
> > > > > > > > > > >  *
> > > > > > > > > > >  * <p>Classes and methods with this annotation are
> > intended
> > > > for
> > > > > > > > public
> > > > > > > > > > > use and have stable behavior.
> > > > > > > > > > >  * However, their interfaces and signatures are not
> > > > considered
> > > > > to
> > > > > > > be
> > > > > > > > > > > stable and might be changed
> > > > > > > > > > >  * across versions.
> > > > > > > > > > >  *
> > > > > > > > > > >  * <p>This annotation also excludes methods and classes
> > > with
> > > > > > > evolving
> > > > > > > > > > > interfaces / signatures within
> > > > > > > > > > >  * classes annotated with {@link Public}.
> > > > > > > > > > >  */
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > Marking SinkFunction @Deprecated would already single
> > > > everyone
> > > > > to
> > > > > > > > move
> > > > > > > > > to
> > > > > > > > > > > SinkV2, which we as a community, I believe, have a
> strong
> > > > > > interest
> > > > > > > > in.
> > > > > > > > > > Its
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Yes, I also believe we all have this strong interest. I
> > just
> > > > hope
> > > > > > > that
> > > > > > > > > this
> > > > > > > > > > can be done in the best possible way that does not
> confuse
> > > > users.
> > > > > > > > > >
> > > > > > > > > > I probably still have the same concern regarding its
> impact
> > > on
> > > > > > users:
> > > > > > > > if
> > > > > > > > > we
> > > > > > > > > > mark an API as deprecated, it effectively means the users
> > of
> > > > this
> > > > > > API
> > > > > > > > > > should start to migrate to another API (e.g. SinkV2) and
> we
> > > > might
> > > > > > > > remove
> > > > > > > > > > this API in the future. However, given that we know there
> > are
> > > > > known
> > > > > > > > > > problems preventing users from doing so, it seems that we
> > are
> > > > not
> > > > > > > ready
> > > > > > > > > to
> > > > > > > > > > send this message to users right.
> > > > > > > > > >
> > > > > > > > > > If I understand correctly, I guess you are suggesting
> that
> > by
> > > > > > marking
> > > > > > > > > > SinkFunction as deprecated, we can put higher pressure on
> > > Flink
> > > > > > > > > > contributors to update the existing Flink codebase to
> > improve
> > > > and
> > > > > > use
> > > > > > > > > > SinkV2.
> > > > > > > > > >
> > > > > > > > > > I am not sure this is the right way to use @deprecated,
> > which
> > > > > has a
> > > > > > > > > > particular meaning for its users rather than
> contributors.
> > > And
> > > > I
> > > > > am
> > > > > > > > also
> > > > > > > > > > not sure we can even pressure contributors of an
> > open-source
> > > > > > project
> > > > > > > > into
> > > > > > > > > > developing a feature (e.g. migrate all existing
> > SinkFunction
> > > > > > > subclasses
> > > > > > > > > to
> > > > > > > > > > SinkV2). IMO, the typical way is for the contributor with
> > > > > > > interest/time
> > > > > > > > > to
> > > > > > > > > > work on the feature, or talk to other contributors
> whether
> > > they
> > > > > are
> > > > > > > > > willing
> > > > > > > > > > to collaborate/work on this, rather than pressuring other
> > > > > > > contributors
> > > > > > > > > into
> > > > > > > > > > working on this.
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > almost comical how long the transition from
> > > > > > > > SourceFurnction/SinkFunction
> > > > > > > > > to
> > > > > > > > > > > Source/Sink takes us. At the same time, we leave
> > ourselves
> > > > the
> > > > > > > option
> > > > > > > > > to
> > > > > > > > > > to
> > > > > > > > > > > make small changes to SinkV2 if any problems arise
> during
> > > the
> > > > > > > > migration
> > > > > > > > > > of
> > > > > > > > > > > these connector.
> > > > > > > > > > >
> > > > > > > > > > > I think, we have a bit of a chicken/egg problem here.
> The
> > > > > > pressure
> > > > > > > > for
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Similar to the reason described above, I am not sure we
> > have
> > > a
> > > > > > > > > chicken/egg
> > > > > > > > > > problem here. The issue here is that SinkV2 is not ready
> > and
> > > we
> > > > > > have
> > > > > > > a
> > > > > > > > > lot
> > > > > > > > > > of existing SinkFunction that is not migrated by
> ourselves.
> > > We
> > > > > > (Flink
> > > > > > > > > > contributors) probably do not need to mark SinkFunction
> as
> > > > > > deprecated
> > > > > > > > in
> > > > > > > > > > order to address these issues in our own codebase.
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > users and contributors is not high enough to move away
> from
> > > > > > > > SinkFunction
> > > > > > > > > as
> > > > > > > > > > > long as its not deprecated, but at the same time we
> need
> > > > people
> > > > > > to
> > > > > > > > > > migrate
> > > > > > > > > > > their connectors to see if there are any gaps in
> SinkV2.
> > I
> > > > > > believe,
> > > > > > > > the
> > > > > > > > > > > combination proposed above could bridge this problem.
> > > > > > > > > > >
> > > > > > > > > > > 2) I don't understand the argument of waiting until
> some
> > of
> > > > the
> > > > > > > > > > > implementations are @Public. How can we make the
> > > > > implementations
> > > > > > of
> > > > > > > > the
> > > > > > > > > > > SinkV2 API @Public without making SinkV2 @Public? All
> > > public
> > > > > > > methods
> > > > > > > > of
> > > > > > > > > > > SinkV2 are part of every implementation. So to me it
> > > actually
> > > > > > seems
> > > > > > > > to
> > > > > > > > > be
> > > > > > > > > > > opposite: in order to make any of the implementation
> > > @Public
> > > > we
> > > > > > > first
> > > > > > > > > > need
> > > > > > > > > > > to make the API @Public.
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Yeah I also agree with you.
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > Cheers,
> > > > > > > > > > >
> > > > > > > > > > > Konstantin
> > > > > > > > > > >
> > > > > > > > > > > Am Mo., 30. Jan. 2023 um 13:18 Uhr schrieb Dong Lin <
> > > > > > > > > lindong28@gmail.com
> > > > > > > > > > >:
> > > > > > > > > > >
> > > > > > > > > > > > Hi Martijn,
> > > > > > > > > > > >
> > > > > > > > > > > > Thanks for driving this effort to clean-up the Flink
> > > > > codebase!
> > > > > > > > > > > >
> > > > > > > > > > > > I like the idea to cleanup Flink codebase to avoid
> > having
> > > > two
> > > > > > > > Sinks.
> > > > > > > > > On
> > > > > > > > > > > the
> > > > > > > > > > > > other hand, I also thing the concern mentioned by
> Jing
> > > > makes
> > > > > > > sense.
> > > > > > > > > In
> > > > > > > > > > > > addition to thinking in terms of the rule proposed in
> > > > > FLIP-197
> > > > > > > > > > > > <
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-197%3A+API+stability+graduation+process
> > > > > > > > > > > > >
> > > > > > > > > > > > (which
> > > > > > > > > > > > seems to focus mostly on the Flink developers'
> > > > perspective),
> > > > > it
> > > > > > > > might
> > > > > > > > > > be
> > > > > > > > > > > > useful to also think about the story from users'
> > > > perspective
> > > > > > and
> > > > > > > > make
> > > > > > > > > > > sure
> > > > > > > > > > > > their concerns can be addressed.
> > > > > > > > > > > >
> > > > > > > > > > > > Typically, by marking an API as deprecated, we are
> > > > > effectively
> > > > > > > > > telling
> > > > > > > > > > > > users *they should start to migrate to the new API
> ASAP
> > > and
> > > > > we
> > > > > > > > > reserve
> > > > > > > > > > > the
> > > > > > > > > > > > right to remove this API completely in the 1-2
> > releases*.
> > > > > Then
> > > > > > it
> > > > > > > > > might
> > > > > > > > > > > be
> > > > > > > > > > > > reasonable for users to ask questions such as:
> > > > > > > > > > > >
> > > > > > > > > > > > 1) Does SinkV2 public API provides all the
> > > functionalities
> > > > > > needed
> > > > > > > > to
> > > > > > > > > > > > migrate my existing code from subclassing
> SinkFunction
> > to
> > > > > > > > subclassing
> > > > > > > > > > > > SinkV2?
> > > > > > > > > > > >
> > > > > > > > > > > > 2) Is the amount of migration work reasonable? If
> yes,
> > > why
> > > > > is a
> > > > > > > > class
> > > > > > > > > > > such
> > > > > > > > > > > > as HBaseSinkFunction in Flink's own codebase still
> > > > depending
> > > > > on
> > > > > > > > > > > > SinkFunction? Maybe Flink developers should eat their
> > own
> > > > dog
> > > > > > > food
> > > > > > > > > and
> > > > > > > > > > > > migrate (or deprecate) those classes in the Flink
> > > codebase
> > > > > > first?
> > > > > > > > > > > >
> > > > > > > > > > > > Based on the discussion in this thread, I am not sure
> > we
> > > > have
> > > > > > > good
> > > > > > > > > > > answers
> > > > > > > > > > > > to those questions yet. For the 1st question above,
> the
> > > > > answer
> > > > > > is
> > > > > > > > > *no*
> > > > > > > > > > > > because we already know that the SinkV2 is currently
> > not
> > > > able
> > > > > > to
> > > > > > > > > > support
> > > > > > > > > > > > migration for JdbcSink. For the 2nd question above,
> we
> > > know
> > > > > > there
> > > > > > > > are
> > > > > > > > > > > many
> > > > > > > > > > > > non-deprecated class in Flink codebase that are still
> > > > > depending
> > > > > > > on
> > > > > > > > > > > SinkV2.
> > > > > > > > > > > > It is probably not nice to users if we tell them to
> > > migrate
> > > > > > while
> > > > > > > > we
> > > > > > > > > > know
> > > > > > > > > > > > there are existing issues that can prevent them from
> > > doing
> > > > so
> > > > > > > > easily.
> > > > > > > > > > > >
> > > > > > > > > > > > In order to move forward to deprecate SinkV2, I think
> > it
> > > > will
> > > > > > be
> > > > > > > > > super
> > > > > > > > > > > > useful to first migrate all the connectors managed by
> > > Flink
> > > > > > > > community
> > > > > > > > > > > > (including all externalized connectors) to use
> SinkV2.
> > > This
> > > > > > work
> > > > > > > > > won't
> > > > > > > > > > be
> > > > > > > > > > > > wasted since we need to do this anyway. And it will
> > also
> > > > give
> > > > > > us
> > > > > > > a
> > > > > > > > > > chance
> > > > > > > > > > > > to validate the capabilities of SinkV2 and fix bugs
> by
> > > > > > ourselves
> > > > > > > as
> > > > > > > > > > much
> > > > > > > > > > > as
> > > > > > > > > > > > possible.
> > > > > > > > > > > >
> > > > > > > > > > > > What do you think?
> > > > > > > > > > > >
> > > > > > > > > > > > Best Regards,
> > > > > > > > > > > > Dong
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > On Wed, Jan 18, 2023 at 6:52 PM Martijn Visser <
> > > > > > > > > > martijnvisser@apache.org
> > > > > > > > > > > >
> > > > > > > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > Hi all,
> > > > > > > > > > > > >
> > > > > > > > > > > > > While discussing FLIP-281 [1] the discussion also
> > > turned
> > > > to
> > > > > > the
> > > > > > > > > > > > > SinkFunction and the SinkV2 API. For a broader
> > > discussion
> > > > > I'm
> > > > > > > > > opening
> > > > > > > > > > > up
> > > > > > > > > > > > a
> > > > > > > > > > > > > separate discussion thread.
> > > > > > > > > > > > >
> > > > > > > > > > > > > As Yun Tang has mentioned in that discussion
> thread,
> > it
> > > > > would
> > > > > > > be
> > > > > > > > a
> > > > > > > > > > good
> > > > > > > > > > > > > time to deprecate the SinkFunction to avoid the
> need
> > to
> > > > > > > introduce
> > > > > > > > > new
> > > > > > > > > > > > > functions towards (to be) deprecated APIs. Jing
> > > > rightfully
> > > > > > > > > mentioned
> > > > > > > > > > > that
> > > > > > > > > > > > > it would be confusing to deprecate the SinkFunction
> > if
> > > > its
> > > > > > > > > successor
> > > > > > > > > > is
> > > > > > > > > > > > not
> > > > > > > > > > > > > yet marked as @Public (it's currently
> > @PublicEvolving).
> > > > > > > > > > > > >
> > > > > > > > > > > > > My proposal would be to promote the SinkV2 API to
> > > @public
> > > > > in
> > > > > > > > Flink
> > > > > > > > > > 1.17
> > > > > > > > > > > > and
> > > > > > > > > > > > > mark the SinkFunction as @deprecated in Flink 1.17
> > > > > > > > > > > > >
> > > > > > > > > > > > > The original Sink interface was introduced in Flink
> > > 1.12
> > > > > with
> > > > > > > > > > FLIP-143
> > > > > > > > > > > > [2]
> > > > > > > > > > > > > and extended with FLIP-177 in Flink 1.14 [3] and
> has
> > > been
> > > > > > > > improved
> > > > > > > > > on
> > > > > > > > > > > > > further as Sink V2 via FLIP-191 in Flink 1.15 [4].
> > > > > > > > > > > > >
> > > > > > > > > > > > > Looking at the API stability graduation process
> [5],
> > > the
> > > > > fact
> > > > > > > > that
> > > > > > > > > > Sink
> > > > > > > > > > > > V2
> > > > > > > > > > > > > was introduced in Flink 1.15 would mean that we
> could
> > > > > > warrant a
> > > > > > > > > > > promotion
> > > > > > > > > > > > > to @public already (given that there have been two
> > > > releases
> > > > > > > with
> > > > > > > > > 1.15
> > > > > > > > > > > and
> > > > > > > > > > > > > 1.16 where it was introduced). Combined with the
> fact
> > > > that
> > > > > > > SinkV2
> > > > > > > > > has
> > > > > > > > > > > > been
> > > > > > > > > > > > > the result of iteration over the introduction of
> the
> > > > > original
> > > > > > > > Sink
> > > > > > > > > > API
> > > > > > > > > > > > > since Flink 1.12, I would argue that the promotion
> is
> > > > > > overdue.
> > > > > > > > > > > > >
> > > > > > > > > > > > > If we promote the Sink API to @public, I think we
> > > should
> > > > > also
> > > > > > > > > > > immediately
> > > > > > > > > > > > > mark the SinkFunction as @deprecated.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Looking forward to your thoughts.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Best regards,
> > > > > > > > > > > > >
> > > > > > > > > > > > > Martijn
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > [1]
> > > > > > > > >
> > > https://lists.apache.org/thread/l05m6cf8fwkkbpnjtzbg9l2lo40oxzd1
> > > > > > > > > > > > > [2]
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-143%3A+Unified+Sink+API
> > > > > > > > > > > > > [3]
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-177%3A+Extend+Sink+API
> > > > > > > > > > > > > [4]
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-191%3A+Extend+unified+Sink+interface+to+support+small+file+compaction
> > > > > > > > > > > > > [5]
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-197%3A+API+stability+graduation+process
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > --
> > > > > > > > > > > https://twitter.com/snntrable
> > > > > > > > > > > https://github.com/knaufk
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > https://twitter.com/snntrable
> > > > > > > https://github.com/knaufk
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] Promote SinkV2 to @Public and deprecate SinkFunction

Posted by Jing Ge <ji...@ververica.com.INVALID>.
@Martijn

What I shared previously is the fact of the current KafkaSink. Following
your suggestion, the KafkaSink should still be marked as @Experimental for
now which will need even longer time to graduate. BTW, KafkaSink does not
depend on any @Internal interfaces. The @Internal is used for methods
coming from @PublicEvolving SinkV2 interfaces, not interfaces themself.
Thanks for bringing this topic up. Currently, there is no rule defined to
say that no @Internal is allowed for methods implemented
from @PublicEvolving interfaces. Further (off-this-topic) discussion might
be required to check if it really makes sense to define such a rule, since
some methods defined in interfaces might only be used internally, i.e. no
connector user would be aware of them.

@Dong

I agree with everything you said and especially can't agree more to let
developers who will own it make the decision.

Best regards,
Jing


On Sun, Feb 12, 2023 at 2:53 AM Dong Lin <li...@gmail.com> wrote:

> Hi Martijn,
>
> Thanks for the reply. Please see my comments inline.
>
> Regards,
> Dong
>
> On Sat, Feb 11, 2023 at 4:31 AM Martijn Visser <ma...@apache.org>
> wrote:
>
> > Hi all,
> >
> > I wanted to get back on a couple of comments and have a proposal at the
> end
> > for the next steps:
> >
> > @Steven Wu
> > If I look back at the original discussion of FLIP-191 and also the thread
> > that you're referring to, it appears from the discussion with Yun Gao
> that
> > the solution was in near-sight, but just not finished. Perhaps it needs
> to
> > be restarted once more so it can be brought to a closure. Also when I
> > looked back at the original introduction of SinkV2, there was FLINK-25726
> > [1] which looks like it was built specifically for Iceberg and Delta
> Sink?
> >
> > @Jing
> > > All potential unstable methods coming from SinkV2 interfaces will be
> kept
> > marked as @internal.
> >
> > This is a situation that I think should be avoided: if it's unstable, it
> > should be marked as @Experimental. Connectors should not need to rely
> > on @Internal interfaces; if they are needed by a connector, a maintainer
> > should create a proposal to expose.
> >
> > On the production readiness of SinkV2:
> >
> > @Dong
> > > And Yuxia mentioned earlier in this thread that there are bugs such as
> > FLINK-30238 <https://issues.apache.org/jira/browse/FLINK-30238> and
> > FLINK-29459 <https://issues.apache.org/jira/browse/FLINK-29459> , which
> > makes it hard to use SinkV2 properly in production.
> >
> > @Jark
> > > The follow-up step should be trying the new APIs in externalized
> > connectors and giving users the confidence to migrate.
> >
> > I think that is not a fair assessment of the situation. Given that Kafka,
> > Elasticsearch, FileSink and everything that uses the AsyncSink is on
> SinkV2
> > since Flink 1.15 and the popularity of these connectors, they have
> > definitely already proven themselves. That's not to downplay that bugs
> have
> > no impact and shouldn't be resolved, but I don't think it's fair to state
> > that SinkV2 is not stable or usable in production.
> >
>
> I believe there exist jobs that are currently using SinkV2 (e.g. Kafka,
> FileSink) in production. The questions are whether 1) SinkV2 is ready to
> support all features of all existing SinkFunction; and 2) whether all those
> connectors (e.g. Kafka) are ready to support all features of their
> SinkFunction version (e.g. FlinkKafkaProducer) reliable without critical
> bugs (e.g. losing data).
>
> If we can not answer "yes" to the above two questions, then we probably can
> not say that SinkV2 is usable in production in general, even though we can
> still say that some source/sink based on SinkV2 is usable in production in
> selected scenarios (e.g. maybe those jobs that do not use on savepoint).
>
> Maybe we can use FLINK-30238
> <https://issues.apache.org/jira/browse/FLINK-30238> to explain this point.
> The JIRA's description basically says that "there is the risk of losing
> data if we use stop-with-savepoint with SinkV2". Do you know whether this
> might potentially affect those SinkV2 (e.g. Kafka, FileSink) jobs in
> production when they are restarted with savepoint?
>
>
>
> >
> > Coming back to the original topic of this thread, my proposal would be
> that
> > in Flink 1.17:
> >
> > 1. SinkV2 gets promoted from @PublicEvolving to @Public, which also
> means:
> >
>
> I believe we have agreed to address the issues with SinkV2. Now the
> question is, how likely would the solutions require backward incompatible
> change to SinkV2.
>
> For example, do you think the following questions (mentioned earlier in
> this thread) would require backward incompatible changes?
> - Sink.InitContext still lacks some necessary information to migrate
> existing connectors to new sinks
> - Address FLINK-30238 and FLINK-29459.
> - Find the migration path for the Iceberg sink
>
> IMO, it seems safer to mark an interface as @Public when we are pretty sure
> that there is no known issue in the near future that would require
> backward-incompatible changes to the interface.
>
> Note that the cost of making an interface @Public lies on the developer(s)
> who owns the task of updating SinkV2 to address those known issues (e.g.
> migration). Maybe we should find the developers who will own this task and
> let them decide whether to mark it @Public?
>
>
> a. KafkaSink gets promoted from @PublicEvolving to @Public (which solves
> > the weird thing that FlinkKafkaProducer is already marked as deprecated
> > since Flink 1.14 and points to KafkaSink as a successor)
> > b. FileSink gets promoted from @Experimental to @PublicEvolving (note:
> this
> > has been marked as @Experimental since Flink 1.12, so one could argue
> that
> > this needs to be bumped to @Public too. The reason why I didn't do it is
> > because of Yuxia's remark that FileSystemTableSink is still using
> > SinkFunction)
> > c. ElasticsearchSink gets promoted from @PublicEvolving to @Public (has
> > been externalized so needs to happen there)
> > d. The AWS connectors (Firehose, Kinesis) get promoted from
> @PublicEvolving
> > to @Public (has been externalized so needs to happen there)
> > e. It's up to the maintainers for DynamoDB and Opensearch to decide this
> > themselves, given that I believe they've only been introduced recently
> and
> > support only Flink 1.16.
> >
> > 2. For the next Flink release, the community needs to establish what are
> > features that are currently not available in SinkV2 (e.g. FLIP-287), if
> > such a feature should be made available in SinkV2 (in case it's a feature
> > which currently resides in SinkFunction, while from new insights we think
> > it should now be solved differently) and then get these implemented asap.
> >
> > Best regards,
> >
> > Martijn
> >
> > [1] https://issues.apache.org/jira/browse/FLINK-25726
> >
> >
> > On Tue, Feb 7, 2023 at 5:11 AM Steven Wu <st...@gmail.com> wrote:
> >
> > > > Regarding the discussion on global committer [1] for sinks with
> global
> > > transactions, there is no consensus on solving that problem in SinkV2.
> > Will
> > > it require any breaking change in SinkV2?
> > >
> > > Just want to reiterate my earlier question. What is the migration path
> > for
> > > the Iceberg sink?
> > >
> > > [1] https://lists.apache.org/thread/82bgvlton9olb591bfg2djv0cshj1bxj
> > >
> > > On Mon, Feb 6, 2023 at 6:22 PM Jark Wu <im...@gmail.com> wrote:
> > >
> > > > Hi Konstantin,
> > > >
> > > > I totally agree with making SinkV2 @Public. I just have concerns
> about
> > > > deprecating SinkFunction at this point. Dong Lin has raised the
> blocker
> > > > issues of migration multiple times in this thread which I think we
> > should
> > > > address first. I don't know why we rush to deprecate SinkFunction
> while
> > > > SourceFunction is still public, but the new Source API is much more
> > > stable
> > > > and production-ready than SinkV2.
> > > >
> > > > Iceberg community raised concerns[1] about the workability and
> > stability
> > > of
> > > > Flink connector APIs.
> > > >
> > > > We are hoping any issues with the APIs for Iceberg connector will
> > surface
> > > > > sooner and get more attention from the Flink community when the
> > > connector
> > > > > is within Flink umbrella rather than in Iceberg repo.
> > > >
> > > >
> > > > The connector externalizing is a big step for building a mechanism to
> > > > guarantee Flink connector API is stable and workable. The follow-up
> > step
> > > > should be trying the new APIs in externalized connectors and giving
> > users
> > > > the confidence to migrate.
> > > >
> > > >
> > > > Best,
> > > > Jark
> > > >
> > > > [1]:
> https://lists.apache.org/thread/r5zqnkt01x2c611brkjmxxnt3bfcgl1b
> > > >
> > > > On Tue, 7 Feb 2023 at 09:53, yuxia <lu...@alumni.sjtu.edu.cn>
> > wrote:
> > > >
> > > > > Hi Konstantin,
> > > > > Just FYI, the FileSystemTableSink are still using SinkFunction.
> > > > >
> > > > > Best regards,
> > > > > Yuxia
> > > > >
> > > > > ----- 原始邮件 -----
> > > > > 发件人: "Dong Lin" <li...@gmail.com>
> > > > > 收件人: "dev" <de...@flink.apache.org>
> > > > > 抄送: "Jing Ge" <ji...@ververica.com>, "Yun Tang" <my...@live.com>
> > > > > 发送时间: 星期二, 2023年 2 月 07日 上午 9:41:07
> > > > > 主题: Re: [DISCUSS] Promote SinkV2 to @Public and deprecate
> > SinkFunction
> > > > >
> > > > > Hi Konstantin,
> > > > >
> > > > > Thanks for the reply. Please see my comments inline.
> > > > >
> > > > > On Mon, Feb 6, 2023 at 9:48 PM Konstantin Knauf <knaufk@apache.org
> >
> > > > wrote:
> > > > >
> > > > > > Hi Steven,
> > > > > >
> > > > > > Sink is already deprecated. It was deprecated at the moment where
> > we
> > > > > > introduced SinkV2.
> > > > > >
> > > > > > Hi Jark, Hi Dong,
> > > > > >
> > > > > > My understanding is the SinkV2 is a workable interface. The most
> > > > > important
> > > > > > connectors have been migrated (Kafka, Filesystem) and more
> > connectors
> > > > > > (OpenSearch, ElasticSearch, Kinesis) use it successfully. To make
> > > > SinkV2
> > > > > > public, it does not need to have all possible functionality.
> Public
> > > > APIs
> > > > > > can be extended. That's what we do all the time. There will also
> > > always
> > > > > be
> > > > > > bugs. So, these points can not be categorical blockers to promote
> > the
> > > > > API.
> > > > > >
> > > > >
> > > > > Right. I believe we all agree that we make SinkV2 @PublicEvolving.
> > The
> > > > > concern here is whether we can mark SinkFunction as deprecated at
> > this
> > > > > point.
> > > > >
> > > > >
> > > > > >
> > > > > > What are the specific issues/tickets that are blocking us? Can we
> > in
> > > > your
> > > > > >
> > > > >
> > > > > For example, Lijie mentioned earlier in this thread that according
> to
> > > > > FLIP-287,
> > > > > currently the Sink.InitContext still lacks some necessary
> information
> > > to
> > > > > migrate existing connectors to new sinks. This could be a blocker
> > issue
> > > > > since this is related to the SinkV2 API design.
> > > > >
> > > > > And Yuxia mentioned earlier in this thread that there are bugs such
> > as
> > > > > FLINK-30238 <https://issues.apache.org/jira/browse/FLINK-30238>
> and
> > > > > FLINK-29459 <https://issues.apache.org/jira/browse/FLINK-29459> ,
> > > which
> > > > > makes it hard to use SinkV2 properly in production.  It seems that
> > > these
> > > > > two bugs are created months ago and are still unresolved or even
> > > > > unassigned. This looks like a clear signal that SinkV2 is not being
> > > > > actively maintained and used in production.
> > > > >
> > > > >
> > > > > > opinion only deprecate it when every single connector in Apache
> > Flink
> > > > is
> > > > > > migrated already?
> > > > > >
> > > > >
> > > > > Technically this is not a hard requirement. But I would suggest
> that
> > we
> > > > > should migrate all existing connectors so that we eat our own
> dogfood
> > > and
> > > > > prove to users that SinkV2 is ready for use in production.
> > > > >
> > > > >
> > > > > > In my opinion it is the time to ask users to the migrate their
> > > > > connectors.
> > > > > > More importantly, @Deprecated would signal users not to build new
> > > > > > connectors on SinkFunction. I would arque its also very
> misleading
> > to
> > > > > users
> > > > > > to not @Deprecated SinkFunction given that is clearly will be
> > > > deprecated.
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > > Cheers,
> > > > > >
> > > > > > Konstantin
> > > > > >
> > > > > >
> > > > > > Am Mo., 6. Feb. 2023 um 13:26 Uhr schrieb Jark Wu <
> > imjark@gmail.com
> > > >:
> > > > > >
> > > > > > > I agree with Dong Lin.
> > > > > > >
> > > > > > > Oracle explains how to use Deprecate API [1]:
> > > > > > >
> > > > > > > You are strongly recommended to use the Javadoc @deprecated tag
> > > with
> > > > > > > > appropriate comments explaining how to use the new API. This
> > > > ensures
> > > > > > > > developers will *have a workable migration path from the old
> > API
> > > to
> > > > > the
> > > > > > > > new API*.
> > > > > > >
> > > > > > >
> > > > > > > From a user's perspective, the workable migration path is very
> > > > > important.
> > > > > > > Otherwise, it blurs the semantics of API deprecation. The Flink
> > > API's
> > > > > > > compatibility and stability issues in the past left a bad
> > > impression
> > > > on
> > > > > > the
> > > > > > > downstream projects. We should be careful when changing and
> > > > deprecating
> > > > > > > APIs, especially when there are known migration gaps. I think
> > it's
> > > a
> > > > > good
> > > > > > > idea to migrate Flink-owned connectors before marking old API
> > > > > deprecated.
> > > > > > > This ensures downstream projects can migrate to new APIs
> > smoothly.
> > > > > > >
> > > > > > > Best,
> > > > > > > Jark
> > > > > > >
> > > > > > > [1]:
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://docs.oracle.com/javase/8/docs/technotes/guides/javadoc/deprecation/deprecation.html
> > > > > > >
> > > > > > > On Mon, 6 Feb 2023 at 10:01, Steven Wu <st...@gmail.com>
> > > wrote:
> > > > > > >
> > > > > > > > Regarding the discussion on global committer [1] for sinks
> with
> > > > > global
> > > > > > > > transactions, there is no consensus on solving that problem
> in
> > > > > SinkV2.
> > > > > > > Will
> > > > > > > > it require any breaking change in SinkV2?
> > > > > > > >
> > > > > > > > Also will SinkV1 be deprecated too? or it should happen
> > sometime
> > > > > after
> > > > > > > > SinkFunction deprecation?
> > > > > > > >
> > > > > > > > [1]
> > > > https://lists.apache.org/thread/82bgvlton9olb591bfg2djv0cshj1bxj
> > > > > > > >
> > > > > > > > On Sun, Feb 5, 2023 at 2:14 AM Dong Lin <lindong28@gmail.com
> >
> > > > wrote:
> > > > > > > >
> > > > > > > > > Hi Konstantin,
> > > > > > > > >
> > > > > > > > > Thanks for the comment! Please see my comment inline.
> > > > > > > > >
> > > > > > > > > Cheers,
> > > > > > > > > Dong
> > > > > > > > >
> > > > > > > > > On Sat, Feb 4, 2023 at 2:06 AM Konstantin Knauf <
> > > > knaufk@apache.org
> > > > > >
> > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Hi everyone,
> > > > > > > > > >
> > > > > > > > > > sorry for joining the discussion late.
> > > > > > > > > >
> > > > > > > > > > 1) Is there an option to deprecate SinkFunction in Flink
> > 1.17
> > > > > while
> > > > > > > > > leaving
> > > > > > > > > > SinkV2 @PublicEvolving in Flink 1.17. We then aim to make
> > > > SinkV2
> > > > > > > > @Public
> > > > > > > > > in
> > > > > > > > > > and remove SinkFunction in Flink 1.18. @PublicEvolving
> are
> > > > > intended
> > > > > > > for
> > > > > > > > > > public use. So, I don't see it as a blocker for
> deprecating
> > > > > > > > SinkFunction
> > > > > > > > > > that we have to make SinkV2 @PublicEvovling. For
> reference
> > > this
> > > > > is
> > > > > > > the
> > > > > > > > > > description of @PublicEvovling:
> > > > > > > > > >
> > > > > > > > > > /**
> > > > > > > > > >  * Annotation to mark classes and methods for public use,
> > but
> > > > > with
> > > > > > > > > > evolving interfaces.
> > > > > > > > > >  *
> > > > > > > > > >  * <p>Classes and methods with this annotation are
> intended
> > > for
> > > > > > > public
> > > > > > > > > > use and have stable behavior.
> > > > > > > > > >  * However, their interfaces and signatures are not
> > > considered
> > > > to
> > > > > > be
> > > > > > > > > > stable and might be changed
> > > > > > > > > >  * across versions.
> > > > > > > > > >  *
> > > > > > > > > >  * <p>This annotation also excludes methods and classes
> > with
> > > > > > evolving
> > > > > > > > > > interfaces / signatures within
> > > > > > > > > >  * classes annotated with {@link Public}.
> > > > > > > > > >  */
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Marking SinkFunction @Deprecated would already single
> > > everyone
> > > > to
> > > > > > > move
> > > > > > > > to
> > > > > > > > > > SinkV2, which we as a community, I believe, have a strong
> > > > > interest
> > > > > > > in.
> > > > > > > > > Its
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > > Yes, I also believe we all have this strong interest. I
> just
> > > hope
> > > > > > that
> > > > > > > > this
> > > > > > > > > can be done in the best possible way that does not confuse
> > > users.
> > > > > > > > >
> > > > > > > > > I probably still have the same concern regarding its impact
> > on
> > > > > users:
> > > > > > > if
> > > > > > > > we
> > > > > > > > > mark an API as deprecated, it effectively means the users
> of
> > > this
> > > > > API
> > > > > > > > > should start to migrate to another API (e.g. SinkV2) and we
> > > might
> > > > > > > remove
> > > > > > > > > this API in the future. However, given that we know there
> are
> > > > known
> > > > > > > > > problems preventing users from doing so, it seems that we
> are
> > > not
> > > > > > ready
> > > > > > > > to
> > > > > > > > > send this message to users right.
> > > > > > > > >
> > > > > > > > > If I understand correctly, I guess you are suggesting that
> by
> > > > > marking
> > > > > > > > > SinkFunction as deprecated, we can put higher pressure on
> > Flink
> > > > > > > > > contributors to update the existing Flink codebase to
> improve
> > > and
> > > > > use
> > > > > > > > > SinkV2.
> > > > > > > > >
> > > > > > > > > I am not sure this is the right way to use @deprecated,
> which
> > > > has a
> > > > > > > > > particular meaning for its users rather than contributors.
> > And
> > > I
> > > > am
> > > > > > > also
> > > > > > > > > not sure we can even pressure contributors of an
> open-source
> > > > > project
> > > > > > > into
> > > > > > > > > developing a feature (e.g. migrate all existing
> SinkFunction
> > > > > > subclasses
> > > > > > > > to
> > > > > > > > > SinkV2). IMO, the typical way is for the contributor with
> > > > > > interest/time
> > > > > > > > to
> > > > > > > > > work on the feature, or talk to other contributors whether
> > they
> > > > are
> > > > > > > > willing
> > > > > > > > > to collaborate/work on this, rather than pressuring other
> > > > > > contributors
> > > > > > > > into
> > > > > > > > > working on this.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > almost comical how long the transition from
> > > > > > > SourceFurnction/SinkFunction
> > > > > > > > to
> > > > > > > > > > Source/Sink takes us. At the same time, we leave
> ourselves
> > > the
> > > > > > option
> > > > > > > > to
> > > > > > > > > to
> > > > > > > > > > make small changes to SinkV2 if any problems arise during
> > the
> > > > > > > migration
> > > > > > > > > of
> > > > > > > > > > these connector.
> > > > > > > > > >
> > > > > > > > > > I think, we have a bit of a chicken/egg problem here. The
> > > > > pressure
> > > > > > > for
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > > Similar to the reason described above, I am not sure we
> have
> > a
> > > > > > > > chicken/egg
> > > > > > > > > problem here. The issue here is that SinkV2 is not ready
> and
> > we
> > > > > have
> > > > > > a
> > > > > > > > lot
> > > > > > > > > of existing SinkFunction that is not migrated by ourselves.
> > We
> > > > > (Flink
> > > > > > > > > contributors) probably do not need to mark SinkFunction as
> > > > > deprecated
> > > > > > > in
> > > > > > > > > order to address these issues in our own codebase.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > users and contributors is not high enough to move away from
> > > > > > > SinkFunction
> > > > > > > > as
> > > > > > > > > > long as its not deprecated, but at the same time we need
> > > people
> > > > > to
> > > > > > > > > migrate
> > > > > > > > > > their connectors to see if there are any gaps in SinkV2.
> I
> > > > > believe,
> > > > > > > the
> > > > > > > > > > combination proposed above could bridge this problem.
> > > > > > > > > >
> > > > > > > > > > 2) I don't understand the argument of waiting until some
> of
> > > the
> > > > > > > > > > implementations are @Public. How can we make the
> > > > implementations
> > > > > of
> > > > > > > the
> > > > > > > > > > SinkV2 API @Public without making SinkV2 @Public? All
> > public
> > > > > > methods
> > > > > > > of
> > > > > > > > > > SinkV2 are part of every implementation. So to me it
> > actually
> > > > > seems
> > > > > > > to
> > > > > > > > be
> > > > > > > > > > opposite: in order to make any of the implementation
> > @Public
> > > we
> > > > > > first
> > > > > > > > > need
> > > > > > > > > > to make the API @Public.
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > > Yeah I also agree with you.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Cheers,
> > > > > > > > > >
> > > > > > > > > > Konstantin
> > > > > > > > > >
> > > > > > > > > > Am Mo., 30. Jan. 2023 um 13:18 Uhr schrieb Dong Lin <
> > > > > > > > lindong28@gmail.com
> > > > > > > > > >:
> > > > > > > > > >
> > > > > > > > > > > Hi Martijn,
> > > > > > > > > > >
> > > > > > > > > > > Thanks for driving this effort to clean-up the Flink
> > > > codebase!
> > > > > > > > > > >
> > > > > > > > > > > I like the idea to cleanup Flink codebase to avoid
> having
> > > two
> > > > > > > Sinks.
> > > > > > > > On
> > > > > > > > > > the
> > > > > > > > > > > other hand, I also thing the concern mentioned by Jing
> > > makes
> > > > > > sense.
> > > > > > > > In
> > > > > > > > > > > addition to thinking in terms of the rule proposed in
> > > > FLIP-197
> > > > > > > > > > > <
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-197%3A+API+stability+graduation+process
> > > > > > > > > > > >
> > > > > > > > > > > (which
> > > > > > > > > > > seems to focus mostly on the Flink developers'
> > > perspective),
> > > > it
> > > > > > > might
> > > > > > > > > be
> > > > > > > > > > > useful to also think about the story from users'
> > > perspective
> > > > > and
> > > > > > > make
> > > > > > > > > > sure
> > > > > > > > > > > their concerns can be addressed.
> > > > > > > > > > >
> > > > > > > > > > > Typically, by marking an API as deprecated, we are
> > > > effectively
> > > > > > > > telling
> > > > > > > > > > > users *they should start to migrate to the new API ASAP
> > and
> > > > we
> > > > > > > > reserve
> > > > > > > > > > the
> > > > > > > > > > > right to remove this API completely in the 1-2
> releases*.
> > > > Then
> > > > > it
> > > > > > > > might
> > > > > > > > > > be
> > > > > > > > > > > reasonable for users to ask questions such as:
> > > > > > > > > > >
> > > > > > > > > > > 1) Does SinkV2 public API provides all the
> > functionalities
> > > > > needed
> > > > > > > to
> > > > > > > > > > > migrate my existing code from subclassing SinkFunction
> to
> > > > > > > subclassing
> > > > > > > > > > > SinkV2?
> > > > > > > > > > >
> > > > > > > > > > > 2) Is the amount of migration work reasonable? If yes,
> > why
> > > > is a
> > > > > > > class
> > > > > > > > > > such
> > > > > > > > > > > as HBaseSinkFunction in Flink's own codebase still
> > > depending
> > > > on
> > > > > > > > > > > SinkFunction? Maybe Flink developers should eat their
> own
> > > dog
> > > > > > food
> > > > > > > > and
> > > > > > > > > > > migrate (or deprecate) those classes in the Flink
> > codebase
> > > > > first?
> > > > > > > > > > >
> > > > > > > > > > > Based on the discussion in this thread, I am not sure
> we
> > > have
> > > > > > good
> > > > > > > > > > answers
> > > > > > > > > > > to those questions yet. For the 1st question above, the
> > > > answer
> > > > > is
> > > > > > > > *no*
> > > > > > > > > > > because we already know that the SinkV2 is currently
> not
> > > able
> > > > > to
> > > > > > > > > support
> > > > > > > > > > > migration for JdbcSink. For the 2nd question above, we
> > know
> > > > > there
> > > > > > > are
> > > > > > > > > > many
> > > > > > > > > > > non-deprecated class in Flink codebase that are still
> > > > depending
> > > > > > on
> > > > > > > > > > SinkV2.
> > > > > > > > > > > It is probably not nice to users if we tell them to
> > migrate
> > > > > while
> > > > > > > we
> > > > > > > > > know
> > > > > > > > > > > there are existing issues that can prevent them from
> > doing
> > > so
> > > > > > > easily.
> > > > > > > > > > >
> > > > > > > > > > > In order to move forward to deprecate SinkV2, I think
> it
> > > will
> > > > > be
> > > > > > > > super
> > > > > > > > > > > useful to first migrate all the connectors managed by
> > Flink
> > > > > > > community
> > > > > > > > > > > (including all externalized connectors) to use SinkV2.
> > This
> > > > > work
> > > > > > > > won't
> > > > > > > > > be
> > > > > > > > > > > wasted since we need to do this anyway. And it will
> also
> > > give
> > > > > us
> > > > > > a
> > > > > > > > > chance
> > > > > > > > > > > to validate the capabilities of SinkV2 and fix bugs by
> > > > > ourselves
> > > > > > as
> > > > > > > > > much
> > > > > > > > > > as
> > > > > > > > > > > possible.
> > > > > > > > > > >
> > > > > > > > > > > What do you think?
> > > > > > > > > > >
> > > > > > > > > > > Best Regards,
> > > > > > > > > > > Dong
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > On Wed, Jan 18, 2023 at 6:52 PM Martijn Visser <
> > > > > > > > > martijnvisser@apache.org
> > > > > > > > > > >
> > > > > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Hi all,
> > > > > > > > > > > >
> > > > > > > > > > > > While discussing FLIP-281 [1] the discussion also
> > turned
> > > to
> > > > > the
> > > > > > > > > > > > SinkFunction and the SinkV2 API. For a broader
> > discussion
> > > > I'm
> > > > > > > > opening
> > > > > > > > > > up
> > > > > > > > > > > a
> > > > > > > > > > > > separate discussion thread.
> > > > > > > > > > > >
> > > > > > > > > > > > As Yun Tang has mentioned in that discussion thread,
> it
> > > > would
> > > > > > be
> > > > > > > a
> > > > > > > > > good
> > > > > > > > > > > > time to deprecate the SinkFunction to avoid the need
> to
> > > > > > introduce
> > > > > > > > new
> > > > > > > > > > > > functions towards (to be) deprecated APIs. Jing
> > > rightfully
> > > > > > > > mentioned
> > > > > > > > > > that
> > > > > > > > > > > > it would be confusing to deprecate the SinkFunction
> if
> > > its
> > > > > > > > successor
> > > > > > > > > is
> > > > > > > > > > > not
> > > > > > > > > > > > yet marked as @Public (it's currently
> @PublicEvolving).
> > > > > > > > > > > >
> > > > > > > > > > > > My proposal would be to promote the SinkV2 API to
> > @public
> > > > in
> > > > > > > Flink
> > > > > > > > > 1.17
> > > > > > > > > > > and
> > > > > > > > > > > > mark the SinkFunction as @deprecated in Flink 1.17
> > > > > > > > > > > >
> > > > > > > > > > > > The original Sink interface was introduced in Flink
> > 1.12
> > > > with
> > > > > > > > > FLIP-143
> > > > > > > > > > > [2]
> > > > > > > > > > > > and extended with FLIP-177 in Flink 1.14 [3] and has
> > been
> > > > > > > improved
> > > > > > > > on
> > > > > > > > > > > > further as Sink V2 via FLIP-191 in Flink 1.15 [4].
> > > > > > > > > > > >
> > > > > > > > > > > > Looking at the API stability graduation process [5],
> > the
> > > > fact
> > > > > > > that
> > > > > > > > > Sink
> > > > > > > > > > > V2
> > > > > > > > > > > > was introduced in Flink 1.15 would mean that we could
> > > > > warrant a
> > > > > > > > > > promotion
> > > > > > > > > > > > to @public already (given that there have been two
> > > releases
> > > > > > with
> > > > > > > > 1.15
> > > > > > > > > > and
> > > > > > > > > > > > 1.16 where it was introduced). Combined with the fact
> > > that
> > > > > > SinkV2
> > > > > > > > has
> > > > > > > > > > > been
> > > > > > > > > > > > the result of iteration over the introduction of the
> > > > original
> > > > > > > Sink
> > > > > > > > > API
> > > > > > > > > > > > since Flink 1.12, I would argue that the promotion is
> > > > > overdue.
> > > > > > > > > > > >
> > > > > > > > > > > > If we promote the Sink API to @public, I think we
> > should
> > > > also
> > > > > > > > > > immediately
> > > > > > > > > > > > mark the SinkFunction as @deprecated.
> > > > > > > > > > > >
> > > > > > > > > > > > Looking forward to your thoughts.
> > > > > > > > > > > >
> > > > > > > > > > > > Best regards,
> > > > > > > > > > > >
> > > > > > > > > > > > Martijn
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > [1]
> > > > > > > >
> > https://lists.apache.org/thread/l05m6cf8fwkkbpnjtzbg9l2lo40oxzd1
> > > > > > > > > > > > [2]
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-143%3A+Unified+Sink+API
> > > > > > > > > > > > [3]
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-177%3A+Extend+Sink+API
> > > > > > > > > > > > [4]
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-191%3A+Extend+unified+Sink+interface+to+support+small+file+compaction
> > > > > > > > > > > > [5]
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-197%3A+API+stability+graduation+process
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > --
> > > > > > > > > > https://twitter.com/snntrable
> > > > > > > > > > https://github.com/knaufk
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > https://twitter.com/snntrable
> > > > > > https://github.com/knaufk
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] Promote SinkV2 to @Public and deprecate SinkFunction

Posted by Dong Lin <li...@gmail.com>.
Hi Martijn,

Thanks for the reply. Please see my comments inline.

Regards,
Dong

On Sat, Feb 11, 2023 at 4:31 AM Martijn Visser <ma...@apache.org>
wrote:

> Hi all,
>
> I wanted to get back on a couple of comments and have a proposal at the end
> for the next steps:
>
> @Steven Wu
> If I look back at the original discussion of FLIP-191 and also the thread
> that you're referring to, it appears from the discussion with Yun Gao that
> the solution was in near-sight, but just not finished. Perhaps it needs to
> be restarted once more so it can be brought to a closure. Also when I
> looked back at the original introduction of SinkV2, there was FLINK-25726
> [1] which looks like it was built specifically for Iceberg and Delta Sink?
>
> @Jing
> > All potential unstable methods coming from SinkV2 interfaces will be kept
> marked as @internal.
>
> This is a situation that I think should be avoided: if it's unstable, it
> should be marked as @Experimental. Connectors should not need to rely
> on @Internal interfaces; if they are needed by a connector, a maintainer
> should create a proposal to expose.
>
> On the production readiness of SinkV2:
>
> @Dong
> > And Yuxia mentioned earlier in this thread that there are bugs such as
> FLINK-30238 <https://issues.apache.org/jira/browse/FLINK-30238> and
> FLINK-29459 <https://issues.apache.org/jira/browse/FLINK-29459> , which
> makes it hard to use SinkV2 properly in production.
>
> @Jark
> > The follow-up step should be trying the new APIs in externalized
> connectors and giving users the confidence to migrate.
>
> I think that is not a fair assessment of the situation. Given that Kafka,
> Elasticsearch, FileSink and everything that uses the AsyncSink is on SinkV2
> since Flink 1.15 and the popularity of these connectors, they have
> definitely already proven themselves. That's not to downplay that bugs have
> no impact and shouldn't be resolved, but I don't think it's fair to state
> that SinkV2 is not stable or usable in production.
>

I believe there exist jobs that are currently using SinkV2 (e.g. Kafka,
FileSink) in production. The questions are whether 1) SinkV2 is ready to
support all features of all existing SinkFunction; and 2) whether all those
connectors (e.g. Kafka) are ready to support all features of their
SinkFunction version (e.g. FlinkKafkaProducer) reliable without critical
bugs (e.g. losing data).

If we can not answer "yes" to the above two questions, then we probably can
not say that SinkV2 is usable in production in general, even though we can
still say that some source/sink based on SinkV2 is usable in production in
selected scenarios (e.g. maybe those jobs that do not use on savepoint).

Maybe we can use FLINK-30238
<https://issues.apache.org/jira/browse/FLINK-30238> to explain this point.
The JIRA's description basically says that "there is the risk of losing
data if we use stop-with-savepoint with SinkV2". Do you know whether this
might potentially affect those SinkV2 (e.g. Kafka, FileSink) jobs in
production when they are restarted with savepoint?



>
> Coming back to the original topic of this thread, my proposal would be that
> in Flink 1.17:
>
> 1. SinkV2 gets promoted from @PublicEvolving to @Public, which also means:
>

I believe we have agreed to address the issues with SinkV2. Now the
question is, how likely would the solutions require backward incompatible
change to SinkV2.

For example, do you think the following questions (mentioned earlier in
this thread) would require backward incompatible changes?
- Sink.InitContext still lacks some necessary information to migrate
existing connectors to new sinks
- Address FLINK-30238 and FLINK-29459.
- Find the migration path for the Iceberg sink

IMO, it seems safer to mark an interface as @Public when we are pretty sure
that there is no known issue in the near future that would require
backward-incompatible changes to the interface.

Note that the cost of making an interface @Public lies on the developer(s)
who owns the task of updating SinkV2 to address those known issues (e.g.
migration). Maybe we should find the developers who will own this task and
let them decide whether to mark it @Public?


a. KafkaSink gets promoted from @PublicEvolving to @Public (which solves
> the weird thing that FlinkKafkaProducer is already marked as deprecated
> since Flink 1.14 and points to KafkaSink as a successor)
> b. FileSink gets promoted from @Experimental to @PublicEvolving (note: this
> has been marked as @Experimental since Flink 1.12, so one could argue that
> this needs to be bumped to @Public too. The reason why I didn't do it is
> because of Yuxia's remark that FileSystemTableSink is still using
> SinkFunction)
> c. ElasticsearchSink gets promoted from @PublicEvolving to @Public (has
> been externalized so needs to happen there)
> d. The AWS connectors (Firehose, Kinesis) get promoted from @PublicEvolving
> to @Public (has been externalized so needs to happen there)
> e. It's up to the maintainers for DynamoDB and Opensearch to decide this
> themselves, given that I believe they've only been introduced recently and
> support only Flink 1.16.
>
> 2. For the next Flink release, the community needs to establish what are
> features that are currently not available in SinkV2 (e.g. FLIP-287), if
> such a feature should be made available in SinkV2 (in case it's a feature
> which currently resides in SinkFunction, while from new insights we think
> it should now be solved differently) and then get these implemented asap.
>
> Best regards,
>
> Martijn
>
> [1] https://issues.apache.org/jira/browse/FLINK-25726
>
>
> On Tue, Feb 7, 2023 at 5:11 AM Steven Wu <st...@gmail.com> wrote:
>
> > > Regarding the discussion on global committer [1] for sinks with global
> > transactions, there is no consensus on solving that problem in SinkV2.
> Will
> > it require any breaking change in SinkV2?
> >
> > Just want to reiterate my earlier question. What is the migration path
> for
> > the Iceberg sink?
> >
> > [1] https://lists.apache.org/thread/82bgvlton9olb591bfg2djv0cshj1bxj
> >
> > On Mon, Feb 6, 2023 at 6:22 PM Jark Wu <im...@gmail.com> wrote:
> >
> > > Hi Konstantin,
> > >
> > > I totally agree with making SinkV2 @Public. I just have concerns about
> > > deprecating SinkFunction at this point. Dong Lin has raised the blocker
> > > issues of migration multiple times in this thread which I think we
> should
> > > address first. I don't know why we rush to deprecate SinkFunction while
> > > SourceFunction is still public, but the new Source API is much more
> > stable
> > > and production-ready than SinkV2.
> > >
> > > Iceberg community raised concerns[1] about the workability and
> stability
> > of
> > > Flink connector APIs.
> > >
> > > We are hoping any issues with the APIs for Iceberg connector will
> surface
> > > > sooner and get more attention from the Flink community when the
> > connector
> > > > is within Flink umbrella rather than in Iceberg repo.
> > >
> > >
> > > The connector externalizing is a big step for building a mechanism to
> > > guarantee Flink connector API is stable and workable. The follow-up
> step
> > > should be trying the new APIs in externalized connectors and giving
> users
> > > the confidence to migrate.
> > >
> > >
> > > Best,
> > > Jark
> > >
> > > [1]: https://lists.apache.org/thread/r5zqnkt01x2c611brkjmxxnt3bfcgl1b
> > >
> > > On Tue, 7 Feb 2023 at 09:53, yuxia <lu...@alumni.sjtu.edu.cn>
> wrote:
> > >
> > > > Hi Konstantin,
> > > > Just FYI, the FileSystemTableSink are still using SinkFunction.
> > > >
> > > > Best regards,
> > > > Yuxia
> > > >
> > > > ----- 原始邮件 -----
> > > > 发件人: "Dong Lin" <li...@gmail.com>
> > > > 收件人: "dev" <de...@flink.apache.org>
> > > > 抄送: "Jing Ge" <ji...@ververica.com>, "Yun Tang" <my...@live.com>
> > > > 发送时间: 星期二, 2023年 2 月 07日 上午 9:41:07
> > > > 主题: Re: [DISCUSS] Promote SinkV2 to @Public and deprecate
> SinkFunction
> > > >
> > > > Hi Konstantin,
> > > >
> > > > Thanks for the reply. Please see my comments inline.
> > > >
> > > > On Mon, Feb 6, 2023 at 9:48 PM Konstantin Knauf <kn...@apache.org>
> > > wrote:
> > > >
> > > > > Hi Steven,
> > > > >
> > > > > Sink is already deprecated. It was deprecated at the moment where
> we
> > > > > introduced SinkV2.
> > > > >
> > > > > Hi Jark, Hi Dong,
> > > > >
> > > > > My understanding is the SinkV2 is a workable interface. The most
> > > > important
> > > > > connectors have been migrated (Kafka, Filesystem) and more
> connectors
> > > > > (OpenSearch, ElasticSearch, Kinesis) use it successfully. To make
> > > SinkV2
> > > > > public, it does not need to have all possible functionality. Public
> > > APIs
> > > > > can be extended. That's what we do all the time. There will also
> > always
> > > > be
> > > > > bugs. So, these points can not be categorical blockers to promote
> the
> > > > API.
> > > > >
> > > >
> > > > Right. I believe we all agree that we make SinkV2 @PublicEvolving.
> The
> > > > concern here is whether we can mark SinkFunction as deprecated at
> this
> > > > point.
> > > >
> > > >
> > > > >
> > > > > What are the specific issues/tickets that are blocking us? Can we
> in
> > > your
> > > > >
> > > >
> > > > For example, Lijie mentioned earlier in this thread that according to
> > > > FLIP-287,
> > > > currently the Sink.InitContext still lacks some necessary information
> > to
> > > > migrate existing connectors to new sinks. This could be a blocker
> issue
> > > > since this is related to the SinkV2 API design.
> > > >
> > > > And Yuxia mentioned earlier in this thread that there are bugs such
> as
> > > > FLINK-30238 <https://issues.apache.org/jira/browse/FLINK-30238> and
> > > > FLINK-29459 <https://issues.apache.org/jira/browse/FLINK-29459> ,
> > which
> > > > makes it hard to use SinkV2 properly in production.  It seems that
> > these
> > > > two bugs are created months ago and are still unresolved or even
> > > > unassigned. This looks like a clear signal that SinkV2 is not being
> > > > actively maintained and used in production.
> > > >
> > > >
> > > > > opinion only deprecate it when every single connector in Apache
> Flink
> > > is
> > > > > migrated already?
> > > > >
> > > >
> > > > Technically this is not a hard requirement. But I would suggest that
> we
> > > > should migrate all existing connectors so that we eat our own dogfood
> > and
> > > > prove to users that SinkV2 is ready for use in production.
> > > >
> > > >
> > > > > In my opinion it is the time to ask users to the migrate their
> > > > connectors.
> > > > > More importantly, @Deprecated would signal users not to build new
> > > > > connectors on SinkFunction. I would arque its also very misleading
> to
> > > > users
> > > > > to not @Deprecated SinkFunction given that is clearly will be
> > > deprecated.
> > > >
> > > >
> > > >
> > > >
> > > > > Cheers,
> > > > >
> > > > > Konstantin
> > > > >
> > > > >
> > > > > Am Mo., 6. Feb. 2023 um 13:26 Uhr schrieb Jark Wu <
> imjark@gmail.com
> > >:
> > > > >
> > > > > > I agree with Dong Lin.
> > > > > >
> > > > > > Oracle explains how to use Deprecate API [1]:
> > > > > >
> > > > > > You are strongly recommended to use the Javadoc @deprecated tag
> > with
> > > > > > > appropriate comments explaining how to use the new API. This
> > > ensures
> > > > > > > developers will *have a workable migration path from the old
> API
> > to
> > > > the
> > > > > > > new API*.
> > > > > >
> > > > > >
> > > > > > From a user's perspective, the workable migration path is very
> > > > important.
> > > > > > Otherwise, it blurs the semantics of API deprecation. The Flink
> > API's
> > > > > > compatibility and stability issues in the past left a bad
> > impression
> > > on
> > > > > the
> > > > > > downstream projects. We should be careful when changing and
> > > deprecating
> > > > > > APIs, especially when there are known migration gaps. I think
> it's
> > a
> > > > good
> > > > > > idea to migrate Flink-owned connectors before marking old API
> > > > deprecated.
> > > > > > This ensures downstream projects can migrate to new APIs
> smoothly.
> > > > > >
> > > > > > Best,
> > > > > > Jark
> > > > > >
> > > > > > [1]:
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://docs.oracle.com/javase/8/docs/technotes/guides/javadoc/deprecation/deprecation.html
> > > > > >
> > > > > > On Mon, 6 Feb 2023 at 10:01, Steven Wu <st...@gmail.com>
> > wrote:
> > > > > >
> > > > > > > Regarding the discussion on global committer [1] for sinks with
> > > > global
> > > > > > > transactions, there is no consensus on solving that problem in
> > > > SinkV2.
> > > > > > Will
> > > > > > > it require any breaking change in SinkV2?
> > > > > > >
> > > > > > > Also will SinkV1 be deprecated too? or it should happen
> sometime
> > > > after
> > > > > > > SinkFunction deprecation?
> > > > > > >
> > > > > > > [1]
> > > https://lists.apache.org/thread/82bgvlton9olb591bfg2djv0cshj1bxj
> > > > > > >
> > > > > > > On Sun, Feb 5, 2023 at 2:14 AM Dong Lin <li...@gmail.com>
> > > wrote:
> > > > > > >
> > > > > > > > Hi Konstantin,
> > > > > > > >
> > > > > > > > Thanks for the comment! Please see my comment inline.
> > > > > > > >
> > > > > > > > Cheers,
> > > > > > > > Dong
> > > > > > > >
> > > > > > > > On Sat, Feb 4, 2023 at 2:06 AM Konstantin Knauf <
> > > knaufk@apache.org
> > > > >
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Hi everyone,
> > > > > > > > >
> > > > > > > > > sorry for joining the discussion late.
> > > > > > > > >
> > > > > > > > > 1) Is there an option to deprecate SinkFunction in Flink
> 1.17
> > > > while
> > > > > > > > leaving
> > > > > > > > > SinkV2 @PublicEvolving in Flink 1.17. We then aim to make
> > > SinkV2
> > > > > > > @Public
> > > > > > > > in
> > > > > > > > > and remove SinkFunction in Flink 1.18. @PublicEvolving are
> > > > intended
> > > > > > for
> > > > > > > > > public use. So, I don't see it as a blocker for deprecating
> > > > > > > SinkFunction
> > > > > > > > > that we have to make SinkV2 @PublicEvovling. For reference
> > this
> > > > is
> > > > > > the
> > > > > > > > > description of @PublicEvovling:
> > > > > > > > >
> > > > > > > > > /**
> > > > > > > > >  * Annotation to mark classes and methods for public use,
> but
> > > > with
> > > > > > > > > evolving interfaces.
> > > > > > > > >  *
> > > > > > > > >  * <p>Classes and methods with this annotation are intended
> > for
> > > > > > public
> > > > > > > > > use and have stable behavior.
> > > > > > > > >  * However, their interfaces and signatures are not
> > considered
> > > to
> > > > > be
> > > > > > > > > stable and might be changed
> > > > > > > > >  * across versions.
> > > > > > > > >  *
> > > > > > > > >  * <p>This annotation also excludes methods and classes
> with
> > > > > evolving
> > > > > > > > > interfaces / signatures within
> > > > > > > > >  * classes annotated with {@link Public}.
> > > > > > > > >  */
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > Marking SinkFunction @Deprecated would already single
> > everyone
> > > to
> > > > > > move
> > > > > > > to
> > > > > > > > > SinkV2, which we as a community, I believe, have a strong
> > > > interest
> > > > > > in.
> > > > > > > > Its
> > > > > > > > >
> > > > > > > >
> > > > > > > > Yes, I also believe we all have this strong interest. I just
> > hope
> > > > > that
> > > > > > > this
> > > > > > > > can be done in the best possible way that does not confuse
> > users.
> > > > > > > >
> > > > > > > > I probably still have the same concern regarding its impact
> on
> > > > users:
> > > > > > if
> > > > > > > we
> > > > > > > > mark an API as deprecated, it effectively means the users of
> > this
> > > > API
> > > > > > > > should start to migrate to another API (e.g. SinkV2) and we
> > might
> > > > > > remove
> > > > > > > > this API in the future. However, given that we know there are
> > > known
> > > > > > > > problems preventing users from doing so, it seems that we are
> > not
> > > > > ready
> > > > > > > to
> > > > > > > > send this message to users right.
> > > > > > > >
> > > > > > > > If I understand correctly, I guess you are suggesting that by
> > > > marking
> > > > > > > > SinkFunction as deprecated, we can put higher pressure on
> Flink
> > > > > > > > contributors to update the existing Flink codebase to improve
> > and
> > > > use
> > > > > > > > SinkV2.
> > > > > > > >
> > > > > > > > I am not sure this is the right way to use @deprecated, which
> > > has a
> > > > > > > > particular meaning for its users rather than contributors.
> And
> > I
> > > am
> > > > > > also
> > > > > > > > not sure we can even pressure contributors of an open-source
> > > > project
> > > > > > into
> > > > > > > > developing a feature (e.g. migrate all existing SinkFunction
> > > > > subclasses
> > > > > > > to
> > > > > > > > SinkV2). IMO, the typical way is for the contributor with
> > > > > interest/time
> > > > > > > to
> > > > > > > > work on the feature, or talk to other contributors whether
> they
> > > are
> > > > > > > willing
> > > > > > > > to collaborate/work on this, rather than pressuring other
> > > > > contributors
> > > > > > > into
> > > > > > > > working on this.
> > > > > > > >
> > > > > > > >
> > > > > > > > almost comical how long the transition from
> > > > > > SourceFurnction/SinkFunction
> > > > > > > to
> > > > > > > > > Source/Sink takes us. At the same time, we leave ourselves
> > the
> > > > > option
> > > > > > > to
> > > > > > > > to
> > > > > > > > > make small changes to SinkV2 if any problems arise during
> the
> > > > > > migration
> > > > > > > > of
> > > > > > > > > these connector.
> > > > > > > > >
> > > > > > > > > I think, we have a bit of a chicken/egg problem here. The
> > > > pressure
> > > > > > for
> > > > > > > > >
> > > > > > > >
> > > > > > > > Similar to the reason described above, I am not sure we have
> a
> > > > > > > chicken/egg
> > > > > > > > problem here. The issue here is that SinkV2 is not ready and
> we
> > > > have
> > > > > a
> > > > > > > lot
> > > > > > > > of existing SinkFunction that is not migrated by ourselves.
> We
> > > > (Flink
> > > > > > > > contributors) probably do not need to mark SinkFunction as
> > > > deprecated
> > > > > > in
> > > > > > > > order to address these issues in our own codebase.
> > > > > > > >
> > > > > > > >
> > > > > > > > users and contributors is not high enough to move away from
> > > > > > SinkFunction
> > > > > > > as
> > > > > > > > > long as its not deprecated, but at the same time we need
> > people
> > > > to
> > > > > > > > migrate
> > > > > > > > > their connectors to see if there are any gaps in SinkV2. I
> > > > believe,
> > > > > > the
> > > > > > > > > combination proposed above could bridge this problem.
> > > > > > > > >
> > > > > > > > > 2) I don't understand the argument of waiting until some of
> > the
> > > > > > > > > implementations are @Public. How can we make the
> > > implementations
> > > > of
> > > > > > the
> > > > > > > > > SinkV2 API @Public without making SinkV2 @Public? All
> public
> > > > > methods
> > > > > > of
> > > > > > > > > SinkV2 are part of every implementation. So to me it
> actually
> > > > seems
> > > > > > to
> > > > > > > be
> > > > > > > > > opposite: in order to make any of the implementation
> @Public
> > we
> > > > > first
> > > > > > > > need
> > > > > > > > > to make the API @Public.
> > > > > > > > >
> > > > > > > >
> > > > > > > > Yeah I also agree with you.
> > > > > > > >
> > > > > > > >
> > > > > > > > >
> > > > > > > > > Cheers,
> > > > > > > > >
> > > > > > > > > Konstantin
> > > > > > > > >
> > > > > > > > > Am Mo., 30. Jan. 2023 um 13:18 Uhr schrieb Dong Lin <
> > > > > > > lindong28@gmail.com
> > > > > > > > >:
> > > > > > > > >
> > > > > > > > > > Hi Martijn,
> > > > > > > > > >
> > > > > > > > > > Thanks for driving this effort to clean-up the Flink
> > > codebase!
> > > > > > > > > >
> > > > > > > > > > I like the idea to cleanup Flink codebase to avoid having
> > two
> > > > > > Sinks.
> > > > > > > On
> > > > > > > > > the
> > > > > > > > > > other hand, I also thing the concern mentioned by Jing
> > makes
> > > > > sense.
> > > > > > > In
> > > > > > > > > > addition to thinking in terms of the rule proposed in
> > > FLIP-197
> > > > > > > > > > <
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-197%3A+API+stability+graduation+process
> > > > > > > > > > >
> > > > > > > > > > (which
> > > > > > > > > > seems to focus mostly on the Flink developers'
> > perspective),
> > > it
> > > > > > might
> > > > > > > > be
> > > > > > > > > > useful to also think about the story from users'
> > perspective
> > > > and
> > > > > > make
> > > > > > > > > sure
> > > > > > > > > > their concerns can be addressed.
> > > > > > > > > >
> > > > > > > > > > Typically, by marking an API as deprecated, we are
> > > effectively
> > > > > > > telling
> > > > > > > > > > users *they should start to migrate to the new API ASAP
> and
> > > we
> > > > > > > reserve
> > > > > > > > > the
> > > > > > > > > > right to remove this API completely in the 1-2 releases*.
> > > Then
> > > > it
> > > > > > > might
> > > > > > > > > be
> > > > > > > > > > reasonable for users to ask questions such as:
> > > > > > > > > >
> > > > > > > > > > 1) Does SinkV2 public API provides all the
> functionalities
> > > > needed
> > > > > > to
> > > > > > > > > > migrate my existing code from subclassing SinkFunction to
> > > > > > subclassing
> > > > > > > > > > SinkV2?
> > > > > > > > > >
> > > > > > > > > > 2) Is the amount of migration work reasonable? If yes,
> why
> > > is a
> > > > > > class
> > > > > > > > > such
> > > > > > > > > > as HBaseSinkFunction in Flink's own codebase still
> > depending
> > > on
> > > > > > > > > > SinkFunction? Maybe Flink developers should eat their own
> > dog
> > > > > food
> > > > > > > and
> > > > > > > > > > migrate (or deprecate) those classes in the Flink
> codebase
> > > > first?
> > > > > > > > > >
> > > > > > > > > > Based on the discussion in this thread, I am not sure we
> > have
> > > > > good
> > > > > > > > > answers
> > > > > > > > > > to those questions yet. For the 1st question above, the
> > > answer
> > > > is
> > > > > > > *no*
> > > > > > > > > > because we already know that the SinkV2 is currently not
> > able
> > > > to
> > > > > > > > support
> > > > > > > > > > migration for JdbcSink. For the 2nd question above, we
> know
> > > > there
> > > > > > are
> > > > > > > > > many
> > > > > > > > > > non-deprecated class in Flink codebase that are still
> > > depending
> > > > > on
> > > > > > > > > SinkV2.
> > > > > > > > > > It is probably not nice to users if we tell them to
> migrate
> > > > while
> > > > > > we
> > > > > > > > know
> > > > > > > > > > there are existing issues that can prevent them from
> doing
> > so
> > > > > > easily.
> > > > > > > > > >
> > > > > > > > > > In order to move forward to deprecate SinkV2, I think it
> > will
> > > > be
> > > > > > > super
> > > > > > > > > > useful to first migrate all the connectors managed by
> Flink
> > > > > > community
> > > > > > > > > > (including all externalized connectors) to use SinkV2.
> This
> > > > work
> > > > > > > won't
> > > > > > > > be
> > > > > > > > > > wasted since we need to do this anyway. And it will also
> > give
> > > > us
> > > > > a
> > > > > > > > chance
> > > > > > > > > > to validate the capabilities of SinkV2 and fix bugs by
> > > > ourselves
> > > > > as
> > > > > > > > much
> > > > > > > > > as
> > > > > > > > > > possible.
> > > > > > > > > >
> > > > > > > > > > What do you think?
> > > > > > > > > >
> > > > > > > > > > Best Regards,
> > > > > > > > > > Dong
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > On Wed, Jan 18, 2023 at 6:52 PM Martijn Visser <
> > > > > > > > martijnvisser@apache.org
> > > > > > > > > >
> > > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Hi all,
> > > > > > > > > > >
> > > > > > > > > > > While discussing FLIP-281 [1] the discussion also
> turned
> > to
> > > > the
> > > > > > > > > > > SinkFunction and the SinkV2 API. For a broader
> discussion
> > > I'm
> > > > > > > opening
> > > > > > > > > up
> > > > > > > > > > a
> > > > > > > > > > > separate discussion thread.
> > > > > > > > > > >
> > > > > > > > > > > As Yun Tang has mentioned in that discussion thread, it
> > > would
> > > > > be
> > > > > > a
> > > > > > > > good
> > > > > > > > > > > time to deprecate the SinkFunction to avoid the need to
> > > > > introduce
> > > > > > > new
> > > > > > > > > > > functions towards (to be) deprecated APIs. Jing
> > rightfully
> > > > > > > mentioned
> > > > > > > > > that
> > > > > > > > > > > it would be confusing to deprecate the SinkFunction if
> > its
> > > > > > > successor
> > > > > > > > is
> > > > > > > > > > not
> > > > > > > > > > > yet marked as @Public (it's currently @PublicEvolving).
> > > > > > > > > > >
> > > > > > > > > > > My proposal would be to promote the SinkV2 API to
> @public
> > > in
> > > > > > Flink
> > > > > > > > 1.17
> > > > > > > > > > and
> > > > > > > > > > > mark the SinkFunction as @deprecated in Flink 1.17
> > > > > > > > > > >
> > > > > > > > > > > The original Sink interface was introduced in Flink
> 1.12
> > > with
> > > > > > > > FLIP-143
> > > > > > > > > > [2]
> > > > > > > > > > > and extended with FLIP-177 in Flink 1.14 [3] and has
> been
> > > > > > improved
> > > > > > > on
> > > > > > > > > > > further as Sink V2 via FLIP-191 in Flink 1.15 [4].
> > > > > > > > > > >
> > > > > > > > > > > Looking at the API stability graduation process [5],
> the
> > > fact
> > > > > > that
> > > > > > > > Sink
> > > > > > > > > > V2
> > > > > > > > > > > was introduced in Flink 1.15 would mean that we could
> > > > warrant a
> > > > > > > > > promotion
> > > > > > > > > > > to @public already (given that there have been two
> > releases
> > > > > with
> > > > > > > 1.15
> > > > > > > > > and
> > > > > > > > > > > 1.16 where it was introduced). Combined with the fact
> > that
> > > > > SinkV2
> > > > > > > has
> > > > > > > > > > been
> > > > > > > > > > > the result of iteration over the introduction of the
> > > original
> > > > > > Sink
> > > > > > > > API
> > > > > > > > > > > since Flink 1.12, I would argue that the promotion is
> > > > overdue.
> > > > > > > > > > >
> > > > > > > > > > > If we promote the Sink API to @public, I think we
> should
> > > also
> > > > > > > > > immediately
> > > > > > > > > > > mark the SinkFunction as @deprecated.
> > > > > > > > > > >
> > > > > > > > > > > Looking forward to your thoughts.
> > > > > > > > > > >
> > > > > > > > > > > Best regards,
> > > > > > > > > > >
> > > > > > > > > > > Martijn
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > [1]
> > > > > > >
> https://lists.apache.org/thread/l05m6cf8fwkkbpnjtzbg9l2lo40oxzd1
> > > > > > > > > > > [2]
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-143%3A+Unified+Sink+API
> > > > > > > > > > > [3]
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-177%3A+Extend+Sink+API
> > > > > > > > > > > [4]
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-191%3A+Extend+unified+Sink+interface+to+support+small+file+compaction
> > > > > > > > > > > [5]
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-197%3A+API+stability+graduation+process
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > > https://twitter.com/snntrable
> > > > > > > > > https://github.com/knaufk
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > > >
> > > > > --
> > > > > https://twitter.com/snntrable
> > > > > https://github.com/knaufk
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] Promote SinkV2 to @Public and deprecate SinkFunction

Posted by Martijn Visser <ma...@apache.org>.
Hi all,

I wanted to get back on a couple of comments and have a proposal at the end
for the next steps:

@Steven Wu
If I look back at the original discussion of FLIP-191 and also the thread
that you're referring to, it appears from the discussion with Yun Gao that
the solution was in near-sight, but just not finished. Perhaps it needs to
be restarted once more so it can be brought to a closure. Also when I
looked back at the original introduction of SinkV2, there was FLINK-25726
[1] which looks like it was built specifically for Iceberg and Delta Sink?

@Jing
> All potential unstable methods coming from SinkV2 interfaces will be kept
marked as @internal.

This is a situation that I think should be avoided: if it's unstable, it
should be marked as @Experimental. Connectors should not need to rely
on @Internal interfaces; if they are needed by a connector, a maintainer
should create a proposal to expose.

On the production readiness of SinkV2:

@Dong
> And Yuxia mentioned earlier in this thread that there are bugs such as
FLINK-30238 <https://issues.apache.org/jira/browse/FLINK-30238> and
FLINK-29459 <https://issues.apache.org/jira/browse/FLINK-29459> , which
makes it hard to use SinkV2 properly in production.

@Jark
> The follow-up step should be trying the new APIs in externalized
connectors and giving users the confidence to migrate.

I think that is not a fair assessment of the situation. Given that Kafka,
Elasticsearch, FileSink and everything that uses the AsyncSink is on SinkV2
since Flink 1.15 and the popularity of these connectors, they have
definitely already proven themselves. That's not to downplay that bugs have
no impact and shouldn't be resolved, but I don't think it's fair to state
that SinkV2 is not stable or usable in production.

Coming back to the original topic of this thread, my proposal would be that
in Flink 1.17:

1. SinkV2 gets promoted from @PublicEvolving to @Public, which also means:
a. KafkaSink gets promoted from @PublicEvolving to @Public (which solves
the weird thing that FlinkKafkaProducer is already marked as deprecated
since Flink 1.14 and points to KafkaSink as a successor)
b. FileSink gets promoted from @Experimental to @PublicEvolving (note: this
has been marked as @Experimental since Flink 1.12, so one could argue that
this needs to be bumped to @Public too. The reason why I didn't do it is
because of Yuxia's remark that FileSystemTableSink is still using
SinkFunction)
c. ElasticsearchSink gets promoted from @PublicEvolving to @Public (has
been externalized so needs to happen there)
d. The AWS connectors (Firehose, Kinesis) get promoted from @PublicEvolving
to @Public (has been externalized so needs to happen there)
e. It's up to the maintainers for DynamoDB and Opensearch to decide this
themselves, given that I believe they've only been introduced recently and
support only Flink 1.16.

2. For the next Flink release, the community needs to establish what are
features that are currently not available in SinkV2 (e.g. FLIP-287), if
such a feature should be made available in SinkV2 (in case it's a feature
which currently resides in SinkFunction, while from new insights we think
it should now be solved differently) and then get these implemented asap.

Best regards,

Martijn

[1] https://issues.apache.org/jira/browse/FLINK-25726


On Tue, Feb 7, 2023 at 5:11 AM Steven Wu <st...@gmail.com> wrote:

> > Regarding the discussion on global committer [1] for sinks with global
> transactions, there is no consensus on solving that problem in SinkV2. Will
> it require any breaking change in SinkV2?
>
> Just want to reiterate my earlier question. What is the migration path for
> the Iceberg sink?
>
> [1] https://lists.apache.org/thread/82bgvlton9olb591bfg2djv0cshj1bxj
>
> On Mon, Feb 6, 2023 at 6:22 PM Jark Wu <im...@gmail.com> wrote:
>
> > Hi Konstantin,
> >
> > I totally agree with making SinkV2 @Public. I just have concerns about
> > deprecating SinkFunction at this point. Dong Lin has raised the blocker
> > issues of migration multiple times in this thread which I think we should
> > address first. I don't know why we rush to deprecate SinkFunction while
> > SourceFunction is still public, but the new Source API is much more
> stable
> > and production-ready than SinkV2.
> >
> > Iceberg community raised concerns[1] about the workability and stability
> of
> > Flink connector APIs.
> >
> > We are hoping any issues with the APIs for Iceberg connector will surface
> > > sooner and get more attention from the Flink community when the
> connector
> > > is within Flink umbrella rather than in Iceberg repo.
> >
> >
> > The connector externalizing is a big step for building a mechanism to
> > guarantee Flink connector API is stable and workable. The follow-up step
> > should be trying the new APIs in externalized connectors and giving users
> > the confidence to migrate.
> >
> >
> > Best,
> > Jark
> >
> > [1]: https://lists.apache.org/thread/r5zqnkt01x2c611brkjmxxnt3bfcgl1b
> >
> > On Tue, 7 Feb 2023 at 09:53, yuxia <lu...@alumni.sjtu.edu.cn> wrote:
> >
> > > Hi Konstantin,
> > > Just FYI, the FileSystemTableSink are still using SinkFunction.
> > >
> > > Best regards,
> > > Yuxia
> > >
> > > ----- 原始邮件 -----
> > > 发件人: "Dong Lin" <li...@gmail.com>
> > > 收件人: "dev" <de...@flink.apache.org>
> > > 抄送: "Jing Ge" <ji...@ververica.com>, "Yun Tang" <my...@live.com>
> > > 发送时间: 星期二, 2023年 2 月 07日 上午 9:41:07
> > > 主题: Re: [DISCUSS] Promote SinkV2 to @Public and deprecate SinkFunction
> > >
> > > Hi Konstantin,
> > >
> > > Thanks for the reply. Please see my comments inline.
> > >
> > > On Mon, Feb 6, 2023 at 9:48 PM Konstantin Knauf <kn...@apache.org>
> > wrote:
> > >
> > > > Hi Steven,
> > > >
> > > > Sink is already deprecated. It was deprecated at the moment where we
> > > > introduced SinkV2.
> > > >
> > > > Hi Jark, Hi Dong,
> > > >
> > > > My understanding is the SinkV2 is a workable interface. The most
> > > important
> > > > connectors have been migrated (Kafka, Filesystem) and more connectors
> > > > (OpenSearch, ElasticSearch, Kinesis) use it successfully. To make
> > SinkV2
> > > > public, it does not need to have all possible functionality. Public
> > APIs
> > > > can be extended. That's what we do all the time. There will also
> always
> > > be
> > > > bugs. So, these points can not be categorical blockers to promote the
> > > API.
> > > >
> > >
> > > Right. I believe we all agree that we make SinkV2 @PublicEvolving. The
> > > concern here is whether we can mark SinkFunction as deprecated at this
> > > point.
> > >
> > >
> > > >
> > > > What are the specific issues/tickets that are blocking us? Can we in
> > your
> > > >
> > >
> > > For example, Lijie mentioned earlier in this thread that according to
> > > FLIP-287,
> > > currently the Sink.InitContext still lacks some necessary information
> to
> > > migrate existing connectors to new sinks. This could be a blocker issue
> > > since this is related to the SinkV2 API design.
> > >
> > > And Yuxia mentioned earlier in this thread that there are bugs such as
> > > FLINK-30238 <https://issues.apache.org/jira/browse/FLINK-30238> and
> > > FLINK-29459 <https://issues.apache.org/jira/browse/FLINK-29459> ,
> which
> > > makes it hard to use SinkV2 properly in production.  It seems that
> these
> > > two bugs are created months ago and are still unresolved or even
> > > unassigned. This looks like a clear signal that SinkV2 is not being
> > > actively maintained and used in production.
> > >
> > >
> > > > opinion only deprecate it when every single connector in Apache Flink
> > is
> > > > migrated already?
> > > >
> > >
> > > Technically this is not a hard requirement. But I would suggest that we
> > > should migrate all existing connectors so that we eat our own dogfood
> and
> > > prove to users that SinkV2 is ready for use in production.
> > >
> > >
> > > > In my opinion it is the time to ask users to the migrate their
> > > connectors.
> > > > More importantly, @Deprecated would signal users not to build new
> > > > connectors on SinkFunction. I would arque its also very misleading to
> > > users
> > > > to not @Deprecated SinkFunction given that is clearly will be
> > deprecated.
> > >
> > >
> > >
> > >
> > > > Cheers,
> > > >
> > > > Konstantin
> > > >
> > > >
> > > > Am Mo., 6. Feb. 2023 um 13:26 Uhr schrieb Jark Wu <imjark@gmail.com
> >:
> > > >
> > > > > I agree with Dong Lin.
> > > > >
> > > > > Oracle explains how to use Deprecate API [1]:
> > > > >
> > > > > You are strongly recommended to use the Javadoc @deprecated tag
> with
> > > > > > appropriate comments explaining how to use the new API. This
> > ensures
> > > > > > developers will *have a workable migration path from the old API
> to
> > > the
> > > > > > new API*.
> > > > >
> > > > >
> > > > > From a user's perspective, the workable migration path is very
> > > important.
> > > > > Otherwise, it blurs the semantics of API deprecation. The Flink
> API's
> > > > > compatibility and stability issues in the past left a bad
> impression
> > on
> > > > the
> > > > > downstream projects. We should be careful when changing and
> > deprecating
> > > > > APIs, especially when there are known migration gaps. I think it's
> a
> > > good
> > > > > idea to migrate Flink-owned connectors before marking old API
> > > deprecated.
> > > > > This ensures downstream projects can migrate to new APIs smoothly.
> > > > >
> > > > > Best,
> > > > > Jark
> > > > >
> > > > > [1]:
> > > > >
> > > > >
> > > >
> > >
> >
> https://docs.oracle.com/javase/8/docs/technotes/guides/javadoc/deprecation/deprecation.html
> > > > >
> > > > > On Mon, 6 Feb 2023 at 10:01, Steven Wu <st...@gmail.com>
> wrote:
> > > > >
> > > > > > Regarding the discussion on global committer [1] for sinks with
> > > global
> > > > > > transactions, there is no consensus on solving that problem in
> > > SinkV2.
> > > > > Will
> > > > > > it require any breaking change in SinkV2?
> > > > > >
> > > > > > Also will SinkV1 be deprecated too? or it should happen sometime
> > > after
> > > > > > SinkFunction deprecation?
> > > > > >
> > > > > > [1]
> > https://lists.apache.org/thread/82bgvlton9olb591bfg2djv0cshj1bxj
> > > > > >
> > > > > > On Sun, Feb 5, 2023 at 2:14 AM Dong Lin <li...@gmail.com>
> > wrote:
> > > > > >
> > > > > > > Hi Konstantin,
> > > > > > >
> > > > > > > Thanks for the comment! Please see my comment inline.
> > > > > > >
> > > > > > > Cheers,
> > > > > > > Dong
> > > > > > >
> > > > > > > On Sat, Feb 4, 2023 at 2:06 AM Konstantin Knauf <
> > knaufk@apache.org
> > > >
> > > > > > wrote:
> > > > > > >
> > > > > > > > Hi everyone,
> > > > > > > >
> > > > > > > > sorry for joining the discussion late.
> > > > > > > >
> > > > > > > > 1) Is there an option to deprecate SinkFunction in Flink 1.17
> > > while
> > > > > > > leaving
> > > > > > > > SinkV2 @PublicEvolving in Flink 1.17. We then aim to make
> > SinkV2
> > > > > > @Public
> > > > > > > in
> > > > > > > > and remove SinkFunction in Flink 1.18. @PublicEvolving are
> > > intended
> > > > > for
> > > > > > > > public use. So, I don't see it as a blocker for deprecating
> > > > > > SinkFunction
> > > > > > > > that we have to make SinkV2 @PublicEvovling. For reference
> this
> > > is
> > > > > the
> > > > > > > > description of @PublicEvovling:
> > > > > > > >
> > > > > > > > /**
> > > > > > > >  * Annotation to mark classes and methods for public use, but
> > > with
> > > > > > > > evolving interfaces.
> > > > > > > >  *
> > > > > > > >  * <p>Classes and methods with this annotation are intended
> for
> > > > > public
> > > > > > > > use and have stable behavior.
> > > > > > > >  * However, their interfaces and signatures are not
> considered
> > to
> > > > be
> > > > > > > > stable and might be changed
> > > > > > > >  * across versions.
> > > > > > > >  *
> > > > > > > >  * <p>This annotation also excludes methods and classes with
> > > > evolving
> > > > > > > > interfaces / signatures within
> > > > > > > >  * classes annotated with {@link Public}.
> > > > > > > >  */
> > > > > > > >
> > > > > > > >
> > > > > > > > Marking SinkFunction @Deprecated would already single
> everyone
> > to
> > > > > move
> > > > > > to
> > > > > > > > SinkV2, which we as a community, I believe, have a strong
> > > interest
> > > > > in.
> > > > > > > Its
> > > > > > > >
> > > > > > >
> > > > > > > Yes, I also believe we all have this strong interest. I just
> hope
> > > > that
> > > > > > this
> > > > > > > can be done in the best possible way that does not confuse
> users.
> > > > > > >
> > > > > > > I probably still have the same concern regarding its impact on
> > > users:
> > > > > if
> > > > > > we
> > > > > > > mark an API as deprecated, it effectively means the users of
> this
> > > API
> > > > > > > should start to migrate to another API (e.g. SinkV2) and we
> might
> > > > > remove
> > > > > > > this API in the future. However, given that we know there are
> > known
> > > > > > > problems preventing users from doing so, it seems that we are
> not
> > > > ready
> > > > > > to
> > > > > > > send this message to users right.
> > > > > > >
> > > > > > > If I understand correctly, I guess you are suggesting that by
> > > marking
> > > > > > > SinkFunction as deprecated, we can put higher pressure on Flink
> > > > > > > contributors to update the existing Flink codebase to improve
> and
> > > use
> > > > > > > SinkV2.
> > > > > > >
> > > > > > > I am not sure this is the right way to use @deprecated, which
> > has a
> > > > > > > particular meaning for its users rather than contributors. And
> I
> > am
> > > > > also
> > > > > > > not sure we can even pressure contributors of an open-source
> > > project
> > > > > into
> > > > > > > developing a feature (e.g. migrate all existing SinkFunction
> > > > subclasses
> > > > > > to
> > > > > > > SinkV2). IMO, the typical way is for the contributor with
> > > > interest/time
> > > > > > to
> > > > > > > work on the feature, or talk to other contributors whether they
> > are
> > > > > > willing
> > > > > > > to collaborate/work on this, rather than pressuring other
> > > > contributors
> > > > > > into
> > > > > > > working on this.
> > > > > > >
> > > > > > >
> > > > > > > almost comical how long the transition from
> > > > > SourceFurnction/SinkFunction
> > > > > > to
> > > > > > > > Source/Sink takes us. At the same time, we leave ourselves
> the
> > > > option
> > > > > > to
> > > > > > > to
> > > > > > > > make small changes to SinkV2 if any problems arise during the
> > > > > migration
> > > > > > > of
> > > > > > > > these connector.
> > > > > > > >
> > > > > > > > I think, we have a bit of a chicken/egg problem here. The
> > > pressure
> > > > > for
> > > > > > > >
> > > > > > >
> > > > > > > Similar to the reason described above, I am not sure we have a
> > > > > > chicken/egg
> > > > > > > problem here. The issue here is that SinkV2 is not ready and we
> > > have
> > > > a
> > > > > > lot
> > > > > > > of existing SinkFunction that is not migrated by ourselves. We
> > > (Flink
> > > > > > > contributors) probably do not need to mark SinkFunction as
> > > deprecated
> > > > > in
> > > > > > > order to address these issues in our own codebase.
> > > > > > >
> > > > > > >
> > > > > > > users and contributors is not high enough to move away from
> > > > > SinkFunction
> > > > > > as
> > > > > > > > long as its not deprecated, but at the same time we need
> people
> > > to
> > > > > > > migrate
> > > > > > > > their connectors to see if there are any gaps in SinkV2. I
> > > believe,
> > > > > the
> > > > > > > > combination proposed above could bridge this problem.
> > > > > > > >
> > > > > > > > 2) I don't understand the argument of waiting until some of
> the
> > > > > > > > implementations are @Public. How can we make the
> > implementations
> > > of
> > > > > the
> > > > > > > > SinkV2 API @Public without making SinkV2 @Public? All public
> > > > methods
> > > > > of
> > > > > > > > SinkV2 are part of every implementation. So to me it actually
> > > seems
> > > > > to
> > > > > > be
> > > > > > > > opposite: in order to make any of the implementation @Public
> we
> > > > first
> > > > > > > need
> > > > > > > > to make the API @Public.
> > > > > > > >
> > > > > > >
> > > > > > > Yeah I also agree with you.
> > > > > > >
> > > > > > >
> > > > > > > >
> > > > > > > > Cheers,
> > > > > > > >
> > > > > > > > Konstantin
> > > > > > > >
> > > > > > > > Am Mo., 30. Jan. 2023 um 13:18 Uhr schrieb Dong Lin <
> > > > > > lindong28@gmail.com
> > > > > > > >:
> > > > > > > >
> > > > > > > > > Hi Martijn,
> > > > > > > > >
> > > > > > > > > Thanks for driving this effort to clean-up the Flink
> > codebase!
> > > > > > > > >
> > > > > > > > > I like the idea to cleanup Flink codebase to avoid having
> two
> > > > > Sinks.
> > > > > > On
> > > > > > > > the
> > > > > > > > > other hand, I also thing the concern mentioned by Jing
> makes
> > > > sense.
> > > > > > In
> > > > > > > > > addition to thinking in terms of the rule proposed in
> > FLIP-197
> > > > > > > > > <
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-197%3A+API+stability+graduation+process
> > > > > > > > > >
> > > > > > > > > (which
> > > > > > > > > seems to focus mostly on the Flink developers'
> perspective),
> > it
> > > > > might
> > > > > > > be
> > > > > > > > > useful to also think about the story from users'
> perspective
> > > and
> > > > > make
> > > > > > > > sure
> > > > > > > > > their concerns can be addressed.
> > > > > > > > >
> > > > > > > > > Typically, by marking an API as deprecated, we are
> > effectively
> > > > > > telling
> > > > > > > > > users *they should start to migrate to the new API ASAP and
> > we
> > > > > > reserve
> > > > > > > > the
> > > > > > > > > right to remove this API completely in the 1-2 releases*.
> > Then
> > > it
> > > > > > might
> > > > > > > > be
> > > > > > > > > reasonable for users to ask questions such as:
> > > > > > > > >
> > > > > > > > > 1) Does SinkV2 public API provides all the functionalities
> > > needed
> > > > > to
> > > > > > > > > migrate my existing code from subclassing SinkFunction to
> > > > > subclassing
> > > > > > > > > SinkV2?
> > > > > > > > >
> > > > > > > > > 2) Is the amount of migration work reasonable? If yes, why
> > is a
> > > > > class
> > > > > > > > such
> > > > > > > > > as HBaseSinkFunction in Flink's own codebase still
> depending
> > on
> > > > > > > > > SinkFunction? Maybe Flink developers should eat their own
> dog
> > > > food
> > > > > > and
> > > > > > > > > migrate (or deprecate) those classes in the Flink codebase
> > > first?
> > > > > > > > >
> > > > > > > > > Based on the discussion in this thread, I am not sure we
> have
> > > > good
> > > > > > > > answers
> > > > > > > > > to those questions yet. For the 1st question above, the
> > answer
> > > is
> > > > > > *no*
> > > > > > > > > because we already know that the SinkV2 is currently not
> able
> > > to
> > > > > > > support
> > > > > > > > > migration for JdbcSink. For the 2nd question above, we know
> > > there
> > > > > are
> > > > > > > > many
> > > > > > > > > non-deprecated class in Flink codebase that are still
> > depending
> > > > on
> > > > > > > > SinkV2.
> > > > > > > > > It is probably not nice to users if we tell them to migrate
> > > while
> > > > > we
> > > > > > > know
> > > > > > > > > there are existing issues that can prevent them from doing
> so
> > > > > easily.
> > > > > > > > >
> > > > > > > > > In order to move forward to deprecate SinkV2, I think it
> will
> > > be
> > > > > > super
> > > > > > > > > useful to first migrate all the connectors managed by Flink
> > > > > community
> > > > > > > > > (including all externalized connectors) to use SinkV2. This
> > > work
> > > > > > won't
> > > > > > > be
> > > > > > > > > wasted since we need to do this anyway. And it will also
> give
> > > us
> > > > a
> > > > > > > chance
> > > > > > > > > to validate the capabilities of SinkV2 and fix bugs by
> > > ourselves
> > > > as
> > > > > > > much
> > > > > > > > as
> > > > > > > > > possible.
> > > > > > > > >
> > > > > > > > > What do you think?
> > > > > > > > >
> > > > > > > > > Best Regards,
> > > > > > > > > Dong
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Wed, Jan 18, 2023 at 6:52 PM Martijn Visser <
> > > > > > > martijnvisser@apache.org
> > > > > > > > >
> > > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Hi all,
> > > > > > > > > >
> > > > > > > > > > While discussing FLIP-281 [1] the discussion also turned
> to
> > > the
> > > > > > > > > > SinkFunction and the SinkV2 API. For a broader discussion
> > I'm
> > > > > > opening
> > > > > > > > up
> > > > > > > > > a
> > > > > > > > > > separate discussion thread.
> > > > > > > > > >
> > > > > > > > > > As Yun Tang has mentioned in that discussion thread, it
> > would
> > > > be
> > > > > a
> > > > > > > good
> > > > > > > > > > time to deprecate the SinkFunction to avoid the need to
> > > > introduce
> > > > > > new
> > > > > > > > > > functions towards (to be) deprecated APIs. Jing
> rightfully
> > > > > > mentioned
> > > > > > > > that
> > > > > > > > > > it would be confusing to deprecate the SinkFunction if
> its
> > > > > > successor
> > > > > > > is
> > > > > > > > > not
> > > > > > > > > > yet marked as @Public (it's currently @PublicEvolving).
> > > > > > > > > >
> > > > > > > > > > My proposal would be to promote the SinkV2 API to @public
> > in
> > > > > Flink
> > > > > > > 1.17
> > > > > > > > > and
> > > > > > > > > > mark the SinkFunction as @deprecated in Flink 1.17
> > > > > > > > > >
> > > > > > > > > > The original Sink interface was introduced in Flink 1.12
> > with
> > > > > > > FLIP-143
> > > > > > > > > [2]
> > > > > > > > > > and extended with FLIP-177 in Flink 1.14 [3] and has been
> > > > > improved
> > > > > > on
> > > > > > > > > > further as Sink V2 via FLIP-191 in Flink 1.15 [4].
> > > > > > > > > >
> > > > > > > > > > Looking at the API stability graduation process [5], the
> > fact
> > > > > that
> > > > > > > Sink
> > > > > > > > > V2
> > > > > > > > > > was introduced in Flink 1.15 would mean that we could
> > > warrant a
> > > > > > > > promotion
> > > > > > > > > > to @public already (given that there have been two
> releases
> > > > with
> > > > > > 1.15
> > > > > > > > and
> > > > > > > > > > 1.16 where it was introduced). Combined with the fact
> that
> > > > SinkV2
> > > > > > has
> > > > > > > > > been
> > > > > > > > > > the result of iteration over the introduction of the
> > original
> > > > > Sink
> > > > > > > API
> > > > > > > > > > since Flink 1.12, I would argue that the promotion is
> > > overdue.
> > > > > > > > > >
> > > > > > > > > > If we promote the Sink API to @public, I think we should
> > also
> > > > > > > > immediately
> > > > > > > > > > mark the SinkFunction as @deprecated.
> > > > > > > > > >
> > > > > > > > > > Looking forward to your thoughts.
> > > > > > > > > >
> > > > > > > > > > Best regards,
> > > > > > > > > >
> > > > > > > > > > Martijn
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > [1]
> > > > > > https://lists.apache.org/thread/l05m6cf8fwkkbpnjtzbg9l2lo40oxzd1
> > > > > > > > > > [2]
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-143%3A+Unified+Sink+API
> > > > > > > > > > [3]
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-177%3A+Extend+Sink+API
> > > > > > > > > > [4]
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-191%3A+Extend+unified+Sink+interface+to+support+small+file+compaction
> > > > > > > > > > [5]
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-197%3A+API+stability+graduation+process
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > --
> > > > > > > > https://twitter.com/snntrable
> > > > > > > > https://github.com/knaufk
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > > >
> > > > --
> > > > https://twitter.com/snntrable
> > > > https://github.com/knaufk
> > > >
> > >
> >
>

Re: [DISCUSS] Promote SinkV2 to @Public and deprecate SinkFunction

Posted by Steven Wu <st...@gmail.com>.
> Regarding the discussion on global committer [1] for sinks with global
transactions, there is no consensus on solving that problem in SinkV2. Will
it require any breaking change in SinkV2?

Just want to reiterate my earlier question. What is the migration path for
the Iceberg sink?

[1] https://lists.apache.org/thread/82bgvlton9olb591bfg2djv0cshj1bxj

On Mon, Feb 6, 2023 at 6:22 PM Jark Wu <im...@gmail.com> wrote:

> Hi Konstantin,
>
> I totally agree with making SinkV2 @Public. I just have concerns about
> deprecating SinkFunction at this point. Dong Lin has raised the blocker
> issues of migration multiple times in this thread which I think we should
> address first. I don't know why we rush to deprecate SinkFunction while
> SourceFunction is still public, but the new Source API is much more stable
> and production-ready than SinkV2.
>
> Iceberg community raised concerns[1] about the workability and stability of
> Flink connector APIs.
>
> We are hoping any issues with the APIs for Iceberg connector will surface
> > sooner and get more attention from the Flink community when the connector
> > is within Flink umbrella rather than in Iceberg repo.
>
>
> The connector externalizing is a big step for building a mechanism to
> guarantee Flink connector API is stable and workable. The follow-up step
> should be trying the new APIs in externalized connectors and giving users
> the confidence to migrate.
>
>
> Best,
> Jark
>
> [1]: https://lists.apache.org/thread/r5zqnkt01x2c611brkjmxxnt3bfcgl1b
>
> On Tue, 7 Feb 2023 at 09:53, yuxia <lu...@alumni.sjtu.edu.cn> wrote:
>
> > Hi Konstantin,
> > Just FYI, the FileSystemTableSink are still using SinkFunction.
> >
> > Best regards,
> > Yuxia
> >
> > ----- 原始邮件 -----
> > 发件人: "Dong Lin" <li...@gmail.com>
> > 收件人: "dev" <de...@flink.apache.org>
> > 抄送: "Jing Ge" <ji...@ververica.com>, "Yun Tang" <my...@live.com>
> > 发送时间: 星期二, 2023年 2 月 07日 上午 9:41:07
> > 主题: Re: [DISCUSS] Promote SinkV2 to @Public and deprecate SinkFunction
> >
> > Hi Konstantin,
> >
> > Thanks for the reply. Please see my comments inline.
> >
> > On Mon, Feb 6, 2023 at 9:48 PM Konstantin Knauf <kn...@apache.org>
> wrote:
> >
> > > Hi Steven,
> > >
> > > Sink is already deprecated. It was deprecated at the moment where we
> > > introduced SinkV2.
> > >
> > > Hi Jark, Hi Dong,
> > >
> > > My understanding is the SinkV2 is a workable interface. The most
> > important
> > > connectors have been migrated (Kafka, Filesystem) and more connectors
> > > (OpenSearch, ElasticSearch, Kinesis) use it successfully. To make
> SinkV2
> > > public, it does not need to have all possible functionality. Public
> APIs
> > > can be extended. That's what we do all the time. There will also always
> > be
> > > bugs. So, these points can not be categorical blockers to promote the
> > API.
> > >
> >
> > Right. I believe we all agree that we make SinkV2 @PublicEvolving. The
> > concern here is whether we can mark SinkFunction as deprecated at this
> > point.
> >
> >
> > >
> > > What are the specific issues/tickets that are blocking us? Can we in
> your
> > >
> >
> > For example, Lijie mentioned earlier in this thread that according to
> > FLIP-287,
> > currently the Sink.InitContext still lacks some necessary information to
> > migrate existing connectors to new sinks. This could be a blocker issue
> > since this is related to the SinkV2 API design.
> >
> > And Yuxia mentioned earlier in this thread that there are bugs such as
> > FLINK-30238 <https://issues.apache.org/jira/browse/FLINK-30238> and
> > FLINK-29459 <https://issues.apache.org/jira/browse/FLINK-29459> , which
> > makes it hard to use SinkV2 properly in production.  It seems that these
> > two bugs are created months ago and are still unresolved or even
> > unassigned. This looks like a clear signal that SinkV2 is not being
> > actively maintained and used in production.
> >
> >
> > > opinion only deprecate it when every single connector in Apache Flink
> is
> > > migrated already?
> > >
> >
> > Technically this is not a hard requirement. But I would suggest that we
> > should migrate all existing connectors so that we eat our own dogfood and
> > prove to users that SinkV2 is ready for use in production.
> >
> >
> > > In my opinion it is the time to ask users to the migrate their
> > connectors.
> > > More importantly, @Deprecated would signal users not to build new
> > > connectors on SinkFunction. I would arque its also very misleading to
> > users
> > > to not @Deprecated SinkFunction given that is clearly will be
> deprecated.
> >
> >
> >
> >
> > > Cheers,
> > >
> > > Konstantin
> > >
> > >
> > > Am Mo., 6. Feb. 2023 um 13:26 Uhr schrieb Jark Wu <im...@gmail.com>:
> > >
> > > > I agree with Dong Lin.
> > > >
> > > > Oracle explains how to use Deprecate API [1]:
> > > >
> > > > You are strongly recommended to use the Javadoc @deprecated tag with
> > > > > appropriate comments explaining how to use the new API. This
> ensures
> > > > > developers will *have a workable migration path from the old API to
> > the
> > > > > new API*.
> > > >
> > > >
> > > > From a user's perspective, the workable migration path is very
> > important.
> > > > Otherwise, it blurs the semantics of API deprecation. The Flink API's
> > > > compatibility and stability issues in the past left a bad impression
> on
> > > the
> > > > downstream projects. We should be careful when changing and
> deprecating
> > > > APIs, especially when there are known migration gaps. I think it's a
> > good
> > > > idea to migrate Flink-owned connectors before marking old API
> > deprecated.
> > > > This ensures downstream projects can migrate to new APIs smoothly.
> > > >
> > > > Best,
> > > > Jark
> > > >
> > > > [1]:
> > > >
> > > >
> > >
> >
> https://docs.oracle.com/javase/8/docs/technotes/guides/javadoc/deprecation/deprecation.html
> > > >
> > > > On Mon, 6 Feb 2023 at 10:01, Steven Wu <st...@gmail.com> wrote:
> > > >
> > > > > Regarding the discussion on global committer [1] for sinks with
> > global
> > > > > transactions, there is no consensus on solving that problem in
> > SinkV2.
> > > > Will
> > > > > it require any breaking change in SinkV2?
> > > > >
> > > > > Also will SinkV1 be deprecated too? or it should happen sometime
> > after
> > > > > SinkFunction deprecation?
> > > > >
> > > > > [1]
> https://lists.apache.org/thread/82bgvlton9olb591bfg2djv0cshj1bxj
> > > > >
> > > > > On Sun, Feb 5, 2023 at 2:14 AM Dong Lin <li...@gmail.com>
> wrote:
> > > > >
> > > > > > Hi Konstantin,
> > > > > >
> > > > > > Thanks for the comment! Please see my comment inline.
> > > > > >
> > > > > > Cheers,
> > > > > > Dong
> > > > > >
> > > > > > On Sat, Feb 4, 2023 at 2:06 AM Konstantin Knauf <
> knaufk@apache.org
> > >
> > > > > wrote:
> > > > > >
> > > > > > > Hi everyone,
> > > > > > >
> > > > > > > sorry for joining the discussion late.
> > > > > > >
> > > > > > > 1) Is there an option to deprecate SinkFunction in Flink 1.17
> > while
> > > > > > leaving
> > > > > > > SinkV2 @PublicEvolving in Flink 1.17. We then aim to make
> SinkV2
> > > > > @Public
> > > > > > in
> > > > > > > and remove SinkFunction in Flink 1.18. @PublicEvolving are
> > intended
> > > > for
> > > > > > > public use. So, I don't see it as a blocker for deprecating
> > > > > SinkFunction
> > > > > > > that we have to make SinkV2 @PublicEvovling. For reference this
> > is
> > > > the
> > > > > > > description of @PublicEvovling:
> > > > > > >
> > > > > > > /**
> > > > > > >  * Annotation to mark classes and methods for public use, but
> > with
> > > > > > > evolving interfaces.
> > > > > > >  *
> > > > > > >  * <p>Classes and methods with this annotation are intended for
> > > > public
> > > > > > > use and have stable behavior.
> > > > > > >  * However, their interfaces and signatures are not considered
> to
> > > be
> > > > > > > stable and might be changed
> > > > > > >  * across versions.
> > > > > > >  *
> > > > > > >  * <p>This annotation also excludes methods and classes with
> > > evolving
> > > > > > > interfaces / signatures within
> > > > > > >  * classes annotated with {@link Public}.
> > > > > > >  */
> > > > > > >
> > > > > > >
> > > > > > > Marking SinkFunction @Deprecated would already single everyone
> to
> > > > move
> > > > > to
> > > > > > > SinkV2, which we as a community, I believe, have a strong
> > interest
> > > > in.
> > > > > > Its
> > > > > > >
> > > > > >
> > > > > > Yes, I also believe we all have this strong interest. I just hope
> > > that
> > > > > this
> > > > > > can be done in the best possible way that does not confuse users.
> > > > > >
> > > > > > I probably still have the same concern regarding its impact on
> > users:
> > > > if
> > > > > we
> > > > > > mark an API as deprecated, it effectively means the users of this
> > API
> > > > > > should start to migrate to another API (e.g. SinkV2) and we might
> > > > remove
> > > > > > this API in the future. However, given that we know there are
> known
> > > > > > problems preventing users from doing so, it seems that we are not
> > > ready
> > > > > to
> > > > > > send this message to users right.
> > > > > >
> > > > > > If I understand correctly, I guess you are suggesting that by
> > marking
> > > > > > SinkFunction as deprecated, we can put higher pressure on Flink
> > > > > > contributors to update the existing Flink codebase to improve and
> > use
> > > > > > SinkV2.
> > > > > >
> > > > > > I am not sure this is the right way to use @deprecated, which
> has a
> > > > > > particular meaning for its users rather than contributors. And I
> am
> > > > also
> > > > > > not sure we can even pressure contributors of an open-source
> > project
> > > > into
> > > > > > developing a feature (e.g. migrate all existing SinkFunction
> > > subclasses
> > > > > to
> > > > > > SinkV2). IMO, the typical way is for the contributor with
> > > interest/time
> > > > > to
> > > > > > work on the feature, or talk to other contributors whether they
> are
> > > > > willing
> > > > > > to collaborate/work on this, rather than pressuring other
> > > contributors
> > > > > into
> > > > > > working on this.
> > > > > >
> > > > > >
> > > > > > almost comical how long the transition from
> > > > SourceFurnction/SinkFunction
> > > > > to
> > > > > > > Source/Sink takes us. At the same time, we leave ourselves the
> > > option
> > > > > to
> > > > > > to
> > > > > > > make small changes to SinkV2 if any problems arise during the
> > > > migration
> > > > > > of
> > > > > > > these connector.
> > > > > > >
> > > > > > > I think, we have a bit of a chicken/egg problem here. The
> > pressure
> > > > for
> > > > > > >
> > > > > >
> > > > > > Similar to the reason described above, I am not sure we have a
> > > > > chicken/egg
> > > > > > problem here. The issue here is that SinkV2 is not ready and we
> > have
> > > a
> > > > > lot
> > > > > > of existing SinkFunction that is not migrated by ourselves. We
> > (Flink
> > > > > > contributors) probably do not need to mark SinkFunction as
> > deprecated
> > > > in
> > > > > > order to address these issues in our own codebase.
> > > > > >
> > > > > >
> > > > > > users and contributors is not high enough to move away from
> > > > SinkFunction
> > > > > as
> > > > > > > long as its not deprecated, but at the same time we need people
> > to
> > > > > > migrate
> > > > > > > their connectors to see if there are any gaps in SinkV2. I
> > believe,
> > > > the
> > > > > > > combination proposed above could bridge this problem.
> > > > > > >
> > > > > > > 2) I don't understand the argument of waiting until some of the
> > > > > > > implementations are @Public. How can we make the
> implementations
> > of
> > > > the
> > > > > > > SinkV2 API @Public without making SinkV2 @Public? All public
> > > methods
> > > > of
> > > > > > > SinkV2 are part of every implementation. So to me it actually
> > seems
> > > > to
> > > > > be
> > > > > > > opposite: in order to make any of the implementation @Public we
> > > first
> > > > > > need
> > > > > > > to make the API @Public.
> > > > > > >
> > > > > >
> > > > > > Yeah I also agree with you.
> > > > > >
> > > > > >
> > > > > > >
> > > > > > > Cheers,
> > > > > > >
> > > > > > > Konstantin
> > > > > > >
> > > > > > > Am Mo., 30. Jan. 2023 um 13:18 Uhr schrieb Dong Lin <
> > > > > lindong28@gmail.com
> > > > > > >:
> > > > > > >
> > > > > > > > Hi Martijn,
> > > > > > > >
> > > > > > > > Thanks for driving this effort to clean-up the Flink
> codebase!
> > > > > > > >
> > > > > > > > I like the idea to cleanup Flink codebase to avoid having two
> > > > Sinks.
> > > > > On
> > > > > > > the
> > > > > > > > other hand, I also thing the concern mentioned by Jing makes
> > > sense.
> > > > > In
> > > > > > > > addition to thinking in terms of the rule proposed in
> FLIP-197
> > > > > > > > <
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-197%3A+API+stability+graduation+process
> > > > > > > > >
> > > > > > > > (which
> > > > > > > > seems to focus mostly on the Flink developers' perspective),
> it
> > > > might
> > > > > > be
> > > > > > > > useful to also think about the story from users' perspective
> > and
> > > > make
> > > > > > > sure
> > > > > > > > their concerns can be addressed.
> > > > > > > >
> > > > > > > > Typically, by marking an API as deprecated, we are
> effectively
> > > > > telling
> > > > > > > > users *they should start to migrate to the new API ASAP and
> we
> > > > > reserve
> > > > > > > the
> > > > > > > > right to remove this API completely in the 1-2 releases*.
> Then
> > it
> > > > > might
> > > > > > > be
> > > > > > > > reasonable for users to ask questions such as:
> > > > > > > >
> > > > > > > > 1) Does SinkV2 public API provides all the functionalities
> > needed
> > > > to
> > > > > > > > migrate my existing code from subclassing SinkFunction to
> > > > subclassing
> > > > > > > > SinkV2?
> > > > > > > >
> > > > > > > > 2) Is the amount of migration work reasonable? If yes, why
> is a
> > > > class
> > > > > > > such
> > > > > > > > as HBaseSinkFunction in Flink's own codebase still depending
> on
> > > > > > > > SinkFunction? Maybe Flink developers should eat their own dog
> > > food
> > > > > and
> > > > > > > > migrate (or deprecate) those classes in the Flink codebase
> > first?
> > > > > > > >
> > > > > > > > Based on the discussion in this thread, I am not sure we have
> > > good
> > > > > > > answers
> > > > > > > > to those questions yet. For the 1st question above, the
> answer
> > is
> > > > > *no*
> > > > > > > > because we already know that the SinkV2 is currently not able
> > to
> > > > > > support
> > > > > > > > migration for JdbcSink. For the 2nd question above, we know
> > there
> > > > are
> > > > > > > many
> > > > > > > > non-deprecated class in Flink codebase that are still
> depending
> > > on
> > > > > > > SinkV2.
> > > > > > > > It is probably not nice to users if we tell them to migrate
> > while
> > > > we
> > > > > > know
> > > > > > > > there are existing issues that can prevent them from doing so
> > > > easily.
> > > > > > > >
> > > > > > > > In order to move forward to deprecate SinkV2, I think it will
> > be
> > > > > super
> > > > > > > > useful to first migrate all the connectors managed by Flink
> > > > community
> > > > > > > > (including all externalized connectors) to use SinkV2. This
> > work
> > > > > won't
> > > > > > be
> > > > > > > > wasted since we need to do this anyway. And it will also give
> > us
> > > a
> > > > > > chance
> > > > > > > > to validate the capabilities of SinkV2 and fix bugs by
> > ourselves
> > > as
> > > > > > much
> > > > > > > as
> > > > > > > > possible.
> > > > > > > >
> > > > > > > > What do you think?
> > > > > > > >
> > > > > > > > Best Regards,
> > > > > > > > Dong
> > > > > > > >
> > > > > > > >
> > > > > > > > On Wed, Jan 18, 2023 at 6:52 PM Martijn Visser <
> > > > > > martijnvisser@apache.org
> > > > > > > >
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Hi all,
> > > > > > > > >
> > > > > > > > > While discussing FLIP-281 [1] the discussion also turned to
> > the
> > > > > > > > > SinkFunction and the SinkV2 API. For a broader discussion
> I'm
> > > > > opening
> > > > > > > up
> > > > > > > > a
> > > > > > > > > separate discussion thread.
> > > > > > > > >
> > > > > > > > > As Yun Tang has mentioned in that discussion thread, it
> would
> > > be
> > > > a
> > > > > > good
> > > > > > > > > time to deprecate the SinkFunction to avoid the need to
> > > introduce
> > > > > new
> > > > > > > > > functions towards (to be) deprecated APIs. Jing rightfully
> > > > > mentioned
> > > > > > > that
> > > > > > > > > it would be confusing to deprecate the SinkFunction if its
> > > > > successor
> > > > > > is
> > > > > > > > not
> > > > > > > > > yet marked as @Public (it's currently @PublicEvolving).
> > > > > > > > >
> > > > > > > > > My proposal would be to promote the SinkV2 API to @public
> in
> > > > Flink
> > > > > > 1.17
> > > > > > > > and
> > > > > > > > > mark the SinkFunction as @deprecated in Flink 1.17
> > > > > > > > >
> > > > > > > > > The original Sink interface was introduced in Flink 1.12
> with
> > > > > > FLIP-143
> > > > > > > > [2]
> > > > > > > > > and extended with FLIP-177 in Flink 1.14 [3] and has been
> > > > improved
> > > > > on
> > > > > > > > > further as Sink V2 via FLIP-191 in Flink 1.15 [4].
> > > > > > > > >
> > > > > > > > > Looking at the API stability graduation process [5], the
> fact
> > > > that
> > > > > > Sink
> > > > > > > > V2
> > > > > > > > > was introduced in Flink 1.15 would mean that we could
> > warrant a
> > > > > > > promotion
> > > > > > > > > to @public already (given that there have been two releases
> > > with
> > > > > 1.15
> > > > > > > and
> > > > > > > > > 1.16 where it was introduced). Combined with the fact that
> > > SinkV2
> > > > > has
> > > > > > > > been
> > > > > > > > > the result of iteration over the introduction of the
> original
> > > > Sink
> > > > > > API
> > > > > > > > > since Flink 1.12, I would argue that the promotion is
> > overdue.
> > > > > > > > >
> > > > > > > > > If we promote the Sink API to @public, I think we should
> also
> > > > > > > immediately
> > > > > > > > > mark the SinkFunction as @deprecated.
> > > > > > > > >
> > > > > > > > > Looking forward to your thoughts.
> > > > > > > > >
> > > > > > > > > Best regards,
> > > > > > > > >
> > > > > > > > > Martijn
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > [1]
> > > > > https://lists.apache.org/thread/l05m6cf8fwkkbpnjtzbg9l2lo40oxzd1
> > > > > > > > > [2]
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-143%3A+Unified+Sink+API
> > > > > > > > > [3]
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-177%3A+Extend+Sink+API
> > > > > > > > > [4]
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-191%3A+Extend+unified+Sink+interface+to+support+small+file+compaction
> > > > > > > > > [5]
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-197%3A+API+stability+graduation+process
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > https://twitter.com/snntrable
> > > > > > > https://github.com/knaufk
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> > >
> > > --
> > > https://twitter.com/snntrable
> > > https://github.com/knaufk
> > >
> >
>

Re: [DISCUSS] Promote SinkV2 to @Public and deprecate SinkFunction

Posted by Jark Wu <im...@gmail.com>.
Hi Konstantin,

I totally agree with making SinkV2 @Public. I just have concerns about
deprecating SinkFunction at this point. Dong Lin has raised the blocker
issues of migration multiple times in this thread which I think we should
address first. I don't know why we rush to deprecate SinkFunction while
SourceFunction is still public, but the new Source API is much more stable
and production-ready than SinkV2.

Iceberg community raised concerns[1] about the workability and stability of
Flink connector APIs.

We are hoping any issues with the APIs for Iceberg connector will surface
> sooner and get more attention from the Flink community when the connector
> is within Flink umbrella rather than in Iceberg repo.


The connector externalizing is a big step for building a mechanism to
guarantee Flink connector API is stable and workable. The follow-up step
should be trying the new APIs in externalized connectors and giving users
the confidence to migrate.


Best,
Jark

[1]: https://lists.apache.org/thread/r5zqnkt01x2c611brkjmxxnt3bfcgl1b

On Tue, 7 Feb 2023 at 09:53, yuxia <lu...@alumni.sjtu.edu.cn> wrote:

> Hi Konstantin,
> Just FYI, the FileSystemTableSink are still using SinkFunction.
>
> Best regards,
> Yuxia
>
> ----- 原始邮件 -----
> 发件人: "Dong Lin" <li...@gmail.com>
> 收件人: "dev" <de...@flink.apache.org>
> 抄送: "Jing Ge" <ji...@ververica.com>, "Yun Tang" <my...@live.com>
> 发送时间: 星期二, 2023年 2 月 07日 上午 9:41:07
> 主题: Re: [DISCUSS] Promote SinkV2 to @Public and deprecate SinkFunction
>
> Hi Konstantin,
>
> Thanks for the reply. Please see my comments inline.
>
> On Mon, Feb 6, 2023 at 9:48 PM Konstantin Knauf <kn...@apache.org> wrote:
>
> > Hi Steven,
> >
> > Sink is already deprecated. It was deprecated at the moment where we
> > introduced SinkV2.
> >
> > Hi Jark, Hi Dong,
> >
> > My understanding is the SinkV2 is a workable interface. The most
> important
> > connectors have been migrated (Kafka, Filesystem) and more connectors
> > (OpenSearch, ElasticSearch, Kinesis) use it successfully. To make SinkV2
> > public, it does not need to have all possible functionality. Public APIs
> > can be extended. That's what we do all the time. There will also always
> be
> > bugs. So, these points can not be categorical blockers to promote the
> API.
> >
>
> Right. I believe we all agree that we make SinkV2 @PublicEvolving. The
> concern here is whether we can mark SinkFunction as deprecated at this
> point.
>
>
> >
> > What are the specific issues/tickets that are blocking us? Can we in your
> >
>
> For example, Lijie mentioned earlier in this thread that according to
> FLIP-287,
> currently the Sink.InitContext still lacks some necessary information to
> migrate existing connectors to new sinks. This could be a blocker issue
> since this is related to the SinkV2 API design.
>
> And Yuxia mentioned earlier in this thread that there are bugs such as
> FLINK-30238 <https://issues.apache.org/jira/browse/FLINK-30238> and
> FLINK-29459 <https://issues.apache.org/jira/browse/FLINK-29459> , which
> makes it hard to use SinkV2 properly in production.  It seems that these
> two bugs are created months ago and are still unresolved or even
> unassigned. This looks like a clear signal that SinkV2 is not being
> actively maintained and used in production.
>
>
> > opinion only deprecate it when every single connector in Apache Flink is
> > migrated already?
> >
>
> Technically this is not a hard requirement. But I would suggest that we
> should migrate all existing connectors so that we eat our own dogfood and
> prove to users that SinkV2 is ready for use in production.
>
>
> > In my opinion it is the time to ask users to the migrate their
> connectors.
> > More importantly, @Deprecated would signal users not to build new
> > connectors on SinkFunction. I would arque its also very misleading to
> users
> > to not @Deprecated SinkFunction given that is clearly will be deprecated.
>
>
>
>
> > Cheers,
> >
> > Konstantin
> >
> >
> > Am Mo., 6. Feb. 2023 um 13:26 Uhr schrieb Jark Wu <im...@gmail.com>:
> >
> > > I agree with Dong Lin.
> > >
> > > Oracle explains how to use Deprecate API [1]:
> > >
> > > You are strongly recommended to use the Javadoc @deprecated tag with
> > > > appropriate comments explaining how to use the new API. This ensures
> > > > developers will *have a workable migration path from the old API to
> the
> > > > new API*.
> > >
> > >
> > > From a user's perspective, the workable migration path is very
> important.
> > > Otherwise, it blurs the semantics of API deprecation. The Flink API's
> > > compatibility and stability issues in the past left a bad impression on
> > the
> > > downstream projects. We should be careful when changing and deprecating
> > > APIs, especially when there are known migration gaps. I think it's a
> good
> > > idea to migrate Flink-owned connectors before marking old API
> deprecated.
> > > This ensures downstream projects can migrate to new APIs smoothly.
> > >
> > > Best,
> > > Jark
> > >
> > > [1]:
> > >
> > >
> >
> https://docs.oracle.com/javase/8/docs/technotes/guides/javadoc/deprecation/deprecation.html
> > >
> > > On Mon, 6 Feb 2023 at 10:01, Steven Wu <st...@gmail.com> wrote:
> > >
> > > > Regarding the discussion on global committer [1] for sinks with
> global
> > > > transactions, there is no consensus on solving that problem in
> SinkV2.
> > > Will
> > > > it require any breaking change in SinkV2?
> > > >
> > > > Also will SinkV1 be deprecated too? or it should happen sometime
> after
> > > > SinkFunction deprecation?
> > > >
> > > > [1] https://lists.apache.org/thread/82bgvlton9olb591bfg2djv0cshj1bxj
> > > >
> > > > On Sun, Feb 5, 2023 at 2:14 AM Dong Lin <li...@gmail.com> wrote:
> > > >
> > > > > Hi Konstantin,
> > > > >
> > > > > Thanks for the comment! Please see my comment inline.
> > > > >
> > > > > Cheers,
> > > > > Dong
> > > > >
> > > > > On Sat, Feb 4, 2023 at 2:06 AM Konstantin Knauf <knaufk@apache.org
> >
> > > > wrote:
> > > > >
> > > > > > Hi everyone,
> > > > > >
> > > > > > sorry for joining the discussion late.
> > > > > >
> > > > > > 1) Is there an option to deprecate SinkFunction in Flink 1.17
> while
> > > > > leaving
> > > > > > SinkV2 @PublicEvolving in Flink 1.17. We then aim to make SinkV2
> > > > @Public
> > > > > in
> > > > > > and remove SinkFunction in Flink 1.18. @PublicEvolving are
> intended
> > > for
> > > > > > public use. So, I don't see it as a blocker for deprecating
> > > > SinkFunction
> > > > > > that we have to make SinkV2 @PublicEvovling. For reference this
> is
> > > the
> > > > > > description of @PublicEvovling:
> > > > > >
> > > > > > /**
> > > > > >  * Annotation to mark classes and methods for public use, but
> with
> > > > > > evolving interfaces.
> > > > > >  *
> > > > > >  * <p>Classes and methods with this annotation are intended for
> > > public
> > > > > > use and have stable behavior.
> > > > > >  * However, their interfaces and signatures are not considered to
> > be
> > > > > > stable and might be changed
> > > > > >  * across versions.
> > > > > >  *
> > > > > >  * <p>This annotation also excludes methods and classes with
> > evolving
> > > > > > interfaces / signatures within
> > > > > >  * classes annotated with {@link Public}.
> > > > > >  */
> > > > > >
> > > > > >
> > > > > > Marking SinkFunction @Deprecated would already single everyone to
> > > move
> > > > to
> > > > > > SinkV2, which we as a community, I believe, have a strong
> interest
> > > in.
> > > > > Its
> > > > > >
> > > > >
> > > > > Yes, I also believe we all have this strong interest. I just hope
> > that
> > > > this
> > > > > can be done in the best possible way that does not confuse users.
> > > > >
> > > > > I probably still have the same concern regarding its impact on
> users:
> > > if
> > > > we
> > > > > mark an API as deprecated, it effectively means the users of this
> API
> > > > > should start to migrate to another API (e.g. SinkV2) and we might
> > > remove
> > > > > this API in the future. However, given that we know there are known
> > > > > problems preventing users from doing so, it seems that we are not
> > ready
> > > > to
> > > > > send this message to users right.
> > > > >
> > > > > If I understand correctly, I guess you are suggesting that by
> marking
> > > > > SinkFunction as deprecated, we can put higher pressure on Flink
> > > > > contributors to update the existing Flink codebase to improve and
> use
> > > > > SinkV2.
> > > > >
> > > > > I am not sure this is the right way to use @deprecated, which has a
> > > > > particular meaning for its users rather than contributors. And I am
> > > also
> > > > > not sure we can even pressure contributors of an open-source
> project
> > > into
> > > > > developing a feature (e.g. migrate all existing SinkFunction
> > subclasses
> > > > to
> > > > > SinkV2). IMO, the typical way is for the contributor with
> > interest/time
> > > > to
> > > > > work on the feature, or talk to other contributors whether they are
> > > > willing
> > > > > to collaborate/work on this, rather than pressuring other
> > contributors
> > > > into
> > > > > working on this.
> > > > >
> > > > >
> > > > > almost comical how long the transition from
> > > SourceFurnction/SinkFunction
> > > > to
> > > > > > Source/Sink takes us. At the same time, we leave ourselves the
> > option
> > > > to
> > > > > to
> > > > > > make small changes to SinkV2 if any problems arise during the
> > > migration
> > > > > of
> > > > > > these connector.
> > > > > >
> > > > > > I think, we have a bit of a chicken/egg problem here. The
> pressure
> > > for
> > > > > >
> > > > >
> > > > > Similar to the reason described above, I am not sure we have a
> > > > chicken/egg
> > > > > problem here. The issue here is that SinkV2 is not ready and we
> have
> > a
> > > > lot
> > > > > of existing SinkFunction that is not migrated by ourselves. We
> (Flink
> > > > > contributors) probably do not need to mark SinkFunction as
> deprecated
> > > in
> > > > > order to address these issues in our own codebase.
> > > > >
> > > > >
> > > > > users and contributors is not high enough to move away from
> > > SinkFunction
> > > > as
> > > > > > long as its not deprecated, but at the same time we need people
> to
> > > > > migrate
> > > > > > their connectors to see if there are any gaps in SinkV2. I
> believe,
> > > the
> > > > > > combination proposed above could bridge this problem.
> > > > > >
> > > > > > 2) I don't understand the argument of waiting until some of the
> > > > > > implementations are @Public. How can we make the implementations
> of
> > > the
> > > > > > SinkV2 API @Public without making SinkV2 @Public? All public
> > methods
> > > of
> > > > > > SinkV2 are part of every implementation. So to me it actually
> seems
> > > to
> > > > be
> > > > > > opposite: in order to make any of the implementation @Public we
> > first
> > > > > need
> > > > > > to make the API @Public.
> > > > > >
> > > > >
> > > > > Yeah I also agree with you.
> > > > >
> > > > >
> > > > > >
> > > > > > Cheers,
> > > > > >
> > > > > > Konstantin
> > > > > >
> > > > > > Am Mo., 30. Jan. 2023 um 13:18 Uhr schrieb Dong Lin <
> > > > lindong28@gmail.com
> > > > > >:
> > > > > >
> > > > > > > Hi Martijn,
> > > > > > >
> > > > > > > Thanks for driving this effort to clean-up the Flink codebase!
> > > > > > >
> > > > > > > I like the idea to cleanup Flink codebase to avoid having two
> > > Sinks.
> > > > On
> > > > > > the
> > > > > > > other hand, I also thing the concern mentioned by Jing makes
> > sense.
> > > > In
> > > > > > > addition to thinking in terms of the rule proposed in FLIP-197
> > > > > > > <
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-197%3A+API+stability+graduation+process
> > > > > > > >
> > > > > > > (which
> > > > > > > seems to focus mostly on the Flink developers' perspective), it
> > > might
> > > > > be
> > > > > > > useful to also think about the story from users' perspective
> and
> > > make
> > > > > > sure
> > > > > > > their concerns can be addressed.
> > > > > > >
> > > > > > > Typically, by marking an API as deprecated, we are effectively
> > > > telling
> > > > > > > users *they should start to migrate to the new API ASAP and we
> > > > reserve
> > > > > > the
> > > > > > > right to remove this API completely in the 1-2 releases*. Then
> it
> > > > might
> > > > > > be
> > > > > > > reasonable for users to ask questions such as:
> > > > > > >
> > > > > > > 1) Does SinkV2 public API provides all the functionalities
> needed
> > > to
> > > > > > > migrate my existing code from subclassing SinkFunction to
> > > subclassing
> > > > > > > SinkV2?
> > > > > > >
> > > > > > > 2) Is the amount of migration work reasonable? If yes, why is a
> > > class
> > > > > > such
> > > > > > > as HBaseSinkFunction in Flink's own codebase still depending on
> > > > > > > SinkFunction? Maybe Flink developers should eat their own dog
> > food
> > > > and
> > > > > > > migrate (or deprecate) those classes in the Flink codebase
> first?
> > > > > > >
> > > > > > > Based on the discussion in this thread, I am not sure we have
> > good
> > > > > > answers
> > > > > > > to those questions yet. For the 1st question above, the answer
> is
> > > > *no*
> > > > > > > because we already know that the SinkV2 is currently not able
> to
> > > > > support
> > > > > > > migration for JdbcSink. For the 2nd question above, we know
> there
> > > are
> > > > > > many
> > > > > > > non-deprecated class in Flink codebase that are still depending
> > on
> > > > > > SinkV2.
> > > > > > > It is probably not nice to users if we tell them to migrate
> while
> > > we
> > > > > know
> > > > > > > there are existing issues that can prevent them from doing so
> > > easily.
> > > > > > >
> > > > > > > In order to move forward to deprecate SinkV2, I think it will
> be
> > > > super
> > > > > > > useful to first migrate all the connectors managed by Flink
> > > community
> > > > > > > (including all externalized connectors) to use SinkV2. This
> work
> > > > won't
> > > > > be
> > > > > > > wasted since we need to do this anyway. And it will also give
> us
> > a
> > > > > chance
> > > > > > > to validate the capabilities of SinkV2 and fix bugs by
> ourselves
> > as
> > > > > much
> > > > > > as
> > > > > > > possible.
> > > > > > >
> > > > > > > What do you think?
> > > > > > >
> > > > > > > Best Regards,
> > > > > > > Dong
> > > > > > >
> > > > > > >
> > > > > > > On Wed, Jan 18, 2023 at 6:52 PM Martijn Visser <
> > > > > martijnvisser@apache.org
> > > > > > >
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Hi all,
> > > > > > > >
> > > > > > > > While discussing FLIP-281 [1] the discussion also turned to
> the
> > > > > > > > SinkFunction and the SinkV2 API. For a broader discussion I'm
> > > > opening
> > > > > > up
> > > > > > > a
> > > > > > > > separate discussion thread.
> > > > > > > >
> > > > > > > > As Yun Tang has mentioned in that discussion thread, it would
> > be
> > > a
> > > > > good
> > > > > > > > time to deprecate the SinkFunction to avoid the need to
> > introduce
> > > > new
> > > > > > > > functions towards (to be) deprecated APIs. Jing rightfully
> > > > mentioned
> > > > > > that
> > > > > > > > it would be confusing to deprecate the SinkFunction if its
> > > > successor
> > > > > is
> > > > > > > not
> > > > > > > > yet marked as @Public (it's currently @PublicEvolving).
> > > > > > > >
> > > > > > > > My proposal would be to promote the SinkV2 API to @public in
> > > Flink
> > > > > 1.17
> > > > > > > and
> > > > > > > > mark the SinkFunction as @deprecated in Flink 1.17
> > > > > > > >
> > > > > > > > The original Sink interface was introduced in Flink 1.12 with
> > > > > FLIP-143
> > > > > > > [2]
> > > > > > > > and extended with FLIP-177 in Flink 1.14 [3] and has been
> > > improved
> > > > on
> > > > > > > > further as Sink V2 via FLIP-191 in Flink 1.15 [4].
> > > > > > > >
> > > > > > > > Looking at the API stability graduation process [5], the fact
> > > that
> > > > > Sink
> > > > > > > V2
> > > > > > > > was introduced in Flink 1.15 would mean that we could
> warrant a
> > > > > > promotion
> > > > > > > > to @public already (given that there have been two releases
> > with
> > > > 1.15
> > > > > > and
> > > > > > > > 1.16 where it was introduced). Combined with the fact that
> > SinkV2
> > > > has
> > > > > > > been
> > > > > > > > the result of iteration over the introduction of the original
> > > Sink
> > > > > API
> > > > > > > > since Flink 1.12, I would argue that the promotion is
> overdue.
> > > > > > > >
> > > > > > > > If we promote the Sink API to @public, I think we should also
> > > > > > immediately
> > > > > > > > mark the SinkFunction as @deprecated.
> > > > > > > >
> > > > > > > > Looking forward to your thoughts.
> > > > > > > >
> > > > > > > > Best regards,
> > > > > > > >
> > > > > > > > Martijn
> > > > > > > >
> > > > > > > >
> > > > > > > > [1]
> > > > https://lists.apache.org/thread/l05m6cf8fwkkbpnjtzbg9l2lo40oxzd1
> > > > > > > > [2]
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-143%3A+Unified+Sink+API
> > > > > > > > [3]
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-177%3A+Extend+Sink+API
> > > > > > > > [4]
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-191%3A+Extend+unified+Sink+interface+to+support+small+file+compaction
> > > > > > > > [5]
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-197%3A+API+stability+graduation+process
> > > > > > > >
> > > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > https://twitter.com/snntrable
> > > > > > https://github.com/knaufk
> > > > > >
> > > > >
> > > >
> > >
> >
> >
> > --
> > https://twitter.com/snntrable
> > https://github.com/knaufk
> >
>

Re: [DISCUSS] Promote SinkV2 to @Public and deprecate SinkFunction

Posted by yuxia <lu...@alumni.sjtu.edu.cn>.
Hi Konstantin,
Just FYI, the FileSystemTableSink are still using SinkFunction.

Best regards,
Yuxia

----- 原始邮件 -----
发件人: "Dong Lin" <li...@gmail.com>
收件人: "dev" <de...@flink.apache.org>
抄送: "Jing Ge" <ji...@ververica.com>, "Yun Tang" <my...@live.com>
发送时间: 星期二, 2023年 2 月 07日 上午 9:41:07
主题: Re: [DISCUSS] Promote SinkV2 to @Public and deprecate SinkFunction

Hi Konstantin,

Thanks for the reply. Please see my comments inline.

On Mon, Feb 6, 2023 at 9:48 PM Konstantin Knauf <kn...@apache.org> wrote:

> Hi Steven,
>
> Sink is already deprecated. It was deprecated at the moment where we
> introduced SinkV2.
>
> Hi Jark, Hi Dong,
>
> My understanding is the SinkV2 is a workable interface. The most important
> connectors have been migrated (Kafka, Filesystem) and more connectors
> (OpenSearch, ElasticSearch, Kinesis) use it successfully. To make SinkV2
> public, it does not need to have all possible functionality. Public APIs
> can be extended. That's what we do all the time. There will also always be
> bugs. So, these points can not be categorical blockers to promote the API.
>

Right. I believe we all agree that we make SinkV2 @PublicEvolving. The
concern here is whether we can mark SinkFunction as deprecated at this
point.


>
> What are the specific issues/tickets that are blocking us? Can we in your
>

For example, Lijie mentioned earlier in this thread that according to FLIP-287,
currently the Sink.InitContext still lacks some necessary information to
migrate existing connectors to new sinks. This could be a blocker issue
since this is related to the SinkV2 API design.

And Yuxia mentioned earlier in this thread that there are bugs such as
FLINK-30238 <https://issues.apache.org/jira/browse/FLINK-30238> and
FLINK-29459 <https://issues.apache.org/jira/browse/FLINK-29459> , which
makes it hard to use SinkV2 properly in production.  It seems that these
two bugs are created months ago and are still unresolved or even
unassigned. This looks like a clear signal that SinkV2 is not being
actively maintained and used in production.


> opinion only deprecate it when every single connector in Apache Flink is
> migrated already?
>

Technically this is not a hard requirement. But I would suggest that we
should migrate all existing connectors so that we eat our own dogfood and
prove to users that SinkV2 is ready for use in production.


> In my opinion it is the time to ask users to the migrate their connectors.
> More importantly, @Deprecated would signal users not to build new
> connectors on SinkFunction. I would arque its also very misleading to users
> to not @Deprecated SinkFunction given that is clearly will be deprecated.




> Cheers,
>
> Konstantin
>
>
> Am Mo., 6. Feb. 2023 um 13:26 Uhr schrieb Jark Wu <im...@gmail.com>:
>
> > I agree with Dong Lin.
> >
> > Oracle explains how to use Deprecate API [1]:
> >
> > You are strongly recommended to use the Javadoc @deprecated tag with
> > > appropriate comments explaining how to use the new API. This ensures
> > > developers will *have a workable migration path from the old API to the
> > > new API*.
> >
> >
> > From a user's perspective, the workable migration path is very important.
> > Otherwise, it blurs the semantics of API deprecation. The Flink API's
> > compatibility and stability issues in the past left a bad impression on
> the
> > downstream projects. We should be careful when changing and deprecating
> > APIs, especially when there are known migration gaps. I think it's a good
> > idea to migrate Flink-owned connectors before marking old API deprecated.
> > This ensures downstream projects can migrate to new APIs smoothly.
> >
> > Best,
> > Jark
> >
> > [1]:
> >
> >
> https://docs.oracle.com/javase/8/docs/technotes/guides/javadoc/deprecation/deprecation.html
> >
> > On Mon, 6 Feb 2023 at 10:01, Steven Wu <st...@gmail.com> wrote:
> >
> > > Regarding the discussion on global committer [1] for sinks with global
> > > transactions, there is no consensus on solving that problem in SinkV2.
> > Will
> > > it require any breaking change in SinkV2?
> > >
> > > Also will SinkV1 be deprecated too? or it should happen sometime after
> > > SinkFunction deprecation?
> > >
> > > [1] https://lists.apache.org/thread/82bgvlton9olb591bfg2djv0cshj1bxj
> > >
> > > On Sun, Feb 5, 2023 at 2:14 AM Dong Lin <li...@gmail.com> wrote:
> > >
> > > > Hi Konstantin,
> > > >
> > > > Thanks for the comment! Please see my comment inline.
> > > >
> > > > Cheers,
> > > > Dong
> > > >
> > > > On Sat, Feb 4, 2023 at 2:06 AM Konstantin Knauf <kn...@apache.org>
> > > wrote:
> > > >
> > > > > Hi everyone,
> > > > >
> > > > > sorry for joining the discussion late.
> > > > >
> > > > > 1) Is there an option to deprecate SinkFunction in Flink 1.17 while
> > > > leaving
> > > > > SinkV2 @PublicEvolving in Flink 1.17. We then aim to make SinkV2
> > > @Public
> > > > in
> > > > > and remove SinkFunction in Flink 1.18. @PublicEvolving are intended
> > for
> > > > > public use. So, I don't see it as a blocker for deprecating
> > > SinkFunction
> > > > > that we have to make SinkV2 @PublicEvovling. For reference this is
> > the
> > > > > description of @PublicEvovling:
> > > > >
> > > > > /**
> > > > >  * Annotation to mark classes and methods for public use, but with
> > > > > evolving interfaces.
> > > > >  *
> > > > >  * <p>Classes and methods with this annotation are intended for
> > public
> > > > > use and have stable behavior.
> > > > >  * However, their interfaces and signatures are not considered to
> be
> > > > > stable and might be changed
> > > > >  * across versions.
> > > > >  *
> > > > >  * <p>This annotation also excludes methods and classes with
> evolving
> > > > > interfaces / signatures within
> > > > >  * classes annotated with {@link Public}.
> > > > >  */
> > > > >
> > > > >
> > > > > Marking SinkFunction @Deprecated would already single everyone to
> > move
> > > to
> > > > > SinkV2, which we as a community, I believe, have a strong interest
> > in.
> > > > Its
> > > > >
> > > >
> > > > Yes, I also believe we all have this strong interest. I just hope
> that
> > > this
> > > > can be done in the best possible way that does not confuse users.
> > > >
> > > > I probably still have the same concern regarding its impact on users:
> > if
> > > we
> > > > mark an API as deprecated, it effectively means the users of this API
> > > > should start to migrate to another API (e.g. SinkV2) and we might
> > remove
> > > > this API in the future. However, given that we know there are known
> > > > problems preventing users from doing so, it seems that we are not
> ready
> > > to
> > > > send this message to users right.
> > > >
> > > > If I understand correctly, I guess you are suggesting that by marking
> > > > SinkFunction as deprecated, we can put higher pressure on Flink
> > > > contributors to update the existing Flink codebase to improve and use
> > > > SinkV2.
> > > >
> > > > I am not sure this is the right way to use @deprecated, which has a
> > > > particular meaning for its users rather than contributors. And I am
> > also
> > > > not sure we can even pressure contributors of an open-source project
> > into
> > > > developing a feature (e.g. migrate all existing SinkFunction
> subclasses
> > > to
> > > > SinkV2). IMO, the typical way is for the contributor with
> interest/time
> > > to
> > > > work on the feature, or talk to other contributors whether they are
> > > willing
> > > > to collaborate/work on this, rather than pressuring other
> contributors
> > > into
> > > > working on this.
> > > >
> > > >
> > > > almost comical how long the transition from
> > SourceFurnction/SinkFunction
> > > to
> > > > > Source/Sink takes us. At the same time, we leave ourselves the
> option
> > > to
> > > > to
> > > > > make small changes to SinkV2 if any problems arise during the
> > migration
> > > > of
> > > > > these connector.
> > > > >
> > > > > I think, we have a bit of a chicken/egg problem here. The pressure
> > for
> > > > >
> > > >
> > > > Similar to the reason described above, I am not sure we have a
> > > chicken/egg
> > > > problem here. The issue here is that SinkV2 is not ready and we have
> a
> > > lot
> > > > of existing SinkFunction that is not migrated by ourselves. We (Flink
> > > > contributors) probably do not need to mark SinkFunction as deprecated
> > in
> > > > order to address these issues in our own codebase.
> > > >
> > > >
> > > > users and contributors is not high enough to move away from
> > SinkFunction
> > > as
> > > > > long as its not deprecated, but at the same time we need people to
> > > > migrate
> > > > > their connectors to see if there are any gaps in SinkV2. I believe,
> > the
> > > > > combination proposed above could bridge this problem.
> > > > >
> > > > > 2) I don't understand the argument of waiting until some of the
> > > > > implementations are @Public. How can we make the implementations of
> > the
> > > > > SinkV2 API @Public without making SinkV2 @Public? All public
> methods
> > of
> > > > > SinkV2 are part of every implementation. So to me it actually seems
> > to
> > > be
> > > > > opposite: in order to make any of the implementation @Public we
> first
> > > > need
> > > > > to make the API @Public.
> > > > >
> > > >
> > > > Yeah I also agree with you.
> > > >
> > > >
> > > > >
> > > > > Cheers,
> > > > >
> > > > > Konstantin
> > > > >
> > > > > Am Mo., 30. Jan. 2023 um 13:18 Uhr schrieb Dong Lin <
> > > lindong28@gmail.com
> > > > >:
> > > > >
> > > > > > Hi Martijn,
> > > > > >
> > > > > > Thanks for driving this effort to clean-up the Flink codebase!
> > > > > >
> > > > > > I like the idea to cleanup Flink codebase to avoid having two
> > Sinks.
> > > On
> > > > > the
> > > > > > other hand, I also thing the concern mentioned by Jing makes
> sense.
> > > In
> > > > > > addition to thinking in terms of the rule proposed in FLIP-197
> > > > > > <
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-197%3A+API+stability+graduation+process
> > > > > > >
> > > > > > (which
> > > > > > seems to focus mostly on the Flink developers' perspective), it
> > might
> > > > be
> > > > > > useful to also think about the story from users' perspective and
> > make
> > > > > sure
> > > > > > their concerns can be addressed.
> > > > > >
> > > > > > Typically, by marking an API as deprecated, we are effectively
> > > telling
> > > > > > users *they should start to migrate to the new API ASAP and we
> > > reserve
> > > > > the
> > > > > > right to remove this API completely in the 1-2 releases*. Then it
> > > might
> > > > > be
> > > > > > reasonable for users to ask questions such as:
> > > > > >
> > > > > > 1) Does SinkV2 public API provides all the functionalities needed
> > to
> > > > > > migrate my existing code from subclassing SinkFunction to
> > subclassing
> > > > > > SinkV2?
> > > > > >
> > > > > > 2) Is the amount of migration work reasonable? If yes, why is a
> > class
> > > > > such
> > > > > > as HBaseSinkFunction in Flink's own codebase still depending on
> > > > > > SinkFunction? Maybe Flink developers should eat their own dog
> food
> > > and
> > > > > > migrate (or deprecate) those classes in the Flink codebase first?
> > > > > >
> > > > > > Based on the discussion in this thread, I am not sure we have
> good
> > > > > answers
> > > > > > to those questions yet. For the 1st question above, the answer is
> > > *no*
> > > > > > because we already know that the SinkV2 is currently not able to
> > > > support
> > > > > > migration for JdbcSink. For the 2nd question above, we know there
> > are
> > > > > many
> > > > > > non-deprecated class in Flink codebase that are still depending
> on
> > > > > SinkV2.
> > > > > > It is probably not nice to users if we tell them to migrate while
> > we
> > > > know
> > > > > > there are existing issues that can prevent them from doing so
> > easily.
> > > > > >
> > > > > > In order to move forward to deprecate SinkV2, I think it will be
> > > super
> > > > > > useful to first migrate all the connectors managed by Flink
> > community
> > > > > > (including all externalized connectors) to use SinkV2. This work
> > > won't
> > > > be
> > > > > > wasted since we need to do this anyway. And it will also give us
> a
> > > > chance
> > > > > > to validate the capabilities of SinkV2 and fix bugs by ourselves
> as
> > > > much
> > > > > as
> > > > > > possible.
> > > > > >
> > > > > > What do you think?
> > > > > >
> > > > > > Best Regards,
> > > > > > Dong
> > > > > >
> > > > > >
> > > > > > On Wed, Jan 18, 2023 at 6:52 PM Martijn Visser <
> > > > martijnvisser@apache.org
> > > > > >
> > > > > > wrote:
> > > > > >
> > > > > > > Hi all,
> > > > > > >
> > > > > > > While discussing FLIP-281 [1] the discussion also turned to the
> > > > > > > SinkFunction and the SinkV2 API. For a broader discussion I'm
> > > opening
> > > > > up
> > > > > > a
> > > > > > > separate discussion thread.
> > > > > > >
> > > > > > > As Yun Tang has mentioned in that discussion thread, it would
> be
> > a
> > > > good
> > > > > > > time to deprecate the SinkFunction to avoid the need to
> introduce
> > > new
> > > > > > > functions towards (to be) deprecated APIs. Jing rightfully
> > > mentioned
> > > > > that
> > > > > > > it would be confusing to deprecate the SinkFunction if its
> > > successor
> > > > is
> > > > > > not
> > > > > > > yet marked as @Public (it's currently @PublicEvolving).
> > > > > > >
> > > > > > > My proposal would be to promote the SinkV2 API to @public in
> > Flink
> > > > 1.17
> > > > > > and
> > > > > > > mark the SinkFunction as @deprecated in Flink 1.17
> > > > > > >
> > > > > > > The original Sink interface was introduced in Flink 1.12 with
> > > > FLIP-143
> > > > > > [2]
> > > > > > > and extended with FLIP-177 in Flink 1.14 [3] and has been
> > improved
> > > on
> > > > > > > further as Sink V2 via FLIP-191 in Flink 1.15 [4].
> > > > > > >
> > > > > > > Looking at the API stability graduation process [5], the fact
> > that
> > > > Sink
> > > > > > V2
> > > > > > > was introduced in Flink 1.15 would mean that we could warrant a
> > > > > promotion
> > > > > > > to @public already (given that there have been two releases
> with
> > > 1.15
> > > > > and
> > > > > > > 1.16 where it was introduced). Combined with the fact that
> SinkV2
> > > has
> > > > > > been
> > > > > > > the result of iteration over the introduction of the original
> > Sink
> > > > API
> > > > > > > since Flink 1.12, I would argue that the promotion is overdue.
> > > > > > >
> > > > > > > If we promote the Sink API to @public, I think we should also
> > > > > immediately
> > > > > > > mark the SinkFunction as @deprecated.
> > > > > > >
> > > > > > > Looking forward to your thoughts.
> > > > > > >
> > > > > > > Best regards,
> > > > > > >
> > > > > > > Martijn
> > > > > > >
> > > > > > >
> > > > > > > [1]
> > > https://lists.apache.org/thread/l05m6cf8fwkkbpnjtzbg9l2lo40oxzd1
> > > > > > > [2]
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-143%3A+Unified+Sink+API
> > > > > > > [3]
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-177%3A+Extend+Sink+API
> > > > > > > [4]
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-191%3A+Extend+unified+Sink+interface+to+support+small+file+compaction
> > > > > > > [5]
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-197%3A+API+stability+graduation+process
> > > > > > >
> > > > > >
> > > > >
> > > > >
> > > > > --
> > > > > https://twitter.com/snntrable
> > > > > https://github.com/knaufk
> > > > >
> > > >
> > >
> >
>
>
> --
> https://twitter.com/snntrable
> https://github.com/knaufk
>

Re: [DISCUSS] Promote SinkV2 to @Public and deprecate SinkFunction

Posted by Dong Lin <li...@gmail.com>.
Hi Konstantin,

Thanks for the reply. Please see my comments inline.

On Mon, Feb 6, 2023 at 9:48 PM Konstantin Knauf <kn...@apache.org> wrote:

> Hi Steven,
>
> Sink is already deprecated. It was deprecated at the moment where we
> introduced SinkV2.
>
> Hi Jark, Hi Dong,
>
> My understanding is the SinkV2 is a workable interface. The most important
> connectors have been migrated (Kafka, Filesystem) and more connectors
> (OpenSearch, ElasticSearch, Kinesis) use it successfully. To make SinkV2
> public, it does not need to have all possible functionality. Public APIs
> can be extended. That's what we do all the time. There will also always be
> bugs. So, these points can not be categorical blockers to promote the API.
>

Right. I believe we all agree that we make SinkV2 @PublicEvolving. The
concern here is whether we can mark SinkFunction as deprecated at this
point.


>
> What are the specific issues/tickets that are blocking us? Can we in your
>

For example, Lijie mentioned earlier in this thread that according to FLIP-287,
currently the Sink.InitContext still lacks some necessary information to
migrate existing connectors to new sinks. This could be a blocker issue
since this is related to the SinkV2 API design.

And Yuxia mentioned earlier in this thread that there are bugs such as
FLINK-30238 <https://issues.apache.org/jira/browse/FLINK-30238> and
FLINK-29459 <https://issues.apache.org/jira/browse/FLINK-29459> , which
makes it hard to use SinkV2 properly in production.  It seems that these
two bugs are created months ago and are still unresolved or even
unassigned. This looks like a clear signal that SinkV2 is not being
actively maintained and used in production.


> opinion only deprecate it when every single connector in Apache Flink is
> migrated already?
>

Technically this is not a hard requirement. But I would suggest that we
should migrate all existing connectors so that we eat our own dogfood and
prove to users that SinkV2 is ready for use in production.


> In my opinion it is the time to ask users to the migrate their connectors.
> More importantly, @Deprecated would signal users not to build new
> connectors on SinkFunction. I would arque its also very misleading to users
> to not @Deprecated SinkFunction given that is clearly will be deprecated.




> Cheers,
>
> Konstantin
>
>
> Am Mo., 6. Feb. 2023 um 13:26 Uhr schrieb Jark Wu <im...@gmail.com>:
>
> > I agree with Dong Lin.
> >
> > Oracle explains how to use Deprecate API [1]:
> >
> > You are strongly recommended to use the Javadoc @deprecated tag with
> > > appropriate comments explaining how to use the new API. This ensures
> > > developers will *have a workable migration path from the old API to the
> > > new API*.
> >
> >
> > From a user's perspective, the workable migration path is very important.
> > Otherwise, it blurs the semantics of API deprecation. The Flink API's
> > compatibility and stability issues in the past left a bad impression on
> the
> > downstream projects. We should be careful when changing and deprecating
> > APIs, especially when there are known migration gaps. I think it's a good
> > idea to migrate Flink-owned connectors before marking old API deprecated.
> > This ensures downstream projects can migrate to new APIs smoothly.
> >
> > Best,
> > Jark
> >
> > [1]:
> >
> >
> https://docs.oracle.com/javase/8/docs/technotes/guides/javadoc/deprecation/deprecation.html
> >
> > On Mon, 6 Feb 2023 at 10:01, Steven Wu <st...@gmail.com> wrote:
> >
> > > Regarding the discussion on global committer [1] for sinks with global
> > > transactions, there is no consensus on solving that problem in SinkV2.
> > Will
> > > it require any breaking change in SinkV2?
> > >
> > > Also will SinkV1 be deprecated too? or it should happen sometime after
> > > SinkFunction deprecation?
> > >
> > > [1] https://lists.apache.org/thread/82bgvlton9olb591bfg2djv0cshj1bxj
> > >
> > > On Sun, Feb 5, 2023 at 2:14 AM Dong Lin <li...@gmail.com> wrote:
> > >
> > > > Hi Konstantin,
> > > >
> > > > Thanks for the comment! Please see my comment inline.
> > > >
> > > > Cheers,
> > > > Dong
> > > >
> > > > On Sat, Feb 4, 2023 at 2:06 AM Konstantin Knauf <kn...@apache.org>
> > > wrote:
> > > >
> > > > > Hi everyone,
> > > > >
> > > > > sorry for joining the discussion late.
> > > > >
> > > > > 1) Is there an option to deprecate SinkFunction in Flink 1.17 while
> > > > leaving
> > > > > SinkV2 @PublicEvolving in Flink 1.17. We then aim to make SinkV2
> > > @Public
> > > > in
> > > > > and remove SinkFunction in Flink 1.18. @PublicEvolving are intended
> > for
> > > > > public use. So, I don't see it as a blocker for deprecating
> > > SinkFunction
> > > > > that we have to make SinkV2 @PublicEvovling. For reference this is
> > the
> > > > > description of @PublicEvovling:
> > > > >
> > > > > /**
> > > > >  * Annotation to mark classes and methods for public use, but with
> > > > > evolving interfaces.
> > > > >  *
> > > > >  * <p>Classes and methods with this annotation are intended for
> > public
> > > > > use and have stable behavior.
> > > > >  * However, their interfaces and signatures are not considered to
> be
> > > > > stable and might be changed
> > > > >  * across versions.
> > > > >  *
> > > > >  * <p>This annotation also excludes methods and classes with
> evolving
> > > > > interfaces / signatures within
> > > > >  * classes annotated with {@link Public}.
> > > > >  */
> > > > >
> > > > >
> > > > > Marking SinkFunction @Deprecated would already single everyone to
> > move
> > > to
> > > > > SinkV2, which we as a community, I believe, have a strong interest
> > in.
> > > > Its
> > > > >
> > > >
> > > > Yes, I also believe we all have this strong interest. I just hope
> that
> > > this
> > > > can be done in the best possible way that does not confuse users.
> > > >
> > > > I probably still have the same concern regarding its impact on users:
> > if
> > > we
> > > > mark an API as deprecated, it effectively means the users of this API
> > > > should start to migrate to another API (e.g. SinkV2) and we might
> > remove
> > > > this API in the future. However, given that we know there are known
> > > > problems preventing users from doing so, it seems that we are not
> ready
> > > to
> > > > send this message to users right.
> > > >
> > > > If I understand correctly, I guess you are suggesting that by marking
> > > > SinkFunction as deprecated, we can put higher pressure on Flink
> > > > contributors to update the existing Flink codebase to improve and use
> > > > SinkV2.
> > > >
> > > > I am not sure this is the right way to use @deprecated, which has a
> > > > particular meaning for its users rather than contributors. And I am
> > also
> > > > not sure we can even pressure contributors of an open-source project
> > into
> > > > developing a feature (e.g. migrate all existing SinkFunction
> subclasses
> > > to
> > > > SinkV2). IMO, the typical way is for the contributor with
> interest/time
> > > to
> > > > work on the feature, or talk to other contributors whether they are
> > > willing
> > > > to collaborate/work on this, rather than pressuring other
> contributors
> > > into
> > > > working on this.
> > > >
> > > >
> > > > almost comical how long the transition from
> > SourceFurnction/SinkFunction
> > > to
> > > > > Source/Sink takes us. At the same time, we leave ourselves the
> option
> > > to
> > > > to
> > > > > make small changes to SinkV2 if any problems arise during the
> > migration
> > > > of
> > > > > these connector.
> > > > >
> > > > > I think, we have a bit of a chicken/egg problem here. The pressure
> > for
> > > > >
> > > >
> > > > Similar to the reason described above, I am not sure we have a
> > > chicken/egg
> > > > problem here. The issue here is that SinkV2 is not ready and we have
> a
> > > lot
> > > > of existing SinkFunction that is not migrated by ourselves. We (Flink
> > > > contributors) probably do not need to mark SinkFunction as deprecated
> > in
> > > > order to address these issues in our own codebase.
> > > >
> > > >
> > > > users and contributors is not high enough to move away from
> > SinkFunction
> > > as
> > > > > long as its not deprecated, but at the same time we need people to
> > > > migrate
> > > > > their connectors to see if there are any gaps in SinkV2. I believe,
> > the
> > > > > combination proposed above could bridge this problem.
> > > > >
> > > > > 2) I don't understand the argument of waiting until some of the
> > > > > implementations are @Public. How can we make the implementations of
> > the
> > > > > SinkV2 API @Public without making SinkV2 @Public? All public
> methods
> > of
> > > > > SinkV2 are part of every implementation. So to me it actually seems
> > to
> > > be
> > > > > opposite: in order to make any of the implementation @Public we
> first
> > > > need
> > > > > to make the API @Public.
> > > > >
> > > >
> > > > Yeah I also agree with you.
> > > >
> > > >
> > > > >
> > > > > Cheers,
> > > > >
> > > > > Konstantin
> > > > >
> > > > > Am Mo., 30. Jan. 2023 um 13:18 Uhr schrieb Dong Lin <
> > > lindong28@gmail.com
> > > > >:
> > > > >
> > > > > > Hi Martijn,
> > > > > >
> > > > > > Thanks for driving this effort to clean-up the Flink codebase!
> > > > > >
> > > > > > I like the idea to cleanup Flink codebase to avoid having two
> > Sinks.
> > > On
> > > > > the
> > > > > > other hand, I also thing the concern mentioned by Jing makes
> sense.
> > > In
> > > > > > addition to thinking in terms of the rule proposed in FLIP-197
> > > > > > <
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-197%3A+API+stability+graduation+process
> > > > > > >
> > > > > > (which
> > > > > > seems to focus mostly on the Flink developers' perspective), it
> > might
> > > > be
> > > > > > useful to also think about the story from users' perspective and
> > make
> > > > > sure
> > > > > > their concerns can be addressed.
> > > > > >
> > > > > > Typically, by marking an API as deprecated, we are effectively
> > > telling
> > > > > > users *they should start to migrate to the new API ASAP and we
> > > reserve
> > > > > the
> > > > > > right to remove this API completely in the 1-2 releases*. Then it
> > > might
> > > > > be
> > > > > > reasonable for users to ask questions such as:
> > > > > >
> > > > > > 1) Does SinkV2 public API provides all the functionalities needed
> > to
> > > > > > migrate my existing code from subclassing SinkFunction to
> > subclassing
> > > > > > SinkV2?
> > > > > >
> > > > > > 2) Is the amount of migration work reasonable? If yes, why is a
> > class
> > > > > such
> > > > > > as HBaseSinkFunction in Flink's own codebase still depending on
> > > > > > SinkFunction? Maybe Flink developers should eat their own dog
> food
> > > and
> > > > > > migrate (or deprecate) those classes in the Flink codebase first?
> > > > > >
> > > > > > Based on the discussion in this thread, I am not sure we have
> good
> > > > > answers
> > > > > > to those questions yet. For the 1st question above, the answer is
> > > *no*
> > > > > > because we already know that the SinkV2 is currently not able to
> > > > support
> > > > > > migration for JdbcSink. For the 2nd question above, we know there
> > are
> > > > > many
> > > > > > non-deprecated class in Flink codebase that are still depending
> on
> > > > > SinkV2.
> > > > > > It is probably not nice to users if we tell them to migrate while
> > we
> > > > know
> > > > > > there are existing issues that can prevent them from doing so
> > easily.
> > > > > >
> > > > > > In order to move forward to deprecate SinkV2, I think it will be
> > > super
> > > > > > useful to first migrate all the connectors managed by Flink
> > community
> > > > > > (including all externalized connectors) to use SinkV2. This work
> > > won't
> > > > be
> > > > > > wasted since we need to do this anyway. And it will also give us
> a
> > > > chance
> > > > > > to validate the capabilities of SinkV2 and fix bugs by ourselves
> as
> > > > much
> > > > > as
> > > > > > possible.
> > > > > >
> > > > > > What do you think?
> > > > > >
> > > > > > Best Regards,
> > > > > > Dong
> > > > > >
> > > > > >
> > > > > > On Wed, Jan 18, 2023 at 6:52 PM Martijn Visser <
> > > > martijnvisser@apache.org
> > > > > >
> > > > > > wrote:
> > > > > >
> > > > > > > Hi all,
> > > > > > >
> > > > > > > While discussing FLIP-281 [1] the discussion also turned to the
> > > > > > > SinkFunction and the SinkV2 API. For a broader discussion I'm
> > > opening
> > > > > up
> > > > > > a
> > > > > > > separate discussion thread.
> > > > > > >
> > > > > > > As Yun Tang has mentioned in that discussion thread, it would
> be
> > a
> > > > good
> > > > > > > time to deprecate the SinkFunction to avoid the need to
> introduce
> > > new
> > > > > > > functions towards (to be) deprecated APIs. Jing rightfully
> > > mentioned
> > > > > that
> > > > > > > it would be confusing to deprecate the SinkFunction if its
> > > successor
> > > > is
> > > > > > not
> > > > > > > yet marked as @Public (it's currently @PublicEvolving).
> > > > > > >
> > > > > > > My proposal would be to promote the SinkV2 API to @public in
> > Flink
> > > > 1.17
> > > > > > and
> > > > > > > mark the SinkFunction as @deprecated in Flink 1.17
> > > > > > >
> > > > > > > The original Sink interface was introduced in Flink 1.12 with
> > > > FLIP-143
> > > > > > [2]
> > > > > > > and extended with FLIP-177 in Flink 1.14 [3] and has been
> > improved
> > > on
> > > > > > > further as Sink V2 via FLIP-191 in Flink 1.15 [4].
> > > > > > >
> > > > > > > Looking at the API stability graduation process [5], the fact
> > that
> > > > Sink
> > > > > > V2
> > > > > > > was introduced in Flink 1.15 would mean that we could warrant a
> > > > > promotion
> > > > > > > to @public already (given that there have been two releases
> with
> > > 1.15
> > > > > and
> > > > > > > 1.16 where it was introduced). Combined with the fact that
> SinkV2
> > > has
> > > > > > been
> > > > > > > the result of iteration over the introduction of the original
> > Sink
> > > > API
> > > > > > > since Flink 1.12, I would argue that the promotion is overdue.
> > > > > > >
> > > > > > > If we promote the Sink API to @public, I think we should also
> > > > > immediately
> > > > > > > mark the SinkFunction as @deprecated.
> > > > > > >
> > > > > > > Looking forward to your thoughts.
> > > > > > >
> > > > > > > Best regards,
> > > > > > >
> > > > > > > Martijn
> > > > > > >
> > > > > > >
> > > > > > > [1]
> > > https://lists.apache.org/thread/l05m6cf8fwkkbpnjtzbg9l2lo40oxzd1
> > > > > > > [2]
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-143%3A+Unified+Sink+API
> > > > > > > [3]
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-177%3A+Extend+Sink+API
> > > > > > > [4]
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-191%3A+Extend+unified+Sink+interface+to+support+small+file+compaction
> > > > > > > [5]
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-197%3A+API+stability+graduation+process
> > > > > > >
> > > > > >
> > > > >
> > > > >
> > > > > --
> > > > > https://twitter.com/snntrable
> > > > > https://github.com/knaufk
> > > > >
> > > >
> > >
> >
>
>
> --
> https://twitter.com/snntrable
> https://github.com/knaufk
>

Re: [DISCUSS] Promote SinkV2 to @Public and deprecate SinkFunction

Posted by Konstantin Knauf <kn...@apache.org>.
Hi Steven,

Sink is already deprecated. It was deprecated at the moment where we
introduced SinkV2.

Hi Jark, Hi Dong,

My understanding is the SinkV2 is a workable interface. The most important
connectors have been migrated (Kafka, Filesystem) and more connectors
(OpenSearch, ElasticSearch, Kinesis) use it successfully. To make SinkV2
public, it does not need to have all possible functionality. Public APIs
can be extended. That's what we do all the time. There will also always be
bugs. So, these points can not be categorical blockers to promote the API.

What are the specific issues/tickets that are blocking us? Can we in your
opinion only deprecate it when every single connector in Apache Flink is
migrated already?

In my opinion it is the time to ask users to the migrate their connectors.
More importantly, @Deprecated would signal users not to build new
connectors on SinkFunction. I would arque its also very misleading to users
to not @Deprecated SinkFunction given that is clearly will be deprecated.

Cheers,

Konstantin


Am Mo., 6. Feb. 2023 um 13:26 Uhr schrieb Jark Wu <im...@gmail.com>:

> I agree with Dong Lin.
>
> Oracle explains how to use Deprecate API [1]:
>
> You are strongly recommended to use the Javadoc @deprecated tag with
> > appropriate comments explaining how to use the new API. This ensures
> > developers will *have a workable migration path from the old API to the
> > new API*.
>
>
> From a user's perspective, the workable migration path is very important.
> Otherwise, it blurs the semantics of API deprecation. The Flink API's
> compatibility and stability issues in the past left a bad impression on the
> downstream projects. We should be careful when changing and deprecating
> APIs, especially when there are known migration gaps. I think it's a good
> idea to migrate Flink-owned connectors before marking old API deprecated.
> This ensures downstream projects can migrate to new APIs smoothly.
>
> Best,
> Jark
>
> [1]:
>
> https://docs.oracle.com/javase/8/docs/technotes/guides/javadoc/deprecation/deprecation.html
>
> On Mon, 6 Feb 2023 at 10:01, Steven Wu <st...@gmail.com> wrote:
>
> > Regarding the discussion on global committer [1] for sinks with global
> > transactions, there is no consensus on solving that problem in SinkV2.
> Will
> > it require any breaking change in SinkV2?
> >
> > Also will SinkV1 be deprecated too? or it should happen sometime after
> > SinkFunction deprecation?
> >
> > [1] https://lists.apache.org/thread/82bgvlton9olb591bfg2djv0cshj1bxj
> >
> > On Sun, Feb 5, 2023 at 2:14 AM Dong Lin <li...@gmail.com> wrote:
> >
> > > Hi Konstantin,
> > >
> > > Thanks for the comment! Please see my comment inline.
> > >
> > > Cheers,
> > > Dong
> > >
> > > On Sat, Feb 4, 2023 at 2:06 AM Konstantin Knauf <kn...@apache.org>
> > wrote:
> > >
> > > > Hi everyone,
> > > >
> > > > sorry for joining the discussion late.
> > > >
> > > > 1) Is there an option to deprecate SinkFunction in Flink 1.17 while
> > > leaving
> > > > SinkV2 @PublicEvolving in Flink 1.17. We then aim to make SinkV2
> > @Public
> > > in
> > > > and remove SinkFunction in Flink 1.18. @PublicEvolving are intended
> for
> > > > public use. So, I don't see it as a blocker for deprecating
> > SinkFunction
> > > > that we have to make SinkV2 @PublicEvovling. For reference this is
> the
> > > > description of @PublicEvovling:
> > > >
> > > > /**
> > > >  * Annotation to mark classes and methods for public use, but with
> > > > evolving interfaces.
> > > >  *
> > > >  * <p>Classes and methods with this annotation are intended for
> public
> > > > use and have stable behavior.
> > > >  * However, their interfaces and signatures are not considered to be
> > > > stable and might be changed
> > > >  * across versions.
> > > >  *
> > > >  * <p>This annotation also excludes methods and classes with evolving
> > > > interfaces / signatures within
> > > >  * classes annotated with {@link Public}.
> > > >  */
> > > >
> > > >
> > > > Marking SinkFunction @Deprecated would already single everyone to
> move
> > to
> > > > SinkV2, which we as a community, I believe, have a strong interest
> in.
> > > Its
> > > >
> > >
> > > Yes, I also believe we all have this strong interest. I just hope that
> > this
> > > can be done in the best possible way that does not confuse users.
> > >
> > > I probably still have the same concern regarding its impact on users:
> if
> > we
> > > mark an API as deprecated, it effectively means the users of this API
> > > should start to migrate to another API (e.g. SinkV2) and we might
> remove
> > > this API in the future. However, given that we know there are known
> > > problems preventing users from doing so, it seems that we are not ready
> > to
> > > send this message to users right.
> > >
> > > If I understand correctly, I guess you are suggesting that by marking
> > > SinkFunction as deprecated, we can put higher pressure on Flink
> > > contributors to update the existing Flink codebase to improve and use
> > > SinkV2.
> > >
> > > I am not sure this is the right way to use @deprecated, which has a
> > > particular meaning for its users rather than contributors. And I am
> also
> > > not sure we can even pressure contributors of an open-source project
> into
> > > developing a feature (e.g. migrate all existing SinkFunction subclasses
> > to
> > > SinkV2). IMO, the typical way is for the contributor with interest/time
> > to
> > > work on the feature, or talk to other contributors whether they are
> > willing
> > > to collaborate/work on this, rather than pressuring other contributors
> > into
> > > working on this.
> > >
> > >
> > > almost comical how long the transition from
> SourceFurnction/SinkFunction
> > to
> > > > Source/Sink takes us. At the same time, we leave ourselves the option
> > to
> > > to
> > > > make small changes to SinkV2 if any problems arise during the
> migration
> > > of
> > > > these connector.
> > > >
> > > > I think, we have a bit of a chicken/egg problem here. The pressure
> for
> > > >
> > >
> > > Similar to the reason described above, I am not sure we have a
> > chicken/egg
> > > problem here. The issue here is that SinkV2 is not ready and we have a
> > lot
> > > of existing SinkFunction that is not migrated by ourselves. We (Flink
> > > contributors) probably do not need to mark SinkFunction as deprecated
> in
> > > order to address these issues in our own codebase.
> > >
> > >
> > > users and contributors is not high enough to move away from
> SinkFunction
> > as
> > > > long as its not deprecated, but at the same time we need people to
> > > migrate
> > > > their connectors to see if there are any gaps in SinkV2. I believe,
> the
> > > > combination proposed above could bridge this problem.
> > > >
> > > > 2) I don't understand the argument of waiting until some of the
> > > > implementations are @Public. How can we make the implementations of
> the
> > > > SinkV2 API @Public without making SinkV2 @Public? All public methods
> of
> > > > SinkV2 are part of every implementation. So to me it actually seems
> to
> > be
> > > > opposite: in order to make any of the implementation @Public we first
> > > need
> > > > to make the API @Public.
> > > >
> > >
> > > Yeah I also agree with you.
> > >
> > >
> > > >
> > > > Cheers,
> > > >
> > > > Konstantin
> > > >
> > > > Am Mo., 30. Jan. 2023 um 13:18 Uhr schrieb Dong Lin <
> > lindong28@gmail.com
> > > >:
> > > >
> > > > > Hi Martijn,
> > > > >
> > > > > Thanks for driving this effort to clean-up the Flink codebase!
> > > > >
> > > > > I like the idea to cleanup Flink codebase to avoid having two
> Sinks.
> > On
> > > > the
> > > > > other hand, I also thing the concern mentioned by Jing makes sense.
> > In
> > > > > addition to thinking in terms of the rule proposed in FLIP-197
> > > > > <
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-197%3A+API+stability+graduation+process
> > > > > >
> > > > > (which
> > > > > seems to focus mostly on the Flink developers' perspective), it
> might
> > > be
> > > > > useful to also think about the story from users' perspective and
> make
> > > > sure
> > > > > their concerns can be addressed.
> > > > >
> > > > > Typically, by marking an API as deprecated, we are effectively
> > telling
> > > > > users *they should start to migrate to the new API ASAP and we
> > reserve
> > > > the
> > > > > right to remove this API completely in the 1-2 releases*. Then it
> > might
> > > > be
> > > > > reasonable for users to ask questions such as:
> > > > >
> > > > > 1) Does SinkV2 public API provides all the functionalities needed
> to
> > > > > migrate my existing code from subclassing SinkFunction to
> subclassing
> > > > > SinkV2?
> > > > >
> > > > > 2) Is the amount of migration work reasonable? If yes, why is a
> class
> > > > such
> > > > > as HBaseSinkFunction in Flink's own codebase still depending on
> > > > > SinkFunction? Maybe Flink developers should eat their own dog food
> > and
> > > > > migrate (or deprecate) those classes in the Flink codebase first?
> > > > >
> > > > > Based on the discussion in this thread, I am not sure we have good
> > > > answers
> > > > > to those questions yet. For the 1st question above, the answer is
> > *no*
> > > > > because we already know that the SinkV2 is currently not able to
> > > support
> > > > > migration for JdbcSink. For the 2nd question above, we know there
> are
> > > > many
> > > > > non-deprecated class in Flink codebase that are still depending on
> > > > SinkV2.
> > > > > It is probably not nice to users if we tell them to migrate while
> we
> > > know
> > > > > there are existing issues that can prevent them from doing so
> easily.
> > > > >
> > > > > In order to move forward to deprecate SinkV2, I think it will be
> > super
> > > > > useful to first migrate all the connectors managed by Flink
> community
> > > > > (including all externalized connectors) to use SinkV2. This work
> > won't
> > > be
> > > > > wasted since we need to do this anyway. And it will also give us a
> > > chance
> > > > > to validate the capabilities of SinkV2 and fix bugs by ourselves as
> > > much
> > > > as
> > > > > possible.
> > > > >
> > > > > What do you think?
> > > > >
> > > > > Best Regards,
> > > > > Dong
> > > > >
> > > > >
> > > > > On Wed, Jan 18, 2023 at 6:52 PM Martijn Visser <
> > > martijnvisser@apache.org
> > > > >
> > > > > wrote:
> > > > >
> > > > > > Hi all,
> > > > > >
> > > > > > While discussing FLIP-281 [1] the discussion also turned to the
> > > > > > SinkFunction and the SinkV2 API. For a broader discussion I'm
> > opening
> > > > up
> > > > > a
> > > > > > separate discussion thread.
> > > > > >
> > > > > > As Yun Tang has mentioned in that discussion thread, it would be
> a
> > > good
> > > > > > time to deprecate the SinkFunction to avoid the need to introduce
> > new
> > > > > > functions towards (to be) deprecated APIs. Jing rightfully
> > mentioned
> > > > that
> > > > > > it would be confusing to deprecate the SinkFunction if its
> > successor
> > > is
> > > > > not
> > > > > > yet marked as @Public (it's currently @PublicEvolving).
> > > > > >
> > > > > > My proposal would be to promote the SinkV2 API to @public in
> Flink
> > > 1.17
> > > > > and
> > > > > > mark the SinkFunction as @deprecated in Flink 1.17
> > > > > >
> > > > > > The original Sink interface was introduced in Flink 1.12 with
> > > FLIP-143
> > > > > [2]
> > > > > > and extended with FLIP-177 in Flink 1.14 [3] and has been
> improved
> > on
> > > > > > further as Sink V2 via FLIP-191 in Flink 1.15 [4].
> > > > > >
> > > > > > Looking at the API stability graduation process [5], the fact
> that
> > > Sink
> > > > > V2
> > > > > > was introduced in Flink 1.15 would mean that we could warrant a
> > > > promotion
> > > > > > to @public already (given that there have been two releases with
> > 1.15
> > > > and
> > > > > > 1.16 where it was introduced). Combined with the fact that SinkV2
> > has
> > > > > been
> > > > > > the result of iteration over the introduction of the original
> Sink
> > > API
> > > > > > since Flink 1.12, I would argue that the promotion is overdue.
> > > > > >
> > > > > > If we promote the Sink API to @public, I think we should also
> > > > immediately
> > > > > > mark the SinkFunction as @deprecated.
> > > > > >
> > > > > > Looking forward to your thoughts.
> > > > > >
> > > > > > Best regards,
> > > > > >
> > > > > > Martijn
> > > > > >
> > > > > >
> > > > > > [1]
> > https://lists.apache.org/thread/l05m6cf8fwkkbpnjtzbg9l2lo40oxzd1
> > > > > > [2]
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-143%3A+Unified+Sink+API
> > > > > > [3]
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-177%3A+Extend+Sink+API
> > > > > > [4]
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-191%3A+Extend+unified+Sink+interface+to+support+small+file+compaction
> > > > > > [5]
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-197%3A+API+stability+graduation+process
> > > > > >
> > > > >
> > > >
> > > >
> > > > --
> > > > https://twitter.com/snntrable
> > > > https://github.com/knaufk
> > > >
> > >
> >
>


-- 
https://twitter.com/snntrable
https://github.com/knaufk

Re: [DISCUSS] Promote SinkV2 to @Public and deprecate SinkFunction

Posted by Jark Wu <im...@gmail.com>.
I agree with Dong Lin.

Oracle explains how to use Deprecate API [1]:

You are strongly recommended to use the Javadoc @deprecated tag with
> appropriate comments explaining how to use the new API. This ensures
> developers will *have a workable migration path from the old API to the
> new API*.


From a user's perspective, the workable migration path is very important.
Otherwise, it blurs the semantics of API deprecation. The Flink API's
compatibility and stability issues in the past left a bad impression on the
downstream projects. We should be careful when changing and deprecating
APIs, especially when there are known migration gaps. I think it's a good
idea to migrate Flink-owned connectors before marking old API deprecated.
This ensures downstream projects can migrate to new APIs smoothly.

Best,
Jark

[1]:
https://docs.oracle.com/javase/8/docs/technotes/guides/javadoc/deprecation/deprecation.html

On Mon, 6 Feb 2023 at 10:01, Steven Wu <st...@gmail.com> wrote:

> Regarding the discussion on global committer [1] for sinks with global
> transactions, there is no consensus on solving that problem in SinkV2. Will
> it require any breaking change in SinkV2?
>
> Also will SinkV1 be deprecated too? or it should happen sometime after
> SinkFunction deprecation?
>
> [1] https://lists.apache.org/thread/82bgvlton9olb591bfg2djv0cshj1bxj
>
> On Sun, Feb 5, 2023 at 2:14 AM Dong Lin <li...@gmail.com> wrote:
>
> > Hi Konstantin,
> >
> > Thanks for the comment! Please see my comment inline.
> >
> > Cheers,
> > Dong
> >
> > On Sat, Feb 4, 2023 at 2:06 AM Konstantin Knauf <kn...@apache.org>
> wrote:
> >
> > > Hi everyone,
> > >
> > > sorry for joining the discussion late.
> > >
> > > 1) Is there an option to deprecate SinkFunction in Flink 1.17 while
> > leaving
> > > SinkV2 @PublicEvolving in Flink 1.17. We then aim to make SinkV2
> @Public
> > in
> > > and remove SinkFunction in Flink 1.18. @PublicEvolving are intended for
> > > public use. So, I don't see it as a blocker for deprecating
> SinkFunction
> > > that we have to make SinkV2 @PublicEvovling. For reference this is the
> > > description of @PublicEvovling:
> > >
> > > /**
> > >  * Annotation to mark classes and methods for public use, but with
> > > evolving interfaces.
> > >  *
> > >  * <p>Classes and methods with this annotation are intended for public
> > > use and have stable behavior.
> > >  * However, their interfaces and signatures are not considered to be
> > > stable and might be changed
> > >  * across versions.
> > >  *
> > >  * <p>This annotation also excludes methods and classes with evolving
> > > interfaces / signatures within
> > >  * classes annotated with {@link Public}.
> > >  */
> > >
> > >
> > > Marking SinkFunction @Deprecated would already single everyone to move
> to
> > > SinkV2, which we as a community, I believe, have a strong interest in.
> > Its
> > >
> >
> > Yes, I also believe we all have this strong interest. I just hope that
> this
> > can be done in the best possible way that does not confuse users.
> >
> > I probably still have the same concern regarding its impact on users: if
> we
> > mark an API as deprecated, it effectively means the users of this API
> > should start to migrate to another API (e.g. SinkV2) and we might remove
> > this API in the future. However, given that we know there are known
> > problems preventing users from doing so, it seems that we are not ready
> to
> > send this message to users right.
> >
> > If I understand correctly, I guess you are suggesting that by marking
> > SinkFunction as deprecated, we can put higher pressure on Flink
> > contributors to update the existing Flink codebase to improve and use
> > SinkV2.
> >
> > I am not sure this is the right way to use @deprecated, which has a
> > particular meaning for its users rather than contributors. And I am also
> > not sure we can even pressure contributors of an open-source project into
> > developing a feature (e.g. migrate all existing SinkFunction subclasses
> to
> > SinkV2). IMO, the typical way is for the contributor with interest/time
> to
> > work on the feature, or talk to other contributors whether they are
> willing
> > to collaborate/work on this, rather than pressuring other contributors
> into
> > working on this.
> >
> >
> > almost comical how long the transition from SourceFurnction/SinkFunction
> to
> > > Source/Sink takes us. At the same time, we leave ourselves the option
> to
> > to
> > > make small changes to SinkV2 if any problems arise during the migration
> > of
> > > these connector.
> > >
> > > I think, we have a bit of a chicken/egg problem here. The pressure for
> > >
> >
> > Similar to the reason described above, I am not sure we have a
> chicken/egg
> > problem here. The issue here is that SinkV2 is not ready and we have a
> lot
> > of existing SinkFunction that is not migrated by ourselves. We (Flink
> > contributors) probably do not need to mark SinkFunction as deprecated in
> > order to address these issues in our own codebase.
> >
> >
> > users and contributors is not high enough to move away from SinkFunction
> as
> > > long as its not deprecated, but at the same time we need people to
> > migrate
> > > their connectors to see if there are any gaps in SinkV2. I believe, the
> > > combination proposed above could bridge this problem.
> > >
> > > 2) I don't understand the argument of waiting until some of the
> > > implementations are @Public. How can we make the implementations of the
> > > SinkV2 API @Public without making SinkV2 @Public? All public methods of
> > > SinkV2 are part of every implementation. So to me it actually seems to
> be
> > > opposite: in order to make any of the implementation @Public we first
> > need
> > > to make the API @Public.
> > >
> >
> > Yeah I also agree with you.
> >
> >
> > >
> > > Cheers,
> > >
> > > Konstantin
> > >
> > > Am Mo., 30. Jan. 2023 um 13:18 Uhr schrieb Dong Lin <
> lindong28@gmail.com
> > >:
> > >
> > > > Hi Martijn,
> > > >
> > > > Thanks for driving this effort to clean-up the Flink codebase!
> > > >
> > > > I like the idea to cleanup Flink codebase to avoid having two Sinks.
> On
> > > the
> > > > other hand, I also thing the concern mentioned by Jing makes sense.
> In
> > > > addition to thinking in terms of the rule proposed in FLIP-197
> > > > <
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-197%3A+API+stability+graduation+process
> > > > >
> > > > (which
> > > > seems to focus mostly on the Flink developers' perspective), it might
> > be
> > > > useful to also think about the story from users' perspective and make
> > > sure
> > > > their concerns can be addressed.
> > > >
> > > > Typically, by marking an API as deprecated, we are effectively
> telling
> > > > users *they should start to migrate to the new API ASAP and we
> reserve
> > > the
> > > > right to remove this API completely in the 1-2 releases*. Then it
> might
> > > be
> > > > reasonable for users to ask questions such as:
> > > >
> > > > 1) Does SinkV2 public API provides all the functionalities needed to
> > > > migrate my existing code from subclassing SinkFunction to subclassing
> > > > SinkV2?
> > > >
> > > > 2) Is the amount of migration work reasonable? If yes, why is a class
> > > such
> > > > as HBaseSinkFunction in Flink's own codebase still depending on
> > > > SinkFunction? Maybe Flink developers should eat their own dog food
> and
> > > > migrate (or deprecate) those classes in the Flink codebase first?
> > > >
> > > > Based on the discussion in this thread, I am not sure we have good
> > > answers
> > > > to those questions yet. For the 1st question above, the answer is
> *no*
> > > > because we already know that the SinkV2 is currently not able to
> > support
> > > > migration for JdbcSink. For the 2nd question above, we know there are
> > > many
> > > > non-deprecated class in Flink codebase that are still depending on
> > > SinkV2.
> > > > It is probably not nice to users if we tell them to migrate while we
> > know
> > > > there are existing issues that can prevent them from doing so easily.
> > > >
> > > > In order to move forward to deprecate SinkV2, I think it will be
> super
> > > > useful to first migrate all the connectors managed by Flink community
> > > > (including all externalized connectors) to use SinkV2. This work
> won't
> > be
> > > > wasted since we need to do this anyway. And it will also give us a
> > chance
> > > > to validate the capabilities of SinkV2 and fix bugs by ourselves as
> > much
> > > as
> > > > possible.
> > > >
> > > > What do you think?
> > > >
> > > > Best Regards,
> > > > Dong
> > > >
> > > >
> > > > On Wed, Jan 18, 2023 at 6:52 PM Martijn Visser <
> > martijnvisser@apache.org
> > > >
> > > > wrote:
> > > >
> > > > > Hi all,
> > > > >
> > > > > While discussing FLIP-281 [1] the discussion also turned to the
> > > > > SinkFunction and the SinkV2 API. For a broader discussion I'm
> opening
> > > up
> > > > a
> > > > > separate discussion thread.
> > > > >
> > > > > As Yun Tang has mentioned in that discussion thread, it would be a
> > good
> > > > > time to deprecate the SinkFunction to avoid the need to introduce
> new
> > > > > functions towards (to be) deprecated APIs. Jing rightfully
> mentioned
> > > that
> > > > > it would be confusing to deprecate the SinkFunction if its
> successor
> > is
> > > > not
> > > > > yet marked as @Public (it's currently @PublicEvolving).
> > > > >
> > > > > My proposal would be to promote the SinkV2 API to @public in Flink
> > 1.17
> > > > and
> > > > > mark the SinkFunction as @deprecated in Flink 1.17
> > > > >
> > > > > The original Sink interface was introduced in Flink 1.12 with
> > FLIP-143
> > > > [2]
> > > > > and extended with FLIP-177 in Flink 1.14 [3] and has been improved
> on
> > > > > further as Sink V2 via FLIP-191 in Flink 1.15 [4].
> > > > >
> > > > > Looking at the API stability graduation process [5], the fact that
> > Sink
> > > > V2
> > > > > was introduced in Flink 1.15 would mean that we could warrant a
> > > promotion
> > > > > to @public already (given that there have been two releases with
> 1.15
> > > and
> > > > > 1.16 where it was introduced). Combined with the fact that SinkV2
> has
> > > > been
> > > > > the result of iteration over the introduction of the original Sink
> > API
> > > > > since Flink 1.12, I would argue that the promotion is overdue.
> > > > >
> > > > > If we promote the Sink API to @public, I think we should also
> > > immediately
> > > > > mark the SinkFunction as @deprecated.
> > > > >
> > > > > Looking forward to your thoughts.
> > > > >
> > > > > Best regards,
> > > > >
> > > > > Martijn
> > > > >
> > > > >
> > > > > [1]
> https://lists.apache.org/thread/l05m6cf8fwkkbpnjtzbg9l2lo40oxzd1
> > > > > [2]
> > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-143%3A+Unified+Sink+API
> > > > > [3]
> > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-177%3A+Extend+Sink+API
> > > > > [4]
> > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-191%3A+Extend+unified+Sink+interface+to+support+small+file+compaction
> > > > > [5]
> > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-197%3A+API+stability+graduation+process
> > > > >
> > > >
> > >
> > >
> > > --
> > > https://twitter.com/snntrable
> > > https://github.com/knaufk
> > >
> >
>

Re: [DISCUSS] Promote SinkV2 to @Public and deprecate SinkFunction

Posted by Steven Wu <st...@gmail.com>.
Regarding the discussion on global committer [1] for sinks with global
transactions, there is no consensus on solving that problem in SinkV2. Will
it require any breaking change in SinkV2?

Also will SinkV1 be deprecated too? or it should happen sometime after
SinkFunction deprecation?

[1] https://lists.apache.org/thread/82bgvlton9olb591bfg2djv0cshj1bxj

On Sun, Feb 5, 2023 at 2:14 AM Dong Lin <li...@gmail.com> wrote:

> Hi Konstantin,
>
> Thanks for the comment! Please see my comment inline.
>
> Cheers,
> Dong
>
> On Sat, Feb 4, 2023 at 2:06 AM Konstantin Knauf <kn...@apache.org> wrote:
>
> > Hi everyone,
> >
> > sorry for joining the discussion late.
> >
> > 1) Is there an option to deprecate SinkFunction in Flink 1.17 while
> leaving
> > SinkV2 @PublicEvolving in Flink 1.17. We then aim to make SinkV2 @Public
> in
> > and remove SinkFunction in Flink 1.18. @PublicEvolving are intended for
> > public use. So, I don't see it as a blocker for deprecating SinkFunction
> > that we have to make SinkV2 @PublicEvovling. For reference this is the
> > description of @PublicEvovling:
> >
> > /**
> >  * Annotation to mark classes and methods for public use, but with
> > evolving interfaces.
> >  *
> >  * <p>Classes and methods with this annotation are intended for public
> > use and have stable behavior.
> >  * However, their interfaces and signatures are not considered to be
> > stable and might be changed
> >  * across versions.
> >  *
> >  * <p>This annotation also excludes methods and classes with evolving
> > interfaces / signatures within
> >  * classes annotated with {@link Public}.
> >  */
> >
> >
> > Marking SinkFunction @Deprecated would already single everyone to move to
> > SinkV2, which we as a community, I believe, have a strong interest in.
> Its
> >
>
> Yes, I also believe we all have this strong interest. I just hope that this
> can be done in the best possible way that does not confuse users.
>
> I probably still have the same concern regarding its impact on users: if we
> mark an API as deprecated, it effectively means the users of this API
> should start to migrate to another API (e.g. SinkV2) and we might remove
> this API in the future. However, given that we know there are known
> problems preventing users from doing so, it seems that we are not ready to
> send this message to users right.
>
> If I understand correctly, I guess you are suggesting that by marking
> SinkFunction as deprecated, we can put higher pressure on Flink
> contributors to update the existing Flink codebase to improve and use
> SinkV2.
>
> I am not sure this is the right way to use @deprecated, which has a
> particular meaning for its users rather than contributors. And I am also
> not sure we can even pressure contributors of an open-source project into
> developing a feature (e.g. migrate all existing SinkFunction subclasses to
> SinkV2). IMO, the typical way is for the contributor with interest/time to
> work on the feature, or talk to other contributors whether they are willing
> to collaborate/work on this, rather than pressuring other contributors into
> working on this.
>
>
> almost comical how long the transition from SourceFurnction/SinkFunction to
> > Source/Sink takes us. At the same time, we leave ourselves the option to
> to
> > make small changes to SinkV2 if any problems arise during the migration
> of
> > these connector.
> >
> > I think, we have a bit of a chicken/egg problem here. The pressure for
> >
>
> Similar to the reason described above, I am not sure we have a chicken/egg
> problem here. The issue here is that SinkV2 is not ready and we have a lot
> of existing SinkFunction that is not migrated by ourselves. We (Flink
> contributors) probably do not need to mark SinkFunction as deprecated in
> order to address these issues in our own codebase.
>
>
> users and contributors is not high enough to move away from SinkFunction as
> > long as its not deprecated, but at the same time we need people to
> migrate
> > their connectors to see if there are any gaps in SinkV2. I believe, the
> > combination proposed above could bridge this problem.
> >
> > 2) I don't understand the argument of waiting until some of the
> > implementations are @Public. How can we make the implementations of the
> > SinkV2 API @Public without making SinkV2 @Public? All public methods of
> > SinkV2 are part of every implementation. So to me it actually seems to be
> > opposite: in order to make any of the implementation @Public we first
> need
> > to make the API @Public.
> >
>
> Yeah I also agree with you.
>
>
> >
> > Cheers,
> >
> > Konstantin
> >
> > Am Mo., 30. Jan. 2023 um 13:18 Uhr schrieb Dong Lin <lindong28@gmail.com
> >:
> >
> > > Hi Martijn,
> > >
> > > Thanks for driving this effort to clean-up the Flink codebase!
> > >
> > > I like the idea to cleanup Flink codebase to avoid having two Sinks. On
> > the
> > > other hand, I also thing the concern mentioned by Jing makes sense. In
> > > addition to thinking in terms of the rule proposed in FLIP-197
> > > <
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-197%3A+API+stability+graduation+process
> > > >
> > > (which
> > > seems to focus mostly on the Flink developers' perspective), it might
> be
> > > useful to also think about the story from users' perspective and make
> > sure
> > > their concerns can be addressed.
> > >
> > > Typically, by marking an API as deprecated, we are effectively telling
> > > users *they should start to migrate to the new API ASAP and we reserve
> > the
> > > right to remove this API completely in the 1-2 releases*. Then it might
> > be
> > > reasonable for users to ask questions such as:
> > >
> > > 1) Does SinkV2 public API provides all the functionalities needed to
> > > migrate my existing code from subclassing SinkFunction to subclassing
> > > SinkV2?
> > >
> > > 2) Is the amount of migration work reasonable? If yes, why is a class
> > such
> > > as HBaseSinkFunction in Flink's own codebase still depending on
> > > SinkFunction? Maybe Flink developers should eat their own dog food and
> > > migrate (or deprecate) those classes in the Flink codebase first?
> > >
> > > Based on the discussion in this thread, I am not sure we have good
> > answers
> > > to those questions yet. For the 1st question above, the answer is *no*
> > > because we already know that the SinkV2 is currently not able to
> support
> > > migration for JdbcSink. For the 2nd question above, we know there are
> > many
> > > non-deprecated class in Flink codebase that are still depending on
> > SinkV2.
> > > It is probably not nice to users if we tell them to migrate while we
> know
> > > there are existing issues that can prevent them from doing so easily.
> > >
> > > In order to move forward to deprecate SinkV2, I think it will be super
> > > useful to first migrate all the connectors managed by Flink community
> > > (including all externalized connectors) to use SinkV2. This work won't
> be
> > > wasted since we need to do this anyway. And it will also give us a
> chance
> > > to validate the capabilities of SinkV2 and fix bugs by ourselves as
> much
> > as
> > > possible.
> > >
> > > What do you think?
> > >
> > > Best Regards,
> > > Dong
> > >
> > >
> > > On Wed, Jan 18, 2023 at 6:52 PM Martijn Visser <
> martijnvisser@apache.org
> > >
> > > wrote:
> > >
> > > > Hi all,
> > > >
> > > > While discussing FLIP-281 [1] the discussion also turned to the
> > > > SinkFunction and the SinkV2 API. For a broader discussion I'm opening
> > up
> > > a
> > > > separate discussion thread.
> > > >
> > > > As Yun Tang has mentioned in that discussion thread, it would be a
> good
> > > > time to deprecate the SinkFunction to avoid the need to introduce new
> > > > functions towards (to be) deprecated APIs. Jing rightfully mentioned
> > that
> > > > it would be confusing to deprecate the SinkFunction if its successor
> is
> > > not
> > > > yet marked as @Public (it's currently @PublicEvolving).
> > > >
> > > > My proposal would be to promote the SinkV2 API to @public in Flink
> 1.17
> > > and
> > > > mark the SinkFunction as @deprecated in Flink 1.17
> > > >
> > > > The original Sink interface was introduced in Flink 1.12 with
> FLIP-143
> > > [2]
> > > > and extended with FLIP-177 in Flink 1.14 [3] and has been improved on
> > > > further as Sink V2 via FLIP-191 in Flink 1.15 [4].
> > > >
> > > > Looking at the API stability graduation process [5], the fact that
> Sink
> > > V2
> > > > was introduced in Flink 1.15 would mean that we could warrant a
> > promotion
> > > > to @public already (given that there have been two releases with 1.15
> > and
> > > > 1.16 where it was introduced). Combined with the fact that SinkV2 has
> > > been
> > > > the result of iteration over the introduction of the original Sink
> API
> > > > since Flink 1.12, I would argue that the promotion is overdue.
> > > >
> > > > If we promote the Sink API to @public, I think we should also
> > immediately
> > > > mark the SinkFunction as @deprecated.
> > > >
> > > > Looking forward to your thoughts.
> > > >
> > > > Best regards,
> > > >
> > > > Martijn
> > > >
> > > >
> > > > [1] https://lists.apache.org/thread/l05m6cf8fwkkbpnjtzbg9l2lo40oxzd1
> > > > [2]
> > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-143%3A+Unified+Sink+API
> > > > [3]
> > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-177%3A+Extend+Sink+API
> > > > [4]
> > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-191%3A+Extend+unified+Sink+interface+to+support+small+file+compaction
> > > > [5]
> > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-197%3A+API+stability+graduation+process
> > > >
> > >
> >
> >
> > --
> > https://twitter.com/snntrable
> > https://github.com/knaufk
> >
>

Re: [DISCUSS] Promote SinkV2 to @Public and deprecate SinkFunction

Posted by Dong Lin <li...@gmail.com>.
Hi Konstantin,

Thanks for the comment! Please see my comment inline.

Cheers,
Dong

On Sat, Feb 4, 2023 at 2:06 AM Konstantin Knauf <kn...@apache.org> wrote:

> Hi everyone,
>
> sorry for joining the discussion late.
>
> 1) Is there an option to deprecate SinkFunction in Flink 1.17 while leaving
> SinkV2 @PublicEvolving in Flink 1.17. We then aim to make SinkV2 @Public in
> and remove SinkFunction in Flink 1.18. @PublicEvolving are intended for
> public use. So, I don't see it as a blocker for deprecating SinkFunction
> that we have to make SinkV2 @PublicEvovling. For reference this is the
> description of @PublicEvovling:
>
> /**
>  * Annotation to mark classes and methods for public use, but with
> evolving interfaces.
>  *
>  * <p>Classes and methods with this annotation are intended for public
> use and have stable behavior.
>  * However, their interfaces and signatures are not considered to be
> stable and might be changed
>  * across versions.
>  *
>  * <p>This annotation also excludes methods and classes with evolving
> interfaces / signatures within
>  * classes annotated with {@link Public}.
>  */
>
>
> Marking SinkFunction @Deprecated would already single everyone to move to
> SinkV2, which we as a community, I believe, have a strong interest in. Its
>

Yes, I also believe we all have this strong interest. I just hope that this
can be done in the best possible way that does not confuse users.

I probably still have the same concern regarding its impact on users: if we
mark an API as deprecated, it effectively means the users of this API
should start to migrate to another API (e.g. SinkV2) and we might remove
this API in the future. However, given that we know there are known
problems preventing users from doing so, it seems that we are not ready to
send this message to users right.

If I understand correctly, I guess you are suggesting that by marking
SinkFunction as deprecated, we can put higher pressure on Flink
contributors to update the existing Flink codebase to improve and use
SinkV2.

I am not sure this is the right way to use @deprecated, which has a
particular meaning for its users rather than contributors. And I am also
not sure we can even pressure contributors of an open-source project into
developing a feature (e.g. migrate all existing SinkFunction subclasses to
SinkV2). IMO, the typical way is for the contributor with interest/time to
work on the feature, or talk to other contributors whether they are willing
to collaborate/work on this, rather than pressuring other contributors into
working on this.


almost comical how long the transition from SourceFurnction/SinkFunction to
> Source/Sink takes us. At the same time, we leave ourselves the option to to
> make small changes to SinkV2 if any problems arise during the migration of
> these connector.
>
> I think, we have a bit of a chicken/egg problem here. The pressure for
>

Similar to the reason described above, I am not sure we have a chicken/egg
problem here. The issue here is that SinkV2 is not ready and we have a lot
of existing SinkFunction that is not migrated by ourselves. We (Flink
contributors) probably do not need to mark SinkFunction as deprecated in
order to address these issues in our own codebase.


users and contributors is not high enough to move away from SinkFunction as
> long as its not deprecated, but at the same time we need people to migrate
> their connectors to see if there are any gaps in SinkV2. I believe, the
> combination proposed above could bridge this problem.
>
> 2) I don't understand the argument of waiting until some of the
> implementations are @Public. How can we make the implementations of the
> SinkV2 API @Public without making SinkV2 @Public? All public methods of
> SinkV2 are part of every implementation. So to me it actually seems to be
> opposite: in order to make any of the implementation @Public we first need
> to make the API @Public.
>

Yeah I also agree with you.


>
> Cheers,
>
> Konstantin
>
> Am Mo., 30. Jan. 2023 um 13:18 Uhr schrieb Dong Lin <li...@gmail.com>:
>
> > Hi Martijn,
> >
> > Thanks for driving this effort to clean-up the Flink codebase!
> >
> > I like the idea to cleanup Flink codebase to avoid having two Sinks. On
> the
> > other hand, I also thing the concern mentioned by Jing makes sense. In
> > addition to thinking in terms of the rule proposed in FLIP-197
> > <
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-197%3A+API+stability+graduation+process
> > >
> > (which
> > seems to focus mostly on the Flink developers' perspective), it might be
> > useful to also think about the story from users' perspective and make
> sure
> > their concerns can be addressed.
> >
> > Typically, by marking an API as deprecated, we are effectively telling
> > users *they should start to migrate to the new API ASAP and we reserve
> the
> > right to remove this API completely in the 1-2 releases*. Then it might
> be
> > reasonable for users to ask questions such as:
> >
> > 1) Does SinkV2 public API provides all the functionalities needed to
> > migrate my existing code from subclassing SinkFunction to subclassing
> > SinkV2?
> >
> > 2) Is the amount of migration work reasonable? If yes, why is a class
> such
> > as HBaseSinkFunction in Flink's own codebase still depending on
> > SinkFunction? Maybe Flink developers should eat their own dog food and
> > migrate (or deprecate) those classes in the Flink codebase first?
> >
> > Based on the discussion in this thread, I am not sure we have good
> answers
> > to those questions yet. For the 1st question above, the answer is *no*
> > because we already know that the SinkV2 is currently not able to support
> > migration for JdbcSink. For the 2nd question above, we know there are
> many
> > non-deprecated class in Flink codebase that are still depending on
> SinkV2.
> > It is probably not nice to users if we tell them to migrate while we know
> > there are existing issues that can prevent them from doing so easily.
> >
> > In order to move forward to deprecate SinkV2, I think it will be super
> > useful to first migrate all the connectors managed by Flink community
> > (including all externalized connectors) to use SinkV2. This work won't be
> > wasted since we need to do this anyway. And it will also give us a chance
> > to validate the capabilities of SinkV2 and fix bugs by ourselves as much
> as
> > possible.
> >
> > What do you think?
> >
> > Best Regards,
> > Dong
> >
> >
> > On Wed, Jan 18, 2023 at 6:52 PM Martijn Visser <martijnvisser@apache.org
> >
> > wrote:
> >
> > > Hi all,
> > >
> > > While discussing FLIP-281 [1] the discussion also turned to the
> > > SinkFunction and the SinkV2 API. For a broader discussion I'm opening
> up
> > a
> > > separate discussion thread.
> > >
> > > As Yun Tang has mentioned in that discussion thread, it would be a good
> > > time to deprecate the SinkFunction to avoid the need to introduce new
> > > functions towards (to be) deprecated APIs. Jing rightfully mentioned
> that
> > > it would be confusing to deprecate the SinkFunction if its successor is
> > not
> > > yet marked as @Public (it's currently @PublicEvolving).
> > >
> > > My proposal would be to promote the SinkV2 API to @public in Flink 1.17
> > and
> > > mark the SinkFunction as @deprecated in Flink 1.17
> > >
> > > The original Sink interface was introduced in Flink 1.12 with FLIP-143
> > [2]
> > > and extended with FLIP-177 in Flink 1.14 [3] and has been improved on
> > > further as Sink V2 via FLIP-191 in Flink 1.15 [4].
> > >
> > > Looking at the API stability graduation process [5], the fact that Sink
> > V2
> > > was introduced in Flink 1.15 would mean that we could warrant a
> promotion
> > > to @public already (given that there have been two releases with 1.15
> and
> > > 1.16 where it was introduced). Combined with the fact that SinkV2 has
> > been
> > > the result of iteration over the introduction of the original Sink API
> > > since Flink 1.12, I would argue that the promotion is overdue.
> > >
> > > If we promote the Sink API to @public, I think we should also
> immediately
> > > mark the SinkFunction as @deprecated.
> > >
> > > Looking forward to your thoughts.
> > >
> > > Best regards,
> > >
> > > Martijn
> > >
> > >
> > > [1] https://lists.apache.org/thread/l05m6cf8fwkkbpnjtzbg9l2lo40oxzd1
> > > [2]
> > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-143%3A+Unified+Sink+API
> > > [3]
> > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-177%3A+Extend+Sink+API
> > > [4]
> > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-191%3A+Extend+unified+Sink+interface+to+support+small+file+compaction
> > > [5]
> > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-197%3A+API+stability+graduation+process
> > >
> >
>
>
> --
> https://twitter.com/snntrable
> https://github.com/knaufk
>