You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@flink.apache.org by Jing Ge <ji...@ververica.com.INVALID> on 2023/06/20 09:41:40 UTC

[DISCUSS] Graduate the FileSink to @PublicEvolving

Hi all,

The FileSink has been marked as @Experimental[1] since Oct. 2020.
According to FLIP-197[2], I would like to propose to graduate it
to @PublicEvloving in the upcoming 1.18 release.

On the other hand, as a related topic, FileSource was marked
as @PublicEvolving[3] 3 years ago. It deserves a graduation discussion too.
To keep this discussion lean and efficient, let's focus on FlieSink in this
thread. There will be another discussion thread for the FileSource.

I was wondering if anyone might have any concerns. Looking forward to
hearing from you.


Best regards,
Jing






[1]
https://github.com/apache/flink/blob/4006de973525c5284e9bc8fa6196ab7624189261/flink-connectors/flink-connector-files/src/main/java/org/apache/flink/connector/file/sink/FileSink.java#L129
[2]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-197%3A+API+stability+graduation+process
[3]
https://github.com/apache/flink/blob/4006de973525c5284e9bc8fa6196ab7624189261/flink-connectors/flink-connector-files/src/main/java/org/apache/flink/connector/file/src/FileSource.java#L95

Re: [DISCUSS] Graduate the FileSink to @PublicEvolving

Posted by Jing Ge <ji...@ververica.com.INVALID>.
Hi,

If there are no other concerns, I will start voting. Thanks!

Best Regards,
Jing

On Mon, Jun 26, 2023 at 11:35 AM Jing Ge <ji...@ververica.com> wrote:

> Hi,
>
> @Galen @Yuxia
>
> Your points are valid. Speaking of removing deprecated API, I have the
> same concern. As a matter of fact, I have been raising it in the discussion
> thread of API deprecation process[1]. This is another example that we
> should care about more factors than the migration period, thanks for
> the hint! I will add one more update into that thread with the reference of
> this thread.
>
> In a nutshell, this thread is focusing on the graduation process. Your
> valid concerns should be taken care of by the deprecation process.
> Please don't hesitate to share your thoughts in that thread.
>
>
> Best regards,
> Jing
>
> [1] https://lists.apache.org/thread/vmhzv8fcw2b33pqxp43486owrxbkd5x9
>
>
> On Sun, Jun 25, 2023 at 3:48 AM yuxia <lu...@alumni.sjtu.edu.cn> wrote:
>
>> Thanks Jing for briging this to dicuss.
>> I agree it's not a blocker for graduting the FileSink to @PublicEvolving
>> since the Sink which is the rootcause has marked as @PublicEvolving.
>> But I do also share the same concern with Galen. At least it should be a
>> blocker for removing StreamingFileSink.
>> Btw, seems it's really a big headache for migrating to Sink, we may need
>> to pay more attention to this ticket and try to fix it.
>>
>> Best regards,
>> Yuxia
>>
>> ----- 原始邮件 -----
>> 发件人: "Galen Warren" <ga...@cvillewarrens.com.INVALID>
>> 收件人: "dev" <de...@flink.apache.org>
>> 发送时间: 星期五, 2023年 6 月 23日 下午 7:47:24
>> 主题: Re: [DISCUSS] Graduate the FileSink to @PublicEvolving
>>
>> Thanks Jing. I can only offer my perspective on this, others may view it
>> differently.
>>
>> If FileSink is subject to data loss in the "stop-on-savepoint then
>> restart"
>> scenario, that makes it unusable for me, and presumably for anyone who
>> uses
>> it in a long-running streaming application and who cannot tolerate data
>> loss. I still use the (deprecated!) StreamingFileSink for this reason.
>>
>> The bigger picture here is that StreamingFileSink is deprecated and will
>> presumably ultimately be removed, to be replaced with FileSink. Graduating
>> the status of FileSink seems to be a step along that path; I'm concerned
>> about continuing down that path with such a critical issue present.
>> Ultimately, my concern is that FileSink will graduate fully and that
>> StreamingFileSink will be removed and that there will be no remaining
>> option to reliably stop/start streaming jobs that write to files without
>> incurring the risk of data loss.
>>
>> I'm sure I'd feel better about things if there were an ongoing effort to
>> address this FileSink issue and/or a commitment that StreamingFileSink
>> would not be removed until this issue is addressed.
>>
>> My two cents -- thanks.
>>
>>
>> On Fri, Jun 23, 2023 at 1:47 AM Jing Ge <ji...@ververica.com.invalid>
>> wrote:
>>
>> > Hi Galen,
>> >
>> > Thanks for the hint which is helpful for us to have a clear big picture.
>> > Afaiac, this will not be a blocking issue for the graduation. There will
>> > always be some (potential) bugs in the implementation. The API is very
>> > stable from 2020. The timing is good to graduate. WDYT?
>> > Furthermore, I'd like to have more opinions. All opinions together will
>> > help the community build a mature API graduation process.
>> >
>> > Best regards,
>> > Jing
>> >
>> > On Tue, Jun 20, 2023 at 12:48 PM Galen Warren
>> > <ga...@cvillewarrens.com.invalid> wrote:
>> >
>> > > Is this issue still unresolved?
>> > >
>> > >
>> https://issues.apache.org/jira/plugins/servlet/mobile#issue/FLINK-30238
>> > >
>> > > Based on prior discussion, I believe this could lead to data loss with
>> > > FileSink.
>> > >
>> > >
>> > >
>> > > On Tue, Jun 20, 2023, 5:41 AM Jing Ge <ji...@ververica.com.invalid>
>> > wrote:
>> > >
>> > > > Hi all,
>> > > >
>> > > > The FileSink has been marked as @Experimental[1] since Oct. 2020.
>> > > > According to FLIP-197[2], I would like to propose to graduate it
>> > > > to @PublicEvloving in the upcoming 1.18 release.
>> > > >
>> > > > On the other hand, as a related topic, FileSource was marked
>> > > > as @PublicEvolving[3] 3 years ago. It deserves a graduation
>> discussion
>> > > too.
>> > > > To keep this discussion lean and efficient, let's focus on FlieSink
>> in
>> > > this
>> > > > thread. There will be another discussion thread for the FileSource.
>> > > >
>> > > > I was wondering if anyone might have any concerns. Looking forward
>> to
>> > > > hearing from you.
>> > > >
>> > > >
>> > > > Best regards,
>> > > > Jing
>> > > >
>> > > >
>> > > >
>> > > >
>> > > >
>> > > >
>> > > > [1]
>> > > >
>> > > >
>> > >
>> >
>> https://github.com/apache/flink/blob/4006de973525c5284e9bc8fa6196ab7624189261/flink-connectors/flink-connector-files/src/main/java/org/apache/flink/connector/file/sink/FileSink.java#L129
>> > > > [2]
>> > > >
>> > > >
>> > >
>> >
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-197%3A+API+stability+graduation+process
>> > > > [3]
>> > > >
>> > > >
>> > >
>> >
>> https://github.com/apache/flink/blob/4006de973525c5284e9bc8fa6196ab7624189261/flink-connectors/flink-connector-files/src/main/java/org/apache/flink/connector/file/src/FileSource.java#L95
>> > > >
>> > >
>> >
>>
>

Re: [DISCUSS] Graduate the FileSink to @PublicEvolving

Posted by Jing Ge <ji...@ververica.com.INVALID>.
Hi,

@Galen @Yuxia

Your points are valid. Speaking of removing deprecated API, I have the same
concern. As a matter of fact, I have been raising it in the discussion
thread of API deprecation process[1]. This is another example that we
should care about more factors than the migration period, thanks for
the hint! I will add one more update into that thread with the reference of
this thread.

In a nutshell, this thread is focusing on the graduation process. Your
valid concerns should be taken care of by the deprecation process.
Please don't hesitate to share your thoughts in that thread.


Best regards,
Jing

[1] https://lists.apache.org/thread/vmhzv8fcw2b33pqxp43486owrxbkd5x9


On Sun, Jun 25, 2023 at 3:48 AM yuxia <lu...@alumni.sjtu.edu.cn> wrote:

> Thanks Jing for briging this to dicuss.
> I agree it's not a blocker for graduting the FileSink to @PublicEvolving
> since the Sink which is the rootcause has marked as @PublicEvolving.
> But I do also share the same concern with Galen. At least it should be a
> blocker for removing StreamingFileSink.
> Btw, seems it's really a big headache for migrating to Sink, we may need
> to pay more attention to this ticket and try to fix it.
>
> Best regards,
> Yuxia
>
> ----- 原始邮件 -----
> 发件人: "Galen Warren" <ga...@cvillewarrens.com.INVALID>
> 收件人: "dev" <de...@flink.apache.org>
> 发送时间: 星期五, 2023年 6 月 23日 下午 7:47:24
> 主题: Re: [DISCUSS] Graduate the FileSink to @PublicEvolving
>
> Thanks Jing. I can only offer my perspective on this, others may view it
> differently.
>
> If FileSink is subject to data loss in the "stop-on-savepoint then restart"
> scenario, that makes it unusable for me, and presumably for anyone who uses
> it in a long-running streaming application and who cannot tolerate data
> loss. I still use the (deprecated!) StreamingFileSink for this reason.
>
> The bigger picture here is that StreamingFileSink is deprecated and will
> presumably ultimately be removed, to be replaced with FileSink. Graduating
> the status of FileSink seems to be a step along that path; I'm concerned
> about continuing down that path with such a critical issue present.
> Ultimately, my concern is that FileSink will graduate fully and that
> StreamingFileSink will be removed and that there will be no remaining
> option to reliably stop/start streaming jobs that write to files without
> incurring the risk of data loss.
>
> I'm sure I'd feel better about things if there were an ongoing effort to
> address this FileSink issue and/or a commitment that StreamingFileSink
> would not be removed until this issue is addressed.
>
> My two cents -- thanks.
>
>
> On Fri, Jun 23, 2023 at 1:47 AM Jing Ge <ji...@ververica.com.invalid>
> wrote:
>
> > Hi Galen,
> >
> > Thanks for the hint which is helpful for us to have a clear big picture.
> > Afaiac, this will not be a blocking issue for the graduation. There will
> > always be some (potential) bugs in the implementation. The API is very
> > stable from 2020. The timing is good to graduate. WDYT?
> > Furthermore, I'd like to have more opinions. All opinions together will
> > help the community build a mature API graduation process.
> >
> > Best regards,
> > Jing
> >
> > On Tue, Jun 20, 2023 at 12:48 PM Galen Warren
> > <ga...@cvillewarrens.com.invalid> wrote:
> >
> > > Is this issue still unresolved?
> > >
> > >
> https://issues.apache.org/jira/plugins/servlet/mobile#issue/FLINK-30238
> > >
> > > Based on prior discussion, I believe this could lead to data loss with
> > > FileSink.
> > >
> > >
> > >
> > > On Tue, Jun 20, 2023, 5:41 AM Jing Ge <ji...@ververica.com.invalid>
> > wrote:
> > >
> > > > Hi all,
> > > >
> > > > The FileSink has been marked as @Experimental[1] since Oct. 2020.
> > > > According to FLIP-197[2], I would like to propose to graduate it
> > > > to @PublicEvloving in the upcoming 1.18 release.
> > > >
> > > > On the other hand, as a related topic, FileSource was marked
> > > > as @PublicEvolving[3] 3 years ago. It deserves a graduation
> discussion
> > > too.
> > > > To keep this discussion lean and efficient, let's focus on FlieSink
> in
> > > this
> > > > thread. There will be another discussion thread for the FileSource.
> > > >
> > > > I was wondering if anyone might have any concerns. Looking forward to
> > > > hearing from you.
> > > >
> > > >
> > > > Best regards,
> > > > Jing
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > [1]
> > > >
> > > >
> > >
> >
> https://github.com/apache/flink/blob/4006de973525c5284e9bc8fa6196ab7624189261/flink-connectors/flink-connector-files/src/main/java/org/apache/flink/connector/file/sink/FileSink.java#L129
> > > > [2]
> > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-197%3A+API+stability+graduation+process
> > > > [3]
> > > >
> > > >
> > >
> >
> https://github.com/apache/flink/blob/4006de973525c5284e9bc8fa6196ab7624189261/flink-connectors/flink-connector-files/src/main/java/org/apache/flink/connector/file/src/FileSource.java#L95
> > > >
> > >
> >
>

Re: [DISCUSS] Graduate the FileSink to @PublicEvolving

Posted by yuxia <lu...@alumni.sjtu.edu.cn>.
Thanks Jing for briging this to dicuss.
I agree it's not a blocker for graduting the FileSink to @PublicEvolving since the Sink which is the rootcause has marked as @PublicEvolving.
But I do also share the same concern with Galen. At least it should be a blocker for removing StreamingFileSink.
Btw, seems it's really a big headache for migrating to Sink, we may need to pay more attention to this ticket and try to fix it.

Best regards,
Yuxia

----- 原始邮件 -----
发件人: "Galen Warren" <ga...@cvillewarrens.com.INVALID>
收件人: "dev" <de...@flink.apache.org>
发送时间: 星期五, 2023年 6 月 23日 下午 7:47:24
主题: Re: [DISCUSS] Graduate the FileSink to @PublicEvolving

Thanks Jing. I can only offer my perspective on this, others may view it
differently.

If FileSink is subject to data loss in the "stop-on-savepoint then restart"
scenario, that makes it unusable for me, and presumably for anyone who uses
it in a long-running streaming application and who cannot tolerate data
loss. I still use the (deprecated!) StreamingFileSink for this reason.

The bigger picture here is that StreamingFileSink is deprecated and will
presumably ultimately be removed, to be replaced with FileSink. Graduating
the status of FileSink seems to be a step along that path; I'm concerned
about continuing down that path with such a critical issue present.
Ultimately, my concern is that FileSink will graduate fully and that
StreamingFileSink will be removed and that there will be no remaining
option to reliably stop/start streaming jobs that write to files without
incurring the risk of data loss.

I'm sure I'd feel better about things if there were an ongoing effort to
address this FileSink issue and/or a commitment that StreamingFileSink
would not be removed until this issue is addressed.

My two cents -- thanks.


On Fri, Jun 23, 2023 at 1:47 AM Jing Ge <ji...@ververica.com.invalid> wrote:

> Hi Galen,
>
> Thanks for the hint which is helpful for us to have a clear big picture.
> Afaiac, this will not be a blocking issue for the graduation. There will
> always be some (potential) bugs in the implementation. The API is very
> stable from 2020. The timing is good to graduate. WDYT?
> Furthermore, I'd like to have more opinions. All opinions together will
> help the community build a mature API graduation process.
>
> Best regards,
> Jing
>
> On Tue, Jun 20, 2023 at 12:48 PM Galen Warren
> <ga...@cvillewarrens.com.invalid> wrote:
>
> > Is this issue still unresolved?
> >
> > https://issues.apache.org/jira/plugins/servlet/mobile#issue/FLINK-30238
> >
> > Based on prior discussion, I believe this could lead to data loss with
> > FileSink.
> >
> >
> >
> > On Tue, Jun 20, 2023, 5:41 AM Jing Ge <ji...@ververica.com.invalid>
> wrote:
> >
> > > Hi all,
> > >
> > > The FileSink has been marked as @Experimental[1] since Oct. 2020.
> > > According to FLIP-197[2], I would like to propose to graduate it
> > > to @PublicEvloving in the upcoming 1.18 release.
> > >
> > > On the other hand, as a related topic, FileSource was marked
> > > as @PublicEvolving[3] 3 years ago. It deserves a graduation discussion
> > too.
> > > To keep this discussion lean and efficient, let's focus on FlieSink in
> > this
> > > thread. There will be another discussion thread for the FileSource.
> > >
> > > I was wondering if anyone might have any concerns. Looking forward to
> > > hearing from you.
> > >
> > >
> > > Best regards,
> > > Jing
> > >
> > >
> > >
> > >
> > >
> > >
> > > [1]
> > >
> > >
> >
> https://github.com/apache/flink/blob/4006de973525c5284e9bc8fa6196ab7624189261/flink-connectors/flink-connector-files/src/main/java/org/apache/flink/connector/file/sink/FileSink.java#L129
> > > [2]
> > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-197%3A+API+stability+graduation+process
> > > [3]
> > >
> > >
> >
> https://github.com/apache/flink/blob/4006de973525c5284e9bc8fa6196ab7624189261/flink-connectors/flink-connector-files/src/main/java/org/apache/flink/connector/file/src/FileSource.java#L95
> > >
> >
>

Re: [DISCUSS] Graduate the FileSink to @PublicEvolving

Posted by Galen Warren <ga...@cvillewarrens.com.INVALID>.
Thanks Jing. I can only offer my perspective on this, others may view it
differently.

If FileSink is subject to data loss in the "stop-on-savepoint then restart"
scenario, that makes it unusable for me, and presumably for anyone who uses
it in a long-running streaming application and who cannot tolerate data
loss. I still use the (deprecated!) StreamingFileSink for this reason.

The bigger picture here is that StreamingFileSink is deprecated and will
presumably ultimately be removed, to be replaced with FileSink. Graduating
the status of FileSink seems to be a step along that path; I'm concerned
about continuing down that path with such a critical issue present.
Ultimately, my concern is that FileSink will graduate fully and that
StreamingFileSink will be removed and that there will be no remaining
option to reliably stop/start streaming jobs that write to files without
incurring the risk of data loss.

I'm sure I'd feel better about things if there were an ongoing effort to
address this FileSink issue and/or a commitment that StreamingFileSink
would not be removed until this issue is addressed.

My two cents -- thanks.


On Fri, Jun 23, 2023 at 1:47 AM Jing Ge <ji...@ververica.com.invalid> wrote:

> Hi Galen,
>
> Thanks for the hint which is helpful for us to have a clear big picture.
> Afaiac, this will not be a blocking issue for the graduation. There will
> always be some (potential) bugs in the implementation. The API is very
> stable from 2020. The timing is good to graduate. WDYT?
> Furthermore, I'd like to have more opinions. All opinions together will
> help the community build a mature API graduation process.
>
> Best regards,
> Jing
>
> On Tue, Jun 20, 2023 at 12:48 PM Galen Warren
> <ga...@cvillewarrens.com.invalid> wrote:
>
> > Is this issue still unresolved?
> >
> > https://issues.apache.org/jira/plugins/servlet/mobile#issue/FLINK-30238
> >
> > Based on prior discussion, I believe this could lead to data loss with
> > FileSink.
> >
> >
> >
> > On Tue, Jun 20, 2023, 5:41 AM Jing Ge <ji...@ververica.com.invalid>
> wrote:
> >
> > > Hi all,
> > >
> > > The FileSink has been marked as @Experimental[1] since Oct. 2020.
> > > According to FLIP-197[2], I would like to propose to graduate it
> > > to @PublicEvloving in the upcoming 1.18 release.
> > >
> > > On the other hand, as a related topic, FileSource was marked
> > > as @PublicEvolving[3] 3 years ago. It deserves a graduation discussion
> > too.
> > > To keep this discussion lean and efficient, let's focus on FlieSink in
> > this
> > > thread. There will be another discussion thread for the FileSource.
> > >
> > > I was wondering if anyone might have any concerns. Looking forward to
> > > hearing from you.
> > >
> > >
> > > Best regards,
> > > Jing
> > >
> > >
> > >
> > >
> > >
> > >
> > > [1]
> > >
> > >
> >
> https://github.com/apache/flink/blob/4006de973525c5284e9bc8fa6196ab7624189261/flink-connectors/flink-connector-files/src/main/java/org/apache/flink/connector/file/sink/FileSink.java#L129
> > > [2]
> > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-197%3A+API+stability+graduation+process
> > > [3]
> > >
> > >
> >
> https://github.com/apache/flink/blob/4006de973525c5284e9bc8fa6196ab7624189261/flink-connectors/flink-connector-files/src/main/java/org/apache/flink/connector/file/src/FileSource.java#L95
> > >
> >
>

Re: [DISCUSS] Graduate the FileSink to @PublicEvolving

Posted by Jing Ge <ji...@ververica.com.INVALID>.
Hi Galen,

Thanks for the hint which is helpful for us to have a clear big picture.
Afaiac, this will not be a blocking issue for the graduation. There will
always be some (potential) bugs in the implementation. The API is very
stable from 2020. The timing is good to graduate. WDYT?
Furthermore, I'd like to have more opinions. All opinions together will
help the community build a mature API graduation process.

Best regards,
Jing

On Tue, Jun 20, 2023 at 12:48 PM Galen Warren
<ga...@cvillewarrens.com.invalid> wrote:

> Is this issue still unresolved?
>
> https://issues.apache.org/jira/plugins/servlet/mobile#issue/FLINK-30238
>
> Based on prior discussion, I believe this could lead to data loss with
> FileSink.
>
>
>
> On Tue, Jun 20, 2023, 5:41 AM Jing Ge <ji...@ververica.com.invalid> wrote:
>
> > Hi all,
> >
> > The FileSink has been marked as @Experimental[1] since Oct. 2020.
> > According to FLIP-197[2], I would like to propose to graduate it
> > to @PublicEvloving in the upcoming 1.18 release.
> >
> > On the other hand, as a related topic, FileSource was marked
> > as @PublicEvolving[3] 3 years ago. It deserves a graduation discussion
> too.
> > To keep this discussion lean and efficient, let's focus on FlieSink in
> this
> > thread. There will be another discussion thread for the FileSource.
> >
> > I was wondering if anyone might have any concerns. Looking forward to
> > hearing from you.
> >
> >
> > Best regards,
> > Jing
> >
> >
> >
> >
> >
> >
> > [1]
> >
> >
> https://github.com/apache/flink/blob/4006de973525c5284e9bc8fa6196ab7624189261/flink-connectors/flink-connector-files/src/main/java/org/apache/flink/connector/file/sink/FileSink.java#L129
> > [2]
> >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-197%3A+API+stability+graduation+process
> > [3]
> >
> >
> https://github.com/apache/flink/blob/4006de973525c5284e9bc8fa6196ab7624189261/flink-connectors/flink-connector-files/src/main/java/org/apache/flink/connector/file/src/FileSource.java#L95
> >
>

Re: [DISCUSS] Graduate the FileSink to @PublicEvolving

Posted by Galen Warren <ga...@cvillewarrens.com.INVALID>.
Is this issue still unresolved?

https://issues.apache.org/jira/plugins/servlet/mobile#issue/FLINK-30238

Based on prior discussion, I believe this could lead to data loss with
FileSink.



On Tue, Jun 20, 2023, 5:41 AM Jing Ge <ji...@ververica.com.invalid> wrote:

> Hi all,
>
> The FileSink has been marked as @Experimental[1] since Oct. 2020.
> According to FLIP-197[2], I would like to propose to graduate it
> to @PublicEvloving in the upcoming 1.18 release.
>
> On the other hand, as a related topic, FileSource was marked
> as @PublicEvolving[3] 3 years ago. It deserves a graduation discussion too.
> To keep this discussion lean and efficient, let's focus on FlieSink in this
> thread. There will be another discussion thread for the FileSource.
>
> I was wondering if anyone might have any concerns. Looking forward to
> hearing from you.
>
>
> Best regards,
> Jing
>
>
>
>
>
>
> [1]
>
> https://github.com/apache/flink/blob/4006de973525c5284e9bc8fa6196ab7624189261/flink-connectors/flink-connector-files/src/main/java/org/apache/flink/connector/file/sink/FileSink.java#L129
> [2]
>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-197%3A+API+stability+graduation+process
> [3]
>
> https://github.com/apache/flink/blob/4006de973525c5284e9bc8fa6196ab7624189261/flink-connectors/flink-connector-files/src/main/java/org/apache/flink/connector/file/src/FileSource.java#L95
>