You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Kevin Lam <ke...@shopify.com> on 2022/01/07 16:06:47 UTC

Plans to update StreamExecutionEnvironment.readFiles to use the FLIP-27 compatible FileSource?

Hi all,

Are there any plans to update StreamExecutionEnvironment.readFiles
<https://nightlies.apache.org/flink/flink-docs-release-1.13/api/java/org/apache/flink/streaming/api/environment/StreamExecutionEnvironment.html#readFile-org.apache.flink.api.common.io.FileInputFormat-java.lang.String->
to use the new FLIP-27 compatible FileSource
<https://nightlies.apache.org/flink/flink-docs-release-1.14/api/java/org/apache/flink/connector/file/src/FileSource.html>
?

readFiles supports some features via it's FileInputFormat
like setNestedFileEnumeration and setFilesFilter that we'd be interested in
continuing to use but it seems FileSource doesn't have that.

Re: Plans to update StreamExecutionEnvironment.readFiles to use the FLIP-27 compatible FileSource?

Posted by Kevin Lam <ke...@shopify.com>.
Hi Fabian,

No problem, thanks for the clarification. In terms of its importance, we
have some Flink applications running using
StreamExecutionEnvironment.readFiles
<https://nightlies.apache.org/flink/flink-docs-release-1.13/api/java/org/apache/flink/streaming/api/environment/StreamExecutionEnvironment.html#readFile-org.apache.flink.api.common.io.FileInputFormat-java.lang.String->,
so in order to adopt the new FileSource API, we would need to migrate those
applications. Ideally, we could migrate the state. If there isn't a way to
migrate state, it would be nice if there were some documentation or
guidance from the Flink community on how best to migrate.

Cheers,
Kevin

On Tue, Jan 11, 2022 at 10:19 AM Fabian Paul <fp...@apache.org> wrote:

> Hi Kevin,
>
> Sorry for the misleading information. The FileSink is compatible with
> the predecessor but unfortunately, it is not the case for the
> FileSource. I updated the ticket accordingly. Perhaps there is a way
> to migrate the state but it would be a larger effort. Is this an
> important feature for you?
>
> Best,
> Fabian
>
> On Mon, Jan 10, 2022 at 3:58 PM Kevin Lam <ke...@shopify.com> wrote:
> >
> > Hi Fabian,
> >
> > Thanks for creating and sharing that ticket. I noticed the clause "The
> FileSource can already read the state of the previous version", a little
> off-topic from the original topic of this thread but I was wondering if you
> could elaborate on that. Can the new FileSource interoperate with the old
> .readFile operator state? Is there a smooth way to upgrade to the new
> FileSource API from the old one without losing state?
> >
> > Thanks!
> >
> > On Mon, Jan 10, 2022 at 7:20 AM Fabian Paul <fp...@apache.org> wrote:
> >>
> >> Hi Kevin,
> >>
> >> I created a ticket to track the effort [1]. Unfortunately, we are
> >> already in the last few weeks of the release cycle for 1.15 so I
> >> cannot guarantee that someone can implement it until then.
> >>
> >> Best,
> >> Fabian
> >>
> >> [1] https://issues.apache.org/jira/browse/FLINK-25591
> >>
> >> On Fri, Jan 7, 2022 at 5:07 PM Kevin Lam <ke...@shopify.com> wrote:
> >> >
> >> > Hi all,
> >> >
> >> > Are there any plans to update StreamExecutionEnvironment.readFiles to
> use the new FLIP-27 compatible FileSource?
> >> >
> >> > readFiles supports some features via it's FileInputFormat like
> setNestedFileEnumeration and setFilesFilter that we'd be interested in
> continuing to use but it seems FileSource doesn't have that.
>

Re: Plans to update StreamExecutionEnvironment.readFiles to use the FLIP-27 compatible FileSource?

Posted by Fabian Paul <fp...@apache.org>.
Hi Kevin,

Sorry for the misleading information. The FileSink is compatible with
the predecessor but unfortunately, it is not the case for the
FileSource. I updated the ticket accordingly. Perhaps there is a way
to migrate the state but it would be a larger effort. Is this an
important feature for you?

Best,
Fabian

On Mon, Jan 10, 2022 at 3:58 PM Kevin Lam <ke...@shopify.com> wrote:
>
> Hi Fabian,
>
> Thanks for creating and sharing that ticket. I noticed the clause "The FileSource can already read the state of the previous version", a little off-topic from the original topic of this thread but I was wondering if you could elaborate on that. Can the new FileSource interoperate with the old .readFile operator state? Is there a smooth way to upgrade to the new FileSource API from the old one without losing state?
>
> Thanks!
>
> On Mon, Jan 10, 2022 at 7:20 AM Fabian Paul <fp...@apache.org> wrote:
>>
>> Hi Kevin,
>>
>> I created a ticket to track the effort [1]. Unfortunately, we are
>> already in the last few weeks of the release cycle for 1.15 so I
>> cannot guarantee that someone can implement it until then.
>>
>> Best,
>> Fabian
>>
>> [1] https://issues.apache.org/jira/browse/FLINK-25591
>>
>> On Fri, Jan 7, 2022 at 5:07 PM Kevin Lam <ke...@shopify.com> wrote:
>> >
>> > Hi all,
>> >
>> > Are there any plans to update StreamExecutionEnvironment.readFiles to use the new FLIP-27 compatible FileSource?
>> >
>> > readFiles supports some features via it's FileInputFormat like setNestedFileEnumeration and setFilesFilter that we'd be interested in continuing to use but it seems FileSource doesn't have that.

Re: Plans to update StreamExecutionEnvironment.readFiles to use the FLIP-27 compatible FileSource?

Posted by Kevin Lam <ke...@shopify.com>.
Hi Fabian,

Thanks for creating and sharing that ticket. I noticed the clause "The
FileSource can already read the state of the previous version", a little
off-topic from the original topic of this thread but I was wondering if you
could elaborate on that. Can the new FileSource interoperate with the old
.readFile operator state? Is there a smooth way to upgrade to the new
FileSource API from the old one without losing state?

Thanks!

On Mon, Jan 10, 2022 at 7:20 AM Fabian Paul <fp...@apache.org> wrote:

> Hi Kevin,
>
> I created a ticket to track the effort [1]. Unfortunately, we are
> already in the last few weeks of the release cycle for 1.15 so I
> cannot guarantee that someone can implement it until then.
>
> Best,
> Fabian
>
> [1] https://issues.apache.org/jira/browse/FLINK-25591
>
> On Fri, Jan 7, 2022 at 5:07 PM Kevin Lam <ke...@shopify.com> wrote:
> >
> > Hi all,
> >
> > Are there any plans to update StreamExecutionEnvironment.readFiles to
> use the new FLIP-27 compatible FileSource?
> >
> > readFiles supports some features via it's FileInputFormat like
> setNestedFileEnumeration and setFilesFilter that we'd be interested in
> continuing to use but it seems FileSource doesn't have that.
>

Re: Plans to update StreamExecutionEnvironment.readFiles to use the FLIP-27 compatible FileSource?

Posted by Fabian Paul <fp...@apache.org>.
Hi Kevin,

I created a ticket to track the effort [1]. Unfortunately, we are
already in the last few weeks of the release cycle for 1.15 so I
cannot guarantee that someone can implement it until then.

Best,
Fabian

[1] https://issues.apache.org/jira/browse/FLINK-25591

On Fri, Jan 7, 2022 at 5:07 PM Kevin Lam <ke...@shopify.com> wrote:
>
> Hi all,
>
> Are there any plans to update StreamExecutionEnvironment.readFiles to use the new FLIP-27 compatible FileSource?
>
> readFiles supports some features via it's FileInputFormat like setNestedFileEnumeration and setFilesFilter that we'd be interested in continuing to use but it seems FileSource doesn't have that.