You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@flink.apache.org by Vijay Srinivasaraghavan <vi...@yahoo.com.INVALID> on 2017/02/15 19:28:18 UTC

Reliable Distributed FS support (HCFS)

Hello,
Regarding the Filesystem abstraction support, we are planning to use a distributed file system which complies with Hadoop Compatible File System (HCFS) standard in place of standard HDFS.
According to the documentation (https://ci.apache.org/projects/flink/flink-docs-release-1.3/internals/filesystems.html), persistence gurantees is listed as one of the main requirement and to be precises it qualifies both visibility and durability gurantees.
My question is,
1) Are we expecting the file system to support "Atomic Rename" characteristics? I believe checkpoint mechanism involves in renaming the files and will that have an impact if "atomic rename" is not guranteed by the underlying file system?
2) How does one certify Flink with HCFS (in place of standard HDFS) in terms of the scenarios/usecase that needs to be tested? Is there any general guidance on this?
ThanksVijay

Re: Reliable Distributed FS support (HCFS)

Posted by Robert Metzger <rm...@apache.org>.
Hi Vijay,

Regarding your second question: First of all, the example jobs of Flink
need to pass.
Secondly, I would recommend implementing a test job that uses a lot of
state, different state backends (file system and rocks) and some artificial
failures.
We at data Artisans have some testing jobs internally for testing such
workloads. I'll try to publish them soon on Github so that others can use
them as well. Please ping me if you urgently need them :)

Regards,
Robert



On Fri, Feb 17, 2017 at 3:20 PM, Aljoscha Krettek <al...@apache.org>
wrote:

> Hi,
> I think atomic rename is not part of the requirements.
>
> I'll add +Stephan who recently wrote this document in case he has any
> additional input.
>
> Cheers,
> Aljoscha
>
> On Thu, 16 Feb 2017 at 23:28 Vijay Srinivasaraghavan <vi...@yahoo.com>
> wrote:
>
>> Following up on my question regarding backed Filesystem (HCFS)
>> requirements. Appreciate any inputs.
>>
>> ---
>> Regarding the Filesystem abstraction support, we are planning to use a
>> distributed file system which complies with Hadoop Compatible File System
>> (HCFS) standard in place of standard HDFS.
>>
>> According to the documentation (https://ci.apache.org/
>> projects/flink/flink-docs-release-1.3/internals/filesystems.html),
>> persistence gurantees is listed as one of the main requirement and to be
>> precises it qualifies both visibility and durability gurantees.
>>
>> My question is,
>>
>> 1) Are we expecting the file system to support "Atomic Rename"
>> characteristics? I believe checkpoint mechanism involves in renaming the
>> files and will that have an impact if "atomic rename" is not guranteed by
>> the underlying file system?
>>
>> 2) How does one certify Flink with HCFS (in place of standard HDFS) in
>> terms of the scenarios/usecase that needs to be tested? Is there any
>> general guidance on this?
>> ---
>>
>> Regards
>> Vijay
>>
>>
>> On Wednesday, February 15, 2017 11:28 AM, Vijay Srinivasaraghavan <
>> vijikarthi@yahoo.com> wrote:
>>
>>
>> Hello,
>>
>> Regarding the Filesystem abstraction support, we are planning to use a
>> distributed file system which complies with Hadoop Compatible File System
>> (HCFS) standard in place of standard HDFS.
>>
>> According to the documentation (https://ci.apache.org/
>> projects/flink/flink-docs-release-1.3/internals/filesystems.html),
>> persistence gurantees is listed as one of the main requirement and to be
>> precises it qualifies both visibility and durability gurantees.
>>
>> My question is,
>>
>> 1) Are we expecting the file system to support "Atomic Rename"
>> characteristics? I believe checkpoint mechanism involves in renaming the
>> files and will that have an impact if "atomic rename" is not guranteed by
>> the underlying file system?
>>
>> 2) How does one certify Flink with HCFS (in place of standard HDFS) in
>> terms of the scenarios/usecase that needs to be tested? Is there any
>> general guidance on this?
>>
>> Thanks
>> Vijay
>>
>>
>>

Re: Reliable Distributed FS support (HCFS)

Posted by Aljoscha Krettek <al...@apache.org>.
Hi,
I think atomic rename is not part of the requirements.

I'll add +Stephan who recently wrote this document in case he has any
additional input.

Cheers,
Aljoscha

On Thu, 16 Feb 2017 at 23:28 Vijay Srinivasaraghavan <vi...@yahoo.com>
wrote:

> Following up on my question regarding backed Filesystem (HCFS)
> requirements. Appreciate any inputs.
>
> ---
> Regarding the Filesystem abstraction support, we are planning to use a
> distributed file system which complies with Hadoop Compatible File System
> (HCFS) standard in place of standard HDFS.
>
> According to the documentation (
> https://ci.apache.org/projects/flink/flink-docs-release-1.3/internals/filesystems.html),
> persistence gurantees is listed as one of the main requirement and to be
> precises it qualifies both visibility and durability gurantees.
>
> My question is,
>
> 1) Are we expecting the file system to support "Atomic Rename"
> characteristics? I believe checkpoint mechanism involves in renaming the
> files and will that have an impact if "atomic rename" is not guranteed by
> the underlying file system?
>
> 2) How does one certify Flink with HCFS (in place of standard HDFS) in
> terms of the scenarios/usecase that needs to be tested? Is there any
> general guidance on this?
> ---
>
> Regards
> Vijay
>
>
> On Wednesday, February 15, 2017 11:28 AM, Vijay Srinivasaraghavan <
> vijikarthi@yahoo.com> wrote:
>
>
> Hello,
>
> Regarding the Filesystem abstraction support, we are planning to use a
> distributed file system which complies with Hadoop Compatible File System
> (HCFS) standard in place of standard HDFS.
>
> According to the documentation (
> https://ci.apache.org/projects/flink/flink-docs-release-1.3/internals/filesystems.html),
> persistence gurantees is listed as one of the main requirement and to be
> precises it qualifies both visibility and durability gurantees.
>
> My question is,
>
> 1) Are we expecting the file system to support "Atomic Rename"
> characteristics? I believe checkpoint mechanism involves in renaming the
> files and will that have an impact if "atomic rename" is not guranteed by
> the underlying file system?
>
> 2) How does one certify Flink with HCFS (in place of standard HDFS) in
> terms of the scenarios/usecase that needs to be tested? Is there any
> general guidance on this?
>
> Thanks
> Vijay
>
>
>

Re: Reliable Distributed FS support (HCFS)

Posted by Aljoscha Krettek <al...@apache.org>.
Hi,
I think atomic rename is not part of the requirements.

I'll add +Stephan who recently wrote this document in case he has any
additional input.

Cheers,
Aljoscha

On Thu, 16 Feb 2017 at 23:28 Vijay Srinivasaraghavan <vi...@yahoo.com>
wrote:

> Following up on my question regarding backed Filesystem (HCFS)
> requirements. Appreciate any inputs.
>
> ---
> Regarding the Filesystem abstraction support, we are planning to use a
> distributed file system which complies with Hadoop Compatible File System
> (HCFS) standard in place of standard HDFS.
>
> According to the documentation (
> https://ci.apache.org/projects/flink/flink-docs-release-1.3/internals/filesystems.html),
> persistence gurantees is listed as one of the main requirement and to be
> precises it qualifies both visibility and durability gurantees.
>
> My question is,
>
> 1) Are we expecting the file system to support "Atomic Rename"
> characteristics? I believe checkpoint mechanism involves in renaming the
> files and will that have an impact if "atomic rename" is not guranteed by
> the underlying file system?
>
> 2) How does one certify Flink with HCFS (in place of standard HDFS) in
> terms of the scenarios/usecase that needs to be tested? Is there any
> general guidance on this?
> ---
>
> Regards
> Vijay
>
>
> On Wednesday, February 15, 2017 11:28 AM, Vijay Srinivasaraghavan <
> vijikarthi@yahoo.com> wrote:
>
>
> Hello,
>
> Regarding the Filesystem abstraction support, we are planning to use a
> distributed file system which complies with Hadoop Compatible File System
> (HCFS) standard in place of standard HDFS.
>
> According to the documentation (
> https://ci.apache.org/projects/flink/flink-docs-release-1.3/internals/filesystems.html),
> persistence gurantees is listed as one of the main requirement and to be
> precises it qualifies both visibility and durability gurantees.
>
> My question is,
>
> 1) Are we expecting the file system to support "Atomic Rename"
> characteristics? I believe checkpoint mechanism involves in renaming the
> files and will that have an impact if "atomic rename" is not guranteed by
> the underlying file system?
>
> 2) How does one certify Flink with HCFS (in place of standard HDFS) in
> terms of the scenarios/usecase that needs to be tested? Is there any
> general guidance on this?
>
> Thanks
> Vijay
>
>
>

Re: Reliable Distributed FS support (HCFS)

Posted by Vijay Srinivasaraghavan <vi...@yahoo.com.INVALID>.
Following up on my question regarding backed Filesystem (HCFS) requirements. Appreciate any inputs.
---Regarding the Filesystem abstraction support, we are planning to use a distributed file system which complies with Hadoop Compatible File System (HCFS) standard in place of standard HDFS.
According to the documentation (https://ci.apache.org/projects/flink/flink-docs-release-1.3/internals/filesystems.html), persistence gurantees is listed as one of the main requirement and to be precises it qualifies both visibility and durability gurantees.
My question is,
1) Are we expecting the file system to support "Atomic Rename" characteristics? I believe checkpoint mechanism involves in renaming the files and will that have an impact if "atomic rename" is not guranteed by the underlying file system?
2) How does one certify Flink with HCFS (in place of standard HDFS) in terms of the scenarios/usecase that needs to be tested? Is there any general guidance on this?---
RegardsVijay 

    On Wednesday, February 15, 2017 11:28 AM, Vijay Srinivasaraghavan <vi...@yahoo.com> wrote:
 

 Hello,
Regarding the Filesystem abstraction support, we are planning to use a distributed file system which complies with Hadoop Compatible File System (HCFS) standard in place of standard HDFS.
According to the documentation (https://ci.apache.org/projects/flink/flink-docs-release-1.3/internals/filesystems.html), persistence gurantees is listed as one of the main requirement and to be precises it qualifies both visibility and durability gurantees.
My question is,
1) Are we expecting the file system to support "Atomic Rename" characteristics? I believe checkpoint mechanism involves in renaming the files and will that have an impact if "atomic rename" is not guranteed by the underlying file system?
2) How does one certify Flink with HCFS (in place of standard HDFS) in terms of the scenarios/usecase that needs to be tested? Is there any general guidance on this?
ThanksVijay

   

Re: Reliable Distributed FS support (HCFS)

Posted by Vijay Srinivasaraghavan <vi...@yahoo.com>.
Following up on my question regarding backed Filesystem (HCFS) requirements. Appreciate any inputs.
---Regarding the Filesystem abstraction support, we are planning to use a distributed file system which complies with Hadoop Compatible File System (HCFS) standard in place of standard HDFS.
According to the documentation (https://ci.apache.org/projects/flink/flink-docs-release-1.3/internals/filesystems.html), persistence gurantees is listed as one of the main requirement and to be precises it qualifies both visibility and durability gurantees.
My question is,
1) Are we expecting the file system to support "Atomic Rename" characteristics? I believe checkpoint mechanism involves in renaming the files and will that have an impact if "atomic rename" is not guranteed by the underlying file system?
2) How does one certify Flink with HCFS (in place of standard HDFS) in terms of the scenarios/usecase that needs to be tested? Is there any general guidance on this?---
RegardsVijay 

    On Wednesday, February 15, 2017 11:28 AM, Vijay Srinivasaraghavan <vi...@yahoo.com> wrote:
 

 Hello,
Regarding the Filesystem abstraction support, we are planning to use a distributed file system which complies with Hadoop Compatible File System (HCFS) standard in place of standard HDFS.
According to the documentation (https://ci.apache.org/projects/flink/flink-docs-release-1.3/internals/filesystems.html), persistence gurantees is listed as one of the main requirement and to be precises it qualifies both visibility and durability gurantees.
My question is,
1) Are we expecting the file system to support "Atomic Rename" characteristics? I believe checkpoint mechanism involves in renaming the files and will that have an impact if "atomic rename" is not guranteed by the underlying file system?
2) How does one certify Flink with HCFS (in place of standard HDFS) in terms of the scenarios/usecase that needs to be tested? Is there any general guidance on this?
ThanksVijay