You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@flink.apache.org by Stephan Ewen <se...@apache.org> on 2017/10/14 20:25:23 UTC

Re: Empty directories left over from checkpointing

Some updates on this:

Aside from reworking how the S3 directory handling is done, we also looked
into supporting S3 different than we currently do. Currently support goes
strictly through Hadoop's S3 file systems, which we need to change, because
we want it to be possible to use Flink without Hadoop dependencies.

In the next release, we will have S3 file systems without Hadoop dependency:

  - One implementation wraps and shades a newer version of s3a. For
compatibility with current behavior.

  - The second is interesting for this directory problem: It uses Pesto's
S3 support which is a bit different from Hadoop' s3n and s3a. It does not
create empty directly marker files, hence it is not trying to make S3 look
as much like a file system as s3a and s3n are, but that is actually of
advantage for checkpointing. With that implementation, the here mentioned
issue should not exist.

Caveat: The new file systems and their aggressive shading needs to be
testet at scale still, but we are happy to take any feedback on this.

Merged as of
https://github.com/apache/flink/commit/991af3652479f85f732cbbade46bed7df1c5d819

You can use them by simply dropping the respective JARs from "/opt" into
"/lib" and using the file system scheme "s3://".
The configuration is as in Hadoop/Presto, but you can drop the config keys
into the Flink configuration - they will be forwarded to the Hadoop
configuration.

Hope that this makes the S3 use a lot easier and more fun...


On Wed, Sep 20, 2017 at 2:49 PM, Stefan Richter <s.richter@data-artisans.com
> wrote:

> Hi,
>
> We recently removed some cleanup code, because it involved checking some
> store meta data to check when we can delete a directory. For certain stores
> (like S3), requesting this meta data whenever we delete a file was so
> expensive that it could bring down the job because removing state could not
> be processed fast enough. We have a temporary fix in place now, so that
> jobs at large scale can still run reliably on stores like S3. Currently,
> this comes at the cost of not cleaning up directories but we are clearly
> planning to introduce a different mechanism for directory cleanup in the
> future that is not as fine grained as doing meta data queries per file
> delete. In the meantime, unfortunately the best way is to cleanup empty
> directories with some external tool.
>
> Best,
> Stefan
>
> Am 20.09.2017 um 01:23 schrieb Hao Sun <ha...@zendesk.com>:
>
> Thanks Elias! Seems like there is no better answer than "do not care about
> them now", or delete with a background job.
>
> On Tue, Sep 19, 2017 at 4:11 PM Elias Levy <fe...@gmail.com>
> wrote:
>
>> There are a couple of related JIRAs:
>>
>> https://issues.apache.org/jira/browse/FLINK-7587
>> https://issues.apache.org/jira/browse/FLINK-7266
>>
>>
>> On Tue, Sep 19, 2017 at 12:20 PM, Hao Sun <ha...@zendesk.com> wrote:
>>
>>> Hi, I am using RocksDB and S3 as storage backend for my checkpoints.
>>> Can flink delete these empty directories automatically? Or I need a
>>> background job to do the deletion?
>>>
>>> I know this has been discussed before, but I could not get a concrete
>>> answer for it yet. Thanks
>>>
>>> <image.png>
>>>
>>
>>
>

Re: Empty directories left over from checkpointing

Posted by Elias Levy <fe...@gmail.com>.

Stephan,

Thanks for taking care of this.  We'll give it a try once 1.4 drops.

On Sat, Oct 14, 2017 at 1:25 PM, Stephan Ewen <se...@apache.org> wrote:

> Some updates on this:
>
> Aside from reworking how the S3 directory handling is done, we also looked
> into supporting S3 different than we currently do. Currently support goes
> strictly through Hadoop's S3 file systems, which we need to change, because
> we want it to be possible to use Flink without Hadoop dependencies.
>
> In the next release, we will have S3 file systems without Hadoop
> dependency:
>
>   - One implementation wraps and shades a newer version of s3a. For
> compatibility with current behavior.
>
>   - The second is interesting for this directory problem: It uses Pesto's
> S3 support which is a bit different from Hadoop' s3n and s3a. It does not
> create empty directly marker files, hence it is not trying to make S3 look
> as much like a file system as s3a and s3n are, but that is actually of
> advantage for checkpointing. With that implementation, the here mentioned
> issue should not exist.
>
> Caveat: The new file systems and their aggressive shading needs to be
> testet at scale still, but we are happy to take any feedback on this.
>
> Merged as of https://github.com/apache/flink/commit/
> 991af3652479f85f732cbbade46bed7df1c5d819
>
> You can use them by simply dropping the respective JARs from "/opt" into
> "/lib" and using the file system scheme "s3://".
> The configuration is as in Hadoop/Presto, but you can drop the config keys
> into the Flink configuration - they will be forwarded to the Hadoop
> configuration.
>
> Hope that this makes the S3 use a lot easier and more fun...
>
>
> On Wed, Sep 20, 2017 at 2:49 PM, Stefan Richter <
> s.richter@data-artisans.com> wrote:
>
>> Hi,
>>
>> We recently removed some cleanup code, because it involved checking some
>> store meta data to check when we can delete a directory. For certain stores
>> (like S3), requesting this meta data whenever we delete a file was so
>> expensive that it could bring down the job because removing state could not
>> be processed fast enough. We have a temporary fix in place now, so that
>> jobs at large scale can still run reliably on stores like S3. Currently,
>> this comes at the cost of not cleaning up directories but we are clearly
>> planning to introduce a different mechanism for directory cleanup in the
>> future that is not as fine grained as doing meta data queries per file
>> delete. In the meantime, unfortunately the best way is to cleanup empty
>> directories with some external tool.
>>
>> Best,
>> Stefan
>>
>> Am 20.09.2017 um 01:23 schrieb Hao Sun <ha...@zendesk.com>:
>>
>> Thanks Elias! Seems like there is no better answer than "do not care
>> about them now", or delete with a background job.
>>
>> On Tue, Sep 19, 2017 at 4:11 PM Elias Levy <fe...@gmail.com>
>> wrote:
>>
>>> There are a couple of related JIRAs:
>>>
>>> https://issues.apache.org/jira/browse/FLINK-7587
>>> https://issues.apache.org/jira/browse/FLINK-7266
>>>
>>>
>>> On Tue, Sep 19, 2017 at 12:20 PM, Hao Sun <ha...@zendesk.com> wrote:
>>>
>>>> Hi, I am using RocksDB and S3 as storage backend for my checkpoints.
>>>> Can flink delete these empty directories automatically? Or I need a
>>>> background job to do the deletion?
>>>>
>>>> I know this has been discussed before, but I could not get a concrete
>>>> answer for it yet. Thanks
>>>>
>>>> <image.png>
>>>>
>>>
>>>
>>
>