You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Steve Loughran (Jira)" <ji...@apache.org> on 2020/01/14 15:43:00 UTC

[jira] [Commented] (HADOOP-16746) s3a empty dir markers are not created in s3guard as authoritative

    [ https://issues.apache.org/jira/browse/HADOOP-16746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17015175#comment-17015175 ] 

Steve Loughran commented on HADOOP-16746:
-----------------------------------------

Here are my ideas here

Simple

* when we put a file, if len==0 and endswith / assume empty dir marker, mark as auth

Complicated

* in a write op context, we pass along flags indicating this is an empty dir and should be marked as auth

I can't see how you'd ever create a file ending in / which wasn't a directory. Therefore, I'm going to go with simple first.

Yes -I am choosing in the simple option over some advanced one. The advanced one we can put off. For that I want a context which

* includes op specific stats interface so we can collect stats across threads properly
* maybe: wrap BulkOperationContext (so no need to replicate)
* Be the basis for more things

That's a spanning change; I'd actually like to isolate that.


> s3a empty dir markers are not created in s3guard as authoritative
> -----------------------------------------------------------------
>
>                 Key: HADOOP-16746
>                 URL: https://issues.apache.org/jira/browse/HADOOP-16746
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>    Affects Versions: 3.3.0
>            Reporter: Steve Loughran
>            Assignee: Steve Loughran
>            Priority: Major
>
> Newly created empty dirs, or markers created after delete operations, are not marked in S3Guard as auth. This has adverse consequences in that following changes (i.e. new files) don't get marked as auth either...it needs a listFiles call to scan the source and mark as auth.
> I could stick a quick fix in to HADOOP-16697, but don't want to as I don't like what that would mean. Essentially, finishedWrite() need to recognise when an empty directory markers being created (it does this already) and then *always* declare it as auth.
> I'd prefer for the mkdirs operation to pass a flag all the way through to finishedWrite so that it doesn't need to infer this. The WriteOpContext of HADOOP-16134 would be the way to do this. Yes it's a big change but it would be extensible -and I already have some plans there.
> Instead it will be a follow-up.
> The tests for this problem are part of HADOOP-16697, just disabled for now.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org