You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Mingliang Liu (JIRA)" <ji...@apache.org> on 2017/04/04 01:40:42 UTC

[jira] [Updated] (HADOOP-14255) S3A to delete unnecessary fake directory objects in mkdirs()

     [ https://issues.apache.org/jira/browse/HADOOP-14255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mingliang Liu updated HADOOP-14255:
-----------------------------------
    Attachment: HADOOP-14255.001.patch

[~stevel@apache.org], thanks for reviewing.

{quote}
could we have the test actually create the whole list of children, rather than mkdirs(nested)? as today a mkdirs(nested) won't create the parents, but a mkdir(a), (a/b), (a/b/c) will create lots of those ancestors
{quote}
The v1 patch added one more tests to address the 1st comment. I'd prefer to keep the existing test because that is for testing the behavior that after FileSystem::mkdirs(), all non-existent ancestors (Path, not necessarily S3 fake directory objects) will exist.

{quote}
Maybe we should add a test for that too:
mkdir a
mkdir a/b
assert a/b exists
rm -rf a
assert a/b doesn't exist
{quote}
The 2nd problem you proposed is tested by {{AbstractContractDeleteTest::testDeleteDeepEmptyDir}} which is passing for S3AFileSystem.

> S3A to delete unnecessary fake directory objects in mkdirs()
> ------------------------------------------------------------
>
>                 Key: HADOOP-14255
>                 URL: https://issues.apache.org/jira/browse/HADOOP-14255
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>            Reporter: Mingliang Liu
>            Assignee: Mingliang Liu
>         Attachments: HADOOP-14255.000.patch, HADOOP-14255.001.patch
>
>
> In S3AFileSystem, as an optimization, we delete unnecessary fake directory objects if that directory contains at least one (nested) file. That is done in closing stream of newly created file. However, if the directory becomes non-empty after we just create an empty subdirectory, we do not delete its fake directory object though that fake directory object becomes "unnecessary".
> So in {{S3AFileSystem::mkdirs()}}, we have a pending TODO:
> {quote}
>   // TODO: If we have created an empty file at /foo/bar and we then call
>   // mkdirs for /foo/bar/baz/roo what happens to the empty file /foo/bar/?
>   private boolean innerMkdirs(Path p, FsPermission permission)
> {quote}
> This JIRA is to fix the TODO: provide consistent behavior for a fake directory object between its nested subdirectory and nested file by deleting it.
> See related discussion in [HADOOP-14236]. Thanks [~stevel@apache.org] for discussion.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org