You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Mingliang Liu (JIRA)" <ji...@apache.org> on 2017/04/06 21:13:41 UTC

[jira] [Updated] (HADOOP-14172) S3Guard: import does not import empty directory

     [ https://issues.apache.org/jira/browse/HADOOP-14172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mingliang Liu updated HADOOP-14172:
-----------------------------------
       Resolution: Fixed
     Hadoop Flags: Reviewed
    Fix Version/s: HADOOP-13345
           Status: Resolved  (was: Patch Available)

> S3Guard: import does not import empty directory
> -----------------------------------------------
>
>                 Key: HADOOP-14172
>                 URL: https://issues.apache.org/jira/browse/HADOOP-14172
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>            Reporter: Sean Mackrory
>            Assignee: Sean Mackrory
>             Fix For: HADOOP-13345
>
>         Attachments: HADOOP-14172-HADOOP-13345.001.patch
>
>
> It imports everything comes up from listFiles, which includes only files (and their parent directories as a side-effect). My first thought on doing this would be to override S3AFileSystem to add an optional parameter to use AcceptAllButSelfAndS3nDirs instead of AcceptFilesOnly. But we could also manually traverse the tree to get all FileStatus objects directory by directory like we do for diff. That's far slower but doesn't add surface area to S3AFileSystem. But there's also the impact to other S3 clients to worry about - I could go either way on that.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org