You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Steve Loughran (JIRA)" <ji...@apache.org> on 2017/02/27 10:58:45 UTC

[jira] [Commented] (HADOOP-14124) S3AFileSystem silently deletes "fake" directories when writing a file.

    [ https://issues.apache.org/jira/browse/HADOOP-14124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15885572#comment-15885572 ] 

Steve Loughran commented on HADOOP-14124:
-----------------------------------------

Here's why: when doing earlier directory scanning looking for files and things, the check for a mock empty directory halts scans down the tree. If we left them alone, then operations looking for files and their children is potentially *not going to find child entries*. I think we can both agree: that'd be a disaster. It's not just us that hates mock dirs BTW, cyberduck deserves a mention.

Regarding filesystem compatibility, what really drives us to make object stores look like Hadoop filesystems,  meaning how close we are to the [Hadoop Filesystem specification|http://hadoop.apache.org/docs/r2.7.0/hadoop-project-dist/hadoop-common/filesystem/filesystem.html], and its tests

Looking into S3a, the big places where we check {{S3AFileStatus.isEmptyDirectory()}} is in rename and delete, where corner cases like "rename-into-empty-dir is permitted by overwriting the empty dir with the data" and "delete empty dir doesn't need to be recursive". We can argue about the suitability of those semantics in a blobstore, but they're what Posix dictates, for better or worse.

In HADOOP-9565 (?) we do say "We should have a minimal blobstore interface which just does PUT/GET/HEAD/DELETE" and stop pretending otherwise, but it hits a couple of brick walls
# projects like Hive don't want a new API, they want object stores to look like Posix, up to the things that are near-impossible O(1) atomic renames.
# the object stores are all different. WASB does leases, AWS does multipart puts and has finally added the ability to dynamically metadata to objects; its consistency model still lags.

I really don't see a good solution here. We're forced to abuse object stores to make them pretend to be filestores, then field complaints about "Race conditions in rename" &c (HIVE-14269) —the Posix metaphor failing— and in our best-effort attempts to come close to it, do some things in the blobstore that aren't great.

# we welcome submissions of documentation improvements to our project, something on the topic of compatibility of s3a based on your experience would be good.
# we're aware of the assumption in s3n and s3a that they have unrestricted write access to all paths in a bucket. Once you start playing with S3 ACLs, that assumption doesn't hold, and those recursive read+delete calls don't work well. In HADOOP-13164 we're resilient to problems there, probably.

To conclude: code changes here have go in as WONTFIX, sorry. More documentation on what's happening under the hood: we'd welcome that

ps: can you declare the hadoop version you are using, thanks.



> S3AFileSystem silently deletes "fake" directories when writing a file.
> ----------------------------------------------------------------------
>
>                 Key: HADOOP-14124
>                 URL: https://issues.apache.org/jira/browse/HADOOP-14124
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs, fs/s3
>            Reporter: Joel Baranick
>              Labels: filesystem, s3
>
> I realize that you guys probably have a good reason for {{S3AFileSystem}} to cleanup "fake" folders when a file is written to S3.  That said, that fact that it silently does this feels like a separation of concerns issue.  It also leads to weird behavior issues where calls to {{AmazonS3Client.getObjectMetadata}} for folders work before calling {{S3AFileSystem.create}} but not after.  Also, there seems to be no mention in the javadoc that the {{deleteUnnecessaryFakeDirectories}} method is automatically invoked. Lastly, it seems like the goal of {{FileSystem}} should be to ensure that code built on top of it is portable to different implementations.  This behavior is an example of a case where this can break down.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org