You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Aaron Fabbri (JIRA)" <ji...@apache.org> on 2016/10/17 23:03:58 UTC

[jira] [Updated] (HADOOP-13651) S3Guard: S3AFileSystem Integration with MetadataStore

     [ https://issues.apache.org/jira/browse/HADOOP-13651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Aaron Fabbri updated HADOOP-13651:
----------------------------------
    Attachment: HADOOP-13651-HADOOP-13345.001.patch

I don't have all tests passing yet, but I wanted to attach a v1 / RFC patch in case folks want to take a look.  See my previous comment for overview, (except I've now implemented create() in this patch).

This patch has really benefited from the great work on integration and FS contract tests that folks has done, so thank you.

The create() case was interesting:  On create, we need to put a FileStatus in the MetadataStore.  The main wart was on modification time:  S3A uses S3's server-side modification time to populate FileStatus's.  We cannot know that time value at create time, unless we blocked and polled S3 for results.  Those results would be subject to S3 consistency and multi-writer issues.  The other approach would be to put a PathMetadata in the MetadataStore that says "this file exists but we do not have FileStatus for it yet".. That complicates the client a bit, so for now, I just use local system time for modification time.
 
The main issue I'm tackling next is {{S3AFileStatus#isEmptyDirectory()}}.. This one bit of state is a pain because it means you cannot simply cache a S3AFileStatus in isolation: it needs to be updated when the set of children changes.  Couple this with the fact that we do not require all metadata to be pre-loaded into the MetadataStore, and you have a nasty little problem.  I have an idea of how to tackle it.  I may post my solution to that part as a separate RFC patch on here so folks can comment on that part alone.


> S3Guard: S3AFileSystem Integration with MetadataStore
> -----------------------------------------------------
>
>                 Key: HADOOP-13651
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13651
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>            Reporter: Aaron Fabbri
>            Assignee: Aaron Fabbri
>         Attachments: HADOOP-13651-HADOOP-13345.001.patch
>
>
> Modify S3AFileSystem et al. to optionally use a MetadataStore for metadata consistency and caching.
> Implementation should have minimal overhead when no MetadataStore is configured.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org