You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Aaron Fabbri (JIRA)" <ji...@apache.org> on 2018/09/21 00:47:00 UTC

[jira] [Created] (HADOOP-15780) S3Guard: document how to deal with non-S3Guard processes writing data to S3Guarded buckets

Aaron Fabbri created HADOOP-15780:
-------------------------------------

             Summary: S3Guard: document how to deal with non-S3Guard processes writing data to S3Guarded buckets
                 Key: HADOOP-15780
                 URL: https://issues.apache.org/jira/browse/HADOOP-15780
             Project: Hadoop Common
          Issue Type: Sub-task
    Affects Versions: 3.2.0
            Reporter: Aaron Fabbri


Our general policy for S3Guard is this: All modifiers of a bucket that is configured for use with S3Guard, must use S3Guard. Otherwise, the MetadataStore will not be properly updated as the S3 bucket changes and problems will arise.

There are limited circumstances in which may be safe to have an external (non-s3guard) process writing data.  There are also scenarios where it definitely breaks things.

I think we should start by documenting the cases that this works / does not work for. After we've enumerated that, we can suggest enhancements as needed to make this sort of configuration easier to use.

To get the ball rolling, some things that do not work:
- Deleting a path *p* with S3Guard, then writing a new file at path *p* without S3guard (will still have delete marker in S3Guard, making the file appear to be deleted but still visible in S3 due to false "eventual consistency") (as [~stevel@apache.org] and I have discussed)
- When fs.s3a.metadatastore.authoritative is true, adding files to directories without S3Guard, then listing with S3Guard may exclude externally-written files from listings.

(Note, there are also S3A interop issues with other non-S3A clients even without S3Guard, due to the unique way S3A interprets empty directory markers).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-dev-help@hadoop.apache.org