You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/06/01 10:00:50 UTC

[GitHub] [arrow] pitrou commented on a diff in pull request #13206: ARROW-15906: [C++][Python][R] By default, don't create or delete S3 buckets

pitrou commented on code in PR #13206:
URL: https://github.com/apache/arrow/pull/13206#discussion_r886614745


##########
cpp/src/arrow/filesystem/s3fs.h:
##########
@@ -218,6 +229,12 @@ class ARROW_EXPORT S3FileSystem : public FileSystem {
   /// Return the actual region this filesystem connects to
   std::string region() const;
 
+  /// Set create_buckets property of options

Review Comment:
   These trivial setters are not needed as people can access the attributes directly.



##########
r/R/filesystem.R:
##########
@@ -180,6 +184,11 @@ FileSelector$create <- function(base_dir, allow_not_found = FALSE, recursive = F
 #' - `$OpenAppendStream(path)`: Open an [output stream][OutputStream] for
 #'    appending.
 #'
+#' `S3FileSystem` also has methods:
+#'
+#' - `$allow_bucket_creation()` which can set the corresponding option above.
+#' - `$allow_bucket_deletion()` which can set the corresponding option above.

Review Comment:
   Why expose these as methods as well? Is it useful?



##########
cpp/src/arrow/filesystem/s3fs.cc:
##########
@@ -353,6 +354,12 @@ Result<S3Options> S3Options::FromUri(const Uri& uri, std::string* out_path) {
       options.scheme = kv.second;
     } else if (kv.first == "endpoint_override") {
       options.endpoint_override = kv.second;
+    } else if (kv.first == "allow_bucket_creation") {
+      options.allow_bucket_creation =
+          ::arrow::internal::AsciiEqualsCaseInsensitive(kv.second, "true");
+    } else if (kv.first == "allow_bucket_deletion") {
+      options.allow_bucket_deletion =
+          ::arrow::internal::AsciiEqualsCaseInsensitive(kv.second, "true");

Review Comment:
   Hmm, I think we should be a bit stricter and not blindly interpret any other value as "false".
   How about allowing the following values:
   * "0", "false" (case-insensitive) -> boolean false
   * "1", "true" (case-insensitive) -> boolean true
   * any other value -> raise `Status::Invalid`
   



##########
cpp/src/arrow/filesystem/s3fs.h:
##########
@@ -130,6 +130,17 @@ struct ARROW_EXPORT S3Options {
   /// Whether OutputStream writes will be issued in the background, without blocking.
   bool background_writes = true;
 
+  /// Whether to allow creation or deletion of buckets

Review Comment:
   ```suggestion
     /// Whether to allow creation of buckets
   ```



##########
python/pyarrow/_s3fs.pyx:
##########
@@ -312,3 +328,25 @@ cdef class S3FileSystem(FileSystem):
         The AWS region this filesystem connects to.
         """
         return frombytes(self.s3fs.region())
+
+    @property
+    def allow_bucket_creation(self):
+        """
+        Whether to allow CreateDir at the bucket-level.

Review Comment:
   We should probably use the same docstring for constructor parameters and the associated properties.
   Personally, I like "Whether to allow CreateDir at the bucket-level."



##########
python/pyarrow/includes/libarrow_fs.pxd:
##########
@@ -193,6 +195,8 @@ cdef extern from "arrow/filesystem/api.h" namespace "arrow::fs" nogil:
         CResult[shared_ptr[CS3FileSystem]] Make(const CS3Options& options)
         CS3Options options()
         c_string region()
+        void allow_bucket_creation(c_bool allow)
+        void allow_bucket_deletion(c_bool allow)

Review Comment:
   Not needed?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org