You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Dewey Dunnington (Jira)" <ji...@apache.org> on 2022/01/10 14:36:00 UTC

[jira] [Resolved] (ARROW-15097) [R] Can't write dataset on minio local s3

     [ https://issues.apache.org/jira/browse/ARROW-15097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Dewey Dunnington resolved ARROW-15097.
--------------------------------------
    Resolution: Not A Problem

> [R] Can't write dataset on minio local s3
> -----------------------------------------
>
>                 Key: ARROW-15097
>                 URL: https://issues.apache.org/jira/browse/ARROW-15097
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: R
>            Reporter: Dewey Dunnington
>            Priority: Major
>
> When trying to reproduce the "Odd behaviour when writing a dataset to s3" error described here ( [https://github.com/apache/arrow/issues/11934] ), I ran into problems writing to a local minio-backed bucket. This could be 'user error' (me!) since I'm unfamiliar with this kind of thing. If so, perhaps documenting how to make a test setup as alluded to in the S3 vignette might be a good solution here.
>  
> The code I'm using to reproduce is:
> {code:r}
> library(arrow, warn.conflicts = FALSE)
> dir <- tempfile()
> dir.create(dir)
> subdir <- file.path(dir, "some_subdir")
> dir.create(subdir)
> list.files(dir)
> #> [1] "some_subdir"
> minio_server <- processx::process$new("minio", args = c("server", dir), supervise = TRUE)
> Sys.sleep(1)
> stopifnot(minio_server$is_alive())
> # make sure we can connect
> s3_uri <- "s3://minioadmin:minioadmin@?scheme=http&endpoint_override=localhost%3A9000"
> bucket <- s3_bucket(s3_uri)
> bucket$ls("some_subdir")
> #> character(0)
> # write a dataset to minio (currently hangs or errors)
> data <- data.frame(x = letters[1:5])
> write_dataset(
>   dataset = data,
>   path = bucket$path("test_parquet")
> )
> #> Error: IOError: When creating bucket 'test_parquet': AWS Error [code 100]: Unable to parse ExceptionName: InvalidBucketName Message: The specified bucket is not valid.
> minio_server$interrupt()
> #> [1] TRUE
> Sys.sleep(1)
> stopifnot(!minio_server$is_alive())
> {code}
> The output of {{mc admin trace}} is:
> {noformat}
> $ mc alias set myminio http://localhost:9000 minioadmin minioadmin
> Added `myminio` successfully.
> $ mc admin trace myminio
> 2021-12-14T08:46:04:000 [200 OK] s3.ListBuckets localhost:9000/ ::1               444µs       ↑ 156 B ↓ 685 B
> 2021-12-14T08:46:26:000 [400 Bad Request] s3.PutBucket localhost:9000/test_parquet ::1               127µs       ↑ 187 B ↓ 625 B
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)