You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/01/10 14:46:53 UTC

[GitHub] [arrow] paleolimbot commented on issue #11934: [R] errors when downloading parquet files from s3.

paleolimbot commented on issue #11934:
URL: https://github.com/apache/arrow/issues/11934#issuecomment-1008942966


   I couldn't reproduce this using minio locally...is there anything that I'm not understanding about your setup? If you can  modify this example to reproduce your error we will be better able to help fix!
   
   ``` r
   library(arrow, warn.conflicts = FALSE)
   
   dir <- tempfile()
   dir.create(dir)
   subdir <- file.path(dir, "some_subdir")
   dir.create(subdir)
   list.files(dir)
   #> [1] "some_subdir"
   
   minio_server <- processx::process$new("minio", args = c("server", dir), supervise = TRUE)
   Sys.sleep(1)
   stopifnot(minio_server$is_alive())
   #> Error: minio_server$is_alive() is not TRUE
   
   # make sure we can connect
   s3_uri <- "s3://minioadmin:minioadmin@?scheme=http&endpoint_override=localhost%3A9000"
   bucket <- s3_bucket(s3_uri)
   bucket$ls("some_subdir")
   #> [1] "some_subdir/test"
   
   # write a dataset to minio
   data <- data.frame(x = letters[1:5])
   
   write_dataset(
     dataset = data,
     path = bucket$path("some_subdir/test")
   )
   
   bucket$ls("some_subdir/test")
   #> [1] "some_subdir/test/part-0.parquet"
   
   dplyr::collect(arrow::open_dataset(bucket$path("some_subdir/test")))
   #>   x
   #> 1 a
   #> 2 b
   #> 3 c
   #> 4 d
   #> 5 e
   
   minio_server$interrupt()
   #> [1] FALSE
   Sys.sleep(1)
   stopifnot(!minio_server$is_alive())
   ```
   
   <sup>Created on 2022-01-10 by the [reprex package](https://reprex.tidyverse.org) (v2.0.1)</sup>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org