You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/07/01 14:48:38 UTC

[GitHub] [arrow] Mokubyow edited a comment on issue #10634: pq.write_to_dataset() and ds.write_dataset() both throw InvalidLocationConstraint when using S3FileSystem

Mokubyow edited a comment on issue #10634:
URL: https://github.com/apache/arrow/issues/10634#issuecomment-872305153


   @westonpace I'm 100% sure I'm using the exact same `path` variable for each of these writing methods. I've annotated what works and what doesn't below. I can also confirm that the `path` variable returned from `fs.FileSystem.from_uri(target_uri)` does indeed start with the bucket name `my-existing-bucket/new_prefix/` Additionally, I'm not creating a new bucket this bucket already exists in my AWS account I'm just looking to write to it, so I'm not sure why this error is being thrown in the first place.
   
   ```
   filesystem, path = fs.FileSystem.from_uri(target_uri)
   
   # Throws InvalidLocationConstraint error
   ds.write_dataset(dataset, path, filesystem=filesystem, format="parquet")
   
   # Throws InvalidLocationConstraint error
   pq.write_to_dataset(dataset.to_table(), path, filesystem=filesystem)
   
   # Writes single file
   pq.write_table(dataset.to_table(), path, filesystem=filesystem)
   ```
   
   Are there any solutions you can think of for writing a large dataset as many parquet files from pyarrow to unblock me while we debug?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org