You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2018/01/25 17:37:00 UTC

[jira] [Commented] (ARROW-1213) [Python] Enable s3fs to be used with ParquetDataset and reader/writer functions

    [ https://issues.apache.org/jira/browse/ARROW-1213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16339538#comment-16339538 ] 

ASF GitHub Bot commented on ARROW-1213:
---------------------------------------

gsakkis commented on issue #916: ARROW-1213: [Python] Support s3fs filesystem for Amazon S3 in ParquetDataset
URL: https://github.com/apache/arrow/pull/916#issuecomment-360541307
 
 
   Sorry to resurrect this but has there been a regression since then? I am trying the code sample from @DrChrisLevy above and I am getting `IndexError`. Looks like it doesn't like the `s3://` scheme, passing `bucket/path` works.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


> [Python] Enable s3fs to be used with ParquetDataset and reader/writer functions
> -------------------------------------------------------------------------------
>
>                 Key: ARROW-1213
>                 URL: https://issues.apache.org/jira/browse/ARROW-1213
>             Project: Apache Arrow
>          Issue Type: Improvement
>            Reporter: Yacko
>            Assignee: Wes McKinney
>            Priority: Minor
>              Labels: pull-request-available
>             Fix For: 0.6.0
>
>
> Pyarrow dataset function can't read from s3 using s3fs as the filesystem. Is  there a way we can add the support for read from s3 based on partitioned files ?
> I am trying to address the problem mentioned in the stackoverflow link :
> https://stackoverflow.com/questions/45082832/how-to-read-partitioned-parquet-files-from-s3-using-pyarrow-in-python



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)