You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "KellyWalker (via GitHub)" <gi...@apache.org> on 2023/09/29 20:06:38 UTC

[GitHub] [arrow] KellyWalker opened a new issue, #37960: Add support for GCS URI (gs://) to pyarrow.parquet.read_table

KellyWalker opened a new issue, #37960:
URL: https://github.com/apache/arrow/issues/37960

   ### Describe the enhancement requested
   
   Currently, this works:
   
   ```
   from gcsfs import GCSFileSystem
   import pyarrow.parquet as pq
   gcs = GCSFileSystem()
   parquet_file = pq.read_table("/bucket/path/to/file.parquet", filesystem=gcs)
   ```
   
   But this does not:
   
   ```
   import pyarrow.parquet as pq
   parquet_file = pq.read_table("gs://bucket/path/to/file.parquet")
   ```
   
   It would be nice if the latter worked directly without needing to specify the filesystem.
   
   ### Component(s)
   
   Parquet, Python


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] kou commented on issue #37960: [Python] Add support for GCS URI (gs://) to pyarrow.parquet.read_table

Posted by "kou (via GitHub)" <gi...@apache.org>.
kou commented on issue #37960:
URL: https://github.com/apache/arrow/issues/37960#issuecomment-1741521034

   Which wheel are you using?
   Wheel list: https://pypi.org/project/pyarrow/#files


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] KellyWalker commented on issue #37960: [Python] Add support for GCS URI (gs://) to pyarrow.parquet.read_table

Posted by "KellyWalker (via GitHub)" <gi...@apache.org>.
KellyWalker commented on issue #37960:
URL: https://github.com/apache/arrow/issues/37960#issuecomment-1741506624

   Yes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] KellyWalker closed issue #37960: [Python] Add support for GCS URI (gs://) to pyarrow.parquet.read_table

Posted by "KellyWalker (via GitHub)" <gi...@apache.org>.
KellyWalker closed issue #37960: [Python] Add support for GCS URI (gs://) to pyarrow.parquet.read_table
URL: https://github.com/apache/arrow/issues/37960


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] KellyWalker commented on issue #37960: [Python] Add support for GCS URI (gs://) to pyarrow.parquet.read_table

Posted by "KellyWalker (via GitHub)" <gi...@apache.org>.
KellyWalker commented on issue #37960:
URL: https://github.com/apache/arrow/issues/37960#issuecomment-1741523600

   pyarrow-12.0.1-cp38-cp38-win_amd64.whl
   
   Listed here for 12.0.1: https://pypi.org/project/pyarrow/12.0.1/#files


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] kou commented on issue #37960: [Python] Add support for GCS URI (gs://) to pyarrow.parquet.read_table

Posted by "kou (via GitHub)" <gi...@apache.org>.
kou commented on issue #37960:
URL: https://github.com/apache/arrow/issues/37960#issuecomment-1741503538

   Does this work?
   
   ```python
   import pyarrow.dataset as ds
   dataset = ds.dataset("gs://bucket/path/to/file.parquet", format="parquet")
   parquet_file = dataset.to_table()
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] kou commented on issue #37960: [Python] Add support for GCS URI (gs://) to pyarrow.parquet.read_table

Posted by "kou (via GitHub)" <gi...@apache.org>.
kou commented on issue #37960:
URL: https://github.com/apache/arrow/issues/37960#issuecomment-1741524631

   OK. pyarrow 13.0.0 will solve your problem.
   See also: #35255/#35193


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] kou commented on issue #37960: [Python] Add support for GCS URI (gs://) to pyarrow.parquet.read_table

Posted by "kou (via GitHub)" <gi...@apache.org>.
kou commented on issue #37960:
URL: https://github.com/apache/arrow/issues/37960#issuecomment-1741514555

   How did you install your PyArrow? wheel? conda?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] KellyWalker commented on issue #37960: [Python] Add support for GCS URI (gs://) to pyarrow.parquet.read_table

Posted by "KellyWalker (via GitHub)" <gi...@apache.org>.
KellyWalker commented on issue #37960:
URL: https://github.com/apache/arrow/issues/37960#issuecomment-1741508963

   No.
   
   In both cases the following error is generated:
   `pyarrow.lib.ArrowNotImplementedError: Got GCS URI but Arrow compiled without GCS support`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] KellyWalker commented on issue #37960: [Python] Add support for GCS URI (gs://) to pyarrow.parquet.read_table

Posted by "KellyWalker (via GitHub)" <gi...@apache.org>.
KellyWalker commented on issue #37960:
URL: https://github.com/apache/arrow/issues/37960#issuecomment-1741516297

   Wheel file through poetry.
   
   It is resolved to use pyarrow 12.0.1.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] KellyWalker commented on issue #37960: [Python] Add support for GCS URI (gs://) to pyarrow.parquet.read_table

Posted by "KellyWalker (via GitHub)" <gi...@apache.org>.
KellyWalker commented on issue #37960:
URL: https://github.com/apache/arrow/issues/37960#issuecomment-1741540034

   I confirmed that it does. Thank you so much for the help.
   
   I did search for the issue before I reported it, but maybe I was only looking at open tickets.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org