You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "AlenkaF (via GitHub)" <gi...@apache.org> on 2023/05/29 11:07:15 UTC

[GitHub] [arrow] AlenkaF commented on issue #35713: [Python] Can't interchange zero-sized `polars` dataframes

AlenkaF commented on issue #35713:
URL: https://github.com/apache/arrow/issues/35713#issuecomment-1566980860

   The issue is actually coming from the polars side as tracked in https://github.com/pola-rs/polars/issues/8884. It seems that in `to_arrow()` the [`pa.Table.from_batches`](https://github.com/apache/arrow/blob/05fe0d25834fd1629d71ceb51f0281b44a511f94/python/pyarrow/table.pxi#L3955-L3956) gets an empty list without a schema https://github.com/pola-rs/polars/blob/ba8187d85c47f199f26d28bc519f51afd1a1b885/py-polars/polars/dataframe/frame.py#L1849
   
   ```python
   >>> import polars as pl
   >>> pl.DataFrame({}).to_arrow()
   Traceback (most recent call last):
     File "<stdin>", line 1, in <module>
     File "/Users/alenkafrim/repos/pyarrow-dev/lib/python3.10/site-packages/polars/dataframe/frame.py", line 1845, in to_arrow
       return pa.Table.from_batches(record_batches)
     File "pyarrow/table.pxi", line 3955, in pyarrow.lib.Table.from_batches
       raise ValueError('Must pass schema, or at least '
   ValueError: Must pass schema, or at least one RecordBatch
   ```
   and so it fails if we want to convert it to pyarrow table via the dataframe protocol as the `__dataframe__` method fails on empty polars table.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org