You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by "Fokko (via GitHub)" <gi...@apache.org> on 2023/03/10 16:17:47 UTC

[GitHub] [iceberg] Fokko commented on issue #7067: Polars Based Compute Engine

Fokko commented on issue #7067:
URL: https://github.com/apache/iceberg/issues/7067#issuecomment-1464044392

   Hey @asheeshgarg Thanks for raising this!
   
   Integrating with Polars would be awesome. 
   
   Currently, we can load data into an Arrow table and convert that in every Arrow based backed (including Polars). As a short-term fix, we can add next to `to_arrow()` and `to_pandas` a method called [`to_polars()`](https://pola-rs.github.io/polars/py-polars/html/reference/api/polars.from_arrow.html). 
   
   And on a longer term. Unfortunately, we don't yet support pushing down the predicate from a PyArrow dataset directly all the way down to Iceberg. Once that is done, we can also easily add this to Polars. (Disclaimer: I'm more familiar with PyArrow at the moment, therefore the PyArrow concepts). I do think Polars and Iceberg would be an awesome combination as Polars is also lazy by design, and you would only open the Parquet files that you actually need for your query.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org