You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2022/09/22 01:27:08 UTC

[GitHub] [iceberg] purefunctions commented on issue #5122: How can I use Iceberg in C++

purefunctions commented on issue #5122:
URL: https://github.com/apache/iceberg/issues/5122#issuecomment-1254395202

   @JanKaul Just got linked to this conversation from apache arrow discord chat! I just (today) created my own project https://github.com/trust-in-rust/rustberg with a modest goal of support a small subset of iceberg operations through the datafusion project. The iceberg-rs library that you linked was the first one that I saw, but it only supports metadata files for now, like you discovered. I also found https://github.com/joshuarobinson/rust_iceberg which actually reads more than metadata and supports a query through datafusion. However, it doesn't support the partition/schema evolution based split planning that one would have to do when using iceberg and evolving tables.
   
   @openinx just like @snth We also have a very small scale data lake based on iceberg on-premises. We use iceberg to make it easier to migrate the lake to a cloud later. We use Hive Metastore to store the latest metadata files and use NFS for table storage. We use spark (single node is sufficient for our needs) to ingest data and provide a python library for end users to query the data. The query needs are modest and the targeted query is mostly handled by the partitioning scheme and the resultant data that needs to be further queried can easily fit in one node. Currently we use pyspark in the user library and the startup time of JVM/spark and in general the slowness introduced by pyspark is something that we'd like to reduce. Hence looking at rust for a iceberg-lite kind of query API based on datafusion


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org