You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2022/12/13 01:36:25 UTC

[GitHub] [iceberg] srilman commented on issue #3220: [Python] support iceberg hadoop catalog in python library

srilman commented on issue #3220:
URL: https://github.com/apache/iceberg/issues/3220#issuecomment-1347623323

   FYI, I did start working on an initial implementation but ran into issues with trying to use just FileIO to perform special operations needed for Hadoop tables (I believe hadoop FS libraries are used instead in the Java implementation).
   
   As a developer for a Python-based compute engine, we would like to start using PyIceberg. However, we use Hadoop tables extensively for 
   (1) Local testing since it is easy to view and modify metadata 
   (2) Public read-only examples on remote FS like S3
   
   I think we should be able to replace usages of (1) by using a SQLLite DB + JDBCCatalog locally, or some equivalent SQLCatalog for Python. However, as long as there exists catalogs unsupported in Python, I think there should be some way for users to access their tables  in some limited fashion (like read-only as Fokko suggested). Thoughts?
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org