You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@iceberg.apache.org by "Fokko (via GitHub)" <gi...@apache.org> on 2023/03/03 11:02:33 UTC

[GitHub] [iceberg] Fokko commented on issue #6505: Python: Infer Iceberg schema from the Parquet file

Fokko commented on issue #6505:
URL: https://github.com/apache/iceberg/issues/6505#issuecomment-1453357137

   @bigluck Thanks for giving it a try.
   
   > I'm confused because the query is a simple COUNT(*), and I thought pyiceber would read the metadata stored on the metadata folder to get the number of records.
   
   Unfortunately, with the current DuckDB implementation, it pulls in all the (relevant) data. Since there is no filter on the scan, this means the entire table.
   
   How big is the table? Could it be that it runs out of memory? Running `echo $?` will tell you the exit code of the process, which might indicate an out-of-memory situation.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org