You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2021/03/17 08:57:43 UTC

[GitHub] [iceberg] ankitkpandey opened a new issue #2342: Control on the number of data files created

ankitkpandey opened a new issue #2342:
URL: https://github.com/apache/iceberg/issues/2342


   Hi, I'm trying to use Spark along with Iceberg to capture differential data, using Spark SQL's MERGE INTO command.
   But I see around 200 files each with roughly 1mb size. Is there a configuration var which I can use to reduce the number of files?
   
   Also, my current use-case doesn't require time-travel and old snapshots, so is there a way to automatically delete them while merging the new data. Maybe just keeping the last snapshot.
   I have looked extensively through the docs but could only find methods using the Table and Actions API.
   
   Any help would be appreciated.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] ankitkpandey closed issue #2342: Control on the number of data files created

Posted by GitBox <gi...@apache.org>.
ankitkpandey closed issue #2342:
URL: https://github.com/apache/iceberg/issues/2342


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org