You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2022/07/11 18:03:05 UTC

[GitHub] [iceberg] flyrain commented on issue #5245: Optimize the performance of MOR on Trino

flyrain commented on issue #5245:
URL: https://github.com/apache/iceberg/issues/5245#issuecomment-1180706273

   Hi @shidayang, how many delete files were there in your test?
   
   I did benchmark multiple delete files, you can see the result here https://github.com/apache/iceberg/pull/3287#issuecomment-960433304.
   ```
   with 25% rows are deleted and distribute these deletes to 1, 2, 5, 10 delete files
   ```
   The perf doesn’t degrade much with more delete files. Please be ware that non-vectorized read is using the path without caching the filter. I am guessing Trino could be different from Spark in terms of read pattern.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org