You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by "Fokko (via GitHub)" <gi...@apache.org> on 2023/05/20 09:27:49 UTC

[GitHub] [iceberg] Fokko commented on issue #7663: Rename table property for enabling Parquet dictionary encoding

Fokko commented on issue #7663:
URL: https://github.com/apache/iceberg/issues/7663#issuecomment-1555872635

   > Dictionary encoding is useful for low cardinality columns, so the space difference between the two is negligible, with the tradeoff being deterministic lookups vs False positives from the bloom filter.
   
   If it is low cardinality, the likelihood of having false positives is also low (assuming a fixed size bit for the bloom filter). I'm not sure if the dictionary is used for skipping data for example, but I don't think that should influence or impact this decision. Because if that's not the case, then we should fix that :)
   
   I'm not super strong on this, but in the end, I think less configuration is better. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org