You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2022/06/15 21:09:11 UTC

[GitHub] [iceberg] kbendick commented on pull request #5049: Spark: Fixes NPE when convert NOT IN filter

kbendick commented on PR #5049:
URL: https://github.com/apache/iceberg/pull/5049#issuecomment-1156945817

   I recall now when working on the `NOT_STARTS_WITH` that we wanted to leave it up to the engine to handle 3 value boolean logic.
   
   The example query, `DELETE FROM table WHERE id NOT IN (1, null)`, should probably be written as `DELETE FROM table WHERE id is not null and id != 1`. Spark will add the not null-check itself often times.
   
   With 3-valued boolean logic, for comparison purposes every null is a different null and nothing equals null. For those who might want a refresher (like I often need), I like this website's explanation of 3-valued boolean logic involving nulls: https://modern-sql.com/concept/three-valued-logic
   
   To avoid the headache that can incur, we explicitly avoid 3-valued boolean logic. There would be better filter pushdown and more advantage taken of Iceberg's statistics by writing the query with explicit `AND is [not] null` checking too.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org