You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2018/12/19 10:25:17 UTC

[GitHub] mgaido91 opened a new pull request #23350: [SPARK-26366][SQL][BACKPORT-2.3] ReplaceExceptWithFilter should consider NULL as False

mgaido91 opened a new pull request #23350: [SPARK-26366][SQL][BACKPORT-2.3] ReplaceExceptWithFilter should consider NULL as False
URL: https://github.com/apache/spark/pull/23350
 
 
   ## What changes were proposed in this pull request?
   
   In `ReplaceExceptWithFilter` we do not consider properly the case in which the condition returns NULL. Indeed, in that case, since negating NULL still returns NULL, so it is not true the assumption that negating the condition returns all the rows which didn't satisfy it, rows returning NULL may not be returned. This happens when constraints inferred by `InferFiltersFromConstraints` are not enough, as it happens with `OR` conditions.
   
   The rule had also problems with non-deterministic conditions: in such a scenario, this rule would change the probability of the output.
   
   The PR fixes these problem by:
    - returning False for the condition when it is Null (in this way we do return all the rows which didn't satisfy it);
    - avoiding any transformation when the condition is non-deterministic.
   
   ## How was this patch tested?
   
   added UTs
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org