You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by "ConeyLiu (via GitHub)" <gi...@apache.org> on 2023/06/24 05:29:16 UTC

[GitHub] [iceberg] ConeyLiu opened a new pull request, #7898: Spark 3.4: Support NOT_EQ for V2 filters

ConeyLiu opened a new pull request, #7898:
URL: https://github.com/apache/iceberg/pull/7898

   This splits the changes for `NOT_EQ` from #7886


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] szehon-ho commented on a diff in pull request #7898: Spark 3.4: Support NOT_EQ for V2 filters

Posted by "szehon-ho (via GitHub)" <gi...@apache.org>.
szehon-ho commented on code in PR #7898:
URL: https://github.com/apache/iceberg/pull/7898#discussion_r1242595290


##########
spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/SparkV2Filters.java:
##########
@@ -245,6 +230,57 @@ public static Expression convert(Predicate predicate) {
     return null;
   }
 
+  private static <T> UnboundPredicate<T> handleEQPredicate(Predicate predicate) {

Review Comment:
   Sorry I was probably not so clear.
   
   Glad to see the lambda removed but I think its still not clean that there's a repeated if/else in the common code (the predicate EQ or NOT_EQ is checked twice, in case statement and then common method).  So how about just make sure each layer checks once.
   
   ```
   
   case EQ:
      PredicateValues predicateValues = predicateValues(predicate);
      return handleEqual(predicateValues);
   case NOT_EQ:  
      PredicateValues predicateValues = predicateValues(predicate);
      return handleNotEqual(predicateValues);
    
   class PredicateValues {
     Object value;
     Column column;
   }
   
   // predicateValues returns if one side is literal and other is column, null otherwise
   // handle[Not]Equal returns null if predicateValues is null
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] ConeyLiu commented on a diff in pull request #7898: Spark 3.4: Support NOT_EQ for V2 filters

Posted by "ConeyLiu (via GitHub)" <gi...@apache.org>.
ConeyLiu commented on code in PR #7898:
URL: https://github.com/apache/iceberg/pull/7898#discussion_r1245030301


##########
spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/SparkV2Filters.java:
##########
@@ -245,6 +251,46 @@ public static Expression convert(Predicate predicate) {
     return null;
   }
 
+  private static <T> PredicateChildren<T> predicateChildren(Predicate predicate) {
+    T value;
+    UnboundTerm<T> term;
+    if (isRef(leftChild(predicate)) && isLiteral(rightChild(predicate))) {
+      term = ref(SparkUtil.toColumnName(leftChild(predicate)));
+      value = convertLiteral(rightChild(predicate));
+    } else if (isRef(rightChild(predicate)) && isLiteral(leftChild(predicate))) {
+      term = ref(SparkUtil.toColumnName(rightChild(predicate)));
+      value = convertLiteral(leftChild(predicate));
+    } else {
+      return null;
+    }
+
+    return new PredicateChildren<>(term, value);
+  }
+
+  private static <T> UnboundPredicate<T> handleEqual(PredicateChildren<T> children) {

Review Comment:
   Updated.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #7898: Spark 3.4: Support NOT_EQ for V2 filters

Posted by "aokolnychyi (via GitHub)" <gi...@apache.org>.
aokolnychyi commented on code in PR #7898:
URL: https://github.com/apache/iceberg/pull/7898#discussion_r1248352867


##########
spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/SparkV2Filters.java:
##########
@@ -289,11 +311,27 @@ private static Object convertLiteral(Literal<?> literal) {
     return literal.value();
   }
 
-  private static Expression handleEqual(String attribute, Object value) {
+  private static UnboundPredicate<Object> handleEqual(UnboundTerm<Object> term, Object value) {
+    if (value == null) {
+      return isNull(term);
+    }

Review Comment:
   +1



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] szehon-ho commented on a diff in pull request #7898: Spark 3.4: Support NOT_EQ for V2 filters

Posted by "szehon-ho (via GitHub)" <gi...@apache.org>.
szehon-ho commented on code in PR #7898:
URL: https://github.com/apache/iceberg/pull/7898#discussion_r1248120744


##########
spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/SparkV2Filters.java:
##########
@@ -289,11 +311,28 @@ private static Object convertLiteral(Literal<?> literal) {
     return literal.value();
   }
 
-  private static Expression handleEqual(String attribute, Object value) {
-    if (NaNUtil.isNaN(value)) {
-      return isNaN(attribute);
+  private static UnboundPredicate<Object> handleEqual(Pair<UnboundTerm<Object>, Object> children) {
+    if (children.second() == null) {
+      return isNull(children.first());
+    }
+
+    if (NaNUtil.isNaN(children.second())) {
+      return isNaN(children.first());
+    } else {
+      return equal(children.first(), children.second());
+    }
+  }
+
+  private static UnboundPredicate<Object> handleNotEqual(
+      Pair<UnboundTerm<Object>, Object> children) {
+    if (children.second() == null) {
+      return notNull(children.first());
+    }
+
+    if (NaNUtil.isNaN(children.second())) {
+      return notNaN(children.first());
     } else {
-      return equal(attribute, value);
+      return notEqual(children.first(), children.second());

Review Comment:
   I see, and you plan to pass UnboundTerm instead of string, for following pr



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] ConeyLiu commented on a diff in pull request #7898: Spark 3.4: Support NOT_EQ for V2 filters

Posted by "ConeyLiu (via GitHub)" <gi...@apache.org>.
ConeyLiu commented on code in PR #7898:
URL: https://github.com/apache/iceberg/pull/7898#discussion_r1247609599


##########
spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/SparkV2Filters.java:
##########
@@ -289,11 +311,28 @@ private static Object convertLiteral(Literal<?> literal) {
     return literal.value();
   }
 
-  private static Expression handleEqual(String attribute, Object value) {
-    if (NaNUtil.isNaN(value)) {
-      return isNaN(attribute);
+  private static UnboundPredicate<Object> handleEqual(Pair<UnboundTerm<Object>, Object> children) {
+    if (children.second() == null) {
+      return isNull(children.first());
+    }
+
+    if (NaNUtil.isNaN(children.second())) {
+      return isNaN(children.first());
+    } else {
+      return equal(children.first(), children.second());
+    }
+  }
+
+  private static UnboundPredicate<Object> handleNotEqual(
+      Pair<UnboundTerm<Object>, Object> children) {
+    if (children.second() == null) {
+      return notNull(children.first());
+    }
+
+    if (NaNUtil.isNaN(children.second())) {
+      return notNaN(children.first());
     } else {
-      return equal(attribute, value);
+      return notEqual(children.first(), children.second());

Review Comment:
   The string will be wrapped into `NamedReference`. There are two API as follows:
   ```java
     public static <T> UnboundPredicate<T> equal(String name, T value) {
       return new UnboundPredicate<>(Expression.Operation.EQ, ref(name), value);
     }
   
     public static <T> UnboundPredicate<T> equal(UnboundTerm<T> expr, T value) {
       return new UnboundPredicate<>(Expression.Operation.EQ, expr, value);
     }
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] ConeyLiu commented on a diff in pull request #7898: Spark 3.4: Support NOT_EQ for V2 filters

Posted by "ConeyLiu (via GitHub)" <gi...@apache.org>.
ConeyLiu commented on code in PR #7898:
URL: https://github.com/apache/iceberg/pull/7898#discussion_r1248440612


##########
spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/SparkV2Filters.java:
##########
@@ -289,11 +311,27 @@ private static Object convertLiteral(Literal<?> literal) {
     return literal.value();
   }
 
-  private static Expression handleEqual(String attribute, Object value) {
+  private static UnboundPredicate<Object> handleEqual(UnboundTerm<Object> term, Object value) {
+    if (value == null) {
+      return isNull(term);
+    }
+
+    if (NaNUtil.isNaN(value)) {
+      return isNaN(term);
+    } else {
+      return equal(term, value);
+    }
+  }
+
+  private static UnboundPredicate<Object> handleNotEqual(UnboundTerm<Object> term, Object value) {
+    if (value == null) {
+      return notNull(term);
+    }

Review Comment:
   Updated



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] ConeyLiu commented on a diff in pull request #7898: Spark 3.4: Support NOT_EQ for V2 filters

Posted by "ConeyLiu (via GitHub)" <gi...@apache.org>.
ConeyLiu commented on code in PR #7898:
URL: https://github.com/apache/iceberg/pull/7898#discussion_r1245029106


##########
spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/SparkV2Filters.java:
##########
@@ -152,31 +159,30 @@ public static Expression convert(Predicate predicate) {
           }
 
         case EQ: // used for both eq and null-safe-eq
-          Object value;
-          String columnName;
-          if (isRef(leftChild(predicate)) && isLiteral(rightChild(predicate))) {
-            columnName = SparkUtil.toColumnName(leftChild(predicate));
-            value = convertLiteral(rightChild(predicate));
-          } else if (isRef(rightChild(predicate)) && isLiteral(leftChild(predicate))) {
-            columnName = SparkUtil.toColumnName(rightChild(predicate));
-            value = convertLiteral(leftChild(predicate));
-          } else {
+          PredicateChildren<Object> eqChildren = predicateChildren(predicate);
+          if (eqChildren == null) {
             return null;
           }
 
-          if (predicate.name().equals(EQ)) {
-            // comparison with null in normal equality is always null. this is probably a mistake.
+          if (EQ.equals(predicate.name())) {

Review Comment:
   We should process the `predicateChildren` first, and then with the `children` to determine whether the `value` is null.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] szehon-ho commented on a diff in pull request #7898: Spark 3.4: Support NOT_EQ for V2 filters

Posted by "szehon-ho (via GitHub)" <gi...@apache.org>.
szehon-ho commented on code in PR #7898:
URL: https://github.com/apache/iceberg/pull/7898#discussion_r1242595290


##########
spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/SparkV2Filters.java:
##########
@@ -245,6 +230,57 @@ public static Expression convert(Predicate predicate) {
     return null;
   }
 
+  private static <T> UnboundPredicate<T> handleEQPredicate(Predicate predicate) {

Review Comment:
   Sorry I was probably not so clear.
   
   Glad to see the lambda removed but I think its still not clean that there's a repeated branch in the common code (the predicate EQ or NOT_EQ is checked twice, in case statement and then if/else).  So how about just make sure each layer checks once.
   
   ```
   
   case EQ:
      PredicateChildren children = predicateChildren(predicate);
      return handleEqual(children);
   case NOT_EQ:  
      PredicateChildren children = predicateChildren(predicate);
      return handleNotEqual(children);
    
   class PredicateChildren {
     Object value;
     Column column;
   }
   
   // predicateChildren returns if one side is literal and other is column, null otherwise
   // handle[Not]Equal returns null if predicateChildren is null
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] szehon-ho commented on a diff in pull request #7898: Spark 3.4: Support NOT_EQ for V2 filters

Posted by "szehon-ho (via GitHub)" <gi...@apache.org>.
szehon-ho commented on code in PR #7898:
URL: https://github.com/apache/iceberg/pull/7898#discussion_r1247552059


##########
spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/SparkV2Filters.java:
##########
@@ -245,6 +251,22 @@ public static Expression convert(Predicate predicate) {
     return null;
   }
 
+  private static Pair<UnboundTerm<Object>, Object> predicateChildren(Predicate predicate) {
+    Object value;
+    UnboundTerm<Object> term;

Review Comment:
   Optional, my thought is , it would be clearer to have term be of type 'NamedReference', instead of super class 'UnboundTerm', not sure what you think.



##########
spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/SparkV2Filters.java:
##########
@@ -289,11 +311,28 @@ private static Object convertLiteral(Literal<?> literal) {
     return literal.value();
   }
 
-  private static Expression handleEqual(String attribute, Object value) {
-    if (NaNUtil.isNaN(value)) {
-      return isNaN(attribute);
+  private static UnboundPredicate<Object> handleEqual(Pair<UnboundTerm<Object>, Object> children) {
+    if (children.second() == null) {
+      return isNull(children.first());
+    }
+
+    if (NaNUtil.isNaN(children.second())) {
+      return isNaN(children.first());
+    } else {
+      return equal(children.first(), children.second());
+    }
+  }
+
+  private static UnboundPredicate<Object> handleNotEqual(

Review Comment:
   Sorry for not mentioning earlier, but can we make these two methods take not a Pair, but actual arguments, so its easier to read?
   
   Then just invoke in the caller: , handle[Not]Equal(children.first(), children.second())



##########
spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/SparkV2Filters.java:
##########
@@ -289,11 +311,28 @@ private static Object convertLiteral(Literal<?> literal) {
     return literal.value();
   }
 
-  private static Expression handleEqual(String attribute, Object value) {
-    if (NaNUtil.isNaN(value)) {
-      return isNaN(attribute);
+  private static UnboundPredicate<Object> handleEqual(Pair<UnboundTerm<Object>, Object> children) {
+    if (children.second() == null) {
+      return isNull(children.first());
+    }
+
+    if (NaNUtil.isNaN(children.second())) {
+      return isNaN(children.first());
+    } else {
+      return equal(children.first(), children.second());
+    }
+  }
+
+  private static UnboundPredicate<Object> handleNotEqual(
+      Pair<UnboundTerm<Object>, Object> children) {
+    if (children.second() == null) {
+      return notNull(children.first());
+    }
+
+    if (NaNUtil.isNaN(children.second())) {
+      return notNaN(children.first());
     } else {
-      return equal(attribute, value);
+      return notEqual(children.first(), children.second());

Review Comment:
   I had a question , was this originally a bug?  it was returning equal(string, value), not equal(NamedReference, value)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] ConeyLiu commented on a diff in pull request #7898: Spark 3.4: Support NOT_EQ for V2 filters

Posted by "ConeyLiu (via GitHub)" <gi...@apache.org>.
ConeyLiu commented on code in PR #7898:
URL: https://github.com/apache/iceberg/pull/7898#discussion_r1247619639


##########
spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/SparkV2Filters.java:
##########
@@ -289,11 +311,28 @@ private static Object convertLiteral(Literal<?> literal) {
     return literal.value();
   }
 
-  private static Expression handleEqual(String attribute, Object value) {
-    if (NaNUtil.isNaN(value)) {
-      return isNaN(attribute);
+  private static UnboundPredicate<Object> handleEqual(Pair<UnboundTerm<Object>, Object> children) {
+    if (children.second() == null) {
+      return isNull(children.first());
+    }
+
+    if (NaNUtil.isNaN(children.second())) {
+      return isNaN(children.first());
+    } else {
+      return equal(children.first(), children.second());
+    }
+  }
+
+  private static UnboundPredicate<Object> handleNotEqual(

Review Comment:
   Good idea, updated for the methods to be more clearly.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue merged pull request #7898: Spark 3.4: Support NOT_EQ for V2 filters

Posted by "rdblue (via GitHub)" <gi...@apache.org>.
rdblue merged PR #7898:
URL: https://github.com/apache/iceberg/pull/7898


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] szehon-ho commented on a diff in pull request #7898: Spark 3.4: Support NOT_EQ for V2 filters

Posted by "szehon-ho (via GitHub)" <gi...@apache.org>.
szehon-ho commented on code in PR #7898:
URL: https://github.com/apache/iceberg/pull/7898#discussion_r1242595290


##########
spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/SparkV2Filters.java:
##########
@@ -245,6 +230,57 @@ public static Expression convert(Predicate predicate) {
     return null;
   }
 
+  private static <T> UnboundPredicate<T> handleEQPredicate(Predicate predicate) {

Review Comment:
   Sorry I was probably not so clear.
   
   Glad to see the lambda removed but I think its still not clean that there's a repeated if/else in the common code (the predicate EQ or NOT_EQ is checked twice, in case statement and then common method).  So how about just make sure each layer checks once.
   
   ```
   
   case EQ:
      PredicateChildren children = predicateChildren(predicate);
      return handleEqual(children);
   case NOT_EQ:  
      PredicateChildren children = predicateChildren(predicate);
      return handleNotEqual(children);
    
   class PredicateChildren {
     Object value;
     Column column;
   }
   
   // predicateValues returns if one side is literal and other is column, null otherwise
   // handle[Not]Equal returns null if predicateValues is null
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #7898: Spark 3.4: Support NOT_EQ for V2 filters

Posted by "aokolnychyi (via GitHub)" <gi...@apache.org>.
aokolnychyi commented on code in PR #7898:
URL: https://github.com/apache/iceberg/pull/7898#discussion_r1248353665


##########
spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/SparkV2Filters.java:
##########
@@ -245,6 +251,22 @@ public static Expression convert(Predicate predicate) {
     return null;
   }
 
+  private static Pair<UnboundTerm<Object>, Object> predicateChildren(Predicate predicate) {
+    Object value;
+    UnboundTerm<Object> term;
+    if (isRef(leftChild(predicate)) && isLiteral(rightChild(predicate))) {
+      term = ref(SparkUtil.toColumnName(leftChild(predicate)));
+      value = convertLiteral(rightChild(predicate));
+    } else if (isRef(rightChild(predicate)) && isLiteral(leftChild(predicate))) {
+      term = ref(SparkUtil.toColumnName(rightChild(predicate)));
+      value = convertLiteral(leftChild(predicate));
+    } else {
+      return null;

Review Comment:
   Optional: I am not sure there is a lot of value in keeping `term` and `value` outside given that we have a nested return and not using them anywhere below. I'd probably format it like this, it up to you, though:
   
   ```
   if (isRef(leftChild(predicate)) && isLiteral(rightChild(predicate))) {
     UnboundTerm<Object> term = ref(SparkUtil.toColumnName(leftChild(predicate)));
     Object value = convertLiteral(rightChild(predicate));
     return Pair.of(term, value);
   
   } else if (isRef(rightChild(predicate)) && isLiteral(leftChild(predicate))) {
     UnboundTerm<Object> term = ref(SparkUtil.toColumnName(rightChild(predicate)));
     Object value = convertLiteral(leftChild(predicate));
     return Pair.of(term, value);
   
   } else {
     return null;
   }
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #7898: Spark 3.4: Support NOT_EQ for V2 filters

Posted by "aokolnychyi (via GitHub)" <gi...@apache.org>.
aokolnychyi commented on code in PR #7898:
URL: https://github.com/apache/iceberg/pull/7898#discussion_r1248353864


##########
spark/v3.4/spark/src/test/java/org/apache/iceberg/spark/TestSparkV2Filters.java:
##########
@@ -39,6 +39,7 @@
 
 public class TestSparkV2Filters {
 
+  @SuppressWarnings("checkstyle:MethodLength")

Review Comment:
   I'd be okay doing that in a follow-up change as well.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] szehon-ho commented on a diff in pull request #7898: Spark 3.4: Support NOT_EQ for V2 filters

Posted by "szehon-ho (via GitHub)" <gi...@apache.org>.
szehon-ho commented on code in PR #7898:
URL: https://github.com/apache/iceberg/pull/7898#discussion_r1240615096


##########
spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/SparkV2Filters.java:
##########
@@ -152,30 +160,22 @@ public static Expression convert(Predicate predicate) {
           }
 
         case EQ: // used for both eq and null-safe-eq
-          Object value;
-          String columnName;
-          if (isRef(leftChild(predicate)) && isLiteral(rightChild(predicate))) {
-            columnName = SparkUtil.toColumnName(leftChild(predicate));
-            value = convertLiteral(rightChild(predicate));
-          } else if (isRef(rightChild(predicate)) && isLiteral(leftChild(predicate))) {
-            columnName = SparkUtil.toColumnName(rightChild(predicate));
-            value = convertLiteral(leftChild(predicate));
-          } else {
-            return null;
-          }
-
-          if (predicate.name().equals(EQ)) {
-            // comparison with null in normal equality is always null. this is probably a mistake.
-            Preconditions.checkNotNull(
-                value, "Expression is always false (eq is not null-safe): %s", predicate);
-            return handleEqual(columnName, value);
-          } else { // "<=>"
-            if (value == null) {
-              return isNull(columnName);
-            } else {
-              return handleEqual(columnName, value);
-            }
-          }
+          return handleEQPredicate(
+              predicate,
+              (uboundTerm, value) -> {

Review Comment:
   Nit: unbound



##########
spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/SparkV2Filters.java:
##########
@@ -245,6 +245,47 @@ public static Expression convert(Predicate predicate) {
     return null;
   }
 
+  private static <T> UnboundPredicate<T> handleEQPredicate(
+      Predicate predicate, BiFunction<UnboundTerm<T>, T, UnboundPredicate<T>> func) {

Review Comment:
   I think this lambda makes it harder to read.  My vote is just we return a Pair (or better, a modeled class of value/columnName), and invoke the func directly after this method?
   
   Probably the method name can be improved as well, as handle doesnt tell much, could be called something like columnAndValue, or something similar



##########
spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/SparkV2Filters.java:
##########
@@ -152,30 +160,22 @@ public static Expression convert(Predicate predicate) {
           }
 
         case EQ: // used for both eq and null-safe-eq
-          Object value;
-          String columnName;
-          if (isRef(leftChild(predicate)) && isLiteral(rightChild(predicate))) {
-            columnName = SparkUtil.toColumnName(leftChild(predicate));
-            value = convertLiteral(rightChild(predicate));
-          } else if (isRef(rightChild(predicate)) && isLiteral(leftChild(predicate))) {
-            columnName = SparkUtil.toColumnName(rightChild(predicate));
-            value = convertLiteral(leftChild(predicate));
-          } else {
-            return null;
-          }
-
-          if (predicate.name().equals(EQ)) {
-            // comparison with null in normal equality is always null. this is probably a mistake.
-            Preconditions.checkNotNull(
-                value, "Expression is always false (eq is not null-safe): %s", predicate);
-            return handleEqual(columnName, value);
-          } else { // "<=>"
-            if (value == null) {
-              return isNull(columnName);
-            } else {
-              return handleEqual(columnName, value);
-            }
-          }
+          return handleEQPredicate(
+              predicate,
+              (uboundTerm, value) -> {
+                if (predicate.name().equals(EQ)) {
+                  // comparison with null in normal equality is always null. this is probably a
+                  // mistake.
+                  Preconditions.checkNotNull(
+                      value, "Expression is always false (eq is not null-safe): %s", predicate);
+                  return handleEqual(uboundTerm, value);

Review Comment:
   Seems we can just pull it out of if/else?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] ConeyLiu commented on a diff in pull request #7898: Spark 3.4: Support NOT_EQ for V2 filters

Posted by "ConeyLiu (via GitHub)" <gi...@apache.org>.
ConeyLiu commented on code in PR #7898:
URL: https://github.com/apache/iceberg/pull/7898#discussion_r1240637935


##########
spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/SparkV2Filters.java:
##########
@@ -152,30 +160,22 @@ public static Expression convert(Predicate predicate) {
           }
 
         case EQ: // used for both eq and null-safe-eq
-          Object value;
-          String columnName;
-          if (isRef(leftChild(predicate)) && isLiteral(rightChild(predicate))) {
-            columnName = SparkUtil.toColumnName(leftChild(predicate));
-            value = convertLiteral(rightChild(predicate));
-          } else if (isRef(rightChild(predicate)) && isLiteral(leftChild(predicate))) {
-            columnName = SparkUtil.toColumnName(rightChild(predicate));
-            value = convertLiteral(leftChild(predicate));
-          } else {
-            return null;
-          }
-
-          if (predicate.name().equals(EQ)) {
-            // comparison with null in normal equality is always null. this is probably a mistake.
-            Preconditions.checkNotNull(
-                value, "Expression is always false (eq is not null-safe): %s", predicate);
-            return handleEqual(columnName, value);
-          } else { // "<=>"
-            if (value == null) {
-              return isNull(columnName);
-            } else {
-              return handleEqual(columnName, value);
-            }
-          }
+          return handleEQPredicate(
+              predicate,
+              (uboundTerm, value) -> {

Review Comment:
   Updated



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] szehon-ho commented on a diff in pull request #7898: Spark 3.4: Support NOT_EQ for V2 filters

Posted by "szehon-ho (via GitHub)" <gi...@apache.org>.
szehon-ho commented on code in PR #7898:
URL: https://github.com/apache/iceberg/pull/7898#discussion_r1242595290


##########
spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/SparkV2Filters.java:
##########
@@ -245,6 +230,57 @@ public static Expression convert(Predicate predicate) {
     return null;
   }
 
+  private static <T> UnboundPredicate<T> handleEQPredicate(Predicate predicate) {

Review Comment:
   Sorry I was probably not so clear.
   
   Glad to see the lambda removed but I think its still not clean that there's a repeated if/else in the common code (the predicate EQ or NOT_EQ is checked twice, in case statement and then common method).  So how about just make sure each layer checks once.
   
   ```
   
   case EQ:
      PredicateChildren children = predicateChildren(predicate);
      return handleEqual(children);
   case NOT_EQ:  
      PredicateChildren children = predicateChildren(predicate);
      return handleNotEqual(children);
    
   class PredicateChildren {
     Object value;
     Column column;
   }
   
   // predicateChildren returns if one side is literal and other is column, null otherwise
   // handle[Not]Equal returns null if predicateChildren is null
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a diff in pull request #7898: Spark 3.4: Support NOT_EQ for V2 filters

Posted by "rdblue (via GitHub)" <gi...@apache.org>.
rdblue commented on code in PR #7898:
URL: https://github.com/apache/iceberg/pull/7898#discussion_r1244369323


##########
spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/SparkV2Filters.java:
##########
@@ -245,6 +251,46 @@ public static Expression convert(Predicate predicate) {
     return null;
   }
 
+  private static <T> PredicateChildren<T> predicateChildren(Predicate predicate) {
+    T value;
+    UnboundTerm<T> term;
+    if (isRef(leftChild(predicate)) && isLiteral(rightChild(predicate))) {
+      term = ref(SparkUtil.toColumnName(leftChild(predicate)));
+      value = convertLiteral(rightChild(predicate));
+    } else if (isRef(rightChild(predicate)) && isLiteral(leftChild(predicate))) {
+      term = ref(SparkUtil.toColumnName(rightChild(predicate)));
+      value = convertLiteral(leftChild(predicate));
+    } else {
+      return null;
+    }
+
+    return new PredicateChildren<>(term, value);
+  }
+
+  private static <T> UnboundPredicate<T> handleEqual(PredicateChildren<T> children) {

Review Comment:
   Can you move this back to where it was so that the diff is readable?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] szehon-ho commented on a diff in pull request #7898: Spark 3.4: Support NOT_EQ for V2 filters

Posted by "szehon-ho (via GitHub)" <gi...@apache.org>.
szehon-ho commented on code in PR #7898:
URL: https://github.com/apache/iceberg/pull/7898#discussion_r1246952518


##########
spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/SparkV2Filters.java:
##########
@@ -245,6 +251,22 @@ public static Expression convert(Predicate predicate) {
     return null;
   }
 
+  private static <T> Pair<UnboundTerm<T>, T> predicateChildren(Predicate predicate) {

Review Comment:
   Question: should we remove `<T>`?  Looks like all callers just use Object in the end, and <T> just clutters a bit?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] szehon-ho commented on a diff in pull request #7898: Spark 3.4: Support NOT_EQ for V2 filters

Posted by "szehon-ho (via GitHub)" <gi...@apache.org>.
szehon-ho commented on code in PR #7898:
URL: https://github.com/apache/iceberg/pull/7898#discussion_r1247552059


##########
spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/SparkV2Filters.java:
##########
@@ -245,6 +251,22 @@ public static Expression convert(Predicate predicate) {
     return null;
   }
 
+  private static Pair<UnboundTerm<Object>, Object> predicateChildren(Predicate predicate) {
+    Object value;
+    UnboundTerm<Object> term;

Review Comment:
   Optional, my thought is , it would be clearer to have term be of type 'NamedReference', instead of super class 'UnboundTerm', not sure what you think?  (because its more specific)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] szehon-ho commented on a diff in pull request #7898: Spark 3.4: Support NOT_EQ for V2 filters

Posted by "szehon-ho (via GitHub)" <gi...@apache.org>.
szehon-ho commented on code in PR #7898:
URL: https://github.com/apache/iceberg/pull/7898#discussion_r1242595290


##########
spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/SparkV2Filters.java:
##########
@@ -245,6 +230,57 @@ public static Expression convert(Predicate predicate) {
     return null;
   }
 
+  private static <T> UnboundPredicate<T> handleEQPredicate(Predicate predicate) {

Review Comment:
   Sorry I was probably not so clear.
   
   Glad to see the lambda removed but I think its still not clean that there's a repeated if/else in the common code (the predicate EQ or NOT_EQ is checked twice, in case statement and then common method).  So how about just make sure each layer checks once.
   
   ```
   
   case EQ:
      PredicateValues predicateValues = predicateValues(predicate);
      handleEqual(predicateValues);
   case NOT_EQ:  
      PredicateValues predicateValues = predicateValues(predicate);
      return handleNotEqual(predicateValues);
    
   class PredicateValues {
     Object value;
     Column column;
   }
   
   // predicateValues returns if one side is literal and other is column, null otherwise
   // handle[Not]Equal returns null if predicateValues is null
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] ConeyLiu commented on pull request #7898: Spark 3.4: Support NOT_EQ for V2 filters

Posted by "ConeyLiu (via GitHub)" <gi...@apache.org>.
ConeyLiu commented on PR #7898:
URL: https://github.com/apache/iceberg/pull/7898#issuecomment-1621902465

   Thanks @rdblue @szehon-ho @RussellSpitzer @aokolnychyi !


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] szehon-ho commented on a diff in pull request #7898: Spark 3.4: Support NOT_EQ for V2 filters

Posted by "szehon-ho (via GitHub)" <gi...@apache.org>.
szehon-ho commented on code in PR #7898:
URL: https://github.com/apache/iceberg/pull/7898#discussion_r1244109346


##########
spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/SparkV2Filters.java:
##########
@@ -152,31 +159,30 @@ public static Expression convert(Predicate predicate) {
           }
 
         case EQ: // used for both eq and null-safe-eq
-          Object value;
-          String columnName;
-          if (isRef(leftChild(predicate)) && isLiteral(rightChild(predicate))) {
-            columnName = SparkUtil.toColumnName(leftChild(predicate));
-            value = convertLiteral(rightChild(predicate));
-          } else if (isRef(rightChild(predicate)) && isLiteral(leftChild(predicate))) {
-            columnName = SparkUtil.toColumnName(rightChild(predicate));
-            value = convertLiteral(leftChild(predicate));
-          } else {
+          PredicateChildren<Object> eqChildren = predicateChildren(predicate);
+          if (eqChildren == null) {
             return null;
           }
 
-          if (predicate.name().equals(EQ)) {
-            // comparison with null in normal equality is always null. this is probably a mistake.
+          if (EQ.equals(predicate.name())) {

Review Comment:
   Probaby can move this to the top of the case as well, to avoid the unnecessary work if Precondition fails?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #7898: Spark 3.4: Support NOT_EQ for V2 filters

Posted by "aokolnychyi (via GitHub)" <gi...@apache.org>.
aokolnychyi commented on code in PR #7898:
URL: https://github.com/apache/iceberg/pull/7898#discussion_r1248352958


##########
spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/SparkV2Filters.java:
##########
@@ -289,11 +311,27 @@ private static Object convertLiteral(Literal<?> literal) {
     return literal.value();
   }
 
-  private static Expression handleEqual(String attribute, Object value) {
+  private static UnboundPredicate<Object> handleEqual(UnboundTerm<Object> term, Object value) {
+    if (value == null) {
+      return isNull(term);
+    }
+
+    if (NaNUtil.isNaN(value)) {
+      return isNaN(term);
+    } else {
+      return equal(term, value);
+    }
+  }
+
+  private static UnboundPredicate<Object> handleNotEqual(UnboundTerm<Object> term, Object value) {
+    if (value == null) {
+      return notNull(term);
+    }

Review Comment:
   Minor: Same here, let's use if/else if/else.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] ConeyLiu commented on a diff in pull request #7898: Spark 3.4: Support NOT_EQ for V2 filters

Posted by "ConeyLiu (via GitHub)" <gi...@apache.org>.
ConeyLiu commented on code in PR #7898:
URL: https://github.com/apache/iceberg/pull/7898#discussion_r1240638322


##########
spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/SparkV2Filters.java:
##########
@@ -245,6 +230,57 @@ public static Expression convert(Predicate predicate) {
     return null;
   }
 
+  private static <T> UnboundPredicate<T> handleEQPredicate(Predicate predicate) {

Review Comment:
   It should be more accurate with `handleEQOrNotEQPredicate` or `handleEqualOrNotEqualPredicate`. However, the name is a little stranger or too long.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a diff in pull request #7898: Spark 3.4: Support NOT_EQ for V2 filters

Posted by "rdblue (via GitHub)" <gi...@apache.org>.
rdblue commented on code in PR #7898:
URL: https://github.com/apache/iceberg/pull/7898#discussion_r1244368724


##########
spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/SparkV2Filters.java:
##########
@@ -328,4 +367,14 @@ private static boolean isSupportedInPredicate(Predicate predicate) {
       return Arrays.stream(predicate.children()).skip(1).allMatch(SparkV2Filters::isLiteral);
     }
   }
+
+  private static class PredicateChildren<T> {

Review Comment:
   I don't think it should be necessary to create a custom class for this. Can you use `Pair` instead?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] szehon-ho commented on a diff in pull request #7898: Spark 3.4: Support NOT_EQ for V2 filters

Posted by "szehon-ho (via GitHub)" <gi...@apache.org>.
szehon-ho commented on code in PR #7898:
URL: https://github.com/apache/iceberg/pull/7898#discussion_r1246952682


##########
spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/SparkV2Filters.java:
##########
@@ -282,18 +304,35 @@ private static boolean isLiteral(org.apache.spark.sql.connector.expressions.Expr
     return expr instanceof Literal;
   }
 
-  private static Object convertLiteral(Literal<?> literal) {
+  @SuppressWarnings("unchecked")
+  private static <T> T convertLiteral(Literal<?> literal) {

Review Comment:
   Same



##########
spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/SparkV2Filters.java:
##########
@@ -245,6 +251,22 @@ public static Expression convert(Predicate predicate) {
     return null;
   }
 
+  private static <T> Pair<UnboundTerm<T>, T> predicateChildren(Predicate predicate) {

Review Comment:
   Question: should we remove <T>?  Looks like all callers just use Object in the end, and <T> just clutters a bit?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a diff in pull request #7898: Spark 3.4: Support NOT_EQ for V2 filters

Posted by "rdblue (via GitHub)" <gi...@apache.org>.
rdblue commented on code in PR #7898:
URL: https://github.com/apache/iceberg/pull/7898#discussion_r1251323173


##########
spark/v3.4/spark/src/test/java/org/apache/iceberg/spark/TestSparkV2Filters.java:
##########
@@ -48,145 +48,150 @@ public void testV2Filters() {
     attrMap.put("`d`.b.`dd```", "d.b.dd`");
     attrMap.put("a.`aa```.c", "a.aa`.c");
 
-    attrMap.forEach(
-        (quoted, unquoted) -> {
-          NamedReference namedReference = FieldReference.apply(quoted);
-          org.apache.spark.sql.connector.expressions.Expression[] attrOnly =
-              new org.apache.spark.sql.connector.expressions.Expression[] {namedReference};
-
-          LiteralValue value = new LiteralValue(1, DataTypes.IntegerType);
-          org.apache.spark.sql.connector.expressions.Expression[] attrAndValue =
-              new org.apache.spark.sql.connector.expressions.Expression[] {namedReference, value};
-          org.apache.spark.sql.connector.expressions.Expression[] valueAndAttr =
-              new org.apache.spark.sql.connector.expressions.Expression[] {value, namedReference};
-
-          Predicate isNull = new Predicate("IS_NULL", attrOnly);
-          Expression expectedIsNull = Expressions.isNull(unquoted);
-          Expression actualIsNull = SparkV2Filters.convert(isNull);
-          Assert.assertEquals(
-              "IsNull must match", expectedIsNull.toString(), actualIsNull.toString());
-
-          Predicate isNotNull = new Predicate("IS_NOT_NULL", attrOnly);
-          Expression expectedIsNotNull = Expressions.notNull(unquoted);
-          Expression actualIsNotNull = SparkV2Filters.convert(isNotNull);
-          Assert.assertEquals(
-              "IsNotNull must match", expectedIsNotNull.toString(), actualIsNotNull.toString());
-
-          Predicate lt1 = new Predicate("<", attrAndValue);
-          Expression expectedLt1 = Expressions.lessThan(unquoted, 1);
-          Expression actualLt1 = SparkV2Filters.convert(lt1);
-          Assert.assertEquals("LessThan must match", expectedLt1.toString(), actualLt1.toString());
-
-          Predicate lt2 = new Predicate("<", valueAndAttr);
-          Expression expectedLt2 = Expressions.greaterThan(unquoted, 1);
-          Expression actualLt2 = SparkV2Filters.convert(lt2);
-          Assert.assertEquals("LessThan must match", expectedLt2.toString(), actualLt2.toString());
-
-          Predicate ltEq1 = new Predicate("<=", attrAndValue);
-          Expression expectedLtEq1 = Expressions.lessThanOrEqual(unquoted, 1);
-          Expression actualLtEq1 = SparkV2Filters.convert(ltEq1);
-          Assert.assertEquals(
-              "LessThanOrEqual must match", expectedLtEq1.toString(), actualLtEq1.toString());
-
-          Predicate ltEq2 = new Predicate("<=", valueAndAttr);
-          Expression expectedLtEq2 = Expressions.greaterThanOrEqual(unquoted, 1);
-          Expression actualLtEq2 = SparkV2Filters.convert(ltEq2);
-          Assert.assertEquals(
-              "LessThanOrEqual must match", expectedLtEq2.toString(), actualLtEq2.toString());
-
-          Predicate gt1 = new Predicate(">", attrAndValue);
-          Expression expectedGt1 = Expressions.greaterThan(unquoted, 1);
-          Expression actualGt1 = SparkV2Filters.convert(gt1);
-          Assert.assertEquals(
-              "GreaterThan must match", expectedGt1.toString(), actualGt1.toString());
-
-          Predicate gt2 = new Predicate(">", valueAndAttr);
-          Expression expectedGt2 = Expressions.lessThan(unquoted, 1);
-          Expression actualGt2 = SparkV2Filters.convert(gt2);
-          Assert.assertEquals(
-              "GreaterThan must match", expectedGt2.toString(), actualGt2.toString());
-
-          Predicate gtEq1 = new Predicate(">=", attrAndValue);
-          Expression expectedGtEq1 = Expressions.greaterThanOrEqual(unquoted, 1);
-          Expression actualGtEq1 = SparkV2Filters.convert(gtEq1);
-          Assert.assertEquals(
-              "GreaterThanOrEqual must match", expectedGtEq1.toString(), actualGtEq1.toString());
-
-          Predicate gtEq2 = new Predicate(">=", valueAndAttr);
-          Expression expectedGtEq2 = Expressions.lessThanOrEqual(unquoted, 1);
-          Expression actualGtEq2 = SparkV2Filters.convert(gtEq2);
-          Assert.assertEquals(
-              "GreaterThanOrEqual must match", expectedGtEq2.toString(), actualGtEq2.toString());
-
-          Predicate eq1 = new Predicate("=", attrAndValue);
-          Expression expectedEq1 = Expressions.equal(unquoted, 1);
-          Expression actualEq1 = SparkV2Filters.convert(eq1);
-          Assert.assertEquals("EqualTo must match", expectedEq1.toString(), actualEq1.toString());
-
-          Predicate eq2 = new Predicate("=", valueAndAttr);
-          Expression expectedEq2 = Expressions.equal(unquoted, 1);
-          Expression actualEq2 = SparkV2Filters.convert(eq2);
-          Assert.assertEquals("EqualTo must match", expectedEq2.toString(), actualEq2.toString());
-
-          Predicate eqNullSafe1 = new Predicate("<=>", attrAndValue);
-          Expression expectedEqNullSafe1 = Expressions.equal(unquoted, 1);
-          Expression actualEqNullSafe1 = SparkV2Filters.convert(eqNullSafe1);
-          Assert.assertEquals(
-              "EqualNullSafe must match",
-              expectedEqNullSafe1.toString(),
-              actualEqNullSafe1.toString());
-
-          Predicate eqNullSafe2 = new Predicate("<=>", valueAndAttr);
-          Expression expectedEqNullSafe2 = Expressions.equal(unquoted, 1);
-          Expression actualEqNullSafe2 = SparkV2Filters.convert(eqNullSafe2);
-          Assert.assertEquals(
-              "EqualNullSafe must match",
-              expectedEqNullSafe2.toString(),
-              actualEqNullSafe2.toString());
-
-          LiteralValue str =
-              new LiteralValue(UTF8String.fromString("iceberg"), DataTypes.StringType);
-          org.apache.spark.sql.connector.expressions.Expression[] attrAndStr =
-              new org.apache.spark.sql.connector.expressions.Expression[] {namedReference, str};
-          Predicate startsWith = new Predicate("STARTS_WITH", attrAndStr);
-          Expression expectedStartsWith = Expressions.startsWith(unquoted, "iceberg");
-          Expression actualStartsWith = SparkV2Filters.convert(startsWith);
-          Assert.assertEquals(
-              "StartsWith must match", expectedStartsWith.toString(), actualStartsWith.toString());
-
-          Predicate in = new Predicate("IN", attrAndValue);
-          Expression expectedIn = Expressions.in(unquoted, 1);
-          Expression actualIn = SparkV2Filters.convert(in);
-          Assert.assertEquals("In must match", expectedIn.toString(), actualIn.toString());
-
-          Predicate and = new And(lt1, eq1);
-          Expression expectedAnd = Expressions.and(expectedLt1, expectedEq1);
-          Expression actualAnd = SparkV2Filters.convert(and);
-          Assert.assertEquals("And must match", expectedAnd.toString(), actualAnd.toString());
-
-          org.apache.spark.sql.connector.expressions.Expression[] attrAndAttr =
-              new org.apache.spark.sql.connector.expressions.Expression[] {
-                namedReference, namedReference
-              };
-          Predicate invalid = new Predicate("<", attrAndAttr);
-          Predicate andWithInvalidLeft = new And(invalid, eq1);
-          Expression convertedAnd = SparkV2Filters.convert(andWithInvalidLeft);
-          Assert.assertEquals("And must match", convertedAnd, null);
-
-          Predicate or = new Or(lt1, eq1);
-          Expression expectedOr = Expressions.or(expectedLt1, expectedEq1);
-          Expression actualOr = SparkV2Filters.convert(or);
-          Assert.assertEquals("Or must match", expectedOr.toString(), actualOr.toString());
-
-          Predicate orWithInvalidLeft = new Or(invalid, eq1);
-          Expression convertedOr = SparkV2Filters.convert(orWithInvalidLeft);
-          Assert.assertEquals("Or must match", convertedOr, null);
-
-          Predicate not = new Not(lt1);
-          Expression expectedNot = Expressions.not(expectedLt1);
-          Expression actualNot = SparkV2Filters.convert(not);
-          Assert.assertEquals("Not must match", expectedNot.toString(), actualNot.toString());
-        });
+    attrMap.forEach(this::testV2Filter);
+  }
+
+  private void testV2Filter(String quoted, String unquoted) {
+    NamedReference namedReference = FieldReference.apply(quoted);
+    org.apache.spark.sql.connector.expressions.Expression[] attrOnly =
+        new org.apache.spark.sql.connector.expressions.Expression[] {namedReference};
+
+    LiteralValue value = new LiteralValue(1, DataTypes.IntegerType);
+    org.apache.spark.sql.connector.expressions.Expression[] attrAndValue =
+        new org.apache.spark.sql.connector.expressions.Expression[] {namedReference, value};
+    org.apache.spark.sql.connector.expressions.Expression[] valueAndAttr =
+        new org.apache.spark.sql.connector.expressions.Expression[] {value, namedReference};
+
+    Predicate isNull = new Predicate("IS_NULL", attrOnly);
+    Expression expectedIsNull = Expressions.isNull(unquoted);
+    Expression actualIsNull = SparkV2Filters.convert(isNull);
+    Assert.assertEquals("IsNull must match", expectedIsNull.toString(), actualIsNull.toString());
+
+    Predicate isNotNull = new Predicate("IS_NOT_NULL", attrOnly);
+    Expression expectedIsNotNull = Expressions.notNull(unquoted);
+    Expression actualIsNotNull = SparkV2Filters.convert(isNotNull);
+    Assert.assertEquals(
+        "IsNotNull must match", expectedIsNotNull.toString(), actualIsNotNull.toString());
+
+    Predicate lt1 = new Predicate("<", attrAndValue);
+    Expression expectedLt1 = Expressions.lessThan(unquoted, 1);
+    Expression actualLt1 = SparkV2Filters.convert(lt1);
+    Assert.assertEquals("LessThan must match", expectedLt1.toString(), actualLt1.toString());
+
+    Predicate lt2 = new Predicate("<", valueAndAttr);
+    Expression expectedLt2 = Expressions.greaterThan(unquoted, 1);
+    Expression actualLt2 = SparkV2Filters.convert(lt2);
+    Assert.assertEquals("LessThan must match", expectedLt2.toString(), actualLt2.toString());
+
+    Predicate ltEq1 = new Predicate("<=", attrAndValue);
+    Expression expectedLtEq1 = Expressions.lessThanOrEqual(unquoted, 1);
+    Expression actualLtEq1 = SparkV2Filters.convert(ltEq1);
+    Assert.assertEquals(
+        "LessThanOrEqual must match", expectedLtEq1.toString(), actualLtEq1.toString());
+
+    Predicate ltEq2 = new Predicate("<=", valueAndAttr);
+    Expression expectedLtEq2 = Expressions.greaterThanOrEqual(unquoted, 1);
+    Expression actualLtEq2 = SparkV2Filters.convert(ltEq2);
+    Assert.assertEquals(
+        "LessThanOrEqual must match", expectedLtEq2.toString(), actualLtEq2.toString());
+
+    Predicate gt1 = new Predicate(">", attrAndValue);
+    Expression expectedGt1 = Expressions.greaterThan(unquoted, 1);
+    Expression actualGt1 = SparkV2Filters.convert(gt1);
+    Assert.assertEquals("GreaterThan must match", expectedGt1.toString(), actualGt1.toString());
+
+    Predicate gt2 = new Predicate(">", valueAndAttr);
+    Expression expectedGt2 = Expressions.lessThan(unquoted, 1);
+    Expression actualGt2 = SparkV2Filters.convert(gt2);
+    Assert.assertEquals("GreaterThan must match", expectedGt2.toString(), actualGt2.toString());
+
+    Predicate gtEq1 = new Predicate(">=", attrAndValue);
+    Expression expectedGtEq1 = Expressions.greaterThanOrEqual(unquoted, 1);
+    Expression actualGtEq1 = SparkV2Filters.convert(gtEq1);
+    Assert.assertEquals(
+        "GreaterThanOrEqual must match", expectedGtEq1.toString(), actualGtEq1.toString());
+
+    Predicate gtEq2 = new Predicate(">=", valueAndAttr);
+    Expression expectedGtEq2 = Expressions.lessThanOrEqual(unquoted, 1);
+    Expression actualGtEq2 = SparkV2Filters.convert(gtEq2);
+    Assert.assertEquals(
+        "GreaterThanOrEqual must match", expectedGtEq2.toString(), actualGtEq2.toString());
+
+    Predicate eq1 = new Predicate("=", attrAndValue);
+    Expression expectedEq1 = Expressions.equal(unquoted, 1);
+    Expression actualEq1 = SparkV2Filters.convert(eq1);
+    Assert.assertEquals("EqualTo must match", expectedEq1.toString(), actualEq1.toString());
+
+    Predicate eq2 = new Predicate("=", valueAndAttr);
+    Expression expectedEq2 = Expressions.equal(unquoted, 1);
+    Expression actualEq2 = SparkV2Filters.convert(eq2);
+    Assert.assertEquals("EqualTo must match", expectedEq2.toString(), actualEq2.toString());
+
+    Predicate notEq1 = new Predicate("<>", attrAndValue);
+    Expression expectedNotEq1 = Expressions.notEqual(unquoted, 1);
+    Expression actualNotEq1 = SparkV2Filters.convert(notEq1);
+    Assert.assertEquals(
+        "NotEqualTo must match", expectedNotEq1.toString(), actualNotEq1.toString());
+
+    Predicate notEq2 = new Predicate("<>", valueAndAttr);

Review Comment:
   Are there cases for null and NaN values? I think there should be since there is special handling for them.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] ConeyLiu commented on a diff in pull request #7898: Spark 3.4: Support NOT_EQ for V2 filters

Posted by "ConeyLiu (via GitHub)" <gi...@apache.org>.
ConeyLiu commented on code in PR #7898:
URL: https://github.com/apache/iceberg/pull/7898#discussion_r1252177062


##########
spark/v3.4/spark/src/test/java/org/apache/iceberg/spark/TestSparkV2Filters.java:
##########
@@ -48,145 +48,150 @@ public void testV2Filters() {
     attrMap.put("`d`.b.`dd```", "d.b.dd`");
     attrMap.put("a.`aa```.c", "a.aa`.c");
 
-    attrMap.forEach(
-        (quoted, unquoted) -> {
-          NamedReference namedReference = FieldReference.apply(quoted);
-          org.apache.spark.sql.connector.expressions.Expression[] attrOnly =
-              new org.apache.spark.sql.connector.expressions.Expression[] {namedReference};
-
-          LiteralValue value = new LiteralValue(1, DataTypes.IntegerType);
-          org.apache.spark.sql.connector.expressions.Expression[] attrAndValue =
-              new org.apache.spark.sql.connector.expressions.Expression[] {namedReference, value};
-          org.apache.spark.sql.connector.expressions.Expression[] valueAndAttr =
-              new org.apache.spark.sql.connector.expressions.Expression[] {value, namedReference};
-
-          Predicate isNull = new Predicate("IS_NULL", attrOnly);
-          Expression expectedIsNull = Expressions.isNull(unquoted);
-          Expression actualIsNull = SparkV2Filters.convert(isNull);
-          Assert.assertEquals(
-              "IsNull must match", expectedIsNull.toString(), actualIsNull.toString());
-
-          Predicate isNotNull = new Predicate("IS_NOT_NULL", attrOnly);
-          Expression expectedIsNotNull = Expressions.notNull(unquoted);
-          Expression actualIsNotNull = SparkV2Filters.convert(isNotNull);
-          Assert.assertEquals(
-              "IsNotNull must match", expectedIsNotNull.toString(), actualIsNotNull.toString());
-
-          Predicate lt1 = new Predicate("<", attrAndValue);
-          Expression expectedLt1 = Expressions.lessThan(unquoted, 1);
-          Expression actualLt1 = SparkV2Filters.convert(lt1);
-          Assert.assertEquals("LessThan must match", expectedLt1.toString(), actualLt1.toString());
-
-          Predicate lt2 = new Predicate("<", valueAndAttr);
-          Expression expectedLt2 = Expressions.greaterThan(unquoted, 1);
-          Expression actualLt2 = SparkV2Filters.convert(lt2);
-          Assert.assertEquals("LessThan must match", expectedLt2.toString(), actualLt2.toString());
-
-          Predicate ltEq1 = new Predicate("<=", attrAndValue);
-          Expression expectedLtEq1 = Expressions.lessThanOrEqual(unquoted, 1);
-          Expression actualLtEq1 = SparkV2Filters.convert(ltEq1);
-          Assert.assertEquals(
-              "LessThanOrEqual must match", expectedLtEq1.toString(), actualLtEq1.toString());
-
-          Predicate ltEq2 = new Predicate("<=", valueAndAttr);
-          Expression expectedLtEq2 = Expressions.greaterThanOrEqual(unquoted, 1);
-          Expression actualLtEq2 = SparkV2Filters.convert(ltEq2);
-          Assert.assertEquals(
-              "LessThanOrEqual must match", expectedLtEq2.toString(), actualLtEq2.toString());
-
-          Predicate gt1 = new Predicate(">", attrAndValue);
-          Expression expectedGt1 = Expressions.greaterThan(unquoted, 1);
-          Expression actualGt1 = SparkV2Filters.convert(gt1);
-          Assert.assertEquals(
-              "GreaterThan must match", expectedGt1.toString(), actualGt1.toString());
-
-          Predicate gt2 = new Predicate(">", valueAndAttr);
-          Expression expectedGt2 = Expressions.lessThan(unquoted, 1);
-          Expression actualGt2 = SparkV2Filters.convert(gt2);
-          Assert.assertEquals(
-              "GreaterThan must match", expectedGt2.toString(), actualGt2.toString());
-
-          Predicate gtEq1 = new Predicate(">=", attrAndValue);
-          Expression expectedGtEq1 = Expressions.greaterThanOrEqual(unquoted, 1);
-          Expression actualGtEq1 = SparkV2Filters.convert(gtEq1);
-          Assert.assertEquals(
-              "GreaterThanOrEqual must match", expectedGtEq1.toString(), actualGtEq1.toString());
-
-          Predicate gtEq2 = new Predicate(">=", valueAndAttr);
-          Expression expectedGtEq2 = Expressions.lessThanOrEqual(unquoted, 1);
-          Expression actualGtEq2 = SparkV2Filters.convert(gtEq2);
-          Assert.assertEquals(
-              "GreaterThanOrEqual must match", expectedGtEq2.toString(), actualGtEq2.toString());
-
-          Predicate eq1 = new Predicate("=", attrAndValue);
-          Expression expectedEq1 = Expressions.equal(unquoted, 1);
-          Expression actualEq1 = SparkV2Filters.convert(eq1);
-          Assert.assertEquals("EqualTo must match", expectedEq1.toString(), actualEq1.toString());
-
-          Predicate eq2 = new Predicate("=", valueAndAttr);
-          Expression expectedEq2 = Expressions.equal(unquoted, 1);
-          Expression actualEq2 = SparkV2Filters.convert(eq2);
-          Assert.assertEquals("EqualTo must match", expectedEq2.toString(), actualEq2.toString());
-
-          Predicate eqNullSafe1 = new Predicate("<=>", attrAndValue);
-          Expression expectedEqNullSafe1 = Expressions.equal(unquoted, 1);
-          Expression actualEqNullSafe1 = SparkV2Filters.convert(eqNullSafe1);
-          Assert.assertEquals(
-              "EqualNullSafe must match",
-              expectedEqNullSafe1.toString(),
-              actualEqNullSafe1.toString());
-
-          Predicate eqNullSafe2 = new Predicate("<=>", valueAndAttr);
-          Expression expectedEqNullSafe2 = Expressions.equal(unquoted, 1);
-          Expression actualEqNullSafe2 = SparkV2Filters.convert(eqNullSafe2);
-          Assert.assertEquals(
-              "EqualNullSafe must match",
-              expectedEqNullSafe2.toString(),
-              actualEqNullSafe2.toString());
-
-          LiteralValue str =
-              new LiteralValue(UTF8String.fromString("iceberg"), DataTypes.StringType);
-          org.apache.spark.sql.connector.expressions.Expression[] attrAndStr =
-              new org.apache.spark.sql.connector.expressions.Expression[] {namedReference, str};
-          Predicate startsWith = new Predicate("STARTS_WITH", attrAndStr);
-          Expression expectedStartsWith = Expressions.startsWith(unquoted, "iceberg");
-          Expression actualStartsWith = SparkV2Filters.convert(startsWith);
-          Assert.assertEquals(
-              "StartsWith must match", expectedStartsWith.toString(), actualStartsWith.toString());
-
-          Predicate in = new Predicate("IN", attrAndValue);
-          Expression expectedIn = Expressions.in(unquoted, 1);
-          Expression actualIn = SparkV2Filters.convert(in);
-          Assert.assertEquals("In must match", expectedIn.toString(), actualIn.toString());
-
-          Predicate and = new And(lt1, eq1);
-          Expression expectedAnd = Expressions.and(expectedLt1, expectedEq1);
-          Expression actualAnd = SparkV2Filters.convert(and);
-          Assert.assertEquals("And must match", expectedAnd.toString(), actualAnd.toString());
-
-          org.apache.spark.sql.connector.expressions.Expression[] attrAndAttr =
-              new org.apache.spark.sql.connector.expressions.Expression[] {
-                namedReference, namedReference
-              };
-          Predicate invalid = new Predicate("<", attrAndAttr);
-          Predicate andWithInvalidLeft = new And(invalid, eq1);
-          Expression convertedAnd = SparkV2Filters.convert(andWithInvalidLeft);
-          Assert.assertEquals("And must match", convertedAnd, null);
-
-          Predicate or = new Or(lt1, eq1);
-          Expression expectedOr = Expressions.or(expectedLt1, expectedEq1);
-          Expression actualOr = SparkV2Filters.convert(or);
-          Assert.assertEquals("Or must match", expectedOr.toString(), actualOr.toString());
-
-          Predicate orWithInvalidLeft = new Or(invalid, eq1);
-          Expression convertedOr = SparkV2Filters.convert(orWithInvalidLeft);
-          Assert.assertEquals("Or must match", convertedOr, null);
-
-          Predicate not = new Not(lt1);
-          Expression expectedNot = Expressions.not(expectedLt1);
-          Expression actualNot = SparkV2Filters.convert(not);
-          Assert.assertEquals("Not must match", expectedNot.toString(), actualNot.toString());
-        });
+    attrMap.forEach(this::testV2Filter);
+  }
+
+  private void testV2Filter(String quoted, String unquoted) {
+    NamedReference namedReference = FieldReference.apply(quoted);
+    org.apache.spark.sql.connector.expressions.Expression[] attrOnly =
+        new org.apache.spark.sql.connector.expressions.Expression[] {namedReference};
+
+    LiteralValue value = new LiteralValue(1, DataTypes.IntegerType);
+    org.apache.spark.sql.connector.expressions.Expression[] attrAndValue =
+        new org.apache.spark.sql.connector.expressions.Expression[] {namedReference, value};
+    org.apache.spark.sql.connector.expressions.Expression[] valueAndAttr =
+        new org.apache.spark.sql.connector.expressions.Expression[] {value, namedReference};
+
+    Predicate isNull = new Predicate("IS_NULL", attrOnly);
+    Expression expectedIsNull = Expressions.isNull(unquoted);
+    Expression actualIsNull = SparkV2Filters.convert(isNull);
+    Assert.assertEquals("IsNull must match", expectedIsNull.toString(), actualIsNull.toString());
+
+    Predicate isNotNull = new Predicate("IS_NOT_NULL", attrOnly);
+    Expression expectedIsNotNull = Expressions.notNull(unquoted);
+    Expression actualIsNotNull = SparkV2Filters.convert(isNotNull);
+    Assert.assertEquals(
+        "IsNotNull must match", expectedIsNotNull.toString(), actualIsNotNull.toString());
+
+    Predicate lt1 = new Predicate("<", attrAndValue);
+    Expression expectedLt1 = Expressions.lessThan(unquoted, 1);
+    Expression actualLt1 = SparkV2Filters.convert(lt1);
+    Assert.assertEquals("LessThan must match", expectedLt1.toString(), actualLt1.toString());
+
+    Predicate lt2 = new Predicate("<", valueAndAttr);
+    Expression expectedLt2 = Expressions.greaterThan(unquoted, 1);
+    Expression actualLt2 = SparkV2Filters.convert(lt2);
+    Assert.assertEquals("LessThan must match", expectedLt2.toString(), actualLt2.toString());
+
+    Predicate ltEq1 = new Predicate("<=", attrAndValue);
+    Expression expectedLtEq1 = Expressions.lessThanOrEqual(unquoted, 1);
+    Expression actualLtEq1 = SparkV2Filters.convert(ltEq1);
+    Assert.assertEquals(
+        "LessThanOrEqual must match", expectedLtEq1.toString(), actualLtEq1.toString());
+
+    Predicate ltEq2 = new Predicate("<=", valueAndAttr);
+    Expression expectedLtEq2 = Expressions.greaterThanOrEqual(unquoted, 1);
+    Expression actualLtEq2 = SparkV2Filters.convert(ltEq2);
+    Assert.assertEquals(
+        "LessThanOrEqual must match", expectedLtEq2.toString(), actualLtEq2.toString());
+
+    Predicate gt1 = new Predicate(">", attrAndValue);
+    Expression expectedGt1 = Expressions.greaterThan(unquoted, 1);
+    Expression actualGt1 = SparkV2Filters.convert(gt1);
+    Assert.assertEquals("GreaterThan must match", expectedGt1.toString(), actualGt1.toString());
+
+    Predicate gt2 = new Predicate(">", valueAndAttr);
+    Expression expectedGt2 = Expressions.lessThan(unquoted, 1);
+    Expression actualGt2 = SparkV2Filters.convert(gt2);
+    Assert.assertEquals("GreaterThan must match", expectedGt2.toString(), actualGt2.toString());
+
+    Predicate gtEq1 = new Predicate(">=", attrAndValue);
+    Expression expectedGtEq1 = Expressions.greaterThanOrEqual(unquoted, 1);
+    Expression actualGtEq1 = SparkV2Filters.convert(gtEq1);
+    Assert.assertEquals(
+        "GreaterThanOrEqual must match", expectedGtEq1.toString(), actualGtEq1.toString());
+
+    Predicate gtEq2 = new Predicate(">=", valueAndAttr);
+    Expression expectedGtEq2 = Expressions.lessThanOrEqual(unquoted, 1);
+    Expression actualGtEq2 = SparkV2Filters.convert(gtEq2);
+    Assert.assertEquals(
+        "GreaterThanOrEqual must match", expectedGtEq2.toString(), actualGtEq2.toString());
+
+    Predicate eq1 = new Predicate("=", attrAndValue);
+    Expression expectedEq1 = Expressions.equal(unquoted, 1);
+    Expression actualEq1 = SparkV2Filters.convert(eq1);
+    Assert.assertEquals("EqualTo must match", expectedEq1.toString(), actualEq1.toString());
+
+    Predicate eq2 = new Predicate("=", valueAndAttr);
+    Expression expectedEq2 = Expressions.equal(unquoted, 1);
+    Expression actualEq2 = SparkV2Filters.convert(eq2);
+    Assert.assertEquals("EqualTo must match", expectedEq2.toString(), actualEq2.toString());
+
+    Predicate notEq1 = new Predicate("<>", attrAndValue);
+    Expression expectedNotEq1 = Expressions.notEqual(unquoted, 1);
+    Expression actualNotEq1 = SparkV2Filters.convert(notEq1);
+    Assert.assertEquals(
+        "NotEqualTo must match", expectedNotEq1.toString(), actualNotEq1.toString());
+
+    Predicate notEq2 = new Predicate("<>", valueAndAttr);

Review Comment:
   Added



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] szehon-ho commented on a diff in pull request #7898: Spark 3.4: Support NOT_EQ for V2 filters

Posted by "szehon-ho (via GitHub)" <gi...@apache.org>.
szehon-ho commented on code in PR #7898:
URL: https://github.com/apache/iceberg/pull/7898#discussion_r1251157591


##########
spark/v3.4/spark/src/test/java/org/apache/iceberg/spark/TestSparkV2Filters.java:
##########
@@ -48,145 +48,150 @@ public void testV2Filters() {
     attrMap.put("`d`.b.`dd```", "d.b.dd`");
     attrMap.put("a.`aa```.c", "a.aa`.c");
 
-    attrMap.forEach(
-        (quoted, unquoted) -> {
-          NamedReference namedReference = FieldReference.apply(quoted);
-          org.apache.spark.sql.connector.expressions.Expression[] attrOnly =
-              new org.apache.spark.sql.connector.expressions.Expression[] {namedReference};
-
-          LiteralValue value = new LiteralValue(1, DataTypes.IntegerType);
-          org.apache.spark.sql.connector.expressions.Expression[] attrAndValue =
-              new org.apache.spark.sql.connector.expressions.Expression[] {namedReference, value};
-          org.apache.spark.sql.connector.expressions.Expression[] valueAndAttr =
-              new org.apache.spark.sql.connector.expressions.Expression[] {value, namedReference};
-
-          Predicate isNull = new Predicate("IS_NULL", attrOnly);
-          Expression expectedIsNull = Expressions.isNull(unquoted);
-          Expression actualIsNull = SparkV2Filters.convert(isNull);
-          Assert.assertEquals(
-              "IsNull must match", expectedIsNull.toString(), actualIsNull.toString());
-
-          Predicate isNotNull = new Predicate("IS_NOT_NULL", attrOnly);
-          Expression expectedIsNotNull = Expressions.notNull(unquoted);
-          Expression actualIsNotNull = SparkV2Filters.convert(isNotNull);
-          Assert.assertEquals(
-              "IsNotNull must match", expectedIsNotNull.toString(), actualIsNotNull.toString());
-
-          Predicate lt1 = new Predicate("<", attrAndValue);
-          Expression expectedLt1 = Expressions.lessThan(unquoted, 1);
-          Expression actualLt1 = SparkV2Filters.convert(lt1);
-          Assert.assertEquals("LessThan must match", expectedLt1.toString(), actualLt1.toString());
-
-          Predicate lt2 = new Predicate("<", valueAndAttr);
-          Expression expectedLt2 = Expressions.greaterThan(unquoted, 1);
-          Expression actualLt2 = SparkV2Filters.convert(lt2);
-          Assert.assertEquals("LessThan must match", expectedLt2.toString(), actualLt2.toString());
-
-          Predicate ltEq1 = new Predicate("<=", attrAndValue);
-          Expression expectedLtEq1 = Expressions.lessThanOrEqual(unquoted, 1);
-          Expression actualLtEq1 = SparkV2Filters.convert(ltEq1);
-          Assert.assertEquals(
-              "LessThanOrEqual must match", expectedLtEq1.toString(), actualLtEq1.toString());
-
-          Predicate ltEq2 = new Predicate("<=", valueAndAttr);
-          Expression expectedLtEq2 = Expressions.greaterThanOrEqual(unquoted, 1);
-          Expression actualLtEq2 = SparkV2Filters.convert(ltEq2);
-          Assert.assertEquals(
-              "LessThanOrEqual must match", expectedLtEq2.toString(), actualLtEq2.toString());
-
-          Predicate gt1 = new Predicate(">", attrAndValue);
-          Expression expectedGt1 = Expressions.greaterThan(unquoted, 1);
-          Expression actualGt1 = SparkV2Filters.convert(gt1);
-          Assert.assertEquals(
-              "GreaterThan must match", expectedGt1.toString(), actualGt1.toString());
-
-          Predicate gt2 = new Predicate(">", valueAndAttr);
-          Expression expectedGt2 = Expressions.lessThan(unquoted, 1);
-          Expression actualGt2 = SparkV2Filters.convert(gt2);
-          Assert.assertEquals(
-              "GreaterThan must match", expectedGt2.toString(), actualGt2.toString());
-
-          Predicate gtEq1 = new Predicate(">=", attrAndValue);
-          Expression expectedGtEq1 = Expressions.greaterThanOrEqual(unquoted, 1);
-          Expression actualGtEq1 = SparkV2Filters.convert(gtEq1);
-          Assert.assertEquals(
-              "GreaterThanOrEqual must match", expectedGtEq1.toString(), actualGtEq1.toString());
-
-          Predicate gtEq2 = new Predicate(">=", valueAndAttr);
-          Expression expectedGtEq2 = Expressions.lessThanOrEqual(unquoted, 1);
-          Expression actualGtEq2 = SparkV2Filters.convert(gtEq2);
-          Assert.assertEquals(
-              "GreaterThanOrEqual must match", expectedGtEq2.toString(), actualGtEq2.toString());
-
-          Predicate eq1 = new Predicate("=", attrAndValue);
-          Expression expectedEq1 = Expressions.equal(unquoted, 1);
-          Expression actualEq1 = SparkV2Filters.convert(eq1);
-          Assert.assertEquals("EqualTo must match", expectedEq1.toString(), actualEq1.toString());
-
-          Predicate eq2 = new Predicate("=", valueAndAttr);
-          Expression expectedEq2 = Expressions.equal(unquoted, 1);
-          Expression actualEq2 = SparkV2Filters.convert(eq2);
-          Assert.assertEquals("EqualTo must match", expectedEq2.toString(), actualEq2.toString());
-
-          Predicate eqNullSafe1 = new Predicate("<=>", attrAndValue);
-          Expression expectedEqNullSafe1 = Expressions.equal(unquoted, 1);
-          Expression actualEqNullSafe1 = SparkV2Filters.convert(eqNullSafe1);
-          Assert.assertEquals(
-              "EqualNullSafe must match",
-              expectedEqNullSafe1.toString(),
-              actualEqNullSafe1.toString());
-
-          Predicate eqNullSafe2 = new Predicate("<=>", valueAndAttr);
-          Expression expectedEqNullSafe2 = Expressions.equal(unquoted, 1);
-          Expression actualEqNullSafe2 = SparkV2Filters.convert(eqNullSafe2);
-          Assert.assertEquals(
-              "EqualNullSafe must match",
-              expectedEqNullSafe2.toString(),
-              actualEqNullSafe2.toString());
-
-          LiteralValue str =
-              new LiteralValue(UTF8String.fromString("iceberg"), DataTypes.StringType);
-          org.apache.spark.sql.connector.expressions.Expression[] attrAndStr =
-              new org.apache.spark.sql.connector.expressions.Expression[] {namedReference, str};
-          Predicate startsWith = new Predicate("STARTS_WITH", attrAndStr);
-          Expression expectedStartsWith = Expressions.startsWith(unquoted, "iceberg");
-          Expression actualStartsWith = SparkV2Filters.convert(startsWith);
-          Assert.assertEquals(
-              "StartsWith must match", expectedStartsWith.toString(), actualStartsWith.toString());
-
-          Predicate in = new Predicate("IN", attrAndValue);
-          Expression expectedIn = Expressions.in(unquoted, 1);
-          Expression actualIn = SparkV2Filters.convert(in);
-          Assert.assertEquals("In must match", expectedIn.toString(), actualIn.toString());
-
-          Predicate and = new And(lt1, eq1);
-          Expression expectedAnd = Expressions.and(expectedLt1, expectedEq1);
-          Expression actualAnd = SparkV2Filters.convert(and);
-          Assert.assertEquals("And must match", expectedAnd.toString(), actualAnd.toString());
-
-          org.apache.spark.sql.connector.expressions.Expression[] attrAndAttr =
-              new org.apache.spark.sql.connector.expressions.Expression[] {
-                namedReference, namedReference
-              };
-          Predicate invalid = new Predicate("<", attrAndAttr);
-          Predicate andWithInvalidLeft = new And(invalid, eq1);
-          Expression convertedAnd = SparkV2Filters.convert(andWithInvalidLeft);
-          Assert.assertEquals("And must match", convertedAnd, null);
-
-          Predicate or = new Or(lt1, eq1);
-          Expression expectedOr = Expressions.or(expectedLt1, expectedEq1);
-          Expression actualOr = SparkV2Filters.convert(or);
-          Assert.assertEquals("Or must match", expectedOr.toString(), actualOr.toString());
-
-          Predicate orWithInvalidLeft = new Or(invalid, eq1);
-          Expression convertedOr = SparkV2Filters.convert(orWithInvalidLeft);
-          Assert.assertEquals("Or must match", convertedOr, null);
-
-          Predicate not = new Not(lt1);
-          Expression expectedNot = Expressions.not(expectedLt1);
-          Expression actualNot = SparkV2Filters.convert(not);
-          Assert.assertEquals("Not must match", expectedNot.toString(), actualNot.toString());
-        });
+    attrMap.forEach(this::testV2Filter);
+  }
+
+  private void testV2Filter(String quoted, String unquoted) {

Review Comment:
   I think this is a bit hacky to avoid the length warning, as it doesnt add the readability more than the suppression does.  I think @RussellSpitzer may have meant individual test methods, like:
   
   ```
   @Test
   public void testIsNullV2Filter() {
   }
   
   @Test
   public void testIsNotNullV2Filter {
   }
   ```
   
   Maybe we can revert it to what you had, and we can make a cleanup issue for this?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a diff in pull request #7898: Spark 3.4: Support NOT_EQ for V2 filters

Posted by "rdblue (via GitHub)" <gi...@apache.org>.
rdblue commented on code in PR #7898:
URL: https://github.com/apache/iceberg/pull/7898#discussion_r1251322690


##########
spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/SparkV2Filters.java:
##########
@@ -289,11 +311,23 @@ private static Object convertLiteral(Literal<?> literal) {
     return literal.value();
   }
 
-  private static Expression handleEqual(String attribute, Object value) {
-    if (NaNUtil.isNaN(value)) {
-      return isNaN(attribute);
+  private static UnboundPredicate<Object> handleEqual(UnboundTerm<Object> term, Object value) {
+    if (value == null) {
+      return isNull(term);
+    } else if (NaNUtil.isNaN(value)) {
+      return isNaN(term);
+    } else {
+      return equal(term, value);
+    }
+  }
+
+  private static UnboundPredicate<Object> handleNotEqual(UnboundTerm<Object> term, Object value) {
+    if (value == null) {
+      return notNull(term);

Review Comment:
   `<>` is not null safe right? So this should return `null` or throw an exception if the value is null.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] ConeyLiu commented on a diff in pull request #7898: Spark 3.4: Support NOT_EQ for V2 filters

Posted by "ConeyLiu (via GitHub)" <gi...@apache.org>.
ConeyLiu commented on code in PR #7898:
URL: https://github.com/apache/iceberg/pull/7898#discussion_r1247610921


##########
spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/SparkV2Filters.java:
##########
@@ -245,6 +251,22 @@ public static Expression convert(Predicate predicate) {
     return null;
   }
 
+  private static Pair<UnboundTerm<Object>, Object> predicateChildren(Predicate predicate) {
+    Object value;
+    UnboundTerm<Object> term;

Review Comment:
   I would like this to be `UnboundTerm ` because it could be a `UnboundTransform` as well in the following PRs.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] RussellSpitzer commented on a diff in pull request #7898: Spark 3.4: Support NOT_EQ for V2 filters

Posted by "RussellSpitzer (via GitHub)" <gi...@apache.org>.
RussellSpitzer commented on code in PR #7898:
URL: https://github.com/apache/iceberg/pull/7898#discussion_r1248320570


##########
spark/v3.4/spark/src/test/java/org/apache/iceberg/spark/TestSparkV2Filters.java:
##########
@@ -39,6 +39,7 @@
 
 public class TestSparkV2Filters {
 
+  @SuppressWarnings("checkstyle:MethodLength")

Review Comment:
   We should probably break this test up by operator :)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] szehon-ho commented on a diff in pull request #7898: Spark 3.4: Support NOT_EQ for V2 filters

Posted by "szehon-ho (via GitHub)" <gi...@apache.org>.
szehon-ho commented on code in PR #7898:
URL: https://github.com/apache/iceberg/pull/7898#discussion_r1247550348


##########
spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/SparkV2Filters.java:
##########
@@ -289,11 +311,28 @@ private static Object convertLiteral(Literal<?> literal) {
     return literal.value();
   }
 
-  private static Expression handleEqual(String attribute, Object value) {
-    if (NaNUtil.isNaN(value)) {
-      return isNaN(attribute);
+  private static UnboundPredicate<Object> handleEqual(Pair<UnboundTerm<Object>, Object> children) {
+    if (children.second() == null) {
+      return isNull(children.first());
+    }
+
+    if (NaNUtil.isNaN(children.second())) {
+      return isNaN(children.first());
+    } else {
+      return equal(children.first(), children.second());
+    }
+  }
+
+  private static UnboundPredicate<Object> handleNotEqual(

Review Comment:
   Sorry for not mentioning earlier, but can we make these two methods take not a Pair, but actual arguments, so this method itself is easier to read?
   
   Then just invoke in the caller: , handle[Not]Equal(children.first(), children.second())



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] ConeyLiu commented on a diff in pull request #7898: Spark 3.4: Support NOT_EQ for V2 filters

Posted by "ConeyLiu (via GitHub)" <gi...@apache.org>.
ConeyLiu commented on code in PR #7898:
URL: https://github.com/apache/iceberg/pull/7898#discussion_r1245026734


##########
spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/SparkV2Filters.java:
##########
@@ -152,31 +159,30 @@ public static Expression convert(Predicate predicate) {
           }
 
         case EQ: // used for both eq and null-safe-eq
-          Object value;
-          String columnName;
-          if (isRef(leftChild(predicate)) && isLiteral(rightChild(predicate))) {
-            columnName = SparkUtil.toColumnName(leftChild(predicate));
-            value = convertLiteral(rightChild(predicate));
-          } else if (isRef(rightChild(predicate)) && isLiteral(leftChild(predicate))) {
-            columnName = SparkUtil.toColumnName(rightChild(predicate));
-            value = convertLiteral(leftChild(predicate));
-          } else {
+          PredicateChildren<Object> eqChildren = predicateChildren(predicate);
+          if (eqChildren == null) {
             return null;
           }
 
-          if (predicate.name().equals(EQ)) {
-            // comparison with null in normal equality is always null. this is probably a mistake.
+          if (EQ.equals(predicate.name())) {
+            // comparison with null in normal equality is always null. this is probably a
+            // mistake.

Review Comment:
   Reverted



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] ConeyLiu commented on a diff in pull request #7898: Spark 3.4: Support NOT_EQ for V2 filters

Posted by "ConeyLiu (via GitHub)" <gi...@apache.org>.
ConeyLiu commented on code in PR #7898:
URL: https://github.com/apache/iceberg/pull/7898#discussion_r1245029587


##########
spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/SparkV2Filters.java:
##########
@@ -328,4 +367,14 @@ private static boolean isSupportedInPredicate(Predicate predicate) {
       return Arrays.stream(predicate.children()).skip(1).allMatch(SparkV2Filters::isLiteral);
     }
   }
+
+  private static class PredicateChildren<T> {

Review Comment:
   Changed to Pair.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] ConeyLiu commented on a diff in pull request #7898: Spark 3.4: Support NOT_EQ for V2 filters

Posted by "ConeyLiu (via GitHub)" <gi...@apache.org>.
ConeyLiu commented on code in PR #7898:
URL: https://github.com/apache/iceberg/pull/7898#discussion_r1240638008


##########
spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/SparkV2Filters.java:
##########
@@ -245,6 +245,47 @@ public static Expression convert(Predicate predicate) {
     return null;
   }
 
+  private static <T> UnboundPredicate<T> handleEQPredicate(
+      Predicate predicate, BiFunction<UnboundTerm<T>, T, UnboundPredicate<T>> func) {

Review Comment:
   Inline the lambda in the `handleEQPredicate`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] ConeyLiu commented on a diff in pull request #7898: Spark 3.4: Support NOT_EQ for V2 filters

Posted by "ConeyLiu (via GitHub)" <gi...@apache.org>.
ConeyLiu commented on code in PR #7898:
URL: https://github.com/apache/iceberg/pull/7898#discussion_r1240638008


##########
spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/SparkV2Filters.java:
##########
@@ -245,6 +245,47 @@ public static Expression convert(Predicate predicate) {
     return null;
   }
 
+  private static <T> UnboundPredicate<T> handleEQPredicate(
+      Predicate predicate, BiFunction<UnboundTerm<T>, T, UnboundPredicate<T>> func) {

Review Comment:
   Inlined the lambda in the `handleEQPredicate`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a diff in pull request #7898: Spark 3.4: Support NOT_EQ for V2 filters

Posted by "rdblue (via GitHub)" <gi...@apache.org>.
rdblue commented on code in PR #7898:
URL: https://github.com/apache/iceberg/pull/7898#discussion_r1244025696


##########
spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/SparkV2Filters.java:
##########
@@ -152,31 +159,30 @@ public static Expression convert(Predicate predicate) {
           }
 
         case EQ: // used for both eq and null-safe-eq
-          Object value;
-          String columnName;
-          if (isRef(leftChild(predicate)) && isLiteral(rightChild(predicate))) {
-            columnName = SparkUtil.toColumnName(leftChild(predicate));
-            value = convertLiteral(rightChild(predicate));
-          } else if (isRef(rightChild(predicate)) && isLiteral(leftChild(predicate))) {
-            columnName = SparkUtil.toColumnName(rightChild(predicate));
-            value = convertLiteral(leftChild(predicate));
-          } else {
+          PredicateChildren<Object> eqChildren = predicateChildren(predicate);
+          if (eqChildren == null) {
             return null;
           }
 
-          if (predicate.name().equals(EQ)) {
-            // comparison with null in normal equality is always null. this is probably a mistake.
+          if (EQ.equals(predicate.name())) {
+            // comparison with null in normal equality is always null. this is probably a
+            // mistake.

Review Comment:
   Why did the formatting of this line change? It looks like it doesn't need to.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] ConeyLiu commented on a diff in pull request #7898: Spark 3.4: Support NOT_EQ for V2 filters

Posted by "ConeyLiu (via GitHub)" <gi...@apache.org>.
ConeyLiu commented on code in PR #7898:
URL: https://github.com/apache/iceberg/pull/7898#discussion_r1243803595


##########
spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/SparkV2Filters.java:
##########
@@ -245,6 +230,57 @@ public static Expression convert(Predicate predicate) {
     return null;
   }
 
+  private static <T> UnboundPredicate<T> handleEQPredicate(Predicate predicate) {

Review Comment:
   Thanks for the good suggestion. Updated based on the comments.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] ConeyLiu commented on a diff in pull request #7898: Spark 3.4: Support NOT_EQ for V2 filters

Posted by "ConeyLiu (via GitHub)" <gi...@apache.org>.
ConeyLiu commented on code in PR #7898:
URL: https://github.com/apache/iceberg/pull/7898#discussion_r1248440325


##########
spark/v3.4/spark/src/test/java/org/apache/iceberg/spark/TestSparkV2Filters.java:
##########
@@ -39,6 +39,7 @@
 
 public class TestSparkV2Filters {
 
+  @SuppressWarnings("checkstyle:MethodLength")

Review Comment:
   Split the internal test block to function `testV2Filter`.



##########
spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/SparkV2Filters.java:
##########
@@ -289,11 +311,27 @@ private static Object convertLiteral(Literal<?> literal) {
     return literal.value();
   }
 
-  private static Expression handleEqual(String attribute, Object value) {
+  private static UnboundPredicate<Object> handleEqual(UnboundTerm<Object> term, Object value) {
+    if (value == null) {
+      return isNull(term);
+    }

Review Comment:
   Updated



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] ConeyLiu commented on a diff in pull request #7898: Spark 3.4: Support NOT_EQ for V2 filters

Posted by "ConeyLiu (via GitHub)" <gi...@apache.org>.
ConeyLiu commented on code in PR #7898:
URL: https://github.com/apache/iceberg/pull/7898#discussion_r1248440812


##########
spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/SparkV2Filters.java:
##########
@@ -245,6 +251,22 @@ public static Expression convert(Predicate predicate) {
     return null;
   }
 
+  private static Pair<UnboundTerm<Object>, Object> predicateChildren(Predicate predicate) {
+    Object value;
+    UnboundTerm<Object> term;
+    if (isRef(leftChild(predicate)) && isLiteral(rightChild(predicate))) {
+      term = ref(SparkUtil.toColumnName(leftChild(predicate)));
+      value = convertLiteral(rightChild(predicate));
+    } else if (isRef(rightChild(predicate)) && isLiteral(leftChild(predicate))) {
+      term = ref(SparkUtil.toColumnName(rightChild(predicate)));
+      value = convertLiteral(leftChild(predicate));
+    } else {
+      return null;

Review Comment:
   Updated



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] ConeyLiu commented on a diff in pull request #7898: Spark 3.4: Support NOT_EQ for V2 filters

Posted by "ConeyLiu (via GitHub)" <gi...@apache.org>.
ConeyLiu commented on code in PR #7898:
URL: https://github.com/apache/iceberg/pull/7898#discussion_r1252178334


##########
spark/v3.4/spark/src/test/java/org/apache/iceberg/spark/TestSparkV2Filters.java:
##########
@@ -48,145 +48,150 @@ public void testV2Filters() {
     attrMap.put("`d`.b.`dd```", "d.b.dd`");
     attrMap.put("a.`aa```.c", "a.aa`.c");
 
-    attrMap.forEach(
-        (quoted, unquoted) -> {
-          NamedReference namedReference = FieldReference.apply(quoted);
-          org.apache.spark.sql.connector.expressions.Expression[] attrOnly =
-              new org.apache.spark.sql.connector.expressions.Expression[] {namedReference};
-
-          LiteralValue value = new LiteralValue(1, DataTypes.IntegerType);
-          org.apache.spark.sql.connector.expressions.Expression[] attrAndValue =
-              new org.apache.spark.sql.connector.expressions.Expression[] {namedReference, value};
-          org.apache.spark.sql.connector.expressions.Expression[] valueAndAttr =
-              new org.apache.spark.sql.connector.expressions.Expression[] {value, namedReference};
-
-          Predicate isNull = new Predicate("IS_NULL", attrOnly);
-          Expression expectedIsNull = Expressions.isNull(unquoted);
-          Expression actualIsNull = SparkV2Filters.convert(isNull);
-          Assert.assertEquals(
-              "IsNull must match", expectedIsNull.toString(), actualIsNull.toString());
-
-          Predicate isNotNull = new Predicate("IS_NOT_NULL", attrOnly);
-          Expression expectedIsNotNull = Expressions.notNull(unquoted);
-          Expression actualIsNotNull = SparkV2Filters.convert(isNotNull);
-          Assert.assertEquals(
-              "IsNotNull must match", expectedIsNotNull.toString(), actualIsNotNull.toString());
-
-          Predicate lt1 = new Predicate("<", attrAndValue);
-          Expression expectedLt1 = Expressions.lessThan(unquoted, 1);
-          Expression actualLt1 = SparkV2Filters.convert(lt1);
-          Assert.assertEquals("LessThan must match", expectedLt1.toString(), actualLt1.toString());
-
-          Predicate lt2 = new Predicate("<", valueAndAttr);
-          Expression expectedLt2 = Expressions.greaterThan(unquoted, 1);
-          Expression actualLt2 = SparkV2Filters.convert(lt2);
-          Assert.assertEquals("LessThan must match", expectedLt2.toString(), actualLt2.toString());
-
-          Predicate ltEq1 = new Predicate("<=", attrAndValue);
-          Expression expectedLtEq1 = Expressions.lessThanOrEqual(unquoted, 1);
-          Expression actualLtEq1 = SparkV2Filters.convert(ltEq1);
-          Assert.assertEquals(
-              "LessThanOrEqual must match", expectedLtEq1.toString(), actualLtEq1.toString());
-
-          Predicate ltEq2 = new Predicate("<=", valueAndAttr);
-          Expression expectedLtEq2 = Expressions.greaterThanOrEqual(unquoted, 1);
-          Expression actualLtEq2 = SparkV2Filters.convert(ltEq2);
-          Assert.assertEquals(
-              "LessThanOrEqual must match", expectedLtEq2.toString(), actualLtEq2.toString());
-
-          Predicate gt1 = new Predicate(">", attrAndValue);
-          Expression expectedGt1 = Expressions.greaterThan(unquoted, 1);
-          Expression actualGt1 = SparkV2Filters.convert(gt1);
-          Assert.assertEquals(
-              "GreaterThan must match", expectedGt1.toString(), actualGt1.toString());
-
-          Predicate gt2 = new Predicate(">", valueAndAttr);
-          Expression expectedGt2 = Expressions.lessThan(unquoted, 1);
-          Expression actualGt2 = SparkV2Filters.convert(gt2);
-          Assert.assertEquals(
-              "GreaterThan must match", expectedGt2.toString(), actualGt2.toString());
-
-          Predicate gtEq1 = new Predicate(">=", attrAndValue);
-          Expression expectedGtEq1 = Expressions.greaterThanOrEqual(unquoted, 1);
-          Expression actualGtEq1 = SparkV2Filters.convert(gtEq1);
-          Assert.assertEquals(
-              "GreaterThanOrEqual must match", expectedGtEq1.toString(), actualGtEq1.toString());
-
-          Predicate gtEq2 = new Predicate(">=", valueAndAttr);
-          Expression expectedGtEq2 = Expressions.lessThanOrEqual(unquoted, 1);
-          Expression actualGtEq2 = SparkV2Filters.convert(gtEq2);
-          Assert.assertEquals(
-              "GreaterThanOrEqual must match", expectedGtEq2.toString(), actualGtEq2.toString());
-
-          Predicate eq1 = new Predicate("=", attrAndValue);
-          Expression expectedEq1 = Expressions.equal(unquoted, 1);
-          Expression actualEq1 = SparkV2Filters.convert(eq1);
-          Assert.assertEquals("EqualTo must match", expectedEq1.toString(), actualEq1.toString());
-
-          Predicate eq2 = new Predicate("=", valueAndAttr);
-          Expression expectedEq2 = Expressions.equal(unquoted, 1);
-          Expression actualEq2 = SparkV2Filters.convert(eq2);
-          Assert.assertEquals("EqualTo must match", expectedEq2.toString(), actualEq2.toString());
-
-          Predicate eqNullSafe1 = new Predicate("<=>", attrAndValue);
-          Expression expectedEqNullSafe1 = Expressions.equal(unquoted, 1);
-          Expression actualEqNullSafe1 = SparkV2Filters.convert(eqNullSafe1);
-          Assert.assertEquals(
-              "EqualNullSafe must match",
-              expectedEqNullSafe1.toString(),
-              actualEqNullSafe1.toString());
-
-          Predicate eqNullSafe2 = new Predicate("<=>", valueAndAttr);
-          Expression expectedEqNullSafe2 = Expressions.equal(unquoted, 1);
-          Expression actualEqNullSafe2 = SparkV2Filters.convert(eqNullSafe2);
-          Assert.assertEquals(
-              "EqualNullSafe must match",
-              expectedEqNullSafe2.toString(),
-              actualEqNullSafe2.toString());
-
-          LiteralValue str =
-              new LiteralValue(UTF8String.fromString("iceberg"), DataTypes.StringType);
-          org.apache.spark.sql.connector.expressions.Expression[] attrAndStr =
-              new org.apache.spark.sql.connector.expressions.Expression[] {namedReference, str};
-          Predicate startsWith = new Predicate("STARTS_WITH", attrAndStr);
-          Expression expectedStartsWith = Expressions.startsWith(unquoted, "iceberg");
-          Expression actualStartsWith = SparkV2Filters.convert(startsWith);
-          Assert.assertEquals(
-              "StartsWith must match", expectedStartsWith.toString(), actualStartsWith.toString());
-
-          Predicate in = new Predicate("IN", attrAndValue);
-          Expression expectedIn = Expressions.in(unquoted, 1);
-          Expression actualIn = SparkV2Filters.convert(in);
-          Assert.assertEquals("In must match", expectedIn.toString(), actualIn.toString());
-
-          Predicate and = new And(lt1, eq1);
-          Expression expectedAnd = Expressions.and(expectedLt1, expectedEq1);
-          Expression actualAnd = SparkV2Filters.convert(and);
-          Assert.assertEquals("And must match", expectedAnd.toString(), actualAnd.toString());
-
-          org.apache.spark.sql.connector.expressions.Expression[] attrAndAttr =
-              new org.apache.spark.sql.connector.expressions.Expression[] {
-                namedReference, namedReference
-              };
-          Predicate invalid = new Predicate("<", attrAndAttr);
-          Predicate andWithInvalidLeft = new And(invalid, eq1);
-          Expression convertedAnd = SparkV2Filters.convert(andWithInvalidLeft);
-          Assert.assertEquals("And must match", convertedAnd, null);
-
-          Predicate or = new Or(lt1, eq1);
-          Expression expectedOr = Expressions.or(expectedLt1, expectedEq1);
-          Expression actualOr = SparkV2Filters.convert(or);
-          Assert.assertEquals("Or must match", expectedOr.toString(), actualOr.toString());
-
-          Predicate orWithInvalidLeft = new Or(invalid, eq1);
-          Expression convertedOr = SparkV2Filters.convert(orWithInvalidLeft);
-          Assert.assertEquals("Or must match", convertedOr, null);
-
-          Predicate not = new Not(lt1);
-          Expression expectedNot = Expressions.not(expectedLt1);
-          Expression actualNot = SparkV2Filters.convert(not);
-          Assert.assertEquals("Not must match", expectedNot.toString(), actualNot.toString());
-        });
+    attrMap.forEach(this::testV2Filter);
+  }
+
+  private void testV2Filter(String quoted, String unquoted) {

Review Comment:
   Oh sorry for the misunderstood. Reverted and will address it in a follow-up.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] ConeyLiu commented on a diff in pull request #7898: Spark 3.4: Support NOT_EQ for V2 filters

Posted by "ConeyLiu (via GitHub)" <gi...@apache.org>.
ConeyLiu commented on code in PR #7898:
URL: https://github.com/apache/iceberg/pull/7898#discussion_r1252177250


##########
spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/SparkV2Filters.java:
##########
@@ -289,11 +311,23 @@ private static Object convertLiteral(Literal<?> literal) {
     return literal.value();
   }
 
-  private static Expression handleEqual(String attribute, Object value) {
-    if (NaNUtil.isNaN(value)) {
-      return isNaN(attribute);
+  private static UnboundPredicate<Object> handleEqual(UnboundTerm<Object> term, Object value) {
+    if (value == null) {
+      return isNull(term);
+    } else if (NaNUtil.isNaN(value)) {
+      return isNaN(term);
+    } else {
+      return equal(term, value);
+    }
+  }
+
+  private static UnboundPredicate<Object> handleNotEqual(UnboundTerm<Object> term, Object value) {
+    if (value == null) {
+      return notNull(term);

Review Comment:
   Sorry for missed that.



##########
spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/SparkV2Filters.java:
##########
@@ -289,11 +311,23 @@ private static Object convertLiteral(Literal<?> literal) {
     return literal.value();
   }
 
-  private static Expression handleEqual(String attribute, Object value) {
-    if (NaNUtil.isNaN(value)) {
-      return isNaN(attribute);
+  private static UnboundPredicate<Object> handleEqual(UnboundTerm<Object> term, Object value) {
+    if (value == null) {
+      return isNull(term);
+    } else if (NaNUtil.isNaN(value)) {
+      return isNaN(term);
+    } else {
+      return equal(term, value);
+    }
+  }
+
+  private static UnboundPredicate<Object> handleNotEqual(UnboundTerm<Object> term, Object value) {
+    if (value == null) {
+      return notNull(term);

Review Comment:
   Sorry for missing that. Updated.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] ConeyLiu commented on a diff in pull request #7898: Spark 3.4: Support NOT_EQ for V2 filters

Posted by "ConeyLiu (via GitHub)" <gi...@apache.org>.
ConeyLiu commented on code in PR #7898:
URL: https://github.com/apache/iceberg/pull/7898#discussion_r1247503853


##########
spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/SparkV2Filters.java:
##########
@@ -245,6 +251,22 @@ public static Expression convert(Predicate predicate) {
     return null;
   }
 
+  private static <T> Pair<UnboundTerm<T>, T> predicateChildren(Predicate predicate) {

Review Comment:
   @szehon-ho It should be OK to remove the `<T>` and use `Object` instead. And also updated `handleEqual` and `handleNotEqual`.



##########
spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/SparkV2Filters.java:
##########
@@ -282,18 +304,35 @@ private static boolean isLiteral(org.apache.spark.sql.connector.expressions.Expr
     return expr instanceof Literal;
   }
 
-  private static Object convertLiteral(Literal<?> literal) {
+  @SuppressWarnings("unchecked")
+  private static <T> T convertLiteral(Literal<?> literal) {

Review Comment:
   updated as well.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] szehon-ho commented on a diff in pull request #7898: Spark 3.4: Support NOT_EQ for V2 filters

Posted by "szehon-ho (via GitHub)" <gi...@apache.org>.
szehon-ho commented on code in PR #7898:
URL: https://github.com/apache/iceberg/pull/7898#discussion_r1247553984


##########
spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/SparkV2Filters.java:
##########
@@ -289,11 +311,28 @@ private static Object convertLiteral(Literal<?> literal) {
     return literal.value();
   }
 
-  private static Expression handleEqual(String attribute, Object value) {
-    if (NaNUtil.isNaN(value)) {
-      return isNaN(attribute);
+  private static UnboundPredicate<Object> handleEqual(Pair<UnboundTerm<Object>, Object> children) {
+    if (children.second() == null) {
+      return isNull(children.first());
+    }
+
+    if (NaNUtil.isNaN(children.second())) {
+      return isNaN(children.first());
+    } else {
+      return equal(children.first(), children.second());
+    }
+  }
+
+  private static UnboundPredicate<Object> handleNotEqual(
+      Pair<UnboundTerm<Object>, Object> children) {
+    if (children.second() == null) {
+      return notNull(children.first());
+    }
+
+    if (NaNUtil.isNaN(children.second())) {
+      return notNaN(children.first());
     } else {
-      return equal(attribute, value);
+      return notEqual(children.first(), children.second());

Review Comment:
   (My question is about the original code)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] RussellSpitzer commented on a diff in pull request #7898: Spark 3.4: Support NOT_EQ for V2 filters

Posted by "RussellSpitzer (via GitHub)" <gi...@apache.org>.
RussellSpitzer commented on code in PR #7898:
URL: https://github.com/apache/iceberg/pull/7898#discussion_r1248321463


##########
spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/SparkV2Filters.java:
##########
@@ -289,11 +311,27 @@ private static Object convertLiteral(Literal<?> literal) {
     return literal.value();
   }
 
-  private static Expression handleEqual(String attribute, Object value) {
+  private static UnboundPredicate<Object> handleEqual(UnboundTerm<Object> term, Object value) {
+    if (value == null) {
+      return isNull(term);
+    }

Review Comment:
   minor style suggestion here, i find these a bit more clear as if, elseif, else



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] szehon-ho commented on a diff in pull request #7898: Spark 3.4: Support NOT_EQ for V2 filters

Posted by "szehon-ho (via GitHub)" <gi...@apache.org>.
szehon-ho commented on code in PR #7898:
URL: https://github.com/apache/iceberg/pull/7898#discussion_r1251157591


##########
spark/v3.4/spark/src/test/java/org/apache/iceberg/spark/TestSparkV2Filters.java:
##########
@@ -48,145 +48,150 @@ public void testV2Filters() {
     attrMap.put("`d`.b.`dd```", "d.b.dd`");
     attrMap.put("a.`aa```.c", "a.aa`.c");
 
-    attrMap.forEach(
-        (quoted, unquoted) -> {
-          NamedReference namedReference = FieldReference.apply(quoted);
-          org.apache.spark.sql.connector.expressions.Expression[] attrOnly =
-              new org.apache.spark.sql.connector.expressions.Expression[] {namedReference};
-
-          LiteralValue value = new LiteralValue(1, DataTypes.IntegerType);
-          org.apache.spark.sql.connector.expressions.Expression[] attrAndValue =
-              new org.apache.spark.sql.connector.expressions.Expression[] {namedReference, value};
-          org.apache.spark.sql.connector.expressions.Expression[] valueAndAttr =
-              new org.apache.spark.sql.connector.expressions.Expression[] {value, namedReference};
-
-          Predicate isNull = new Predicate("IS_NULL", attrOnly);
-          Expression expectedIsNull = Expressions.isNull(unquoted);
-          Expression actualIsNull = SparkV2Filters.convert(isNull);
-          Assert.assertEquals(
-              "IsNull must match", expectedIsNull.toString(), actualIsNull.toString());
-
-          Predicate isNotNull = new Predicate("IS_NOT_NULL", attrOnly);
-          Expression expectedIsNotNull = Expressions.notNull(unquoted);
-          Expression actualIsNotNull = SparkV2Filters.convert(isNotNull);
-          Assert.assertEquals(
-              "IsNotNull must match", expectedIsNotNull.toString(), actualIsNotNull.toString());
-
-          Predicate lt1 = new Predicate("<", attrAndValue);
-          Expression expectedLt1 = Expressions.lessThan(unquoted, 1);
-          Expression actualLt1 = SparkV2Filters.convert(lt1);
-          Assert.assertEquals("LessThan must match", expectedLt1.toString(), actualLt1.toString());
-
-          Predicate lt2 = new Predicate("<", valueAndAttr);
-          Expression expectedLt2 = Expressions.greaterThan(unquoted, 1);
-          Expression actualLt2 = SparkV2Filters.convert(lt2);
-          Assert.assertEquals("LessThan must match", expectedLt2.toString(), actualLt2.toString());
-
-          Predicate ltEq1 = new Predicate("<=", attrAndValue);
-          Expression expectedLtEq1 = Expressions.lessThanOrEqual(unquoted, 1);
-          Expression actualLtEq1 = SparkV2Filters.convert(ltEq1);
-          Assert.assertEquals(
-              "LessThanOrEqual must match", expectedLtEq1.toString(), actualLtEq1.toString());
-
-          Predicate ltEq2 = new Predicate("<=", valueAndAttr);
-          Expression expectedLtEq2 = Expressions.greaterThanOrEqual(unquoted, 1);
-          Expression actualLtEq2 = SparkV2Filters.convert(ltEq2);
-          Assert.assertEquals(
-              "LessThanOrEqual must match", expectedLtEq2.toString(), actualLtEq2.toString());
-
-          Predicate gt1 = new Predicate(">", attrAndValue);
-          Expression expectedGt1 = Expressions.greaterThan(unquoted, 1);
-          Expression actualGt1 = SparkV2Filters.convert(gt1);
-          Assert.assertEquals(
-              "GreaterThan must match", expectedGt1.toString(), actualGt1.toString());
-
-          Predicate gt2 = new Predicate(">", valueAndAttr);
-          Expression expectedGt2 = Expressions.lessThan(unquoted, 1);
-          Expression actualGt2 = SparkV2Filters.convert(gt2);
-          Assert.assertEquals(
-              "GreaterThan must match", expectedGt2.toString(), actualGt2.toString());
-
-          Predicate gtEq1 = new Predicate(">=", attrAndValue);
-          Expression expectedGtEq1 = Expressions.greaterThanOrEqual(unquoted, 1);
-          Expression actualGtEq1 = SparkV2Filters.convert(gtEq1);
-          Assert.assertEquals(
-              "GreaterThanOrEqual must match", expectedGtEq1.toString(), actualGtEq1.toString());
-
-          Predicate gtEq2 = new Predicate(">=", valueAndAttr);
-          Expression expectedGtEq2 = Expressions.lessThanOrEqual(unquoted, 1);
-          Expression actualGtEq2 = SparkV2Filters.convert(gtEq2);
-          Assert.assertEquals(
-              "GreaterThanOrEqual must match", expectedGtEq2.toString(), actualGtEq2.toString());
-
-          Predicate eq1 = new Predicate("=", attrAndValue);
-          Expression expectedEq1 = Expressions.equal(unquoted, 1);
-          Expression actualEq1 = SparkV2Filters.convert(eq1);
-          Assert.assertEquals("EqualTo must match", expectedEq1.toString(), actualEq1.toString());
-
-          Predicate eq2 = new Predicate("=", valueAndAttr);
-          Expression expectedEq2 = Expressions.equal(unquoted, 1);
-          Expression actualEq2 = SparkV2Filters.convert(eq2);
-          Assert.assertEquals("EqualTo must match", expectedEq2.toString(), actualEq2.toString());
-
-          Predicate eqNullSafe1 = new Predicate("<=>", attrAndValue);
-          Expression expectedEqNullSafe1 = Expressions.equal(unquoted, 1);
-          Expression actualEqNullSafe1 = SparkV2Filters.convert(eqNullSafe1);
-          Assert.assertEquals(
-              "EqualNullSafe must match",
-              expectedEqNullSafe1.toString(),
-              actualEqNullSafe1.toString());
-
-          Predicate eqNullSafe2 = new Predicate("<=>", valueAndAttr);
-          Expression expectedEqNullSafe2 = Expressions.equal(unquoted, 1);
-          Expression actualEqNullSafe2 = SparkV2Filters.convert(eqNullSafe2);
-          Assert.assertEquals(
-              "EqualNullSafe must match",
-              expectedEqNullSafe2.toString(),
-              actualEqNullSafe2.toString());
-
-          LiteralValue str =
-              new LiteralValue(UTF8String.fromString("iceberg"), DataTypes.StringType);
-          org.apache.spark.sql.connector.expressions.Expression[] attrAndStr =
-              new org.apache.spark.sql.connector.expressions.Expression[] {namedReference, str};
-          Predicate startsWith = new Predicate("STARTS_WITH", attrAndStr);
-          Expression expectedStartsWith = Expressions.startsWith(unquoted, "iceberg");
-          Expression actualStartsWith = SparkV2Filters.convert(startsWith);
-          Assert.assertEquals(
-              "StartsWith must match", expectedStartsWith.toString(), actualStartsWith.toString());
-
-          Predicate in = new Predicate("IN", attrAndValue);
-          Expression expectedIn = Expressions.in(unquoted, 1);
-          Expression actualIn = SparkV2Filters.convert(in);
-          Assert.assertEquals("In must match", expectedIn.toString(), actualIn.toString());
-
-          Predicate and = new And(lt1, eq1);
-          Expression expectedAnd = Expressions.and(expectedLt1, expectedEq1);
-          Expression actualAnd = SparkV2Filters.convert(and);
-          Assert.assertEquals("And must match", expectedAnd.toString(), actualAnd.toString());
-
-          org.apache.spark.sql.connector.expressions.Expression[] attrAndAttr =
-              new org.apache.spark.sql.connector.expressions.Expression[] {
-                namedReference, namedReference
-              };
-          Predicate invalid = new Predicate("<", attrAndAttr);
-          Predicate andWithInvalidLeft = new And(invalid, eq1);
-          Expression convertedAnd = SparkV2Filters.convert(andWithInvalidLeft);
-          Assert.assertEquals("And must match", convertedAnd, null);
-
-          Predicate or = new Or(lt1, eq1);
-          Expression expectedOr = Expressions.or(expectedLt1, expectedEq1);
-          Expression actualOr = SparkV2Filters.convert(or);
-          Assert.assertEquals("Or must match", expectedOr.toString(), actualOr.toString());
-
-          Predicate orWithInvalidLeft = new Or(invalid, eq1);
-          Expression convertedOr = SparkV2Filters.convert(orWithInvalidLeft);
-          Assert.assertEquals("Or must match", convertedOr, null);
-
-          Predicate not = new Not(lt1);
-          Expression expectedNot = Expressions.not(expectedLt1);
-          Expression actualNot = SparkV2Filters.convert(not);
-          Assert.assertEquals("Not must match", expectedNot.toString(), actualNot.toString());
-        });
+    attrMap.forEach(this::testV2Filter);
+  }
+
+  private void testV2Filter(String quoted, String unquoted) {

Review Comment:
   I think this is a bit hacky to avoid the length warning, I think @RussellSpitzer may have meant individual test methods, like:
   
   ```
   @Test
   public void testIsNullV2Filter() {
   }
   
   @Test
   public void testIsNotNullV2Filter {
   }
   ```
   
   Maybe we can revert it to what you had, and we can make a cleanup issue for this?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on pull request #7898: Spark 3.4: Support NOT_EQ for V2 filters

Posted by "rdblue (via GitHub)" <gi...@apache.org>.
rdblue commented on PR #7898:
URL: https://github.com/apache/iceberg/pull/7898#issuecomment-1620741008

   Thanks, @ConeyLiu! Now that this is in we can finish reviewing #7886.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org