You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by GitBox <gi...@apache.org> on 2021/05/27 16:11:15 UTC

[GitHub] [incubator-pinot] siddharthteotia commented on a change in pull request #6927: Normalize LHS and RHS numerical types for >, >=, <, and <= operators.

siddharthteotia commented on a change in pull request #6927:
URL: https://github.com/apache/incubator-pinot/pull/6927#discussion_r640770793



##########
File path: pinot-core/src/main/java/org/apache/pinot/core/query/optimizer/filter/NumericalFilterOptimizer.java
##########
@@ -80,22 +83,29 @@ public Expression optimize(Expression expression, @Nullable Schema schema) {
     List<Expression> operands = function.getOperands();
     String operator = function.getOperator();
     if (operator.equals(FilterKind.AND.name()) || operator.equals(FilterKind.OR.name())) {
-      // One of the operands may be an EQUALS or NOT_EQUALS function so recursively traverse the expression tree to see
-      // if we find an EQUALS or NOT_EQUALS function to rewrite.
+      // Recursively traverse the expression tree to find an operator node that can be rewritten.
       operands.forEach(operand -> optimize(operand, schema));
 
       // We have rewritten the child operands, so rewrite the parent if needed.
       return optimizeCurrent(expression);
-    } else if (operator.equals(FilterKind.EQUALS.name()) || operator.equals(FilterKind.NOT_EQUALS.name())) {
+    } else {
       // Verify that LHS is a numeric column and RHS is a numeric literal before rewriting.
       Expression lhs = operands.get(0);
       Expression rhs = operands.get(1);
       if (isNumericColumn(lhs, schema) && isNumericLiteral(rhs)) {
-        // Rewrite the expression.
-        return rewrite(expression, lhs, rhs, schema);
+        FilterKind kind = FilterKind.valueOf(operator);
+        switch (kind) {
+          case EQUALS:
+          case NOT_EQUALS:
+            return rewriteEqualsExpression(expression, kind, lhs, rhs, schema);
+          case GREATER_THAN:
+          case GREATER_THAN_OR_EQUAL:
+          case LESS_THAN:
+          case LESS_THAN_OR_EQUAL:
+            return rewriteRangeExpression(expression, kind, lhs, rhs, schema);

Review comment:
       Will we consider BETWEEN in the future ?

##########
File path: pinot-core/src/main/java/org/apache/pinot/core/query/optimizer/filter/NumericalFilterOptimizer.java
##########
@@ -271,7 +242,211 @@ private Expression rewrite(Expression equals, Expression lhs, Expression rhs, Sc
             break;
           }
         }
+      }
     }
     return equals;
   }
+
+  /**
+   * Rewrite expressions of form "column > literal", "column >= literal", "column < literal", and "column <= literal"
+   * to ensure that RHS literal is the same datatype as LHS column.
+   */
+  private static Expression rewriteRangeExpression(Expression range, FilterKind kind, Expression lhs, Expression rhs, Schema schema) {
+    // Get column data type.
+    FieldSpec.DataType dataType = schema.getFieldSpecFor(lhs.getIdentifier().getName()).getDataType();
+
+    switch (rhs.getLiteral().getSetField()) {
+      case SHORT_VALUE:
+      case INT_VALUE:
+        // No rewrites needed since SHORT and INT conversion to numeric column types (INT, LONG, FLOAT, and DOUBLE) is
+        // lossless and will be implicitly handled on the server side.
+        break;
+      case LONG_VALUE: {
+        long actual = rhs.getLiteral().getLongValue();
+        switch (dataType) {
+          case INT: {
+            int converted = (int) actual;
+            int comparison = Long.compare(actual, converted);
+            if (comparison > 0) {
+              // Literal value is greater than the bounds of INT. > and >= expressions will always be false because an
+              // INT column can never have a value greater than Integer.MAX_VALUE. < and <= expressions will always be
+              // true, because an INT column will always have values less than Integer.MIN_VALUE.

Review comment:
       typo ?
   
   because an INT column will always have values **bigger** than Integer.MIN_VALUE and **less** than Integer.MAX_VALUE
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org