You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by GitBox <gi...@apache.org> on 2019/12/31 23:07:47 UTC

[GitHub] [drill] paul-rogers opened a new pull request #1945: DRILL-7502: Invalid codegen for typeof() with UNION

paul-rogers opened a new pull request #1945: DRILL-7502: Invalid codegen for typeof() with UNION
URL: https://github.com/apache/drill/pull/1945
 
 
   The gist of the problem is that codegen for a `UNION` using a `FieldReader` for a `@Param` was handled incorrectly. We had code to handle complex and repeated types specially. Turned out we just needed to add `UNION` type to that same logic. Since the fix turned out to be simple (after much sleuthing), I suspect this may have been a regression and this fix simply puts things back the way the original author intended.
   
   Also fixes DRILL-6362: typeof() reports NULL for primitive columns with a NULL value.
   
   typeof() is meant to return "NULL" if a UNION has a NULL value, but the column type when known, such as for non-UNION columns.
   
   Also fixes DRILL-7499: sqltypeof() function with an array returns "ARRAY", not type. This was due to treating REPEATED like LIST.
   
   Includes cleanup in files visited during the work.
   
   ### Documentation
   
   User-visible behaviour changes after this fix:
   
   * `typeof()` returns the value type of a specific column value. This is the Drill minor type for most types. For types other than `UNION`, if the value is `NULL`, `typeof() `still reports the column type (since the NULL has a type.) But for a `UNION` column, `typeof()` returns the type of that specific column value, or "NULL" if the column is null. (Since, for UNION, NULL has no type.)
   * `sqlTypeOf()` works the same except for arrays. Previously (sometime between Drill 1.15 and Drill 1.17), `sqlTypeOf()` returned "ARRAY" for a repeated column. After this fix, it works as in Drill 1.14: `sqlTypeOf()` returns the type name. Thus a repeated INT returns "INT", not "ARRAY". (The `LIST` type still returns "ARRAY", however.)
   * `drillTypeOf()` returns the actual minor type of a column, even for `UNION` and `DICT` columns. Thus, if a column is `UNION`, `typeOf()` returns the value type, but `drillTypeOf()` returns "UNION."
   
   ### Tests
   
   Added two new tests to `TestTypeFns`, one which uses the JSON reader to verify `UNION` type behaviour (the test which uncovered this bug.) Reran all unit tests.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [drill] ihuzenko commented on a change in pull request #1945: DRILL-7502: Invalid codegen for typeof() with UNION

Posted by GitBox <gi...@apache.org>.
ihuzenko commented on a change in pull request #1945: DRILL-7502: Invalid codegen for typeof() with UNION
URL: https://github.com/apache/drill/pull/1945#discussion_r362869398
 
 

 ##########
 File path: exec/java-exec/src/main/codegen/templates/TypeHelper.java
 ##########
 @@ -82,28 +82,28 @@ public static JType getHolderType(JCodeModel model, MinorType type, DataMode mod
     case MAP:
     case LIST:
       return model._ref(ComplexHolder.class);
-      
+
 <#list vv.types as type>
   <#list type.minor as minor>
-      case ${minor.class?upper_case}:
-        switch (mode) {
-          case REQUIRED:
-            return model._ref(${minor.class}Holder.class);
-          case OPTIONAL:
-            return model._ref(Nullable${minor.class}Holder.class);
-          case REPEATED:
-            return model._ref(Repeated${minor.class}Holder.class);
-        }
+    case ${minor.class?upper_case}:
+      switch (mode) {
+        case REQUIRED:
+          return model._ref(${minor.class}Holder.class);
+        case OPTIONAL:
+          return model._ref(Nullable${minor.class}Holder.class);
+        case REPEATED:
+          return model._ref(Repeated${minor.class}Holder.class);
+      }
   </#list>
 </#list>
-      case GENERIC_OBJECT:
-        return model._ref(ObjectHolder.class);
+    case GENERIC_OBJECT:
+      return model._ref(ObjectHolder.class);
     case NULL:
       return model._ref(UntypedNullHolder.class);
-      default:
-        break;
-      }
-      throw new UnsupportedOperationException(buildErrorMessage("get holder type", type, mode));
+    default:
+      break;
+    }
+    throw new UnsupportedOperationException(buildErrorMessage("get holder type", type, mode));
 
 Review comment:
   I think the line may be located under ```default:``` inside switch-case. 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [drill] vvysotskyi commented on a change in pull request #1945: DRILL-7502: Invalid codegen for typeof() with UNION

Posted by GitBox <gi...@apache.org>.
vvysotskyi commented on a change in pull request #1945: DRILL-7502: Invalid codegen for typeof() with UNION
URL: https://github.com/apache/drill/pull/1945#discussion_r363090485
 
 

 ##########
 File path: exec/java-exec/src/main/codegen/templates/UnionFunctions.java
 ##########
 @@ -43,15 +43,15 @@
  * Additional functions can be found in the class UnionFunctions
  */
 public class GUnionFunctions {
-
   <#list vv.types as type><#list type.minor as minor><#assign name = minor.class?cap_first />
   <#assign fields = minor.fields!type.fields />
   <#assign uncappedName = name?uncap_first/>
 
   <#if !minor.class?starts_with("Decimal")>
-
-  @SuppressWarnings("unused")
-  @FunctionTemplate(name = "IS_${name?upper_case}", scope = FunctionTemplate.FunctionScope.SIMPLE, nulls=NullHandling.INTERNAL)
+  @FunctionTemplate(
+      name = "IS_${name?upper_case}",
+      scope = FunctionTemplate.FunctionScope.SIMPLE,
+      nulls=NullHandling.INTERNAL)
 
 Review comment:
   ```suggestion
         nulls = NullHandling.INTERNAL)
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [drill] ihuzenko commented on a change in pull request #1945: DRILL-7502: Invalid codegen for typeof() with UNION

Posted by GitBox <gi...@apache.org>.
ihuzenko commented on a change in pull request #1945: DRILL-7502: Invalid codegen for typeof() with UNION
URL: https://github.com/apache/drill/pull/1945#discussion_r362885373
 
 

 ##########
 File path: exec/java-exec/src/main/java/org/apache/drill/exec/expr/EvaluationVisitor.java
 ##########
 @@ -117,9 +106,9 @@ public HoldingContainer addExpr(LogicalExpression e, ClassGenerator<?> generator
   }
 
   private class ExpressionHolder {
 
 Review comment:
   ```suggestion
     private static class ExpressionHolder {
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [drill] vvysotskyi commented on a change in pull request #1945: DRILL-7502: Invalid codegen for typeof() with UNION

Posted by GitBox <gi...@apache.org>.
vvysotskyi commented on a change in pull request #1945: DRILL-7502: Invalid codegen for typeof() with UNION
URL: https://github.com/apache/drill/pull/1945#discussion_r363377148
 
 

 ##########
 File path: logical/src/main/java/org/apache/drill/common/expression/FunctionHolderExpression.java
 ##########
 @@ -98,4 +105,20 @@ public FieldReference getFieldReference() {
   public void setFieldReference(FieldReference fieldReference) {
     this.fieldReference = fieldReference;
   }
+
+  @Override
+  public String toString() {
+    StringBuilder buf = new StringBuilder()
+        .append("[").append(getClass().getSimpleName())
 
 Review comment:
   Please separate somehow separate `getClass().getSimpleName()` and `nameUsed` in resulting string.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [drill] vvysotskyi commented on a change in pull request #1945: DRILL-7502: Invalid codegen for typeof() with UNION

Posted by GitBox <gi...@apache.org>.
vvysotskyi commented on a change in pull request #1945: DRILL-7502: Invalid codegen for typeof() with UNION
URL: https://github.com/apache/drill/pull/1945#discussion_r362777154
 
 

 ##########
 File path: common/src/main/java/org/apache/drill/common/types/Types.java
 ##########
 @@ -906,4 +908,16 @@ public static boolean isNullable(final MajorType type) {
         throw new UnsupportedOperationException("Unexpected/unhandled DataMode value " + type.getMode());
     }
   }
+
+  /**
+   * The number of minor types. Actually, the largest minor type ordinal.
+   * (There are holes in the ordering.) Update this if a new type
+   * is added. Use this value when allocating arrays to be indexed
+   * by minor type.
+   *
+   * @return the maximum minor type ordinal
+   */
+  public static final int typeCount() {
 
 Review comment:
   I think there is an issue with misusing `getNumber()` method in places where `ordinal()` should be used. It should guarantee that we will use indexes less than `MinorType.values().length`.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [drill] paul-rogers commented on issue #1945: DRILL-7502: Invalid codegen for typeof() with UNION

Posted by GitBox <gi...@apache.org>.
paul-rogers commented on issue #1945: DRILL-7502: Invalid codegen for typeof() with UNION
URL: https://github.com/apache/drill/pull/1945#issuecomment-572377107
 
 
   @vvysotskyi, thanks much for the review. Addressed the one comment I missed last time. Squashed commits. 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [drill] ihuzenko commented on a change in pull request #1945: DRILL-7502: Invalid codegen for typeof() with UNION

Posted by GitBox <gi...@apache.org>.
ihuzenko commented on a change in pull request #1945: DRILL-7502: Invalid codegen for typeof() with UNION
URL: https://github.com/apache/drill/pull/1945#discussion_r362892653
 
 

 ##########
 File path: exec/vector/src/main/codegen/templates/AbstractFieldReader.java
 ##########
 @@ -28,11 +28,11 @@
 /*
  * This class is generated using freemarker and the ${.template_name} template.
  */
-@SuppressWarnings("unused")
 public abstract class AbstractFieldReader extends AbstractBaseReader implements FieldReader {
 
-  public AbstractFieldReader() {
-  }
+  public AbstractFieldReader() { }
 
 Review comment:
   ```suggestion
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [drill] paul-rogers commented on issue #1945: DRILL-7502: Invalid codegen for typeof() with UNION

Posted by GitBox <gi...@apache.org>.
paul-rogers commented on issue #1945: DRILL-7502: Invalid codegen for typeof() with UNION
URL: https://github.com/apache/drill/pull/1945#issuecomment-570762571
 
 
   Another thing to note is that we may want to go though one more round of sorting out the type functions. In normal SQL, there is just one level of type: a columns is an `INT` or a `DOUBLE`.
   
   In Sqllite (IIRC) there is the `Variant` type: so a column can be `Variant` but the values can be `INT` or `DOUBLE`. The "variant" name comes from Visual Basic. In Drill, the variant is called `UNION`.
   
   In Drill, we also have structured types: `ARRAY<INTEGER>` or `DICT<STRING,DOUBLE>`.
   
   We've been trying to handle all these cases with originally one function (`typeof()`), then later three (adding `sqlTypeOf()` and `drillTypeOf()`).
   
   We probably want four variations. Two forms of the type:
   
   * The full type description: `ARRAY<ARRAY<MAP<a: INTEGER, b: VARCHAR>>>` say, or `UNION<DOUBLE, INTEGER>`.
   * The short type of the vector itself: `LIST`, `UNION`.
   
   For Variant types (`UNION`, `LIST`) we then want two forms:
   
   * The vector itself. (`DICT`, `MAP`, `UNION`, `LIST`)
   * The value: (`INTEGER`, `DOUBLE`).
   
   Of course, the value can itself be structured: a `UNION` which has a `LIST` of `UNION` types. So, for both the vector and the value, we'd want both the simple and verbose forms.
   
   And, now that @arina-ielchiieva added the nice type description system, we have no function which returns the full description, including the `NOT NULL` attribute, as it appears in the schema file: `INTEGER NOT NULL`.
   
   The present fix gets us closer, but it is hard to solve four or five cases with three functions. I suspect we could do better. Suggestions welcomed!

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [drill] vvysotskyi commented on a change in pull request #1945: DRILL-7502: Invalid codegen for typeof() with UNION

Posted by GitBox <gi...@apache.org>.
vvysotskyi commented on a change in pull request #1945: DRILL-7502: Invalid codegen for typeof() with UNION
URL: https://github.com/apache/drill/pull/1945#discussion_r363435140
 
 

 ##########
 File path: common/src/main/java/org/apache/drill/common/types/Types.java
 ##########
 @@ -906,4 +908,16 @@ public static boolean isNullable(final MajorType type) {
         throw new UnsupportedOperationException("Unexpected/unhandled DataMode value " + type.getMode());
     }
   }
+
+  /**
+   * The number of minor types. Actually, the largest minor type ordinal.
+   * (There are holes in the ordering.) Update this if a new type
+   * is added. Use this value when allocating arrays to be indexed
+   * by minor type.
+   *
+   * @return the maximum minor type ordinal
+   */
+  public static final int typeCount() {
 
 Review comment:
   Thank you for finding and fixing all these issues! union data type was broken for a long time, and it is great that you will fix these code-gen issues.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [drill] vvysotskyi commented on a change in pull request #1945: DRILL-7502: Invalid codegen for typeof() with UNION

Posted by GitBox <gi...@apache.org>.
vvysotskyi commented on a change in pull request #1945: DRILL-7502: Invalid codegen for typeof() with UNION
URL: https://github.com/apache/drill/pull/1945#discussion_r363376466
 
 

 ##########
 File path: logical/src/main/java/org/apache/drill/common/expression/FunctionHolderExpression.java
 ##########
 @@ -21,31 +21,37 @@
 
 import org.apache.drill.common.expression.fn.FuncHolder;
 import org.apache.drill.common.expression.visitors.ExprVisitor;
-
 import org.apache.drill.shaded.guava.com.google.common.collect.ImmutableList;
-import org.apache.drill.shaded.guava.com.google.common.collect.Lists;
 
+/**
+ * Represents an actual call (a reference) to a declared function.
+ * Holds the name used (functions can have multiple aliases), the
+ * function declaration, and the actual argument expressions used
+ * in this call. This might be better named
+ * <code>FunctionCallExpression</code> as it represents a use
+ * of a function. Subclasses hold references to the declaration
+ * depending on the type (Drill, Hive) of the function.
+ */
 public abstract class FunctionHolderExpression extends LogicalExpressionBase {
   public final ImmutableList<LogicalExpression> args;
   public final String nameUsed;
 
   /**
-   * A field reference identifies the output field and
-   * is used to reference that field in the generated classes.
+   * Identifies the output field. References that field in the
+   * generated classes.
    */
   private FieldReference fieldReference;
 
   public FunctionHolderExpression(String nameUsed, ExpressionPosition pos, List<LogicalExpression> args) {
     super(pos);
+    this.nameUsed = nameUsed;
     if (args == null) {
-      args = Lists.newArrayList();
-    }
-
-    if (!(args instanceof ImmutableList)) {
-      args = ImmutableList.copyOf(args);
+      this.args = ImmutableList.of();
+    } else if (args instanceof ImmutableList) {
 
 Review comment:
   Could you please replace these if blocks with `ImmutableList.copyOf`, it produces these checks internally to avoid copying is possible.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [drill] vvysotskyi commented on a change in pull request #1945: DRILL-7502: Invalid codegen for typeof() with UNION

Posted by GitBox <gi...@apache.org>.
vvysotskyi commented on a change in pull request #1945: DRILL-7502: Invalid codegen for typeof() with UNION
URL: https://github.com/apache/drill/pull/1945#discussion_r363369497
 
 

 ##########
 File path: exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/handlers/DefaultSqlHandler.java
 ##########
 @@ -324,7 +323,12 @@ protected DrillRel convertToDrel(RelNode relNode) throws SqlUnsupportedException
   /**
    * A shuttle designed to finalize all RelNodes.
    */
-  private static class PrelFinalizer extends RelShuttleImpl {
+  public static class PrelFinalizer extends RelShuttleImpl {
+
+    // The non-default constructor appears to be called from Calcite
+    // code when processing Hive-based queries.
+    public PrelFinalizer(DefaultSqlHandler handler) { }
 
 Review comment:
   Could you please explain it a little? I thought this is a regular visitor which is used only in places where its instance is created and used. By the way, since it doesn't store any state, its instance may be reused.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [drill] vvysotskyi commented on issue #1945: DRILL-7502: Invalid codegen for typeof() with UNION

Posted by GitBox <gi...@apache.org>.
vvysotskyi commented on issue #1945: DRILL-7502: Invalid codegen for typeof() with UNION
URL: https://github.com/apache/drill/pull/1945#issuecomment-572012181
 
 
   @paul-rogers, could you please squash the commits.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [drill] paul-rogers commented on a change in pull request #1945: DRILL-7502: Invalid codegen for typeof() with UNION

Posted by GitBox <gi...@apache.org>.
paul-rogers commented on a change in pull request #1945: DRILL-7502: Invalid codegen for typeof() with UNION
URL: https://github.com/apache/drill/pull/1945#discussion_r363012403
 
 

 ##########
 File path: common/src/main/java/org/apache/drill/common/types/Types.java
 ##########
 @@ -906,4 +908,16 @@ public static boolean isNullable(final MajorType type) {
         throw new UnsupportedOperationException("Unexpected/unhandled DataMode value " + type.getMode());
     }
   }
+
+  /**
+   * The number of minor types. Actually, the largest minor type ordinal.
+   * (There are holes in the ordering.) Update this if a new type
+   * is added. Use this value when allocating arrays to be indexed
+   * by minor type.
+   *
+   * @return the maximum minor type ordinal
+   */
+  public static final int typeCount() {
 
 Review comment:
   @vvysotskyi, if I could e-mail you a beer, or even a pizza, I would do so. You uncovered a huge mess in the Union vector and reader. It is amazing it ever worked. Adding the `DICT` type pushed it over the edge. As it turns out, the code has long confused ordinals and Protobuf values. The problem is made worse because the Probuf-generated classes replace the `valueOf()` ordinal function with a lookup based on the Protobuf values. We had the two systems all mixed up.
   
   So, I cleaned up all that, following your advice to use only the ordinal values. Added comments to help people avoid falling into this trap in the future.
   
   Plus, I think this accidentally fixed another issue: DRILL-7510.
   
   Not bad for a single review comment!

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [drill] ihuzenko commented on a change in pull request #1945: DRILL-7502: Invalid codegen for typeof() with UNION

Posted by GitBox <gi...@apache.org>.
ihuzenko commented on a change in pull request #1945: DRILL-7502: Invalid codegen for typeof() with UNION
URL: https://github.com/apache/drill/pull/1945#discussion_r362882552
 
 

 ##########
 File path: exec/java-exec/src/main/java/org/apache/drill/exec/expr/EvaluationVisitor.java
 ##########
 @@ -146,6 +135,19 @@ public boolean equals(Object obj) {
 
   Stack<Map<ExpressionHolder,HoldingContainer>> mapStack = new Stack<>();
 
+  public EvaluationVisitor() { }
 
 Review comment:
   ```suggestion
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [drill] paul-rogers commented on issue #1945: DRILL-7502: Invalid codegen for typeof() with UNION

Posted by GitBox <gi...@apache.org>.
paul-rogers commented on issue #1945: DRILL-7502: Invalid codegen for typeof() with UNION
URL: https://github.com/apache/drill/pull/1945#issuecomment-570328592
 
 
   Am now seeing failures in the `GUnionFunctions` where a parameter is declared as `UnionHolder` but is now being created as `FieldReader`. Seems that the bug that I fixed may have been introduced to make these functions work, but had the side effect of forcing union parameters to be a `UnionReader` even if declared as a `FieldReader`. Looks like my change now does the opposite. So, more work to be done to make the decision based on the kind of field.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [drill] ihuzenko commented on a change in pull request #1945: DRILL-7502: Invalid codegen for typeof() with UNION

Posted by GitBox <gi...@apache.org>.
ihuzenko commented on a change in pull request #1945: DRILL-7502: Invalid codegen for typeof() with UNION
URL: https://github.com/apache/drill/pull/1945#discussion_r362884952
 
 

 ##########
 File path: exec/java-exec/src/main/java/org/apache/drill/exec/expr/EvaluationVisitor.java
 ##########
 @@ -146,6 +135,19 @@ public boolean equals(Object obj) {
 
   Stack<Map<ExpressionHolder,HoldingContainer>> mapStack = new Stack<>();
 
+  public EvaluationVisitor() { }
+
+  public HoldingContainer addExpr(LogicalExpression e, ClassGenerator<?> generator) {
+
 
 Review comment:
   ```suggestion
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [drill] paul-rogers commented on a change in pull request #1945: DRILL-7502: Invalid codegen for typeof() with UNION

Posted by GitBox <gi...@apache.org>.
paul-rogers commented on a change in pull request #1945: DRILL-7502: Invalid codegen for typeof() with UNION
URL: https://github.com/apache/drill/pull/1945#discussion_r362616012
 
 

 ##########
 File path: common/src/main/java/org/apache/drill/common/types/Types.java
 ##########
 @@ -906,4 +908,16 @@ public static boolean isNullable(final MajorType type) {
         throw new UnsupportedOperationException("Unexpected/unhandled DataMode value " + type.getMode());
     }
   }
+
+  /**
+   * The number of minor types. Actually, the largest minor type ordinal.
+   * (There are holes in the ordering.) Update this if a new type
+   * is added. Use this value when allocating arrays to be indexed
+   * by minor type.
+   *
+   * @return the maximum minor type ordinal
+   */
+  public static final int typeCount() {
 
 Review comment:
   I agree it seems it *looks like* it should work. However, when a I ran a test that uses the `DICT` type I got array out of bounds errors. Reason: the number of values in the types list has holes: Type 2 is missing, as are a number of others. So, when we ask how many values exist, we get something like 40. But, `DICT` has an index of 44 (because of the hole.) So, when we request the `DICT`th entry in an array, we get an OOB error.
   
   The other approach is to introduce a value for the missing ordinals, maybe "UNUSEDx" type. I'll try that instead.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [drill] vvysotskyi commented on a change in pull request #1945: DRILL-7502: Invalid codegen for typeof() with UNION

Posted by GitBox <gi...@apache.org>.
vvysotskyi commented on a change in pull request #1945: DRILL-7502: Invalid codegen for typeof() with UNION
URL: https://github.com/apache/drill/pull/1945#discussion_r362596363
 
 

 ##########
 File path: exec/java-exec/src/main/java/org/apache/drill/exec/expr/EvaluationVisitor.java
 ##########
 @@ -400,8 +403,10 @@ public HoldingContainer visitUnknown(LogicalExpression e, ClassGenerator<?> gene
      * seedValue0 .value = seedValue;
      * </p>
      *
-     * @param e parameter expression
-     * @param generator class generator
+     * @param e
 
 Review comment:
   Please revert this change, initial formatting looks better.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [drill] vvysotskyi commented on a change in pull request #1945: DRILL-7502: Invalid codegen for typeof() with UNION

Posted by GitBox <gi...@apache.org>.
vvysotskyi commented on a change in pull request #1945: DRILL-7502: Invalid codegen for typeof() with UNION
URL: https://github.com/apache/drill/pull/1945#discussion_r362594843
 
 

 ##########
 File path: exec/java-exec/src/main/java/org/apache/drill/exec/expr/ClassGenerator.java
 ##########
 @@ -97,31 +98,37 @@
   private JVar innerClassField;
 
   /**
-   * Assumed that field has 3 indexes within the constant pull: index of the CONSTANT_Fieldref_info +
-   * CONSTANT_Fieldref_info.name_and_type_index + CONSTANT_NameAndType_info.name_index.
-   * CONSTANT_NameAndType_info.descriptor_index has limited range of values, CONSTANT_Fieldref_info.class_index is
-   * the same for a single class, they will be taken into account later.
+   * Assumed that field has 3 indexes within the constant pull: index of the
+   * CONSTANT_Fieldref_info + CONSTANT_Fieldref_info.name_and_type_index +
+   * CONSTANT_NameAndType_info.name_index.
+   * CONSTANT_NameAndType_info.descriptor_index has limited range of values,
+   * CONSTANT_Fieldref_info.class_index is the same for a single class, they
+   * will be taken into account later.
    * <p>
    * Local variable has 1 index within the constant pool.
    * {@link org.objectweb.asm.MethodWriter#visitLocalVariable(String, String, String, Label, Label, int)}
    * <p>
-   * For upper estimation of max index value, suppose that each field and local variable uses different literal
-   * values that have two indexes, then the number of occupied indexes within the constant pull is
-   * fieldCount * 3 + fieldCount * 2 + (index - fieldCount) * 3 => fieldCount * 2 + index * 3
+   * For upper estimation of max index value, suppose that each field and local
+   * variable uses different literal values that have two indexes, then the
+   * number of occupied indexes within the constant pull is fieldCount * 3 +
+   * fieldCount * 2 + (index - fieldCount) * 3 => fieldCount * 2 + index * 3
    * <p>
-   * Assumed that method has 3 indexes within the constant pull: index of the CONSTANT_Methodref_info +
-   * CONSTANT_Methodref_info.name_and_type_index + CONSTANT_NameAndType_info.name_index.
+   * Assumed that method has 3 indexes within the constant pull: index of the
+   * CONSTANT_Methodref_info + CONSTANT_Methodref_info.name_and_type_index +
+   * CONSTANT_NameAndType_info.name_index.
    * <p>
-   * For the upper estimation of number of split methods suppose that each expression in the method uses single variable.
-   * Suppose that the max number of indexes within the constant pull occupied by fields and local variables is M,
-   * the number of split methods is N, number of abstract methods in the template is A, then splitted methods count is
-   * N = (M - A * N * 3) / 50 => N = M / (50 + A * 3)
+   * For the upper estimation of number of split methods suppose that each
+   * expression in the method uses single variable. Suppose that the max number
+   * of indexes within the constant pull occupied by fields and local variables
+   * is M, the number of split methods is N, number of abstract methods in the
+   * template is A, then splitted methods count is N = (M - A * N * 3) / 50 => N
 
 Review comment:
   Please leave formulas to start from the new line.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [drill] asfgit closed pull request #1945: DRILL-7502: Invalid codegen for typeof() with UNION

Posted by GitBox <gi...@apache.org>.
asfgit closed pull request #1945: DRILL-7502: Invalid codegen for typeof() with UNION
URL: https://github.com/apache/drill/pull/1945
 
 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [drill] paul-rogers commented on a change in pull request #1945: DRILL-7502: Invalid codegen for typeof() with UNION

Posted by GitBox <gi...@apache.org>.
paul-rogers commented on a change in pull request #1945: DRILL-7502: Invalid codegen for typeof() with UNION
URL: https://github.com/apache/drill/pull/1945#discussion_r363565664
 
 

 ##########
 File path: exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/handlers/DefaultSqlHandler.java
 ##########
 @@ -324,7 +323,12 @@ protected DrillRel convertToDrel(RelNode relNode) throws SqlUnsupportedException
   /**
    * A shuttle designed to finalize all RelNodes.
    */
-  private static class PrelFinalizer extends RelShuttleImpl {
+  public static class PrelFinalizer extends RelShuttleImpl {
+
+    // The non-default constructor appears to be called from Calcite
+    // code when processing Hive-based queries.
+    public PrelFinalizer(DefaultSqlHandler handler) { }
 
 Review comment:
   @vvysotskyi, not sure I can explain this. I got a runtime exception that some code could not create an instance. The error message appeared to indicate that the code was called via introspection, with the above signature. Adding the above code made the problem disappear.
   
   I should have noted the test that failed when run in Eclipse.From the comment, it seemed to be from some of the Hive tests I ran. But, today, those tests run fine without this addition. So, I removed it.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [drill] paul-rogers commented on issue #1945: DRILL-7502: Invalid codegen for typeof() with UNION

Posted by GitBox <gi...@apache.org>.
paul-rogers commented on issue #1945: DRILL-7502: Invalid codegen for typeof() with UNION
URL: https://github.com/apache/drill/pull/1945#issuecomment-570761952
 
 
   @vvysotskyi, @ihuzenko thank you for your reviews. I believe I've addressed your comments. Please take another look.
   
   This commit contains some significant changes. As it turns out, we have three different ways we work with Union vectors in code gen. Two worked, the third did not. Figuring that out was a bit of an involved exercise. I added quite a few comments and split up some overly-large methods, to help gain understanding. I added a bit of a hack to work around the missing case. It works; all tests pass including the new one that initially failed. (And DRILL-7510 got magically fixed along the way.)
   
   I say the fix is a "hack" because I had to add more state: to access the Union vector when setting up parameters that use a `FieldReader`. It seems there should be a simpler way; but I could not find it.
   
   @KazydubB, these changes touched some of the same areas you apparently touched for your `DICT` work. Can you take a look to 1) make sure I didn't accidentally break anything, and 2) perhaps see the pattern I may be missing about how to set up a `FieldReader` for a Union vector. 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [drill] vvysotskyi commented on a change in pull request #1945: DRILL-7502: Invalid codegen for typeof() with UNION

Posted by GitBox <gi...@apache.org>.
vvysotskyi commented on a change in pull request #1945: DRILL-7502: Invalid codegen for typeof() with UNION
URL: https://github.com/apache/drill/pull/1945#discussion_r362592822
 
 

 ##########
 File path: common/src/main/java/org/apache/drill/common/types/Types.java
 ##########
 @@ -906,4 +908,16 @@ public static boolean isNullable(final MajorType type) {
         throw new UnsupportedOperationException("Unexpected/unhandled DataMode value " + type.getMode());
     }
   }
+
+  /**
+   * The number of minor types. Actually, the largest minor type ordinal.
+   * (There are holes in the ordering.) Update this if a new type
+   * is added. Use this value when allocating arrays to be indexed
+   * by minor type.
+   *
+   * @return the maximum minor type ordinal
+   */
+  public static final int typeCount() {
 
 Review comment:
   This method in the most places replaces `MinorType.values().length` occurrences, and for arrays, where it is used, are indexed using `Enum.ordinal()` method, so it should work correctly, and the existing code doesn't require updating every time when a new type is added.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [drill] ihuzenko commented on a change in pull request #1945: DRILL-7502: Invalid codegen for typeof() with UNION

Posted by GitBox <gi...@apache.org>.
ihuzenko commented on a change in pull request #1945: DRILL-7502: Invalid codegen for typeof() with UNION
URL: https://github.com/apache/drill/pull/1945#discussion_r362891731
 
 

 ##########
 File path: exec/vector/src/main/codegen/templates/UnionReader.java
 ##########
 @@ -31,28 +34,33 @@
 /*
  * This class is generated using freemarker and the ${.template_name} template.
  */
-@SuppressWarnings("unused")
 public class UnionReader extends AbstractFieldReader {
 
-  private BaseReader[] readers = new BaseReader[45];
+  private static final int TYPE_COUNT = Types.typeCount();
+  private BaseReader[] readers = new BaseReader[TYPE_COUNT];
   public UnionVector data;
-  
+
   public UnionReader(UnionVector data) {
     this.data = data;
   }
 
-  private static MajorType[] TYPES = new MajorType[45];
+  private static MajorType[] TYPES = new MajorType[TYPE_COUNT];
 
 Review comment:
   can be final?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [drill] ihuzenko commented on a change in pull request #1945: DRILL-7502: Invalid codegen for typeof() with UNION

Posted by GitBox <gi...@apache.org>.
ihuzenko commented on a change in pull request #1945: DRILL-7502: Invalid codegen for typeof() with UNION
URL: https://github.com/apache/drill/pull/1945#discussion_r362886476
 
 

 ##########
 File path: exec/java-exec/src/main/java/org/apache/drill/exec/expr/EvaluationVisitor.java
 ##########
 @@ -1282,11 +1285,9 @@ public HoldingContainer visitConvertExpression(ConvertExpression e, ClassGenerat
     }
   }
 
-
-
   private class ConstantFilter extends EvalVisitor {
 
-    private Set<LogicalExpression> constantBoundaries;
+    private final Set<LogicalExpression> constantBoundaries;
 
     public ConstantFilter(Set<LogicalExpression> constantBoundaries) {
       super();
 
 Review comment:
   ```suggestion
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [drill] vvysotskyi commented on a change in pull request #1945: DRILL-7502: Invalid codegen for typeof() with UNION

Posted by GitBox <gi...@apache.org>.
vvysotskyi commented on a change in pull request #1945: DRILL-7502: Invalid codegen for typeof() with UNION
URL: https://github.com/apache/drill/pull/1945#discussion_r363090494
 
 

 ##########
 File path: exec/java-exec/src/main/codegen/templates/UnionFunctions.java
 ##########
 @@ -68,8 +68,10 @@ public void eval() {
     }
   }
 
-  @SuppressWarnings("unused")
-  @FunctionTemplate(name = "ASSERT_${name?upper_case}", scope = FunctionTemplate.FunctionScope.SIMPLE, nulls=NullHandling.INTERNAL)
+  @FunctionTemplate(
+      name = "ASSERT_${name?upper_case}",
+      scope = FunctionTemplate.FunctionScope.SIMPLE,
+      nulls=NullHandling.INTERNAL)
 
 Review comment:
   ```suggestion
         nulls = NullHandling.INTERNAL)
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [drill] vvysotskyi commented on a change in pull request #1945: DRILL-7502: Invalid codegen for typeof() with UNION

Posted by GitBox <gi...@apache.org>.
vvysotskyi commented on a change in pull request #1945: DRILL-7502: Invalid codegen for typeof() with UNION
URL: https://github.com/apache/drill/pull/1945#discussion_r362777154
 
 

 ##########
 File path: common/src/main/java/org/apache/drill/common/types/Types.java
 ##########
 @@ -906,4 +908,16 @@ public static boolean isNullable(final MajorType type) {
         throw new UnsupportedOperationException("Unexpected/unhandled DataMode value " + type.getMode());
     }
   }
+
+  /**
+   * The number of minor types. Actually, the largest minor type ordinal.
+   * (There are holes in the ordering.) Update this if a new type
+   * is added. Use this value when allocating arrays to be indexed
+   * by minor type.
+   *
+   * @return the maximum minor type ordinal
+   */
+  public static final int typeCount() {
 
 Review comment:
   I think there is an issue with misusing `getNumber()` field in places where `ordinal()` should be used. It should guarantee that we will use indexes less than `MinorType.values().length`.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services