You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by KulykRoman <gi...@git.apache.org> on 2017/12/08 17:19:06 UTC

[GitHub] drill pull request #1066: DRILL-3993: Changes to support Calcite 1.13

GitHub user KulykRoman opened a pull request:

    https://github.com/apache/drill/pull/1066

    DRILL-3993: Changes to support Calcite 1.13

    Works with Drill-specific Calcite 1.13 from a branch: https://github.com/KulykRoman/incubator-calcite/tree/DRILL-3993.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/KulykRoman/drill DRILL-3993

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/drill/pull/1066.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1066
    
----
commit 71d8b03d60dcbe4d46f1f12aba82354064c25ef6
Author: Roman Kulyk <ro...@gmail.com>
Date:   2017-11-30T16:19:12Z

    REVERTED:  DRILL-5089
    
    Dynamically load schema of storage plugin only when needed for every query

commit 1c3cea13c798eb831f9e9911ec144bf4824ac5ad
Author: Roman Kulyk <ro...@gmail.com>
Date:   2017-08-29T14:10:24Z

    DRILL-3993: Changes to support Calcite 1.13
    
    - fixed all compiling errors (main changes were: Maven changes, chenges RelNode -> RelRoot, implementing some new methods from updated interfaces, chenges some literals, logger changes);
    - fixed unexpected column errors, validation errors and assertion errors after Calcite update;
    - fixed describe table/schema statement according to updated logic;
    - added fixes with time-intervals;
    - changed precision of BINARY to 65536 (was 1048576) according to updated logic (Calcite overrides bigger precision to own maxPrecision);
    - ignored some incorrect tests with DRILL-3244;
    - changed "Table not found" message to "Object not found within" according to new Calcite changes.

commit 7196b366994f6ebcdcb80ca07701e06955e2b856
Author: Volodymyr Vysotskyi <vv...@gmail.com>
Date:   2017-11-03T12:18:09Z

    DRILL-3993: Fix unit test failures connected with support Calcite 1.13
    
    - Use root schema as default for describe table statement.
    Fix TestOpenTSDBPlugin.testDescribe() and TestInfoSchemaOnHiveStorage.varCharMaxLengthAndDecimalPrecisionInInfoSchema() unit tests.
    - Modify expected results for tests:
    TestPreparedStatementProvider.invalidQueryValidationError();
    TestProjectPushDown.testTPCH1();
    TestProjectPushDown.testTPCH3();
    TestStorageBasedHiveAuthorization.selectUser1_db_u0_only();
    TestStorageBasedHiveAuthorization.selectUser0_db_u1g1_only()
    - Fix TestCTAS.whenTableQueryColumnHasStarAndTableFiledListIsSpecified(), TestViewSupport.createViewWhenViewQueryColumnHasStarAndViewFiledListIsSpecified(), TestInbuiltHiveUDFs.testIf(), testDisableUtf8SupportInQueryString unit tests.
    - Fix UnsupportedOperationException and NPE for jdbc tests.
    - Fix AssertionError: Conversion to relational algebra failed to preserve datatypes
    
    *DrillCompoundIdentifier:
    According to the changes, made in [CALCITE-546], star Identifier is replaced by empty string during parsing the query. Since Drill uses its own DrillCompoundIdentifier, it should also replace star by empty string before creating SqlIdentifier instance to avoid further errors connected with star column. see SqlIdentifier.isStar() method.
    
    *SqlConverter:
    In [CALCITE-1417] added simplification of expressions which should be projected every time when a new project rel node is created using RelBuilder. It causes assertion errors connected with types nullability. This hook was set to false to avoid project expressions simplification. See usage of this hook and RelBuilder.project() method.
    
    In Drill the type nullability of the function depends on only the nullability of its arguments. In some cases, a function may return null value even if it had non-nullable arguments. When Calice simplifies expressions, it checks that the type of the result is the same as the type of the expression. Otherwise, makeCast() method is called. But when a function returns null literal, this cast does nothing, even when the function has a non-nullable type. So to avoid this issue, method makeCast() was overridden.
    
    *DrillAvgVarianceConvertlet:
    Problem with sum0 and specific changes in old Calcite (it is CALCITE-777). (see HistogramShuttle.visitCall method) Changes were made to avoid changes in Calcite.
    
    *SqlConverter, DescribeTableHandler, ShowTablesHandler:
    New Calcite tries to combine both default and specified workspaces during the query validation. In some cases, for example, when describe table statement is used, Calcite tries to find INFORMATION_SCHEMA in the schema used as default. When it does not find the schema, it tries to find a table with such name. For some storage plugins, such as opentsdb and hbase, when a table was not found, the error is thrown, and the query fails. To avoid this issue, default schema was changed to root schema for validation stage for describe table and show tables queries.

commit 56f2c9ac47d224e2ac02ab5036e7aa01df1951f2
Author: Volodymyr Vysotskyi <vv...@gmail.com>
Date:   2017-11-15T10:37:50Z

    DRILL-3993: Use custom RelBuilder implementation in rules
    
    After the changes, made in CALCITE-1056 if the filter has a predicate that is always false, RelBuilder.filter() method returns values rel node instead of filter rel node. In order to preserve column types, DrillRelBuilder.empty() method, which is returned by filter method  was overridden, and now it returns filter with a false predicate. (advice to override this method was in its javadoc) The goal of all other changes in this commit is to use our custom RelBuilder for all rules that are used in Drill.

commit 82c750aee2ad67a1e9e0227ca06d39f9362cba99
Author: Roman Kulyk <ro...@gmail.com>
Date:   2017-11-02T18:22:36Z

    DRILL-3993: Fix failed tests after Calcite update
    
    - fix temporary table errors according to updated logic;
    - fixed errors when we trying to make select from hbase table with schema name in query (example: "SELECT row_key FROM hbase.TestTableNullStr) from hbase schema (did "USE hbase" before). Added test for it;
    - added fix for views which were created on Calcite 1.4 and test for it.

----


---

[GitHub] drill pull request #1066: DRILL-3993: Changes to support Calcite 1.15

Posted by chunhui-shi <gi...@git.apache.org>.
Github user chunhui-shi commented on a diff in the pull request:

    https://github.com/apache/drill/pull/1066#discussion_r158103733
  
    --- Diff: contrib/storage-hbase/src/main/java/org/apache/drill/exec/store/hbase/HBaseSchemaFactory.java ---
    @@ -72,7 +72,16 @@ public AbstractSchema getSubSchema(String name) {
         @Override
         public Table getTable(String name) {
           HBaseScanSpec scanSpec = new HBaseScanSpec(name);
    -      return new DrillHBaseTable(schemaName, plugin, scanSpec);
    +      try {
    +        return new DrillHBaseTable(schemaName, plugin, scanSpec);
    +      } catch (Exception e) {
    +        // Calcite firstly is looking for a table in the default schema, if a table was not found,
    +        // it is looking in root schema.
    +        // If a table does not exist, a query will fail at validation stage,
    +        // so the error should not be thrown there.
    --- End diff --
    
    do you mean 'should not be thrown HERE'? The same for other places.


---

[GitHub] drill pull request #1066: DRILL-3993: Changes to support Calcite 1.15

Posted by chunhui-shi <gi...@git.apache.org>.
Github user chunhui-shi commented on a diff in the pull request:

    https://github.com/apache/drill/pull/1066#discussion_r158117382
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/SqlConverter.java ---
    @@ -470,34 +576,32 @@ public void disallowTemporaryTables() {
          * @throws UserException if temporary tables usage is disallowed
          */
         @Override
    -    public RelOptTableImpl getTable(final List<String> names) {
    -      RelOptTableImpl temporaryTable = null;
    -
    -      if (mightBeTemporaryTable(names, session.getDefaultSchemaPath(), drillConfig)) {
    -        String temporaryTableName = session.resolveTemporaryTableName(names.get(names.size() - 1));
    -        if (temporaryTableName != null) {
    -          List<String> temporaryNames = Lists.newArrayList(temporarySchema, temporaryTableName);
    -          temporaryTable = super.getTable(temporaryNames);
    +    public Prepare.PreparingTable getTable(final List<String> names) {
    +      String originalTableName = session.getOriginalTableNameFromTemporaryTable(names.get(names.size() - 1));
    +      if (originalTableName != null) {
    +        if (!allowTemporaryTables) {
    +          throw UserException
    +              .validationError()
    +              .message("Temporary tables usage is disallowed. Used temporary table name: [%s].", originalTableName)
    +              .build(logger);
             }
           }
    -      if (temporaryTable != null) {
    -        if (allowTemporaryTables) {
    -          return temporaryTable;
    +      // Fix for select from hbase table with schema name in query (example: "SELECT col FROM hbase.t)
    +      // from hbase schema (did "USE hbase" before).
    --- End diff --
    
    Could you explain why this is needed now? I think this used to work -- if a schema is not found under default, Drill falls back to the root to do the search. 
    
    What got changed thus you have to introduce this fix? 
    
    What about this test case?
    "use hbase; select t.col, t2.col2 from hbase2.t2 as t2, hbase.t as t where t.id = t2.id"



---

[GitHub] drill pull request #1066: DRILL-3993: Changes to support Calcite 1.15

Posted by vvysotskyi <gi...@git.apache.org>.
Github user vvysotskyi commented on a diff in the pull request:

    https://github.com/apache/drill/pull/1066#discussion_r158258149
  
    --- Diff: exec/java-exec/src/main/codegen/data/Parser.tdd ---
    @@ -75,6 +72,26 @@
       implementationFiles: [
         "parserImpls.ftl"
       ]
    +
    +  # List of methods for parsing extensions to "CREATE [OR REPLACE]" calls.
    +  # Each must accept arguments "(SqlParserPos pos, boolean replace)".
    +  createStatementParserMethods: [
    --- End diff --
    
    Calcites `Parser.jj` uses these lists to extend existing functionality when desired methods specified inside it. If we did not specify these lists, java class `Parser.java` could not be generated and build would fail with the error:
    ```
    [ERROR] Failed to execute goal org.apache.drill.tools:drill-fmpp-maven-plugin:1.12.0-SNAPSHOT:generate (generate-fmpp) on project drill-java-exec: FMPP processing session failed.
    [ERROR] Caused by: freemarker.core.InvalidReferenceException: The following has evaluated to null or missing:
    [ERROR] ==> parser.createStatementParserMethods  [in template "Parser.jj" at line 881, column 6]
    ```


---

[GitHub] drill pull request #1066: DRILL-3993: Changes to support Calcite 1.15

Posted by chunhui-shi <gi...@git.apache.org>.
Github user chunhui-shi commented on a diff in the pull request:

    https://github.com/apache/drill/pull/1066#discussion_r158173488
  
    --- Diff: exec/java-exec/src/test/resources/record/test_recorditerator.json ---
    @@ -60,7 +60,7 @@
             @id:2,
             child:1,
             pop:"project",
    -        exprs:[ { ref : "`*`", expr : "`*`"} ]
    +        exprs:[ { ref : "`**`", expr : "`**`"} ]
    --- End diff --
    
    Not sure I understand this '**' thing, can you explain more about this change?


---

[GitHub] drill issue #1066: DRILL-3993: Changes to support Calcite 1.15

Posted by amansinha100 <gi...@git.apache.org>.
Github user amansinha100 commented on the issue:

    https://github.com/apache/drill/pull/1066
  
    Other than one comment above, rest LGTM.  +1


---

[GitHub] drill issue #1066: DRILL-3993: Changes to support Calcite 1.15

Posted by vvysotskyi <gi...@git.apache.org>.
Github user vvysotskyi commented on the issue:

    https://github.com/apache/drill/pull/1066
  
    @amansinha100, regarding HashAggregate OOM related change, it was done in the scope of this pull request since with new Calcite a physical plan for the query was changed to the correct one but it caused an infinite loop in HashAgg operator. Therefore I made these changes in order to prevent this infinite loop for the queries that previously worked. 
    I still think that it is better to keep this change in this PR.


---

[GitHub] drill pull request #1066: DRILL-3993: Changes to support Calcite 1.15

Posted by chunhui-shi <gi...@git.apache.org>.
Github user chunhui-shi commented on a diff in the pull request:

    https://github.com/apache/drill/pull/1066#discussion_r161607926
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/aggregate/HashAggTemplate.java ---
    @@ -1303,6 +1305,8 @@ private void checkGroupAndAggrValues(int incomingRowIdx) {
           long memDiff = allocator.getAllocatedMemory() - allocatedBeforeHTput;
           if ( memDiff > 0 ) { logger.warn("Leak: HashTable put() OOM left behind {} bytes allocated",memDiff); }
     
    +      checkForSpillPossibility(currentPartition);
    --- End diff --
    
    Not sure this check 'chooseAPartitionToFlush'  is needed. If an exception is desired, I would think modifying doSpill() is better way e.g. modifying this line: "  if ( victimPartition < 0 ) { return; } " Otherwise in this process chooseAPartitionToFlush will be called twice.
    
          int victimPartition = chooseAPartitionToFlush(currentPartition, forceSpill);
    
          // In case no partition has more than one batch -- try and "push the limits"; maybe next
          // time the spill could work.
          if ( victimPartition < 0 ) { return; } 


---

[GitHub] drill issue #1066: DRILL-3993: Changes to support Calcite 1.15

Posted by chunhui-shi <gi...@git.apache.org>.
Github user chunhui-shi commented on the issue:

    https://github.com/apache/drill/pull/1066
  
    Can you file a pull request to mapr/incubator-calcite with your changes in https://github.com/KulykRoman/incubator-calcite/commits/DrillCalcite1.15.0_rc0?
    So we can review and get these changes into official branch to publish.


---

[GitHub] drill issue #1066: DRILL-3993: Changes to support Calcite 1.15

Posted by amansinha100 <gi...@git.apache.org>.
Github user amansinha100 commented on the issue:

    https://github.com/apache/drill/pull/1066
  
    @vvysotskyi could you pls separate out the HashAggregate OOM related change from this PR and file a separate JIRA for it since it is unrelated to the calcite rebase per-se.  Lumping them together causes confusion.   Thanks. 


---

[GitHub] drill pull request #1066: DRILL-3993: Changes to support Calcite 1.15

Posted by amansinha100 <gi...@git.apache.org>.
Github user amansinha100 commented on a diff in the pull request:

    https://github.com/apache/drill/pull/1066#discussion_r157854723
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/planner/DrillRelBuilder.java ---
    @@ -0,0 +1,69 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to you under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + * http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.drill.exec.planner;
    +
    +import org.apache.calcite.plan.Context;
    +import org.apache.calcite.plan.Contexts;
    +import org.apache.calcite.plan.RelOptCluster;
    +import org.apache.calcite.plan.RelOptSchema;
    +import org.apache.calcite.rel.RelNode;
    +import org.apache.calcite.rel.core.RelFactories;
    +import org.apache.calcite.tools.RelBuilder;
    +import org.apache.calcite.tools.RelBuilderFactory;
    +import org.apache.calcite.util.Util;
    +
    +public class DrillRelBuilder extends RelBuilder {
    +  private final RelFactories.FilterFactory filterFactory;
    +
    +  protected DrillRelBuilder(Context context, RelOptCluster cluster, RelOptSchema relOptSchema) {
    +    super(context, cluster, relOptSchema);
    +    this.filterFactory =
    +        Util.first(context.unwrap(RelFactories.FilterFactory.class),
    +            RelFactories.DEFAULT_FILTER_FACTORY);
    +  }
    +
    +  /**
    +   * Original method {@link RelBuilder#empty} returns empty values rel.
    +   * In the order to preserve dara row types, filter with false predicate is created.
    +   */
    +  @Override
    +  public RelBuilder empty() {
    --- End diff --
    
    Is a separate DrillRelBuilder absolutely needed ?  Should the original behavior of empty() in Calcite be changed to preserve the RowType of the input ? The fewer things we extend in Drill the better. 


---

[GitHub] drill pull request #1066: DRILL-3993: Changes to support Calcite 1.15

Posted by vvysotskyi <gi...@git.apache.org>.
Github user vvysotskyi commented on a diff in the pull request:

    https://github.com/apache/drill/pull/1066#discussion_r158260235
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/SqlConverter.java ---
    @@ -470,34 +576,32 @@ public void disallowTemporaryTables() {
          * @throws UserException if temporary tables usage is disallowed
          */
         @Override
    -    public RelOptTableImpl getTable(final List<String> names) {
    -      RelOptTableImpl temporaryTable = null;
    -
    -      if (mightBeTemporaryTable(names, session.getDefaultSchemaPath(), drillConfig)) {
    -        String temporaryTableName = session.resolveTemporaryTableName(names.get(names.size() - 1));
    -        if (temporaryTableName != null) {
    -          List<String> temporaryNames = Lists.newArrayList(temporarySchema, temporaryTableName);
    -          temporaryTable = super.getTable(temporaryNames);
    +    public Prepare.PreparingTable getTable(final List<String> names) {
    +      String originalTableName = session.getOriginalTableNameFromTemporaryTable(names.get(names.size() - 1));
    +      if (originalTableName != null) {
    +        if (!allowTemporaryTables) {
    +          throw UserException
    +              .validationError()
    +              .message("Temporary tables usage is disallowed. Used temporary table name: [%s].", originalTableName)
    +              .build(logger);
             }
           }
    -      if (temporaryTable != null) {
    -        if (allowTemporaryTables) {
    -          return temporaryTable;
    +      // Fix for select from hbase table with schema name in query (example: "SELECT col FROM hbase.t)
    +      // from hbase schema (did "USE hbase" before).
    --- End diff --
    
    This change was made before the change, where we catch and log exceptions from `HBaseSchema.getTable()` method. So yes, it is not needed now.


---

[GitHub] drill pull request #1066: DRILL-3993: Changes to support Calcite 1.15

Posted by vvysotskyi <gi...@git.apache.org>.
Github user vvysotskyi commented on a diff in the pull request:

    https://github.com/apache/drill/pull/1066#discussion_r158263816
  
    --- Diff: exec/jdbc-all/pom.xml ---
    @@ -572,7 +572,7 @@
                               This is likely due to you adding new dependencies to a java-exec and not updating the excludes in this module. This is important as it minimizes the size of the dependency of Drill application users.
     
                             </message>
    -                        <maxsize>29000000</maxsize>
    +                        <maxsize>31000000</maxsize>
    --- End diff --
    
    We didn't have problems on our local machines or test cluster connected with `maxsize` after changing it to 31000000. Which version of Maven did you use? 
    Anyway, I have increased it to 32000000.


---

[GitHub] drill issue #1066: DRILL-3993: Changes to support Calcite 1.15

Posted by vvysotskyi <gi...@git.apache.org>.
Github user vvysotskyi commented on the issue:

    https://github.com/apache/drill/pull/1066
  
    @chunhui-shi, I have made additional fixes in new commits (commits after DRILL-3993: Changes after code review. 3120762). Could you please take a look? 
    Also, I have created pull request on incubator-calcite: https://github.com/mapr/incubator-calcite/pull/16


---

[GitHub] drill pull request #1066: DRILL-3993: Changes to support Calcite 1.15

Posted by amansinha100 <gi...@git.apache.org>.
Github user amansinha100 commented on a diff in the pull request:

    https://github.com/apache/drill/pull/1066#discussion_r157853167
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/planner/DrillRelBuilder.java ---
    @@ -0,0 +1,69 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to you under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + * http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.drill.exec.planner;
    +
    +import org.apache.calcite.plan.Context;
    +import org.apache.calcite.plan.Contexts;
    +import org.apache.calcite.plan.RelOptCluster;
    +import org.apache.calcite.plan.RelOptSchema;
    +import org.apache.calcite.rel.RelNode;
    +import org.apache.calcite.rel.core.RelFactories;
    +import org.apache.calcite.tools.RelBuilder;
    +import org.apache.calcite.tools.RelBuilderFactory;
    +import org.apache.calcite.util.Util;
    +
    +public class DrillRelBuilder extends RelBuilder {
    +  private final RelFactories.FilterFactory filterFactory;
    +
    +  protected DrillRelBuilder(Context context, RelOptCluster cluster, RelOptSchema relOptSchema) {
    +    super(context, cluster, relOptSchema);
    +    this.filterFactory =
    +        Util.first(context.unwrap(RelFactories.FilterFactory.class),
    +            RelFactories.DEFAULT_FILTER_FACTORY);
    +  }
    +
    +  /**
    +   * Original method {@link RelBuilder#empty} returns empty values rel.
    +   * In the order to preserve dara row types, filter with false predicate is created.
    --- End diff --
    
    'data' spelling


---

[GitHub] drill pull request #1066: DRILL-3993: Changes to support Calcite 1.15

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/drill/pull/1066


---

[GitHub] drill pull request #1066: DRILL-3993: Changes to support Calcite 1.15

Posted by vvysotskyi <gi...@git.apache.org>.
Github user vvysotskyi commented on a diff in the pull request:

    https://github.com/apache/drill/pull/1066#discussion_r162161040
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/planner/DrillRelBuilder.java ---
    @@ -0,0 +1,69 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to you under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + * http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.drill.exec.planner;
    +
    +import org.apache.calcite.plan.Context;
    +import org.apache.calcite.plan.Contexts;
    +import org.apache.calcite.plan.RelOptCluster;
    +import org.apache.calcite.plan.RelOptSchema;
    +import org.apache.calcite.rel.RelNode;
    +import org.apache.calcite.rel.core.RelFactories;
    +import org.apache.calcite.tools.RelBuilder;
    +import org.apache.calcite.tools.RelBuilderFactory;
    +import org.apache.calcite.util.Util;
    +
    +public class DrillRelBuilder extends RelBuilder {
    +  private final RelFactories.FilterFactory filterFactory;
    +
    +  protected DrillRelBuilder(Context context, RelOptCluster cluster, RelOptSchema relOptSchema) {
    +    super(context, cluster, relOptSchema);
    +    this.filterFactory =
    +        Util.first(context.unwrap(RelFactories.FilterFactory.class),
    +            RelFactories.DEFAULT_FILTER_FACTORY);
    +  }
    +
    +  /**
    +   * Original method {@link RelBuilder#empty} returns empty values rel.
    +   * In the order to preserve dara row types, filter with false predicate is created.
    +   */
    +  @Override
    +  public RelBuilder empty() {
    --- End diff --
    
    In some cases, RowType of the input may not be known at the moment when this method is called (some of the columns or all columns were not dynamically discovered yet), therefore I had to make changes directly in Calcite to allow using of custom RelBuilder and make changes in Drill to pass custom RelBuilder into them. For more details please see [1] and [2].
    
    [1] https://lists.apache.org/list.html?dev@calcite.apache.org:lte=1y:Make%20RelBuilder.filter%28%29%20configurable
    [2] https://issues.apache.org/jira/browse/CALCITE-2043



---

[GitHub] drill pull request #1066: DRILL-3993: Changes to support Calcite 1.15

Posted by chunhui-shi <gi...@git.apache.org>.
Github user chunhui-shi commented on a diff in the pull request:

    https://github.com/apache/drill/pull/1066#discussion_r158173904
  
    --- Diff: exec/jdbc-all/pom.xml ---
    @@ -572,7 +572,7 @@
                               This is likely due to you adding new dependencies to a java-exec and not updating the excludes in this module. This is important as it minimizes the size of the dependency of Drill application users.
     
                             </message>
    -                        <maxsize>29000000</maxsize>
    +                        <maxsize>31000000</maxsize>
    --- End diff --
    
    I played with this branch and I have to change the size of 31000000 to 32000000, it might due to my build environment but we may want to increase it to 32000000.


---

[GitHub] drill pull request #1066: DRILL-3993: Changes to support Calcite 1.15

Posted by vvysotskyi <gi...@git.apache.org>.
Github user vvysotskyi commented on a diff in the pull request:

    https://github.com/apache/drill/pull/1066#discussion_r158261430
  
    --- Diff: exec/java-exec/src/test/resources/record/test_recorditerator.json ---
    @@ -60,7 +60,7 @@
             @id:2,
             child:1,
             pop:"project",
    -        exprs:[ { ref : "`*`", expr : "`*`"} ]
    +        exprs:[ { ref : "`**`", expr : "`**`"} ]
    --- End diff --
    
    This change connected with commit `[CALCITE-1150] Add dynamic record type and dynamic star for schema-on-read table`. `**` is used as the dynamic star column name prefix. It is equivalent to `*` in our Drill-Calcite 1.4.0.


---

[GitHub] drill pull request #1066: DRILL-3993: Changes to support Calcite 1.15

Posted by vvysotskyi <gi...@git.apache.org>.
Github user vvysotskyi commented on a diff in the pull request:

    https://github.com/apache/drill/pull/1066#discussion_r162157833
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/planner/DrillRelBuilder.java ---
    @@ -0,0 +1,69 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to you under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + * http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.drill.exec.planner;
    +
    +import org.apache.calcite.plan.Context;
    +import org.apache.calcite.plan.Contexts;
    +import org.apache.calcite.plan.RelOptCluster;
    +import org.apache.calcite.plan.RelOptSchema;
    +import org.apache.calcite.rel.RelNode;
    +import org.apache.calcite.rel.core.RelFactories;
    +import org.apache.calcite.tools.RelBuilder;
    +import org.apache.calcite.tools.RelBuilderFactory;
    +import org.apache.calcite.util.Util;
    +
    +public class DrillRelBuilder extends RelBuilder {
    +  private final RelFactories.FilterFactory filterFactory;
    +
    +  protected DrillRelBuilder(Context context, RelOptCluster cluster, RelOptSchema relOptSchema) {
    +    super(context, cluster, relOptSchema);
    +    this.filterFactory =
    +        Util.first(context.unwrap(RelFactories.FilterFactory.class),
    +            RelFactories.DEFAULT_FILTER_FACTORY);
    +  }
    +
    +  /**
    +   * Original method {@link RelBuilder#empty} returns empty values rel.
    +   * In the order to preserve dara row types, filter with false predicate is created.
    --- End diff --
    
    Thanks, fixed


---

[GitHub] drill pull request #1066: DRILL-3993: Changes to support Calcite 1.15

Posted by vvysotskyi <gi...@git.apache.org>.
Github user vvysotskyi commented on a diff in the pull request:

    https://github.com/apache/drill/pull/1066#discussion_r161796052
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/aggregate/HashAggTemplate.java ---
    @@ -1303,6 +1305,8 @@ private void checkGroupAndAggrValues(int incomingRowIdx) {
           long memDiff = allocator.getAllocatedMemory() - allocatedBeforeHTput;
           if ( memDiff > 0 ) { logger.warn("Leak: HashTable put() OOM left behind {} bytes allocated",memDiff); }
     
    +      checkForSpillPossibility(currentPartition);
    --- End diff --
    
    These checks were needed to avoid infinite loop when there is not enough memory for the spill. 
    I moved these checks into `spillIfNeeded()` method, so when called `doSpill()`, `forceSpill` in `spillIfNeeded()` is true and check should be done.


---

[GitHub] drill pull request #1066: DRILL-3993: Changes to support Calcite 1.15

Posted by amansinha100 <gi...@git.apache.org>.
Github user amansinha100 commented on a diff in the pull request:

    https://github.com/apache/drill/pull/1066#discussion_r162171143
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/planner/DrillRelBuilder.java ---
    @@ -0,0 +1,69 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to you under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + * http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.drill.exec.planner;
    +
    +import org.apache.calcite.plan.Context;
    +import org.apache.calcite.plan.Contexts;
    +import org.apache.calcite.plan.RelOptCluster;
    +import org.apache.calcite.plan.RelOptSchema;
    +import org.apache.calcite.rel.RelNode;
    +import org.apache.calcite.rel.core.RelFactories;
    +import org.apache.calcite.tools.RelBuilder;
    +import org.apache.calcite.tools.RelBuilderFactory;
    +import org.apache.calcite.util.Util;
    +
    +public class DrillRelBuilder extends RelBuilder {
    +  private final RelFactories.FilterFactory filterFactory;
    +
    +  protected DrillRelBuilder(Context context, RelOptCluster cluster, RelOptSchema relOptSchema) {
    +    super(context, cluster, relOptSchema);
    +    this.filterFactory =
    +        Util.first(context.unwrap(RelFactories.FilterFactory.class),
    +            RelFactories.DEFAULT_FILTER_FACTORY);
    +  }
    +
    +  /**
    +   * Original method {@link RelBuilder#empty} returns empty values rel.
    +   * In the order to preserve dara row types, filter with false predicate is created.
    +   */
    +  @Override
    +  public RelBuilder empty() {
    --- End diff --
    
    Ok, thanks for clarifying.  The schema-on-read does make it difficult to avoid overriding the empty() behavior.


---

[GitHub] drill pull request #1066: DRILL-3993: Changes to support Calcite 1.15

Posted by chunhui-shi <gi...@git.apache.org>.
Github user chunhui-shi commented on a diff in the pull request:

    https://github.com/apache/drill/pull/1066#discussion_r158106870
  
    --- Diff: exec/java-exec/src/main/codegen/data/Parser.tdd ---
    @@ -75,6 +72,26 @@
       implementationFiles: [
         "parserImpls.ftl"
       ]
    +
    +  # List of methods for parsing extensions to "CREATE [OR REPLACE]" calls.
    +  # Each must accept arguments "(SqlParserPos pos, boolean replace)".
    +  createStatementParserMethods: [
    --- End diff --
    
    these lists are empty. What is the reason we need them here?


---

[GitHub] drill pull request #1066: DRILL-3993: Changes to support Calcite 1.15

Posted by vvysotskyi <gi...@git.apache.org>.
Github user vvysotskyi commented on a diff in the pull request:

    https://github.com/apache/drill/pull/1066#discussion_r158258451
  
    --- Diff: contrib/storage-hbase/src/main/java/org/apache/drill/exec/store/hbase/HBaseSchemaFactory.java ---
    @@ -72,7 +72,16 @@ public AbstractSchema getSubSchema(String name) {
         @Override
         public Table getTable(String name) {
           HBaseScanSpec scanSpec = new HBaseScanSpec(name);
    -      return new DrillHBaseTable(schemaName, plugin, scanSpec);
    +      try {
    +        return new DrillHBaseTable(schemaName, plugin, scanSpec);
    +      } catch (Exception e) {
    +        // Calcite firstly is looking for a table in the default schema, if a table was not found,
    +        // it is looking in root schema.
    +        // If a table does not exist, a query will fail at validation stage,
    +        // so the error should not be thrown there.
    --- End diff --
    
    Yes, I meant it. Thanks, replaced.


---

[GitHub] drill issue #1066: DRILL-3993: Changes to support Calcite 1.15

Posted by vvysotskyi <gi...@git.apache.org>.
Github user vvysotskyi commented on the issue:

    https://github.com/apache/drill/pull/1066
  
    FYI: changes in pom files:
    We observed issue with Avatica similar to the issue described in https://issues.apache.org/jira/browse/CALCITE-1694. Therefore we had to make changes in our pom files similar to the changes, made in CALCITE-1694 to fix NoClassDefFoundError.


---

[GitHub] drill pull request #1066: DRILL-3993: Changes to support Calcite 1.15

Posted by vvysotskyi <gi...@git.apache.org>.
Github user vvysotskyi commented on a diff in the pull request:

    https://github.com/apache/drill/pull/1066#discussion_r158259594
  
    --- Diff: contrib/storage-jdbc/src/main/java/org/apache/drill/exec/store/jdbc/JdbcPrel.java ---
    @@ -62,7 +62,7 @@ public JdbcPrel(RelOptCluster cluster, RelTraitSet traitSet, JdbcIntermediatePre
             (JavaTypeFactory) getCluster().getTypeFactory());
         final JdbcImplementor.Result result =
             jdbcImplementor.visitChild(0, input.accept(new SubsetRemover()));
    -    sql = result.asQuery().toSqlString(dialect).getSql();
    +    sql = result.asSelect().toSqlString(dialect).getSql();
    --- End diff --
    
    It is also may be another `SqlCall` instance. Thanks for pointing this, replaced it by `asStatement()` method.


---

[GitHub] drill pull request #1066: DRILL-3993: Changes to support Calcite 1.15

Posted by chunhui-shi <gi...@git.apache.org>.
Github user chunhui-shi commented on a diff in the pull request:

    https://github.com/apache/drill/pull/1066#discussion_r158120135
  
    --- Diff: contrib/storage-hbase/src/main/java/org/apache/drill/exec/store/hbase/HBaseSchemaFactory.java ---
    @@ -72,7 +72,16 @@ public AbstractSchema getSubSchema(String name) {
         @Override
         public Table getTable(String name) {
           HBaseScanSpec scanSpec = new HBaseScanSpec(name);
    -      return new DrillHBaseTable(schemaName, plugin, scanSpec);
    +      try {
    +        return new DrillHBaseTable(schemaName, plugin, scanSpec);
    +      } catch (Exception e) {
    +        // Calcite firstly is looking for a table in the default schema, if a table was not found,
    --- End diff --
    
    'is looking for' and 'is looking in' seems to be saying Calcite IS working in this line of code, but I think you meant that for the new version calcite, it 'looks for' something... so would like to get some rephrase here.


---

[GitHub] drill pull request #1066: DRILL-3993: Changes to support Calcite 1.15

Posted by vvysotskyi <gi...@git.apache.org>.
Github user vvysotskyi commented on a diff in the pull request:

    https://github.com/apache/drill/pull/1066#discussion_r158258329
  
    --- Diff: contrib/storage-hbase/src/main/java/org/apache/drill/exec/store/hbase/HBaseSchemaFactory.java ---
    @@ -72,7 +72,16 @@ public AbstractSchema getSubSchema(String name) {
         @Override
         public Table getTable(String name) {
           HBaseScanSpec scanSpec = new HBaseScanSpec(name);
    -      return new DrillHBaseTable(schemaName, plugin, scanSpec);
    +      try {
    +        return new DrillHBaseTable(schemaName, plugin, scanSpec);
    +      } catch (Exception e) {
    +        // Calcite firstly is looking for a table in the default schema, if a table was not found,
    --- End diff --
    
    Thanks, done.


---

[GitHub] drill pull request #1066: DRILL-3993: Changes to support Calcite 1.15

Posted by chunhui-shi <gi...@git.apache.org>.
Github user chunhui-shi commented on a diff in the pull request:

    https://github.com/apache/drill/pull/1066#discussion_r158105572
  
    --- Diff: contrib/storage-jdbc/src/main/java/org/apache/drill/exec/store/jdbc/JdbcPrel.java ---
    @@ -62,7 +62,7 @@ public JdbcPrel(RelOptCluster cluster, RelTraitSet traitSet, JdbcIntermediatePre
             (JavaTypeFactory) getCluster().getTypeFactory());
         final JdbcImplementor.Result result =
             jdbcImplementor.visitChild(0, input.accept(new SubsetRemover()));
    -    sql = result.asQuery().toSqlString(dialect).getSql();
    +    sql = result.asSelect().toSqlString(dialect).getSql();
    --- End diff --
    
    Is the 'result' here guaranteed to be a SqlSelect?


---

[GitHub] drill pull request #1066: DRILL-3993: Changes to support Calcite 1.15

Posted by vvysotskyi <gi...@git.apache.org>.
Github user vvysotskyi commented on a diff in the pull request:

    https://github.com/apache/drill/pull/1066#discussion_r158374197
  
    --- Diff: exec/java-exec/src/main/codegen/includes/parserImpls.ftl ---
    @@ -351,4 +351,23 @@ SqlNode SqlDropFunction() :
        {
            return new SqlDropFunction(pos, jar);
        }
    -}
    \ No newline at end of file
    +}
    +
    +<#if !parser.includeCompoundIdentifier >
    --- End diff --
    
    Actually, it does not cause a new behaviour or functionality, it just helps to preserve old one after changes in Calcite `Parser.jj`. Therefore existing unit tests cover this change.


---

[GitHub] drill pull request #1066: DRILL-3993: Changes to support Calcite 1.15

Posted by chunhui-shi <gi...@git.apache.org>.
Github user chunhui-shi commented on a diff in the pull request:

    https://github.com/apache/drill/pull/1066#discussion_r161077516
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/AggPruleBase.java ---
    @@ -82,4 +83,19 @@ protected boolean create2PhasePlan(RelOptRuleCall call, DrillAggregateRel aggreg
         }
         return true;
       }
    +
    +  /**
    +   * Returns group-by keys with the remapped arguments for specified aggregate.
    +   *
    +   * @param groupSet ImmutableBitSet of aggregate rel node, whose group-by keys should be remapped.
    +   * @return {@link ImmutableBitSet} instance with remapped keys.
    +   */
    +  public static ImmutableBitSet remapGroupSet(ImmutableBitSet groupSet) {
    --- End diff --
    
    what is the reason we are going this remap with new calcite? 
    And if the result is only depended on size of groupSet, we don't really need to iterate through the groupSet.


---

[GitHub] drill pull request #1066: DRILL-3993: Changes to support Calcite 1.15

Posted by chunhui-shi <gi...@git.apache.org>.
Github user chunhui-shi commented on a diff in the pull request:

    https://github.com/apache/drill/pull/1066#discussion_r158335378
  
    --- Diff: exec/java-exec/src/main/codegen/includes/parserImpls.ftl ---
    @@ -351,4 +351,23 @@ SqlNode SqlDropFunction() :
        {
            return new SqlDropFunction(pos, jar);
        }
    -}
    \ No newline at end of file
    +}
    +
    +<#if !parser.includeCompoundIdentifier >
    --- End diff --
    
    do we need a test case for this newly added ParenthesizedCompoundIdentifierList?


---

[GitHub] drill issue #1066: DRILL-3993: Changes to support Calcite 1.15

Posted by vvysotskyi <gi...@git.apache.org>.
Github user vvysotskyi commented on the issue:

    https://github.com/apache/drill/pull/1066
  
    @chunhui-shi, this branch is not ready completely. There is opened a pull request on Apache Calcite [CALCITE-2018](https://issues.apache.org/jira/browse/CALCITE-2018) which should be cherry-picked into this branch. When it is merged into Apache master, we will update both branches and create a pull request on mapr/incubator-calcite.


---

[GitHub] drill pull request #1066: DRILL-3993: Changes to support Calcite 1.15

Posted by vvysotskyi <gi...@git.apache.org>.
Github user vvysotskyi commented on a diff in the pull request:

    https://github.com/apache/drill/pull/1066#discussion_r161185413
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/AggPruleBase.java ---
    @@ -82,4 +83,19 @@ protected boolean create2PhasePlan(RelOptRuleCall call, DrillAggregateRel aggreg
         }
         return true;
       }
    +
    +  /**
    +   * Returns group-by keys with the remapped arguments for specified aggregate.
    +   *
    +   * @param groupSet ImmutableBitSet of aggregate rel node, whose group-by keys should be remapped.
    +   * @return {@link ImmutableBitSet} instance with remapped keys.
    +   */
    +  public static ImmutableBitSet remapGroupSet(ImmutableBitSet groupSet) {
    --- End diff --
    
    After the changes in CALCITE-1930 in the class `AggregateExpandDistinctAggregatesRule`, this rule started applying more correctly, since in the older version there were checks like this:
    ```
    aggCall.getAggregation() instanceof SqlCountAggFunction
    ```
    but they were replaced by checks like this:
    ```
    final SqlKind aggCallKind = aggCall.getAggregation().getKind();
    switch (aggCallKind)
    ```
    So for the cases when instead of Calcite s `SqlCountAggFunction` were used `DrillCalciteSqlAggFunctionWrapper`s this rule changed its behaviour and instead of returning joins of distinct and non-distinct aggregate rel nodes, it returns distinct aggregate which has an input with non-distinct aggregates grouped by column, which is distinct in outer aggregate.
    
    These Drill rules were able to work correctly only for the cases when aggregate rel node does not contain ` aggCalls` and contains only the single field in `rowType` (in this case `groupSet` is always `{0}`, and it still correct for outer aggregates which are created in `*AggPrule`s)
    
    With the new version of `AggregateExpandDistinctAggregatesRule` these Drill rules received aggregate rel nodes with the non-empty lists of ` aggCalls`, therefore its `rowType` contains more than one field. But before my changes, the same `groupSet` was passed to the constructor of outer aggregate and row type of aggregate differed from the row type of its input. So it was incorrect `groupSet`. 
    Aggregate rel nodes always specify the group by columns in the first positions of the list, so correct `groupSet` for outer aggregate should be `ImmutableBitSet` with the same size as the `groupSet` of nested aggregate, but the ordinals of columns should start from 0. 
    
    As for the point of iterating through the group set, `ImmutableBitSet.size()`, `ImmutableBitSet.length()` and `ImmutableBitSet.cardinality()` does not return desired "size" of the `groupSet`. `AggregateExpandDistinctAggregatesRule` also contains similar code which iterating through the `groupSet` for similar purposes.


---

[GitHub] drill issue #1066: DRILL-3993: Changes to support Calcite 1.15

Posted by chunhui-shi <gi...@git.apache.org>.
Github user chunhui-shi commented on the issue:

    https://github.com/apache/drill/pull/1066
  
    +1. Thank you for addressing the comments.


---