You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by sudheeshkatkam <gi...@git.apache.org> on 2015/10/09 00:56:16 UTC

[GitHub] drill pull request: DRILL-3623: Use shorter query path for LIMIT 0...

GitHub user sudheeshkatkam opened a pull request:

    https://github.com/apache/drill/pull/193

    DRILL-3623: Use shorter query path for LIMIT 0 queries on schema-ed tables

    Initial patch.
    
    DrillTable#providesDeferredSchema function is used by the NonDeferredSchemaTableLimit0Visitor to check if the table can provide schema directly, and if so the result is directly returned.
    
    It seems the shorter query path for this query needs a hacky "otherPlan" in the DefaultSqlHandler without major refactoring (Should I go ahead and make changes?). This also means that "EXPLAIN PLAN ..." returns a plan that is different the actual query plan (without a check in ExplainHandler, another hack).
    
    I think the classes need more meaningful names (NonDeferredSchemaTableLimit0Visitor).
    
    Also, note the type conversion using CALCITE_TO_DRILL_TYPE_MAPPING.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/sudheeshkatkam/drill DRILL-3623

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/drill/pull/193.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #193
    
----
commit a766c54b34697df8b851204705ea1ce16c7114b7
Author: Sudheesh Katkam <sk...@maprtech.com>
Date:   2015-10-08T22:38:00Z

    DRILL-3623: Use shorter query path for LIMIT 0 queries on schema-ed tables

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request: DRILL-3623: Use shorter query path for LIMIT 0...

Posted by jacques-n <gi...@git.apache.org>.
Github user jacques-n commented on the pull request:

    https://github.com/apache/drill/pull/193#issuecomment-152018974
  
    Got it. Thanks for the explanation. So this is a hack until we can solve those issues.
    
    I think we have to do this work, however. a 1-2 second response to a limit 0 query is too much. We should open up jiras for all of these inconsistency issues and then get Calcite and Drill in alignment. 
    
    What do you think we're talking about: aggregation outputs, implicit casting. What else? 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request: DRILL-3623: Use shorter query path for LIMIT 0...

Posted by sudheeshkatkam <gi...@git.apache.org>.
Github user sudheeshkatkam commented on the pull request:

    https://github.com/apache/drill/pull/193#issuecomment-191973058
  
    Moving to #405.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request: DRILL-3623: Use shorter query path for LIMIT 0...

Posted by julianhyde <gi...@git.apache.org>.
Github user julianhyde commented on a diff in the pull request:

    https://github.com/apache/drill/pull/193#discussion_r43316183
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/handlers/FindLimit0Visitor.java ---
    @@ -46,6 +51,32 @@ public static boolean containsLimit0(RelNode rel) {
         return visitor.isContains();
       }
     
    +  public static DrillRel addLimitOnTopOfLeafNodes(final DrillRel rel) {
    +    final RelShuttleImpl shuttle = new RelShuttleImpl() {
    +
    +      private RelNode addLimitAsParent(RelNode node) {
    +        final RexBuilder builder = node.getCluster().getRexBuilder();
    +        final RexLiteral offset = builder.makeExactLiteral(BigDecimal.ZERO);
    +        final RexLiteral fetch = builder.makeExactLiteral(BigDecimal.ZERO);
    +        return new DrillLimitRel(node.getCluster(), node.getTraitSet(), node, offset, fetch);
    --- End diff --
    
    Agree with @jinfengni. In more recent versions of Calcite, use RelBuilder.limit() or .sortLimit(). The RelBuilder will be configured to create the appropriate Drill variants of all RelNodes. It might also do some useful canonization/optimization. We recommend using RelBuilder for most tasks involving creating RelNodes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request: DRILL-3623: Use shorter query path for LIMIT 0...

Posted by jinfengni <gi...@git.apache.org>.
Github user jinfengni commented on a diff in the pull request:

    https://github.com/apache/drill/pull/193#discussion_r45010657
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillDirectScanRel.java ---
    @@ -0,0 +1,111 @@
    +/**
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + * http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.drill.exec.planner.logical;
    +
    +import com.google.common.collect.Iterators;
    +import org.apache.calcite.plan.RelOptCluster;
    +import org.apache.calcite.plan.RelTraitSet;
    +import org.apache.calcite.rel.AbstractRelNode;
    +import org.apache.calcite.rel.RelWriter;
    +import org.apache.calcite.rel.type.RelDataType;
    +import org.apache.drill.common.logical.data.LogicalOperator;
    +import org.apache.drill.exec.physical.base.PhysicalOperator;
    +import org.apache.drill.exec.planner.physical.DrillScanPrel;
    +import org.apache.drill.exec.planner.physical.PhysicalPlanCreator;
    +import org.apache.drill.exec.planner.physical.PlannerSettings;
    +import org.apache.drill.exec.planner.physical.Prel;
    +import org.apache.drill.exec.planner.physical.PrelUtil;
    +import org.apache.drill.exec.planner.physical.visitor.PrelVisitor;
    +import org.apache.drill.exec.record.BatchSchema;
    +import org.apache.drill.exec.store.direct.DirectGroupScan;
    +
    +import java.io.IOException;
    +import java.util.Iterator;
    +
    +/**
    + * Logical and physical RelNode representing a {@link DirectGroupScan}. This is not backed by a {@link DrillTable},
    + * unlike {@link DrillScanRel}.
    + */
    +public class DrillDirectScanRel extends AbstractRelNode implements DrillScanPrel, DrillRel {
    --- End diff --
    
    Sounds like that the DrillDirectScanRel approach works only when "limit 0" is on the root level; it would not apply when "limit 0" is inside a subquery. 
    
    The Values approach may apply for the case "limit 0" in a subquery. Certainly, this JIRA only targets the case of "limit 0" on root. But with the VALUES approach, there is room for improvement for a more general scenarios. 
    



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request: DRILL-3623: Use shorter query path for LIMIT 0...

Posted by sudheeshkatkam <gi...@git.apache.org>.
Github user sudheeshkatkam commented on a diff in the pull request:

    https://github.com/apache/drill/pull/193#discussion_r45008717
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillDirectScanRel.java ---
    @@ -0,0 +1,111 @@
    +/**
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + * http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.drill.exec.planner.logical;
    +
    +import com.google.common.collect.Iterators;
    +import org.apache.calcite.plan.RelOptCluster;
    +import org.apache.calcite.plan.RelTraitSet;
    +import org.apache.calcite.rel.AbstractRelNode;
    +import org.apache.calcite.rel.RelWriter;
    +import org.apache.calcite.rel.type.RelDataType;
    +import org.apache.drill.common.logical.data.LogicalOperator;
    +import org.apache.drill.exec.physical.base.PhysicalOperator;
    +import org.apache.drill.exec.planner.physical.DrillScanPrel;
    +import org.apache.drill.exec.planner.physical.PhysicalPlanCreator;
    +import org.apache.drill.exec.planner.physical.PlannerSettings;
    +import org.apache.drill.exec.planner.physical.Prel;
    +import org.apache.drill.exec.planner.physical.PrelUtil;
    +import org.apache.drill.exec.planner.physical.visitor.PrelVisitor;
    +import org.apache.drill.exec.record.BatchSchema;
    +import org.apache.drill.exec.store.direct.DirectGroupScan;
    +
    +import java.io.IOException;
    +import java.util.Iterator;
    +
    +/**
    + * Logical and physical RelNode representing a {@link DirectGroupScan}. This is not backed by a {@link DrillTable},
    + * unlike {@link DrillScanRel}.
    + */
    +public class DrillDirectScanRel extends AbstractRelNode implements DrillScanPrel, DrillRel {
    --- End diff --
    
    Should I stick to the current impl?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request: DRILL-3623: Use shorter query path for LIMIT 0...

Posted by jinfengni <gi...@git.apache.org>.
Github user jinfengni commented on the pull request:

    https://github.com/apache/drill/pull/193#issuecomment-152034091
  
    Sudheesh and I feel this new approach is more like a big optimization step towards solving the performance issue for "limit 0" query, rather than hack solution :  1) It shows quite significantly reduction in query time, from hundreds of seconds to couple of seconds in some cases. That's a big improvement. 2) it would benefit not only schema-based query, but also schema-less query, while the original approach would only apply for schema-based query. 
    
    I agree we should continue to optimize "limit 0" query. But for now, I think this new approach has its own merits.
    
    The aggregation /implicit casting are the two things that I can think of, if we go with the schema-based approach. 
     



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request: DRILL-3623: Use shorter query path for LIMIT 0...

Posted by sudheeshkatkam <gi...@git.apache.org>.
Github user sudheeshkatkam commented on a diff in the pull request:

    https://github.com/apache/drill/pull/193#discussion_r45162213
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillDirectScanRel.java ---
    @@ -0,0 +1,111 @@
    +/**
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + * http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.drill.exec.planner.logical;
    +
    +import com.google.common.collect.Iterators;
    +import org.apache.calcite.plan.RelOptCluster;
    +import org.apache.calcite.plan.RelTraitSet;
    +import org.apache.calcite.rel.AbstractRelNode;
    +import org.apache.calcite.rel.RelWriter;
    +import org.apache.calcite.rel.type.RelDataType;
    +import org.apache.drill.common.logical.data.LogicalOperator;
    +import org.apache.drill.exec.physical.base.PhysicalOperator;
    +import org.apache.drill.exec.planner.physical.DrillScanPrel;
    +import org.apache.drill.exec.planner.physical.PhysicalPlanCreator;
    +import org.apache.drill.exec.planner.physical.PlannerSettings;
    +import org.apache.drill.exec.planner.physical.Prel;
    +import org.apache.drill.exec.planner.physical.PrelUtil;
    +import org.apache.drill.exec.planner.physical.visitor.PrelVisitor;
    +import org.apache.drill.exec.record.BatchSchema;
    +import org.apache.drill.exec.store.direct.DirectGroupScan;
    +
    +import java.io.IOException;
    +import java.util.Iterator;
    +
    +/**
    + * Logical and physical RelNode representing a {@link DirectGroupScan}. This is not backed by a {@link DrillTable},
    + * unlike {@link DrillScanRel}.
    + */
    +public class DrillDirectScanRel extends AbstractRelNode implements DrillScanPrel, DrillRel {
    --- End diff --
    
    The `getValuesRelIfFullySchemaed(...)` check is done before logical transformation to avoid creating expensive objects while applying rules (unless rules are ordered). For example, DrillScanRule creates a DrillScanRel object which constructs a group scan object that can be quite expensive (see HiveScan, MongoGroupScan, HbaseGroupScan).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request: DRILL-3623: Use shorter query path for LIMIT 0...

Posted by jinfengni <gi...@git.apache.org>.
Github user jinfengni commented on a diff in the pull request:

    https://github.com/apache/drill/pull/193#discussion_r43314473
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/handlers/FindLimit0Visitor.java ---
    @@ -46,6 +51,32 @@ public static boolean containsLimit0(RelNode rel) {
         return visitor.isContains();
       }
     
    +  public static DrillRel addLimitOnTopOfLeafNodes(final DrillRel rel) {
    +    final RelShuttleImpl shuttle = new RelShuttleImpl() {
    +
    +      private RelNode addLimitAsParent(RelNode node) {
    +        final RexBuilder builder = node.getCluster().getRexBuilder();
    +        final RexLiteral offset = builder.makeExactLiteral(BigDecimal.ZERO);
    +        final RexLiteral fetch = builder.makeExactLiteral(BigDecimal.ZERO);
    +        return new DrillLimitRel(node.getCluster(), node.getTraitSet(), node, offset, fetch);
    +      }
    +
    +      @Override
    +      public RelNode visit(TableScan scan) {
    --- End diff --
    
    You also need override visitValues, since Value could be leaf operator as well, in addition to TableScan. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request: DRILL-3623: Use shorter query path for LIMIT 0...

Posted by sudheeshkatkam <gi...@git.apache.org>.
Github user sudheeshkatkam commented on a diff in the pull request:

    https://github.com/apache/drill/pull/193#discussion_r43339305
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/handlers/FindLimit0Visitor.java ---
    @@ -46,6 +51,32 @@ public static boolean containsLimit0(RelNode rel) {
         return visitor.isContains();
       }
     
    +  public static DrillRel addLimitOnTopOfLeafNodes(final DrillRel rel) {
    +    final RelShuttleImpl shuttle = new RelShuttleImpl() {
    +
    +      private RelNode addLimitAsParent(RelNode node) {
    +        final RexBuilder builder = node.getCluster().getRexBuilder();
    +        final RexLiteral offset = builder.makeExactLiteral(BigDecimal.ZERO);
    +        final RexLiteral fetch = builder.makeExactLiteral(BigDecimal.ZERO);
    +        return new DrillLimitRel(node.getCluster(), node.getTraitSet(), node, offset, fetch);
    --- End diff --
    
    Thank you Julian, RelBuilder seems perfect for this case.
    
    Jinfeng, for now, making this visitor more general and using RelBuilder needs bigger changes, so I am adding a TODO(DRILL-3993).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request: DRILL-3623: Use shorter query path for LIMIT 0...

Posted by jacques-n <gi...@git.apache.org>.
Github user jacques-n commented on the pull request:

    https://github.com/apache/drill/pull/193#issuecomment-152047328
  
    Interesting. Can you explain where the time is coming from? It isn't clear to me why this will have a big impact over what we had before. While you're pushing the limit down to just above the scan nodes, we already had an optimization which avoided parallelization. Since we're pipelined this really shouldn't matter much. Is limit zero not working right in the limit operator? It should terminate upon receiving schema, not wait until a batch of actual records (I'm wondering if it is doing the latter). Is sending zero records through causing operators to skip compilation? In what cases was this change taking something from hundreds of seconds to a few seconds? I'm asking these questions so I can better understand as I want to make sure there isn't a bug somewhere else. Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request: DRILL-3623: Use shorter query path for LIMIT 0...

Posted by jacques-n <gi...@git.apache.org>.
Github user jacques-n commented on a diff in the pull request:

    https://github.com/apache/drill/pull/193#discussion_r41628985
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/handlers/DefaultSqlHandler.java ---
    @@ -162,6 +166,9 @@ public PhysicalPlan getPlan(SqlNode sqlNode) throws ValidationException, RelConv
     
         log("Optiq Logical", queryRelNode, logger);
         DrillRel drel = convertToDrel(queryRelNode, validatedRowType);
    +    if (otherPlan != null) {
    --- End diff --
    
    This is a hack. Let's create a relnode which is equivalent to using a pojorecord reader and then when we create the plan, it will effectively be a direct plan.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request: DRILL-3623: Use shorter query path for LIMIT 0...

Posted by jinfengni <gi...@git.apache.org>.
Github user jinfengni commented on the pull request:

    https://github.com/apache/drill/pull/193#issuecomment-151987787
  
    Please modify the title of JIRA DRILL-3623, since the new pull request is using a completely different approach to address the performance issue for "LIMIT 0".


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request: DRILL-3623: Use shorter query path for LIMIT 0...

Posted by jinfengni <gi...@git.apache.org>.
Github user jinfengni commented on a diff in the pull request:

    https://github.com/apache/drill/pull/193#discussion_r45406832
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/handlers/FindLimit0Visitor.java ---
    @@ -36,16 +48,62 @@
      * executing a schema-only query.
      */
     public class FindLimit0Visitor extends RelShuttleImpl {
    -  private static final org.slf4j.Logger logger = org.slf4j.LoggerFactory.getLogger(FindLimit0Visitor.class);
    +//  private static final org.slf4j.Logger logger = org.slf4j.LoggerFactory.getLogger(FindLimit0Visitor.class);
    +
    +  /**
    +   * Values in the {@link DrillConstExecutor#DRILL_TO_CALCITE_TYPE_MAPPING} map.
    +   * + without {@link SqlTypeName#ANY} (avoid late binding)
    +   * + without {@link SqlTypeName#VARBINARY} ({@link DrillValuesRel values} does not support this)
    +   * + with {@link SqlTypeName#CHAR} ({@link DrillValuesRel values} supports this, but the above mapping does not
    +   *   contain this type.
    +   */
    +  private static final ImmutableSet<SqlTypeName> TYPES = ImmutableSet.<SqlTypeName>builder()
    +      .add(SqlTypeName.INTEGER, SqlTypeName.BIGINT, SqlTypeName.FLOAT, SqlTypeName.DOUBLE, SqlTypeName.VARCHAR,
    +          SqlTypeName.BOOLEAN, SqlTypeName.DATE, SqlTypeName.DECIMAL, SqlTypeName.TIME, SqlTypeName.TIMESTAMP,
    +          SqlTypeName.INTERVAL_YEAR_MONTH, SqlTypeName.INTERVAL_DAY_TIME, SqlTypeName.MAP, SqlTypeName.ARRAY,
    +          SqlTypeName.TINYINT, SqlTypeName.SMALLINT, SqlTypeName.CHAR)
    +      .build();
     
       private boolean contains = false;
     
    +  /**
    +   * Checks if the root portion of the RelNode tree contains a limit 0 pattern.
    +   */
       public static boolean containsLimit0(RelNode rel) {
         FindLimit0Visitor visitor = new FindLimit0Visitor();
         rel.accept(visitor);
         return visitor.isContains();
       }
     
    +  /**
    +   * If all field types of the given node are {@link #TYPES recognized types}, then this method returns the tree:
    +   *   DrillLimitRel(0)
    +   *     \
    +   *     DrillValuesRel(field types)
    +   * Otherwise, the method returns null.
    +   */
    +  public static DrillRel getValuesRelIfFullySchemaed(final RelNode rel) {
    +    final List<RelDataTypeField> fieldList = rel.getRowType().getFieldList();
    +    final ImmutableList.Builder<RexLiteral> tupleBuilder = new ImmutableList.Builder<>();
    +    final RexBuilder literalBuilder = new RexBuilder(rel.getCluster().getTypeFactory());
    +    for (final RelDataTypeField field : fieldList) {
    +      if (!TYPES.contains(field.getType().getSqlTypeName())) {
    +        return null;
    +      } else {
    +        tupleBuilder.add((RexLiteral) literalBuilder.makeLiteral(null, field.getType(), false));
    +      }
    +    }
    +
    +    final ImmutableList<ImmutableList<RexLiteral>> tuples = new ImmutableList.Builder<ImmutableList<RexLiteral>>()
    +        .add(tupleBuilder.build())
    +        .build();
    +    final RelTraitSet traits = rel.getTraitSet().plus(DrillRel.DRILL_LOGICAL);
    +    // TODO: ideally, we want the limit to be pushed into values
    +    final DrillValuesRel values = new DrillValuesRel(rel.getCluster(), rel.getRowType(), tuples, traits);
    --- End diff --
    
    FindLimit0Visitor.containsLimit0(relNode) is called when relNode is LogicalRel.  Can we create LogicalValue with empty list of tuple, in stead of creating DrillValues with one tuple followed by a DrillLimit(0) ? 
    
    I think LogicalValue would allow an empty list of tuple, right?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request: DRILL-3623: Use shorter query path for LIMIT 0...

Posted by jacques-n <gi...@git.apache.org>.
Github user jacques-n commented on the pull request:

    https://github.com/apache/drill/pull/193#issuecomment-152056732
  
    I'm sorry to say that I'm -1 on this change
    
    It seems to be adding a planning rewrite rule where there should be a simple fix execution bug. Let's just fix the execution bug. 
    
    Limit 0 should complete its execution the moment it receives a schema (as part of fast schema). It doesn't need to receive any records. You just described a situation where it is waiting for records from a blocking operator. That shouldn't be the case. If there is some other real benefit to this change after that execution bug is fixed, let's revisit in that light.
    
    If you think I'm misunderstanding your description of the execution behavior or the dynamics involved, please help me to better understand. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request: DRILL-3623: Use shorter query path for LIMIT 0...

Posted by sudheeshkatkam <gi...@git.apache.org>.
Github user sudheeshkatkam commented on a diff in the pull request:

    https://github.com/apache/drill/pull/193#discussion_r45408655
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/handlers/FindLimit0Visitor.java ---
    @@ -36,16 +48,62 @@
      * executing a schema-only query.
      */
     public class FindLimit0Visitor extends RelShuttleImpl {
    -  private static final org.slf4j.Logger logger = org.slf4j.LoggerFactory.getLogger(FindLimit0Visitor.class);
    +//  private static final org.slf4j.Logger logger = org.slf4j.LoggerFactory.getLogger(FindLimit0Visitor.class);
    +
    +  /**
    +   * Values in the {@link DrillConstExecutor#DRILL_TO_CALCITE_TYPE_MAPPING} map.
    +   * + without {@link SqlTypeName#ANY} (avoid late binding)
    +   * + without {@link SqlTypeName#VARBINARY} ({@link DrillValuesRel values} does not support this)
    +   * + with {@link SqlTypeName#CHAR} ({@link DrillValuesRel values} supports this, but the above mapping does not
    +   *   contain this type.
    +   */
    +  private static final ImmutableSet<SqlTypeName> TYPES = ImmutableSet.<SqlTypeName>builder()
    +      .add(SqlTypeName.INTEGER, SqlTypeName.BIGINT, SqlTypeName.FLOAT, SqlTypeName.DOUBLE, SqlTypeName.VARCHAR,
    +          SqlTypeName.BOOLEAN, SqlTypeName.DATE, SqlTypeName.DECIMAL, SqlTypeName.TIME, SqlTypeName.TIMESTAMP,
    +          SqlTypeName.INTERVAL_YEAR_MONTH, SqlTypeName.INTERVAL_DAY_TIME, SqlTypeName.MAP, SqlTypeName.ARRAY,
    +          SqlTypeName.TINYINT, SqlTypeName.SMALLINT, SqlTypeName.CHAR)
    +      .build();
     
       private boolean contains = false;
     
    +  /**
    +   * Checks if the root portion of the RelNode tree contains a limit 0 pattern.
    +   */
       public static boolean containsLimit0(RelNode rel) {
         FindLimit0Visitor visitor = new FindLimit0Visitor();
         rel.accept(visitor);
         return visitor.isContains();
       }
     
    +  /**
    +   * If all field types of the given node are {@link #TYPES recognized types}, then this method returns the tree:
    +   *   DrillLimitRel(0)
    +   *     \
    +   *     DrillValuesRel(field types)
    +   * Otherwise, the method returns null.
    +   */
    +  public static DrillRel getValuesRelIfFullySchemaed(final RelNode rel) {
    +    final List<RelDataTypeField> fieldList = rel.getRowType().getFieldList();
    +    final ImmutableList.Builder<RexLiteral> tupleBuilder = new ImmutableList.Builder<>();
    +    final RexBuilder literalBuilder = new RexBuilder(rel.getCluster().getTypeFactory());
    +    for (final RelDataTypeField field : fieldList) {
    +      if (!TYPES.contains(field.getType().getSqlTypeName())) {
    +        return null;
    +      } else {
    +        tupleBuilder.add((RexLiteral) literalBuilder.makeLiteral(null, field.getType(), false));
    +      }
    +    }
    +
    +    final ImmutableList<ImmutableList<RexLiteral>> tuples = new ImmutableList.Builder<ImmutableList<RexLiteral>>()
    +        .add(tupleBuilder.build())
    +        .build();
    +    final RelTraitSet traits = rel.getTraitSet().plus(DrillRel.DRILL_LOGICAL);
    +    // TODO: ideally, we want the limit to be pushed into values
    +    final DrillValuesRel values = new DrillValuesRel(rel.getCluster(), rel.getRowType(), tuples, traits);
    --- End diff --
    
    I did this to avoid logical transformation completely, as this is an exceptional case.
    
    LogicalValues [allows an empty list of tuples](https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/rel/logical/LogicalValues.java#L101) but as Jacques pointed out Drill does not handle that well (we use data to encode types).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request: DRILL-3623: Use shorter query path for LIMIT 0...

Posted by jinfengni <gi...@git.apache.org>.
Github user jinfengni commented on a diff in the pull request:

    https://github.com/apache/drill/pull/193#discussion_r45406939
  
    --- Diff: contrib/storage-hive/core/src/test/java/org/apache/drill/exec/hive/TestHiveStorage.java ---
    @@ -42,6 +43,25 @@ public void hiveReadWithDb() throws Exception {
       }
     
       @Test
    +  public void simpleLimitZero() throws Exception {
    +    testBuilder()
    +        .sqlQuery("select * from hive.kv limit 0")
    +        .expectsEmptyResultSet()
    +        .baselineColumns("key", "value")
    --- End diff --
    
    In the test with schemed "limit 0" case, I think we had better verify the column type as well. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request: DRILL-3623: Use shorter query path for LIMIT 0...

Posted by jacques-n <gi...@git.apache.org>.
Github user jacques-n commented on a diff in the pull request:

    https://github.com/apache/drill/pull/193#discussion_r45150593
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillDirectScanRel.java ---
    @@ -0,0 +1,111 @@
    +/**
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + * http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.drill.exec.planner.logical;
    +
    +import com.google.common.collect.Iterators;
    +import org.apache.calcite.plan.RelOptCluster;
    +import org.apache.calcite.plan.RelTraitSet;
    +import org.apache.calcite.rel.AbstractRelNode;
    +import org.apache.calcite.rel.RelWriter;
    +import org.apache.calcite.rel.type.RelDataType;
    +import org.apache.drill.common.logical.data.LogicalOperator;
    +import org.apache.drill.exec.physical.base.PhysicalOperator;
    +import org.apache.drill.exec.planner.physical.DrillScanPrel;
    +import org.apache.drill.exec.planner.physical.PhysicalPlanCreator;
    +import org.apache.drill.exec.planner.physical.PlannerSettings;
    +import org.apache.drill.exec.planner.physical.Prel;
    +import org.apache.drill.exec.planner.physical.PrelUtil;
    +import org.apache.drill.exec.planner.physical.visitor.PrelVisitor;
    +import org.apache.drill.exec.record.BatchSchema;
    +import org.apache.drill.exec.store.direct.DirectGroupScan;
    +
    +import java.io.IOException;
    +import java.util.Iterator;
    +
    +/**
    + * Logical and physical RelNode representing a {@link DirectGroupScan}. This is not backed by a {@link DrillTable},
    + * unlike {@link DrillScanRel}.
    + */
    +public class DrillDirectScanRel extends AbstractRelNode implements DrillScanPrel, DrillRel {
    --- End diff --
    
    My note was in reference to DrillReduceExpressionsRule.
    
    Basically, you could be able to modify this classes implementations of createEmptyRelOrEquivalent() to switch to a values (with fake data) operator followed by a limit(0). At least that was the thought.
    



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request: DRILL-3623: Use shorter query path for LIMIT 0...

Posted by jacques-n <gi...@git.apache.org>.
Github user jacques-n commented on a diff in the pull request:

    https://github.com/apache/drill/pull/193#discussion_r44852176
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillDirectScanRel.java ---
    @@ -0,0 +1,111 @@
    +/**
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + * http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.drill.exec.planner.logical;
    +
    +import com.google.common.collect.Iterators;
    +import org.apache.calcite.plan.RelOptCluster;
    +import org.apache.calcite.plan.RelTraitSet;
    +import org.apache.calcite.rel.AbstractRelNode;
    +import org.apache.calcite.rel.RelWriter;
    +import org.apache.calcite.rel.type.RelDataType;
    +import org.apache.drill.common.logical.data.LogicalOperator;
    +import org.apache.drill.exec.physical.base.PhysicalOperator;
    +import org.apache.drill.exec.planner.physical.DrillScanPrel;
    +import org.apache.drill.exec.planner.physical.PhysicalPlanCreator;
    +import org.apache.drill.exec.planner.physical.PlannerSettings;
    +import org.apache.drill.exec.planner.physical.Prel;
    +import org.apache.drill.exec.planner.physical.PrelUtil;
    +import org.apache.drill.exec.planner.physical.visitor.PrelVisitor;
    +import org.apache.drill.exec.record.BatchSchema;
    +import org.apache.drill.exec.store.direct.DirectGroupScan;
    +
    +import java.io.IOException;
    +import java.util.Iterator;
    +
    +/**
    + * Logical and physical RelNode representing a {@link DirectGroupScan}. This is not backed by a {@link DrillTable},
    + * unlike {@link DrillScanRel}.
    + */
    +public class DrillDirectScanRel extends AbstractRelNode implements DrillScanPrel, DrillRel {
    --- End diff --
    
    Can we just use the VALUES operator instead? I think that Calcite already has the code to do this in the reduceexpressionsrule.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request: DRILL-3623: Use shorter query path for LIMIT 0...

Posted by jinfengni <gi...@git.apache.org>.
Github user jinfengni commented on the pull request:

    https://github.com/apache/drill/pull/193#issuecomment-158212142
  
    Overall, looks good to me.  +1



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request: DRILL-3623: Use shorter query path for LIMIT 0...

Posted by sudheeshkatkam <gi...@git.apache.org>.
Github user sudheeshkatkam commented on the pull request:

    https://github.com/apache/drill/pull/193#issuecomment-152053724
  
    I think I see the source of confusion (sorry); this patch does not address that query in the JIRA, which is why Jinfeng asked me to change the title in one of his comments. Regarding that query, DRILL-3921 helps avoids most of the execution time, but we still incur the planning time. And my initial approach address this issue but as mentioned above, this is blocked by DRILL-2288 and other things.
    
    The new approach actually addresses any query that has a limit 0 above a blocking operator that consumes all records. And avoiding parallelization made the query much worse. (Actually, was fast-schema supposed to still kick in? Did not seem like it from my experiments.)
    
    I tested against a query like `SELECT * FROM (SELECT COUNT(DISTINCT a), COUNT(DISTINCT b), COUNT(DISTINCT c) FROM very_large_table) T LIMIT 0` and this completed two orders of magnitude faster.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request: DRILL-3623: Use shorter query path for LIMIT 0...

Posted by sudheeshkatkam <gi...@git.apache.org>.
Github user sudheeshkatkam commented on the pull request:

    https://github.com/apache/drill/pull/193#issuecomment-152034356
  
    Also, on the execution side, I was actually hitting [DRILL-2288](https://issues.apache.org/jira/browse/DRILL-2288), where sending exactly one batch with schema and without data is not handled correctly by various RecordBatches. With a fix for that issue, we could add further optimization for schemaed tables (i.e. add the previous implementation) with this implementation as the fallback.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request: DRILL-3623: Use shorter query path for LIMIT 0...

Posted by jacques-n <gi...@git.apache.org>.
Github user jacques-n commented on a diff in the pull request:

    https://github.com/apache/drill/pull/193#discussion_r41629260
  
    --- Diff: contrib/storage-hive/core/src/main/java/org/apache/drill/exec/store/hive/schema/DrillHiveTable.java ---
    @@ -179,4 +179,9 @@ private void throwUnsupportedHiveDataTypeError(String hiveType) {
     
         throw new RuntimeException(errMsg.toString());
       }
    +
    +  @Override
    +  public boolean providesDeferredSchema() {
    --- End diff --
    
    or maybe simply
    
    isFullySchemaed()


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request: DRILL-3623: Use shorter query path for LIMIT 0...

Posted by jinfengni <gi...@git.apache.org>.
Github user jinfengni commented on a diff in the pull request:

    https://github.com/apache/drill/pull/193#discussion_r41682242
  
    --- Diff: contrib/storage-hive/core/src/main/java/org/apache/drill/exec/store/hive/schema/DrillHiveTable.java ---
    @@ -179,4 +179,9 @@ private void throwUnsupportedHiveDataTypeError(String hiveType) {
     
         throw new RuntimeException(errMsg.toString());
       }
    +
    +  @Override
    +  public boolean providesDeferredSchema() {
    --- End diff --
    
    I think DrillHiveTable or other schema-aware table would return RelRecordType (in Calcite), when calling typeFactory.createStructType() as RowType.
    
    For DrillDynamicTable, which is schema-less, the RowType would be RelDataTypeDrillImpl.
    
    



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request: DRILL-3623: Use shorter query path for LIMIT 0...

Posted by sudheeshkatkam <gi...@git.apache.org>.
Github user sudheeshkatkam commented on the pull request:

    https://github.com/apache/drill/pull/193#issuecomment-157839747
  
    Does the latest patch look fine? Also, please see [the note in FindLimit0Visitor](https://github.com/apache/drill/pull/193/files#diff-6d8243a8f379ee0041618329fb132ca6R54).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request: DRILL-3623: Use shorter query path for LIMIT 0...

Posted by jacques-n <gi...@git.apache.org>.
Github user jacques-n commented on a diff in the pull request:

    https://github.com/apache/drill/pull/193#discussion_r45090762
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillDirectScanRel.java ---
    @@ -0,0 +1,111 @@
    +/**
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + * http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.drill.exec.planner.logical;
    +
    +import com.google.common.collect.Iterators;
    +import org.apache.calcite.plan.RelOptCluster;
    +import org.apache.calcite.plan.RelTraitSet;
    +import org.apache.calcite.rel.AbstractRelNode;
    +import org.apache.calcite.rel.RelWriter;
    +import org.apache.calcite.rel.type.RelDataType;
    +import org.apache.drill.common.logical.data.LogicalOperator;
    +import org.apache.drill.exec.physical.base.PhysicalOperator;
    +import org.apache.drill.exec.planner.physical.DrillScanPrel;
    +import org.apache.drill.exec.planner.physical.PhysicalPlanCreator;
    +import org.apache.drill.exec.planner.physical.PlannerSettings;
    +import org.apache.drill.exec.planner.physical.Prel;
    +import org.apache.drill.exec.planner.physical.PrelUtil;
    +import org.apache.drill.exec.planner.physical.visitor.PrelVisitor;
    +import org.apache.drill.exec.record.BatchSchema;
    +import org.apache.drill.exec.store.direct.DirectGroupScan;
    +
    +import java.io.IOException;
    +import java.util.Iterator;
    +
    +/**
    + * Logical and physical RelNode representing a {@link DirectGroupScan}. This is not backed by a {@link DrillTable},
    + * unlike {@link DrillScanRel}.
    + */
    +public class DrillDirectScanRel extends AbstractRelNode implements DrillScanPrel, DrillRel {
    --- End diff --
    
    One other note on the Calcite rule: it seems like we should just modify the Calcite mainline rule to avoid applying the zero records values operator optimization in the case that a column is not yet bound to a type (is an ANY column). That way we can stop maintaining our version of the rule. @jaltekruse, thoughts?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request: DRILL-3623: Use shorter query path for LIMIT 0...

Posted by jacques-n <gi...@git.apache.org>.
Github user jacques-n commented on a diff in the pull request:

    https://github.com/apache/drill/pull/193#discussion_r45090426
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillDirectScanRel.java ---
    @@ -0,0 +1,111 @@
    +/**
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + * http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.drill.exec.planner.logical;
    +
    +import com.google.common.collect.Iterators;
    +import org.apache.calcite.plan.RelOptCluster;
    +import org.apache.calcite.plan.RelTraitSet;
    +import org.apache.calcite.rel.AbstractRelNode;
    +import org.apache.calcite.rel.RelWriter;
    +import org.apache.calcite.rel.type.RelDataType;
    +import org.apache.drill.common.logical.data.LogicalOperator;
    +import org.apache.drill.exec.physical.base.PhysicalOperator;
    +import org.apache.drill.exec.planner.physical.DrillScanPrel;
    +import org.apache.drill.exec.planner.physical.PhysicalPlanCreator;
    +import org.apache.drill.exec.planner.physical.PlannerSettings;
    +import org.apache.drill.exec.planner.physical.Prel;
    +import org.apache.drill.exec.planner.physical.PrelUtil;
    +import org.apache.drill.exec.planner.physical.visitor.PrelVisitor;
    +import org.apache.drill.exec.record.BatchSchema;
    +import org.apache.drill.exec.store.direct.DirectGroupScan;
    +
    +import java.io.IOException;
    +import java.util.Iterator;
    +
    +/**
    + * Logical and physical RelNode representing a {@link DirectGroupScan}. This is not backed by a {@link DrillTable},
    + * unlike {@link DrillScanRel}.
    + */
    +public class DrillDirectScanRel extends AbstractRelNode implements DrillScanPrel, DrillRel {
    --- End diff --
    
    There is one challenge with the Values execution in Drill. We use data to encode types (and generate the vectors). It seems like the ideal would be expressing a values operation that has no data. Maybe we should just support a local limit in the values operator? That would allow us to bypass adding the limit(0) and sv remover for the simple case. Generally, we should probably support leaf node limit pushdown anyway. I see the new patch takes a different approach to the one above. One of the things that seemed to be an issue above is that the Limit operator was properly terminating its parents in the fast schema case of a limit 0. @sudheeshkatkam and @jinfengni, do you agree that is an issue? If it is, we should get a JIRA opened for it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request: DRILL-3623: Use shorter query path for LIMIT 0...

Posted by jacques-n <gi...@git.apache.org>.
Github user jacques-n commented on a diff in the pull request:

    https://github.com/apache/drill/pull/193#discussion_r41629498
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/handlers/NonDeferredSchemaTableLimit0Visitor.java ---
    @@ -0,0 +1,126 @@
    +/**
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + * <p/>
    + * http://www.apache.org/licenses/LICENSE-2.0
    + * <p/>
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.drill.exec.planner.sql.handlers;
    +
    +import com.google.common.collect.Lists;
    +import org.apache.calcite.rel.RelNode;
    +import org.apache.calcite.rel.RelShuttleImpl;
    +import org.apache.calcite.rel.core.TableScan;
    +import org.apache.calcite.rel.type.RelDataType;
    +import org.apache.calcite.sql.type.SqlTypeFactoryImpl;
    +import org.apache.calcite.sql.type.SqlTypeName;
    +import org.apache.drill.common.exceptions.ExecutionSetupException;
    +import org.apache.drill.common.expression.SchemaPath;
    +import org.apache.drill.common.types.TypeProtos;
    +import org.apache.drill.exec.exception.SchemaChangeException;
    +import org.apache.drill.exec.expr.TypeHelper;
    +import org.apache.drill.exec.ops.OperatorContext;
    +import org.apache.drill.exec.physical.impl.OutputMutator;
    +import org.apache.drill.exec.planner.logical.DrillConstExecutor;
    +import org.apache.drill.exec.planner.logical.DrillScanRel;
    +import org.apache.drill.exec.planner.logical.DrillTable;
    +import org.apache.drill.exec.planner.types.DrillRelDataTypeSystem;
    +import org.apache.drill.exec.record.MaterializedField;
    +import org.apache.drill.exec.store.AbstractRecordReader;
    +import org.apache.drill.exec.store.RecordReader;
    +
    +import java.util.List;
    +import java.util.Map;
    +
    +/**
    + *
    --- End diff --
    
    A little more explanation in this javadoc would be good :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request: DRILL-3623: Use shorter query path for LIMIT 0...

Posted by jacques-n <gi...@git.apache.org>.
Github user jacques-n commented on a diff in the pull request:

    https://github.com/apache/drill/pull/193#discussion_r41629465
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/handlers/NonDeferredSchemaTableLimit0Visitor.java ---
    @@ -0,0 +1,126 @@
    +/**
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + * <p/>
    + * http://www.apache.org/licenses/LICENSE-2.0
    + * <p/>
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.drill.exec.planner.sql.handlers;
    +
    +import com.google.common.collect.Lists;
    +import org.apache.calcite.rel.RelNode;
    +import org.apache.calcite.rel.RelShuttleImpl;
    +import org.apache.calcite.rel.core.TableScan;
    +import org.apache.calcite.rel.type.RelDataType;
    +import org.apache.calcite.sql.type.SqlTypeFactoryImpl;
    +import org.apache.calcite.sql.type.SqlTypeName;
    +import org.apache.drill.common.exceptions.ExecutionSetupException;
    +import org.apache.drill.common.expression.SchemaPath;
    +import org.apache.drill.common.types.TypeProtos;
    +import org.apache.drill.exec.exception.SchemaChangeException;
    +import org.apache.drill.exec.expr.TypeHelper;
    +import org.apache.drill.exec.ops.OperatorContext;
    +import org.apache.drill.exec.physical.impl.OutputMutator;
    +import org.apache.drill.exec.planner.logical.DrillConstExecutor;
    +import org.apache.drill.exec.planner.logical.DrillScanRel;
    +import org.apache.drill.exec.planner.logical.DrillTable;
    +import org.apache.drill.exec.planner.types.DrillRelDataTypeSystem;
    +import org.apache.drill.exec.record.MaterializedField;
    +import org.apache.drill.exec.store.AbstractRecordReader;
    +import org.apache.drill.exec.store.RecordReader;
    +
    +import java.util.List;
    +import java.util.Map;
    +
    +/**
    + *
    + */
    +public class NonDeferredSchemaTableLimit0Visitor extends RelShuttleImpl {
    +//  private static final org.slf4j.Logger logger = org.slf4j.LoggerFactory.getLogger(NonDeferredSchemaTableLimit0Visitor.class);
    +
    +  private RecordReader reader;
    +
    +  public static RecordReader getReader(RelNode node) {
    +    final NonDeferredSchemaTableLimit0Visitor visitor = new NonDeferredSchemaTableLimit0Visitor();
    +    node.accept(visitor);
    +    return visitor.getReader();
    +  }
    +
    +  private NonDeferredSchemaTableLimit0Visitor() {
    +  }
    +
    +  RecordReader getReader() {
    +    return reader;
    +  }
    +
    +  @Override
    +  public RelNode visit(TableScan scan) {
    +    if (scan instanceof DrillScanRel) {
    +      final DrillTable table = ((DrillScanRel) scan).getDrillTable();
    +
    +      System.out.println("here?");
    --- End diff --
    
    yes?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request: DRILL-3623: Use shorter query path for LIMIT 0...

Posted by sudheeshkatkam <gi...@git.apache.org>.
Github user sudheeshkatkam commented on a diff in the pull request:

    https://github.com/apache/drill/pull/193#discussion_r41908422
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/handlers/NonDeferredSchemaTableLimit0Visitor.java ---
    @@ -0,0 +1,126 @@
    +/**
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + * <p/>
    + * http://www.apache.org/licenses/LICENSE-2.0
    + * <p/>
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.drill.exec.planner.sql.handlers;
    +
    +import com.google.common.collect.Lists;
    +import org.apache.calcite.rel.RelNode;
    +import org.apache.calcite.rel.RelShuttleImpl;
    +import org.apache.calcite.rel.core.TableScan;
    +import org.apache.calcite.rel.type.RelDataType;
    +import org.apache.calcite.sql.type.SqlTypeFactoryImpl;
    +import org.apache.calcite.sql.type.SqlTypeName;
    +import org.apache.drill.common.exceptions.ExecutionSetupException;
    +import org.apache.drill.common.expression.SchemaPath;
    +import org.apache.drill.common.types.TypeProtos;
    +import org.apache.drill.exec.exception.SchemaChangeException;
    +import org.apache.drill.exec.expr.TypeHelper;
    +import org.apache.drill.exec.ops.OperatorContext;
    +import org.apache.drill.exec.physical.impl.OutputMutator;
    +import org.apache.drill.exec.planner.logical.DrillConstExecutor;
    +import org.apache.drill.exec.planner.logical.DrillScanRel;
    +import org.apache.drill.exec.planner.logical.DrillTable;
    +import org.apache.drill.exec.planner.types.DrillRelDataTypeSystem;
    +import org.apache.drill.exec.record.MaterializedField;
    +import org.apache.drill.exec.store.AbstractRecordReader;
    +import org.apache.drill.exec.store.RecordReader;
    +
    +import java.util.List;
    +import java.util.Map;
    +
    +/**
    + *
    --- End diff --
    
    Yes, this is a reminder for me that I should add comments.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request: DRILL-3623: Use shorter query path for LIMIT 0...

Posted by sudheeshkatkam <gi...@git.apache.org>.
Github user sudheeshkatkam commented on a diff in the pull request:

    https://github.com/apache/drill/pull/193#discussion_r45138996
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillDirectScanRel.java ---
    @@ -0,0 +1,111 @@
    +/**
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + * http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.drill.exec.planner.logical;
    +
    +import com.google.common.collect.Iterators;
    +import org.apache.calcite.plan.RelOptCluster;
    +import org.apache.calcite.plan.RelTraitSet;
    +import org.apache.calcite.rel.AbstractRelNode;
    +import org.apache.calcite.rel.RelWriter;
    +import org.apache.calcite.rel.type.RelDataType;
    +import org.apache.drill.common.logical.data.LogicalOperator;
    +import org.apache.drill.exec.physical.base.PhysicalOperator;
    +import org.apache.drill.exec.planner.physical.DrillScanPrel;
    +import org.apache.drill.exec.planner.physical.PhysicalPlanCreator;
    +import org.apache.drill.exec.planner.physical.PlannerSettings;
    +import org.apache.drill.exec.planner.physical.Prel;
    +import org.apache.drill.exec.planner.physical.PrelUtil;
    +import org.apache.drill.exec.planner.physical.visitor.PrelVisitor;
    +import org.apache.drill.exec.record.BatchSchema;
    +import org.apache.drill.exec.store.direct.DirectGroupScan;
    +
    +import java.io.IOException;
    +import java.util.Iterator;
    +
    +/**
    + * Logical and physical RelNode representing a {@link DirectGroupScan}. This is not backed by a {@link DrillTable},
    + * unlike {@link DrillScanRel}.
    + */
    +public class DrillDirectScanRel extends AbstractRelNode implements DrillScanPrel, DrillRel {
    --- End diff --
    
    @jinfengni I'll post the patch with Values approach.
    
    @julianhyde That makes sense.
    
    @jacques-n Since we use data to encode types in Values, pushing Limit (specially zero) into into Values requires a sizable change (?). Also, I think [there is a bug](https://github.com/apache/drill/blob/master/exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillValuesRel.java#L141) since Calcite allows for creating a LogicalValues [with types and without literals](https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/rel/logical/LogicalValues.java#L101), and Drill does not handle this case.
    
    Regarding the above approach (putting a limit 0 on top of scan), I think fast schema wasn't sufficient because there was only one record batch during execution (extreme case). Right now, I do not see any specific issue there.
    
    Regarding the note, can you expand on what you mean? Change the visitor pattern to a logical rule? Or is there a logical rule that conflicts with this change, and this shorter path should be avoided?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request: DRILL-3623: Use shorter query path for LIMIT 0...

Posted by jacques-n <gi...@git.apache.org>.
Github user jacques-n commented on the pull request:

    https://github.com/apache/drill/pull/193#issuecomment-152001143
  
    What happened to the original strategy of short circuiting on schema'd files. This approach still means we have to pay for all the operation compilations for no reason.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request: DRILL-3623: Use shorter query path for LIMIT 0...

Posted by sudheeshkatkam <gi...@git.apache.org>.
Github user sudheeshkatkam commented on the pull request:

    https://github.com/apache/drill/pull/193#issuecomment-156602174
  
    Updated the patch: for LIMIT 0 queries, short circuit before logical transformation if all field types are known for the root node.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request: DRILL-3623: Use shorter query path for LIMIT 0...

Posted by sudheeshkatkam <gi...@git.apache.org>.
Github user sudheeshkatkam commented on a diff in the pull request:

    https://github.com/apache/drill/pull/193#discussion_r45006594
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillDirectScanRel.java ---
    @@ -0,0 +1,111 @@
    +/**
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + * http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.drill.exec.planner.logical;
    +
    +import com.google.common.collect.Iterators;
    +import org.apache.calcite.plan.RelOptCluster;
    +import org.apache.calcite.plan.RelTraitSet;
    +import org.apache.calcite.rel.AbstractRelNode;
    +import org.apache.calcite.rel.RelWriter;
    +import org.apache.calcite.rel.type.RelDataType;
    +import org.apache.drill.common.logical.data.LogicalOperator;
    +import org.apache.drill.exec.physical.base.PhysicalOperator;
    +import org.apache.drill.exec.planner.physical.DrillScanPrel;
    +import org.apache.drill.exec.planner.physical.PhysicalPlanCreator;
    +import org.apache.drill.exec.planner.physical.PlannerSettings;
    +import org.apache.drill.exec.planner.physical.Prel;
    +import org.apache.drill.exec.planner.physical.PrelUtil;
    +import org.apache.drill.exec.planner.physical.visitor.PrelVisitor;
    +import org.apache.drill.exec.record.BatchSchema;
    +import org.apache.drill.exec.store.direct.DirectGroupScan;
    +
    +import java.io.IOException;
    +import java.util.Iterator;
    +
    +/**
    + * Logical and physical RelNode representing a {@link DirectGroupScan}. This is not backed by a {@link DrillTable},
    + * unlike {@link DrillScanRel}.
    + */
    +public class DrillDirectScanRel extends AbstractRelNode implements DrillScanPrel, DrillRel {
    --- End diff --
    
    With DrillDirectScanRel, the execution plan is: *Screen <-- Project <-- DrillDirectScanRel*. In my experiments ([TPCDS queries](https://github.com/mapr/drill-test-framework/tree/master/framework/resources/Functional/tpcds/original/hive)), the total time did not cross 0.6 s.
    
    With VALUES operator, there needs to be a LIMIT(0) operator on top. Then, the execution plan is: *Screen <-- Project <-- SVR <-- Limit <-- Values*. In my experiments, sometimes (seen twice) Project and SVR take ~0.5s to setup, and the query take 2s to complete, but this is **not reproducible.**


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request: DRILL-3623: Use shorter query path for LIMIT 0...

Posted by julianhyde <gi...@git.apache.org>.
Github user julianhyde commented on a diff in the pull request:

    https://github.com/apache/drill/pull/193#discussion_r45012029
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillDirectScanRel.java ---
    @@ -0,0 +1,111 @@
    +/**
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + * http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.drill.exec.planner.logical;
    +
    +import com.google.common.collect.Iterators;
    +import org.apache.calcite.plan.RelOptCluster;
    +import org.apache.calcite.plan.RelTraitSet;
    +import org.apache.calcite.rel.AbstractRelNode;
    +import org.apache.calcite.rel.RelWriter;
    +import org.apache.calcite.rel.type.RelDataType;
    +import org.apache.drill.common.logical.data.LogicalOperator;
    +import org.apache.drill.exec.physical.base.PhysicalOperator;
    +import org.apache.drill.exec.planner.physical.DrillScanPrel;
    +import org.apache.drill.exec.planner.physical.PhysicalPlanCreator;
    +import org.apache.drill.exec.planner.physical.PlannerSettings;
    +import org.apache.drill.exec.planner.physical.Prel;
    +import org.apache.drill.exec.planner.physical.PrelUtil;
    +import org.apache.drill.exec.planner.physical.visitor.PrelVisitor;
    +import org.apache.drill.exec.record.BatchSchema;
    +import org.apache.drill.exec.store.direct.DirectGroupScan;
    +
    +import java.io.IOException;
    +import java.util.Iterator;
    +
    +/**
    + * Logical and physical RelNode representing a {@link DirectGroupScan}. This is not backed by a {@link DrillTable},
    + * unlike {@link DrillScanRel}.
    + */
    +public class DrillDirectScanRel extends AbstractRelNode implements DrillScanPrel, DrillRel {
    --- End diff --
    
    Yeah, I think DrillDirectScanRel is a wrong turn for this functionality, because you can't see that it is empty and reason about it. A DirectGroupScan is a runtime thing, so shouldn't be floating around at planning time.
    
    I notice that DrillValuesRel does not extend Values. If it did, it would get all of the rules in PruneEmptyRules for free - pruning away Filter, Project etc. on top of it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request: DRILL-3623: Use shorter query path for LIMIT 0...

Posted by jinfengni <gi...@git.apache.org>.
Github user jinfengni commented on the pull request:

    https://github.com/apache/drill/pull/193#issuecomment-152013964
  
    The original approach (skipping the execution phase for limit 0 completely), actually could potentially have issues in some cases, due to the difference in Calcite rule and Drill execution rule, in terms of how type is determined.
    
    For example, sum(int) in calcite is resolved to be int, while in Drill execution, we changed to bigint. Another case is implicit cast. Currently, there are some small differences between Calcite and Drill execution. That means, if we skip the execution for limit 0, then types which are resolved in Calcite could be different from the type if the query goes through Drill execution. For BI tool like Tableau, that means the type returned from "limit 0" query and type from a second query w/o "limit 0" could be different. 
    
    If we want to avoid the above issues, we have to detect all those cases, which are painful. That's why Sudheesh and I are now more inclined to this new approach. 
     


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request: DRILL-3623: Use shorter query path for LIMIT 0...

Posted by sudheeshkatkam <gi...@git.apache.org>.
Github user sudheeshkatkam commented on a diff in the pull request:

    https://github.com/apache/drill/pull/193#discussion_r45408783
  
    --- Diff: contrib/storage-hive/core/src/test/java/org/apache/drill/exec/hive/TestHiveStorage.java ---
    @@ -42,6 +43,25 @@ public void hiveReadWithDb() throws Exception {
       }
     
       @Test
    +  public void simpleLimitZero() throws Exception {
    +    testBuilder()
    +        .sqlQuery("select * from hive.kv limit 0")
    +        .expectsEmptyResultSet()
    +        .baselineColumns("key", "value")
    --- End diff --
    
    will do.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request: DRILL-3623: Use shorter query path for LIMIT 0...

Posted by sudheeshkatkam <gi...@git.apache.org>.
Github user sudheeshkatkam commented on the pull request:

    https://github.com/apache/drill/pull/193#issuecomment-151711063
  
    @jinfengni please review.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request: DRILL-3623: Use shorter query path for LIMIT 0...

Posted by jinfengni <gi...@git.apache.org>.
Github user jinfengni commented on a diff in the pull request:

    https://github.com/apache/drill/pull/193#discussion_r43315255
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/handlers/FindLimit0Visitor.java ---
    @@ -46,6 +51,32 @@ public static boolean containsLimit0(RelNode rel) {
         return visitor.isContains();
       }
     
    +  public static DrillRel addLimitOnTopOfLeafNodes(final DrillRel rel) {
    +    final RelShuttleImpl shuttle = new RelShuttleImpl() {
    +
    +      private RelNode addLimitAsParent(RelNode node) {
    +        final RexBuilder builder = node.getCluster().getRexBuilder();
    +        final RexLiteral offset = builder.makeExactLiteral(BigDecimal.ZERO);
    +        final RexLiteral fetch = builder.makeExactLiteral(BigDecimal.ZERO);
    +        return new DrillLimitRel(node.getCluster(), node.getTraitSet(), node, offset, fetch);
    --- End diff --
    
    I understand in your case, you only putDrillLimitRel. But you may want to make this Visitor more general, such that it could create any kind of LimitRel, including DrillLimitRel, LogicalLimitRel, DrillLimitPrel, etc. You can do that by define a LimitFactory, and pass to this Visitor. This's similar to what other Calcite rule would do, to make the code more general. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request: DRILL-3623: Use shorter query path for LIMIT 0...

Posted by jacques-n <gi...@git.apache.org>.
Github user jacques-n commented on a diff in the pull request:

    https://github.com/apache/drill/pull/193#discussion_r41628570
  
    --- Diff: contrib/storage-hive/core/src/main/java/org/apache/drill/exec/store/hive/schema/DrillHiveTable.java ---
    @@ -179,4 +179,9 @@ private void throwUnsupportedHiveDataTypeError(String hiveType) {
     
         throw new RuntimeException(errMsg.toString());
       }
    +
    +  @Override
    +  public boolean providesDeferredSchema() {
    --- End diff --
    
    Double negative here. How about 
    hasKnownSchema()
    
    With javadoc.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request: DRILL-3623: Use shorter query path for LIMIT 0...

Posted by jacques-n <gi...@git.apache.org>.
Github user jacques-n commented on a diff in the pull request:

    https://github.com/apache/drill/pull/193#discussion_r41628774
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/DirectPlan.java ---
    @@ -45,11 +52,15 @@ public static PhysicalPlan createDirectPlan(QueryContext context, boolean result
       public static <T> PhysicalPlan createDirectPlan(QueryContext context, T obj){
         Iterator<T> iter = (Iterator<T>) Collections.singleton(obj).iterator();
         return createDirectPlan(context.getCurrentEndpoint(), iter, (Class<T>) obj.getClass());
    +  }
     
    +  public static <T> PhysicalPlan createDirectPlan(DrillbitEndpoint endpoint, Iterator<T> iterator, Class<T> clazz) {
    +    PojoRecordReader<T> reader = new PojoRecordReader<>(clazz, iterator);
    +    return createDirectPlan(endpoint, reader, ScanStats.TRIVIAL_TABLE);
       }
    -  public static <T> PhysicalPlan createDirectPlan(DrillbitEndpoint endpoint, Iterator<T> iterator, Class<T> clazz){
    -    PojoRecordReader<T> reader = new PojoRecordReader<T>(clazz, iterator);
    -    DirectGroupScan scan = new DirectGroupScan(reader);
    +
    +  public static PhysicalPlan createDirectPlan(DrillbitEndpoint endpoint, RecordReader reader, ScanStats stats) {
    --- End diff --
    
    Let's keep the default behavior static constructor (which uses trivial) and then we don't have to update the constructor everywhere.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request: DRILL-3623: Use shorter query path for LIMIT 0...

Posted by jinfengni <gi...@git.apache.org>.
Github user jinfengni commented on a diff in the pull request:

    https://github.com/apache/drill/pull/193#discussion_r45407282
  
    --- Diff: contrib/storage-hive/core/src/test/java/org/apache/drill/exec/hive/TestHiveStorage.java ---
    @@ -42,6 +43,25 @@ public void hiveReadWithDb() throws Exception {
       }
     
       @Test
    +  public void simpleLimitZero() throws Exception {
    +    testBuilder()
    +        .sqlQuery("select * from hive.kv limit 0")
    +        .expectsEmptyResultSet()
    +        .baselineColumns("key", "value")
    +        .go();
    +  }
    +
    +  @Test
    +  public void simpleLimitZeroPlan() throws Exception {
    +    final String[] expectedPlan = {
    +      ".*Limit.*\n" +
    +          ".*Values.*"
    +    };
    +    final String[] excludedPlan = {};
    +    PlanTestBase.testPlanMatchingPatterns("select * from hive.kv limit 0", expectedPlan, excludedPlan);
    +  }
    +
    --- End diff --
    
    Also, you may consider adding a negative test, such as "schemed limit 0", but select clause has a function. 
    
    select substr(char_col, 1, 5) from hive.table limit 0;
    
    In such case, I assume this patch would not do the short cut, and the function return type is not known for now in Drill's planning time.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request: DRILL-3623: Use shorter query path for LIMIT 0...

Posted by jacques-n <gi...@git.apache.org>.
Github user jacques-n commented on a diff in the pull request:

    https://github.com/apache/drill/pull/193#discussion_r41630368
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/handlers/NonDeferredSchemaTableLimit0Visitor.java ---
    @@ -0,0 +1,126 @@
    +/**
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + * <p/>
    + * http://www.apache.org/licenses/LICENSE-2.0
    + * <p/>
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.drill.exec.planner.sql.handlers;
    +
    +import com.google.common.collect.Lists;
    +import org.apache.calcite.rel.RelNode;
    +import org.apache.calcite.rel.RelShuttleImpl;
    +import org.apache.calcite.rel.core.TableScan;
    +import org.apache.calcite.rel.type.RelDataType;
    +import org.apache.calcite.sql.type.SqlTypeFactoryImpl;
    +import org.apache.calcite.sql.type.SqlTypeName;
    +import org.apache.drill.common.exceptions.ExecutionSetupException;
    +import org.apache.drill.common.expression.SchemaPath;
    +import org.apache.drill.common.types.TypeProtos;
    +import org.apache.drill.exec.exception.SchemaChangeException;
    +import org.apache.drill.exec.expr.TypeHelper;
    +import org.apache.drill.exec.ops.OperatorContext;
    +import org.apache.drill.exec.physical.impl.OutputMutator;
    +import org.apache.drill.exec.planner.logical.DrillConstExecutor;
    +import org.apache.drill.exec.planner.logical.DrillScanRel;
    +import org.apache.drill.exec.planner.logical.DrillTable;
    +import org.apache.drill.exec.planner.types.DrillRelDataTypeSystem;
    +import org.apache.drill.exec.record.MaterializedField;
    +import org.apache.drill.exec.store.AbstractRecordReader;
    +import org.apache.drill.exec.store.RecordReader;
    +
    +import java.util.List;
    +import java.util.Map;
    +
    +/**
    + *
    + */
    +public class NonDeferredSchemaTableLimit0Visitor extends RelShuttleImpl {
    --- End diff --
    
    I think the logic here is incorrect/incomplete. We should be: 
    
    1. traverse the entire tree to determine if all the table leaves are FullySchemaed and the root portion of the tree is limit 0. 
      - For schema detection, a better possibility would be to determine if the output is fully schemaed as opposed to all the inputs are fully schemaed. Maybe this would work if we verify that all of the fields of the root node are not an ANY type. @jinfengni could provide comments here.
    2. If (1) is true, then we should read the schema of the root node of the tree
    3. Create a reader for the root node of the tree.
    
    The query here can be arbitrarily complex (many joins, group bys, etc).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request: DRILL-3623: Use shorter query path for LIMIT 0...

Posted by jacques-n <gi...@git.apache.org>.
Github user jacques-n commented on the pull request:

    https://github.com/apache/drill/pull/193#issuecomment-152059989
  
    Just to add to my comment above, if you want to do a quick call or hangout to discuss I'm more than happy to. As I said above, it is possible I am misunderstanding. If so, I'll definitely revise my objection.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request: DRILL-3623: Use shorter query path for LIMIT 0...

Posted by jacques-n <gi...@git.apache.org>.
Github user jacques-n commented on a diff in the pull request:

    https://github.com/apache/drill/pull/193#discussion_r41628637
  
    --- Diff: contrib/storage-hive/core/src/main/java/org/apache/drill/exec/store/hive/schema/DrillHiveTable.java ---
    @@ -179,4 +179,9 @@ private void throwUnsupportedHiveDataTypeError(String hiveType) {
     
         throw new RuntimeException(errMsg.toString());
       }
    +
    +  @Override
    +  public boolean providesDeferredSchema() {
    --- End diff --
    
    I believe that there other places (the select * replacement) which depend on the same type of property. We should use a single method to determine this. @jinfengni , can you point Sudheesh in the right direction?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request: DRILL-3623: Use shorter query path for LIMIT 0...

Posted by sudheeshkatkam <gi...@git.apache.org>.
Github user sudheeshkatkam closed the pull request at:

    https://github.com/apache/drill/pull/193


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request: DRILL-3623: Use shorter query path for LIMIT 0...

Posted by sudheeshkatkam <gi...@git.apache.org>.
Github user sudheeshkatkam commented on a diff in the pull request:

    https://github.com/apache/drill/pull/193#discussion_r41908298
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/handlers/DefaultSqlHandler.java ---
    @@ -162,6 +166,9 @@ public PhysicalPlan getPlan(SqlNode sqlNode) throws ValidationException, RelConv
     
         log("Optiq Logical", queryRelNode, logger);
         DrillRel drel = convertToDrel(queryRelNode, validatedRowType);
    +    if (otherPlan != null) {
    --- End diff --
    
    The DrillRel node for pojo record reader needs to have corresponding Prel and Prule, right?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---