You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Paul Rogers (Jira)" <ji...@apache.org> on 2019/11/20 07:00:00 UTC

[jira] [Commented] (DRILL-7451) Planner inserts project node even if scan handles project push-down

    [ https://issues.apache.org/jira/browse/DRILL-7451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16978120#comment-16978120 ] 

Paul Rogers commented on DRILL-7451:
------------------------------------

It appears that the actual behavior is a bit more complex. Run the same test as above, with the same query, but now mark the plugin as projection pushdown is *not* supported. In this case we get two projects. This suggests that the project above is added for a different reason, but it is still trivial and should be removed.

Logical plan with scan project pushdown disabled:

{code:json}
  "graph" : [ {
    "pop" : "DummyGroupScan",
    "@id" : 3,
    "columns" : [ "`**`" ],
    "userName" : "progers",
    "cost" : {
      "memoryCost" : 1.6777216E7,
      "outputRowCount" : 10000.0
    }
  }, {
    "pop" : "project",
    "@id" : 2,
    "exprs" : [ {
      "ref" : "`a`",
      "expr" : "`a`"
    }, {
      "ref" : "`b`",
      "expr" : "`b`"
    }, {
      "ref" : "`c`",
      "expr" : "`c`"
    } ],
    "child" : 3,
    "outputProj" : true,
    "initialAllocation" : 1000000,
    "maxAllocation" : 10000000000,
    "cost" : {
      "memoryCost" : 1.6777216E7,
      "outputRowCount" : 10000.0
    }
  }, {
    "pop" : "project",
    "@id" : 1,
    "exprs" : [ {
      "ref" : "`a`",
      "expr" : "`a`"
    }, {
      "ref" : "`b`",
      "expr" : "`b`"
    }, {
      "ref" : "`c`",
      "expr" : "`c`"
    } ],
    "child" : 2,
    "outputProj" : true,
    "initialAllocation" : 1000000,
    "maxAllocation" : 10000000000,
    "cost" : {
      "memoryCost" : 1.6777216E7,
      "outputRowCount" : 10000.0
    }
  }, {
    "pop" : "screen",
    "@id" : 0,
    "child" : 1,
    "initialAllocation" : 1000000,
    "maxAllocation" : 10000000000,
    "cost" : {
      "memoryCost" : 1.6777216E7,
      "outputRowCount" : 10000.0
    }
  } ]
{code}


> Planner inserts project node even if scan handles project push-down
> -------------------------------------------------------------------
>
>                 Key: DRILL-7451
>                 URL: https://issues.apache.org/jira/browse/DRILL-7451
>             Project: Apache Drill
>          Issue Type: Bug
>            Reporter: Paul Rogers
>            Priority: Minor
>
> I created a "dummy" storage plugin for testing. The test does a simple query:
> {code:sql}
> SELECT a, b, c from dummy.myTable
> {code}
> The first test is to mark the plugin's group scan as supporting projection push down. However, Drill still creates a projection node in the logical plan:
> {code:json}
>   "graph" : [ {
>     "pop" : "DummyGroupScan",
>     "@id" : 2,
>     "columns" : [ "`**`" ],
>     "userName" : "progers",
>     "cost" : {
>       "memoryCost" : 1.6777216E7,
>       "outputRowCount" : 10000.0
>     }
>   }, {
>     "pop" : "project",
>     "@id" : 1,
>     "exprs" : [ {
>       "ref" : "`a`",
>       "expr" : "`a`"
>     }, {
>       "ref" : "`b`",
>       "expr" : "`b`"
>     }, {
>       "ref" : "`c`",
>       "expr" : "`c`"
>     } ],
>     "child" : 2,
>     "outputProj" : true,
>     "initialAllocation" : 1000000,
>     "maxAllocation" : 10000000000,
>     "cost" : {
>       "memoryCost" : 1.6777216E7,
>       "outputRowCount" : 10000.0
>     }
>   }, {
>     "pop" : "screen",
>     "@id" : 0,
>     "child" : 1,
>     "initialAllocation" : 1000000,
>     "maxAllocation" : 10000000000,
>     "cost" : {
>       "memoryCost" : 1.6777216E7,
>       "outputRowCount" : 10000.0
>     }
>   } ]
> {code}
> There is [a comment in the code|https://github.com/apache/drill/blob/master/exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillPushProjectIntoScanRule.java#L109] that suggests the project should be removed:
> {code:java}
>         // project above scan may be removed in ProjectRemoveRule for
>         // the case when it is trivial
> {code}
> As shown in the example, the project is trivial. There is a subtlety: it may be that the scan, unknown to the planner, produce additional columns, say {{d}} and {{e}} which the project operator is needed to remove.
> If this is the reason the project remains, perhaps we can add a flag of some kind where the group scan can insist that not only does it handle projection, it will not insert additional columns. At that point, the project is completely unnecessary in this case.
> This is not a functional bug; just a performance issue: we exercise the machinery of the project operator to do exactly nothing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)