You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Paul Rogers (JIRA)" <ji...@apache.org> on 2017/05/18 18:18:04 UTC

[jira] [Created] (DRILL-5524) Remove no-op projects from query plan

Paul Rogers created DRILL-5524:
----------------------------------

             Summary: Remove no-op projects from query plan
                 Key: DRILL-5524
                 URL: https://issues.apache.org/jira/browse/DRILL-5524
             Project: Apache Drill
          Issue Type: Improvement
    Affects Versions: 1.10.0
            Reporter: Paul Rogers
            Priority: Minor


Consider a very simple query using the mock data source:

{code}
SELECT id_i, name_s10 FROM `mock`.`employees_10K`
{code}

This just says to create two columns: one int, one varchar of length 10, and fill them with random data to create 10,000 records.

The query simply passes the columns directly from the input to the client.

Yet, the query plan includes a "no-op" project:

{code}
  "graph" : [ {
    "pop" : "mock-scan",
    "@id" : 2, ...
  }, {
    "pop" : "project",
    "@id" : 1,
    "exprs" : [ {
      "ref" : "`id_i`",
      "expr" : "`id_i`"
    }, {
      "ref" : "`name_s10`",
      "expr" : "`name_s10`"
    } ], ...
  }, {
    "pop" : "screen",
    "@id" : 0, ...
  } ]
}
{code}

When executed, the project operator generates code that does nothing:

{code}
public class ProjectorGen0 extends ProjectorTemplate {
    public void doEval(int inIndex, int outIndex)
        throws SchemaChangeException
    { }

    public void doSetup(FragmentContext context, RecordBatch incoming, RecordBatch outgoing)
        throws SchemaChangeException
    { }
}
{code}

Yet, the project code still insists on stepping through each row, despite the fact that the code does nothing per record:

{code}
      for (i = startIndex; i < startIndex + recordCount; i++, firstOutputIndex++) {
        try {
          doEval(i, firstOutputIndex);
        } ...
      }
{code}

The request is to both:

1. Skip the per-record loop if all transfers are at the vector level, and
2. Omit the entire project step if nothing changes.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)