You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Jinfeng Ni (JIRA)" <ji...@apache.org> on 2017/06/13 23:34:00 UTC

[jira] [Created] (DRILL-5585) UnionAll operator generates run-time code for every incoming batch

Jinfeng Ni created DRILL-5585:
---------------------------------

             Summary: UnionAll operator generates run-time code for every incoming batch
                 Key: DRILL-5585
                 URL: https://issues.apache.org/jira/browse/DRILL-5585
             Project: Apache Drill
          Issue Type: Bug
            Reporter: Jinfeng Ni
            Assignee: Jinfeng Ni


In Drill's execution framework, each operator may generate run-time code for various purpose. The code generation & compilation should only happen when there is a new schema from incoming batch ({{OK_NEW_SCHEM}}. For any follow-up schema ({{OK}}), the operator should not generate the run-time code, since it's available. 

However, in the current implementation of UnionAll, regardless the incoming batch returns with a {{OK_NEW_SCHEMA}} or {{OK}}, it will always call doWork(), which essentially would 1) generate code and possibly compile code, 2) doSetup, 3) doEvaluation.  The code generation logic is not necessary, and doing that for each batch would significantly impact the operator's performance, and slow down query execution. 

{code}
        case OK_NEW_SCHEMA:
          outputFields = unionAllInput.getOutputFields();
        case OK:
          IterOutcome workOutcome = doWork();
{code}
For the multiple run-time generation, code compilation could be skipped, unless there is a miss in code cache. However,  the current code logic is still problematic,  since it has to {{ClassGenerator}} to generate the run-time source code. 




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)