You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Jinfeng Ni (JIRA)" <ji...@apache.org> on 2017/06/14 00:00:00 UTC

[jira] [Created] (DRILL-5586) UnionAll operator does more than necessary value vector allocation and copy

Jinfeng Ni created DRILL-5586:
---------------------------------

             Summary: UnionAll operator does more than necessary value vector allocation and copy
                 Key: DRILL-5586
                 URL: https://issues.apache.org/jira/browse/DRILL-5586
             Project: Apache Drill
          Issue Type: Bug
            Reporter: Jinfeng Ni


When inputs to UnionAll operators are just simple field reference, in stead of an expression involving a function, which requires evaluation, it should leverage value vector's transfer API.  Doing transfer would avoid the allocation of buffer for value vector in outgoing batch, plus the overhead to copy the data from incoming batch to outgoing batch. 

For example, in the following query:
{code}
select l_orderkey from cp.`tpch/lineitem.parquet` l union all select n_nationkey from cp.`tpch/nation.parquet`
{code}

Both left and right side of UnionAll operator is simple filed reference, and Drill should call transfer API. However, the current code would do buffer allocation & copy for both left and right. Such processing would significantly slow UnionAll operator's performance, and eventually slow down query evaluation.

DRILL-5521 reverts a change in logic whether applying transfer logic made in DRILL-5419, based on SchemaPath equal comparison.  Even we fix that problem, it's not enough to use SchemaPath equal comparison as criteria whether transfer should be used. Ideally, even the output field and incoming field have different names, UnionAll operator should do {{transfer}}, instead of {{copy}}, as long as the expression is simple field reference. 

{code}
select l_orderkey as Key1 from cp.`tpch/lineitem.parquet` l union all select n_nationkey as Key2 from cp.`tpch/nation.parquet`
{code}




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)