You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Volodymyr Vysotskyi (JIRA)" <ji...@apache.org> on 2017/08/03 16:29:00 UTC

[jira] [Commented] (DRILL-1162) 25 way join ended up with OOM

    [ https://issues.apache.org/jira/browse/DRILL-1162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16113039#comment-16113039 ] 

Volodymyr Vysotskyi commented on DRILL-1162:
--------------------------------------------

Thanks for the explanation of background about DRILL-2236. I have no doubt that this fix was required before Drill started using {{HepPlanner}}. 
Considering only the case when tables are joining on PK, I agree with you that DRILL-2236 would set smaller input for the build side since {{LoptOptimizeJoinRule}} considers not only rows count but also columns count, if the rows count is the same ([swapInputs()|https://github.com/mapr/incubator-calcite/blob/DrillCalcite1.4.0-mapr-1.4.0/core/src/main/java/org/apache/calcite/rel/rules/LoptOptimizeJoinRule.java#L1857] method). So in the case when left and right inputs have the same rows count, after applying this rule input with the join will be on the probe side, since join operator has usually larger columns count. But real rows count is not greater than estimated for joining on PK, so we have benefit when the fix for DRILL-2236 swaps join inputs after applying {{LoptOptimizeJoinRule}}.

Unfortunately, for the case of non-PK join condition and when estimated rows count is the same, this swap makes build side larger.

Disabling the hashjoin_swap would help only for the queries like in this Jira (join inputs have the same rows count). 
For example, with these tables:
{code:sql}
create table p as select * from `lineitem1.parquet` limit 60040;
create table o as select * from `lineitem1.parquet` limit 50030;
create table n as select * from `lineitem1.parquet` limit 40020;
create table m as select * from `lineitem1.parquet` limit 30010;
create table l as select * from `lineitem1.parquet` limit 20000;
create table k as select * from `lineitem1.parquet` limit 10000;
{code}
query 
{code:sql}
select count(*) from k
inner join l on k.l_partkey=l.l_partkey
inner join m on m.l_partkey=l.l_partkey
inner join n on m.l_partkey=n.l_partkey
inner join o on n.l_partkey=o.l_partkey
inner join p on p.l_partkey=o.l_partkey;
{code}
also fails with OOM when hashjoin_swap is disabled or enabled.

I agree with you that the real problem is the join row count estimation.
Hive, for example, estimates join row count considering both PK join and non-PK join cases: [getRowCount()|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/stats/HiveRelMdRowCount.java#L65]. It determines the case and calculates row count depending on it. 
I realize that we could not use the similar code since Drill does not have table and column statistics. In DRILL-1328 table statistics will be implemented, but it may cause inconsistent behaviour, since this statistics may be not created for all tables, or for a single table from the query with a join, or a file with statistics may be corrupted. 

Hash join spilling to the disk should also help to omit this OOM even with such plan.

> 25 way join ended up with OOM
> -----------------------------
>
>                 Key: DRILL-1162
>                 URL: https://issues.apache.org/jira/browse/DRILL-1162
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Execution - Flow, Query Planning & Optimization
>            Reporter: Rahul Challapalli
>            Assignee: Volodymyr Vysotskyi
>            Priority: Critical
>             Fix For: Future
>
>         Attachments: error.log, oom_error.log
>
>
> git.commit.id.abbrev=e5c2da0
> The below query results in 0 results being returned 
> {code:sql}
> select count(*) from `lineitem1.parquet` a 
> inner join `part.parquet` j on a.l_partkey = j.p_partkey 
> inner join `orders.parquet` k on a.l_orderkey = k.o_orderkey 
> inner join `supplier.parquet` l on a.l_suppkey = l.s_suppkey 
> inner join `partsupp.parquet` m on j.p_partkey = m.ps_partkey and l.s_suppkey = m.ps_suppkey 
> inner join `customer.parquet` n on k.o_custkey = n.c_custkey 
> inner join `lineitem2.parquet` b on a.l_orderkey = b.l_orderkey 
> inner join `lineitem2.parquet` c on a.l_partkey = c.l_partkey 
> inner join `lineitem2.parquet` d on a.l_suppkey = d.l_suppkey 
> inner join `lineitem2.parquet` e on a.l_extendedprice = e.l_extendedprice 
> inner join `lineitem2.parquet` f on a.l_comment = f.l_comment 
> inner join `lineitem2.parquet` g on a.l_shipdate = g.l_shipdate 
> inner join `lineitem2.parquet` h on a.l_commitdate = h.l_commitdate 
> inner join `lineitem2.parquet` i on a.l_receiptdate = i.l_receiptdate 
> inner join `lineitem2.parquet` o on a.l_receiptdate = o.l_receiptdate 
> inner join `lineitem2.parquet` p on a.l_receiptdate = p.l_receiptdate 
> inner join `lineitem2.parquet` q on a.l_receiptdate = q.l_receiptdate 
> inner join `lineitem2.parquet` r on a.l_receiptdate = r.l_receiptdate 
> inner join `lineitem2.parquet` s on a.l_receiptdate = s.l_receiptdate 
> inner join `lineitem2.parquet` t on a.l_receiptdate = t.l_receiptdate 
> inner join `lineitem2.parquet` u on a.l_receiptdate = u.l_receiptdate 
> inner join `lineitem2.parquet` v on a.l_receiptdate = v.l_receiptdate 
> inner join `lineitem2.parquet` w on a.l_receiptdate = w.l_receiptdate 
> inner join `lineitem2.parquet` x on a.l_receiptdate = x.l_receiptdate;
> {code}
> However when we remove the last 'inner join' and run the query it returns '716372534'. Since the last inner join is similar to the one's before it, it should match some records and return the data appropriately.
> The logs indicated that it actually returned 0 results. Attached the log file.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)