You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Chun Chang (JIRA)" <ji...@apache.org> on 2015/04/25 00:53:40 UTC

[jira] [Closed] (DRILL-2390) regression: MergeJoinBatch size exceeds 64k limit

     [ https://issues.apache.org/jira/browse/DRILL-2390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Chun Chang closed DRILL-2390.
-----------------------------
    Assignee: Chun Chang  (was: Venki Korukanti)

Added a test case for this bug. The test case still fails because merge join and hash join still gives different result. But this bug is fixed.

{code}
[root@qa-node120 resources]# cat Advanced/Failing/complextype/json/complex318.q
alter session set `planner.enable_mergejoin` = true;
alter session set `planner.enable_hashjoin` = false;
select count(a.id) from `complexga100k.json` a inner join `complexga100k.json` b on a.gbyi=b.gbyi;
alter session set `planner.enable_mergejoin` = false;
alter session set `planner.enable_hashjoin` = true;
{code}

{code}
0: jdbc:drill:schema=dfs.drillTestDirAdvanced> alter session set `planner.enable_mergejoin` = true;
+------------+------------+
|     ok     |  summary   |
+------------+------------+
| true       | planner.enable_mergejoin updated. |
+------------+------------+
1 row selected (0.193 seconds)
0: jdbc:drill:schema=dfs.drillTestDirAdvanced> alter session set `planner.enable_hashjoin` = false;
+------------+------------+
|     ok     |  summary   |
+------------+------------+
| true       | planner.enable_hashjoin updated. |
+------------+------------+
1 row selected (0.088 seconds)
0: jdbc:drill:schema=dfs.drillTestDirAdvanced> select count(a.id) from `complexga100k.json` a inner join `complexga100k.json` b on a.gbyi=b.gbyi;
+------------+
|   EXPR$0   |
+------------+
| 659100760  |
+------------+
1 row selected (125.569 seconds)
0: jdbc:drill:schema=dfs.drillTestDirAdvanced> alter session set `planner.enable_mergejoin` = false;
+------------+------------+
|     ok     |  summary   |
+------------+------------+
| true       | planner.enable_mergejoin updated. |
+------------+------------+
1 row selected (0.089 seconds)
0: jdbc:drill:schema=dfs.drillTestDirAdvanced> alter session set `planner.enable_hashjoin` = true;
+------------+------------+
|     ok     |  summary   |
+------------+------------+
| true       | planner.enable_hashjoin updated. |
+------------+------------+
1 row selected (0.111 seconds)
0: jdbc:drill:schema=dfs.drillTestDirAdvanced> select count(a.id) from `complexga100k.json` a inner join `complexga100k.json` b on a.gbyi=b.gbyi;
+------------+
|   EXPR$0   |
+------------+
| 666666670  |
+------------+
1 row selected (155.362 seconds)
{code}

> regression: MergeJoinBatch size exceeds 64k limit
> -------------------------------------------------
>
>                 Key: DRILL-2390
>                 URL: https://issues.apache.org/jira/browse/DRILL-2390
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Functions - Drill
>    Affects Versions: 0.8.0
>            Reporter: Chun Chang
>            Assignee: Chun Chang
>             Fix For: 0.8.0
>
>
> #Wed Mar 04 01:23:42 EST 2015
> git.commit.id.abbrev=71b6bfe
> Enable merge join, following query hits the batch size limit issue.
> {code}
> 0: jdbc:drill:schema=dfs.drillTestDirComplexJ> alter session set `planner.enable_mergejoin` = true;
> +------------+------------+
> |     ok     |  summary   |
> +------------+------------+
> | true       | planner.enable_mergejoin updated. |
> +------------+------------+
> 1 row selected (0.03 seconds)
> 0: jdbc:drill:schema=dfs.drillTestDirComplexJ> alter session set `planner.enable_hashjoin` = false;
> +------------+------------+
> |     ok     |  summary   |
> +------------+------------+
> | true       | planner.enable_hashjoin updated. |
> +------------+------------+
> 1 row selected (0.025 seconds)
> 0: jdbc:drill:schema=dfs.drillTestDirComplexJ> select count(a.id) from `complex100k.json` a inner join `complex100k.json` b on a.gbyi=b.gbyi;
> +------------+
> |   EXPR$0   |
> +------------+
> Query failed: RemoteRpcException: Failure while running fragment., Incoming batch of org.apache.drill.exec.physical.impl.join.MergeJoinBatch has size 3276800, which is beyond the limit of 65536 [ e26545df-a8c3-4cd6-b02a-1872db4ac41f on qa-node120.qa.lab:31010 ]
> [ e26545df-a8c3-4cd6-b02a-1872db4ac41f on qa-node120.qa.lab:31010 ]
> java.lang.RuntimeException: java.sql.SQLException: Failure while executing query.
> 	at sqlline.SqlLine$IncrementalRows.hasNext(SqlLine.java:2514)
> 	at sqlline.SqlLine$TableOutputFormat.print(SqlLine.java:2148)
> 	at sqlline.SqlLine.print(SqlLine.java:1809)
> 	at sqlline.SqlLine$Commands.execute(SqlLine.java:3766)
> 	at sqlline.SqlLine$Commands.sql(SqlLine.java:3663)
> 	at sqlline.SqlLine.dispatch(SqlLine.java:889)
> 	at sqlline.SqlLine.begin(SqlLine.java:763)
> 	at sqlline.SqlLine.start(SqlLine.java:498)
> 	at sqlline.SqlLine.main(SqlLine.java:460)
> {code}
> This worked before.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)