You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "cw (JIRA)" <ji...@apache.org> on 2014/11/17 08:34:34 UTC

[jira] [Created] (HIVE-8895) bugs in mergejoin

cw created HIVE-8895:
------------------------

             Summary: bugs in mergejoin
                 Key: HIVE-8895
                 URL: https://issues.apache.org/jira/browse/HIVE-8895
             Project: Hive
          Issue Type: Bug
          Components: Query Processor
    Affects Versions: 0.13.1, 0.13.0, 0.14.0
            Reporter: cw
            Priority: Minor


I got a IndexOutOfBoundsException with a SQL in hive0.13.1. But it runs well on hive0.11. Here is the example sql which can trigger the exception.

{code}
create table test_join_1(a string, b string);
create table test_join_2(a string, b string);
-- got an IndexOutOfBoundsException error
explain 
select * from

(
    SELECT a a, b b
    FROM test_join_1
)t1
join 
(
    SELECT a a, b b
    FROM test_join_1
)t2
    on  t1.a = t2.a
    and t1.a = t2.b
join
(
    select a from test_join_2
)t3 on t1.a = t3.a;
{code}

And here is some stack information:
{code}
java.lang.IndexOutOfBoundsException: Index: 1, Size: 1
    at java.util.ArrayList.rangeCheck(ArrayList.java:604)
    at java.util.ArrayList.get(ArrayList.java:382)
    at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.mergeJoins(SemanticAnalyzer.java:7403)
    at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.mergeJoinTree(SemanticAnalyzer.java:7616)
    at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:8946)
    at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:9220)
    ...
{code}

But sql as below runs well.
{code}
explain select * from
(
    SELECT a a, b b
    FROM test_join_1
)t1
join 
(
    SELECT a a, b b
    FROM test_join_1
)t2
    on  t1.a = t2.a
    and t2.a = t2.b
join
(
    select a from test_join_2
)t3 on t1.a = t3.a;
{code}

I didn't quite understand the details of mergejoin. But I noticed the patch in HIVE-5556 edited SemanticAnalyzer.java with the change below:
{code}
-    if ((targetCondn == null) || (nodeCondn.size() != targetCondn.size())) {
-      return -1;
+    if ( targetCondn == null ) {
+      return new ObjectPair(-1, null);
+    }
{code}

Maybe it's a good idea to revert the logic of the 'if' statement as before.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)