You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "cw (JIRA)" <ji...@apache.org> on 2014/11/17 08:36:34 UTC

[jira] [Updated] (HIVE-8895) bugs in mergejoin

     [ https://issues.apache.org/jira/browse/HIVE-8895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

cw updated HIVE-8895:
---------------------
    Attachment: HIVE-8895.1.patch

> bugs in mergejoin
> -----------------
>
>                 Key: HIVE-8895
>                 URL: https://issues.apache.org/jira/browse/HIVE-8895
>             Project: Hive
>          Issue Type: Bug
>          Components: Query Processor
>    Affects Versions: 0.13.0, 0.14.0, 0.13.1
>            Reporter: cw
>            Priority: Minor
>              Labels: patch
>         Attachments: HIVE-8895.1.patch
>
>
> I got a IndexOutOfBoundsException with a SQL in hive0.13.1. But it runs well on hive0.11. Here is the example sql which can trigger the exception.
> {code}
> create table test_join_1(a string, b string);
> create table test_join_2(a string, b string);
> -- got an IndexOutOfBoundsException error
> explain 
> select * from
> (
>     SELECT a a, b b
>     FROM test_join_1
> )t1
> join 
> (
>     SELECT a a, b b
>     FROM test_join_1
> )t2
>     on  t1.a = t2.a
>     and t1.a = t2.b
> join
> (
>     select a from test_join_2
> )t3 on t1.a = t3.a;
> {code}
> And here is some stack information:
> {code}
> java.lang.IndexOutOfBoundsException: Index: 1, Size: 1
>     at java.util.ArrayList.rangeCheck(ArrayList.java:604)
>     at java.util.ArrayList.get(ArrayList.java:382)
>     at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.mergeJoins(SemanticAnalyzer.java:7403)
>     at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.mergeJoinTree(SemanticAnalyzer.java:7616)
>     at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:8946)
>     at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:9220)
>     ...
> {code}
> But sql as below runs well.
> {code}
> explain select * from
> (
>     SELECT a a, b b
>     FROM test_join_1
> )t1
> join 
> (
>     SELECT a a, b b
>     FROM test_join_1
> )t2
>     on  t1.a = t2.a
>     and t2.a = t2.b
> join
> (
>     select a from test_join_2
> )t3 on t1.a = t3.a;
> {code}
> I didn't quite understand the details of mergejoin. But I noticed the patch in HIVE-5556 edited SemanticAnalyzer.java with the change below:
> {code}
> -    if ((targetCondn == null) || (nodeCondn.size() != targetCondn.size())) {
> -      return -1;
> +    if ( targetCondn == null ) {
> +      return new ObjectPair(-1, null);
> +    }
> {code}
> Maybe it's a good idea to revert the logic of the 'if' statement as before.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)