You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Wei Zheng (JIRA)" <ji...@apache.org> on 2015/09/08 22:16:45 UTC

[jira] [Commented] (HIVE-11433) NPE for a multiple inner join query

    [ https://issues.apache.org/jira/browse/HIVE-11433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14735529#comment-14735529 ] 

Wei Zheng commented on HIVE-11433:
----------------------------------

[~xuefuz] Do you have a repro case for the problem? I'm doing some tests for multiple joins. Thanks.

> NPE for a multiple inner join query
> -----------------------------------
>
>                 Key: HIVE-11433
>                 URL: https://issues.apache.org/jira/browse/HIVE-11433
>             Project: Hive
>          Issue Type: Bug
>          Components: Query Processor
>    Affects Versions: 1.2.0, 1.1.0, 2.0.0
>            Reporter: Xuefu Zhang
>            Assignee: Xuefu Zhang
>             Fix For: 1.3.0, 2.0.0
>
>         Attachments: HIVE-11433.patch, HIVE-11433.patch
>
>
> NullPointException is thrown for query that has multiple (greater than 3) inner joins. Stacktrace for 1.1.0
> {code}
> NullPointerException null
> java.lang.NullPointerException
>         at org.apache.hadoop.hive.ql.parse.ParseUtils.getIndex(ParseUtils.java:149)
>         at org.apache.hadoop.hive.ql.parse.ParseUtils.checkJoinFilterRefersOneAlias(ParseUtils.java:166)
>         at org.apache.hadoop.hive.ql.parse.ParseUtils.checkJoinFilterRefersOneAlias(ParseUtils.java:185)
>         at org.apache.hadoop.hive.ql.parse.ParseUtils.checkJoinFilterRefersOneAlias(ParseUtils.java:185)
>         at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.mergeJoins(SemanticAnalyzer.java:8257)
>         at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.mergeJoinTree(SemanticAnalyzer.java:8422)
>         at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9805)
>         at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9714)
>         at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genOPTree(SemanticAnalyzer.java:10150)
>         at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10161)
>         at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10078)
>         at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:222)
>         at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:421)
>         at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:307)
>         at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1110)
>         at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1104)
>         at org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:101)
>         at org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:172)
>         at org.apache.hive.service.cli.operation.Operation.run(Operation.java:257)
>         at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:386)
>         at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:373)
>         at org.apache.hive.service.cli.CLIService.executeStatementAsync(CLIService.java:271)
>         at org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:486)
>         at org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1313)
>         at org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1298)
>         at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
>         at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
>         at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:692)
>         at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at java.lang.Thread.run(Thread.java:745)
> {code}.
> However, the problem can also be reproduced in latest master branch. Further investigation shows that the following code (in ParseUtils.java) is problematic:
> {code}
>   static int getIndex(String[] list, String elem) {
>     for(int i=0; i < list.length; i++) {
>       if (list[i].toLowerCase().equals(elem)) {
>         return i;
>       }
>     }
>     return -1;
>   }
> {code}
> The code assumes that every element in the list is not null, which isn't true because of the following code in SemanticAnalyzer.java (method genJoinTree()):
> {code}
>     if ((right.getToken().getType() == HiveParser.TOK_TABREF)
>         || (right.getToken().getType() == HiveParser.TOK_SUBQUERY)
>         || (right.getToken().getType() == HiveParser.TOK_PTBLFUNCTION)) {
>       String tableName = getUnescapedUnqualifiedTableName((ASTNode) right.getChild(0))
>           .toLowerCase();
>       String alias = extractJoinAlias(right, tableName);
>       String[] rightAliases = new String[1];
>       rightAliases[0] = alias;
>       joinTree.setRightAliases(rightAliases);
>       String[] children = joinTree.getBaseSrc();
>       if (children == null) {
>         children = new String[2];
>       }
>       children[1] = alias;
>       joinTree.setBaseSrc(children);
>       joinTree.setId(qb.getId());
>       joinTree.getAliasToOpInfo().put(
>           getModifiedAlias(qb, alias), aliasToOpInfo.get(alias));
>       // remember rhs table for semijoin
>       if (joinTree.getNoSemiJoin() == false) {
>         joinTree.addRHSSemijoin(alias);
>       }
>     } else {
> {code}.
> Specifically, this code can result a null element as base source:
> {code}
>       if (children == null) {
>         children = new String[2];
>       }
>       children[1] = alias;
> {code}
> This appears to be a regression from earlier release (0.14.1). However, it's unclear which commit caused this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)