You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Wei Zheng (JIRA)" <ji...@apache.org> on 2015/09/08 22:16:45 UTC
[jira] [Commented] (HIVE-11433) NPE for a multiple inner join query
[ https://issues.apache.org/jira/browse/HIVE-11433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14735529#comment-14735529 ]
Wei Zheng commented on HIVE-11433:
----------------------------------
[~xuefuz] Do you have a repro case for the problem? I'm doing some tests for multiple joins. Thanks.
> NPE for a multiple inner join query
> -----------------------------------
>
> Key: HIVE-11433
> URL: https://issues.apache.org/jira/browse/HIVE-11433
> Project: Hive
> Issue Type: Bug
> Components: Query Processor
> Affects Versions: 1.2.0, 1.1.0, 2.0.0
> Reporter: Xuefu Zhang
> Assignee: Xuefu Zhang
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-11433.patch, HIVE-11433.patch
>
>
> NullPointException is thrown for query that has multiple (greater than 3) inner joins. Stacktrace for 1.1.0
> {code}
> NullPointerException null
> java.lang.NullPointerException
> at org.apache.hadoop.hive.ql.parse.ParseUtils.getIndex(ParseUtils.java:149)
> at org.apache.hadoop.hive.ql.parse.ParseUtils.checkJoinFilterRefersOneAlias(ParseUtils.java:166)
> at org.apache.hadoop.hive.ql.parse.ParseUtils.checkJoinFilterRefersOneAlias(ParseUtils.java:185)
> at org.apache.hadoop.hive.ql.parse.ParseUtils.checkJoinFilterRefersOneAlias(ParseUtils.java:185)
> at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.mergeJoins(SemanticAnalyzer.java:8257)
> at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.mergeJoinTree(SemanticAnalyzer.java:8422)
> at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9805)
> at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9714)
> at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genOPTree(SemanticAnalyzer.java:10150)
> at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10161)
> at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10078)
> at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:222)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:421)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:307)
> at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1110)
> at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1104)
> at org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:101)
> at org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:172)
> at org.apache.hive.service.cli.operation.Operation.run(Operation.java:257)
> at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:386)
> at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:373)
> at org.apache.hive.service.cli.CLIService.executeStatementAsync(CLIService.java:271)
> at org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:486)
> at org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1313)
> at org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1298)
> at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
> at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
> at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:692)
> at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> {code}.
> However, the problem can also be reproduced in latest master branch. Further investigation shows that the following code (in ParseUtils.java) is problematic:
> {code}
> static int getIndex(String[] list, String elem) {
> for(int i=0; i < list.length; i++) {
> if (list[i].toLowerCase().equals(elem)) {
> return i;
> }
> }
> return -1;
> }
> {code}
> The code assumes that every element in the list is not null, which isn't true because of the following code in SemanticAnalyzer.java (method genJoinTree()):
> {code}
> if ((right.getToken().getType() == HiveParser.TOK_TABREF)
> || (right.getToken().getType() == HiveParser.TOK_SUBQUERY)
> || (right.getToken().getType() == HiveParser.TOK_PTBLFUNCTION)) {
> String tableName = getUnescapedUnqualifiedTableName((ASTNode) right.getChild(0))
> .toLowerCase();
> String alias = extractJoinAlias(right, tableName);
> String[] rightAliases = new String[1];
> rightAliases[0] = alias;
> joinTree.setRightAliases(rightAliases);
> String[] children = joinTree.getBaseSrc();
> if (children == null) {
> children = new String[2];
> }
> children[1] = alias;
> joinTree.setBaseSrc(children);
> joinTree.setId(qb.getId());
> joinTree.getAliasToOpInfo().put(
> getModifiedAlias(qb, alias), aliasToOpInfo.get(alias));
> // remember rhs table for semijoin
> if (joinTree.getNoSemiJoin() == false) {
> joinTree.addRHSSemijoin(alias);
> }
> } else {
> {code}.
> Specifically, this code can result a null element as base source:
> {code}
> if (children == null) {
> children = new String[2];
> }
> children[1] = alias;
> {code}
> This appears to be a regression from earlier release (0.14.1). However, it's unclear which commit caused this.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)