You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Stamatis Zampetakis (Jira)" <ji...@apache.org> on 2023/02/23 16:44:00 UTC
[jira] [Commented] (HIVE-27102) Upgrade Calcite to 1.33.0 and Avatica to 1.23.0
[ https://issues.apache.org/jira/browse/HIVE-27102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17692776#comment-17692776 ]
Stamatis Zampetakis commented on HIVE-27102:
--------------------------------------------
The branch on which I am working on can be found here: [https://github.com/zabetak/hive/tree/calcite-upgrade-1.33]
All the compilation problems are fixed in the branch. Moreover, there are workarounds and notes for some other problems found along the way.
At this stage I am mainly running the CBO queries with the TPCDS stats trying to resolve errors and explain plan changes.
{noformat}
mvn test -Dtest=TestTezTPCDS30TBPerfCliDriver -Dqfile_regex=cbo_query.* -Dtest.output.overwrite{noformat}
In terms of errors there at least few (cbo_query8.q and others) with the stacktrace below :
{noformat}
[ERROR] org.apache.hadoop.hive.cli.TestTezTPCDS30TBPerfCliDriver.testCliDriver[cbo_query8] Time elapsed: 0.425 s <<< FAILURE!
java.lang.AssertionError
at org.apache.calcite.rex.RexCall.<init>(RexCall.java:86)
at org.apache.calcite.rex.RexBuilder.makeCall(RexBuilder.java:253)
at org.apache.hadoop.hive.ql.parse.type.HiveFunctionHelper.getExpression(HiveFunctionHelper.java:322)
at org.apache.hadoop.hive.ql.parse.type.RexNodeExprFactory.createFuncCallExpr(RexNodeExprFactory.java:649)
at org.apache.hadoop.hive.ql.parse.type.RexNodeExprFactory.createFuncCallExpr(RexNodeExprFactory.java:97)
at org.apache.hadoop.hive.ql.parse.type.TypeCheckProcFactory$DefaultExprProcessor.getXpathOrFuncExprNodeDesc(TypeCheckProcFactory.java:1081)
at org.apache.hadoop.hive.ql.parse.type.TypeCheckProcFactory$DefaultExprProcessor.process(TypeCheckProcFactory.java:1471)
at org.apache.hadoop.hive.ql.lib.CostLessRuleDispatcher.dispatch(CostLessRuleDispatcher.java:66)
at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:105)
at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:89)
at org.apache.hadoop.hive.ql.lib.ExpressionWalker.walk(ExpressionWalker.java:101)
at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:120)
at org.apache.hadoop.hive.ql.parse.type.TypeCheckProcFactory.genExprNode(TypeCheckProcFactory.java:228)
at org.apache.hadoop.hive.ql.parse.type.RexNodeTypeCheck.genExprNode(RexNodeTypeCheck.java:40)
at org.apache.hadoop.hive.ql.parse.CalcitePlanner.genAllRexNode(CalcitePlanner.java:5516)
at org.apache.hadoop.hive.ql.parse.CalcitePlanner.genRexNode(CalcitePlanner.java:5474)
at org.apache.hadoop.hive.ql.parse.CalcitePlanner.genRexNode(CalcitePlanner.java:5434)
at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genFilterRelNode(CalcitePlanner.java:3164)
at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genFilterRelNode(CalcitePlanner.java:3482)
at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genFilterLogicalPlan(CalcitePlanner.java:3493)
at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genLogicalPlan(CalcitePlanner.java:5163)
at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genLogicalPlan(CalcitePlanner.java:5059)
at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genLogicalPlan(CalcitePlanner.java:5065)
at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genLogicalPlan(CalcitePlanner.java:5104)
at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genLogicalPlan(CalcitePlanner.java:5059)
at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genLogicalPlan(CalcitePlanner.java:5104)
at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genLogicalPlan(CalcitePlanner.java:5059)
at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genLogicalPlan(CalcitePlanner.java:5104)
at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1648)
at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1592)
at org.apache.calcite.tools.Frameworks.lambda$withPlanner$0(Frameworks.java:140)
at org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:936)
at org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:191)
at org.apache.calcite.tools.Frameworks.withPlanner(Frameworks.java:135)
at org.apache.hadoop.hive.ql.parse.CalcitePlanner.logicalPlan(CalcitePlanner.java:1344)
at org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:571)
at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:12816)
at org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:466)
at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:326)
at org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:180)
at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:326)
at org.apache.hadoop.hive.ql.Compiler.analyze(Compiler.java:223)
at org.apache.hadoop.hive.ql.Compiler.compile(Compiler.java:106)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:518)
at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:470)
at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:435)
at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:429)
at org.apache.hadoop.hive.ql.reexec.ReExecDriver.compileAndRespond(ReExecDriver.java:121)
at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:227)
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:257)
at org.apache.hadoop.hive.cli.CliDriver.processCmd1(CliDriver.java:201)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:127)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:425)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:356)
at org.apache.hadoop.hive.ql.QTestUtil.executeClientInternal(QTestUtil.java:733)
at org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:703)
at org.apache.hadoop.hive.cli.control.CorePerfCliDriver.runTest(CorePerfCliDriver.java:88)
at org.apache.hadoop.hive.cli.control.CliAdapter.runTest(CliAdapter.java:157)
at org.apache.hadoop.hive.cli.TestTezTPCDS30TBPerfCliDriver.testCliDriver(TestTezTPCDS30TBPerfCliDriver.java:79){noformat}
In terms of plan changes one that for sure needs to be addressed is the change of IN/BETWEEN operators to OR/ANDs (see cbo_query5.q and cbo_query10.q).
{noformat}
cbo_query5.q:
HiveFilter(condition=[BETWEEN(false, CAST($2):TIMESTAMP(9), 1998-08-04 00:00:00:TIMESTAMP(9), 1998-08-18 00:00:00:TIMESTAMP(9))])
vs.
HiveFilter(condition=[AND(>=(CAST($2):TIMESTAMP(9), 1998-08-04 00:00:00), <=(CAST($2):TIMESTAMP(9), 1998-08-18 00:00:00))])
cbo_query10.q:
HiveFilter(condition=[IN($7, _UTF-16LE'Walker County':VARCHAR(30) CHARACTER SET "UTF-16LE", _UTF-16LE'Richland County':VARCHAR(30) CHARACTER SET "UTF-16LE", _UTF-16LE'Gaines County':VARCHAR(30) CHARACTER SET "UTF-16LE", _UTF-16LE'Douglas County':VARCHAR(30) CHARACTER SET "UTF-16LE", _UTF-16LE'Dona Ana County':VARCHAR(30) CHARACTER SET "UTF-16LE")])
vs
HiveFilter(condition=[OR(=(CAST($7):VARCHAR(2147483647) CHARACTER SET "UTF-16LE", _UTF-16LE'Gaines County'), =(CAST($7):VARCHAR(2147483647) CHARACTER SET "UTF-16LE", _UTF-16LE'Walker County'), =(CAST($7):VARCHAR(2147483647) CHARACTER SET "UTF-16LE", _UTF-16LE'Douglas County'), =(CAST($7):VARCHAR(2147483647) CHARACTER SET "UTF-16LE", _UTF-16LE'Dona Ana County'), =(CAST($7):VARCHAR(2147483647) CHARACTER SET "UTF-16LE", _UTF-16LE'Richland County'))]){noformat}
> Upgrade Calcite to 1.33.0 and Avatica to 1.23.0
> -----------------------------------------------
>
> Key: HIVE-27102
> URL: https://issues.apache.org/jira/browse/HIVE-27102
> Project: Hive
> Issue Type: Improvement
> Components: CBO
> Reporter: Stamatis Zampetakis
> Assignee: Stamatis Zampetakis
> Priority: Major
>
> New versions for Calcite and Avatica are available so we should upgrade to them.
> I had some WIP in HIVE-26610 for upgrading calcite to 1.32.0 but given that the work was not in very advanced state it is preferred to jump directly to 1.33.0.
> Avatica must be inline with Calcite so both need to be updated at the same time.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)