You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Stamatis Zampetakis (Jira)" <ji...@apache.org> on 2023/02/23 16:44:00 UTC

[jira] [Commented] (HIVE-27102) Upgrade Calcite to 1.33.0 and Avatica to 1.23.0

    [ https://issues.apache.org/jira/browse/HIVE-27102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17692776#comment-17692776 ] 

Stamatis Zampetakis commented on HIVE-27102:
--------------------------------------------

The branch on which I am working on can be found here: [https://github.com/zabetak/hive/tree/calcite-upgrade-1.33]

All the compilation problems are fixed in the branch. Moreover, there are workarounds and notes for some other problems found along the way.

At this stage I am mainly running the CBO queries with the TPCDS stats trying to resolve errors and explain plan changes.
{noformat}
mvn test -Dtest=TestTezTPCDS30TBPerfCliDriver -Dqfile_regex=cbo_query.* -Dtest.output.overwrite{noformat}
In terms of errors there at least few (cbo_query8.q and others) with the stacktrace below :
{noformat}
[ERROR] org.apache.hadoop.hive.cli.TestTezTPCDS30TBPerfCliDriver.testCliDriver[cbo_query8]  Time elapsed: 0.425 s  <<< FAILURE!
java.lang.AssertionError
    at org.apache.calcite.rex.RexCall.<init>(RexCall.java:86)
    at org.apache.calcite.rex.RexBuilder.makeCall(RexBuilder.java:253)
    at org.apache.hadoop.hive.ql.parse.type.HiveFunctionHelper.getExpression(HiveFunctionHelper.java:322)
    at org.apache.hadoop.hive.ql.parse.type.RexNodeExprFactory.createFuncCallExpr(RexNodeExprFactory.java:649)
    at org.apache.hadoop.hive.ql.parse.type.RexNodeExprFactory.createFuncCallExpr(RexNodeExprFactory.java:97)
    at org.apache.hadoop.hive.ql.parse.type.TypeCheckProcFactory$DefaultExprProcessor.getXpathOrFuncExprNodeDesc(TypeCheckProcFactory.java:1081)
    at org.apache.hadoop.hive.ql.parse.type.TypeCheckProcFactory$DefaultExprProcessor.process(TypeCheckProcFactory.java:1471)
    at org.apache.hadoop.hive.ql.lib.CostLessRuleDispatcher.dispatch(CostLessRuleDispatcher.java:66)
    at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:105)
    at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:89)
    at org.apache.hadoop.hive.ql.lib.ExpressionWalker.walk(ExpressionWalker.java:101)
    at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:120)
    at org.apache.hadoop.hive.ql.parse.type.TypeCheckProcFactory.genExprNode(TypeCheckProcFactory.java:228)
    at org.apache.hadoop.hive.ql.parse.type.RexNodeTypeCheck.genExprNode(RexNodeTypeCheck.java:40)
    at org.apache.hadoop.hive.ql.parse.CalcitePlanner.genAllRexNode(CalcitePlanner.java:5516)
    at org.apache.hadoop.hive.ql.parse.CalcitePlanner.genRexNode(CalcitePlanner.java:5474)
    at org.apache.hadoop.hive.ql.parse.CalcitePlanner.genRexNode(CalcitePlanner.java:5434)
    at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genFilterRelNode(CalcitePlanner.java:3164)
    at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genFilterRelNode(CalcitePlanner.java:3482)
    at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genFilterLogicalPlan(CalcitePlanner.java:3493)
    at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genLogicalPlan(CalcitePlanner.java:5163)
    at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genLogicalPlan(CalcitePlanner.java:5059)
    at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genLogicalPlan(CalcitePlanner.java:5065)
    at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genLogicalPlan(CalcitePlanner.java:5104)
    at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genLogicalPlan(CalcitePlanner.java:5059)
    at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genLogicalPlan(CalcitePlanner.java:5104)
    at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genLogicalPlan(CalcitePlanner.java:5059)
    at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genLogicalPlan(CalcitePlanner.java:5104)
    at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1648)
    at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1592)
    at org.apache.calcite.tools.Frameworks.lambda$withPlanner$0(Frameworks.java:140)
    at org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:936)
    at org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:191)
    at org.apache.calcite.tools.Frameworks.withPlanner(Frameworks.java:135)
    at org.apache.hadoop.hive.ql.parse.CalcitePlanner.logicalPlan(CalcitePlanner.java:1344)
    at org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:571)
    at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:12816)
    at org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:466)
    at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:326)
    at org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:180)
    at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:326)
    at org.apache.hadoop.hive.ql.Compiler.analyze(Compiler.java:223)
    at org.apache.hadoop.hive.ql.Compiler.compile(Compiler.java:106)
    at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:518)
    at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:470)
    at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:435)
    at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:429)
    at org.apache.hadoop.hive.ql.reexec.ReExecDriver.compileAndRespond(ReExecDriver.java:121)
    at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:227)
    at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:257)
    at org.apache.hadoop.hive.cli.CliDriver.processCmd1(CliDriver.java:201)
    at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:127)
    at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:425)
    at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:356)
    at org.apache.hadoop.hive.ql.QTestUtil.executeClientInternal(QTestUtil.java:733)
    at org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:703)
    at org.apache.hadoop.hive.cli.control.CorePerfCliDriver.runTest(CorePerfCliDriver.java:88)
    at org.apache.hadoop.hive.cli.control.CliAdapter.runTest(CliAdapter.java:157)
    at org.apache.hadoop.hive.cli.TestTezTPCDS30TBPerfCliDriver.testCliDriver(TestTezTPCDS30TBPerfCliDriver.java:79){noformat}
In terms of plan changes one that for sure needs to be addressed is the change of IN/BETWEEN operators to OR/ANDs (see cbo_query5.q and cbo_query10.q).
{noformat}
cbo_query5.q:
HiveFilter(condition=[BETWEEN(false, CAST($2):TIMESTAMP(9), 1998-08-04 00:00:00:TIMESTAMP(9), 1998-08-18 00:00:00:TIMESTAMP(9))])
vs.
HiveFilter(condition=[AND(>=(CAST($2):TIMESTAMP(9), 1998-08-04 00:00:00), <=(CAST($2):TIMESTAMP(9), 1998-08-18 00:00:00))])

cbo_query10.q:
HiveFilter(condition=[IN($7, _UTF-16LE'Walker County':VARCHAR(30) CHARACTER SET "UTF-16LE", _UTF-16LE'Richland County':VARCHAR(30) CHARACTER SET "UTF-16LE", _UTF-16LE'Gaines County':VARCHAR(30) CHARACTER SET "UTF-16LE", _UTF-16LE'Douglas County':VARCHAR(30) CHARACTER SET "UTF-16LE", _UTF-16LE'Dona Ana County':VARCHAR(30) CHARACTER SET "UTF-16LE")])
vs
HiveFilter(condition=[OR(=(CAST($7):VARCHAR(2147483647) CHARACTER SET "UTF-16LE", _UTF-16LE'Gaines County'), =(CAST($7):VARCHAR(2147483647) CHARACTER SET "UTF-16LE", _UTF-16LE'Walker County'), =(CAST($7):VARCHAR(2147483647) CHARACTER SET "UTF-16LE", _UTF-16LE'Douglas County'), =(CAST($7):VARCHAR(2147483647) CHARACTER SET "UTF-16LE", _UTF-16LE'Dona Ana County'), =(CAST($7):VARCHAR(2147483647) CHARACTER SET "UTF-16LE", _UTF-16LE'Richland County'))]){noformat}

> Upgrade Calcite to 1.33.0 and Avatica to 1.23.0
> -----------------------------------------------
>
>                 Key: HIVE-27102
>                 URL: https://issues.apache.org/jira/browse/HIVE-27102
>             Project: Hive
>          Issue Type: Improvement
>          Components: CBO
>            Reporter: Stamatis Zampetakis
>            Assignee: Stamatis Zampetakis
>            Priority: Major
>
> New versions for Calcite and Avatica are available so we should upgrade to them.
> I had some WIP in HIVE-26610 for upgrading calcite to 1.32.0 but given that the work was not in very advanced state it is preferred to jump directly to 1.33.0.
> Avatica must be inline with Calcite so both need to be updated at the same time.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)