You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Laljo John Pullokkaran (JIRA)" <ji...@apache.org> on 2015/04/17 04:24:59 UTC
[jira] [Updated] (HIVE-10369) CBO: Don't use HiveDefaultCostModel
when With Tez and hive.cbo.costmodel.extended enabled
[ https://issues.apache.org/jira/browse/HIVE-10369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Laljo John Pullokkaran updated HIVE-10369:
------------------------------------------
Attachment: HIVE-10369.patch
> CBO: Don't use HiveDefaultCostModel when With Tez and hive.cbo.costmodel.extended enabled
> ------------------------------------------------------------------------------------------
>
> Key: HIVE-10369
> URL: https://issues.apache.org/jira/browse/HIVE-10369
> Project: Hive
> Issue Type: Sub-task
> Components: CBO
> Affects Versions: 1.2.0
> Reporter: Mostafa Mokhtar
> Assignee: Laljo John Pullokkaran
> Fix For: 1.2.0
>
> Attachments: HIVE-10369.patch
>
>
> When calculating parallelism, we end up using HiveDefaultCostModel. getSplitCount which returns null instead of HiveOnTezCostModel.getSplitCount which results in wrong parallelism.
> This happens for this join
> {code}
> org.apache.calcite.plan.RelOptUtil.toString(join)
> (java.lang.String) HiveJoin(condition=[=($1, $3)], joinType=[inner], algorithm=[none], cost=[not available])
> HiveProject(cs_sold_date_sk=[$0], cs_bill_customer_sk=[$3], cs_sales_price=[$21])
> HiveTableScan(table=[[tpcds_bin_orc_200.catalog_sales]])
> HiveJoin(condition=[=($1, $2)], joinType=[inner], algorithm=[MapJoin], cost=[{2400000.0 rows, 6.400008E11 cpu, 1294.6098 io}])
> HiveProject(c_customer_sk=[$0], c_current_addr_sk=[$4])
> HiveTableScan(table=[[tpcds_bin_orc_200.customer]])
> HiveProject(ca_address_sk=[$0], ca_state=[$8], ca_zip=[$9])
> HiveTableScan(table=[[tpcds_bin_orc_200.customer_address]])
> {code}
> The issue appears to be happening very early when calling
> {code}
> if (pushDownTree != null) {
> costPushDown =
> RelMetadataQuery.getCumulativeCost(pushDownTree.getJoinTree());
> }
> {code}
> As pushDownTree.getJoinTree().joinAlgorithm = HiveOnTezCostModel$TezMapJoinAlgorithm
> Call stack.
> {code}
> HiveDefaultCostModel$DefaultJoinAlgorithm.getSplitCount(HiveJoin) line: 114
> HiveJoin.getSplitCount() line: 136
> HiveRelMdParallelism.splitCount(HiveJoin) line: 63
> NativeMethodAccessorImpl.invoke0(Method, Object, Object[]) line: not available [native method]
> NativeMethodAccessorImpl.invoke(Object, Object[]) line: 57
> DelegatingMethodAccessorImpl.invoke(Object, Object[]) line: 43
> Method.invoke(Object, Object...) line: 606
> ReflectiveRelMetadataProvider$1$1.invoke(Object, Method, Object[]) line: 182
> $Proxy46.splitCount() line: not available
> GeneratedMethodAccessor26.invoke(Object, Object[]) line: not available
> DelegatingMethodAccessorImpl.invoke(Object, Object[]) line: 43
> Method.invoke(Object, Object...) line: 606
> ChainedRelMetadataProvider$ChainedInvocationHandler.invoke(Object, Method, Object[]) line: 109
> $Proxy46.splitCount() line: not available
> GeneratedMethodAccessor26.invoke(Object, Object[]) line: not available
> DelegatingMethodAccessorImpl.invoke(Object, Object[]) line: 43
> Method.invoke(Object, Object...) line: 606
> ChainedRelMetadataProvider$ChainedInvocationHandler.invoke(Object, Method, Object[]) line: 109
> $Proxy46.splitCount() line: not available
> GeneratedMethodAccessor26.invoke(Object, Object[]) line: not available
> DelegatingMethodAccessorImpl.invoke(Object, Object[]) line: 43
> Method.invoke(Object, Object...) line: 606
> CachingRelMetadataProvider$CachingInvocationHandler.invoke(Object, Method, Object[]) line: 132
> $Proxy46.splitCount() line: not available
> RelMetadataQuery.splitCount(RelNode) line: 401
> HiveOnTezCostModel$TezMapJoinAlgorithm.getCost(HiveJoin) line: 255
> HiveOnTezCostModel(HiveCostModel).getJoinCost(HiveJoin) line: 64
> HiveRelMdCost.getNonCumulativeCost(HiveJoin) line: 56
> NativeMethodAccessorImpl.invoke0(Method, Object, Object[]) line: not available [native method]
> NativeMethodAccessorImpl.invoke(Object, Object[]) line: 57
> DelegatingMethodAccessorImpl.invoke(Object, Object[]) line: 43
> Method.invoke(Object, Object...) line: 606
> ReflectiveRelMetadataProvider$1$1.invoke(Object, Method, Object[]) line: 182
> $Proxy41.getNonCumulativeCost() line: not available
> GeneratedMethodAccessor22.invoke(Object, Object[]) line: not available
> DelegatingMethodAccessorImpl.invoke(Object, Object[]) line: 43
> Method.invoke(Object, Object...) line: 606
> ChainedRelMetadataProvider$ChainedInvocationHandler.invoke(Object, Method, Object[]) line: 109
> $Proxy41.getNonCumulativeCost() line: not available
> GeneratedMethodAccessor22.invoke(Object, Object[]) line: not available
> DelegatingMethodAccessorImpl.invoke(Object, Object[]) line: 43
> Method.invoke(Object, Object...) line: 606
> ChainedRelMetadataProvider$ChainedInvocationHandler.invoke(Object, Method, Object[]) line: 109
> $Proxy41.getNonCumulativeCost() line: not available
> GeneratedMethodAccessor22.invoke(Object, Object[]) line: not available
> DelegatingMethodAccessorImpl.invoke(Object, Object[]) line: 43
> Method.invoke(Object, Object...) line: 606
> ChainedRelMetadataProvider$ChainedInvocationHandler.invoke(Object, Method, Object[]) line: 109
> $Proxy41.getNonCumulativeCost() line: not available
> GeneratedMethodAccessor22.invoke(Object, Object[]) line: not available
> DelegatingMethodAccessorImpl.invoke(Object, Object[]) line: 43
> Method.invoke(Object, Object...) line: 606
> CachingRelMetadataProvider$CachingInvocationHandler.invoke(Object, Method, Object[]) line: 132
> $Proxy41.getNonCumulativeCost() line: not available
> RelMetadataQuery.getNonCumulativeCost(RelNode) line: 115
> HiveRelMdDistinctRowCount.getCumulativeCost(HiveJoin) line: 114
> NativeMethodAccessorImpl.invoke0(Method, Object, Object[]) line: not available [native method]
> NativeMethodAccessorImpl.invoke(Object, Object[]) line: 57
> DelegatingMethodAccessorImpl.invoke(Object, Object[]) line: 43
> Method.invoke(Object, Object...) line: 606
> ReflectiveRelMetadataProvider$1$1.invoke(Object, Method, Object[]) line: 182
> $Proxy40.getCumulativeCost() line: not available
> GeneratedMethodAccessor21.invoke(Object, Object[]) line: not available
> DelegatingMethodAccessorImpl.invoke(Object, Object[]) line: 43
> Method.invoke(Object, Object...) line: 606
> ChainedRelMetadataProvider$ChainedInvocationHandler.invoke(Object, Method, Object[]) line: 109
> $Proxy40.getCumulativeCost() line: not available
> GeneratedMethodAccessor21.invoke(Object, Object[]) line: not available
> DelegatingMethodAccessorImpl.invoke(Object, Object[]) line: 43
> Method.invoke(Object, Object...) line: 606
> ChainedRelMetadataProvider$ChainedInvocationHandler.invoke(Object, Method, Object[]) line: 109
> $Proxy40.getCumulativeCost() line: not available
> GeneratedMethodAccessor21.invoke(Object, Object[]) line: not available
> DelegatingMethodAccessorImpl.invoke(Object, Object[]) line: 43
> Method.invoke(Object, Object...) line: 606
> CachingRelMetadataProvider$CachingInvocationHandler.invoke(Object, Method, Object[]) line: 132
> $Proxy40.getCumulativeCost() line: not available
> RelMetadataQuery.getCumulativeCost(RelNode) line: 101
> LoptOptimizeJoinRule.addFactorToTree(LoptMultiJoin, LoptSemiJoinOptimizer, LoptJoinTree, int, BitSet, List<RexNode>, boolean) line: 940
> LoptOptimizeJoinRule.createOrdering(LoptMultiJoin, LoptSemiJoinOptimizer, int) line: 726
> LoptOptimizeJoinRule.findBestOrderings(LoptMultiJoin, LoptSemiJoinOptimizer, RelOptRuleCall) line: 458
> LoptOptimizeJoinRule.onMatch(RelOptRuleCall) line: 128
> HepPlanner(AbstractRelOptPlanner).fireRule(RelOptRuleCall) line: 326
> HepPlanner.applyRule(RelOptRule, HepRelVertex, boolean) line: 515
> HepPlanner.applyRules(Collection<RelOptRule>, boolean) line: 392
> HepPlanner.executeInstruction(HepInstruction$RuleInstance) line: 255
> HepInstruction$RuleInstance.execute(HepPlanner) line: 125
> HepPlanner.executeProgram(HepProgram) line: 207
> HepPlanner.findBestExp() line: 194
> CalcitePlanner$CalcitePlannerAction.apply(RelOptCluster, RelOptSchema, SchemaPlus) line: 849
> CalcitePlanner$CalcitePlannerAction.apply(RelOptCluster, RelOptSchema, SchemaPlus) line: 761
> Frameworks$1.apply(RelOptCluster, RelOptSchema, SchemaPlus, CalciteServerStatement) line: 109
> CalcitePrepareImpl.perform(CalciteServerStatement, PrepareAction<R>) line: 730
> Frameworks.withPrepare(PrepareAction<R>) line: 145
> Frameworks.withPlanner(PlannerAction<R>, FrameworkConfig) line: 105
> CalcitePlanner.getOptimizedAST() line: 602
> CalcitePlanner.genOPTree(ASTNode, SemanticAnalyzer$PlannerContext) line: 240
> CalcitePlanner(SemanticAnalyzer).analyzeInternal(ASTNode, SemanticAnalyzer$PlannerContext) line: 10003
> CalcitePlanner.analyzeInternal(ASTNode) line: 203
> CalcitePlanner(BaseSemanticAnalyzer).analyze(ASTNode, Context) line: 224
> ExplainSemanticAnalyzer.analyzeInternal(ASTNode) line: 74
> ExplainSemanticAnalyzer(BaseSemanticAnalyzer).analyze(ASTNode, Context) line: 224
> Driver.compile(String, boolean) line: 424
> Driver.compile(String) line: 308
> Driver.compileInternal(String) line: 1122
> Driver.runInternal(String, boolean) line: 1170
> Driver.run(String, boolean) line: 1059
> Driver.run(String) line: 1049
> CliDriver.processLocalCmd(String, CommandProcessor, CliSessionState) line: 213
> CliDriver.processCmd(String) line: 165
> CliDriver.processLine(String, boolean) line: 376
> CliDriver.executeDriver(CliSessionState, HiveConf, OptionsProcessor) line: 736
> CliDriver.run(String[]) line: 681
> CliDriver.main(String[]) line: 621
> NativeMethodAccessorImpl.invoke0(Method, Object, Object[]) line: not available [native method]
> NativeMethodAccessorImpl.invoke(Object, Object[]) line: 57
> DelegatingMethodAccessorImpl.invoke(Object, Object[]) line: 43
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)