You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Laljo John Pullokkaran (JIRA)" <ji...@apache.org> on 2015/04/17 04:24:59 UTC

[jira] [Updated] (HIVE-10369) CBO: Don't use HiveDefaultCostModel when With Tez and hive.cbo.costmodel.extended enabled

     [ https://issues.apache.org/jira/browse/HIVE-10369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Laljo John Pullokkaran updated HIVE-10369:
------------------------------------------
    Attachment: HIVE-10369.patch

> CBO: Don't use HiveDefaultCostModel when With Tez and hive.cbo.costmodel.extended enabled 
> ------------------------------------------------------------------------------------------
>
>                 Key: HIVE-10369
>                 URL: https://issues.apache.org/jira/browse/HIVE-10369
>             Project: Hive
>          Issue Type: Sub-task
>          Components: CBO
>    Affects Versions: 1.2.0
>            Reporter: Mostafa Mokhtar
>            Assignee: Laljo John Pullokkaran
>             Fix For: 1.2.0
>
>         Attachments: HIVE-10369.patch
>
>
> When calculating parallelism, we end up using  HiveDefaultCostModel. getSplitCount which returns null instead of  HiveOnTezCostModel.getSplitCount which results in wrong parallelism.
> This happens for this join 
> {code}
> org.apache.calcite.plan.RelOptUtil.toString(join)
> 	 (java.lang.String) HiveJoin(condition=[=($1, $3)], joinType=[inner], algorithm=[none], cost=[not available])
>   HiveProject(cs_sold_date_sk=[$0], cs_bill_customer_sk=[$3], cs_sales_price=[$21])
>     HiveTableScan(table=[[tpcds_bin_orc_200.catalog_sales]])
>   HiveJoin(condition=[=($1, $2)], joinType=[inner], algorithm=[MapJoin], cost=[{2400000.0 rows, 6.400008E11 cpu, 1294.6098 io}])
>     HiveProject(c_customer_sk=[$0], c_current_addr_sk=[$4])
>       HiveTableScan(table=[[tpcds_bin_orc_200.customer]])
>     HiveProject(ca_address_sk=[$0], ca_state=[$8], ca_zip=[$9])
>       HiveTableScan(table=[[tpcds_bin_orc_200.customer_address]])
> {code}
> The issue appears to be happening very early when calling 
> {code}
> if (pushDownTree != null) {
>       costPushDown =
>           RelMetadataQuery.getCumulativeCost(pushDownTree.getJoinTree());
>     }
> {code}
> As pushDownTree.getJoinTree().joinAlgorithm = HiveOnTezCostModel$TezMapJoinAlgorithm
> Call stack.
> {code}
> HiveDefaultCostModel$DefaultJoinAlgorithm.getSplitCount(HiveJoin) line: 114	
> HiveJoin.getSplitCount() line: 136	
> HiveRelMdParallelism.splitCount(HiveJoin) line: 63	
> NativeMethodAccessorImpl.invoke0(Method, Object, Object[]) line: not available [native method]	
> NativeMethodAccessorImpl.invoke(Object, Object[]) line: 57	
> DelegatingMethodAccessorImpl.invoke(Object, Object[]) line: 43	
> Method.invoke(Object, Object...) line: 606	
> ReflectiveRelMetadataProvider$1$1.invoke(Object, Method, Object[]) line: 182	
> $Proxy46.splitCount() line: not available	
> GeneratedMethodAccessor26.invoke(Object, Object[]) line: not available	
> DelegatingMethodAccessorImpl.invoke(Object, Object[]) line: 43	
> Method.invoke(Object, Object...) line: 606	
> ChainedRelMetadataProvider$ChainedInvocationHandler.invoke(Object, Method, Object[]) line: 109	
> $Proxy46.splitCount() line: not available	
> GeneratedMethodAccessor26.invoke(Object, Object[]) line: not available	
> DelegatingMethodAccessorImpl.invoke(Object, Object[]) line: 43	
> Method.invoke(Object, Object...) line: 606	
> ChainedRelMetadataProvider$ChainedInvocationHandler.invoke(Object, Method, Object[]) line: 109	
> $Proxy46.splitCount() line: not available	
> GeneratedMethodAccessor26.invoke(Object, Object[]) line: not available	
> DelegatingMethodAccessorImpl.invoke(Object, Object[]) line: 43	
> Method.invoke(Object, Object...) line: 606	
> CachingRelMetadataProvider$CachingInvocationHandler.invoke(Object, Method, Object[]) line: 132	
> $Proxy46.splitCount() line: not available	
> RelMetadataQuery.splitCount(RelNode) line: 401	
> HiveOnTezCostModel$TezMapJoinAlgorithm.getCost(HiveJoin) line: 255	
> HiveOnTezCostModel(HiveCostModel).getJoinCost(HiveJoin) line: 64	
> HiveRelMdCost.getNonCumulativeCost(HiveJoin) line: 56	
> NativeMethodAccessorImpl.invoke0(Method, Object, Object[]) line: not available [native method]	
> NativeMethodAccessorImpl.invoke(Object, Object[]) line: 57	
> DelegatingMethodAccessorImpl.invoke(Object, Object[]) line: 43	
> Method.invoke(Object, Object...) line: 606	
> ReflectiveRelMetadataProvider$1$1.invoke(Object, Method, Object[]) line: 182	
> $Proxy41.getNonCumulativeCost() line: not available	
> GeneratedMethodAccessor22.invoke(Object, Object[]) line: not available	
> DelegatingMethodAccessorImpl.invoke(Object, Object[]) line: 43	
> Method.invoke(Object, Object...) line: 606	
> ChainedRelMetadataProvider$ChainedInvocationHandler.invoke(Object, Method, Object[]) line: 109	
> $Proxy41.getNonCumulativeCost() line: not available	
> GeneratedMethodAccessor22.invoke(Object, Object[]) line: not available	
> DelegatingMethodAccessorImpl.invoke(Object, Object[]) line: 43	
> Method.invoke(Object, Object...) line: 606	
> ChainedRelMetadataProvider$ChainedInvocationHandler.invoke(Object, Method, Object[]) line: 109	
> $Proxy41.getNonCumulativeCost() line: not available	
> GeneratedMethodAccessor22.invoke(Object, Object[]) line: not available	
> DelegatingMethodAccessorImpl.invoke(Object, Object[]) line: 43	
> Method.invoke(Object, Object...) line: 606	
> ChainedRelMetadataProvider$ChainedInvocationHandler.invoke(Object, Method, Object[]) line: 109	
> $Proxy41.getNonCumulativeCost() line: not available	
> GeneratedMethodAccessor22.invoke(Object, Object[]) line: not available	
> DelegatingMethodAccessorImpl.invoke(Object, Object[]) line: 43	
> Method.invoke(Object, Object...) line: 606	
> CachingRelMetadataProvider$CachingInvocationHandler.invoke(Object, Method, Object[]) line: 132	
> $Proxy41.getNonCumulativeCost() line: not available	
> RelMetadataQuery.getNonCumulativeCost(RelNode) line: 115	
> HiveRelMdDistinctRowCount.getCumulativeCost(HiveJoin) line: 114	
> NativeMethodAccessorImpl.invoke0(Method, Object, Object[]) line: not available [native method]	
> NativeMethodAccessorImpl.invoke(Object, Object[]) line: 57	
> DelegatingMethodAccessorImpl.invoke(Object, Object[]) line: 43	
> Method.invoke(Object, Object...) line: 606	
> ReflectiveRelMetadataProvider$1$1.invoke(Object, Method, Object[]) line: 182	
> $Proxy40.getCumulativeCost() line: not available	
> GeneratedMethodAccessor21.invoke(Object, Object[]) line: not available	
> DelegatingMethodAccessorImpl.invoke(Object, Object[]) line: 43	
> Method.invoke(Object, Object...) line: 606	
> ChainedRelMetadataProvider$ChainedInvocationHandler.invoke(Object, Method, Object[]) line: 109	
> $Proxy40.getCumulativeCost() line: not available	
> GeneratedMethodAccessor21.invoke(Object, Object[]) line: not available	
> DelegatingMethodAccessorImpl.invoke(Object, Object[]) line: 43	
> Method.invoke(Object, Object...) line: 606	
> ChainedRelMetadataProvider$ChainedInvocationHandler.invoke(Object, Method, Object[]) line: 109	
> $Proxy40.getCumulativeCost() line: not available	
> GeneratedMethodAccessor21.invoke(Object, Object[]) line: not available	
> DelegatingMethodAccessorImpl.invoke(Object, Object[]) line: 43	
> Method.invoke(Object, Object...) line: 606	
> CachingRelMetadataProvider$CachingInvocationHandler.invoke(Object, Method, Object[]) line: 132	
> $Proxy40.getCumulativeCost() line: not available	
> RelMetadataQuery.getCumulativeCost(RelNode) line: 101	
> LoptOptimizeJoinRule.addFactorToTree(LoptMultiJoin, LoptSemiJoinOptimizer, LoptJoinTree, int, BitSet, List<RexNode>, boolean) line: 940	
> LoptOptimizeJoinRule.createOrdering(LoptMultiJoin, LoptSemiJoinOptimizer, int) line: 726	
> LoptOptimizeJoinRule.findBestOrderings(LoptMultiJoin, LoptSemiJoinOptimizer, RelOptRuleCall) line: 458	
> LoptOptimizeJoinRule.onMatch(RelOptRuleCall) line: 128	
> HepPlanner(AbstractRelOptPlanner).fireRule(RelOptRuleCall) line: 326	
> HepPlanner.applyRule(RelOptRule, HepRelVertex, boolean) line: 515	
> HepPlanner.applyRules(Collection<RelOptRule>, boolean) line: 392	
> HepPlanner.executeInstruction(HepInstruction$RuleInstance) line: 255	
> HepInstruction$RuleInstance.execute(HepPlanner) line: 125	
> HepPlanner.executeProgram(HepProgram) line: 207	
> HepPlanner.findBestExp() line: 194	
> CalcitePlanner$CalcitePlannerAction.apply(RelOptCluster, RelOptSchema, SchemaPlus) line: 849	
> CalcitePlanner$CalcitePlannerAction.apply(RelOptCluster, RelOptSchema, SchemaPlus) line: 761	
> Frameworks$1.apply(RelOptCluster, RelOptSchema, SchemaPlus, CalciteServerStatement) line: 109	
> CalcitePrepareImpl.perform(CalciteServerStatement, PrepareAction<R>) line: 730	
> Frameworks.withPrepare(PrepareAction<R>) line: 145	
> Frameworks.withPlanner(PlannerAction<R>, FrameworkConfig) line: 105	
> CalcitePlanner.getOptimizedAST() line: 602	
> CalcitePlanner.genOPTree(ASTNode, SemanticAnalyzer$PlannerContext) line: 240	
> CalcitePlanner(SemanticAnalyzer).analyzeInternal(ASTNode, SemanticAnalyzer$PlannerContext) line: 10003	
> CalcitePlanner.analyzeInternal(ASTNode) line: 203	
> CalcitePlanner(BaseSemanticAnalyzer).analyze(ASTNode, Context) line: 224	
> ExplainSemanticAnalyzer.analyzeInternal(ASTNode) line: 74	
> ExplainSemanticAnalyzer(BaseSemanticAnalyzer).analyze(ASTNode, Context) line: 224	
> Driver.compile(String, boolean) line: 424	
> Driver.compile(String) line: 308	
> Driver.compileInternal(String) line: 1122	
> Driver.runInternal(String, boolean) line: 1170	
> Driver.run(String, boolean) line: 1059	
> Driver.run(String) line: 1049	
> CliDriver.processLocalCmd(String, CommandProcessor, CliSessionState) line: 213	
> CliDriver.processCmd(String) line: 165	
> CliDriver.processLine(String, boolean) line: 376	
> CliDriver.executeDriver(CliSessionState, HiveConf, OptionsProcessor) line: 736	
> CliDriver.run(String[]) line: 681	
> CliDriver.main(String[]) line: 621	
> NativeMethodAccessorImpl.invoke0(Method, Object, Object[]) line: not available [native method]	
> NativeMethodAccessorImpl.invoke(Object, Object[]) line: 57	
> DelegatingMethodAccessorImpl.invoke(Object, Object[]) line: 43	
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)