You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Mostafa Mokhtar (JIRA)" <ji...@apache.org> on 2015/04/17 00:45:59 UTC

[jira] [Created] (HIVE-10369) CBO: Don't use HiveDefaultCostModel when With Tez and hive.cbo.costmodel.extended enabled

Mostafa Mokhtar created HIVE-10369:
--------------------------------------

             Summary: CBO: Don't use HiveDefaultCostModel when With Tez and hive.cbo.costmodel.extended enabled 
                 Key: HIVE-10369
                 URL: https://issues.apache.org/jira/browse/HIVE-10369
             Project: Hive
          Issue Type: Sub-task
          Components: CBO
    Affects Versions: 1.2.0
            Reporter: Mostafa Mokhtar
            Assignee: Laljo John Pullokkaran
             Fix For: 1.2.0


When calculating parallelism, we end up using  HiveDefaultCostModel. getSplitCount which returns null instead of  HiveOnTezCostModel.getSplitCount which results in wrong parallelism.

This happens for this join 
{code}
org.apache.calcite.plan.RelOptUtil.toString(join)
	 (java.lang.String) HiveJoin(condition=[=($1, $3)], joinType=[inner], algorithm=[none], cost=[not available])
  HiveProject(cs_sold_date_sk=[$0], cs_bill_customer_sk=[$3], cs_sales_price=[$21])
    HiveTableScan(table=[[tpcds_bin_orc_200.catalog_sales]])
  HiveJoin(condition=[=($1, $2)], joinType=[inner], algorithm=[MapJoin], cost=[{2400000.0 rows, 6.400008E11 cpu, 1294.6098 io}])
    HiveProject(c_customer_sk=[$0], c_current_addr_sk=[$4])
      HiveTableScan(table=[[tpcds_bin_orc_200.customer]])
    HiveProject(ca_address_sk=[$0], ca_state=[$8], ca_zip=[$9])
      HiveTableScan(table=[[tpcds_bin_orc_200.customer_address]])
{code}


The issue appears to be happening very early when calling 
{code}
if (pushDownTree != null) {
      costPushDown =
          RelMetadataQuery.getCumulativeCost(pushDownTree.getJoinTree());
    }
{code}

As pushDownTree.getJoinTree().joinAlgorithm = HiveOnTezCostModel$TezMapJoinAlgorithm


Call stack.
{code}
HiveDefaultCostModel$DefaultJoinAlgorithm.getSplitCount(HiveJoin) line: 114	
HiveJoin.getSplitCount() line: 136	
HiveRelMdParallelism.splitCount(HiveJoin) line: 63	
NativeMethodAccessorImpl.invoke0(Method, Object, Object[]) line: not available [native method]	
NativeMethodAccessorImpl.invoke(Object, Object[]) line: 57	
DelegatingMethodAccessorImpl.invoke(Object, Object[]) line: 43	
Method.invoke(Object, Object...) line: 606	
ReflectiveRelMetadataProvider$1$1.invoke(Object, Method, Object[]) line: 182	
$Proxy46.splitCount() line: not available	
GeneratedMethodAccessor26.invoke(Object, Object[]) line: not available	
DelegatingMethodAccessorImpl.invoke(Object, Object[]) line: 43	
Method.invoke(Object, Object...) line: 606	
ChainedRelMetadataProvider$ChainedInvocationHandler.invoke(Object, Method, Object[]) line: 109	
$Proxy46.splitCount() line: not available	
GeneratedMethodAccessor26.invoke(Object, Object[]) line: not available	
DelegatingMethodAccessorImpl.invoke(Object, Object[]) line: 43	
Method.invoke(Object, Object...) line: 606	
ChainedRelMetadataProvider$ChainedInvocationHandler.invoke(Object, Method, Object[]) line: 109	
$Proxy46.splitCount() line: not available	
GeneratedMethodAccessor26.invoke(Object, Object[]) line: not available	
DelegatingMethodAccessorImpl.invoke(Object, Object[]) line: 43	
Method.invoke(Object, Object...) line: 606	
CachingRelMetadataProvider$CachingInvocationHandler.invoke(Object, Method, Object[]) line: 132	
$Proxy46.splitCount() line: not available	
RelMetadataQuery.splitCount(RelNode) line: 401	
HiveOnTezCostModel$TezMapJoinAlgorithm.getCost(HiveJoin) line: 255	
HiveOnTezCostModel(HiveCostModel).getJoinCost(HiveJoin) line: 64	
HiveRelMdCost.getNonCumulativeCost(HiveJoin) line: 56	
NativeMethodAccessorImpl.invoke0(Method, Object, Object[]) line: not available [native method]	
NativeMethodAccessorImpl.invoke(Object, Object[]) line: 57	
DelegatingMethodAccessorImpl.invoke(Object, Object[]) line: 43	
Method.invoke(Object, Object...) line: 606	
ReflectiveRelMetadataProvider$1$1.invoke(Object, Method, Object[]) line: 182	
$Proxy41.getNonCumulativeCost() line: not available	
GeneratedMethodAccessor22.invoke(Object, Object[]) line: not available	
DelegatingMethodAccessorImpl.invoke(Object, Object[]) line: 43	
Method.invoke(Object, Object...) line: 606	
ChainedRelMetadataProvider$ChainedInvocationHandler.invoke(Object, Method, Object[]) line: 109	
$Proxy41.getNonCumulativeCost() line: not available	
GeneratedMethodAccessor22.invoke(Object, Object[]) line: not available	
DelegatingMethodAccessorImpl.invoke(Object, Object[]) line: 43	
Method.invoke(Object, Object...) line: 606	
ChainedRelMetadataProvider$ChainedInvocationHandler.invoke(Object, Method, Object[]) line: 109	
$Proxy41.getNonCumulativeCost() line: not available	
GeneratedMethodAccessor22.invoke(Object, Object[]) line: not available	
DelegatingMethodAccessorImpl.invoke(Object, Object[]) line: 43	
Method.invoke(Object, Object...) line: 606	
ChainedRelMetadataProvider$ChainedInvocationHandler.invoke(Object, Method, Object[]) line: 109	
$Proxy41.getNonCumulativeCost() line: not available	
GeneratedMethodAccessor22.invoke(Object, Object[]) line: not available	
DelegatingMethodAccessorImpl.invoke(Object, Object[]) line: 43	
Method.invoke(Object, Object...) line: 606	
CachingRelMetadataProvider$CachingInvocationHandler.invoke(Object, Method, Object[]) line: 132	
$Proxy41.getNonCumulativeCost() line: not available	
RelMetadataQuery.getNonCumulativeCost(RelNode) line: 115	
HiveRelMdDistinctRowCount.getCumulativeCost(HiveJoin) line: 114	
NativeMethodAccessorImpl.invoke0(Method, Object, Object[]) line: not available [native method]	
NativeMethodAccessorImpl.invoke(Object, Object[]) line: 57	
DelegatingMethodAccessorImpl.invoke(Object, Object[]) line: 43	
Method.invoke(Object, Object...) line: 606	
ReflectiveRelMetadataProvider$1$1.invoke(Object, Method, Object[]) line: 182	
$Proxy40.getCumulativeCost() line: not available	
GeneratedMethodAccessor21.invoke(Object, Object[]) line: not available	
DelegatingMethodAccessorImpl.invoke(Object, Object[]) line: 43	
Method.invoke(Object, Object...) line: 606	
ChainedRelMetadataProvider$ChainedInvocationHandler.invoke(Object, Method, Object[]) line: 109	
$Proxy40.getCumulativeCost() line: not available	
GeneratedMethodAccessor21.invoke(Object, Object[]) line: not available	
DelegatingMethodAccessorImpl.invoke(Object, Object[]) line: 43	
Method.invoke(Object, Object...) line: 606	
ChainedRelMetadataProvider$ChainedInvocationHandler.invoke(Object, Method, Object[]) line: 109	
$Proxy40.getCumulativeCost() line: not available	
GeneratedMethodAccessor21.invoke(Object, Object[]) line: not available	
DelegatingMethodAccessorImpl.invoke(Object, Object[]) line: 43	
Method.invoke(Object, Object...) line: 606	
CachingRelMetadataProvider$CachingInvocationHandler.invoke(Object, Method, Object[]) line: 132	
$Proxy40.getCumulativeCost() line: not available	
RelMetadataQuery.getCumulativeCost(RelNode) line: 101	
LoptOptimizeJoinRule.addFactorToTree(LoptMultiJoin, LoptSemiJoinOptimizer, LoptJoinTree, int, BitSet, List<RexNode>, boolean) line: 940	
LoptOptimizeJoinRule.createOrdering(LoptMultiJoin, LoptSemiJoinOptimizer, int) line: 726	
LoptOptimizeJoinRule.findBestOrderings(LoptMultiJoin, LoptSemiJoinOptimizer, RelOptRuleCall) line: 458	
LoptOptimizeJoinRule.onMatch(RelOptRuleCall) line: 128	
HepPlanner(AbstractRelOptPlanner).fireRule(RelOptRuleCall) line: 326	
HepPlanner.applyRule(RelOptRule, HepRelVertex, boolean) line: 515	
HepPlanner.applyRules(Collection<RelOptRule>, boolean) line: 392	
HepPlanner.executeInstruction(HepInstruction$RuleInstance) line: 255	
HepInstruction$RuleInstance.execute(HepPlanner) line: 125	
HepPlanner.executeProgram(HepProgram) line: 207	
HepPlanner.findBestExp() line: 194	
CalcitePlanner$CalcitePlannerAction.apply(RelOptCluster, RelOptSchema, SchemaPlus) line: 849	
CalcitePlanner$CalcitePlannerAction.apply(RelOptCluster, RelOptSchema, SchemaPlus) line: 761	
Frameworks$1.apply(RelOptCluster, RelOptSchema, SchemaPlus, CalciteServerStatement) line: 109	
CalcitePrepareImpl.perform(CalciteServerStatement, PrepareAction<R>) line: 730	
Frameworks.withPrepare(PrepareAction<R>) line: 145	
Frameworks.withPlanner(PlannerAction<R>, FrameworkConfig) line: 105	
CalcitePlanner.getOptimizedAST() line: 602	
CalcitePlanner.genOPTree(ASTNode, SemanticAnalyzer$PlannerContext) line: 240	
CalcitePlanner(SemanticAnalyzer).analyzeInternal(ASTNode, SemanticAnalyzer$PlannerContext) line: 10003	
CalcitePlanner.analyzeInternal(ASTNode) line: 203	
CalcitePlanner(BaseSemanticAnalyzer).analyze(ASTNode, Context) line: 224	
ExplainSemanticAnalyzer.analyzeInternal(ASTNode) line: 74	
ExplainSemanticAnalyzer(BaseSemanticAnalyzer).analyze(ASTNode, Context) line: 224	
Driver.compile(String, boolean) line: 424	
Driver.compile(String) line: 308	
Driver.compileInternal(String) line: 1122	
Driver.runInternal(String, boolean) line: 1170	
Driver.run(String, boolean) line: 1059	
Driver.run(String) line: 1049	
CliDriver.processLocalCmd(String, CommandProcessor, CliSessionState) line: 213	
CliDriver.processCmd(String) line: 165	
CliDriver.processLine(String, boolean) line: 376	
CliDriver.executeDriver(CliSessionState, HiveConf, OptionsProcessor) line: 736	
CliDriver.run(String[]) line: 681	
CliDriver.main(String[]) line: 621	
NativeMethodAccessorImpl.invoke0(Method, Object, Object[]) line: not available [native method]	
NativeMethodAccessorImpl.invoke(Object, Object[]) line: 57	
DelegatingMethodAccessorImpl.invoke(Object, Object[]) line: 43	
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)