You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2021/05/07 08:05:00 UTC

[jira] [Work logged] (HIVE-25046) Log CBO plans right after major transformations

     [ https://issues.apache.org/jira/browse/HIVE-25046?focusedWorklogId=593185&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-593185 ]

ASF GitHub Bot logged work on HIVE-25046:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 07/May/21 08:04
            Start Date: 07/May/21 08:04
    Worklog Time Spent: 10m 
      Work Description: zabetak commented on a change in pull request #2205:
URL: https://github.com/apache/hive/pull/2205#discussion_r628010636



##########
File path: ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java
##########
@@ -2067,51 +2069,49 @@ public RelNode apply(RelOptCluster cluster, RelOptSchema relOptSchema, SchemaPlu
       // and on top of that we should check that it only contains operators that
       // are supported by the rewriting algorithm.
       HiveRelOptMaterializationValidator materializationValidator = new HiveRelOptMaterializationValidator();
-      materializationValidator.validate(calciteGenPlan);
+      materializationValidator.validate(calcitePlan);
       setInvalidResultCacheReason(
           materializationValidator.getResultCacheInvalidReason());
       setInvalidAutomaticRewritingMaterializationReason(
           materializationValidator.getAutomaticRewritingInvalidReason());
 
       // 2. Apply pre-join order optimizations
-      calcitePreCboPlan = applyPreJoinOrderingTransforms(calciteGenPlan,
-          mdProvider.getMetadataProvider(), executorProvider);
-
+      calcitePlan = applyPreJoinOrderingTransforms(calcitePlan, mdProvider.getMetadataProvider(), executorProvider);
+      if (LOG.isDebugEnabled()) {
+        LOG.debug("Plan after pre-join transformations:\n" + RelOptUtil.toString(calcitePlan));
+      }
       // 3. Materialized view based rewriting
       // We disable it for CTAS and MV creation queries (trying to avoid any problem
       // due to data freshness)
       if (conf.getBoolVar(ConfVars.HIVE_MATERIALIZED_VIEW_ENABLE_AUTO_REWRITING) &&
               !getQB().isMaterializedView() && !ctx.isLoadingMaterializedView() && !getQB().isCTAS() &&
                getQB().hasTableDefined() &&
               !forViewCreation) {
-        calcitePreCboPlan = applyMaterializedViewRewriting(planner,
-            calcitePreCboPlan, mdProvider.getMetadataProvider(), executorProvider);
+        calcitePlan =
+            applyMaterializedViewRewriting(planner, calcitePlan, mdProvider.getMetadataProvider(), executorProvider);
+        if (LOG.isDebugEnabled()) {
+          LOG.debug("Plan after view-based rewriting:\n" + RelOptUtil.toString(calcitePlan));
+        }
       }
 
       // 4. Apply join order optimizations: reordering MST algorithm
       //    If join optimizations failed because of missing stats, we continue with
       //    the rest of optimizations
       if (profilesCBO.contains(ExtendedCBOProfile.JOIN_REORDERING)) {
-        calciteOptimizedPlan = applyJoinOrderingTransform(calcitePreCboPlan,
-            mdProvider.getMetadataProvider(), executorProvider);
+        calcitePlan = applyJoinOrderingTransform(calcitePlan, mdProvider.getMetadataProvider(), executorProvider);
+        if (LOG.isDebugEnabled()) {
+          LOG.debug("Plan after join transformations:\n" + RelOptUtil.toString(calcitePlan));
+        }
       } else {
-        calciteOptimizedPlan = calcitePreCboPlan;
         disableSemJoinReordering = false;
       }
 
       // 5. Apply post-join order optimizations
-      calciteOptimizedPlan = applyPostJoinOrderingTransform(calciteOptimizedPlan,
-          mdProvider.getMetadataProvider(), executorProvider);
-
-      if (LOG.isDebugEnabled() && !conf.getBoolVar(ConfVars.HIVE_IN_TEST)) {

Review comment:
       We are already logging plans in various places (same method, trimmer, etc) before this `if` so I don't think one or two lines more will have a dramatic impact in performance or logging information. 
   
   Moreover, if for some reason we don't want these logs to appear when we run the tests we can do it in the usual and expected way by changing the Log4j configuration. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 593185)
    Time Spent: 40m  (was: 0.5h)

> Log CBO plans right after major transformations
> -----------------------------------------------
>
>                 Key: HIVE-25046
>                 URL: https://issues.apache.org/jira/browse/HIVE-25046
>             Project: Hive
>          Issue Type: Improvement
>          Components: CBO
>            Reporter: Stamatis Zampetakis
>            Assignee: Stamatis Zampetakis
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 40m
>  Remaining Estimate: 0h
>
> Currently the results of various CBO transformations are logged (in DEBUG mode) at the end of the optimization [phase|https://github.com/apache/hive/blob/9f5bd72e908244b2fe915e8dc39f55afa94bbffa/ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java#L2106] and only if we are not in test mode. This has some disadvantages:
> * If there is a failure (exception) in some intermediate step we will miss all the intermediate  plans, possibly losing track of what plan led to the problem.
> * Intermediate logs are very useful for identifying plan problems while working on a patch; unfortunately the logs are explicitly disabled in test mode which means that in order to appear the respective code needs to change every time we need to see those logs.
> * Logging at the end necessitates keeping additional local variables that make code harder to read.
> The goal of this issue is to place DEBUG logging right after major transformations and independently if we are running in test mode or not to alleviate the shortcomings mentioned above.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)