You are viewing a plain text version of this content. The canonical link for it is here.
Posted to gitbox@hive.apache.org by GitBox <gi...@apache.org> on 2022/01/11 10:55:54 UTC

[GitHub] [hive] kasakrisz commented on a change in pull request #2932: HIVE-25856: Intermittent null ordering in plans of queries with GROUP BY and LIMIT

kasakrisz commented on a change in pull request #2932:
URL: https://github.com/apache/hive/pull/2932#discussion_r782032263



##########
File path: ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveAggregateSortLimitRule.java
##########
@@ -55,29 +54,13 @@
  */
 public class HiveAggregateSortLimitRule extends RelOptRule {
 
-  private static HiveAggregateSortLimitRule instance = null;
-
-  public static final HiveAggregateSortLimitRule getInstance(HiveConf hiveConf) {
-    if (instance == null) {
-      RelFieldCollation.NullDirection defaultAscNullDirection;
-      if (HiveConf.getBoolVar(hiveConf, HiveConf.ConfVars.HIVE_DEFAULT_NULLS_LAST)) {
-        defaultAscNullDirection = RelFieldCollation.NullDirection.LAST;
-      } else {
-        defaultAscNullDirection = RelFieldCollation.NullDirection.FIRST;
-      }
-      instance = new HiveAggregateSortLimitRule(defaultAscNullDirection);
-    }
-
-    return instance;
-  }
-
   private final RelFieldCollation.NullDirection defaultAscNullDirection;
 
-
-  private HiveAggregateSortLimitRule(RelFieldCollation.NullDirection defaultAscNullDirection) {
+  public HiveAggregateSortLimitRule(boolean nullsLast) {

Review comment:
       I generally avoid hardcoding because the null ordering behavior affects Top n key operator pushdown optimization.
   https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/topnkey/TopNKeyPushdownProcessor.java
   
   The `Top N Key` operator introduced into the physical plan as a parent operator of the `Reduce Sink`. It takes the sort keys and ordering parameters from the `Reduce Sink`. The push down optimization tries to move TNK until TS if possible.
   More complex queries may have more `Reduce Sinks` or even other TNKs which should be merged. This is the point where null ordering also count.
   
   I think it is safer to use the config.
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org