You are viewing a plain text version of this content. The canonical link for it is here.
Posted to gitbox@hive.apache.org by GitBox <gi...@apache.org> on 2022/03/01 16:39:26 UTC

[GitHub] [hive] szlta commented on a change in pull request #3060: HIVE-25975: Optimize ClusteredWriter for bucketed Iceberg tables

szlta commented on a change in pull request #3060:
URL: https://github.com/apache/hive/pull/3060#discussion_r816946090



##########
File path: ql/src/java/org/apache/hadoop/hive/ql/plan/DynamicPartitionCtx.java
##########
@@ -50,6 +52,14 @@
   private String defaultPartName; // default partition name in case of null or empty value
   private int maxPartsPerNode;    // maximum dynamic partitions created per mapper/reducer
   private Pattern whiteListPattern;
+  /**
+   * Expressions describing a custom way of sorting the table before write. Expressions can reference simple
+   * column descriptions or a tree of expressions containing more columns and UDFs.
+   * Can be useful for custom bucket/hash sorting.
+   * A custom expression should be a lambda that is given the original column description expressions as per read
+   * schema and returns a single expression. Example for simply just referencing column 3: cols -> cols.get(3).clone()
+   */
+  private transient List<Function<List<ExprNodeDesc>, ExprNodeDesc>> customSortExpressions;

Review comment:
       The job has the dpCtx serialized, and for some reason kryo is throwing exceptions for lambdas on the executor side.
   The expressions here are only needed in HS2 during query planning so I just prevent them from going to executors in the first place.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org