You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2021/01/19 17:45:08 UTC

[GitHub] [iceberg] rdblue commented on a change in pull request #2064: Flink: Support write.distribution-mode.

rdblue commented on a change in pull request #2064:
URL: https://github.com/apache/iceberg/pull/2064#discussion_r560365972



##########
File path: core/src/main/java/org/apache/iceberg/TableProperties.java
##########
@@ -138,6 +138,9 @@ private TableProperties() {
   public static final String ENGINE_HIVE_ENABLED = "engine.hive.enabled";
   public static final boolean ENGINE_HIVE_ENABLED_DEFAULT = false;
 
+  public static final String WRITE_SHUFFLE_BY_PARTITION = "write.shuffle-by.partition";

Review comment:
       The sort order of a table is a recommendation, not a requirement. And you're right that it is for writing. That's why the DDL to update it is `WRITE ORDERED BY ...`.
   
   We don't guarantee a sort order on read except when a data or (eq) delete file has a sort order in metadata (see #1975). The sort order for a table may change and even if writes are globally sorted, multiple writes to the same partitions will produce different file sets that can't be read in order to produce sorted records. That's why we don't make guarantees about reads. Ordering on write is primarily a way to cluster rows for efficient filtering.
   
   If row-level ordering is expensive, as it is for Flink, then it is perfectly fine to ignore the recommendation. Flink may eventually provide a way to order within data files, but I think that is less important than clustering data across files so that data files can be skipped in queries. That's what Steven's idea would achieve, along with handling skew.
   
   It is still valuable to have a write order, even if Flink doesn't guarantee it. If Flink can cluster data by that order, then that's really helpful. And, other services can rewrite those data files after the data is available if row ordering is needed for page skipping within data files. A service that sorts data files after Flink writes them also needs to know the desired order.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org