You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2022/01/19 16:24:02 UTC

[GitHub] [iceberg] aokolnychyi opened a new pull request #3930: Spark 3.2: Use hash distribution by default in copy-on-write DELETE

aokolnychyi opened a new pull request #3930:
URL: https://github.com/apache/iceberg/pull/3930


   This PR changes the default distribution mode for copy-on-write DELETE. Previously, if the table was unpartitioned, unsorted and the write and delete distributions were not set, we did not request anything. It is a follow-up on the original PR.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] rdblue commented on a change in pull request #3930: Spark 3.2: Use hash distribution by default in copy-on-write DELETE

Posted by GitBox <gi...@apache.org>.

rdblue commented on a change in pull request #3930:
URL: https://github.com/apache/iceberg/pull/3930#discussion_r787947162



##########
File path: spark/v3.2/spark/src/test/java/org/apache/iceberg/spark/TestSparkDistributionAndOrderingUtil.java
##########
@@ -299,14 +300,70 @@ public void testRangeWritePartitionedSortedTable() {
     checkWriteDistributionAndOrdering(table, expectedDistribution, expectedOrdering);
   }
 
+  // =============================================================
+  // Distribution and ordering for copy-on-write DELETE operations
+  // =============================================================
+  //
+  // UNPARTITIONED UNORDERED
+  // -------------------------------------------------------------------------
+  // delete mode is NOT SET -> CLUSTER BY _file + LOCALLY ORDER BY _file, _pos
+  // delete mode is NONE -> unspecified distribution + empty ordering
+  // delete mode is HASH -> CLUSTER BY _file + LOCALLY ORDER BY _file, _pos
+  // delete mode is RANGE -> CLUSTER BY _file + LOCALLY ORDER BY _file, _pos
+  //
+  // UNPARTITIONED ORDERED BY id, data
+  // -------------------------------------------------------------------------
+  // delete mode is NOT SET -> CLUSTER BY _file + LOCALLY ORDER BY _file, _pos
+  // delete mode is NONE -> unspecified distribution + LOCALLY ORDER BY id, data
+  // delete mode is HASH -> CLUSTER BY _file + LOCALLY ORDER BY _file, _pos
+  // delete mode is RANGE -> ORDER BY id, data
+  //
+  // PARTITIONED BY date, days(ts) UNORDERED
+  // -------------------------------------------------------------------------
+  // delete mode is NOT SET -> CLUSTER BY _file + LOCALLY ORDER BY _file, _pos
+  // delete mode is NONE -> unspecified distribution + LOCALLY ORDERED BY date, days(ts)
+  // delete mode is HASH -> CLUSTER BY _file + LOCALLY ORDER BY _file, _pos

Review comment:
       Same for when delete mode is not set.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] aokolnychyi commented on a change in pull request #3930: Spark 3.2: Use hash distribution by default in copy-on-write DELETE

Posted by GitBox <gi...@apache.org>.

aokolnychyi commented on a change in pull request #3930:
URL: https://github.com/apache/iceberg/pull/3930#discussion_r788129115



##########
File path: spark/v3.2/spark/src/test/java/org/apache/iceberg/spark/TestSparkDistributionAndOrderingUtil.java
##########
@@ -299,14 +300,70 @@ public void testRangeWritePartitionedSortedTable() {
     checkWriteDistributionAndOrdering(table, expectedDistribution, expectedOrdering);
   }
 
+  // =============================================================
+  // Distribution and ordering for copy-on-write DELETE operations
+  // =============================================================
+  //
+  // UNPARTITIONED UNORDERED
+  // -------------------------------------------------------------------------
+  // delete mode is NOT SET -> CLUSTER BY _file + LOCALLY ORDER BY _file, _pos
+  // delete mode is NONE -> unspecified distribution + empty ordering
+  // delete mode is HASH -> CLUSTER BY _file + LOCALLY ORDER BY _file, _pos
+  // delete mode is RANGE -> CLUSTER BY _file + LOCALLY ORDER BY _file, _pos
+  //
+  // UNPARTITIONED ORDERED BY id, data
+  // -------------------------------------------------------------------------
+  // delete mode is NOT SET -> CLUSTER BY _file + LOCALLY ORDER BY _file, _pos

Review comment:
       That said, I think we should order by `id, data` in this case to have properly sorted files. If there are two files with id (0, 10) and (2, 11) and they end up in one writing task, we probably want to have a properly sorted file.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] rdblue commented on a change in pull request #3930: Spark 3.2: Use hash distribution by default in copy-on-write DELETE

Posted by GitBox <gi...@apache.org>.

rdblue commented on a change in pull request #3930:
URL: https://github.com/apache/iceberg/pull/3930#discussion_r787937696



##########
File path: spark/v3.2/spark/src/test/java/org/apache/iceberg/spark/TestSparkDistributionAndOrderingUtil.java
##########
@@ -299,14 +300,70 @@ public void testRangeWritePartitionedSortedTable() {
     checkWriteDistributionAndOrdering(table, expectedDistribution, expectedOrdering);
   }
 
+  // =============================================================
+  // Distribution and ordering for copy-on-write DELETE operations
+  // =============================================================
+  //
+  // UNPARTITIONED UNORDERED
+  // -------------------------------------------------------------------------
+  // delete mode is NOT SET -> CLUSTER BY _file + LOCALLY ORDER BY _file, _pos
+  // delete mode is NONE -> unspecified distribution + empty ordering

Review comment:
       I see why you'd not introduce extra work here, since there is no table setting and the delete mode is explicit to do very little. But I think we could still introduce a local order by _file and _pos if we wanted to. If there is no order specified, we get to choose and I'd probably opt for doing a bit more work and keeping the original row order.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] aokolnychyi commented on a change in pull request #3930: Spark 3.2: Use hash distribution by default in copy-on-write DELETE

Posted by GitBox <gi...@apache.org>.

aokolnychyi commented on a change in pull request #3930:
URL: https://github.com/apache/iceberg/pull/3930#discussion_r788130015



##########
File path: spark/v3.2/spark/src/test/java/org/apache/iceberg/spark/TestSparkDistributionAndOrderingUtil.java
##########
@@ -299,14 +300,70 @@ public void testRangeWritePartitionedSortedTable() {
     checkWriteDistributionAndOrdering(table, expectedDistribution, expectedOrdering);
   }
 
+  // =============================================================
+  // Distribution and ordering for copy-on-write DELETE operations
+  // =============================================================
+  //
+  // UNPARTITIONED UNORDERED
+  // -------------------------------------------------------------------------
+  // delete mode is NOT SET -> CLUSTER BY _file + LOCALLY ORDER BY _file, _pos
+  // delete mode is NONE -> unspecified distribution + empty ordering
+  // delete mode is HASH -> CLUSTER BY _file + LOCALLY ORDER BY _file, _pos
+  // delete mode is RANGE -> CLUSTER BY _file + LOCALLY ORDER BY _file, _pos
+  //
+  // UNPARTITIONED ORDERED BY id, data
+  // -------------------------------------------------------------------------
+  // delete mode is NOT SET -> CLUSTER BY _file + LOCALLY ORDER BY _file, _pos
+  // delete mode is NONE -> unspecified distribution + LOCALLY ORDER BY id, data
+  // delete mode is HASH -> CLUSTER BY _file + LOCALLY ORDER BY _file, _pos
+  // delete mode is RANGE -> ORDER BY id, data
+  //
+  // PARTITIONED BY date, days(ts) UNORDERED
+  // -------------------------------------------------------------------------
+  // delete mode is NOT SET -> CLUSTER BY _file + LOCALLY ORDER BY _file, _pos
+  // delete mode is NONE -> unspecified distribution + LOCALLY ORDERED BY date, days(ts)
+  // delete mode is HASH -> CLUSTER BY _file + LOCALLY ORDER BY _file, _pos
+  // delete mode is RANGE -> ORDER BY date, days(ts)
+  //
+  // PARTITIONED BY date ORDERED BY id
+  // -------------------------------------------------------------------------
+  // delete mode is NOT SET -> CLUSTER BY _file + LOCALLY ORDER BY _file, _pos

Review comment:
       Yeah, I think we should use the sort order if it is present instead of ordering by `_file` and `_pos`.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] rdblue commented on a change in pull request #3930: Spark 3.2: Use hash distribution by default in copy-on-write DELETE

Posted by GitBox <gi...@apache.org>.

rdblue commented on a change in pull request #3930:
URL: https://github.com/apache/iceberg/pull/3930#discussion_r789257088



##########
File path: spark/v3.2/spark/src/test/java/org/apache/iceberg/spark/TestSparkDistributionAndOrderingUtil.java
##########
@@ -299,14 +300,70 @@ public void testRangeWritePartitionedSortedTable() {
     checkWriteDistributionAndOrdering(table, expectedDistribution, expectedOrdering);
   }
 
+  // =============================================================
+  // Distribution and ordering for copy-on-write DELETE operations
+  // =============================================================
+  //
+  // UNPARTITIONED UNORDERED
+  // -------------------------------------------------------------------------
+  // delete mode is NOT SET -> CLUSTER BY _file + LOCALLY ORDER BY _file, _pos
+  // delete mode is NONE -> unspecified distribution + empty ordering
+  // delete mode is HASH -> CLUSTER BY _file + LOCALLY ORDER BY _file, _pos
+  // delete mode is RANGE -> CLUSTER BY _file + LOCALLY ORDER BY _file, _pos
+  //
+  // UNPARTITIONED ORDERED BY id, data
+  // -------------------------------------------------------------------------
+  // delete mode is NOT SET -> CLUSTER BY _file + LOCALLY ORDER BY _file, _pos

Review comment:
       The comment in your last reply sounds good to me.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] aokolnychyi commented on a change in pull request #3930: Spark 3.2: Use hash distribution by default in copy-on-write DELETE

Posted by GitBox <gi...@apache.org>.

aokolnychyi commented on a change in pull request #3930:
URL: https://github.com/apache/iceberg/pull/3930#discussion_r788116320



##########
File path: spark/v3.2/spark/src/test/java/org/apache/iceberg/spark/TestSparkDistributionAndOrderingUtil.java
##########
@@ -299,14 +300,70 @@ public void testRangeWritePartitionedSortedTable() {
     checkWriteDistributionAndOrdering(table, expectedDistribution, expectedOrdering);
   }
 
+  // =============================================================
+  // Distribution and ordering for copy-on-write DELETE operations
+  // =============================================================
+  //
+  // UNPARTITIONED UNORDERED
+  // -------------------------------------------------------------------------
+  // delete mode is NOT SET -> CLUSTER BY _file + LOCALLY ORDER BY _file, _pos
+  // delete mode is NONE -> unspecified distribution + empty ordering

Review comment:
       I don't mind doing that. In many cases, unfortunately, it won't give us any benefits. I think frequently a single task can end up with records from multiple files and if those original files were sorted, the produced file won't be. There is also no guarantee records from the same original file will end up in the same write task. They can be scattered across all tasks.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] aokolnychyi commented on a change in pull request #3930: Spark 3.2: Use hash distribution by default in copy-on-write DELETE

Posted by GitBox <gi...@apache.org>.

aokolnychyi commented on a change in pull request #3930:
URL: https://github.com/apache/iceberg/pull/3930#discussion_r787925408



##########
File path: spark/v3.2/spark/src/test/java/org/apache/iceberg/spark/TestSparkDistributionAndOrderingUtil.java
##########
@@ -299,14 +300,70 @@ public void testRangeWritePartitionedSortedTable() {
     checkWriteDistributionAndOrdering(table, expectedDistribution, expectedOrdering);
   }
 
+  // =============================================================
+  // Distribution and ordering for copy-on-write DELETE operations
+  // =============================================================
+  //
+  // UNPARTITIONED UNORDERED
+  // -------------------------------------------------------------------------
+  // delete mode is NOT SET -> CLUSTER BY _file + LOCALLY ORDER BY _file, _pos
+  // delete mode is NONE -> unspecified distribution + empty ordering
+  // delete mode is HASH -> CLUSTER BY _file + LOCALLY ORDER BY _file, _pos
+  // delete mode is RANGE -> CLUSTER BY _file + LOCALLY ORDER BY _file, _pos
+  //
+  // UNPARTITIONED ORDERED BY id, data
+  // -------------------------------------------------------------------------
+  // delete mode is NOT SET -> CLUSTER BY _file + LOCALLY ORDER BY _file, _pos

Review comment:
       I am debating a bit what should be the correct ordering in this case given that our table has a defined sort order. On one hand, sorting by `_file` and `_pos` may lead to unsorted files on S3 if the object storage paths are enabled. On the other hand, sorting by the defined sort columns may rebalance files (could be a good thing?).




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] rdblue commented on a change in pull request #3930: Spark 3.2: Use hash distribution by default in copy-on-write DELETE

Posted by GitBox <gi...@apache.org>.

rdblue commented on a change in pull request #3930:
URL: https://github.com/apache/iceberg/pull/3930#discussion_r787932904



##########
File path: spark/v3.2/spark/src/test/java/org/apache/iceberg/spark/TestSparkDistributionAndOrderingUtil.java
##########
@@ -299,14 +300,70 @@ public void testRangeWritePartitionedSortedTable() {
     checkWriteDistributionAndOrdering(table, expectedDistribution, expectedOrdering);
   }
 
+  // =============================================================
+  // Distribution and ordering for copy-on-write DELETE operations
+  // =============================================================
+  //
+  // UNPARTITIONED UNORDERED
+  // -------------------------------------------------------------------------
+  // delete mode is NOT SET -> CLUSTER BY _file + LOCALLY ORDER BY _file, _pos
+  // delete mode is NONE -> unspecified distribution + empty ordering
+  // delete mode is HASH -> CLUSTER BY _file + LOCALLY ORDER BY _file, _pos
+  // delete mode is RANGE -> CLUSTER BY _file + LOCALLY ORDER BY _file, _pos

Review comment:
       Why not order by _file and _pos?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] rdblue commented on a change in pull request #3930: Spark 3.2: Use hash distribution by default in copy-on-write DELETE

Posted by GitBox <gi...@apache.org>.

rdblue commented on a change in pull request #3930:
URL: https://github.com/apache/iceberg/pull/3930#discussion_r787932308



##########
File path: spark/v3.2/spark/src/test/java/org/apache/iceberg/spark/TestSparkDistributionAndOrderingUtil.java
##########
@@ -299,14 +300,70 @@ public void testRangeWritePartitionedSortedTable() {
     checkWriteDistributionAndOrdering(table, expectedDistribution, expectedOrdering);
   }
 
+  // =============================================================
+  // Distribution and ordering for copy-on-write DELETE operations
+  // =============================================================
+  //
+  // UNPARTITIONED UNORDERED
+  // -------------------------------------------------------------------------
+  // delete mode is NOT SET -> CLUSTER BY _file + LOCALLY ORDER BY _file, _pos
+  // delete mode is NONE -> unspecified distribution + empty ordering

Review comment:
       Shouldn't this at least locally order by _file and _pos to keep the original distribution? The delete mode should affect distribution, not the order that we choose.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] aokolnychyi commented on a change in pull request #3930: Spark 3.2: Use hash distribution by default in copy-on-write DELETE

Posted by GitBox <gi...@apache.org>.

aokolnychyi commented on a change in pull request #3930:
URL: https://github.com/apache/iceberg/pull/3930#discussion_r788129115



##########
File path: spark/v3.2/spark/src/test/java/org/apache/iceberg/spark/TestSparkDistributionAndOrderingUtil.java
##########
@@ -299,14 +300,70 @@ public void testRangeWritePartitionedSortedTable() {
     checkWriteDistributionAndOrdering(table, expectedDistribution, expectedOrdering);
   }
 
+  // =============================================================
+  // Distribution and ordering for copy-on-write DELETE operations
+  // =============================================================
+  //
+  // UNPARTITIONED UNORDERED
+  // -------------------------------------------------------------------------
+  // delete mode is NOT SET -> CLUSTER BY _file + LOCALLY ORDER BY _file, _pos
+  // delete mode is NONE -> unspecified distribution + empty ordering
+  // delete mode is HASH -> CLUSTER BY _file + LOCALLY ORDER BY _file, _pos
+  // delete mode is RANGE -> CLUSTER BY _file + LOCALLY ORDER BY _file, _pos
+  //
+  // UNPARTITIONED ORDERED BY id, data
+  // -------------------------------------------------------------------------
+  // delete mode is NOT SET -> CLUSTER BY _file + LOCALLY ORDER BY _file, _pos

Review comment:
       That said, I think we should order by `id, data` in this case to have properly sorted files. If there are two files with id (0, 10) and (2, 11) and they end up in one writing task, we probably want to have a properly sorted file.
   
   This case would become `CLUSTER BY _file + LOCALLY ORDER BY id, data` then.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] rdblue commented on a change in pull request #3930: Spark 3.2: Use hash distribution by default in copy-on-write DELETE

Posted by GitBox <gi...@apache.org>.

rdblue commented on a change in pull request #3930:
URL: https://github.com/apache/iceberg/pull/3930#discussion_r789256311



##########
File path: spark/v3.2/spark/src/test/java/org/apache/iceberg/spark/TestSparkDistributionAndOrderingUtil.java
##########
@@ -299,14 +300,70 @@ public void testRangeWritePartitionedSortedTable() {
     checkWriteDistributionAndOrdering(table, expectedDistribution, expectedOrdering);
   }
 
+  // =============================================================
+  // Distribution and ordering for copy-on-write DELETE operations
+  // =============================================================
+  //
+  // UNPARTITIONED UNORDERED
+  // -------------------------------------------------------------------------
+  // delete mode is NOT SET -> CLUSTER BY _file + LOCALLY ORDER BY _file, _pos
+  // delete mode is NONE -> unspecified distribution + empty ordering
+  // delete mode is HASH -> CLUSTER BY _file + LOCALLY ORDER BY _file, _pos
+  // delete mode is RANGE -> CLUSTER BY _file + LOCALLY ORDER BY _file, _pos

Review comment:
       Yeah, since this is explicit, let's change it.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] rdblue commented on a change in pull request #3930: Spark 3.2: Use hash distribution by default in copy-on-write DELETE

Posted by GitBox <gi...@apache.org>.

rdblue commented on a change in pull request #3930:
URL: https://github.com/apache/iceberg/pull/3930#discussion_r787943744



##########
File path: spark/v3.2/spark/src/test/java/org/apache/iceberg/spark/TestSparkDistributionAndOrderingUtil.java
##########
@@ -299,14 +300,70 @@ public void testRangeWritePartitionedSortedTable() {
     checkWriteDistributionAndOrdering(table, expectedDistribution, expectedOrdering);
   }
 
+  // =============================================================
+  // Distribution and ordering for copy-on-write DELETE operations
+  // =============================================================
+  //
+  // UNPARTITIONED UNORDERED
+  // -------------------------------------------------------------------------
+  // delete mode is NOT SET -> CLUSTER BY _file + LOCALLY ORDER BY _file, _pos
+  // delete mode is NONE -> unspecified distribution + empty ordering
+  // delete mode is HASH -> CLUSTER BY _file + LOCALLY ORDER BY _file, _pos
+  // delete mode is RANGE -> CLUSTER BY _file + LOCALLY ORDER BY _file, _pos
+  //
+  // UNPARTITIONED ORDERED BY id, data
+  // -------------------------------------------------------------------------
+  // delete mode is NOT SET -> CLUSTER BY _file + LOCALLY ORDER BY _file, _pos
+  // delete mode is NONE -> unspecified distribution + LOCALLY ORDER BY id, data
+  // delete mode is HASH -> CLUSTER BY _file + LOCALLY ORDER BY _file, _pos
+  // delete mode is RANGE -> ORDER BY id, data
+  //
+  // PARTITIONED BY date, days(ts) UNORDERED
+  // -------------------------------------------------------------------------
+  // delete mode is NOT SET -> CLUSTER BY _file + LOCALLY ORDER BY _file, _pos
+  // delete mode is NONE -> unspecified distribution + LOCALLY ORDERED BY date, days(ts)
+  // delete mode is HASH -> CLUSTER BY _file + LOCALLY ORDER BY _file, _pos

Review comment:
       Do we need to add `date, days(ts)` as a prefix to the local ordering in case the files are coming from a different partition spec?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] rdblue commented on a change in pull request #3930: Spark 3.2: Use hash distribution by default in copy-on-write DELETE

Posted by GitBox <gi...@apache.org>.

rdblue commented on a change in pull request #3930:
URL: https://github.com/apache/iceberg/pull/3930#discussion_r787940129



##########
File path: spark/v3.2/spark/src/test/java/org/apache/iceberg/spark/TestSparkDistributionAndOrderingUtil.java
##########
@@ -299,14 +300,70 @@ public void testRangeWritePartitionedSortedTable() {
     checkWriteDistributionAndOrdering(table, expectedDistribution, expectedOrdering);
   }
 
+  // =============================================================
+  // Distribution and ordering for copy-on-write DELETE operations
+  // =============================================================
+  //
+  // UNPARTITIONED UNORDERED
+  // -------------------------------------------------------------------------
+  // delete mode is NOT SET -> CLUSTER BY _file + LOCALLY ORDER BY _file, _pos
+  // delete mode is NONE -> unspecified distribution + empty ordering
+  // delete mode is HASH -> CLUSTER BY _file + LOCALLY ORDER BY _file, _pos
+  // delete mode is RANGE -> CLUSTER BY _file + LOCALLY ORDER BY _file, _pos
+  //
+  // UNPARTITIONED ORDERED BY id, data
+  // -------------------------------------------------------------------------
+  // delete mode is NOT SET -> CLUSTER BY _file + LOCALLY ORDER BY _file, _pos

Review comment:
       Does `ORDERED BY` above mean that we have the table distribution mode set to `range`? If so, I'd probably do a global order as though the delete mode were set to `range`. I think that makes sense since the order is requested.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] aokolnychyi commented on a change in pull request #3930: Spark 3.2: Use hash distribution by default in copy-on-write DELETE

Posted by GitBox <gi...@apache.org>.

aokolnychyi commented on a change in pull request #3930:
URL: https://github.com/apache/iceberg/pull/3930#discussion_r788129115



##########
File path: spark/v3.2/spark/src/test/java/org/apache/iceberg/spark/TestSparkDistributionAndOrderingUtil.java
##########
@@ -299,14 +300,70 @@ public void testRangeWritePartitionedSortedTable() {
     checkWriteDistributionAndOrdering(table, expectedDistribution, expectedOrdering);
   }
 
+  // =============================================================
+  // Distribution and ordering for copy-on-write DELETE operations
+  // =============================================================
+  //
+  // UNPARTITIONED UNORDERED
+  // -------------------------------------------------------------------------
+  // delete mode is NOT SET -> CLUSTER BY _file + LOCALLY ORDER BY _file, _pos
+  // delete mode is NONE -> unspecified distribution + empty ordering
+  // delete mode is HASH -> CLUSTER BY _file + LOCALLY ORDER BY _file, _pos
+  // delete mode is RANGE -> CLUSTER BY _file + LOCALLY ORDER BY _file, _pos
+  //
+  // UNPARTITIONED ORDERED BY id, data
+  // -------------------------------------------------------------------------
+  // delete mode is NOT SET -> CLUSTER BY _file + LOCALLY ORDER BY _file, _pos

Review comment:
       That said, I think we should order by `id, data` in this case to have properly sorted files. If there are two files with id (0, 10) and (2, 11) and they end up in one writing task, we probably want to have a properly sorted output file.
   
   This case would become `CLUSTER BY _file + LOCALLY ORDER BY id, data` then.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] rdblue merged pull request #3930: Spark 3.2: Revise distribution and ordering in copy-on-write DELETE

Posted by GitBox <gi...@apache.org>.

rdblue merged pull request #3930:
URL: https://github.com/apache/iceberg/pull/3930


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] rdblue commented on a change in pull request #3930: Spark 3.2: Use hash distribution by default in copy-on-write DELETE

Posted by GitBox <gi...@apache.org>.

rdblue commented on a change in pull request #3930:
URL: https://github.com/apache/iceberg/pull/3930#discussion_r787941920



##########
File path: spark/v3.2/spark/src/test/java/org/apache/iceberg/spark/TestSparkDistributionAndOrderingUtil.java
##########
@@ -299,14 +300,70 @@ public void testRangeWritePartitionedSortedTable() {
     checkWriteDistributionAndOrdering(table, expectedDistribution, expectedOrdering);
   }
 
+  // =============================================================
+  // Distribution and ordering for copy-on-write DELETE operations
+  // =============================================================
+  //
+  // UNPARTITIONED UNORDERED
+  // -------------------------------------------------------------------------
+  // delete mode is NOT SET -> CLUSTER BY _file + LOCALLY ORDER BY _file, _pos
+  // delete mode is NONE -> unspecified distribution + empty ordering
+  // delete mode is HASH -> CLUSTER BY _file + LOCALLY ORDER BY _file, _pos
+  // delete mode is RANGE -> CLUSTER BY _file + LOCALLY ORDER BY _file, _pos
+  //
+  // UNPARTITIONED ORDERED BY id, data
+  // -------------------------------------------------------------------------
+  // delete mode is NOT SET -> CLUSTER BY _file + LOCALLY ORDER BY _file, _pos
+  // delete mode is NONE -> unspecified distribution + LOCALLY ORDER BY id, data
+  // delete mode is HASH -> CLUSTER BY _file + LOCALLY ORDER BY _file, _pos
+  // delete mode is RANGE -> ORDER BY id, data
+  //
+  // PARTITIONED BY date, days(ts) UNORDERED
+  // -------------------------------------------------------------------------
+  // delete mode is NOT SET -> CLUSTER BY _file + LOCALLY ORDER BY _file, _pos
+  // delete mode is NONE -> unspecified distribution + LOCALLY ORDERED BY date, days(ts)
+  // delete mode is HASH -> CLUSTER BY _file + LOCALLY ORDER BY _file, _pos
+  // delete mode is RANGE -> ORDER BY date, days(ts)

Review comment:
       I agree with these.
   
   The only thing I would add is that eventually we can add the table identifier fields to the end of this global ordering to give Spark more places to split.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] rdblue commented on a change in pull request #3930: Spark 3.2: Use hash distribution by default in copy-on-write DELETE

Posted by GitBox <gi...@apache.org>.

rdblue commented on a change in pull request #3930:
URL: https://github.com/apache/iceberg/pull/3930#discussion_r787944895



##########
File path: spark/v3.2/spark/src/test/java/org/apache/iceberg/spark/TestSparkDistributionAndOrderingUtil.java
##########
@@ -299,14 +300,70 @@ public void testRangeWritePartitionedSortedTable() {
     checkWriteDistributionAndOrdering(table, expectedDistribution, expectedOrdering);
   }
 
+  // =============================================================
+  // Distribution and ordering for copy-on-write DELETE operations
+  // =============================================================
+  //
+  // UNPARTITIONED UNORDERED
+  // -------------------------------------------------------------------------
+  // delete mode is NOT SET -> CLUSTER BY _file + LOCALLY ORDER BY _file, _pos
+  // delete mode is NONE -> unspecified distribution + empty ordering
+  // delete mode is HASH -> CLUSTER BY _file + LOCALLY ORDER BY _file, _pos
+  // delete mode is RANGE -> CLUSTER BY _file + LOCALLY ORDER BY _file, _pos
+  //
+  // UNPARTITIONED ORDERED BY id, data
+  // -------------------------------------------------------------------------
+  // delete mode is NOT SET -> CLUSTER BY _file + LOCALLY ORDER BY _file, _pos
+  // delete mode is NONE -> unspecified distribution + LOCALLY ORDER BY id, data
+  // delete mode is HASH -> CLUSTER BY _file + LOCALLY ORDER BY _file, _pos
+  // delete mode is RANGE -> ORDER BY id, data
+  //
+  // PARTITIONED BY date, days(ts) UNORDERED
+  // -------------------------------------------------------------------------
+  // delete mode is NOT SET -> CLUSTER BY _file + LOCALLY ORDER BY _file, _pos
+  // delete mode is NONE -> unspecified distribution + LOCALLY ORDERED BY date, days(ts)
+  // delete mode is HASH -> CLUSTER BY _file + LOCALLY ORDER BY _file, _pos
+  // delete mode is RANGE -> ORDER BY date, days(ts)
+  //
+  // PARTITIONED BY date ORDERED BY id
+  // -------------------------------------------------------------------------
+  // delete mode is NOT SET -> CLUSTER BY _file + LOCALLY ORDER BY _file, _pos

Review comment:
       Why not order by date, id like the range case?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] aokolnychyi commented on a change in pull request #3930: Spark 3.2: Use hash distribution by default in copy-on-write DELETE

Posted by GitBox <gi...@apache.org>.

aokolnychyi commented on a change in pull request #3930:
URL: https://github.com/apache/iceberg/pull/3930#discussion_r788129541



##########
File path: spark/v3.2/spark/src/test/java/org/apache/iceberg/spark/TestSparkDistributionAndOrderingUtil.java
##########
@@ -299,14 +300,70 @@ public void testRangeWritePartitionedSortedTable() {
     checkWriteDistributionAndOrdering(table, expectedDistribution, expectedOrdering);
   }
 
+  // =============================================================
+  // Distribution and ordering for copy-on-write DELETE operations
+  // =============================================================
+  //
+  // UNPARTITIONED UNORDERED
+  // -------------------------------------------------------------------------
+  // delete mode is NOT SET -> CLUSTER BY _file + LOCALLY ORDER BY _file, _pos
+  // delete mode is NONE -> unspecified distribution + empty ordering
+  // delete mode is HASH -> CLUSTER BY _file + LOCALLY ORDER BY _file, _pos
+  // delete mode is RANGE -> CLUSTER BY _file + LOCALLY ORDER BY _file, _pos
+  //
+  // UNPARTITIONED ORDERED BY id, data
+  // -------------------------------------------------------------------------
+  // delete mode is NOT SET -> CLUSTER BY _file + LOCALLY ORDER BY _file, _pos
+  // delete mode is NONE -> unspecified distribution + LOCALLY ORDER BY id, data
+  // delete mode is HASH -> CLUSTER BY _file + LOCALLY ORDER BY _file, _pos
+  // delete mode is RANGE -> ORDER BY id, data
+  //
+  // PARTITIONED BY date, days(ts) UNORDERED
+  // -------------------------------------------------------------------------
+  // delete mode is NOT SET -> CLUSTER BY _file + LOCALLY ORDER BY _file, _pos
+  // delete mode is NONE -> unspecified distribution + LOCALLY ORDERED BY date, days(ts)
+  // delete mode is HASH -> CLUSTER BY _file + LOCALLY ORDER BY _file, _pos

Review comment:
       Sounds reasonable.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] rdblue commented on a change in pull request #3930: Spark 3.2: Use hash distribution by default in copy-on-write DELETE

Posted by GitBox <gi...@apache.org>.

rdblue commented on a change in pull request #3930:
URL: https://github.com/apache/iceberg/pull/3930#discussion_r787932308



##########
File path: spark/v3.2/spark/src/test/java/org/apache/iceberg/spark/TestSparkDistributionAndOrderingUtil.java
##########
@@ -299,14 +300,70 @@ public void testRangeWritePartitionedSortedTable() {
     checkWriteDistributionAndOrdering(table, expectedDistribution, expectedOrdering);
   }
 
+  // =============================================================
+  // Distribution and ordering for copy-on-write DELETE operations
+  // =============================================================
+  //
+  // UNPARTITIONED UNORDERED
+  // -------------------------------------------------------------------------
+  // delete mode is NOT SET -> CLUSTER BY _file + LOCALLY ORDER BY _file, _pos
+  // delete mode is NONE -> unspecified distribution + empty ordering

Review comment:
       Shouldn't this at least locally order by _file and _pos to ensure that the delete files are written correctly? The delete mode should affect distribution, not the order that we choose.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] aokolnychyi commented on a change in pull request #3930: Spark 3.2: Use hash distribution by default in copy-on-write DELETE

Posted by GitBox <gi...@apache.org>.

aokolnychyi commented on a change in pull request #3930:
URL: https://github.com/apache/iceberg/pull/3930#discussion_r787952617



##########
File path: spark/v3.2/spark/src/test/java/org/apache/iceberg/spark/TestSparkDistributionAndOrderingUtil.java
##########
@@ -299,14 +300,70 @@ public void testRangeWritePartitionedSortedTable() {
     checkWriteDistributionAndOrdering(table, expectedDistribution, expectedOrdering);
   }
 
+  // =============================================================
+  // Distribution and ordering for copy-on-write DELETE operations
+  // =============================================================
+  //
+  // UNPARTITIONED UNORDERED
+  // -------------------------------------------------------------------------
+  // delete mode is NOT SET -> CLUSTER BY _file + LOCALLY ORDER BY _file, _pos
+  // delete mode is NONE -> unspecified distribution + empty ordering
+  // delete mode is HASH -> CLUSTER BY _file + LOCALLY ORDER BY _file, _pos
+  // delete mode is RANGE -> CLUSTER BY _file + LOCALLY ORDER BY _file, _pos
+  //
+  // UNPARTITIONED ORDERED BY id, data
+  // -------------------------------------------------------------------------
+  // delete mode is NOT SET -> CLUSTER BY _file + LOCALLY ORDER BY _file, _pos

Review comment:
       No, it did not mean that but I may need to take it into account too. Let me think.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] aokolnychyi commented on a change in pull request #3930: Spark 3.2: Use hash distribution by default in copy-on-write DELETE

Posted by GitBox <gi...@apache.org>.

aokolnychyi commented on a change in pull request #3930:
URL: https://github.com/apache/iceberg/pull/3930#discussion_r788129115



##########
File path: spark/v3.2/spark/src/test/java/org/apache/iceberg/spark/TestSparkDistributionAndOrderingUtil.java
##########
@@ -299,14 +300,70 @@ public void testRangeWritePartitionedSortedTable() {
     checkWriteDistributionAndOrdering(table, expectedDistribution, expectedOrdering);
   }
 
+  // =============================================================
+  // Distribution and ordering for copy-on-write DELETE operations
+  // =============================================================
+  //
+  // UNPARTITIONED UNORDERED
+  // -------------------------------------------------------------------------
+  // delete mode is NOT SET -> CLUSTER BY _file + LOCALLY ORDER BY _file, _pos
+  // delete mode is NONE -> unspecified distribution + empty ordering
+  // delete mode is HASH -> CLUSTER BY _file + LOCALLY ORDER BY _file, _pos
+  // delete mode is RANGE -> CLUSTER BY _file + LOCALLY ORDER BY _file, _pos
+  //
+  // UNPARTITIONED ORDERED BY id, data
+  // -------------------------------------------------------------------------
+  // delete mode is NOT SET -> CLUSTER BY _file + LOCALLY ORDER BY _file, _pos

Review comment:
       That said, I think we should order by `id, data` in this case to have properly sorted files. If there are two files with id (0, 10) and (2, 11) and they end up in one writing task, we probably want to have a properly sorted file.
   
   `CLUSTER BY _file + LOCALLY ORDER BY id, data`.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] rdblue commented on a change in pull request #3930: Spark 3.2: Use hash distribution by default in copy-on-write DELETE

Posted by GitBox <gi...@apache.org>.

rdblue commented on a change in pull request #3930:
URL: https://github.com/apache/iceberg/pull/3930#discussion_r789255943



##########
File path: spark/v3.2/spark/src/test/java/org/apache/iceberg/spark/TestSparkDistributionAndOrderingUtil.java
##########
@@ -299,14 +300,70 @@ public void testRangeWritePartitionedSortedTable() {
     checkWriteDistributionAndOrdering(table, expectedDistribution, expectedOrdering);
   }
 
+  // =============================================================
+  // Distribution and ordering for copy-on-write DELETE operations
+  // =============================================================
+  //
+  // UNPARTITIONED UNORDERED
+  // -------------------------------------------------------------------------
+  // delete mode is NOT SET -> CLUSTER BY _file + LOCALLY ORDER BY _file, _pos
+  // delete mode is NONE -> unspecified distribution + empty ordering

Review comment:
       That's a fair point. If we aren't clustering by _file then there is probably either a mix (if there's a shuffle) or we end up with a good ordering in the first place because we're reading directly from the incoming files.
   
   Let's leave this as is.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] aokolnychyi commented on a change in pull request #3930: Spark 3.2: Use hash distribution by default in copy-on-write DELETE

Posted by GitBox <gi...@apache.org>.

aokolnychyi commented on a change in pull request #3930:
URL: https://github.com/apache/iceberg/pull/3930#discussion_r788117648



##########
File path: spark/v3.2/spark/src/test/java/org/apache/iceberg/spark/TestSparkDistributionAndOrderingUtil.java
##########
@@ -299,14 +300,70 @@ public void testRangeWritePartitionedSortedTable() {
     checkWriteDistributionAndOrdering(table, expectedDistribution, expectedOrdering);
   }
 
+  // =============================================================
+  // Distribution and ordering for copy-on-write DELETE operations
+  // =============================================================
+  //
+  // UNPARTITIONED UNORDERED
+  // -------------------------------------------------------------------------
+  // delete mode is NOT SET -> CLUSTER BY _file + LOCALLY ORDER BY _file, _pos
+  // delete mode is NONE -> unspecified distribution + empty ordering
+  // delete mode is HASH -> CLUSTER BY _file + LOCALLY ORDER BY _file, _pos
+  // delete mode is RANGE -> CLUSTER BY _file + LOCALLY ORDER BY _file, _pos

Review comment:
       I think we can do that. Most likely we will produce more files as the range boundary will not match the original file boundary. That's why we will produce num of shuffle partition files more. However, you are right the user explicitly asked for range so maybe there is a skew that needs be addressed. I agree on this one.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] aokolnychyi commented on a change in pull request #3930: Spark 3.2: Use hash distribution by default in copy-on-write DELETE

Posted by GitBox <gi...@apache.org>.

aokolnychyi commented on a change in pull request #3930:
URL: https://github.com/apache/iceberg/pull/3930#discussion_r788116320



##########
File path: spark/v3.2/spark/src/test/java/org/apache/iceberg/spark/TestSparkDistributionAndOrderingUtil.java
##########
@@ -299,14 +300,70 @@ public void testRangeWritePartitionedSortedTable() {
     checkWriteDistributionAndOrdering(table, expectedDistribution, expectedOrdering);
   }
 
+  // =============================================================
+  // Distribution and ordering for copy-on-write DELETE operations
+  // =============================================================
+  //
+  // UNPARTITIONED UNORDERED
+  // -------------------------------------------------------------------------
+  // delete mode is NOT SET -> CLUSTER BY _file + LOCALLY ORDER BY _file, _pos
+  // delete mode is NONE -> unspecified distribution + empty ordering

Review comment:
       I don't mind doing that. In many cases, unfortunately, it won't give us any benefits. I think frequently a single task can end up with records from multiple files and if those original files were sorted, the produced file won't be. There is also no guarantee records from the same original file will end up in the one write task. They can be scattered across all tasks.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] rdblue commented on a change in pull request #3930: Spark 3.2: Use hash distribution by default in copy-on-write DELETE

Posted by GitBox <gi...@apache.org>.

rdblue commented on a change in pull request #3930:
URL: https://github.com/apache/iceberg/pull/3930#discussion_r787940323



##########
File path: spark/v3.2/spark/src/test/java/org/apache/iceberg/spark/TestSparkDistributionAndOrderingUtil.java
##########
@@ -299,14 +300,70 @@ public void testRangeWritePartitionedSortedTable() {
     checkWriteDistributionAndOrdering(table, expectedDistribution, expectedOrdering);
   }
 
+  // =============================================================
+  // Distribution and ordering for copy-on-write DELETE operations
+  // =============================================================
+  //
+  // UNPARTITIONED UNORDERED
+  // -------------------------------------------------------------------------
+  // delete mode is NOT SET -> CLUSTER BY _file + LOCALLY ORDER BY _file, _pos
+  // delete mode is NONE -> unspecified distribution + empty ordering
+  // delete mode is HASH -> CLUSTER BY _file + LOCALLY ORDER BY _file, _pos
+  // delete mode is RANGE -> CLUSTER BY _file + LOCALLY ORDER BY _file, _pos
+  //
+  // UNPARTITIONED ORDERED BY id, data
+  // -------------------------------------------------------------------------
+  // delete mode is NOT SET -> CLUSTER BY _file + LOCALLY ORDER BY _file, _pos
+  // delete mode is NONE -> unspecified distribution + LOCALLY ORDER BY id, data
+  // delete mode is HASH -> CLUSTER BY _file + LOCALLY ORDER BY _file, _pos
+  // delete mode is RANGE -> ORDER BY id, data

Review comment:
       I agree with these 3 cases.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] rdblue commented on a change in pull request #3930: Spark 3.2: Use hash distribution by default in copy-on-write DELETE

Posted by GitBox <gi...@apache.org>.

rdblue commented on a change in pull request #3930:
URL: https://github.com/apache/iceberg/pull/3930#discussion_r787932904



##########
File path: spark/v3.2/spark/src/test/java/org/apache/iceberg/spark/TestSparkDistributionAndOrderingUtil.java
##########
@@ -299,14 +300,70 @@ public void testRangeWritePartitionedSortedTable() {
     checkWriteDistributionAndOrdering(table, expectedDistribution, expectedOrdering);
   }
 
+  // =============================================================
+  // Distribution and ordering for copy-on-write DELETE operations
+  // =============================================================
+  //
+  // UNPARTITIONED UNORDERED
+  // -------------------------------------------------------------------------
+  // delete mode is NOT SET -> CLUSTER BY _file + LOCALLY ORDER BY _file, _pos
+  // delete mode is NONE -> unspecified distribution + empty ordering
+  // delete mode is HASH -> CLUSTER BY _file + LOCALLY ORDER BY _file, _pos
+  // delete mode is RANGE -> CLUSTER BY _file + LOCALLY ORDER BY _file, _pos

Review comment:
       Why not global order by _file and _pos? That would make sense to me because range is telling us to handle skew.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] aokolnychyi commented on a change in pull request #3930: Spark 3.2: Use hash distribution by default in copy-on-write DELETE

Posted by GitBox <gi...@apache.org>.

aokolnychyi commented on a change in pull request #3930:
URL: https://github.com/apache/iceberg/pull/3930#discussion_r788127706



##########
File path: spark/v3.2/spark/src/test/java/org/apache/iceberg/spark/TestSparkDistributionAndOrderingUtil.java
##########
@@ -299,14 +300,70 @@ public void testRangeWritePartitionedSortedTable() {
     checkWriteDistributionAndOrdering(table, expectedDistribution, expectedOrdering);
   }
 
+  // =============================================================
+  // Distribution and ordering for copy-on-write DELETE operations
+  // =============================================================
+  //
+  // UNPARTITIONED UNORDERED
+  // -------------------------------------------------------------------------
+  // delete mode is NOT SET -> CLUSTER BY _file + LOCALLY ORDER BY _file, _pos
+  // delete mode is NONE -> unspecified distribution + empty ordering
+  // delete mode is HASH -> CLUSTER BY _file + LOCALLY ORDER BY _file, _pos
+  // delete mode is RANGE -> CLUSTER BY _file + LOCALLY ORDER BY _file, _pos
+  //
+  // UNPARTITIONED ORDERED BY id, data
+  // -------------------------------------------------------------------------
+  // delete mode is NOT SET -> CLUSTER BY _file + LOCALLY ORDER BY _file, _pos

Review comment:
       Well, I am not sure I agree on this one. I think regular writes and deletes are fundamentally different and it is our best interest to avoid a range-based shuffle unless the user explicitly asked for it during deletes. I think our default should be to keep the existing order with the minimal cost and a hash shuffle would be way cheaper. What do you think?
   
   This would be also a change in the behavior compared to what we released before.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org