You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2022/07/15 00:50:37 UTC

[GitHub] [iceberg] dramaticlly opened a new pull request, #5280: Docs: add missing table properties for update and merge write distrib…

dramaticlly opened a new pull request, #5280:
URL: https://github.com/apache/iceberg/pull/5280

   Add missing documentation for `write.update.distribution-mode`
   
   I believe
   - `write.delete.distribution-mode`
   - `write.update.distribution-mode`
   - `write.merge.distribution-mode` (not super sure about this one, deducted from https://github.com/apache/iceberg/blob/master/spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/SparkWriteConf.java#L228-L240)
   
   are all introduced in #3511 but only one of them is shown up on documentation site. 
   
   
   CC @aokolnychyi @samredai 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] szehon-ho commented on a diff in pull request #5280: Docs: add missing table properties for update and merge write distrib…

Posted by GitBox <gi...@apache.org>.
szehon-ho commented on code in PR #5280:
URL: https://github.com/apache/iceberg/pull/5280#discussion_r932555882


##########
docs/configuration.md:
##########
@@ -71,6 +71,8 @@ Iceberg tables support table properties to configure table behavior, like the de
 | write.delete.target-file-size-bytes| 67108864 (64 MB)   | Controls the size of delete files generated to target about this many bytes |
 | write.distribution-mode            | none               | Defines distribution of write data: __none__: don't shuffle rows; __hash__: hash distribute by partition key ; __range__: range distribute by partition key or sort key if table has an SortOrder |
 | write.delete.distribution-mode     | hash               | Defines distribution of write delete data          |
+| write.update.distribution-mode     | hash               | Defines distribution of write update data          |
+| write.merge.distribution-mode      | none               | Defines distribution of write merge data           |

Review Comment:
   Looks like the default here is a bit complicated.  From my reading of the code, it picks 'write.distribution-mode' if it is set, else "range" if sorted, "none" if unsorted.  Do you see that as well, and should we mention that?
   
   I feel we also should add something to spark-writes about distribution mode.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] aokolnychyi closed pull request #5280: Docs: add missing table properties for update and merge write distrib…

Posted by "aokolnychyi (via GitHub)" <gi...@apache.org>.
aokolnychyi closed pull request #5280: Docs: add missing table properties for update and merge write distrib…
URL: https://github.com/apache/iceberg/pull/5280


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] aokolnychyi commented on pull request #5280: Docs: add missing table properties for update and merge write distrib…

Posted by "aokolnychyi (via GitHub)" <gi...@apache.org>.
aokolnychyi commented on PR #5280:
URL: https://github.com/apache/iceberg/pull/5280#issuecomment-1410959675

   Oops, I missed this PR and merged #6683 instead as that one popped up earlier in my feed. Sorry, @dramaticlly!
   I am going to close this one as that PR covers this.
   
   We will need to add a section with examples from `TestSparkDistributionAndOrderingUtil` once #6679 is resolved.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org