You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2022/12/07 01:24:47 UTC

[GitHub] [iceberg] JMin824 opened a new issue, #6373: Question about usage of RewriteFile with Zorder Strategy

JMin824 opened a new issue, #6373:
URL: https://github.com/apache/iceberg/issues/6373

   ### Query engine
   
   Iceberg:0.14.1
   Spark:3.2.1
   
   ### Question
   
   I have a question about the usage of Zorder, When I use rewritefile by zorder strategy. If I inserto some new rows into table, Do I need to use rewriteFile by Zorder again? Or the new rows can be insert into file by z-value?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] JMin824 commented on issue #6373: Question about usage of RewriteFile with Zorder Strategy

Posted by GitBox <gi...@apache.org>.
JMin824 commented on issue #6373:
URL: https://github.com/apache/iceberg/issues/6373#issuecomment-1340607468

   
   
   
   > Rewrite all rewrites all files, this means reading all data of the files, ordering them, then writing out new ordered files. If no predicates are selected this would completely rewrite the table.
   
   Thanks a lot, I understand after your reply


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] RussellSpitzer commented on issue #6373: Question about usage of RewriteFile with Zorder Strategy

Posted by GitBox <gi...@apache.org>.
RussellSpitzer commented on issue #6373:
URL: https://github.com/apache/iceberg/issues/6373#issuecomment-1340250944

   There are two main issues here,
   
   1) If you only add enough data for a single file it is most likely not going to help if that single file has been ZOrdered all by itself. Z ordering as we have it currently implemented can only really help across multiple files being ZOrdered at the same time. This is similar to if you were just doing a normal sort, sorting a single file in isolation would not change the min/max stats of that file so it wouldn't help iceberg.
   
   2) We currently do not have the ability to have ZOrder as a table defined sort order yet. There are tickets for that but currently if you wanted to take advantage of ZOrdering you would need to use rewrite again. Again like with normal sort, you would only gain the benefit from effected files.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] JMin824 closed issue #6373: Question about usage of RewriteFile with Zorder Strategy

Posted by GitBox <gi...@apache.org>.
JMin824 closed issue #6373: Question about usage of RewriteFile with Zorder Strategy
URL: https://github.com/apache/iceberg/issues/6373


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] JMin824 commented on issue #6373: Question about usage of RewriteFile with Zorder Strategy

Posted by GitBox <gi...@apache.org>.
JMin824 commented on issue #6373:
URL: https://github.com/apache/iceberg/issues/6373#issuecomment-1340297485

   > * If you only add enough data for a single file it is most likely not going to help if that single file has been ZOrdered all by itself. Z ordering as we have it currently implemented can only really help across multiple files being ZOrdered at the same time. This is similar to if you were just doing a normal sort, sorting a single file in isolation would not change the min/max stats of that file so it wouldn't help iceberg.
   > * We currently do not have the ability to have ZOrder as a table defined sort order yet. There are tickets for that but currently if you wanted to take advantage of ZOrdering you would need to use rewrite again. Again like with normal sort, you would only gain the benefit from effected files.
   
   I find there is an option named rewrite-all in RewriteFile. if I use it in Zorder when I add enough data and some new files are created, what will happen? old files will not be rewritten? I guess it costs less than rewriting all file and maybe the query is a little slower than rewriting all file. thanks a lot for your reply


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] RussellSpitzer commented on issue #6373: Question about usage of RewriteFile with Zorder Strategy

Posted by GitBox <gi...@apache.org>.
RussellSpitzer commented on issue #6373:
URL: https://github.com/apache/iceberg/issues/6373#issuecomment-1340330287

   Rewrite all rewrites all files, this means reading all data of the files, ordering them, then writing out new ordered files. If no predicates are selected this would completely rewrite the table.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org