You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2022/07/07 01:03:33 UTC

[GitHub] [iceberg] cbwn123 opened a new issue, #5216: Rewritedatafiles is not work with total-delete-files on Spark 3.2.1, Scala 2.12, Iceberg 0.14.0

cbwn123 opened a new issue, #5216:
URL: https://github.com/apache/iceberg/issues/5216

   This is snapshot in metadata file which is create by rewritedatafiles action.
    {
       "sequence-number" : 479,
       "snapshot-id" : 7678743334037913715,
       "parent-snapshot-id" : 704591498781525614,
       "timestamp-ms" : 1657152639626,
       "summary" : {
         "operation" : "replace",
         "manifests-created" : "1",
         "manifests-kept" : "0",
         "manifests-replaced" : "3",
         "entries-processed" : "0",
         "changed-partition-count" : "0",
         "total-records" : "190361791",
         "total-files-size" : "11035003231",
         "total-data-files" : "56",
         "total-delete-files" : "24356",
         "total-position-deletes" : "226070",
         "total-equality-deletes" : "0"
   }
   
   The total-data-files number fell sharply,but total-delete-files not.
   
   Why not the rewritedatafiles action merge data file and delete file?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] Initial-neko commented on issue #5216: Rewritedatafiles is not work with total-delete-files on Spark 3.2.1, Scala 2.12, Iceberg 0.14.0

Posted by GitBox <gi...@apache.org>.
Initial-neko commented on issue #5216:
URL: https://github.com/apache/iceberg/issues/5216#issuecomment-1200075292

   @RussellSpitzer mean that this statistical total-delete-files item is not accurate yet, but the query efficiency will not be affected after action done


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] github-actions[bot] closed issue #5216: Rewritedatafiles is not work with total-delete-files on Spark 3.2.1, Scala 2.12, Iceberg 0.14.0

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] closed issue #5216: Rewritedatafiles is not work with total-delete-files on Spark 3.2.1, Scala 2.12, Iceberg 0.14.0
URL: https://github.com/apache/iceberg/issues/5216


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] github-actions[bot] commented on issue #5216: Rewritedatafiles is not work with total-delete-files on Spark 3.2.1, Scala 2.12, Iceberg 0.14.0

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on issue #5216:
URL: https://github.com/apache/iceberg/issues/5216#issuecomment-1426514651

   This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale'


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] RussellSpitzer commented on issue #5216: Rewritedatafiles is not work with total-delete-files on Spark 3.2.1, Scala 2.12, Iceberg 0.14.0

Posted by GitBox <gi...@apache.org>.
RussellSpitzer commented on issue #5216:
URL: https://github.com/apache/iceberg/issues/5216#issuecomment-1179083356

   It does merge data and delete files. Unfortunately the logic for cleaning up delete files is more complicated and we haven't set that up quite yet. The main issue being that just because a delete file is applied to 1 data file doesn't mean it was applied to all relevant data files. Currently the way we do the processing doesn't let us know if a delete file has been completely or only partially merged. There are various efforts going forward to clean this up at the moment.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] github-actions[bot] commented on issue #5216: Rewritedatafiles is not work with total-delete-files on Spark 3.2.1, Scala 2.12, Iceberg 0.14.0

Posted by github-actions.
github-actions[bot] commented on issue #5216:
URL: https://github.com/apache/iceberg/issues/5216#issuecomment-1405841743

   This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs. To permanently prevent this issue from being considered stale, add the label 'not-stale', but commenting on the issue is preferred when possible.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org