You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by "jessiedanwang (via GitHub)" <gi...@apache.org> on 2023/03/06 23:50:59 UTC

[GitHub] [iceberg] jessiedanwang opened a new issue, #7030: What's the recommended way to do iceberg table maintenance for spark structured streaming application?

jessiedanwang opened a new issue, #7030:
URL: https://github.com/apache/iceberg/issues/7030

   ### Query engine
   
   spark on EMR
   
   ### Question
   
   We have a spark structured streaming application, streaming data into iceberg tables in AWS, using Glue catalog. Currently, we stop the streaming query every n batches to do compaction and snapshot expiration using spark actions, and restart streaming query after maintenance job are done. The other option is to run a separate spark application that periodically does iceberg table maintenance tasks, while running the streaming application at the same time. The 3rd option is run optimize and vacuum in Athena to do iceberg table maintenance. I am wondering if there is any big difference in terms of performance in the above ways, and what's the recommended way of doing table maintenance. Thanks.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] github-actions[bot] commented on issue #7030: What's the recommended way to do iceberg table maintenance for spark structured streaming application?

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on issue #7030:
URL: https://github.com/apache/iceberg/issues/7030#issuecomment-1722347879

   This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale'


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] github-actions[bot] closed issue #7030: What's the recommended way to do iceberg table maintenance for spark structured streaming application?

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] closed issue #7030: What's the recommended way to do iceberg table maintenance for spark structured streaming application? 
URL: https://github.com/apache/iceberg/issues/7030


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] github-actions[bot] commented on issue #7030: What's the recommended way to do iceberg table maintenance for spark structured streaming application?

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on issue #7030:
URL: https://github.com/apache/iceberg/issues/7030#issuecomment-1703969441

   This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs. To permanently prevent this issue from being considered stale, add the label 'not-stale', but commenting on the issue is preferred when possible.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org