You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2022/08/22 18:49:34 UTC

[GitHub] [iceberg] stevenzwu opened a new issue, #5613: Flink: throttle FLIP-27 source enumerator for split discovery when Flink job is falling behind in streaming execution

stevenzwu opened a new issue, #5613:
URL: https://github.com/apache/iceberg/issues/5613

   ### Feature Request / Improvement
   
   Right now, FLIP-27 source eagerly discover all available splits from Iceberg table using incremental append scan. If the Flink job is falling behind with a lot of snapshots and data files, eagerly discovering all available splits can overwhelm enumerator with too many pending splits. It can increase memory pressure and enumerator checkpoint size. There is really no benefit of eagerly discovering all splits into memory. It is better to throttle the split discovery when there is certain number (configurable) of pending splits already. 
   
   PR #4943  (on the pre FLIP-27 source) is trying to avoid one incremental scan to cover too many snapshots. It can help. But it doesn't throttle or pause the split discovery if necessary.
   
   ### Query engine
   
   Flink


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] stevenzwu commented on issue #5613: Flink: throttle FLIP-27 source enumerator for split discovery when Flink job is falling behind in streaming execution

Posted by "stevenzwu (via GitHub)" <gi...@apache.org>.
stevenzwu commented on issue #5613:
URL: https://github.com/apache/iceberg/issues/5613#issuecomment-1435810205

   this is implemented by PR #6299 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] stevenzwu closed issue #5613: Flink: throttle FLIP-27 source enumerator for split discovery when Flink job is falling behind in streaming execution

Posted by "stevenzwu (via GitHub)" <gi...@apache.org>.
stevenzwu closed issue #5613: Flink: throttle FLIP-27 source enumerator for split discovery when Flink job is falling behind in streaming execution
URL: https://github.com/apache/iceberg/issues/5613


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] github-actions[bot] commented on issue #5613: Flink: throttle FLIP-27 source enumerator for split discovery when Flink job is falling behind in streaming execution

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on issue #5613:
URL: https://github.com/apache/iceberg/issues/5613#issuecomment-1435798046

   This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs. To permanently prevent this issue from being considered stale, add the label 'not-stale', but commenting on the issue is preferred when possible.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org