You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by GitBox <gi...@apache.org> on 2020/08/12 12:38:04 UTC

[GitHub] [flink] echauchot edited a comment on pull request #13040: [FLINK-17073] [checkpointing] checkpointing backpressure if there are too many checkpoints to clean

echauchot edited a comment on pull request #13040:
URL: https://github.com/apache/flink/pull/13040#issuecomment-672843805


   > Thanks for the PR @echauchot
   > 
   > I think the changes are correct in general, I've left some comments in code.
   > 
   > Besides that, I have some general remarks:
   > 
   > 1. Please make sure CI build passes (see @flinkbot comments). I think you missed the license in a new file.
   > 2. Cleanup commit history (see also [changes guide](https://flink.apache.org/contributing/code-style-and-quality-pull-requests.html))
   >    remove reverted, e.g. `de8197`
   >    squash, e.g. `06c763`
   >    make each commit correct, e.g. `numberOfCleaningCheckpointsSupplier.get() > maxQueuedRequests` check in `05f1cb`
   >    probably it's easier to have just one commit? I think the changes aren't so big and are related
   > 3. I think we can re-structure the code so that ZKStore doesn't depend on `CheckpointCleaner`; for this, include a call to `CheckpointCleaner` into `CompletedCheckpoint.discardCallback`
   > 4. These tests are missing:
   > 
   > * unit tests for `CheckpointsCleaner`
   > * test  that checkpoint request is triggered after deletion completes
   
   Thanks Roman for the detailed review and also for your suggestions throughout the design ! 
   1. Yes I saw the RAT failure in Azure CI build, it did not show up using the verify maven phase when I launched the build locally, that is why I forgot it :)
   2. yes I left separate commits to ease the review (allow to isolate parts corresponding to comments in the design doc and isolate further review commits). My plan was indeed to do all the squashing at the end of the review. This can definitely be a single commit as the changes are bound to only one module and do not change the public API etc...
   3. How can we do that ? I can decrement the counter in the callback which is executed in the Executor but the incrementation of the counter is done before submitting the callback to the executor, so outside the callback. Do I misunderstand something ?
   4.  it is just a delegate for checkpoint async tasks which are already tested in ZooKeeperCompletedCheckpointStoreTest, PendingCheckpoint. The only thing added is the counter inc/dec. I added a test only for CheckpointRequestDeciderTest to test that when threshold is exceeded, there is no CP request triggered and that there is one when the counter decreases. 
   Isn't the UTest you want the test  that checkpoint request is triggered after deletion completes ?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org