You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "rvansa (via GitHub)" <gi...@apache.org> on 2023/04/20 11:03:12 UTC

[GitHub] [kafka] rvansa commented on pull request #13619: Initial support for OpenJDK CRaC snapshotting

rvansa commented on PR #13619:
URL: https://github.com/apache/kafka/pull/13619#issuecomment-1516133197

   Sure, thanks for the pointers! I'll go through the docs and compose a proposal on the mailing list. If you don't mind I'll keep this PR open in the meantime.
   
   > What is the downside of adding checkpoint & restore to the producer threads? What expense does it add?
   
   My take on this is that unless the checkpoint itself is performed, there shouldn't be any performance overhead (or very minimal). In case of this PR the sender performs a volatile read in the loop, which is cheap (unless contended with frequent writes). Also, usually some components that need to handle the checkpoint process need little bit of memory for tracking, but usually applications have only one or few instances of each component. On the other hand the cost of checkpoint itself can be significant as this happens in a controlled manner, sometimes even out of production environment.
   
   > Looks like we are trying to save resource by suspending the threads (on the producer) that are not actively doing anything and restoring them when we they are needed? Is that right?
   
   The sender thread is paused, but not for saving resources but only to achieve correctness. Before performing the checkpoint we need to close all network connections, and don't want to re-create them unexpectedly, until restore. From my understanding of the code the affected components are used exclusively by the Sender thread (processing requests queues), therefore the most natural and performant option was to block it entirely, rather than trying to synchronize using locks (which would bring non-trivial overhead even without checkpoint). 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org